Sample records for single-variant test statistics

  1. Meta-analysis of gene-level associations for rare variants based on single-variant statistics.

    PubMed

    Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu

    2013-08-08

    Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  2. Association analysis of multiple traits by an approach of combining P values.

    PubMed

    Chen, Lili; Wang, Yong; Zhou, Yajing

    2018-03-01

    Increasing evidence shows that one variant can affect multiple traits, which is a widespread phenomenon in complex diseases. Joint analysis of multiple traits can increase statistical power of association analysis and uncover the underlying genetic mechanism. Although there are many statistical methods to analyse multiple traits, most of these methods are usually suitable for detecting common variants associated with multiple traits. However, because of low minor allele frequency of rare variant, these methods are not optimal for rare variant association analysis. In this paper, we extend an adaptive combination of P values method (termed ADA) for single trait to test association between multiple traits and rare variants in the given region. For a given region, we use reverse regression model to test each rare variant associated with multiple traits and obtain the P value of single-variant test. Further, we take the weighted combination of these P values as the test statistic. Extensive simulation studies show that our approach is more powerful than several other comparison methods in most cases and is robust to the inclusion of a high proportion of neutral variants and the different directions of effects of causal variants.

  3. Statistical tests for detecting associations with groups of genetic variants: generalization, evaluation, and implementation

    PubMed Central

    Ferguson, John; Wheeler, William; Fu, YiPing; Prokunina-Olsson, Ludmila; Zhao, Hongyu; Sampson, Joshua

    2013-01-01

    With recent advances in sequencing, genotyping arrays, and imputation, GWAS now aim to identify associations with rare and uncommon genetic variants. Here, we describe and evaluate a class of statistics, generalized score statistics (GSS), that can test for an association between a group of genetic variants and a phenotype. GSS are a simple weighted sum of single-variant statistics and their cross-products. We show that the majority of statistics currently used to detect associations with rare variants are equivalent to choosing a specific set of weights within this framework. We then evaluate the power of various weighting schemes as a function of variant characteristics, such as MAF, the proportion associated with the phenotype, and the direction of effect. Ultimately, we find that two classical tests are robust and powerful, but details are provided as to when other GSS may perform favorably. The software package CRaVe is available at our website (http://dceg.cancer.gov/bb/tools/crave). PMID:23092956

  4. Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.

    PubMed

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Alejandro Q; Musolf, Anthony; Matise, Tara C; Finch, Stephen J; Gordon, Derek

    2012-01-01

    As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single-variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p value, no matter how many loci. Copyright © 2013 S. Karger AG, Basel.

  5. Single variant and multi-variant trend tests for genetic association with next generation sequencing that are robust to sequencing error

    PubMed Central

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Andrew; Musolf, Anthony; Matise, Tara C.; Finch, Stephen J.; Gordon, Derek

    2013-01-01

    As with any new technology, next generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model, based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to that data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p-value, no matter how many loci. PMID:23594495

  6. General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies

    PubMed Central

    Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong

    2013-01-01

    We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515

  7. gsSKAT: Rapid gene set analysis and multiple testing correction for rare-variant association studies using weighted linear kernels.

    PubMed

    Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

    2017-05-01

    Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.

  8. Multiple Phenotype Association Tests Using Summary Statistics in Genome-Wide Association Studies

    PubMed Central

    Liu, Zhonghua; Lin, Xihong

    2017-01-01

    Summary We study in this paper jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. PMID:28653391

  9. Multiple phenotype association tests using summary statistics in genome-wide association studies.

    PubMed

    Liu, Zhonghua; Lin, Xihong

    2018-03-01

    We study in this article jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. © 2017, The International Biometric Society.

  10. A variational Bayes discrete mixture test for rare variant association

    PubMed Central

    Logsdon, Benjamin A.; Dai, James Y.; Auer, Paul L.; Johnsen, Jill M.; Ganesh, Santhi K.; Smith, Nicholas L.; Wilson, James G.; Tracy, Russell P.; Lange, Leslie A.; Jiao, Shuo; Rich, Stephen S.; Lettre, Guillaume; Carlson, Christopher S.; Jackson, Rebecca D.; O’Donnell, Christopher J.; Wurfel, Mark M.; Nickerson, Deborah A.; Tang, Hua; Reiner, Alexander P.; Kooperberg, Charles

    2014-01-01

    Recently, many statistical methods have been proposed to test for associations between rare genetic variants and complex traits. Most of these methods test for association by aggregating genetic variations within a predefined region, such as a gene. Although there is evidence that “aggregate” tests are more powerful than the single marker test, these tests generally ignore neutral variants and therefore are unable to identify specific variants driving the association with phenotype. We propose a novel aggregate rare-variant test that explicitly models a fraction of variants as neutral, tests associations at the gene-level, and infers the rare-variants driving the association. Simulations show that in the practical scenario where there are many variants within a given region of the genome with only a fraction causal our approach has greater power compared to other popular tests such as the Sequence Kernel Association Test (SKAT), the Weighted Sum Statistic (WSS), and the collapsing method of Morris and Zeggini (MZ). Our algorithm leverages a fast variational Bayes approximate inference methodology to scale to exome-wide analyses, a significant computational advantage over exact inference model selection methodologies. To demonstrate the efficacy of our methodology we test for associations between von Willebrand Factor (VWF) levels and VWF missense rare-variants imputed from the National Heart, Lung, and Blood Institute’s Exome Sequencing project into 2,487 African Americans within the VWF gene. Our method suggests that a relatively small fraction (~10%) of the imputed rare missense variants within VWF are strongly associated with lower VWF levels in African Americans. PMID:24482836

  11. A variational Bayes discrete mixture test for rare variant association.

    PubMed

    Logsdon, Benjamin A; Dai, James Y; Auer, Paul L; Johnsen, Jill M; Ganesh, Santhi K; Smith, Nicholas L; Wilson, James G; Tracy, Russell P; Lange, Leslie A; Jiao, Shuo; Rich, Stephen S; Lettre, Guillaume; Carlson, Christopher S; Jackson, Rebecca D; O'Donnell, Christopher J; Wurfel, Mark M; Nickerson, Deborah A; Tang, Hua; Reiner, Alexander P; Kooperberg, Charles

    2014-01-01

    Recently, many statistical methods have been proposed to test for associations between rare genetic variants and complex traits. Most of these methods test for association by aggregating genetic variations within a predefined region, such as a gene. Although there is evidence that "aggregate" tests are more powerful than the single marker test, these tests generally ignore neutral variants and therefore are unable to identify specific variants driving the association with phenotype. We propose a novel aggregate rare-variant test that explicitly models a fraction of variants as neutral, tests associations at the gene-level, and infers the rare-variants driving the association. Simulations show that in the practical scenario where there are many variants within a given region of the genome with only a fraction causal our approach has greater power compared to other popular tests such as the Sequence Kernel Association Test (SKAT), the Weighted Sum Statistic (WSS), and the collapsing method of Morris and Zeggini (MZ). Our algorithm leverages a fast variational Bayes approximate inference methodology to scale to exome-wide analyses, a significant computational advantage over exact inference model selection methodologies. To demonstrate the efficacy of our methodology we test for associations between von Willebrand Factor (VWF) levels and VWF missense rare-variants imputed from the National Heart, Lung, and Blood Institute's Exome Sequencing project into 2,487 African Americans within the VWF gene. Our method suggests that a relatively small fraction (~10%) of the imputed rare missense variants within VWF are strongly associated with lower VWF levels in African Americans.

  12. Detecting Genomic Clustering of Risk Variants from Sequence Data: Cases vs. Controls

    PubMed Central

    Schaid, Daniel J.; Sinnwell, Jason P.; McDonnell, Shannon K.; Thibodeau, Stephen N.

    2013-01-01

    As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method – Tango’s statistic – to genomic sequence data. An advantage of Tango’s method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled chi-square distribution, making computation of p-values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test (SKAT). Although our version of Tango’s statistic, which we call “Kernel Distance” statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff’s scan statistic had the greatest power over a range of clustering scenarios. PMID:23842950

  13. Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study

    PubMed Central

    Hou, Lin; Sun, Ning; Mane, Shrikant; Sayward, Fred; Rajeevan, Nallakkandi; Cheung, Kei-Hoi; Cho, Kelly; Pyarajan, Saiju; Aslan, Mihaela; Miller, Perry; Harvey, Philip D.; Gaziano, J. Michael; Concato, John; Zhao, Hongyu

    2017-01-01

    A key step in genomic studies is to assess high throughput measurements across millions of markers for each participant’s DNA, either using microarrays or sequencing techniques. Accurate genotype calling is essential for downstream statistical analysis of genotype-phenotype associations, and next generation sequencing (NGS) has recently become a more common approach in genomic studies. How the accuracy of variant calling in NGS-based studies affects downstream association analysis has not, however, been studied using empirical data in which both microarrays and NGS were available. In this article, we investigate the impact of variant calling errors on the statistical power to identify associations between single nucleotides and disease, and on associations between multiple rare variants and disease. Both differential and nondifferential genotyping errors are considered. Our results show that the power of burden tests for rare variants is strongly influenced by the specificity in variant calling, but is rather robust with regard to sensitivity. By using the variant calling accuracies estimated from a substudy of a Cooperative Studies Program project conducted by the Department of Veterans Affairs, we show that the power of association tests is mostly retained with commonly adopted variant calling pipelines. An R package, GWAS.PC, is provided to accommodate power analysis that takes account of genotyping errors (http://zhaocenter.org/software/). PMID:28019059

  14. Rare Variant Association Test with Multiple Phenotypes

    PubMed Central

    Lee, Selyeong; Won, Sungho; Kim, Young Jin; Kim, Yongkang; Kim, Bong-Jo; Park, Taesung

    2016-01-01

    Although genome-wide association studies (GWAS) have now discovered thousands of genetic variants associated with common traits, such variants cannot explain the large degree of “missing heritability,” likely due to rare variants. The advent of next generation sequencing technology has allowed rare variant detection and association with common traits, often by investigating specific genomic regions for rare variant effects on a trait. Although multiply correlated phenotypes are often concurrently observed in GWAS, most studies analyze only single phenotypes, which may lessen statistical power. To increase power, multivariate analyses, which consider correlations between multiple phenotypes, can be used. However, few existing multi-variant analyses can identify rare variants for assessing multiple phenotypes. Here, we propose Multivariate Association Analysis using Score Statistics (MAAUSS), to identify rare variants associated with multiple phenotypes, based on the widely used Sequence Kernel Association Test (SKAT) for a single phenotype. We applied MAAUSS to Whole Exome Sequencing (WES) data from a Korean population of 1,058 subjects, to discover genes associated with multiple traits of liver function. We then assessed validation of those genes by a replication study, using an independent dataset of 3,445 individuals. Notably, we detected the gene ZNF620 among five significant genes. We then performed a simulation study to compare MAAUSS's performance with existing methods. Overall, MAAUSS successfully conserved type 1 error rates and in many cases, had a higher power than the existing methods. This study illustrates a feasible and straightforward approach for identifying rare variants correlated with multiple phenotypes, with likely relevance to missing heritability. PMID:28039885

  15. Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

    PubMed Central

    Grinde, Kelsey E.; Arbet, Jaron; Green, Alden; O'Connell, Michael; Valcarcel, Alessandra; Westra, Jason; Tintle, Nathan

    2017-01-01

    To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as “winner's curse.” We illustrate the ramifications of this bias on variant effect size estimation and variant prioritization/ranking approaches, outline parameters of genetic architecture that affect this bias, and propose a bootstrap resampling method to correct for this bias. We find that our correction method significantly reduces the bias due to winner's curse (average two-fold decrease in bias, p < 2.2 × 10−6) and, consequently, substantially improves mean squared error and variant prioritization/ranking. The method is particularly helpful in adjustment for winner's curse effects when the initial gene-based test has low power and for relatively more common, non-causal variants. Adjustment for winner's curse is recommended for all post-hoc estimation and ranking of variants after a gene-based test. Further work is necessary to continue seeking ways to reduce bias and improve inference in post-hoc analysis of gene-based tests under a wide variety of genetic architectures. PMID:28959274

  16. Common variants of the EPDR1 gene and the risk of Dupuytren’s disease.

    PubMed

    Dębniak, T; Żyluk, A; Puchalski, P; Serrano-Fernandez, P

    2013-10-01

    The object of this study was the investigation of 3 common variants of single nucleotide polymorphisms of the ependymin-related gene 1 and its association with the occurrence of Dupuytren's disease. DNA samples were obtained from the peripheral blood of 508 consecutive patients. The control group comprised 515 healthy adults who were age-matched with the Dupuytren's patients. 3 common variants were analysed using TaqMan® genotyping assays and sequencing. The differences in the frequencies of variants of single nucleotide polymorphisms in patients and the control group were statistically tested. Additionally, haplotype frequency and linkage disequilibrium were analysed for these variants. A statistically significant association was noted between rs16879765_CT, rs16879765_TT and rs13240429_AA variants and Dupuytren's disease. 2 haplotypes: rs2722280_C+rs13240429_A+rs16879765_C and rs2722280_C+rs13240429_G+rs16879765_T were found to be statistically significantly associated with Dupuytren's disease. Moreover, we found that rs13240429 and rs16879765 variants were in strong linkage disequilibrium, while rs2722280 was only in moderate linkage disequilibrium. No significant differences were found in the frequencies of the variants of the gene between the groups with a positive and negative familial history of Dupuytren's disease. In conclusion, results of this study suggest that EPDR1 gene can be added to a growing list of genes associated with Dupuytren's disease development. © Georg Thieme Verlag KG Stuttgart · New York.

  17. Higher criticism approach to detect rare variants using whole genome sequencing data

    PubMed Central

    2014-01-01

    Because of low statistical power of single-variant tests for whole genome sequencing (WGS) data, the association test for variant groups is a key approach for genetic mapping. To address the features of sparse and weak genetic effects to be detected, the higher criticism (HC) approach has been proposed and theoretically has proven optimal for detecting sparse and weak genetic effects. Here we develop a strategy to apply the HC approach to WGS data that contains rare variants as the majority. By using Genetic Analysis Workshop 18 "dose" genetic data with simulated phenotypes, we assess the performance of HC under a variety of strategies for grouping variants and collapsing rare variants. The HC approach is compared with the minimal p-value method and the sequence kernel association test. The results show that the HC approach is preferred for detecting weak genetic effects. PMID:25519367

  18. Using volcano plots and regularized-chi statistics in genetic association studies.

    PubMed

    Li, Wentian; Freudenberg, Jan; Suh, Young Ju; Yang, Yaning

    2014-02-01

    Labor intensive experiments are typically required to identify the causal disease variants from a list of disease associated variants in the genome. For designing such experiments, candidate variants are ranked by their strength of genetic association with the disease. However, the two commonly used measures of genetic association, the odds-ratio (OR) and p-value may rank variants in different order. To integrate these two measures into a single analysis, here we transfer the volcano plot methodology from gene expression analysis to genetic association studies. In its original setting, volcano plots are scatter plots of fold-change and t-test statistic (or -log of the p-value), with the latter being more sensitive to sample size. In genetic association studies, the OR and Pearson's chi-square statistic (or equivalently its square root, chi; or the standardized log(OR)) can be analogously used in a volcano plot, allowing for their visual inspection. Moreover, the geometric interpretation of these plots leads to an intuitive method for filtering results by a combination of both OR and chi-square statistic, which we term "regularized-chi". This method selects associated markers by a smooth curve in the volcano plot instead of the right-angled lines which corresponds to independent cutoffs for OR and chi-square statistic. The regularized-chi incorporates relatively more signals from variants with lower minor-allele-frequencies than chi-square test statistic. As rare variants tend to have stronger functional effects, regularized-chi is better suited to the task of prioritization of candidate genes. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. Utilizing population controls in rare-variant case-parent association tests.

    PubMed

    Jiang, Yu; Satten, Glen A; Han, Yujun; Epstein, Michael P; Heinzen, Erin L; Goldstein, David B; Allen, Andrew S

    2014-06-05

    There is great interest in detecting associations between human traits and rare genetic variation. To address the low power implicit in single-locus tests of rare genetic variants, many rare-variant association approaches attempt to accumulate information across a gene, often by taking linear combinations of single-locus contributions to a statistic. Using the right linear combination is key-an optimal test will up-weight true causal variants, down-weight neutral variants, and correctly assign the direction of effect for causal variants. Here, we propose a procedure that exploits data from population controls to estimate the linear combination to be used in an case-parent trio rare-variant association test. Specifically, we estimate the linear combination by comparing population control allele frequencies with allele frequencies in the parents of affected offspring. These estimates are then used to construct a rare-variant transmission disequilibrium test (rvTDT) in the case-parent data. Because the rvTDT is conditional on the parents' data, using parental data in estimating the linear combination does not affect the validity or asymptotic distribution of the rvTDT. By using simulation, we show that our new population-control-based rvTDT can dramatically improve power over rvTDTs that do not use population control information across a wide variety of genetic architectures. It also remains valid under population stratification. We apply the approach to a cohort of epileptic encephalopathy (EE) trios and find that dominant (or additive) inherited rare variants are unlikely to play a substantial role within EE genes previously identified through de novo mutation studies. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  20. A Novel Genome-Information Content-Based Statistic for Genome-Wide Association Analysis Designed for Next-Generation Sequencing Data

    PubMed Central

    Luo, Li; Zhu, Yun

    2012-01-01

    Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812

  1. A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

    PubMed

    Luo, Li; Zhu, Yun; Xiong, Momiao

    2012-06-01

    The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.

  2. DoEstRare: A statistical test to identify local enrichments in rare genomic variants associated with disease.

    PubMed

    Persyn, Elodie; Karakachoff, Matilde; Le Scouarnec, Solena; Le Clézio, Camille; Campion, Dominique; Consortium, French Exome; Schott, Jean-Jacques; Redon, Richard; Bellanger, Lise; Dina, Christian

    2017-01-01

    Next-generation sequencing technologies made it possible to assay the effect of rare variants on complex diseases. As an extension of the "common disease-common variant" paradigm, rare variant studies are necessary to get a more complete insight into the genetic architecture of human traits. Association studies of these rare variations show new challenges in terms of statistical analysis. Due to their low frequency, rare variants must be tested by groups. This approach is then hindered by the fact that an unknown proportion of the variants could be neutral. The risk level of a rare variation may be determined by its impact but also by its position in the protein sequence. More generally, the molecular mechanisms underlying the disease architecture may involve specific protein domains or inter-genic regulatory regions. While a large variety of methods are optimizing functionality weights for each single marker, few evaluate variant position differences between cases and controls. Here, we propose a test called DoEstRare, which aims to simultaneously detect clusters of disease risk variants and global allele frequency differences in genomic regions. This test estimates, for cases and controls, variant position densities in the genetic region by a kernel method, weighted by a function of allele frequencies. We compared DoEstRare with previously published strategies through simulation studies as well as re-analysis of real datasets. Based on simulation under various scenarios, DoEstRare was the sole to consistently show highest performance, in terms of type I error and power both when variants were clustered or not. DoEstRare was also applied to Brugada syndrome and early-onset Alzheimer's disease data and provided complementary results to other existing tests. DoEstRare, by integrating variant position information, gives new opportunities to explain disease susceptibility. DoEstRare is implemented in a user-friendly R package.

  3. Exome Array Analysis of Nuclear Lens Opacity.

    PubMed

    Loomis, Stephanie J; Klein, Alison P; Lee, Kristine E; Chen, Fei; Bomotti, Samantha; Truitt, Barbara; Iyengar, Sudha K; Klein, Ronald; Klein, Barbara E K; Duggal, Priya

    2018-06-01

    Nuclear cataract is the most common subtype of age-related cataract, the leading cause of blindness worldwide. It results from advanced nuclear sclerosis, or opacity in the center of the optic lens, and is affected by both genetic and environmental risk factors, including smoking. We sought to understand the genetic factors associated with nuclear sclerosis through interrogation of rare and low frequency coding variants using exome array data. We analyzed Illumina Human Exome Array data for 1,488 participants of European ancestry in the Beaver Dam Eye Study who were without cataract surgery for association with nuclear sclerosis grade, controlling for age and sex. We performed single-variant regression analysis for 32,138 variants with minor allele frequency (MAF) ≥0.003. In addition, gene-based analysis of 11,844 genes containing at least two variants with MAF < 0.05 was performed using a gene-based unified burden and non-burden sequence kernel association test (SKAT-O). Additionally, both single-variant and gene-based analyses were analyzed stratified by smoking status. No single-variant test was statistically significant after Bonferroni correction (p < 1.6 × 10 -6 ; top single nucleotide polymorphism (SNP): rs144458991, p = 2.83 × 10 -5 ). Gene-based tests were suggestively associated with the gene RNF149 overall (p = 8.29 × 10 -6 ) and among never smokers (N = 790, p = 2.67 × 10 -6 ). This study did not find a significant genetic association with nuclear sclerosis, the possible association with the RNF149 gene highlights a potential candidate gene for future studies that aim to understand the genetic architecture of nuclear sclerosis.

  4. Utilising family-based designs for detecting rare variant disease associations.

    PubMed

    Preston, Mark D; Dudbridge, Frank

    2014-03-01

    Rare genetic variants are thought to be important components in the causality of many diseases but discovering these associations is challenging. We demonstrate how best to use family-based designs to improve the power to detect rare variant disease associations. We show that using genetic data from enriched families (those pedigrees with greater than one affected member) increases the power and sensitivity of existing case-control rare variant tests. However, we show that transmission- (or within-family-) based tests do not benefit from this enrichment. This means that, in studies where a limited amount of genotyping is available, choosing a single case from each of many pedigrees has greater power than selecting multiple cases from fewer pedigrees. Finally, we show how a pseudo-case-control design allows a greater range of statistical tests to be applied to family data. © 2014 The Authors. Annals of Human Genetics published by John Wiley & Sons Ltd/University College London.

  5. Statistical method to compare massive parallel sequencing pipelines.

    PubMed

    Elsensohn, M H; Leblay, N; Dimassi, S; Campan-Fournier, A; Labalme, A; Roucher-Boulez, F; Sanlaville, D; Lesca, G; Bardel, C; Roy, P

    2017-03-01

    Today, sequencing is frequently carried out by Massive Parallel Sequencing (MPS) that cuts drastically sequencing time and expenses. Nevertheless, Sanger sequencing remains the main validation method to confirm the presence of variants. The analysis of MPS data involves the development of several bioinformatic tools, academic or commercial. We present here a statistical method to compare MPS pipelines and test it in a comparison between an academic (BWA-GATK) and a commercial pipeline (TMAP-NextGENe®), with and without reference to a gold standard (here, Sanger sequencing), on a panel of 41 genes in 43 epileptic patients. This method used the number of variants to fit log-linear models for pairwise agreements between pipelines. To assess the heterogeneity of the margins and the odds ratios of agreement, four log-linear models were used: a full model, a homogeneous-margin model, a model with single odds ratio for all patients, and a model with single intercept. Then a log-linear mixed model was fitted considering the biological variability as a random effect. Among the 390,339 base-pairs sequenced, TMAP-NextGENe® and BWA-GATK found, on average, 2253.49 and 1857.14 variants (single nucleotide variants and indels), respectively. Against the gold standard, the pipelines had similar sensitivities (63.47% vs. 63.42%) and close but significantly different specificities (99.57% vs. 99.65%; p < 0.001). Same-trend results were obtained when only single nucleotide variants were considered (99.98% specificity and 76.81% sensitivity for both pipelines). The method allows thus pipeline comparison and selection. It is generalizable to all types of MPS data and all pipelines.

  6. The effect of rare variants on inflation of the test statistics in case-control analyses.

    PubMed

    Pirie, Ailith; Wood, Angela; Lush, Michael; Tyrer, Jonathan; Pharoah, Paul D P

    2015-02-20

    The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data. We found evidence of inflation in the median test statistics of the likelihood ratio and score tests for tests of variants with less than 20 heterozygotes across the sample, regardless of the total sample size. The test statistics for the Wald test were under-inflated at the median for variants below the same minor allele frequency. In a genetic association study, if a substantial proportion of the genetic variants tested have rare minor allele frequencies, the properties of the association test may mask the presence or absence of bias due to population structure. The use of either the likelihood ratio test or the score test is likely to lead to inflation in the median test statistic in the absence of population structure. In contrast, the use of the Wald test is likely to result in under-inflation of the median test statistic which may mask the presence of population structure.

  7. TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies

    PubMed Central

    van der Sluis, Sophie; Posthuma, Danielle; Dolan, Conor V.

    2013-01-01

    To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. PMID:23359524

  8. GEE-based SNP set association test for continuous and discrete traits in family-based association studies.

    PubMed

    Wang, Xuefeng; Lee, Seunggeun; Zhu, Xiaofeng; Redline, Susan; Lin, Xihong

    2013-12-01

    Family-based genetic association studies of related individuals provide opportunities to detect genetic variants that complement studies of unrelated individuals. Most statistical methods for family association studies for common variants are single marker based, which test one SNP a time. In this paper, we consider testing the effect of an SNP set, e.g., SNPs in a gene, in family studies, for both continuous and discrete traits. Specifically, we propose a generalized estimating equations (GEEs) based kernel association test, a variance component based testing method, to test for the association between a phenotype and multiple variants in an SNP set jointly using family samples. The proposed approach allows for both continuous and discrete traits, where the correlation among family members is taken into account through the use of an empirical covariance estimator. We derive the theoretical distribution of the proposed statistic under the null and develop analytical methods to calculate the P-values. We also propose an efficient resampling method for correcting for small sample size bias in family studies. The proposed method allows for easily incorporating covariates and SNP-SNP interactions. Simulation studies show that the proposed method properly controls for type I error rates under both random and ascertained sampling schemes in family studies. We demonstrate through simulation studies that our approach has superior performance for association mapping compared to the single marker based minimum P-value GEE test for an SNP-set effect over a range of scenarios. We illustrate the application of the proposed method using data from the Cleveland Family GWAS Study. © 2013 WILEY PERIODICALS, INC.

  9. Rare-Variant Association Analysis: Study Designs and Statistical Tests

    PubMed Central

    Lee, Seunggeung; Abecasis, Gonçalo R.; Boehnke, Michael; Lin, Xihong

    2014-01-01

    Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions. PMID:24995866

  10. Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.

    PubMed

    Deng, Yangqing; Pan, Wei

    2017-12-01

    There is growing interest in testing genetic pleiotropy, which is when a single genetic variant influences multiple traits. Several methods have been proposed; however, these methods have some limitations. First, all the proposed methods are based on the use of individual-level genotype and phenotype data; in contrast, for logistical, and other, reasons, summary statistics of univariate SNP-trait associations are typically only available based on meta- or mega-analyzed large genome-wide association study (GWAS) data. Second, existing tests are based on marginal pleiotropy, which cannot distinguish between direct and indirect associations of a single genetic variant with multiple traits due to correlations among the traits. Hence, it is useful to consider conditional analysis, in which a subset of traits is adjusted for another subset of traits. For example, in spite of substantial lowering of low-density lipoprotein cholesterol (LDL) with statin therapy, some patients still maintain high residual cardiovascular risk, and, for these patients, it might be helpful to reduce their triglyceride (TG) level. For this purpose, in order to identify new therapeutic targets, it would be useful to identify genetic variants with pleiotropic effects on LDL and TG after adjusting the latter for LDL; otherwise, a pleiotropic effect of a genetic variant detected by a marginal model could simply be due to its association with LDL only, given the well-known correlation between the two types of lipids. Here, we develop a new pleiotropy testing procedure based only on GWAS summary statistics that can be applied for both marginal analysis and conditional analysis. Although the main technical development is based on published union-intersection testing methods, care is needed in specifying conditional models to avoid invalid statistical estimation and inference. In addition to the previously used likelihood ratio test, we also propose using generalized estimating equations under the working independence model for robust inference. We provide numerical examples based on both simulated and real data, including two large lipid GWAS summary association datasets based on ∼100,000 and ∼189,000 samples, respectively, to demonstrate the difference between marginal and conditional analyses, as well as the effectiveness of our new approach. Copyright © 2017 by the Genetics Society of America.

  11. The admixture maximum likelihood test to test for association between rare variants and disease phenotypes.

    PubMed

    Tyrer, Jonathan P; Guo, Qi; Easton, Douglas F; Pharoah, Paul D P

    2013-06-06

    The development of genotyping arrays containing hundreds of thousands of rare variants across the genome and advances in high-throughput sequencing technologies have made feasible empirical genetic association studies to search for rare disease susceptibility alleles. As single variant testing is underpowered to detect associations, the development of statistical methods to combine analysis across variants - so-called "burden tests" - is an area of active research interest. We previously developed a method, the admixture maximum likelihood test, to test multiple, common variants for association with a trait of interest. We have extended this method, called the rare admixture maximum likelihood test (RAML), for the analysis of rare variants. In this paper we compare the performance of RAML with six other burden tests designed to test for association of rare variants. We used simulation testing over a range of scenarios to test the power of RAML compared to the other rare variant association testing methods. These scenarios modelled differences in effect variability, the average direction of effect and the proportion of associated variants. We evaluated the power for all the different scenarios. RAML tended to have the greatest power for most scenarios where the proportion of associated variants was small, whereas SKAT-O performed a little better for the scenarios with a higher proportion of associated variants. The RAML method makes no assumptions about the proportion of variants that are associated with the phenotype of interest or the magnitude and direction of their effect. The method is flexible and can be applied to both dichotomous and quantitative traits and allows for the inclusion of covariates in the underlying regression model. The RAML method performed well compared to the other methods over a wide range of scenarios. Generally power was moderate in most of the scenarios, underlying the need for large sample sizes in any form of association testing.

  12. Are Interactions between cis-Regulatory Variants Evidence for Biological Epistasis or Statistical Artifacts?

    PubMed

    Fish, Alexandra E; Capra, John A; Bush, William S

    2016-10-06

    The importance of epistasis-or statistical interactions between genetic variants-to the development of complex disease in humans has been controversial. Genome-wide association studies of statistical interactions influencing human traits have recently become computationally feasible and have identified many putative interactions. However, statistical models used to detect interactions can be confounded, which makes it difficult to be certain that observed statistical interactions are evidence for true molecular epistasis. In this study, we investigate whether there is evidence for epistatic interactions between genetic variants within the cis-regulatory region that influence gene expression after accounting for technical, statistical, and biological confounding factors. We identified 1,119 (FDR = 5%) interactions that appear to regulate gene expression in human lymphoblastoid cell lines, a tightly controlled, largely genetically determined phenotype. Many of these interactions replicated in an independent dataset (90 of 803 tested, Bonferroni threshold). We then performed an exhaustive analysis of both known and novel confounders, including ceiling/floor effects, missing genotype combinations, haplotype effects, single variants tagged through linkage disequilibrium, and population stratification. Every interaction could be explained by at least one of these confounders, and replication in independent datasets did not protect against some confounders. Assuming that the confounding factors provide a more parsimonious explanation for each interaction, we find it unlikely that cis-regulatory interactions contribute strongly to human gene expression, which calls into question the relevance of cis-regulatory interactions for other human phenotypes. We additionally propose several best practices for epistasis testing to protect future studies from confounding. Copyright © 2016 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  13. Further Evidence of the Association of the Diacylglycerol Kinase Kappa (DGKK) Gene With Hypospadias.

    PubMed

    Hozyasz, Kamil Konrad; Mostowska, Adrianna; Kowal, Andrzej; Mydlak, Dariusz; Tsibulski, Alexander; Jagodzinski, Pawel P

    2018-02-18

    Hypospadias is a common developmental anomaly of the male external genitalia. In previous studies conducted on West European, Californian, and Han Chinese populations the relationship between polymorphic variants of the diacylglycerol kinase kappa (DGKK) gene and hypospadias have been reported. The aim was to study the possible associations between polymorphic variants of the DGKK gene and hypospadias using an independent sample of the Polish population. Ten single nucleotide polymorphisms in DGKK, which were reported to have an impact on the risk of hypospadias in other populations, were genotyped using high-resolution melting curve analysis in a group of 166 boys with isolated anterior (66%) and middle (34%) forms of hypospadias and 285 properly matched controls without congenital anomalies. Two DGKK variants rs11091748 and rs12171755 were associated with increased risk of hypospadias in the Polish population. These results were statistically significant, even after applying the Bonferroni correction for multiple comparisons (P < .005). All the tested nucleotide variants were involved in haplotype combinations associated with hypospadias. The global p-values for haplotypes comprising of rs4143304-rs11091748, rs11091748-rs17328236, rs1934179-rs4554617, rs1934183-rs1934179-rs4554617 and rs12171755-rs1934183-rs1934179-rs4554617 were statistically significant, even after the permutation test correction. Our study provides strong evidence of an association between DGKK nucleotide variants, haplotypes and hypospadias susceptibility.

  14. Methods for meta-analysis of multiple traits using GWAS summary statistics.

    PubMed

    Ray, Debashree; Boehnke, Michael

    2018-03-01

    Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits. © 2017 WILEY PERIODICALS, INC.

  15. Gene- and pathway-based association tests for multiple traits with GWAS summary statistics.

    PubMed

    Kwak, Il-Youp; Pan, Wei

    2017-01-01

    To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition to the most popular single SNP-single trait association analysis, it would be useful to explore multiple correlated (intermediate) traits at the gene- or pathway-level by mining existing single GWAS or meta-analyzed GWAS data. For this purpose, we present an adaptive gene-based test and a pathway-based test for association analysis of multiple traits with GWAS summary statistics. The proposed tests are adaptive at both the SNP- and trait-levels; that is, they account for possibly varying association patterns (e.g. signal sparsity levels) across SNPs and traits, thus maintaining high power across a wide range of situations. Furthermore, the proposed methods are general: they can be applied to mixed types of traits, and to Z-statistics or P-values as summary statistics obtained from either a single GWAS or a meta-analysis of multiple GWAS. Our numerical studies with simulated and real data demonstrated the promising performance of the proposed methods. The methods are implemented in R package aSPU, freely and publicly available at: https://cran.r-project.org/web/packages/aSPU/ CONTACT: weip@biostat.umn.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Testing cross-phenotype effects of rare variants in longitudinal studies of complex traits.

    PubMed

    Rudra, Pratyaydipta; Broadaway, K Alaine; Ware, Erin B; Jhun, Min A; Bielak, Lawrence F; Zhao, Wei; Smith, Jennifer A; Peyser, Patricia A; Kardia, Sharon L R; Epstein, Michael P; Ghosh, Debashis

    2018-06-01

    Many gene mapping studies of complex traits have identified genes or variants that influence multiple phenotypes. With the advent of next-generation sequencing technology, there has been substantial interest in identifying rare variants in genes that possess cross-phenotype effects. In the presence of such effects, modeling both the phenotypes and rare variants collectively using multivariate models can achieve higher statistical power compared to univariate methods that either model each phenotype separately or perform separate tests for each variant. Several studies collect phenotypic data over time and using such longitudinal data can further increase the power to detect genetic associations. Although rare-variant approaches exist for testing cross-phenotype effects at a single time point, there is no analogous method for performing such analyses using longitudinal outcomes. In order to fill this important gap, we propose an extension of Gene Association with Multiple Traits (GAMuT) test, a method for cross-phenotype analysis of rare variants using a framework based on the distance covariance. The approach allows for both binary and continuous phenotypes and can also adjust for covariates. Our simple adjustment to the GAMuT test allows it to handle longitudinal data and to gain power by exploiting temporal correlation. The approach is computationally efficient and applicable on a genome-wide scale due to the use of a closed-form test whose significance can be evaluated analytically. We use simulated data to demonstrate that our method has favorable power over competing approaches and also apply our approach to exome chip data from the Genetic Epidemiology Network of Arteriopathy. © 2018 WILEY PERIODICALS, INC.

  17. Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.

    PubMed

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei

    2016-02-01

    Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.

  18. Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions

    PubMed Central

    Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei

    2015-01-01

    Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979

  19. Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.

    PubMed

    Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao

    2016-04-01

    To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes.

  20. A Statistical Approach for Testing Cross-Phenotype Effects of Rare Variants

    PubMed Central

    Broadaway, K. Alaine; Cutler, David J.; Duncan, Richard; Moore, Jacob L.; Ware, Erin B.; Jhun, Min A.; Bielak, Lawrence F.; Zhao, Wei; Smith, Jennifer A.; Peyser, Patricia A.; Kardia, Sharon L.R.; Ghosh, Debashis; Epstein, Michael P.

    2016-01-01

    Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy. PMID:26942286

  1. Filtering genetic variants and placing informative priors based on putative biological function.

    PubMed

    Friedrichs, Stefanie; Malzahn, Dörthe; Pugh, Elizabeth W; Almeida, Marcio; Liu, Xiao Qing; Bailey, Julia N

    2016-02-03

    High-density genetic marker data, especially sequence data, imply an immense multiple testing burden. This can be ameliorated by filtering genetic variants, exploiting or accounting for correlations between variants, jointly testing variants, and by incorporating informative priors. Priors can be based on biological knowledge or predicted variant function, or even be used to integrate gene expression or other omics data. Based on Genetic Analysis Workshop (GAW) 19 data, this article discusses diversity and usefulness of functional variant scores provided, for example, by PolyPhen2, SIFT, or RegulomeDB annotations. Incorporating functional scores into variant filters or weights and adjusting the significance level for correlations between variants yielded significant associations with blood pressure traits in a large family study of Mexican Americans (GAW19 data set). Marker rs218966 in gene PHF14 and rs9836027 in MAP4 significantly associated with hypertension; additionally, rare variants in SNUPN significantly associated with systolic blood pressure. Variant weights strongly influenced the power of kernel methods and burden tests. Apart from variant weights in test statistics, prior weights may also be used when combining test statistics or to informatively weight p values while controlling false discovery rate (FDR). Indeed, power improved when gene expression data for FDR-controlled informative weighting of association test p values of genes was used. Finally, approaches exploiting variant correlations included identity-by-descent mapping and the optimal strategy for joint testing rare and common variants, which was observed to depend on linkage disequilibrium structure.

  2. Common variants in the obesity-associated genes FTO and MC4R are not associated with risk of colorectal cancer

    PubMed Central

    Yang, Baiyu; Thrift, Aaron P.; Figueiredo, Jane C.; Jenkins, Mark A.; Schumacher, Fredrick R.; Conti, David V.; Lin, Yi; Win, Aung Ko; Limburg, Paul J.; Berndt, Sonja I.; Brenner, Hermann; Chan, Andrew T.; Chang-Claude, Jenny; Hoffmeister, Michael; Hudson, Thomas J.; Marchand, Loïc Le; Newcomb, Polly A.; Slattery, Martha L.; White, Emily; Peters, Ulrike; Casey, Graham; Campbell, Peter T.

    2016-01-01

    Background Obesity is a convincing risk factor for colorectal cancer. Genetic variants in or near FTO and MC4R are consistently associated with body mass index and other body size measures, but whether they are also associated with colorectal cancer risk is unclear. Methods In the discovery stage, we tested associations of 677 FTO and 323 MC4R single nucleotide polymorphisms (SNPs) 100kb upstream and 300kb downstream from each respective locus with risk of colorectal cancer in data from the Colon Cancer Family Registry (CCFR: 1,960 cases; 1,777 controls). Next, all SNPs that were nominally statistically signif icant (p<0.05) in the discovery stage were included in replication analyses in data from the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO: 9,716 cases; 9,844 controls). Results In the discovery stage, 43 FTO variants and 18 MC4R variants were associated with colorectal cancer risk (p<0.05). No SNPs remained statistically significant in the replication analysis after accounting for multiple comparisons. Conclusion We found no evidence that individual variants in or near the obesity-related genes FTO and MC4R are associated with risk of colorectal cancer. PMID:27449576

  3. Common variants in the obesity-associated genes FTO and MC4R are not associated with risk of colorectal cancer.

    PubMed

    Yang, Baiyu; Thrift, Aaron P; Figueiredo, Jane C; Jenkins, Mark A; Schumacher, Fredrick R; Conti, David V; Lin, Yi; Win, Aung Ko; Limburg, Paul J; Berndt, Sonja I; Brenner, Hermann; Chan, Andrew T; Chang-Claude, Jenny; Hoffmeister, Michael; Hudson, Thomas J; Marchand, Loïc Le; Newcomb, Polly A; Slattery, Martha L; White, Emily; Peters, Ulrike; Casey, Graham; Campbell, Peter T

    2016-10-01

    Obesity is a convincing risk factor for colorectal cancer. Genetic variants in or near FTO and MC4R are consistently associated with body mass index and other body size measures, but whether they are also associated with colorectal cancer risk is unclear. In the discovery stage, we tested associations of 677 FTO and 323 MC4R single nucleotide polymorphisms (SNPs) 100kb upstream and 300kb downstream from each respective locus with risk of colorectal cancer in data from the Colon Cancer Family Registry (CCFR: 1960 cases; 1777 controls). Next, all SNPs that were nominally statistically significant (p<0.05) in the discovery stage were included in replication analyses in data from the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO: 9716 cases; 9844 controls). In the discovery stage, 43 FTO variants and 18 MC4R variants were associated with colorectal cancer risk (p<0.05). No SNPs remained statistically significant in the replication analysis after accounting for multiple comparisons. We found no evidence that individual variants in or near the obesity-related genes FTO and MC4R are associated with risk of colorectal cancer. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. A statistical framework for applying RNA profiling to chemical hazard detection.

    PubMed

    Kostich, Mitchell S

    2017-12-01

    Use of 'omics technologies in environmental science is expanding. However, application is mostly restricted to characterizing molecular steps leading from toxicant interaction with molecular receptors to apical endpoints in laboratory species. Use in environmental decision-making is limited, due to difficulty in elucidating mechanisms in sufficient detail to make quantitative outcome predictions in any single species or in extending predictions to aquatic communities. Here we introduce a mechanism-agnostic statistical approach, supplementing mechanistic investigation by allowing probabilistic outcome prediction even when understanding of molecular pathways is limited, and facilitating extrapolation from results in laboratory test species to predictions about aquatic communities. We use concepts familiar to environmental managers, supplemented with techniques employed for clinical interpretation of 'omics-based biomedical tests. We describe the framework in step-wise fashion, beginning with single test replicates of a single RNA variant, then extending to multi-gene RNA profiling, collections of test replicates, and integration of complementary data. In order to simplify the presentation, we focus on using RNA profiling for distinguishing presence versus absence of chemical hazards, but the principles discussed can be extended to other types of 'omics measurements, multi-class problems, and regression. We include a supplemental file demonstrating many of the concepts using the open source R statistical package. Published by Elsevier Ltd.

  5. A Unified Mixed-Effects Model for Rare-Variant Association in Sequencing Studies

    PubMed Central

    Sun, Jianping; Zheng, Yingye; Hsu, Li

    2013-01-01

    For rare-variant association analysis, due to extreme low frequencies of these variants, it is necessary to aggregate them by a prior set (e.g., genes and pathways) in order to achieve adequate power. In this paper, we consider hierarchical models to relate a set of rare variants to phenotype by modeling the effects of variants as a function of variant characteristics while allowing for variant-specific effect (heterogeneity). We derive a set of two score statistics, testing the group effect by variant characteristics and the heterogeneity effect. We make a novel modification to these score statistics so that they are independent under the null hypothesis and their asymptotic distributions can be derived. As a result, the computational burden is greatly reduced compared with permutation-based tests. Our approach provides a general testing framework for rare variants association, which includes many commonly used tests, such as the burden test [Li and Leal, 2008] and the sequence kernel association test [Wu et al., 2011], as special cases. Furthermore, in contrast to these tests, our proposed test has an added capacity to identify which components of variant characteristics and heterogeneity contribute to the association. Simulations under a wide range of scenarios show that the proposed test is valid, robust and powerful. An application to the Dallas Heart Study illustrates that apart from identifying genes with significant associations, the new method also provides additional information regarding the source of the association. Such information may be useful for generating hypothesis in future studies. PMID:23483651

  6. Pathway analysis with next-generation sequencing data.

    PubMed

    Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao

    2015-04-01

    Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.

  7. Improving coeliac disease risk prediction by testing non-HLA variants additional to HLA variants.

    PubMed

    Romanos, Jihane; Rosén, Anna; Kumar, Vinod; Trynka, Gosia; Franke, Lude; Szperl, Agata; Gutierrez-Achury, Javier; van Diemen, Cleo C; Kanninga, Roan; Jankipersadsing, Soesma A; Steck, Andrea; Eisenbarth, Georges; van Heel, David A; Cukrowska, Bozena; Bruno, Valentina; Mazzilli, Maria Cristina; Núñez, Concepcion; Bilbao, Jose Ramon; Mearin, M Luisa; Barisani, Donatella; Rewers, Marian; Norris, Jill M; Ivarsson, Anneli; Boezen, H Marieke; Liu, Edwin; Wijmenga, Cisca

    2014-03-01

    The majority of coeliac disease (CD) patients are not being properly diagnosed and therefore remain untreated, leading to a greater risk of developing CD-associated complications. The major genetic risk heterodimer, HLA-DQ2 and DQ8, is already used clinically to help exclude disease. However, approximately 40% of the population carry these alleles and the majority never develop CD. We explored whether CD risk prediction can be improved by adding non-HLA-susceptible variants to common HLA testing. We developed an average weighted genetic risk score with 10, 26 and 57 single nucleotide polymorphisms (SNP) in 2675 cases and 2815 controls and assessed the improvement in risk prediction provided by the non-HLA SNP. Moreover, we assessed the transferability of the genetic risk model with 26 non-HLA variants to a nested case-control population (n=1709) and a prospective cohort (n=1245) and then tested how well this model predicted CD outcome for 985 independent individuals. Adding 57 non-HLA variants to HLA testing showed a statistically significant improvement compared to scores from models based on HLA only, HLA plus 10 SNP and HLA plus 26 SNP. With 57 non-HLA variants, the area under the receiver operator characteristic curve reached 0.854 compared to 0.823 for HLA only, and 11.1% of individuals were reclassified to a more accurate risk group. We show that the risk model with HLA plus 26 SNP is useful in independent populations. Predicting risk with 57 additional non-HLA variants improved the identification of potential CD patients. This demonstrates a possible role for combined HLA and non-HLA genetic testing in diagnostic work for CD.

  8. Investigation of exomic variants associated with overall survival in ovarian cancer

    PubMed Central

    Ann Chen, Yian; Larson, Melissa C; Fogarty, Zachary C; Earp, Madalene A; Anton-Culver, Hoda; Bandera, Elisa V; Cramer, Daniel; Doherty, Jennifer A; Goodman, Marc T; Gronwald, Jacek; Karlan, Beth Y; Kjaer, Susanne K; Levine, Douglas A; Menon, Usha; Ness, Roberta B; Pearce, Celeste L; Pejovic, Tanja; Rossing, Mary Anne; Wentzensen, Nicolas; Bean, Yukie T; Bisogna, Maria; Brinton, Louise A; Carney, Michael E; Cunningham, Julie M; Cybulski, Cezary; deFazio, Anna; Dicks, Ed M; Edwards, Robert P; Gayther, Simon A; Gentry-Maharaj, Aleksandra; Gore, Martin; Iversen, Edwin S; Jensen, Allan; Johnatty, Sharon E; Lester, Jenny; Lin, Hui-Yi; Lissowska, Jolanta; Lubinski, Jan; Menkiszak, Janusz; Modugno, Francesmary; Moysich, Kirsten B; Orlow, Irene; Pike, Malcolm C; Ramus, Susan J; Song, Honglin; Terry, Kathryn L; Thompson, Pamela J; Tyrer, Jonathan P; van den Berg, David J; Vierkant, Robert A; Vitonis, Allison F; Walsh, Christine; Wilkens, Lynne R; Wu, Anna H; Yang, Hannah; Ziogas, Argyrios; Berchuck, Andrew; Chenevix-Trench, Georgia; Schildkraut, Joellen M; Permuth-Wey, Jennifer; Phelan, Catherine M; Pharoah, Paul D P; Fridley, Brooke L

    2016-01-01

    Background While numerous susceptibility loci for epithelial ovarian cancer (EOC) have been identified, few associations have been reported with overall survival. In the absence of common prognostic genetic markers, we hypothesize that rare coding variants may be associated with overall EOC survival and assessed their contribution in two exome-based genotyping projects of the Ovarian Cancer Association Consortium (OCAC). Methods The primary patient set (Set 1) included 14 independent EOC studies (4293 patients) and 227,892 variants, and a secondary patient set (Set 2) included six additional EOC studies (1744 patients) and 114,620 variants. Because power to detect rare variants individually is reduced, gene-level tests were conducted. Sets were analyzed separately at individual variants and by gene, and then combined with meta-analyses (73,203 variants and 13,163 genes overlapped). Results No individual variant reached genome-wide statistical significance. A SNP previously implicated to be associated with EOC risk and, to a lesser extent, survival, rs8170, showed the strongest evidence of association with survival and similar effect size estimates across sets (Pmeta=1.1E-6, HRSet1=1.17, HRSet2=1.14). Rare variants in ATG2B, an autophagy gene important for apoptosis, were significantly associated with survival after multiple testing correction (Pmeta=1.1E-6; Pcorrected=0.01). Conclusions Common variant rs8170 and rare variants in ATG2B may be associated with EOC overall survival, although further study is needed. Impact This study represents the first exome-wide association study of EOC survival to include rare variant analyses, and suggests that complementary single variant and gene-level analyses in large studies are needed to identify rare variants that warrant follow-up study. PMID:26747452

  9. Hybrid flower pollination algorithm strategies for t-way test suite generation.

    PubMed

    Nasser, Abdullah B; Zamli, Kamal Z; Alsewari, AbdulRahman A; Ahmed, Bestoun S

    2018-01-01

    The application of meta-heuristic algorithms for t-way testing has recently become prevalent. Consequently, many useful meta-heuristic algorithms have been developed on the basis of the implementation of t-way strategies (where t indicates the interaction strength). Mixed results have been reported in the literature to highlight the fact that no single strategy appears to be superior compared with other configurations. The hybridization of two or more algorithms can enhance the overall search capabilities, that is, by compensating the limitation of one algorithm with the strength of others. Thus, hybrid variants of the flower pollination algorithm (FPA) are proposed in the current work. Four hybrid variants of FPA are considered by combining FPA with other algorithmic components. The experimental results demonstrate that FPA hybrids overcome the problems of slow convergence in the original FPA and offers statistically superior performance compared with existing t-way strategies in terms of test suite size.

  10. Hybrid flower pollination algorithm strategies for t-way test suite generation

    PubMed Central

    Zamli, Kamal Z.; Alsewari, AbdulRahman A.

    2018-01-01

    The application of meta-heuristic algorithms for t-way testing has recently become prevalent. Consequently, many useful meta-heuristic algorithms have been developed on the basis of the implementation of t-way strategies (where t indicates the interaction strength). Mixed results have been reported in the literature to highlight the fact that no single strategy appears to be superior compared with other configurations. The hybridization of two or more algorithms can enhance the overall search capabilities, that is, by compensating the limitation of one algorithm with the strength of others. Thus, hybrid variants of the flower pollination algorithm (FPA) are proposed in the current work. Four hybrid variants of FPA are considered by combining FPA with other algorithmic components. The experimental results demonstrate that FPA hybrids overcome the problems of slow convergence in the original FPA and offers statistically superior performance compared with existing t-way strategies in terms of test suite size. PMID:29718918

  11. Generalized functional linear models for gene-based case-control association studies.

    PubMed

    Fan, Ruzong; Wang, Yifan; Mills, James L; Carter, Tonia C; Lobach, Iryna; Wilson, Alexander F; Bailey-Wilson, Joan E; Weeks, Daniel E; Xiong, Momiao

    2014-11-01

    By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. © 2014 WILEY PERIODICALS, INC.

  12. Generalized Functional Linear Models for Gene-based Case-Control Association Studies

    PubMed Central

    Mills, James L.; Carter, Tonia C.; Lobach, Iryna; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Weeks, Daniel E.; Xiong, Momiao

    2014-01-01

    By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene are disease-related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease data sets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. PMID:25203683

  13. Fine-mapping inflammatory bowel disease loci to single variant resolution

    PubMed Central

    Huang, Hailiang; Fang, Ming; Jostins, Luke; Mirkov, Maša Umićević; Boucher, Gabrielle; Anderson, Carl A; Andersen, Vibeke; Cleynen, Isabelle; Cortes, Adrian; Crins, François; D'Amato, Mauro; Deffontaine, Valérie; Dimitrieva, Julia; Docampo, Elisa; Elansary, Mahmoud; Farh, Kyle Kai-How; Franke, Andre; Gori, Ann-Stephan; Goyette, Philippe; Halfvarson, Jonas; Haritunians, Talin; Knight, Jo; Lawrance, Ian C; Lees, Charlie W; Louis, Edouard; Mariman, Rob; Meuwissen, Theo; Mni, Myriam; Momozawa, Yukihide; Parkes, Miles; Spain, Sarah L; Théâtre, Emilie; Trynka, Gosia; Satsangi, Jack; van Sommeren, Suzanne; Vermeire, Severine; Xavier, Ramnik J; Weersma, Rinse K; Duerr, Richard H; Mathew, Christopher G; Rioux, John D; McGovern, Dermot PB; Cho, Judy H; Georges, Michel; Daly, Mark J; Barrett, Jeffrey C

    2017-01-01

    Summary The inflammatory bowel diseases (IBD) are chronic gastrointestinal inflammatory disorders that affect millions worldwide. Genome-wide association studies have identified 200 IBD-associated loci, but few have been conclusively resolved to specific functional variants. Here we report fine-mapping of 94 IBD loci using high-density genotyping in 67,852 individuals. We pinpointed 18 associations to a single causal variant with >95% certainty, and an additional 27 associations to a single variant with >50% certainty. These 45 variants are significantly enriched for protein-coding changes (n=13), direct disruption of transcription factor binding sites (n=3) and tissue specific epigenetic marks (n=10), with the latter category showing enrichment in specific immune cells among associations stronger in CD and in gut mucosa among associations stronger in UC. The results of this study suggest that high-resolution fine-mapping in large samples can convert many GWAS discoveries into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms. PMID:28658209

  14. Detecting epistasis with the marginal epistasis test in genetic mapping studies of quantitative traits

    PubMed Central

    Zeng, Ping; Mukherjee, Sayan; Zhou, Xiang

    2017-01-01

    Epistasis, commonly defined as the interaction between multiple genes, is an important genetic component underlying phenotypic variation. Many statistical methods have been developed to model and identify epistatic interactions between genetic variants. However, because of the large combinatorial search space of interactions, most epistasis mapping methods face enormous computational challenges and often suffer from low statistical power due to multiple test correction. Here, we present a novel, alternative strategy for mapping epistasis: instead of directly identifying individual pairwise or higher-order interactions, we focus on mapping variants that have non-zero marginal epistatic effects—the combined pairwise interaction effects between a given variant and all other variants. By testing marginal epistatic effects, we can identify candidate variants that are involved in epistasis without the need to identify the exact partners with which the variants interact, thus potentially alleviating much of the statistical and computational burden associated with standard epistatic mapping procedures. Our method is based on a variance component model, and relies on a recently developed variance component estimation method for efficient parameter inference and p-value computation. We refer to our method as the “MArginal ePIstasis Test”, or MAPIT. With simulations, we show how MAPIT can be used to estimate and test marginal epistatic effects, produce calibrated test statistics under the null, and facilitate the detection of pairwise epistatic interactions. We further illustrate the benefits of MAPIT in a QTL mapping study by analyzing the gene expression data of over 400 individuals from the GEUVADIS consortium. PMID:28746338

  15. Identification of rare X-linked neuroligin variants by massively parallel sequencing in males with autism spectrum disorder.

    PubMed

    Steinberg, Karyn Meltz; Ramachandran, Dhanya; Patel, Viren C; Shetty, Amol C; Cutler, David J; Zwick, Michael E

    2012-09-28

    Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3' UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects.

  16. Identification of rare X-linked neuroligin variants by massively parallel sequencing in males with autism spectrum disorder

    PubMed Central

    2012-01-01

    Background Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. Methods We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. Results We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3’ UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. Conclusions These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects. PMID:23020841

  17. Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.

    PubMed

    Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao

    2016-11-30

    Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Incorporating gene-environment interaction in testing for association with rare genetic variants.

    PubMed

    Chen, Han; Meigs, James B; Dupuis, Josée

    2014-01-01

    The incorporation of gene-environment interactions could improve the ability to detect genetic associations with complex traits. For common genetic variants, single-marker interaction tests and joint tests of genetic main effects and gene-environment interaction have been well-established and used to identify novel association loci for complex diseases and continuous traits. For rare genetic variants, however, single-marker tests are severely underpowered due to the low minor allele frequency, and only a few gene-environment interaction tests have been developed. We aimed at developing powerful and computationally efficient tests for gene-environment interaction with rare variants. In this paper, we propose interaction and joint tests for testing gene-environment interaction of rare genetic variants. Our approach is a generalization of existing gene-environment interaction tests for multiple genetic variants under certain conditions. We show in our simulation studies that our interaction and joint tests have correct type I errors, and that the joint test is a powerful approach for testing genetic association, allowing for gene-environment interaction. We also illustrate our approach in a real data example from the Framingham Heart Study. Our approach can be applied to both binary and continuous traits, it is powerful and computationally efficient.

  19. Compartmentalization of HIV-1 within the female genital tract is due to monotypic and low-diversity variants not distinct viral populations.

    PubMed

    Bull, Marta; Learn, Gerald; Genowati, Indira; McKernan, Jennifer; Hitti, Jane; Lockhart, David; Tapia, Kenneth; Holte, Sarah; Dragavon, Joan; Coombs, Robert; Mullins, James; Frenkel, Lisa

    2009-09-22

    Compartmentalization of HIV-1 between the genital tract and blood was noted in half of 57 women included in 12 studies primarily using cell-free virus. To further understand differences between genital tract and blood viruses of women with chronic HIV-1 infection cell-free and cell-associated virus populations were sequenced from these tissues, reasoning that integrated viral DNA includes variants archived from earlier in infection, and provides a greater array of genotypes for comparisons. Multiple sequences from single-genome-amplification of HIV-1 RNA and DNA from the genital tract and blood of each woman were compared in a cross-sectional study. Maximum likelihood phylogenies were evaluated for evidence of compartmentalization using four statistical tests. Genital tract and blood HIV-1 appears compartmentalized in 7/13 women by >/=2 statistical analyses. These subjects' phylograms were characterized by low diversity genital-specific viral clades interspersed between clades containing both genital and blood sequences. Many of the genital-specific clades contained monotypic HIV-1 sequences. In 2/7 women, HIV-1 populations were significantly compartmentalized across all four statistical tests; both had low diversity genital tract-only clades. Collapsing monotypic variants into a single sequence diminished the prevalence and extent of compartmentalization. Viral sequences did not demonstrate tissue-specific signature amino acid residues, differential immune selection, or co-receptor usage. In women with chronic HIV-1 infection multiple identical sequences suggest proliferation of HIV-1-infected cells, and low diversity tissue-specific phylogenetic clades are consistent with bursts of viral replication. These monotypic and tissue-specific viruses provide statistical support for compartmentalization of HIV-1 between the female genital tract and blood. However, the intermingling of these clades with clades comprised of both genital and blood sequences and the absence of tissue-specific genetic features suggests compartmentalization between blood and genital tract may be due to viral replication and proliferation of infected cells, and questions whether HIV-1 in the female genital tract is distinct from blood.

  20. Pathway-based discovery of genetic interactions in breast cancer

    PubMed Central

    Xu, Zack Z.; Boone, Charles; Lange, Carol A.

    2017-01-01

    Breast cancer is the second largest cause of cancer death among U.S. women and the leading cause of cancer death among women worldwide. Genome-wide association studies (GWAS) have identified several genetic variants associated with susceptibility to breast cancer, but these still explain less than half of the estimated genetic contribution to the disease. Combinations of variants (i.e. genetic interactions) may play an important role in breast cancer susceptibility. However, due to a lack of statistical power, the current tests for genetic interactions from GWAS data mainly leverage prior knowledge to focus on small sets of genes or SNPs that are known to have an association with breast cancer. Thus, many genetic interactions, particularly among novel variants, remain understudied. Reverse-genetic interaction screens in model organisms have shown that genetic interactions frequently cluster into highly structured motifs, where members of the same pathway share similar patterns of genetic interactions. Based on this key observation, we recently developed a method called BridGE to search for such structured motifs in genetic networks derived from GWAS studies and identify pathway-level genetic interactions in human populations. We applied BridGE to six independent breast cancer cohorts and identified significant pathway-level interactions in five cohorts. Joint analysis across all five cohorts revealed a high confidence consensus set of genetic interactions with support in multiple cohorts. The discovered interactions implicated the glutathione conjugation, vitamin D receptor, purine metabolism, mitotic prometaphase, and steroid hormone biosynthesis pathways as major modifiers of breast cancer risk. Notably, while many of the pathways identified by BridGE show clear relevance to breast cancer, variants in these pathways had not been previously discovered by traditional single variant association tests, or single pathway enrichment analysis that does not consider SNP-SNP interactions. PMID:28957314

  1. Fine-mapping inflammatory bowel disease loci to single-variant resolution.

    PubMed

    Huang, Hailiang; Fang, Ming; Jostins, Luke; Umićević Mirkov, Maša; Boucher, Gabrielle; Anderson, Carl A; Andersen, Vibeke; Cleynen, Isabelle; Cortes, Adrian; Crins, François; D'Amato, Mauro; Deffontaine, Valérie; Dmitrieva, Julia; Docampo, Elisa; Elansary, Mahmoud; Farh, Kyle Kai-How; Franke, Andre; Gori, Ann-Stephan; Goyette, Philippe; Halfvarson, Jonas; Haritunians, Talin; Knight, Jo; Lawrance, Ian C; Lees, Charlie W; Louis, Edouard; Mariman, Rob; Meuwissen, Theo; Mni, Myriam; Momozawa, Yukihide; Parkes, Miles; Spain, Sarah L; Théâtre, Emilie; Trynka, Gosia; Satsangi, Jack; van Sommeren, Suzanne; Vermeire, Severine; Xavier, Ramnik J; Weersma, Rinse K; Duerr, Richard H; Mathew, Christopher G; Rioux, John D; McGovern, Dermot P B; Cho, Judy H; Georges, Michel; Daly, Mark J; Barrett, Jeffrey C

    2017-07-13

    Inflammatory bowel diseases are chronic gastrointestinal inflammatory disorders that affect millions of people worldwide. Genome-wide association studies have identified 200 inflammatory bowel disease-associated loci, but few have been conclusively resolved to specific functional variants. Here we report fine-mapping of 94 inflammatory bowel disease loci using high-density genotyping in 67,852 individuals. We pinpoint 18 associations to a single causal variant with greater than 95% certainty, and an additional 27 associations to a single variant with greater than 50% certainty. These 45 variants are significantly enriched for protein-coding changes (n = 13), direct disruption of transcription-factor binding sites (n = 3), and tissue-specific epigenetic marks (n = 10), with the last category showing enrichment in specific immune cells among associations stronger in Crohn's disease and in gut mucosa among associations stronger in ulcerative colitis. The results of this study suggest that high-resolution fine-mapping in large samples can convert many discoveries from genome-wide association studies into statistically convincing causal variants, providing a powerful substrate for experimental elucidation of disease mechanisms.

  2. Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.

    PubMed

    Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao

    2015-08-01

    Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.

  3. Integrative pathway analysis of a genome-wide association study of V̇o2max response to exercise training

    PubMed Central

    Vivar, Juan C.; Sarzynski, Mark A.; Sung, Yun Ju; Timmons, James A.; Bouchard, Claude; Rankinen, Tuomo

    2013-01-01

    We previously reported the findings from a genome-wide association study of the response of maximal oxygen uptake (V̇o2max) to an exercise program. Here we follow up on these results to generate hypotheses on genes, pathways, and systems involved in the ability to respond to exercise training. A systems biology approach can help us better establish a comprehensive physiological description of what underlies V̇o2maxtrainability. The primary material for this exploration was the individual single-nucleotide polymorphism (SNP), SNP-gene mapping, and statistical significance levels. We aimed to generate novel hypotheses through analyses that go beyond statistical association of single-locus markers. This was accomplished through three complementary approaches: 1) building de novo evidence of gene candidacy through informatics-driven literature mining; 2) aggregating evidence from statistical associations to link variant enrichment in biological pathways to V̇o2max trainability; and 3) predicting possible consequences of variants residing in the pathways of interest. We started with candidate gene prioritization followed by pathway analysis focused on overrepresentation analysis and gene set enrichment analysis. Subsequently, leads were followed using in silico analysis of predicted SNP functions. Pathways related to cellular energetics (pantothenate and CoA biosynthesis; PPAR signaling) and immune functions (complement and coagulation cascades) had the highest levels of SNP burden. In particular, long-chain fatty acid transport and fatty acid oxidation genes and sequence variants were found to influence differences in V̇o2max trainability. Together, these methods allow for the hypothesis-driven ranking and prioritization of genes and pathways for future experimental testing and validation. PMID:23990238

  4. A Standardized DNA Variant Scoring System for Pathogenicity Assessments in Mendelian Disorders

    PubMed Central

    Karbassi, Izabela; Maston, Glenn A.; Love, Angela; DiVincenzo, Christina; Braastad, Corey D.; Elzinga, Christopher D.; Bright, Alison R.; Previte, Domenic; Zhang, Ke; Rowland, Charles M.; McCarthy, Michele; Lapierre, Jennifer L.; Dubois, Felicita; Medeiros, Katelyn A.; Batish, Sat Dev; Jones, Jeffrey; Liaquat, Khalida; Hoffman, Carol A.; Jaremko, Malgorzata; Wang, Zhenyuan; Sun, Weimin; Buller‐Burckle, Arlene; Strom, Charles M.; Keiles, Steven B.

    2015-01-01

    ABSTRACT We developed a rules‐based scoring system to classify DNA variants into five categories including pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign. Over 16,500 pathogenicity assessments on 11,894 variants from 338 genes were analyzed for pathogenicity based on prediction tools, population frequency, co‐occurrence, segregation, and functional studies collected from internal and external sources. Scores were calculated by trained scientists using a quantitative framework that assigned differential weighting to these five types of data. We performed descriptive and comparative statistics on the dataset and tested interobserver concordance among the trained scientists. Private variants defined as variants found within single families (n = 5,182), were either VUS (80.5%; n = 4,169) or likely pathogenic (19.5%; n = 1,013). The remaining variants (n = 6,712) were VUS (38.4%; n = 2,577) or likely benign/benign (34.7%; n = 2,327) or likely pathogenic/pathogenic (26.9%, n = 1,808). Exact agreement between the trained scientists on the final variant score was 98.5% [95% confidence interval (CI) (98.0, 98.9)] with an interobserver consistency of 97% [95% CI (91.5, 99.4)]. Variant scores were stable and showed increasing odds of being in agreement with new data when re‐evaluated periodically. This carefully curated, standardized variant pathogenicity scoring system provides reliable pathogenicity scores for DNA variants encountered in a clinical laboratory setting. PMID:26467025

  5. A Standardized DNA Variant Scoring System for Pathogenicity Assessments in Mendelian Disorders.

    PubMed

    Karbassi, Izabela; Maston, Glenn A; Love, Angela; DiVincenzo, Christina; Braastad, Corey D; Elzinga, Christopher D; Bright, Alison R; Previte, Domenic; Zhang, Ke; Rowland, Charles M; McCarthy, Michele; Lapierre, Jennifer L; Dubois, Felicita; Medeiros, Katelyn A; Batish, Sat Dev; Jones, Jeffrey; Liaquat, Khalida; Hoffman, Carol A; Jaremko, Malgorzata; Wang, Zhenyuan; Sun, Weimin; Buller-Burckle, Arlene; Strom, Charles M; Keiles, Steven B; Higgins, Joseph J

    2016-01-01

    We developed a rules-based scoring system to classify DNA variants into five categories including pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, and benign. Over 16,500 pathogenicity assessments on 11,894 variants from 338 genes were analyzed for pathogenicity based on prediction tools, population frequency, co-occurrence, segregation, and functional studies collected from internal and external sources. Scores were calculated by trained scientists using a quantitative framework that assigned differential weighting to these five types of data. We performed descriptive and comparative statistics on the dataset and tested interobserver concordance among the trained scientists. Private variants defined as variants found within single families (n = 5,182), were either VUS (80.5%; n = 4,169) or likely pathogenic (19.5%; n = 1,013). The remaining variants (n = 6,712) were VUS (38.4%; n = 2,577) or likely benign/benign (34.7%; n = 2,327) or likely pathogenic/pathogenic (26.9%, n = 1,808). Exact agreement between the trained scientists on the final variant score was 98.5% [95% confidence interval (CI) (98.0, 98.9)] with an interobserver consistency of 97% [95% CI (91.5, 99.4)]. Variant scores were stable and showed increasing odds of being in agreement with new data when re-evaluated periodically. This carefully curated, standardized variant pathogenicity scoring system provides reliable pathogenicity scores for DNA variants encountered in a clinical laboratory setting. © 2015 The Authors. **Human Mutation published by Wiley Periodicals, Inc.

  6. Comparison of gene-based rare variant association mapping methods for quantitative traits in a bovine population with complex familial relationships.

    PubMed

    Zhang, Qianqian; Guldbrandtsen, Bernt; Calus, Mario P L; Lund, Mogens Sandø; Sahana, Goutam

    2016-08-17

    There is growing interest in the role of rare variants in the variation of complex traits due to increasing evidence that rare variants are associated with quantitative traits. However, association methods that are commonly used for mapping common variants are not effective to map rare variants. Besides, livestock populations have large half-sib families and the occurrence of rare variants may be confounded with family structure, which makes it difficult to disentangle their effects from family mean effects. We compared the power of methods that are commonly applied in human genetics to map rare variants in cattle using whole-genome sequence data and simulated phenotypes. We also studied the power of mapping rare variants using linear mixed models (LMM), which are the method of choice to account for both family relationships and population structure in cattle. We observed that the power of the LMM approach was low for mapping a rare variant (defined as those that have frequencies lower than 0.01) with a moderate effect (5 to 8 % of phenotypic variance explained by multiple rare variants that vary from 5 to 21 in number) contributing to a QTL with a sample size of 1000. In contrast, across the scenarios studied, statistical methods that are specialized for mapping rare variants increased power regardless of whether multiple rare variants or a single rare variant underlie a QTL. Different methods for combining rare variants in the test single nucleotide polymorphism set resulted in similar power irrespective of the proportion of total genetic variance explained by the QTL. However, when the QTL variance is very small (only 0.1 % of the total genetic variance), these specialized methods for mapping rare variants and LMM generally had no power to map the variants within a gene with sample sizes of 1000 or 5000. We observed that the methods that combine multiple rare variants within a gene into a meta-variant generally had greater power to map rare variants compared to LMM. Therefore, it is recommended to use rare variant association mapping methods to map rare genetic variants that affect quantitative traits in livestock, such as bovine populations.

  7. An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder.

    PubMed

    Werling, Donna M; Brand, Harrison; An, Joon-Yong; Stone, Matthew R; Zhu, Lingxue; Glessner, Joseph T; Collins, Ryan L; Dong, Shan; Layer, Ryan M; Markenscoff-Papadimitriou, Eirene; Farrell, Andrew; Schwartz, Grace B; Wang, Harold Z; Currall, Benjamin B; Zhao, Xuefang; Dea, Jeanselle; Duhn, Clif; Erdman, Carolyn A; Gilson, Michael C; Yadav, Rachita; Handsaker, Robert E; Kashin, Seva; Klei, Lambertus; Mandell, Jeffrey D; Nowakowski, Tomasz J; Liu, Yuwen; Pochareddy, Sirisha; Smith, Louw; Walker, Michael F; Waterman, Matthew J; He, Xin; Kriegstein, Arnold R; Rubenstein, John L; Sestan, Nenad; McCarroll, Steven A; Neale, Benjamin M; Coon, Hilary; Willsey, A Jeremy; Buxbaum, Joseph D; Daly, Mark J; State, Matthew W; Quinlan, Aaron R; Marth, Gabor T; Roeder, Kathryn; Devlin, Bernie; Talkowski, Michael E; Sanders, Stephan J

    2018-05-01

    Genomic association studies of common or rare protein-coding variation have established robust statistical approaches to account for multiple testing. Here we present a comparable framework to evaluate rare and de novo noncoding single-nucleotide variants, insertion/deletions, and all classes of structural variation from whole-genome sequencing (WGS). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions, we define 51,801 annotation categories. Analyses of 519 autism spectrum disorder families did not identify association with any categories after correction for 4,123 effective tests. Without appropriate correction, biologically plausible associations are observed in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still exhibited the strongest associations. Thus, in autism, the contribution of de novo noncoding variation is probably modest in comparison to that of de novo coding variants. Robust results from future WGS studies will require large cohorts and comprehensive analytical strategies that consider the substantial multiple-testing burden.

  8. IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies.

    PubMed

    Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben

    2017-09-15

    Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  9. DMET-Miner: Efficient discovery of association rules from pharmacogenomic data.

    PubMed

    Agapito, Giuseppe; Guzzi, Pietro H; Cannataro, Mario

    2015-08-01

    Microarray platforms enable the investigation of allelic variants that may be correlated to phenotypes. Among those, the Affymetrix DMET (Drug Metabolism Enzymes and Transporters) platform enables the simultaneous investigation of all the genes that are related to drug absorption, distribution, metabolism and excretion (ADME). Although recent studies demonstrated the effectiveness of the use of DMET data for studying drug response or toxicity in clinical studies, there is a lack of tools for the automatic analysis of DMET data. In a previous work we developed DMET-Analyzer, a methodology and a supporting platform able to automatize the statistical study of allelic variants, that has been validated in several clinical studies. Although DMET-Analyzer is able to correlate a single variant for each probe (related to a portion of a gene) through the use of the Fisher test, it is unable to discover multiple associations among allelic variants, due to its underlying statistic analysis strategy that focuses on a single variant for each time. To overcome those limitations, here we propose a new analysis methodology for DMET data based on Association Rules mining, and an efficient implementation of this methodology, named DMET-Miner. DMET-Miner extends the DMET-Analyzer tool with data mining capabilities and correlates the presence of a set of allelic variants with the conditions of patient's samples by exploiting association rules. To face the high number of frequent itemsets generated when considering large clinical studies based on DMET data, DMET-Miner uses an efficient data structure and implements an optimized search strategy that reduces the search space and the execution time. Preliminary experiments on synthetic DMET datasets, show how DMET-Miner outperforms off-the-shelf data mining suites such as the FP-Growth algorithms available in Weka and RapidMiner. To demonstrate the biological relevance of the extracted association rules and the effectiveness of the proposed approach from a medical point of view, some preliminary studies on a real clinical dataset are currently under medical investigation. Copyright © 2015 Elsevier Inc. All rights reserved.

  10. Segment-Wise Genome-Wide Association Analysis Identifies a Candidate Region Associated with Schizophrenia in Three Independent Samples

    PubMed Central

    Rietschel, Marcella; Mattheisen, Manuel; Breuer, René; Schulze, Thomas G.; Nöthen, Markus M.; Levinson, Douglas; Shi, Jianxin; Gejman, Pablo V.; Cichon, Sven; Ophoff, Roel A.

    2012-01-01

    Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio∼1.05–1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (∼15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128–160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions. PMID:22723893

  11. Improved score statistics for meta-analysis in single-variant and gene-level association studies.

    PubMed

    Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo

    2018-06-01

    Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.

  12. Identification of missing variants by combining multiple analytic pipelines.

    PubMed

    Ren, Yingxue; Reddy, Joseph S; Pottier, Cyril; Sarangi, Vivekananda; Tian, Shulan; Sinnwell, Jason P; McDonnell, Shannon K; Biernacka, Joanna M; Carrasquillo, Minerva M; Ross, Owen A; Ertekin-Taner, Nilüfer; Rademakers, Rosa; Hudson, Matthew; Mainzer, Liudmila Sergeevna; Asmann, Yan W

    2018-04-16

    After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires large sample sizes for statistical power and has brought up questions about whether the current variant calling practices are adequate for large cohorts. It is well-known that there are discrepancies between variants called by different pipelines, and that using a single pipeline always misses true variants exclusively identifiable by other pipelines. Nonetheless, it is common practice today to call variants by one pipeline due to computational cost and assume that false negative calls are a small percent of total. We analyzed 10,000 exomes from the Alzheimer's Disease Sequencing Project (ADSP) using multiple analytic pipelines consisting of different read aligners and variant calling strategies. We compared variants identified by using two aligners in 50,100, 200, 500, 1000, and 1952 samples; and compared variants identified by adding single-sample genotyping to the default multi-sample joint genotyping in 50,100, 500, 2000, 5000 and 10,000 samples. We found that using a single pipeline missed increasing numbers of high-quality variants correlated with sample sizes. By combining two read aligners and two variant calling strategies, we rescued 30% of pass-QC variants at sample size of 2000, and 56% at 10,000 samples. The rescued variants had higher proportions of low frequency (minor allele frequency [MAF] 1-5%) and rare (MAF < 1%) variants, which are the very type of variants of interest. In 660 Alzheimer's disease cases with earlier onset ages of ≤65, 4 out of 13 (31%) previously-published rare pathogenic and protective mutations in APP, PSEN1, and PSEN2 genes were undetected by the default one-pipeline approach but recovered by the multi-pipeline approach. Identification of the complete variant set from sequencing data is the prerequisite of genetic association analyses. The current analytic practice of calling genetic variants from sequencing data using a single bioinformatics pipeline is no longer adequate with the increasingly large projects. The number and percentage of quality variants that passed quality filters but are missed by the one-pipeline approach rapidly increased with sample size.

  13. The effect of migration of instantaneous centre of knee orthosis rotation during gait - in vivo displacement measurements in two experimental variants.

    PubMed

    Bogucki, Artur J

    2014-01-01

    The knee joint is a bicondylar hinge two-level joint with six degrees of freedom. The location of the functional axis of flexion-extension motion is still a subject of research and discussions. During the swing phase, the femoral condyles do not have direct contact with the tibial articular surfaces and the intra-articular space narrows with increasing weight bearing. The geometry of knee movements is determined by the shape of articular surfaces. A digital recording of the gait of a healthy volunteer was analysed. In the first experimental variant, the subject was wearing a knee orthosis controlling flexion and extension with a hinge-type single-axis joint. In the second variant, the examination involved a hinge-type double-axis orthosis. Statistical analysis involved mathematically calculated values of displacement P. Scatter graphs with a fourth-order polynomial trend line with a confidence interval of 0.95 due to noise were prepared for each experimental variant. In Variant 1, the average displacement was 15.1 mm, the number of tests was 43, standard deviation was 8.761, and the confidence interval was 2.2. The maximum value of displacement was 30.9 mm and the minimum value was 0.7 mm. In Variant 2, the average displacement was 13.4 mm, the number of tests was 44, standard deviation was 7.275, and the confidence interval was 1.8. The maximum value of displacement was 30.2 mm and the minimum value was 3.4 mm. An analysis of moving averages for both experimental variants revealed that displacement trends for both types of orthosis were compatible from the mid-stance to the mid-swing phase. 1. The method employed in the experiment allows for determining the alignment between the axis of the knee joint and that of shin and thigh orthoses. 2. Migration of the single and double-axis orthoses during the gait cycle exceeded 3 cm. 3. During weight bearing, the double-axis orthosis was positioned more correctly. 4. The study results may be helpful in designing new hinge-type knee joints.

  14. Meta-analysis of gene-level tests for rare variant association.

    PubMed

    Liu, Dajiang J; Peloso, Gina M; Zhan, Xiaowei; Holmen, Oddgeir L; Zawistowski, Matthew; Feng, Shuang; Nikpay, Majid; Auer, Paul L; Goel, Anuj; Zhang, He; Peters, Ulrike; Farrall, Martin; Orho-Melander, Marju; Kooperberg, Charles; McPherson, Ruth; Watkins, Hugh; Willer, Cristen J; Hveem, Kristian; Melander, Olle; Kathiresan, Sekar; Abecasis, Gonçalo R

    2014-02-01

    The majority of reported complex disease associations for common genetic variants have been identified through meta-analysis, a powerful approach that enables the use of large sample sizes while protecting against common artifacts due to population structure and repeated small-sample analyses sharing individual-level data. As the focus of genetic association studies shifts to rare variants, genes and other functional units are becoming the focus of analysis. Here we propose and evaluate new approaches for performing meta-analysis of rare variant association tests, including burden tests, weighted burden tests, variable-threshold tests and tests that allow variants with opposite effects to be grouped together. We show that our approach retains useful features from single-variant meta-analysis approaches and demonstrate its use in a study of blood lipid levels in ∼18,500 individuals genotyped with exome arrays.

  15. Identification and replication of the interplay of four genetic high-risk variants for urinary bladder cancer

    PubMed Central

    Selinski, Silvia; Blaszkewicz, Meinolf; Ickstadt, Katja; Gerullis, Holger; Otto, Thomas; Roth, Emanuel; Volkert, Frank; Ovsiannikov, Daniel; Moormann, Oliver; Banfi, Gergely; Nyirady, Peter; Vermeulen, Sita H; Garcia-Closas, Montserrat; Figueroa, Jonine D; Johnson, Alison; Karagas, Margaret R; Kogevinas, Manolis; Malats, Nuria; Schwenn, Molly; Silverman, Debra T; Koutros, Stella; Rothman, Nathaniel; Kiemeney, Lambertus A; Hengstler, Jan G; Golka, Klaus

    2017-01-01

    Abstract Little is known whether genetic variants identified in genome-wide association studies interact to increase bladder cancer risk. Recently, we identified two- and three-variant combinations associated with a particular increase of bladder cancer risk in a urinary bladder cancer case–control series (Leibniz Research Centre for Working Environment and Human Factors at TU Dortmund (IfADo), 1501 cases, 1565 controls). In an independent case–control series (Nijmegen Bladder Cancer Study, NBCS, 1468 cases, 1720 controls) we confirmed these two- and three-variant combinations. Pooled analysis of the two studies as discovery group (IfADo-NBCS) resulted in sufficient statistical power to test up to four-variant combinations by a logistic regression approach. The New England and Spanish Bladder Cancer Studies (2080 cases and 2167 controls) were used as a replication series. Twelve previously identified risk variants were considered. The strongest four-variant combination was obtained in never smokers. The combination of rs1014971[AA] near apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3A (APOBEC3A) and chromobox homolog 6 (CBX6), solute carrier family 1s4 (urea transporter), member 1 (Kidd blood group) (SLC14A1) exon single nucleotide polymorphism (SNP) rs1058396[AG, GG], UDP glucuronosyltransferase 1 family, polypeptide A complex locus (UGT1A) intron SNP rs11892031[AA] and rs8102137[CC, CT] near cyclin E1 (CCNE1) resulted in an unadjusted odds ratio (OR) of 2.59 (95% CI = 1.93–3.47; P = 1.87 × 10−10), while the individual variant ORs ranged only between 1.11 and 1.30. The combination replicated in the New England and Spanish Bladder Cancer Studies (ORunadjusted = 1.60, 95% CI = 1.10–2.33; P = 0.013). The four-variant combination is relatively frequent, with 25% in never smoking cases and 11% in never smoking controls (total study group: 19% cases, 14% controls). In conclusion, we show that four high-risk variants can statistically interact to confer increased bladder cancer risk particularly in never smokers. PMID:29028944

  16. Sequence variants of Toll-like receptor 4 and susceptibility to prostate cancer.

    PubMed

    Chen, Yen-Ching; Giovannucci, Edward; Lazarus, Ross; Kraft, Peter; Ketkar, Shamika; Hunter, David J

    2005-12-15

    Chronic inflammation has been hypothesized to be a risk factor for prostate cancer. The Toll-like receptor 4 (TLR4) presents the bacterial lipopolysaccharide (LPS), which interacts with ligand-binding protein and CD14 (LPS receptor) and activates expression of inflammatory genes through nuclear factor-kappaB and mitogen-activated protein kinase signaling. A previous case-control study found a modest association of a polymorphism in the TLR4 gene [11381G/C, GG versus GC/CC: odds ratio (OR), 1.26] with risk of prostate cancer. We assessed if sequence variants of TLR4 were associated with the risk of prostate cancer. In a nested case-control design within the Health Professionals Follow-up Study, we identified 700 participants with prostate cancer diagnosed after they had provided a blood specimen in 1993 and before January 2000. Controls were 700 age-matched men without prostate cancer who had had a prostate-specific antigen test after providing a blood specimen. We genotyped 16 common (>5%) single nucleotide polymorphisms (SNP) discovered in a resequencing study spanning TLR4 to test for association between sequence variation in TLR4 and prostate cancer. Homozygosity for the variant alleles of eight SNPs was associated with a statistically significantly lower risk of prostate cancer (TLR4_1893, TLR4_2032, TLR4_2437, TLR4_7764, TLR4_11912, TLR4_16649, TLR4_17050, and TLR4_17923), but the TLR4_15844 polymorphism corresponding to 11381G/C was not associated with prostate cancer (GG versus CG/CC: OR, 1.01; 95% confidence interval, 0.79-1.29). Six common haplotypes (cumulative frequency, 81%) were observed; the global test for association between haplotypes and prostate cancer was statistically significant (chi(2) = 14.8 on 6 degrees of freedom; P = 0.02). Two common haplotypes were statistically significantly associated with altered risk of prostate cancer. Inherited polymorphisms of the innate immune gene TLR4 are associated with risk of prostate cancer.

  17. Breast Cancer Clinical Trial of Chemotherapy and Trastuzumab: Potential Tool to Identify Cardiac Modifying Variants of Dilated Cardiomyopathy

    PubMed Central

    Serie, Daniel J.; Crook, Julia E.; Necela, Brian M.; Axenfeld, Bianca C.; Dockter, Travis J.; Colon-Otero, Gerardo; Perez, Edith A.; Thompson, E. Aubrey; Norton, Nadine

    2017-01-01

    Doxorubicin and the ERBB2 targeted therapy, trastuzumab, are routinely used in the treatment of HER2+ breast cancer. In mouse models, doxorubicin is known to cause cardiomyopathy and conditional cardiac knock out of Erbb2 results in dilated cardiomyopathy and increased sensitivity to doxorubicin-induced cell death. In humans, these drugs also result in cardiac phenotypes, but severity and reversibility is highly variable. We examined the association of decline in left ventricular ejection fraction (LVEF) at 15,204 single nucleotide polymorphisms (SNPs) spanning 72 cardiomyopathy genes, in 800 breast cancer patients who received doxorubicin and trastuzumab. For 7033 common SNPs (minor allele frequency (MAF) > 0.01) we performed single marker linear regression. For all SNPs, we performed gene-based testing with SNP-set (Sequence) Kernel Association Tests: SKAT, SKAT-O and SKAT-common/rare under rare variant non-burden; rare variant optimized burden and non-burden tests; and a combination of rare and common variants respectively. Single marker analyses identified seven missense variants in OBSCN (p = 0.0045–0.0009, MAF = 0.18–0.50) and two in TTN (both p = 0.04, MAF = 0.22). Gene-based rare variant analyses, SKAT and SKAT-O, performed very similarly (ILK, TCAP, DSC2, VCL, FXN, DSP and KCNQ1, p = 0.042–0.006). Gene-based tests of rare/common variants were significant at the nominal 5% level for OBSCN as well as TCAP, DSC2, VCL, NEXN, KCNJ2 and DMD (p = 0.044–0.008). Our results suggest that rare and common variants in OBSCN, as well as in other genes, could have modifying effects in cardiomyopathy. PMID:29367538

  18. Allelic association of sequence variants in the herpes virus entry mediator-B gene (PVRL2) with the severity of multiple sclerosis.

    PubMed

    Schmidt, S; Pericak-Vance, M A; Sawcer, S; Barcellos, L F; Hart, J; Sims, J; Prokop, A M; van der Walt, J; DeLoa, C; Lincoln, R R; Oksenberg, J R; Compston, A; Hauser, S L; Haines, J L; Gregory, S G

    2006-07-01

    Discrepant findings have been reported regarding an association of the apolipoprotein E (APOE) gene with the clinical course of multiple sclerosis (MS). To resolve these discrepancies, we examined common sequence variation in six candidate genes residing in a 380-kb genomic region surrounding and including the APOE locus for an association with MS severity. We genotyped at least three polymorphisms in each of six candidate genes in 1,540 Caucasian MS families (729 single-case and multiple-case families from the United States, 811 single-case families from the UK). By applying the quantitative transmission/disequilibrium test to a recently proposed MS severity score, the only statistically significant (P=0.003) association with MS severity was found for an intronic variant in the Herpes Virus Entry Mediator-B Gene PVRL2. Additional genotyping extended the association to a 16.6 kb block spanning intron 1 to intron 2 of the gene. Sequencing of PVRL2 failed to identify variants with an obvious functional role. In conclusion, the analysis of a very large data set suggests that genetic polymorphisms in PVRL2 may influence MS severity and supports the possibility that viral factors may contribute to the clinical course of MS, consistent with previous reports.

  19. Association between RTEL1, PHLDB1, and TREH Polymorphisms and Glioblastoma Risk: A Case-Control Study

    PubMed Central

    Yang, Bo; Heng, Liang; Du, Shuli; Yang, Hua; Jin, Tianbo; Lang, Hongjuan; Li, Shanqu

    2015-01-01

    Background Glioblastoma (GBM) is a highly invasive, aggressive, and incurable brain tumor. Genetic factors play important roles in GBM risk. The aim of this study was to elucidate the influence of gene polymorphism on GBM susceptibility. Material/Methods In this case-control study, we included 72 GBM patients and 320 healthy controls to analyze the association between 29 single-nucleotide polymorphisms and GBM cancer risk in the Chinese Han population. The single-nucleotide polymorphisms were determined by Sequenom MassARRAY RS1000 and statistical analysis was performed using SPSS software and SNPStats software. Results Using the χ2 test, we found that rs2297440 and rs6010620 in RTEL1 increased risk of GBM. In the recessive model, we also found that the genotypes “CC” of rs2297440 and “GG” of rs6010620 in RTEL1 significantly increased GBM risk. The variant TT genotype of TREH rs17748 and the variant TT genotype of PHLDB1 rs498872 decreased GBM risk in the recessive model. We also found that the TREH rs17748 variant C allele showed an increased risk in males in the dominant model. Conclusions Our results suggest a significant association between the RETL1, TREH, and PHLDB1 genes and GBM development in the Han Chinese population. PMID:26156397

  20. Association between RTEL1, PHLDB1, and TREH Polymorphisms and Glioblastoma Risk: A Case-Control Study.

    PubMed

    Yang, Bo; Heng, Liang; Du, Shuli; Yang, Hua; Jin, Tianbo; Lang, Hongjun; Li, Shanqu

    2015-07-09

    Glioblastoma (GBM) is a highly invasive, aggressive, and incurable brain tumor. Genetic factors play important roles in GBM risk. The aim of this study was to elucidate the influence of gene polymorphism on GBM susceptibility. In this case-control study, we included 72 GBM patients and 320 healthy controls to analyze the association between 29 single-nucleotide polymorphisms and GBM cancer risk in the Chinese Han population. The single-nucleotide polymorphisms were determined by Sequenom MassARRAY RS1000 and statistical analysis was performed using SPSS software and SNPStats software. Using the χ(2) test, we found that rs2297440 and rs6010620 in RTEL1 increased risk of GBM. In the recessive model, we also found that the genotypes "CC" of rs2297440 and "GG" of rs6010620 in RTEL1 significantly increased GBM risk. The variant TT genotype of TREH rs17748 and the variant TT genotype of PHLDB1 rs498872 decreased GBM risk in the recessive model. We also found that the TREH rs17748 variant C allele showed an increased risk in males in the dominant model. Our results suggest a significant association between the RETL1, TREH, and PHLDB1 genes and GBM development in the Han Chinese population.

  1. Investigation of the role of TCF4 rare sequence variants in schizophrenia.

    PubMed

    Basmanav, F Buket; Forstner, Andreas J; Fier, Heide; Herms, Stefan; Meier, Sandra; Degenhardt, Franziska; Hoffmann, Per; Barth, Sandra; Fricker, Nadine; Strohmaier, Jana; Witt, Stephanie H; Ludwig, Michael; Schmael, Christine; Moebus, Susanne; Maier, Wolfgang; Mössner, Rainald; Rujescu, Dan; Rietschel, Marcella; Lange, Christoph; Nöthen, Markus M; Cichon, Sven

    2015-07-01

    Transcription factor 4 (TCF4) is one of the most robust of all reported schizophrenia risk loci and is supported by several genetic and functional lines of evidence. While numerous studies have implicated common genetic variation at TCF4 in schizophrenia risk, the role of rare, small-sized variants at this locus-such as single nucleotide variants and short indels which are below the resolution of chip-based arrays requires further exploration. The aim of the present study was to investigate the association between rare TCF4 sequence variants and schizophrenia. Exon-targeted resequencing was performed in 190 German schizophrenia patients. Six rare variants at the coding exons and flanking sequences of the TCF4 gene were identified, including two missense variants and one splice site variant. These six variants were then pooled with nine additional rare variants identified in 379 European participants of the 1000 Genomes Project, and all 15 variants were genotyped in an independent German sample (n = 1,808 patients; n = 2,261 controls). These data were then analyzed using six statistical methods developed for the association analysis of rare variants. No significant association (P < 0.05) was found. However, the results from our association and power analyses suggest that further research into the possible involvement of rare TCF4 sequence variants in schizophrenia risk is warranted by the assessment of larger cohorts with higher statistical power to identify rare variant associations. © 2015 Wiley Periodicals, Inc.

  2. Identifying Causal Variants at Loci with Multiple Signals of Association

    PubMed Central

    Hormozdiari, Farhad; Kostem, Emrah; Kang, Eun Yong; Pasaniuc, Bogdan; Eskin, Eleazar

    2014-01-01

    Although genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci. In this work, we propose a new statistical framework that allows for the possibility of an arbitrary number of causal variants when estimating the posterior probability of a variant being causal. A direct benefit of our approach is that we predict a set of variants for each locus that under reasonable assumptions will contain all of the true causal variants with a high confidence level (e.g., 95%) even when the locus contains multiple causal variants. We use simulations to show that our approach provides 20–50% improvement in our ability to identify the causal variants compared to the existing methods at loci harboring multiple causal variants. We validate our approach using empirical data from an expression QTL study of CHI3L2 to identify new causal variants that affect gene expression at this locus. CAVIAR is publicly available online at http://genetics.cs.ucla.edu/caviar/. PMID:25104515

  3. Identifying causal variants at loci with multiple signals of association.

    PubMed

    Hormozdiari, Farhad; Kostem, Emrah; Kang, Eun Yong; Pasaniuc, Bogdan; Eskin, Eleazar

    2014-10-01

    Although genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci. In this work, we propose a new statistical framework that allows for the possibility of an arbitrary number of causal variants when estimating the posterior probability of a variant being causal. A direct benefit of our approach is that we predict a set of variants for each locus that under reasonable assumptions will contain all of the true causal variants with a high confidence level (e.g., 95%) even when the locus contains multiple causal variants. We use simulations to show that our approach provides 20-50% improvement in our ability to identify the causal variants compared to the existing methods at loci harboring multiple causal variants. We validate our approach using empirical data from an expression QTL study of CHI3L2 to identify new causal variants that affect gene expression at this locus. CAVIAR is publicly available online at http://genetics.cs.ucla.edu/caviar/. Copyright © 2014 by the Genetics Society of America.

  4. No Association of Coronary Artery Disease with X-Chromosomal Variants in Comprehensive International Meta-Analysis.

    PubMed

    Loley, Christina; Alver, Maris; Assimes, Themistocles L; Bjonnes, Andrew; Goel, Anuj; Gustafsson, Stefan; Hernesniemi, Jussi; Hopewell, Jemma C; Kanoni, Stavroula; Kleber, Marcus E; Lau, King Wai; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Nelson, Christopher P; Nikpay, Majid; Qu, Liming; Salfati, Elias; Scholz, Markus; Tukiainen, Taru; Willenborg, Christina; Won, Hong-Hee; Zeng, Lingyao; Zhang, Weihua; Anand, Sonia S; Beutner, Frank; Bottinger, Erwin P; Clarke, Robert; Dedoussis, George; Do, Ron; Esko, Tõnu; Eskola, Markku; Farrall, Martin; Gauguier, Dominique; Giedraitis, Vilmantas; Granger, Christopher B; Hall, Alistair S; Hamsten, Anders; Hazen, Stanley L; Huang, Jie; Kähönen, Mika; Kyriakou, Theodosios; Laaksonen, Reijo; Lind, Lars; Lindgren, Cecilia; Magnusson, Patrik K E; Marouli, Eirini; Mihailov, Evelin; Morris, Andrew P; Nikus, Kjell; Pedersen, Nancy; Rallidis, Loukianos; Salomaa, Veikko; Shah, Svati H; Stewart, Alexandre F R; Thompson, John R; Zalloua, Pierre A; Chambers, John C; Collins, Rory; Ingelsson, Erik; Iribarren, Carlos; Karhunen, Pekka J; Kooner, Jaspal S; Lehtimäki, Terho; Loos, Ruth J F; März, Winfried; McPherson, Ruth; Metspalu, Andres; Reilly, Muredach P; Ripatti, Samuli; Sanghera, Dharambir K; Thiery, Joachim; Watkins, Hugh; Deloukas, Panos; Kathiresan, Sekar; Samani, Nilesh J; Schunkert, Heribert; Erdmann, Jeanette; König, Inke R

    2016-10-12

    In recent years, genome-wide association studies have identified 58 independent risk loci for coronary artery disease (CAD) on the autosome. However, due to the sex-specific data structure of the X chromosome, it has been excluded from most of these analyses. While females have 2 copies of chromosome X, males have only one. Also, one of the female X chromosomes may be inactivated. Therefore, special test statistics and quality control procedures are required. Thus, little is known about the role of X-chromosomal variants in CAD. To fill this gap, we conducted a comprehensive X-chromosome-wide meta-analysis including more than 43,000 CAD cases and 58,000 controls from 35 international study cohorts. For quality control, sex-specific filters were used to adequately take the special structure of X-chromosomal data into account. For single study analyses, several logistic regression models were calculated allowing for inactivation of one female X-chromosome, adjusting for sex and investigating interactions between sex and genetic variants. Then, meta-analyses including all 35 studies were conducted using random effects models. None of the investigated models revealed genome-wide significant associations for any variant. Although we analyzed the largest-to-date sample, currently available methods were not able to detect any associations of X-chromosomal variants with CAD.

  5. BALSA: integrated secondary analysis for whole-genome and whole-exome sequencing, accelerated by GPU.

    PubMed

    Luo, Ruibang; Wong, Yiu-Lun; Law, Wai-Chun; Lee, Lap-Kei; Cheung, Jeanno; Liu, Chi-Man; Lam, Tak-Wah

    2014-01-01

    This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels), BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads), or just 25 min for 210-fold whole exome sequencing. BALSA's speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.

  6. Genetic Variants in IRF6 and the Risk of Facial Clefts: Single-Marker and Haplotype-Based Analyses in a Population-Based Case-Control Study of Facial Clefts in Norway

    PubMed Central

    Jugessur, Astanand; Rahimov, Fedik; Lie, Rolv T.; Wilcox, Allen J.; Gjessing, Håkon K.; Nilsen, Roy M.; Nguyen, Truc Trung; Murray, Jeffrey C.

    2009-01-01

    Mutations in the gene encoding interferon regulatory factor 6 (IRF6) underlie a common form of syndromic clefting known as Van der Woude syndrome. Lip pits and missing teeth are the only additional features distinguishing the syndrome from isolated clefts. Van der Woude syndrome, therefore, provides an excellent model for studying the isolated forms of clefting. From a population-based case-control study of facial clefts in Norway (1996–2001), we selected 377 cleft lip with or without cleft palate (CL/P), 196 cleft palate only (CPO), and 763 control infant-parent triads for analysis. We genotyped six single nucleotide polymorphisms within the IRF6 locus and estimated the relative risks (RR) conferred on the child by alleles and haplotypes of the child and of the mother. On the whole, there were strong statistical associations with CL/P but not CPO in our data. In single-marker analyses, mothers with a double-dose of the ‘a’-allele at rs4844880 had an increased risk of having a child with CL/P (RR = 1.85, 95% confidence interval: 1.04–3.25; P = 0.036). An RR of 0.38 (95% confidence interval: 0.16–0.92; P = 0.031) was obtained when the child carried a single-dose of the ‘a’-allele at rs2235371 (the p.V274I polymorphism). The P-value for the overall test was <0.001. In haplotype analyses, several of the fetal and maternal haplotype relative risks were statistically significant individually but were not strong enough to show up on the overall test (P = 0.113). Taken together, these findings further support a role for IRF6 variants in clefting of the lip and provide specific risk estimates in a Norwegian population. PMID:18278815

  7. A Protein Domain and Family Based Approach to Rare Variant Association Analysis.

    PubMed

    Richardson, Tom G; Shihab, Hashem A; Rivas, Manuel A; McCarthy, Mark I; Campbell, Colin; Timpson, Nicholas J; Gaunt, Tom R

    2016-01-01

    It has become common practice to analyse large scale sequencing data with statistical approaches based around the aggregation of rare variants within the same gene. We applied a novel approach to rare variant analysis by collapsing variants together using protein domain and family coordinates, regarded to be a more discrete definition of a biologically functional unit. Using Pfam definitions, we collapsed rare variants (Minor Allele Frequency ≤ 1%) together in three different ways 1) variants within single genomic regions which map to individual protein domains 2) variants within two individual protein domain regions which are predicted to be responsible for a protein-protein interaction 3) all variants within combined regions from multiple genes responsible for coding the same protein domain (i.e. protein families). A conventional collapsing analysis using gene coordinates was also undertaken for comparison. We used UK10K sequence data and investigated associations between regions of variants and lipid traits using the sequence kernel association test (SKAT). We observed no strong evidence of association between regions of variants based on Pfam domain definitions and lipid traits. Quantile-Quantile plots illustrated that the overall distributions of p-values from the protein domain analyses were comparable to that of a conventional gene-based approach. Deviations from this distribution suggested that collapsing by either protein domain or gene definitions may be favourable depending on the trait analysed. We have collapsed rare variants together using protein domain and family coordinates to present an alternative approach over collapsing across conventionally used gene-based regions. Although no strong evidence of association was detected in these analyses, future studies may still find value in adopting these approaches to detect previously unidentified association signals.

  8. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls.

    PubMed

    Flannick, Jason; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M; Agarwala, Vineeta; Gaulton, Kyle J; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Dennis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana Cn; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Altshuler, David; Burtt, Noël P; Florez, Jose C; Boehnke, Michael; McCarthy, Mark I

    2017-12-19

    To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D.

  9. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls

    PubMed Central

    Jason, Flannick; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M.; Agarwala, Vineeta; Gaulton, Kyle J.; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J.; Rivas, Manuel A.; Perry, John R. B.; Sim, Xueling; Blackwell, Thomas W.; Robertson, Neil R.; Rayner, N William; Cingolani, Pablo; Locke, Adam E.; Tajes, Juan Fernandez; Highland, Heather M.; Dupuis, Josee; Chines, Peter S.; Lindgren, Cecilia M.; Hartl, Christopher; Jackson, Anne U.; Chen, Han; Huyghe, Jeroen R.; van de Bunt, Martijn; Pearson, Richard D.; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M.; Gamazon, Eric R.; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A.; Below, Jennifer E.; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L.; Pasko, Dorota; Parker, Stephen C. J.; Varga, Tibor V.; Green, Todd; Beer, Nicola L.; Day-Williams, Aaron G.; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J.; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P.; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F.; Han, Bok-Ghee; Jenkinson, Christopher P.; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C. Y.; Palmer, Nicholette D.; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E.; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D.; Neale, Benjamin M.; Purcell, Shaun; Butterworth, Adam S.; Howson, Joanna M. M.; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K. L.; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H. T.; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E.; Rybin, Dennis; Farook, Vidya S.; Fowler, Sharon P.; Freedman, Barry I.; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J.; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K.; Puppala, Sobha; Scott, William R.; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A.; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C.; Mangino, Massimo; Bonnycastle, Lori L.; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L.; Herder, Christian; Groves, Christopher J.; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A.; Doney, Alex S. F.; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J.; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E.; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H.; Stirrups, Kathleen; Wood, Andrew R.; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O.; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P.; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B.; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N. A.; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M.; Syvänen, Ann-Christine; Bergman, Richard N.; Bharadwaj, Dwaipayan; Bottinger, Erwin P.; Cho, Yoon Shin; Chandak, Giriraj R.; Chan, Juliana CN; Chia, Kee Seng; Daly, Mark J.; Ebrahim, Shah B.; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A.; Lehman, Donna M.; Jia, Weiping; Ma, Ronald C. W.; Pollin, Toni I.; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J. F.; Small, Kerrin S.; Ried, Janina S.; DeFronzo, Ralph A.; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J.; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W.; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R.; Gloyn, Anna L.; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D.; Hattersley, Andrew T.; Bowden, Donald W.; Collins, Francis S.; Atzmon, Gil; Chambers, John C.; Spector, Timothy D.; Laakso, Markku; Strom, Tim M.; Bell, Graeme I.; Blangero, John; Duggirala, Ravindranath; Tai, E. Shyong; McVean, Gilean; Hanis, Craig L.; Wilson, James G.; Seielstad, Mark; Frayling, Timothy M.; Meigs, James B.; Cox, Nancy J.; Sladek, Rob; Lander, Eric S.; Gabriel, Stacey; Mohlke, Karen L.; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J.; Morris, Andrew P.; Kang, Hyun Min; Altshuler, David; Burtt, Noël P.; Florez, Jose C.; Boehnke, Michael; McCarthy, Mark I.

    2017-01-01

    To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1–5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D. PMID:29257133

  10. Epistasis analysis for quantitative traits by functional regression model.

    PubMed

    Zhang, Futao; Boerwinkle, Eric; Xiong, Momiao

    2014-06-01

    The critical barrier in interaction analysis for rare variants is that most traditional statistical methods for testing interactions were originally designed for testing the interaction between common variants and are difficult to apply to rare variants because of their prohibitive computational time and poor ability. The great challenges for successful detection of interactions with next-generation sequencing (NGS) data are (1) lack of methods for interaction analysis with rare variants, (2) severe multiple testing, and (3) time-consuming computations. To meet these challenges, we shift the paradigm of interaction analysis between two loci to interaction analysis between two sets of loci or genomic regions and collectively test interactions between all possible pairs of SNPs within two genomic regions. In other words, we take a genome region as a basic unit of interaction analysis and use high-dimensional data reduction and functional data analysis techniques to develop a novel functional regression model to collectively test interactions between all possible pairs of single nucleotide polymorphisms (SNPs) within two genome regions. By intensive simulations, we demonstrate that the functional regression models for interaction analysis of the quantitative trait have the correct type 1 error rates and a much better ability to detect interactions than the current pairwise interaction analysis. The proposed method was applied to exome sequence data from the NHLBI's Exome Sequencing Project (ESP) and CHARGE-S study. We discovered 27 pairs of genes showing significant interactions after applying the Bonferroni correction (P-values < 4.58 × 10(-10)) in the ESP, and 11 were replicated in the CHARGE-S study. © 2014 Zhang et al.; Published by Cold Spring Harbor Laboratory Press.

  11. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis.

    PubMed

    Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti

    2016-07-01

    A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  12. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis

    PubMed Central

    Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti

    2016-01-01

    Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689

  13. OPATs: Omnibus P-value association tests.

    PubMed

    Chen, Chia-Wei; Yang, Hsin-Chou

    2017-07-10

    Combining statistical significances (P-values) from a set of single-locus association tests in genome-wide association studies is a proof-of-principle method for identifying disease-associated genomic segments, functional genes and biological pathways. We review P-value combinations for genome-wide association studies and introduce an integrated analysis tool, Omnibus P-value Association Tests (OPATs), which provides popular analysis methods of P-value combinations. The software OPATs programmed in R and R graphical user interface features a user-friendly interface. In addition to analysis modules for data quality control and single-locus association tests, OPATs provides three types of set-based association test: window-, gene- and biopathway-based association tests. P-value combinations with or without threshold and rank truncation are provided. The significance of a set-based association test is evaluated by using resampling procedures. Performance of the set-based association tests in OPATs has been evaluated by simulation studies and real data analyses. These set-based association tests help boost the statistical power, alleviate the multiple-testing problem, reduce the impact of genetic heterogeneity, increase the replication efficiency of association tests and facilitate the interpretation of association signals by streamlining the testing procedures and integrating the genetic effects of multiple variants in genomic regions of biological relevance. In summary, P-value combinations facilitate the identification of marker sets associated with disease susceptibility and uncover missing heritability in association studies, thereby establishing a foundation for the genetic dissection of complex diseases and traits. OPATs provides an easy-to-use and statistically powerful analysis tool for P-value combinations. OPATs, examples, and user guide can be downloaded from http://www.stat.sinica.edu.tw/hsinchou/genetics/association/OPATs.htm. © The Author 2017. Published by Oxford University Press.

  14. Smoking Gun or Circumstantial Evidence? Comparison of Statistical Learning Methods using Functional Annotations for Prioritizing Risk Variants.

    PubMed

    Gagliano, Sarah A; Ravji, Reena; Barnes, Michael R; Weale, Michael E; Knight, Jo

    2015-08-24

    Although technology has triumphed in facilitating routine genome sequencing, new challenges have been created for the data-analyst. Genome-scale surveys of human variation generate volumes of data that far exceed capabilities for laboratory characterization. By incorporating functional annotations as predictors, statistical learning has been widely investigated for prioritizing genetic variants likely to be associated with complex disease. We compared three published prioritization procedures, which use different statistical learning algorithms and different predictors with regard to the quantity, type and coding. We also explored different combinations of algorithm and annotation set. As an application, we tested which methodology performed best for prioritizing variants using data from a large schizophrenia meta-analysis by the Psychiatric Genomics Consortium. Results suggest that all methods have considerable (and similar) predictive accuracies (AUCs 0.64-0.71) in test set data, but there is more variability in the application to the schizophrenia GWAS. In conclusion, a variety of algorithms and annotations seem to have a similar potential to effectively enrich true risk variants in genome-scale datasets, however none offer more than incremental improvement in prediction. We discuss how methods might be evolved for risk variant prediction to address the impending bottleneck of the new generation of genome re-sequencing studies.

  15. Asian-American variants of human papillomavirus 16 and risk for cervical cancer: a case-control study.

    PubMed

    Berumen, J; Ordoñez, R M; Lazcano, E; Salmeron, J; Galvan, S C; Estrada, R A; Yunes, E; Garcia-Carranca, A; Gonzalez-Lira, G; Madrigal-de la Campa, A

    2001-09-05

    Human papillomavirus 16 (HPV16) has a number of variants, each with a different geographic distribution and some that are associated more often with invasive neoplasias. We investigated whether the high incidence of cervical cancer in Mexico (50 cases per 100 000 women) may be associated with a high prevalence of oncogenic HPV16 variants. Cervical samples were collected from 181 case patients with cervical cancer and from 181 age-matched control subjects, all from Mexico City. HPV16 was detected with an E6/E7 gene-specific polymerase chain reaction, and variant HPV classes and subclasses were identified by sequencing regions of the E6 and L1/MY genes. Clinical data and data on tumor characteristics were also collected. All statistical tests were two-sided. HPV16 was detected in cervical scrapes from 50.8% (92 of 181) of case patients and from 11% (20 of 181) of control subjects. All HPV16-positive samples, except one, contained European (E) or Asian-American (AA) variants. AA and E variants were found statistically significantly more often in case patients (AA = 23.2% [42 of 181]; E = 27.1% [49 of 181]) than in control subjects (AA = 1.1% [two of 181]; E = 10% [18 of 181]) (P<.001 for case versus control subjects for either E or AA variants, chi2 test). However, the frequency of AA variants was 21 times higher in cancer patients than in control subjects, whereas that ratio for E variants was only 2.7 (P =.006, chi2 test). The odds ratio (OR) for cervical cancer associated with AA variants (OR = 27.0; 95% confidence interval [CI] = 6.4 to 113.7) was higher than that associated with E variants (OR = 3.4; 95% CI = 1.9 to 6.0). AA-positive case patients (46.2 +/- 12.5 years [mean +/- standard deviation]) were 7.7 years younger than E-positive case patients (53.9 +/- 12.2 years) (P =.004, Student's t test). AA variants were associated with squamous cell carcinomas and adenocarcinomas, but E variants were associated with only squamous cell carcinomas (P =.014, Fisher's exact test). The high frequency of HPV16 AA variants, which appear to be more oncogenic than E variants, might contribute to the high incidence of cervical cancer in Mexico.

  16. Comprehensive genetic testing for female and male infertility using next-generation sequencing.

    PubMed

    Patel, Bonny; Parets, Sasha; Akana, Matthew; Kellogg, Gregory; Jansen, Michael; Chang, Chihyu; Cai, Ying; Fox, Rebecca; Niknazar, Mohammad; Shraga, Roman; Hunter, Colby; Pollock, Andrew; Wisotzkey, Robert; Jaremko, Malgorzata; Bisignano, Alex; Puig, Oscar

    2018-05-19

    To develop a comprehensive genetic test for female and male infertility in support of medical decisions during assisted reproductive technology (ART) protocols. We developed a next-generation sequencing (NGS) gene panel consisting of 87 genes including promoters, 5' and 3' untranslated regions, exons, and selected introns. In addition, sex chromosome aneuploidies and Y chromosome microdeletions were analyzed concomitantly using the same panel. The NGS panel was analytically validated by retrospective analysis of 118 genomic DNA samples with known variants in loci representative of female and male infertility. Our results showed analytical accuracy of > 99%, with > 98% sensitivity for single-nucleotide variants (SNVs) and > 91% sensitivity for insertions/deletions (indels). Clinical sensitivity was assessed with samples containing variants representative of male and female infertility, and it was 100% for SNVs/indels, CFTR IVS8-5T variants, sex chromosome aneuploidies, and copy number variants (CNVs) and > 93% for Y chromosome microdeletions. Cost analysis shows potential savings when comparing this single NGS assay with the standard approach, which includes multiple assays. A single, comprehensive, NGS panel can simplify the ordering process for healthcare providers, reduce turnaround time, and lower the overall cost of testing for genetic assessment of infertility in females and males, while maintaining accuracy.

  17. Molecular screening of the ghrelin gene in Italian obese children: the Leu72Met variant is associated with an earlier onset of obesity.

    PubMed

    Miraglia del Giudice, E; Santoro, N; Cirillo, G; Raimondo, P; Grandone, A; D'Aniello, A; Di Nardo, M; Perrone, L

    2004-03-01

    To test whether ghrelin variants could play a role in modulating some aspects of the obese phenotype during childhood. We screened the ghrelin gene in 300 Italian obese children and adolescents (mean age 10.5+/-3.2 y; range 4-19 y) and 200 controls by using the single-strand conformation polymorphism and the restriction fragment length polymoprhism analysis. No mutations were detected with the exception of two previously described polymorphisms, Arg51Gln and Leu72Met. For both variations, allelic frequencies were similar between patients and controls. Interestingly, we showed that the Leu72Met polymorphism was associated with differences in the age at obesity onset. Patients with the Met72 allele became obese earlier than homozygous patients for the wild Leu72 allele. The logrank test comparing the plots of the complement of Kaplan-Meier estimates between the two groups of patients was statistically significant (P<0.0001). It is unlikely that ghrelin variations cause the obesity due to single-gene mutations. The Leu72Met polymorphism of the ghrelin gene seems to play a role in anticipating the onset of obesity among children suggesting, therefore, that ghrelin may be involved in the pathophysiology of human adiposity.

  18. Clinical application of high throughput molecular screening techniques for pharmacogenomics

    PubMed Central

    Wiita, Arun P; Schrijver, Iris

    2011-01-01

    Genetic analysis is one of the fastest-growing areas of clinical diagnostics. Fortunately, as our knowledge of clinically relevant genetic variants rapidly expands, so does our ability to detect these variants in patient samples. Increasing demand for genetic information may necessitate the use of high throughput diagnostic methods as part of clinically validated testing. Here we provide a general overview of our current and near-future abilities to perform large-scale genetic testing in the clinical laboratory. First we review in detail molecular methods used for high throughput mutation detection, including techniques able to monitor thousands of genetic variants for a single patient or to genotype a single genetic variant for thousands of patients simultaneously. These methods are analyzed in the context of pharmacogenomic testing in the clinical laboratories, with a focus on tests that are currently validated as well as those that hold strong promise for widespread clinical application in the near future. We further discuss the unique economic and clinical challenges posed by pharmacogenomic markers. Our ability to detect genetic variants frequently outstrips our ability to accurately interpret them in a clinical context, carrying implications both for test development and introduction into patient management algorithms. These complexities must be taken into account prior to the introduction of any pharmacogenomic biomarker into routine clinical testing. PMID:23226057

  19. No association between oxytocin or prolactin gene variants and childhood-onset mood disorders

    PubMed Central

    Strauss, John S.; Freeman, Natalie L.; Shaikh, Sajid A.; Vetró, Ágnes; Kiss, Enikő; Kapornai, Krisztina; Daróczi, Gabriella; Rimay, Timea; Kothencné, Viola Osváth; Dombovári, Edit; Kaczvinszk, Emília; Tamás, Zsuzsa; Baji, Ildikó; Besny, Márta; Gádoros, Julia; DeLuca, Vincenzo; George, Charles J.; Dempster, Emma; Barr, Cathy L.; Kovacs, Maria; Kennedy, James L.

    2010-01-01

    Background Oxytocin (OXT) and prolactin (PRL) are neuropeptide hormones that interact with the serotonin system and are involved in the stress response and social affiliation. In human studies, serum OXT and PRL levels have been associated with depression and related phenotypes. Our purpose was to determine if single nucleotide polymorphisms (SNPs) at the loci for OXT, PRL and their receptors, OXTR and PRLR, were associated with childhood-onset mood disorders (COMD). Methods Using 678 families in a family-based association design, we genotyped sixteen SNPs at OXT, PRL, OXTR and PRLR to test for association with COMD. Results No significant associations were found for SNPs in the OXTR, PRL, or PRLR genes. Two of three SNPs 3' of the OXT gene were associated with COMD (p ≤ 0.02), significant after spectral decomposition, but were not significant after additionally correcting for the number of genes tested. Supplementary analyses of parent-of-origin and proband sex effects for OXT SNPs by Fisher’s Exact test were not significant after Bonferroni correction. Conclusions We have examined sixteen OXT and PRL system gene variants, with no evidence of statistically significant association after correction for multiple tests. PMID:20547007

  20. Canary: an atomic pipeline for clinical amplicon assays.

    PubMed

    Doig, Kenneth D; Ellul, Jason; Fellowes, Andrew; Thompson, Ella R; Ryland, Georgina; Blombery, Piers; Papenfuss, Anthony T; Fox, Stephen B

    2017-12-15

    High throughput sequencing requires bioinformatics pipelines to process large volumes of data into meaningful variants that can be translated into a clinical report. These pipelines often suffer from a number of shortcomings: they lack robustness and have many components written in multiple languages, each with a variety of resource requirements. Pipeline components must be linked together with a workflow system to achieve the processing of FASTQ files through to a VCF file of variants. Crafting these pipelines requires considerable bioinformatics and IT skills beyond the reach of many clinical laboratories. Here we present Canary, a single program that can be run on a laptop, which takes FASTQ files from amplicon assays through to an annotated VCF file ready for clinical analysis. Canary can be installed and run with a single command using Docker containerization or run as a single JAR file on a wide range of platforms. Although it is a single utility, Canary performs all the functions present in more complex and unwieldy pipelines. All variants identified by Canary are 3' shifted and represented in their most parsimonious form to provide a consistent nomenclature, irrespective of sequencing variation. Further, proximate in-phase variants are represented as a single HGVS 'delins' variant. This allows for correct nomenclature and consequences to be ascribed to complex multi-nucleotide polymorphisms (MNPs), which are otherwise difficult to represent and interpret. Variants can also be annotated with hundreds of attributes sourced from MyVariant.info to give up to date details on pathogenicity, population statistics and in-silico predictors. Canary has been used at the Peter MacCallum Cancer Centre in Melbourne for the last 2 years for the processing of clinical sequencing data. By encapsulating clinical features in a single, easily installed executable, Canary makes sequencing more accessible to all pathology laboratories. Canary is available for download as source or a Docker image at https://github.com/PapenfussLab/Canary under a GPL-3.0 License.

  1. Genome-wide scans of genetic variants for psychophysiological endophenotypes: a methodological overview.

    PubMed

    Iacono, William G; Malone, Stephen M; Vaidyanathan, Uma; Vrieze, Scott I

    2014-12-01

    This article provides an introductory overview of the investigative strategy employed to evaluate the genetic basis of 17 endophenotypes examined as part of a 20-year data collection effort from the Minnesota Center for Twin and Family Research. Included are characterization of the study samples, descriptive statistics for key properties of the psychophysiological measures, and rationale behind the steps taken in the molecular genetic study design. The statistical approach included (a) biometric analysis of twin and family data, (b) heritability analysis using 527,829 single nucleotide polymorphisms (SNPs), (c) genome-wide association analysis of these SNPs and 17,601 autosomal genes, (d) follow-up analyses of candidate SNPs and genes hypothesized to have an association with each endophenotype, (e) rare variant analysis of nonsynonymous SNPs in the exome, and (f) whole genome sequencing association analysis using 27 million genetic variants. These methods were used in the accompanying empirical articles comprising this special issue, Genome-Wide Scans of Genetic Variants for Psychophysiological Endophenotypes. Copyright © 2014 Society for Psychophysiological Research.

  2. Identification of seven novel loci associated with amino acid levels using single-variant and gene-based tests in 8545 Finnish men from the METSIM study.

    PubMed

    Teslovich, Tanya M; Kim, Daniel Seung; Yin, Xianyong; Stancáková, Alena; Jackson, Anne U; Wielscher, Matthias; Naj, Adam; Perry, John R B; Huyghe, Jeroen R; Stringham, Heather M; Davis, James P; Raulerson, Chelsea K; Welch, Ryan P; Fuchsberger, Christian; Locke, Adam E; Sim, Xueling; Chines, Peter S; Narisu, Narisu; Kangas, Antti J; Soininen, Pasi; Ala-Korpela, Mika; Gudnason, Vilmundur; Musani, Solomon K; Jarvelin, Marjo-Riitta; Schellenberg, Gerard D; Speliotes, Elizabeth K; Kuusisto, Johanna; Collins, Francis S; Boehnke, Michael; Laakso, Markku; Mohlke, Karen L

    2018-05-01

    Comprehensive metabolite profiling captures many highly heritable traits, including amino acid levels, which are potentially sensitive biomarkers for disease pathogenesis. To better understand the contribution of genetic variation to amino acid levels, we performed single variant and gene-based tests of association between nine serum amino acids (alanine, glutamine, glycine, histidine, isoleucine, leucine, phenylalanine, tyrosine, and valine) and 16.6 million genotyped and imputed variants in 8545 non-diabetic Finnish men from the METabolic Syndrome In Men (METSIM) study with replication in Northern Finland Birth Cohort (NFBC1966). We identified five novel loci associated with amino acid levels (P = < 5×10-8): LOC157273/PPP1R3B with glycine (rs9987289, P = 2.3×10-26); ZFHX3 (chr16:73326579, minor allele frequency (MAF) = 0.42%, P = 3.6×10-9), LIPC (rs10468017, P = 1.5×10-8), and WWOX (rs9937914, P = 3.8×10-8) with alanine; and TRIB1 with tyrosine (rs28601761, P = 8×10-9). Gene-based tests identified two novel genes harboring missense variants of MAF <1% that show aggregate association with amino acid levels: PYCR1 with glycine (Pgene = 1.5×10-6) and BCAT2 with valine (Pgene = 7.4×10-7); neither gene was implicated by single variant association tests. These findings are among the first applications of gene-based tests to identify new loci for amino acid levels. In addition to the seven novel gene associations, we identified five independent signals at established amino acid loci, including two rare variant signals at GLDC (rs138640017, MAF=0.95%, Pconditional = 5.8×10-40) with glycine levels and HAL (rs141635447, MAF = 0.46%, Pconditional = 9.4×10-11) with histidine levels. Examination of all single variant association results in our data revealed a strong inverse relationship between effect size and MAF (Ptrend<0.001). These novel signals provide further insight into the molecular mechanisms of amino acid metabolism and potentially, their perturbations in disease.

  3. Yield of the RYR2 Genetic Test in Suspected Catecholaminergic Polymorphic Ventricular Tachycardia and Implications for Test Interpretation.

    PubMed

    Kapplinger, Jamie D; Pundi, Krishna N; Larson, Nicholas B; Callis, Thomas E; Tester, David J; Bikker, Hennie; Wilde, Arthur A M; Ackerman, Michael J

    2018-02-01

    Pathogenic RYR2 variants account for ≈60% of clinically definite cases of catecholaminergic polymorphic ventricular tachycardia. However, the rate of rare benign RYR2 variants identified in the general population remains a challenge for genetic test interpretation. Therefore, we examined the results of the RYR2 genetic test among patients referred for commercial genetic testing and examined factors impacting variant interpretability. Frequency and location comparisons were made for RYR2 variants identified among 1355 total patients of varying clinical certainty and 60 706 Exome Aggregation Consortium controls. The impact of the clinical phenotype on the yield of RYR2 variants was examined. Six in silico tools were assessed using patient- and control-derived variants. A total of 18.2% (218/1200) of patients referred for commercial testing hosted rare RYR2 variants, statistically less than the 59% (46/78) yield among clinically definite cases, resulting in a much higher potential genetic false discovery rate among referrals considering the 3.2% background rate of rare, benign RYR2 variants. Exclusion of clearly putative pathogenic variants further complicates the interpretation of the next novel RYR2 variant. Exonic/topologic analyses revealed overrepresentation of patient variants in exons covering only one third of the protein. In silico tools largely failed to show evidence toward enhancement of variant interpretation. Current expert recommendations have resulted in increased use of RYR2 genetic testing in patients with questionable clinical phenotypes. Using the largest to date catecholaminergic polymorphic ventricular tachycardia patient versus control comparison, this study highlights important variables in the interpretation of variants to overcome the 3.2% background rate that confounds RYR2 variant interpretation. © 2018 American Heart Association, Inc.

  4. Genome-Wide Association Study for Susceptibility to and Recoverability From Mastitis in Danish Holstein Cows.

    PubMed

    Welderufael, B G; Løvendahl, Peter; de Koning, Dirk-Jan; Janss, Lucas L G; Fikse, W F

    2018-01-01

    Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS) to identify associated single nucleotide polymorphisms (SNPs) and investigate the genetic background not only for susceptibility to - but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t -test and a genome-wide significance level of P -value < 10 -4 was used to declare significant SNP-trait association. A number of significant SNP variants were identified for both traits. Many of the SNP variants associated either with susceptibility to - or recoverability from mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2 ) and genes involved in macrophage recruitment and regulation of inflammations ( PDGFD and PTX3 ) were suggested as possible causal genes for susceptibility to - and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to - and recoverability from mastitis.

  5. Whole exome sequencing of rare variants in EIF4G1 and VPS35 in Parkinson disease

    PubMed Central

    Nuytemans, Karen; Bademci, Guney; Inchausti, Vanessa; Dressen, Amy; Kinnamon, Daniel D.; Mehta, Arpit; Wang, Liyong; Züchner, Stephan; Beecham, Gary W.; Martin, Eden R.; Scott, William K.

    2013-01-01

    Objective: Recently, vacuolar protein sorting 35 (VPS35) and eukaryotic translation initiation factor 4 gamma 1 (EIF4G1) have been identified as 2 causal Parkinson disease (PD) genes. We used whole exome sequencing for rapid, parallel analysis of variations in these 2 genes. Methods: We performed whole exome sequencing in 213 patients with PD and 272 control individuals. Those rare variants (RVs) with <5% frequency in the exome variant server database and our own control data were considered for analysis. We performed joint gene-based tests for association using RVASSOC and SKAT (Sequence Kernel Association Test) as well as single-variant test statistics. Results: We identified 3 novel VPS35 variations that changed the coded amino acid (nonsynonymous) in 3 cases. Two variations were in multiplex families and neither segregated with PD. In EIF4G1, we identified 11 (9 nonsynonymous and 2 small indels) RVs including the reported pathogenic mutation p.R1205H, which segregated in all affected members of a large family, but also in 1 unaffected 86-year-old family member. Two additional RVs were found in isolated patients only. Whereas initial association studies suggested an association (p = 0.04) with all RVs in EIF4G1, subsequent testing in a second dataset for the driving variant (p.F1461) suggested no association between RVs in the gene and PD. Conclusions: We confirm that the specific EIF4G1 variation p.R1205H seems to be a strong PD risk factor, but is nonpenetrant in at least one 86-year-old. A few other select RVs in both genes could not be ruled out as causal. However, there was no evidence for an overall contribution of genetic variability in VPS35 or EIF4G1 to PD development in our dataset. PMID:23408866

  6. No Association of Coronary Artery Disease with X-Chromosomal Variants in Comprehensive International Meta-Analysis

    PubMed Central

    Loley, Christina; Alver, Maris; Assimes, Themistocles L.; Bjonnes, Andrew; Goel, Anuj; Gustafsson, Stefan; Hernesniemi, Jussi; Hopewell, Jemma C.; Kanoni, Stavroula; Kleber, Marcus E.; Lau, King Wai; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Nelson, Christopher P.; Nikpay, Majid; Qu, Liming; Salfati, Elias; Scholz, Markus; Tukiainen, Taru; Willenborg, Christina; Won, Hong-Hee; Zeng, Lingyao; Zhang, Weihua; Anand, Sonia S.; Beutner, Frank; Bottinger, Erwin P.; Clarke, Robert; Dedoussis, George; Do, Ron; Esko, Tõnu; Eskola, Markku; Farrall, Martin; Gauguier, Dominique; Giedraitis, Vilmantas; Granger, Christopher B.; Hall, Alistair S.; Hamsten, Anders; Hazen, Stanley L.; Huang, Jie; Kähönen, Mika; Kyriakou, Theodosios; Laaksonen, Reijo; Lind, Lars; Lindgren, Cecilia; Magnusson, Patrik K. E.; Marouli, Eirini; Mihailov, Evelin; Morris, Andrew P.; Nikus, Kjell; Pedersen, Nancy; Rallidis, Loukianos; Salomaa, Veikko; Shah, Svati H.; Stewart, Alexandre F. R.; Thompson, John R.; Zalloua, Pierre A.; Chambers, John C.; Collins, Rory; Ingelsson, Erik; Iribarren, Carlos; Karhunen, Pekka J.; Kooner, Jaspal S.; Lehtimäki, Terho; Loos, Ruth J. F.; März, Winfried; McPherson, Ruth; Metspalu, Andres; Reilly, Muredach P.; Ripatti, Samuli; Sanghera, Dharambir K.; Thiery, Joachim; Watkins, Hugh; Deloukas, Panos; Kathiresan, Sekar; Samani, Nilesh J.; Schunkert, Heribert; Erdmann, Jeanette; König, Inke R.

    2016-01-01

    In recent years, genome-wide association studies have identified 58 independent risk loci for coronary artery disease (CAD) on the autosome. However, due to the sex-specific data structure of the X chromosome, it has been excluded from most of these analyses. While females have 2 copies of chromosome X, males have only one. Also, one of the female X chromosomes may be inactivated. Therefore, special test statistics and quality control procedures are required. Thus, little is known about the role of X-chromosomal variants in CAD. To fill this gap, we conducted a comprehensive X-chromosome-wide meta-analysis including more than 43,000 CAD cases and 58,000 controls from 35 international study cohorts. For quality control, sex-specific filters were used to adequately take the special structure of X-chromosomal data into account. For single study analyses, several logistic regression models were calculated allowing for inactivation of one female X-chromosome, adjusting for sex and investigating interactions between sex and genetic variants. Then, meta-analyses including all 35 studies were conducted using random effects models. None of the investigated models revealed genome-wide significant associations for any variant. Although we analyzed the largest-to-date sample, currently available methods were not able to detect any associations of X-chromosomal variants with CAD. PMID:27731410

  7. Clinical Variant Classification: A Comparison of Public Databases and a Commercial Testing Laboratory.

    PubMed

    Gradishar, William; Johnson, KariAnne; Brown, Krystal; Mundt, Erin; Manley, Susan

    2017-07-01

    There is a growing move to consult public databases following receipt of a genetic test result from a clinical laboratory; however, the well-documented limitations of these databases call into question how often clinicians will encounter discordant variant classifications that may introduce uncertainty into patient management. Here, we evaluate discordance in BRCA1 and BRCA2 variant classifications between a single commercial testing laboratory and a public database commonly consulted in clinical practice. BRCA1 and BRCA2 variant classifications were obtained from ClinVar and compared with the classifications from a reference laboratory. Full concordance and discordance were determined for variants whose ClinVar entries were of the same pathogenicity (pathogenic, benign, or uncertain). Variants with conflicting ClinVar classifications were considered partially concordant if ≥1 of the listed classifications agreed with the reference laboratory classification. Four thousand two hundred and fifty unique BRCA1 and BRCA2 variants were available for analysis. Overall, 73.2% of classifications were fully concordant and 12.3% were partially concordant. The remaining 14.5% of variants had discordant classifications, most of which had a definitive classification (pathogenic or benign) from the reference laboratory compared with an uncertain classification in ClinVar (14.0%). Here, we show that discrepant classifications between a public database and single reference laboratory potentially account for 26.7% of variants in BRCA1 and BRCA2 . The time and expertise required of clinicians to research these discordant classifications call into question the practicality of checking all test results against a database and suggest that discordant classifications should be interpreted with these limitations in mind. With the increasing use of clinical genetic testing for hereditary cancer risk, accurate variant classification is vital to ensuring appropriate medical management. There is a growing move to consult public databases following receipt of a genetic test result from a clinical laboratory; however, we show that up to 26.7% of variants in BRCA1 and BRCA2 have discordant classifications between ClinVar and a reference laboratory. The findings presented in this paper serve as a note of caution regarding the utility of database consultation. © AlphaMed Press 2017.

  8. New insights into old methods for identifying causal rare variants.

    PubMed

    Wang, Haitian; Huang, Chien-Hsun; Lo, Shaw-Hwa; Zheng, Tian; Hu, Inchi

    2011-11-29

    The advance of high-throughput next-generation sequencing technology makes possible the analysis of rare variants. However, the investigation of rare variants in unrelated-individuals data sets faces the challenge of low power, and most methods circumvent the difficulty by using various collapsing procedures based on genes, pathways, or gene clusters. We suggest a new way to identify causal rare variants using the F-statistic and sliced inverse regression. The procedure is tested on the data set provided by the Genetic Analysis Workshop 17 (GAW17). After preliminary data reduction, we ranked markers according to their F-statistic values. Top-ranked markers were then subjected to sliced inverse regression, and those with higher absolute coefficients in the most significant sliced inverse regression direction were selected. The procedure yields good false discovery rates for the GAW17 data and thus is a promising method for future study on rare variants.

  9. Quantitative evaluation of cross correlation between two finite-length time series with applications to single-molecule FRET.

    PubMed

    Hanson, Jeffery A; Yang, Haw

    2008-11-06

    The statistical properties of the cross correlation between two time series has been studied. An analytical expression for the cross correlation function's variance has been derived. On the basis of these results, a statistically robust method has been proposed to detect the existence and determine the direction of cross correlation between two time series. The proposed method has been characterized by computer simulations. Applications to single-molecule fluorescence spectroscopy are discussed. The results may also find immediate applications in fluorescence correlation spectroscopy (FCS) and its variants.

  10. CCND2, CTNNB1, DDX3X, GLI2, SMARCA4, MYC, MYCN, PTCH1, TP53, and MLL2 gene variants and risk of childhood medulloblastoma.

    PubMed

    Dahlin, Anna M; Hollegaard, Mads V; Wibom, Carl; Andersson, Ulrika; Hougaard, David M; Deltour, Isabelle; Hjalmars, Ulf; Melin, Beatrice

    2015-10-01

    Recent studies have described a number of genes that are frequently altered in medulloblastoma tumors and that have putative key roles in the development of the disease. We hypothesized that common germline genetic variations in these genes may be associated with medulloblastoma development. Based on recent publications, we selected 10 genes that were frequently altered in medulloblastoma: CCND2, CTNNB1, DDX3X, GLI2, SMARCA4, MYC, MYCN, PTCH1, TP53, and MLL2 (now renamed as KMT2D). Common genetic variants (single nucleotide polymorphisms) annotating these genes (n = 221) were genotyped in germline DNA (neonatal dried blood spot samples) from 243 childhood medulloblastoma cases and 247 control subjects from Sweden and Denmark. Eight genetic variants annotating three genes in the sonic hedgehog signaling pathway; CCND2, PTCH1, and GLI2, were found to be associated with the risk of medulloblastoma (P(combined) < 0.05). The findings were however not statistically significant following correction for multiple testing by the very stringent Bonferroni method. The results do not support our hypothesis that common germline genetic variants in the ten studied genes are associated with the risk of developing medulloblastoma.

  11. Proteomic analysis of hair shafts from monozygotic twins: Expression profiles and genetically variant peptides.

    PubMed

    Wu, Pei-Wen; Mason, Katelyn E; Durbin-Johnson, Blythe P; Salemi, Michelle; Phinney, Brett S; Rocke, David M; Parker, Glendon J; Rice, Robert H

    2017-07-01

    Forensic association of hair shaft evidence with individuals is currently assessed by comparing mitochondrial DNA haplotypes of reference and casework samples, primarily for exclusionary purposes. Present work tests and validates more recent proteomic approaches to extract quantitative transcriptional and genetic information from hair samples of monozygotic twin pairs, which would be predicted to partition away from unrelated individuals if the datasets contain identifying information. Protein expression profiles and polymorphic, genetically variant hair peptides were generated from ten pairs of monozygotic twins. Profiling using the protein tryptic digests revealed that samples from identical twins had typically an order of magnitude fewer protein expression differences than unrelated individuals. The data did not indicate that the degree of difference within twin pairs increased with age. In parallel, data from the digests were used to detect genetically variant peptides that result from common nonsynonymous single nucleotide polymorphisms in genes expressed in the hair follicle. Compilation of the variants permitted sorting of the samples by hierarchical clustering, permitting accurate matching of twin pairs. The results demonstrate that genetic differences are detectable by proteomic methods and provide a framework for developing quantitative statistical estimates of personal identification that increase the value of hair shaft evidence. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Nonsyndromic cleft lip with or without cleft palate: Increased burden of rare variants within Gremlin-1, a component of the bone morphogenetic protein 4 pathway.

    PubMed

    Al Chawa, Taofik; Ludwig, Kerstin U; Fier, Heide; Pötzsch, Bernd; Reich, Rudolf H; Schmidt, Gül; Braumann, Bert; Daratsianos, Nikolaos; Böhmer, Anne C; Schuencke, Hannah; Alblas, Margrieta; Fricker, Nadine; Hoffmann, Per; Knapp, Michael; Lange, Christoph; Nöthen, Markus M; Mangold, Elisabeth

    2014-06-01

    The genes Gremlin-1 (GREM1) and Noggin (NOG) are components of the bone morphogenetic protein 4 pathway, which has been implicated in craniofacial development. Both genes map to recently identified susceptibility loci (chromosomal region 15q13, 17q22) for nonsyndromic cleft lip with or without cleft palate (nsCL/P). The aim of the present study was to determine whether rare variants in either gene are implicated in nsCL/P etiology. The complete coding regions, untranslated regions, and splice sites of GREM1 and NOG were sequenced in 96 nsCL/P patients and 96 controls of Central European ethnicity. Three burden and four nonburden tests were performed. Statistically significant results were followed up in a second case-control sample (n = 96, respectively). For rare variants observed in cases, segregation analyses were performed. In NOG, four rare sequence variants (minor allele frequency < 1%) were identified. Here, burden and nonburden analyses generated nonsignificant results. In GREM1, 33 variants were identified, 15 of which were rare. Of these, five were novel. Significant p-values were generated in three nonburden analyses. Segregation analyses revealed incomplete penetrance for all variants investigated. Our study did not provide support for NOG being the causal gene at 17q22. However, the observation of a significant excess of rare variants in GREM1 supports the hypothesis that this is the causal gene at chr. 15q13. Because no single causal variant was identified, future sequencing analyses of GREM1 should involve larger samples and the investigation of regulatory elements. © 2014 Wiley Periodicals, Inc.

  13. A subregion-based burden test for simultaneous identification of susceptibility loci and subregions within.

    PubMed

    Zhu, Bin; Mirabello, Lisa; Chatterjee, Nilanjan

    2018-06-22

    In rare variant association studies, aggregating rare and/or low frequency variants, may increase statistical power for detection of the underlying susceptibility gene or region. However, it is unclear which variants, or class of them, in a gene contribute most to the association. We proposed a subregion-based burden test (REBET) to simultaneously select susceptibility genes and identify important underlying subregions. The subregions are predefined by shared common biologic characteristics, such as the protein domain or functional impact. Based on a subset-based approach considering local correlations between combinations of test statistics of subregions, REBET is able to properly control the type I error rate while adjusting for multiple comparisons in a computationally efficient manner. Simulation studies show that REBET can achieve power competitive to alternative methods when rare variants cluster within subregions. In two case studies, REBET is able to identify known disease susceptibility genes, and more importantly pinpoint the unreported most susceptible subregions, which represent protein domains essential for gene function. R package REBET is available at https://dceg.cancer.gov/tools/analysis/rebet. Published 2018. This article is a U.S. Government work and is in the public domain in the USA.

  14. Genome-Wide Association Study for Susceptibility to and Recoverability From Mastitis in Danish Holstein Cows

    PubMed Central

    Welderufael, B. G.; Løvendahl, Peter; de Koning, Dirk-Jan; Janss, Lucas L. G.; Fikse, W. F.

    2018-01-01

    Because mastitis is very frequent and unavoidable, adding recovery information into the analysis for genetic evaluation of mastitis is of great interest from economical and animal welfare point of view. Here we have performed genome-wide association studies (GWAS) to identify associated single nucleotide polymorphisms (SNPs) and investigate the genetic background not only for susceptibility to – but also for recoverability from mastitis. Somatic cell count records from 993 Danish Holstein cows genotyped for a total of 39378 autosomal SNP markers were used for the association analysis. Single SNP regression analysis was performed using the statistical software package DMU. Substitution effect of each SNP was tested with a t-test and a genome-wide significance level of P-value < 10-4 was used to declare significant SNP-trait association. A number of significant SNP variants were identified for both traits. Many of the SNP variants associated either with susceptibility to – or recoverability from mastitis were located in or very near to genes that have been reported for their role in the immune system. Genes involved in lymphocyte developments (e.g., MAST3 and STAB2) and genes involved in macrophage recruitment and regulation of inflammations (PDGFD and PTX3) were suggested as possible causal genes for susceptibility to – and recoverability from mastitis, respectively. However, this is the first GWAS study for recoverability from mastitis and our results need to be validated. The findings in the current study are, therefore, a starting point for further investigations in identifying causal genetic variants or chromosomal regions for both susceptibility to – and recoverability from mastitis. PMID:29755506

  15. Single-cell whole exome and targeted sequencing in NPM1/FLT3 positive pediatric acute myeloid leukemia.

    PubMed

    Walter, Christiane; Pozzorini, Christian; Reinhardt, Katarina; Geffers, Robert; Xu, Zhenyu; Reinhardt, Dirk; von Neuhoff, Nils; Hanenberg, Helmut

    2018-02-01

    The small portion of leukemic stem cells (LSCs) in acute myeloid leukemia (AML) present in children and adolescents is often masked by the high background of AML blasts and normal hematopoietic cells. The aim of the current study was to establish a simple workflow for reliable genetic analysis of single LSC-enriched blasts from pediatric patients. For three AMLs with mutations in nucleophosmin 1 and/or fms-like tyrosine kinase 3, we performed whole genome amplification on sorted single-cell DNA followed by whole exome sequencing (WES). The corresponding bulk bone marrow DNAs were also analyzed by WES and by targeted sequencing (TS) that included 54 genes associated with myeloid malignancies. Analysis revealed that read coverage statistics were comparable between single-cell and bulk WES data, indicating high-quality whole genome amplification. From 102 single-cell variants, 72 single nucleotide variants and insertions or deletions (70%) were consistently found in the two bulk DNA analyses. Variants reliably detected in single cells were also present in TS. However, initial screening by WES with read counts between 50-72× failed to detect rare AML subclones in the bulk DNAs. In summary, our study demonstrated that single-cell WES combined with bulk DNA TS is a promising tool set for detecting AML subclones and possibly LSCs. © 2017 Wiley Periodicals, Inc.

  16. Trans-Ethnic Meta-Analysis Identifies Common and Rare Variants Associated with Hepatocyte Growth Factor Levels in the Multi-Ethnic Study of Atherosclerosis (MESA)

    PubMed Central

    Larson, Nicholas B.; Berardi, Cecilia; Decker, Paul A.; Wassel, Christina L.; Kirsch, Phillip S.; Pankow, James S.; Sale, Michele M.; de Andrade, Mariza; Sicotte, Hugues; Tang, Weihong; Hanson, Naomi Q.; Tsai, Michael Y.; Taylor, Kent D.; Bielinski, Suzette J.

    2015-01-01

    Summary Hepatocyte growth factor (HGF) is a mesenchyme-derived pleiotropic factor that regulates cell growth, motility, mitogenesis, and morphogenesis in a variety of cells, and increased serum levels of HGF have been linked to a number of clinical and subclinical cardiovascular disease phenotypes. However, little is currently known regarding what genetic factors influence HGF levels, despite evidence of substantial genetic contributions to HGF variation. Based upon ethnicity-stratified single-variant association analysis and trans-ethnic meta-analysis of 6201 participants of the Multi-Ethnic Study of Atherosclerosis (MESA), we discovered five statistically significant common and low-frequency variants: HGF missense polymorphism rs5745687 (p.E299K) as well as four variants (rs16844364, rs4690098, rs114303452, rs3748034) within or in proximity to HGFAC. We also identified two significant ethnicity-specific gene-level associations (A1BG in African Americans; FASN in Chinese Americans) based upon low-frequency/rare variants, while meta-analysis of gene-level results identified a significant association for HGFAC. However, identified single-variant associations explained modest proportions of the total trait variation and were not significantly associated with coronary artery calcium or coronary heart disease. Our findings indicate genetic factors influencing circulating HGF levels may be complex and ethnically diverse. PMID:25998175

  17. Allelic Variants of Melanocortin 3 Receptor Gene (MC3R) and Weight Loss in Obesity: A Randomised Trial of Hypo-Energetic High- versus Low-Fat Diets

    PubMed Central

    Santos, José L.; De la Cruz, Rolando; Holst, Claus; Grau, Katrine; Naranjo, Carolina; Maiz, Alberto; Astrup, Arne; Saris, Wim H. M.; MacDonald, Ian; Oppert, Jean-Michel; Hansen, Torben; Pedersen, Oluf; Sorensen, Thorkild I. A.; Martinez, J. Alfredo

    2011-01-01

    Introduction The melanocortin system plays an important role in energy homeostasis. Mice genetically deficient in the melanocortin-3 receptor gene have a normal body weight with increased body fat, mild hypophagia compared to wild-type mice. In humans, Thr6Lys and Val81Ile variants of the melanocortin-3 receptor gene (MC3R) have been associated with childhood obesity, higher BMI Z-score and elevated body fat percentage compared to non-carriers. The aim of this study is to assess the association in adults between allelic variants of MC3R with weight loss induced by energy-restricted diets. Subjects and Methods This research is based on the NUGENOB study, a trial conducted to assess weight loss during a 10-week dietary intervention involving two different hypo-energetic (high-fat and low-fat) diets. A total of 760 obese patients were genotyped for 10 single nucleotide polymorphisms covering the single exon of MC3R gene and its flanking regions, including the missense variants Thr6Lys and Val81Ile. Linear mixed models and haplotype-based analysis were carried out to assess the potential association between genetic polymorphisms and differential weight loss, fat mass loss, waist change and resting energy expenditure changes. Results No differences in drop-out rate were found by MC3R genotypes. The rs6014646 polymorphism was significantly associated with weight loss using co-dominant (p = 0.04) and dominant models (p = 0.03). These p-values were not statistically significant after strict control for multiple testing. Haplotype-based multivariate analysis using permutations showed that rs3827103–rs1543873 (p = 0.06), rs6014646–rs6024730 (p = 0.05) and rs3746619–rs3827103 (p = 0.10) displayed near-statistical significant results in relation to weight loss. No other significant associations or gene*diet interactions were detected for weight loss, fat mass loss, waist change and resting energy expenditure changes. Conclusion The study provided overall sufficient evidence to support that there is no major effect of genetic variants of MC3R and differential weight loss after a 10-week dietary intervention with hypo-energetic diets in obese Europeans. PMID:21695122

  18. [Statistical analysis of body and lung mass of animals subjected to a single experimental insufflation of soil dust and electro-energetic ashes].

    PubMed

    Matysiak, W; Królikowska-Prasał, I; Staszyc, J; Kifer, E; Romanowska-Sarlej, J

    1989-01-01

    The studies were performed on 44 white female Wistar rats which were intratracheally administered the suspension of the soil dust and the electro-energetic ashes. The electro-energetic ashes were collected from 6 different local heat and power generating plants while the soil dust from several random places of our country. The statistical analysis of the body and the lung mass of the animals subjected to the single dust and ash insufflation was performed. The applied variants proved the statistically significant differences between the body and the lung mass. The observed differences are connected with the kinds of dust and ash used in the experiment.

  19. Is High-Density Lipoprotein Cholesterol Causally Related to Kidney Function? Evidence From Genetic Epidemiological Studies.

    PubMed

    Coassin, Stefan; Friedel, Salome; Köttgen, Anna; Lamina, Claudia; Kronenberg, Florian

    2016-11-01

    A recent observational study with almost 2 million men reported an association between low high-density lipoprotein (HDL) cholesterol and worse kidney function. The causality of this association would be strongly supported if genetic variants associated with HDL cholesterol were also associated with kidney function. We used 68 genetic variants (single-nucleotide polymorphisms [SNPs]) associated with HDL cholesterol in genome-wide association studies including >188 000 subjects and tested their association with estimated glomerular filtration rate (eGFR) using summary statistics from another genome-wide association studies meta-analysis of kidney function including ≤133 413 subjects. Fourteen of the 68 SNPs (21%) had a P value <0.05 compared with the 5% expected by chance (Binomial test P=5.8×10 - 6 ). After Bonferroni correction, 6 SNPs were still significantly associated with eGFR. The genetic variants with the strongest associations with HDL cholesterol concentrations were not the same as those with the strongest association with kidney function and vice versa. An evaluation of pleiotropy indicated that the effects of the HDL-associated SNPs on eGFR were not mediated by HDL cholesterol. In addition, we performed a Mendelian randomization analysis. This analysis revealed a positive but nonsignificant causal effect of HDL cholesterol-increasing variants on eGFR. In summary, our findings indicate that HDL cholesterol does not causally influence eGFR and propose pleiotropic effects on eGFR for some HDL cholesterol-associated SNPs. This may cause the observed association by mechanisms other than the mere HDL cholesterol concentration. © 2016 The Authors.

  20. Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension.

    PubMed

    Zhu, Xiaofeng; Feng, Tao; Tayo, Bamidele O; Liang, Jingjing; Young, J Hunter; Franceschini, Nora; Smith, Jennifer A; Yanek, Lisa R; Sun, Yan V; Edwards, Todd L; Chen, Wei; Nalls, Mike; Fox, Ervin; Sale, Michele; Bottinger, Erwin; Rotimi, Charles; Liu, Yongmei; McKnight, Barbara; Liu, Kiang; Arnett, Donna K; Chakravati, Aravinda; Cooper, Richard S; Redline, Susan

    2015-01-08

    Genome-wide association studies (GWASs) have identified many genetic variants underlying complex traits. Many detected genetic loci harbor variants that associate with multiple-even distinct-traits. Most current analysis approaches focus on single traits, even though the final results from multiple traits are evaluated together. Such approaches miss the opportunity to systemically integrate the phenome-wide data available for genetic association analysis. In this study, we propose a general approach that can integrate association evidence from summary statistics of multiple traits, either correlated, independent, continuous, or binary traits, which might come from the same or different studies. We allow for trait heterogeneity effects. Population structure and cryptic relatedness can also be controlled. Our simulations suggest that the proposed method has improved statistical power over single-trait analysis in most of the cases we studied. We applied our method to the Continental Origins and Genetic Epidemiology Network (COGENT) African ancestry samples for three blood pressure traits and identified four loci (CHIC2, HOXA-EVX1, IGFBP1/IGFBP3, and CDH17; p < 5.0 × 10(-8)) associated with hypertension-related traits that were missed by a single-trait analysis in the original report. Six additional loci with suggestive association evidence (p < 5.0 × 10(-7)) were also observed, including CACNA1D and WNT3. Our study strongly suggests that analyzing multiple phenotypes can improve statistical power and that such analysis can be executed with the summary statistics from GWASs. Our method also provides a way to study a cross phenotype (CP) association by using summary statistics from GWASs of multiple phenotypes. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  1. Adaptive Set-Based Methods for Association Testing.

    PubMed

    Su, Yu-Chen; Gauderman, William James; Berhane, Kiros; Lewinger, Juan Pablo

    2016-02-01

    With a typical sample size of a few thousand subjects, a single genome-wide association study (GWAS) using traditional one single nucleotide polymorphism (SNP)-at-a-time methods can only detect genetic variants conferring a sizable effect on disease risk. Set-based methods, which analyze sets of SNPs jointly, can detect variants with smaller effects acting within a gene, a pathway, or other biologically relevant sets. Although self-contained set-based methods (those that test sets of variants without regard to variants not in the set) are generally more powerful than competitive set-based approaches (those that rely on comparison of variants in the set of interest with variants not in the set), there is no consensus as to which self-contained methods are best. In particular, several self-contained set tests have been proposed to directly or indirectly "adapt" to the a priori unknown proportion and distribution of effects of the truly associated SNPs in the set, which is a major determinant of their power. A popular adaptive set-based test is the adaptive rank truncated product (ARTP), which seeks the set of SNPs that yields the best-combined evidence of association. We compared the standard ARTP, several ARTP variations we introduced, and other adaptive methods in a comprehensive simulation study to evaluate their performance. We used permutations to assess significance for all the methods and thus provide a level playing field for comparison. We found the standard ARTP test to have the highest power across our simulations followed closely by the global model of random effects (GMRE) and a least absolute shrinkage and selection operator (LASSO)-based test. © 2015 WILEY PERIODICALS, INC.

  2. Deep Sequencing of 71 Candidate Genes to Characterize Variation Associated with Alcohol Dependence.

    PubMed

    Clark, Shaunna L; McClay, Joseph L; Adkins, Daniel E; Kumar, Gaurav; Aberg, Karolina A; Nerella, Srilaxmi; Xie, Linying; Collins, Ann L; Crowley, James J; Quackenbush, Corey R; Hilliard, Christopher E; Shabalin, Andrey A; Vrieze, Scott I; Peterson, Roseann E; Copeland, William E; Silberg, Judy L; McGue, Matt; Maes, Hermine; Iacono, William G; Sullivan, Patrick F; Costello, Elizabeth J; van den Oord, Edwin J

    2017-04-01

    Previous genomewide association studies (GWASs) have identified a number of putative risk loci for alcohol dependence (AD). However, only a few loci have replicated and these replicated variants only explain a small proportion of AD risk. Using an innovative approach, the goal of this study was to generate hypotheses about potentially causal variants for AD that can be explored further through functional studies. We employed targeted capture of 71 candidate loci and flanking regions followed by next-generation deep sequencing (mean coverage 78X) in 806 European Americans. Regions included in our targeted capture library were genes identified through published GWAS of alcohol, all human alcohol and aldehyde dehydrogenases, reward system genes including dopaminergic and opioid receptors, prioritized candidate genes based on previous associations, and genes involved in the absorption, distribution, metabolism, and excretion of drugs. We performed single-locus tests to determine if any single variant was associated with AD symptom count. Sets of variants that overlapped with biologically meaningful annotations were tested for association in aggregate. No single, common variant was significantly associated with AD in our study. We did, however, find evidence for association with several variant sets. Two variant sets were significant at the q-value <0.10 level: a genic enhancer for ADHFE1 (p = 1.47 × 10 -5 ; q = 0.019), an alcohol dehydrogenase, and ADORA1 (p = 5.29 × 10 -5 ; q = 0.035), an adenosine receptor that belongs to a G-protein-coupled receptor gene family. To our knowledge, this is the first sequencing study of AD to examine variants in entire genes, including flanking and regulatory regions. We found that in addition to protein coding variant sets, regulatory variant sets may play a role in AD. From these findings, we have generated initial functional hypotheses about how these sets may influence AD. Copyright © 2017 by the Research Society on Alcoholism.

  3. Integrating multiple genomic data to predict disease-causing nonsynonymous single nucleotide variants in exome sequencing studies.

    PubMed

    Wu, Jiaxin; Li, Yanda; Jiang, Rui

    2014-03-01

    Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and the limited size of affected and normal populations. Indeed, prevalent applications of exome sequencing have been appealing for an effective computational method for identifying causative nonsynonymous SNVs from a large number of sequenced variants. Here, we propose a bioinformatics approach called SPRING (Snv PRioritization via the INtegration of Genomic data) for identifying pathogenic nonsynonymous SNVs for a given query disease. Based on six functional effect scores calculated by existing methods (SIFT, PolyPhen2, LRT, MutationTaster, GERP and PhyloP) and five association scores derived from a variety of genomic data sources (gene ontology, protein-protein interactions, protein sequences, protein domain annotations and gene pathway annotations), SPRING calculates the statistical significance that an SNV is causative for a query disease and hence provides a means of prioritizing candidate SNVs. With a series of comprehensive validation experiments, we demonstrate that SPRING is valid for diseases whose genetic bases are either partly known or completely unknown and effective for diseases with a variety of inheritance styles. In applications of our method to real exome sequencing data sets, we show the capability of SPRING in detecting causative de novo mutations for autism, epileptic encephalopathies and intellectual disability. We further provide an online service, the standalone software and genome-wide predictions of causative SNVs for 5,080 diseases at http://bioinfo.au.tsinghua.edu.cn/spring.

  4. Kinetic analysis of single molecule FRET transitions without trajectories

    NASA Astrophysics Data System (ADS)

    Schrangl, Lukas; Göhring, Janett; Schütz, Gerhard J.

    2018-03-01

    Single molecule Förster resonance energy transfer (smFRET) is a popular tool to study biological systems that undergo topological transitions on the nanometer scale. smFRET experiments typically require recording of long smFRET trajectories and subsequent statistical analysis to extract parameters such as the states' lifetimes. Alternatively, analysis of probability distributions exploits the shapes of smFRET distributions at well chosen exposure times and hence works without the acquisition of time traces. Here, we describe a variant that utilizes statistical tests to compare experimental datasets with Monte Carlo simulations. For a given model, parameters are varied to cover the full realistic parameter space. As output, the method yields p-values which quantify the likelihood for each parameter setting to be consistent with the experimental data. The method provides suitable results even if the actual lifetimes differ by an order of magnitude. We also demonstrated the robustness of the method to inaccurately determine input parameters. As proof of concept, the new method was applied to the determination of transition rate constants for Holliday junctions.

  5. Joint Identification of Genetic Variants for Physical Activity in Korean Population

    PubMed Central

    Kim, Jayoun; Kim, Jaehee; Min, Haesook; Oh, Sohee; Kim, Yeonjung; Lee, Andy H.; Park, Taesung

    2014-01-01

    There has been limited research on genome-wide association with physical activity (PA). This study ascertained genetic associations between PA and 344,893 single nucleotide polymorphism (SNP) markers in 8842 Korean samples. PA data were obtained from a validated questionnaire that included information on PA intensity and duration. Metabolic equivalent of tasks were calculated to estimate the total daily PA level for each individual. In addition to single- and multiple-SNP association tests, a pathway enrichment analysis was performed to identify the biological significance of SNP markers. Although no significant SNP was found at genome-wide significance level via single-SNP association tests, 59 genetic variants mapped to 76 genes were identified via a multiple SNP approach using a bootstrap selection stability measure. Pathway analysis for these 59 variants showed that maturity onset diabetes of the young (MODY) was enriched. Joint identification of SNPs could enable the identification of multiple SNPs with good predictive power for PA and a pathway enriched for PA. PMID:25026172

  6. FARVATX: FAmily-based Rare Variant Association Test for X-linked genes

    PubMed Central

    Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H.; Silverman, Edwin K; Park, Taesung; Won, Sungho

    2016-01-01

    Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease (COPD). Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. PMID:27325607

  7. FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes.

    PubMed

    Choi, Sungkyoung; Lee, Sungyoung; Qiao, Dandi; Hardin, Megan; Cho, Michael H; Silverman, Edwin K; Park, Taesung; Won, Sungho

    2016-09-01

    Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods. © 2016 WILEY PERIODICALS, INC.

  8. Single nucleotide polymorphisms associated with nonsyndromic cryptorchidism in Mexican patients.

    PubMed

    Chávez-Saldaña, M; Vigueras-Villaseñor, R M; Yokoyama-Rebollar, E; Landero-Huerta, D A; Rojas-Castañeda, J C; Taja-Chayeb, L; Cuevas-Alpuche, J O; Zambrano, E

    2018-02-01

    Cryptorchidism is a frequent genitourinary malformation considered as an important risk factor for infertility and testicular malignancy. The aetiology of cryptorchidism is multifactorial in which certain SNPs, capable of inhibiting the development of the gubernaculum, are implicated. We analysed 16 SNPs by allelic discrimination and automated sequencing in 85 patients and 99 healthy people, with the objective to identify the association between these variants and isolated cryptorchidism. In two different patients with unilateral cryptorchidism, we found the variants rs121912556 and p.R105R of INSL3 gene in a heterozygous form associated with cryptorchidism, so we could considered them as risk factors for cryptorchidism. On the other hand, SNPs rs10421916 of INSL3 gene, as well as the variants rs1555633 and rs7325513 in the RXFP2 gene, and rs3779456 variant of the HOXA10 gene were statistically significant, when the patients and controls were compared and could be considered as protective factors since are predominantly present in controls. The genotype-phenotype correlation did not show statistical significance. With these results, we could conclude that these polymorphisms can be considered as important variants in our population and would contribute in the future knowledge of the aetiology and physiopathology of cryptorchidism. © 2017 Blackwell Verlag GmbH.

  9. Improving the detection of pathways in genome-wide association studies by combined effects of SNPs from Linkage Disequilibrium blocks.

    PubMed

    Zhao, Huiying; Nyholt, Dale R; Yang, Yuanhao; Wang, Jihua; Yang, Yuedong

    2017-06-14

    Genome-wide association studies (GWAS) have successfully identified single variants associated with diseases. To increase the power of GWAS, gene-based and pathway-based tests are commonly employed to detect more risk factors. However, the gene- and pathway-based association tests may be biased towards genes or pathways containing a large number of single-nucleotide polymorphisms (SNPs) with small P-values caused by high linkage disequilibrium (LD) correlations. To address such bias, numerous pathway-based methods have been developed. Here we propose a novel method, DGAT-path, to divide all SNPs assigned to genes in each pathway into LD blocks, and to sum the chi-square statistics of LD blocks for assessing the significance of the pathway by permutation tests. The method was proven robust with the type I error rate >1.6 times lower than other methods. Meanwhile, the method displays a higher power and is not biased by the pathway size. The applications to the GWAS summary statistics for schizophrenia and breast cancer indicate that the detected top pathways contain more genes close to associated SNPs than other methods. As a result, the method identified 17 and 12 significant pathways containing 20 and 21 novel associated genes, respectively for two diseases. The method is available online by http://sparks-lab.org/server/DGAT-path .

  10. Detection of clinically relevant copy-number variants by exome sequencing in a large cohort of genetic disorders

    PubMed Central

    Pfundt, Rolph; del Rosario, Marisol; Vissers, Lisenka E.L.M.; Kwint, Michael P.; Janssen, Irene M.; de Leeuw, Nicole; Yntema, Helger G.; Nelen, Marcel R.; Lugtenberg, Dorien; Kamsteeg, Erik-Jan; Wieskamp, Nienke; Stegmann, Alexander P.A.; Stevens, Servi J.C.; Rodenburg, Richard J.T.; Simons, Annet; Mensenkamp, Arjen R.; Rinne, Tuula; Gilissen, Christian; Scheffer, Hans; Veltman, Joris A.; Hehir-Kwa, Jayne Y.

    2017-01-01

    Purpose: Copy-number variation is a common source of genomic variation and an important genetic cause of disease. Microarray-based analysis of copy-number variants (CNVs) has become a first-tier diagnostic test for patients with neurodevelopmental disorders, with a diagnostic yield of 10–20%. However, for most other genetic disorders, the role of CNVs is less clear and most diagnostic genetic studies are generally limited to the study of single-nucleotide variants (SNVs) and other small variants. With the introduction of exome and genome sequencing, it is now possible to detect both SNVs and CNVs using an exome- or genome-wide approach with a single test. Methods: We performed exome-based read-depth CNV screening on data from 2,603 patients affected by a range of genetic disorders for which exome sequencing was performed in a diagnostic setting. Results: In total, 123 clinically relevant CNVs ranging in size from 727 bp to 15.3 Mb were detected, which resulted in 51 conclusive diagnoses and an overall increase in diagnostic yield of ~2% (ranging from 0 to –5.8% per disorder). Conclusions: This study shows that CNVs play an important role in a broad range of genetic disorders and that detection via exome-based CNV profiling results in an increase in the diagnostic yield without additional testing, bringing us closer to single-test genomics. Genet Med advance online publication 27 October 2016 PMID:28574513

  11. A systematic approach to assessing the clinical significance of genetic variants.

    PubMed

    Duzkale, H; Shen, J; McLaughlin, H; Alfares, A; Kelly, M A; Pugh, T J; Funke, B H; Rehm, H L; Lebo, M S

    2013-11-01

    Molecular genetic testing informs diagnosis, prognosis, and risk assessment for patients and their family members. Recent advances in low-cost, high-throughput DNA sequencing and computing technologies have enabled the rapid expansion of genetic test content, resulting in dramatically increased numbers of DNA variants identified per test. To address this challenge, our laboratory has developed a systematic approach to thorough and efficient assessments of variants for pathogenicity determination. We first search for existing data in publications and databases including internal, collaborative and public resources. We then perform full evidence-based assessments through statistical analyses of observations in the general population and disease cohorts, evaluation of experimental data from in vivo or in vitro studies, and computational predictions of potential impacts of each variant. Finally, we weigh all evidence to reach an overall conclusion on the potential for each variant to be disease causing. In this report, we highlight the principles of variant assessment, address the caveats and pitfalls, and provide examples to illustrate the process. By sharing our experience and providing a framework for variant assessment, including access to a freely available customizable tool, we hope to help move towards standardized and consistent approaches to variant assessment. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  12. Regularized rare variant enrichment analysis for case-control exome sequencing data.

    PubMed

    Larson, Nicholas B; Schaid, Daniel J

    2014-02-01

    Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.

  13. Pleiotropic analysis of cancer risk loci on esophageal adenocarcinoma risk

    PubMed Central

    Lee, Eunjung; Stram, Daniel O.; Ek, Weronica E.; Onstad, Lynn E; MacGregor, Stuart; Gharahkhani, Puya; Ye, Weimin; Lagergren, Jesper; Shaheen, Nicholas J.; Murray, Liam J.; Hardie, Laura J; Gammon, Marilie D.; Chow, Wong-Ho; Risch, Harvey A.; Corley, Douglas A.; Levine, David M; Whiteman, David C.; Bernstein, Leslie; Bird, Nigel C.; Vaughan, Thomas L.; Wu, Anna H.

    2015-01-01

    Background Several cancer-associated loci identified from genome-wide association studies (GWAS) have been associated with risks of multiple cancer sites, suggesting pleiotropic effects. We investigated whether GWAS-identified risk variants for other common cancers are associated with risk of esophageal adenocarcinoma (EA) or its precursor, Barrett's esophagus (BE). Methods We examined the associations between risks of EA and BE and 387 single nucleotide polymorphisms (SNPs) that have been associated with risks of other cancers, by using genotype imputation data on 2,163 control participants and 3,885 (1,501 EA and 2,384 BE) case patients from the Barrett's and Esophageal Adenocarcinoma Genetic Susceptibility Study, and investigated effect modification by smoking history, body mass index (BMI), and reflux/heartburn. Results After correcting for multiple testing, none of the tested 387 SNPs were statistically significantly associated with risk of EA or BE. No evidence of effect modification by smoking, BMI, or reflux/heartburn was observed. Conclusions Genetic risk variants for common cancers identified from GWAS appear not to be associated with risks of EA or BE. Impact To our knowledge, this is the first investigation of pleiotropic genetic associations with risks of EA and BE. PMID:26364162

  14. Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics

    PubMed Central

    Chen, Wenan; Larrabee, Beth R.; Ovsyannikova, Inna G.; Kennedy, Richard B.; Haralambieva, Iana H.; Poland, Gregory A.; Schaid, Daniel J.

    2015-01-01

    Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. PMID:25948564

  15. ZNF208 polymorphisms associated with ischemic stroke in a southern Chinese Han population.

    PubMed

    Yu, Jianzhong; Zhou, Feng; Luo, Dong; Wang, Nianzhen; Zhang, Chong; Jin, Tianbo; Liang, Xiongfei; Yu, Dan

    2017-01-01

    Ischemic stroke is one of the most common diseases with a high burden of neurological deficits, disability and death. Zinc finger protein 208 (ZNF208) was found to be involved in coronary heart disease, although little information is available about its association with ischemic stroke. We performed the present case-control study to clarify the association between single-nucleotide polymorphisms (SNPs) within ZNF208 and the risk of ischemic stroke in a southern Chinese Han population. A total of 799 subjects (400 cases and 399 healthy controls) were enrolled in the present study. Five SNPs within ZNF208 gene were selected and genotyped using Sequenom MassARRY technology (Sequenom, Inc., San Diego, CA, USA). Data management and statistical analyses were conducted using Sequenom Typer, version 4.0, and a chi-squared test, as well as unconditional logistic regression. Statistical results showed that three variants were associated with the risk of ischemic stroke under allele models (rs2188971, rs2188972, rs8103163 and rs7248488). The variant rs2188972 was also associated with the risk of ischemic stroke in a recessive model after adjustment for age and sex. Haplotype analysis suggested that a significant difference existed between the A rs2188972 T rs2188971 A rs8103163 A rs7248488 haplotype and the risk of ischemic stroke, although this disappeared after adjustment for sex and age. The results obtained in the present study indicate a potential association between ZNF208 variants and the risk of ischemic risk in a southern Chinese Han population. Copyright © 2016 John Wiley & Sons, Ltd.

  16. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants.

    PubMed

    Ioannidis, Nilah M; Rothstein, Joseph H; Pejaver, Vikas; Middha, Sumit; McDonnell, Shannon K; Baheti, Saurabh; Musolf, Anthony; Li, Qing; Holzinger, Emily; Karyadi, Danielle; Cannon-Albright, Lisa A; Teerlink, Craig C; Stanford, Janet L; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan M; Schleutker, Johanna; Carpten, John D; Powell, Isaac J; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William D; Mandal, Diptasri; Eeles, Rosalind A; Kote-Jarai, Zsofia; Bustamante, Carlos D; Schaid, Daniel J; Hastie, Trevor; Ostrander, Elaine A; Bailey-Wilson, Joan E; Radivojac, Predrag; Thibodeau, Stephen N; Whittemore, Alice S; Sieh, Weiva

    2016-10-06

    The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons. REVEL was trained with recently discovered pathogenic and rare neutral missense variants, excluding those previously used to train its constituent tools. When applied to two independent test sets, REVEL had the best overall performance (p < 10 -12 ) as compared to any individual tool and seven ensemble methods: MetaSVM, MetaLR, KGGSeq, Condel, CADD, DANN, and Eigen. Importantly, REVEL also had the best performance for distinguishing pathogenic from rare neutral variants with allele frequencies <0.5%. The area under the receiver operating characteristic curve (AUC) for REVEL was 0.046-0.182 higher in an independent test set of 935 recent SwissVar disease variants and 123,935 putatively neutral exome sequencing variants and 0.027-0.143 higher in an independent test set of 1,953 pathogenic and 2,406 benign variants recently reported in ClinVar than the AUCs for other ensemble methods. We provide pre-computed REVEL scores for all possible human missense variants to facilitate the identification of pathogenic variants in the sea of rare variants discovered as sequencing studies expand in scale. Copyright © 2016 American Society of Human Genetics. All rights reserved.

  17. Validation of PDE9A Gene Identified in GWAS Showing Strong Association with Milk Production Traits in Chinese Holstein.

    PubMed

    Yang, Shao-Hua; Bi, Xiao-Jun; Xie, Yan; Li, Cong; Zhang, Sheng-Li; Zhang, Qin; Sun, Dong-Xiao

    2015-11-05

    Phosphodiesterase9A (PDE9A) is a cyclic guanosine monophosphate (cGMP)-specific enzyme widely expressed among the tissues, which is important in activating cGMP-dependent signaling pathways. In our previous genome-wide association study, a single nucleotide polymorphism (SNP) (BTA-55340-no-rs(b)) located in the intron 14 of PDE9A, was found to be significantly associated with protein yield. In addition, we found that PDE9A was highly expressed in mammary gland by analyzing its mRNA expression in different tissues. The objectives of this study were to identify genetic polymorphisms of PDE9A and to determine the effects of these variants on milk production traits in dairy cattle. DNA sequencing identified 11 single nucleotide polymorphisms (SNPs) and six SNPs in 5' regulatory region were genotyped to test for the subsequent association analyses. After Bonferroni correction for multiple testing, all these identified SNPs were statistically significant for one or more milk production traits (p < 0.0001~0.0077). Interestingly, haplotype-based association analysis revealed similar effects on milk production traits (p < 0.01). In follow-up RNA expression analyses, two SNPs (c.-1376 G>A, c.-724 A>G) were involved in the regulation of gene expression. Consequently, our findings provide confirmatory evidences for associations of PDE9A variants with milk production traits and these identified SNPs may serve as genetic markers to accelerate Chinese Holstein breeding program.

  18. Strong association of common variants in the CDKN2A/CDKN2B region with type 2 diabetes in French Europids.

    PubMed

    Duesing, K; Fatemifar, G; Charpentier, G; Marre, M; Tichet, J; Hercberg, S; Balkau, B; Froguel, P; Gibson, F

    2008-05-01

    Genome-wide association studies (GWASs) recently identified common variants in the CDKN2A/CDKN2B region on chromosome 9p as being strongly associated with type 2 diabetes. Since these association signals were not picked up by the French-Canadian GWAS, we sought to replicate these findings in the French Europid population and to further characterise the susceptibility variants at this novel locus. We genotyped 20 single nucleotide polymorphisms (SNPs) spanning the CDKN2A/CDKN2B locus in our type 2 diabetes case-control cohort. The association between CDKN2A/CDKN2B SNPs and quantitative metabolic traits was also examined in the normoglycaemic participants comprising the control cohort. We report replication of the strong association of rs10811661 with type 2 diabetes found in the GWASs (P= 3.8 X 10(-7); OR 1.43 [95% CI 1.24-1.64]). The other CDKN2A/CDKN2B susceptibility variant, rs564398, did not attain statistical significance (p = 0.053; OR 1.11 [95% CI 1.00-1.24]) in the present study. We also obtained several additional nominal association signals (p < 0.05) at the CDKN2A/CDKN2B locus; however, only the rs3218018 result (p = 0.002) survived Bonferroni correction for multiple testing (adjusted p = 0.04). Our comprehensive association study of common variation spanning the CDKN2A/CDKN2B locus confirms the strong association between the distal susceptibility variant rs10811661 and type 2 diabetes in the French population. Further genetic and functional studies are required to identify the aetiological variants at this locus and determine the cellular and physiological mechanisms by which they act to modulate type 2 diabetes susceptibility.

  19. The impact of rare variation on gene expression across tissues.

    PubMed

    Li, Xin; Kim, Yungil; Tsang, Emily K; Davis, Joe R; Damani, Farhan N; Chiang, Colby; Hess, Gaelen T; Zappala, Zachary; Strober, Benjamin J; Scott, Alexandra J; Li, Amy; Ganna, Andrea; Bassik, Michael C; Merker, Jason D; Hall, Ira M; Battle, Alexis; Montgomery, Stephen B

    2017-10-11

    Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.

  20. Accurate computation of survival statistics in genome-wide studies.

    PubMed

    Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J; Upfal, Eli

    2015-05-01

    A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.

  1. Accurate Computation of Survival Statistics in Genome-Wide Studies

    PubMed Central

    Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli

    2015-01-01

    A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620

  2. Allergic sensitization and filaggrin variants predispose to the comorbidity of eczema, asthma, and rhinitis: results from the Isle of Wight birth cohort

    PubMed Central

    Ziyab, Ali H.; Karmaus, Wilfried; Zhang, Hongmei; Holloway, John W.; Steck, Susan E.; Ewart, Susan; Arshad, Syed Hasan

    2014-01-01

    Background Allergic sensitization and filaggrin gene (FLG) variants are important risk factors for allergic disorders; however, knowledge on their individual and interactive effects on the coexistence of eczema, asthma, and rhinitis is lacking. Objective This study aimed at investigating the single and combined effects of allergic sensitization and FLG variants on the development of single and multiple allergic disorders. Methods The Isle of Wight Birth Cohort (n = 1,456) has been examined at 1, 2, 4, 10, and 18 years of age. Repeated measurements of eczema, asthma, rhinitis, and skin prick tests were available for all follow-ups. FLG variants were genotyped in 1,150 participants. Associations of allergic sensitization and FLG variants with single and multiple allergic disorders were tested in log-binomial regression analysis. Results The prevalence of eczema-, asthma-, and rhinitis-only ranged from 5.6% to 8.5%, 4.9% to 10.2%, and 2.5% to 20.4%, respectively, during the first 18 years of life. The coexistence of allergic disorders is common, with approximately 2% of the population reporting the comorbidity of “eczema, asthma, and rhinitis” during the study period. In repeated measurement analyses, allergic sensitization and FLG variants, when analyzed separately, were associated with having single and multiple allergic disorders. Of particular significance, their combined effect increased the risk of “eczema and asthma” (RR = 13.67, 95% CI: 7.35 – 25.42), “asthma and rhinitis” (RR = 7.46, 95% CI: 5.07 – 10.98), and “eczema, asthma, and rhinitis” (RR = 23.44, 95% CI: 12.27 – 44.78). Conclusions and Clinical Relevance The coexistence of allergic disorders is frequent and allergic sensitization and FLG variants jointly increased risk of allergic comorbidities, which may represent more severe and complex clinical phenotypes. The interactive effect and the elevated proportion of allergic comorbidities associated with allergic sensitization and FLG variants emphasize their joint importance in the pathogenesis of allergic disorders. PMID:24708301

  3. The Relationship between Smoking and Replicated Sequence Variants on Chromosomes 8 and 9 with Familial Intracranial Aneurysm

    PubMed Central

    Deka, Ranjan; Koller, Daniel L.; Lai, Dongbing; Indugula, Subba Rao; Sun, Guangyun; Woo, Daniel; Sauerbeck, Laura; Moomaw, Charles J.; Hornung, Richard; Connolly, E. Sander; Anderson, Craig; Rouleau, Guy; Meissner, Irene; Bailey-Wilson, Joan E.; Huston, John; Brown, Robert D.; Kleindorfer, Dawn O.; Flaherty, Matthew L.; Langefeld, Carl; Foroud, Tatiana; Broderick, Joseph P.

    2010-01-01

    Purpose To replicate the previous association of single nucleotide polymorphisms (SNPs) with risk of intracranial aneurysm (IA) and to examine the relationship of smoking with these variants and the risk of IA. Methods White probands with an IA from families with multiple affected members were identified by 26 clinical centers located throughout North America, New Zealand, and Australia. White controls free of stroke and IA were selected by random digit dialing from the Greater Cincinnati population. SNPs previously associated with IA on chromosome 2, 8, and 9 were genotyped using a TaqMan assay or were included in the Affymetrix 6.0 array that was part of a genome-wide association study of 406 IA cases and 392 controls. Logistic regression modeling tested whether the association of replicated SNPs with IA was modulated by smoking. Results The strongest evidence of association with IA was found with the 8q SNP rs10958409 (genotypic P = 9.2 × 10-5; allelic P = 1.3 × 10-5; OR = 1.86, 95% CI: 1.40−2.47). We also replicated association with both SNPs on chromosome 9p, rs1333040 and rs10757278, but were not able to replicate the previously reported association of the two SNPs on chromosome 2q. Statistical testing showed a multiplicative relationship between the risk alleles and smoking with regard to the risk of IA. Conclusion Our data provide complementary evidence that the variants on chromosome 8q and 9p are associated with IA and that the risk of IA in patients with these variants are greatly increased with cigarette smoking. PMID:20190001

  4. The Rare-Variant Generalized Disequilibrium Test for Association Analysis of Nuclear and Extended Pedigrees with Application to Alzheimer Disease WGS Data.

    PubMed

    He, Zongxiao; Zhang, Di; Renton, Alan E; Li, Biao; Zhao, Linhai; Wang, Gao T; Goate, Alison M; Mayeux, Richard; Leal, Suzanne M

    2017-02-02

    Whole-genome and exome sequence data can be cost-effectively generated for the detection of rare-variant (RV) associations in families. Causal variants that aggregate in families usually have larger effect sizes than those found in sporadic cases, so family-based designs can be a more powerful approach than population-based designs. Moreover, some family-based designs are robust to confounding due to population admixture or substructure. We developed a RV extension of the generalized disequilibrium test (GDT) to analyze sequence data obtained from nuclear and extended families. The GDT utilizes genotype differences of all discordant relative pairs to assess associations within a family, and the RV extension combines the single-variant GDT statistic over a genomic region of interest. The RV-GDT has increased power by efficiently incorporating information beyond first-degree relatives and allows for the inclusion of covariates. Using simulated genetic data, we demonstrated that the RV-GDT method has well-controlled type I error rates, even when applied to admixed populations and populations with substructure. It is more powerful than existing family-based RV association methods, particularly for the analysis of extended pedigrees and pedigrees with missing data. We analyzed whole-genome sequence data from families affected by Alzheimer disease to illustrate the application of the RV-GDT. Given the capability of the RV-GDT to adequately control for population admixture or substructure and analyze pedigrees with missing genotype data and its superior power over other family-based methods, it is an effective tool for elucidating the involvement of RVs in the etiology of complex traits. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  5. A pooling-based approach to mapping genetic variants associated with DNA methylation

    PubMed Central

    Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.; McEwen, Lisa M.; Kobor, Michael S.; Fraser, Hunter B.

    2015-01-01

    DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data. PMID:25910490

  6. A pooling-based approach to mapping genetic variants associated with DNA methylation

    DOE PAGES

    Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.; ...

    2015-04-24

    DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a trulymore » genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. Here we found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.« less

  7. Functional Assessment of Genetic Variants with Outcomes Adapted to Clinical Decision-Making

    PubMed Central

    Thouvenot, Pierre; Ben Yamin, Barbara; Fourrière, Lou; Lescure, Aurianne; Boudier, Thomas; Del Nery, Elaine; Chauchereau, Anne; Goldgar, David E.; Stoppa-Lyonnet, Dominique; Nicolas, Alain; Millot, Gaël A.

    2016-01-01

    Understanding the medical effect of an ever-growing number of human variants detected is a long term challenge in genetic counseling. Functional assays, based on in vitro or in vivo evaluations of the variant effects, provide essential information, but they require robust statistical validation, as well as adapted outputs, to be implemented in the clinical decision-making process. Here, we assessed 25 pathogenic and 15 neutral missense variants of the BRCA1 breast/ovarian cancer susceptibility gene in four BRCA1 functional assays. Next, we developed a novel approach that refines the variant ranking in these functional assays. Lastly, we developed a computational system that provides a probabilistic classification of variants, adapted to clinical interpretation. Using this system, the best functional assay exhibits a variant classification accuracy estimated at 93%. Additional theoretical simulations highlight the benefit of this ready-to-use system in the classification of variants after functional assessment, which should facilitate the consideration of functional evidences in the decision-making process after genetic testing. Finally, we demonstrate the versatility of the system with the classification of siRNAs tested for human cell growth inhibition in high throughput screening. PMID:27272900

  8. Molecular epidemiology identifies only a single rabies virus variant circulating in complex carnivore communities of the Serengeti

    PubMed Central

    Lembo, T; Haydon, D.T; Velasco-Villa, A; Rupprecht, C.E; Packer, C; Brandão, P.E; Kuzmin, I.V; Fooks, A.R; Barrat, J; Cleaveland, S

    2007-01-01

    Understanding the transmission dynamics of generalist pathogens that infect multiple host species is essential for their effective control. Only by identifying those host populations that are critical to the permanent maintenance of the pathogen, as opposed to populations in which outbreaks are the result of ‘spillover’ infections, can control measures be appropriately directed. Rabies virus is capable of infecting a wide range of host species, but in many ecosystems, particular variants circulate among only a limited range of potential host populations. The Serengeti ecosystem (in northwestern Tanzania) supports a complex community of wild carnivores that are threatened by generalist pathogens that also circulate in domestic dog populations surrounding the park boundaries. While the combined assemblage of host species appears capable of permanently maintaining rabies in the ecosystem, little is known about the patterns of circulation within and between these host populations. Here we use molecular phylogenetics to test whether distinct virus–host associations occur in this species-rich carnivore community. Our analysis identifies a single major variant belonging to the group of southern Africa canid-associated viruses (Africa 1b) to be circulating within this ecosystem, and no evidence for species-specific grouping. A statistical parsimony analysis of nucleoprotein and glycoprotein gene sequence data is consistent with both within- and between-species transmission events. While likely differential sampling effort between host species precludes a definitive inference, the results are most consistent with dogs comprising the reservoir of rabies and emphasize the importance of applying control efforts in dog populations. PMID:17609187

  9. Molecular epidemiology identifies only a single rabies virus variant circulating in complex carnivore communities of the Serengeti.

    PubMed

    Lembo, T; Haydon, D T; Velasco-Villa, A; Rupprecht, C E; Packer, C; Brandão, P E; Kuzmin, I V; Fooks, A R; Barrat, J; Cleaveland, S

    2007-09-07

    Understanding the transmission dynamics of generalist pathogens that infect multiple host species is essential for their effective control. Only by identifying those host populations that are critical to the permanent maintenance of the pathogen, as opposed to populations in which outbreaks are the result of 'spillover' infections, can control measures be appropriately directed. Rabies virus is capable of infecting a wide range of host species, but in many ecosystems, particular variants circulate among only a limited range of potential host populations. The Serengeti ecosystem (in northwestern Tanzania) supports a complex community of wild carnivores that are threatened by generalist pathogens that also circulate in domestic dog populations surrounding the park boundaries. While the combined assemblage of host species appears capable of permanently maintaining rabies in the ecosystem, little is known about the patterns of circulation within and between these host populations. Here we use molecular phylogenetics to test whether distinct virus-host associations occur in this species-rich carnivore community. Our analysis identifies a single major variant belonging to the group of southern Africa canid-associated viruses (Africa 1b) to be circulating within this ecosystem, and no evidence for species-specific grouping. A statistical parsimony analysis of nucleoprotein and glycoprotein gene sequence data is consistent with both within- and between-species transmission events. While likely differential sampling effort between host species precludes a definitive inference, the results are most consistent with dogs comprising the reservoir of rabies and emphasize the importance of applying control efforts in dog populations.

  10. MYH9 genetic variants associated with glomerular disease: what is the role for genetic testing?

    PubMed

    Kopp, Jeffrey B; Winkler, Cheryl A; Nelson, George W

    2010-07-01

    Genetic variation in MYH9, encoding nonmuscle myosin IIA heavy chain, has been associated recently with increased risk for kidney disease. Previously, MYH9 missense mutations have been shown to cause the autosomal-dominant MYH9 (ADM9) spectrum, characterized by large platelets, leukocyte Döhle bodies, and, variably, sensorineural deafness, cataracts, and glomerulopathy. Genetic testing is indicated for familial and sporadic cases that fit this spectrum. By contrast, the MYH9 kidney risk variant is characterized by multiple intronic single nucleotide polymorphisms, but the causative variant has not been identified. Disease associations include human immunodeficiency virus-associated collapsing glomerulopathy, focal segmental glomerulosclerosis, hypertension-attributed end-stage kidney disease, and diabetes-attributed end-stage kidney disease. One plausible hypothesis is that the MYH9 kidney risk variant confers a fragile podocyte phenotype. In the case of hypertension-attributed kidney disease, it remains unclear if the hypertension is a contributing cause or a consequence of glomerular injury. The MYH9 kidney risk variant is strikingly more common among individuals of African descent, but only some will develop clinical kidney disease in their lifetime. Thus, it is likely that additional genes and/or environmental factors interact with the MYH9 kidney risk variant to trigger glomerular injury. A preliminary genetic risk stratification scheme, using two single nucleotide polymorphisms, may estimate lifetime risk for kidney disease. Nevertheless, at present, no role has been established for genetic testing as part of personalized medicine, but testing should be considered in clinical studies of glomerular diseases among populations of African descent. Such studies will address critical questions pertaining to MYH9-associated kidney disease, including mechanism, course, and response to therapy. Published by Elsevier Inc.

  11. Annotate-it: a Swiss-knife approach to annotation, analysis and interpretation of single nucleotide variation in human disease

    PubMed Central

    2012-01-01

    The increasing size and complexity of exome/genome sequencing data requires new tools for clinical geneticists to discover disease-causing variants. Bottlenecks in identifying the causative variation include poor cross-sample querying, constantly changing functional annotation and not considering existing knowledge concerning the phenotype. We describe a methodology that facilitates exploration of patient sequencing data towards identification of causal variants under different genetic hypotheses. Annotate-it facilitates handling, analysis and interpretation of high-throughput single nucleotide variant data. We demonstrate our strategy using three case studies. Annotate-it is freely available and test data are accessible to all users at http://www.annotate-it.org. PMID:23013645

  12. [The phonological variant of primary progressive aphasia, a single case study].

    PubMed

    Diesfeldt, H F A

    2011-04-01

    Primary progressive aphasia (PPA) is a neurodegenerative syndrome characterized by an insidious onset and gradual progression of deficits that can involve any aspect of language, including word finding, object naming, fluency, syntax, phonology and word comprehension. The initial symptoms occur in the absence of major deficits in other cognitive domains, including episodic memory, visuospatial abilities and visuoconstruction. According to recent diagnostic guidelines, PPA is typically divided into three variants: nonfluent variant PPA (also termed progressive nonfluent aphasia), semantic variant PPA (also termed semantic dementia) and logopenic/phonological variant PPA (also termed logopenic progressive aphasia). The paper describes a 79-yr old man, who presented with normal motor speech and production rate, impaired single word retrieval and phonemic errors in spontaneous speech and confrontational naming. Confrontation naming was strongly affected by lexical frequency. He was impaired on repetition of sentences and phrases. Reading was intact for regularly spelled words but not for irregular words (surface dyslexia). Comprehension was spared at the single word level, but impaired for complex sentences. He performed within the normal range on the Dutch equivalent of the Pyramids and Palm Trees (PPT) Pictures Test, indicating that semantic processing was preserved. There was, however, a slight deficiency on the PPT Words Test, which appeals to semantic knowledge of verbal associations. His core deficit was interpreted as an inability to retrieve stored lexical-phonological information for spoken word production in spontaneous speech, confrontation naming, repetition and reading aloud.

  13. Joint association of nicotinic acetylcholine receptor variants with abdominal obesity in American Indians: the Strong Heart Family Study.

    PubMed

    Zhu, Yun; Yang, Jingyun; Yeh, Fawn; Cole, Shelley A; Haack, Karin; Lee, Elisa T; Howard, Barbara V; Zhao, Jinying

    2014-01-01

    Cigarette smoke is a strong risk factor for obesity and cardiovascular disease. The effect of genetic variants involved in nicotine metabolism on obesity or body composition has not been well studied. Though many genetic variants have previously been associated with adiposity or body fat distribution, a single variant usually confers a minimal individual risk. The goal of this study is to evaluate the joint association of multiple variants involved in cigarette smoke or nicotine dependence with obesity-related phenotypes in American Indians. To achieve this goal, we genotyped 61 tagSNPs in seven genes encoding nicotine acetylcholine receptors (nAChRs) in 3,665 American Indians participating in the Strong Heart Family Study. Single SNP association with obesity-related traits was tested using family-based association, adjusting for traditional risk factors including smoking. Joint association of all SNPs in the seven nAChRs genes were examined by gene-family analysis based on weighted truncated product method (TPM). Multiple testing was controlled by false discovery rate (FDR). Results demonstrate that multiple SNPs showed weak individual association with one or more measures of obesity, but none survived correction for multiple testing. However, gene-family analysis revealed significant associations with waist circumference (p = 0.0001) and waist-to-hip ratio (p = 0.0001), but not body mass index (p = 0.20) and percent body fat (p = 0.29), indicating that genetic variants are jointly associated with abdominal, but not general, obesity among American Indians. The observed combined genetic effect is independent of cigarette smoking per se. In conclusion, multiple variants in the nAChR gene family are jointly associated with abdominal obesity in American Indians, independent of general obesity and cigarette smoking per se.

  14. Deep Sequencing of Three Loci Implicated in Large-Scale Genome-Wide Association Study Smoking Meta-Analyses.

    PubMed

    Clark, Shaunna L; McClay, Joseph L; Adkins, Daniel E; Aberg, Karolina A; Kumar, Gaurav; Nerella, Sri; Xie, Linying; Collins, Ann L; Crowley, James J; Quakenbush, Corey R; Hillard, Christopher E; Gao, Guimin; Shabalin, Andrey A; Peterson, Roseann E; Copeland, William E; Silberg, Judy L; Maes, Hermine; Sullivan, Patrick F; Costello, Elizabeth J; van den Oord, Edwin J

    2016-05-01

    Genome-wide association study meta-analyses have robustly implicated three loci that affect susceptibility for smoking: CHRNA5\\CHRNA3\\CHRNB4, CHRNB3\\CHRNA6 and EGLN2\\CYP2A6. Functional follow-up studies of these loci are needed to provide insight into biological mechanisms. However, these efforts have been hampered by a lack of knowledge about the specific causal variant(s) involved. In this study, we prioritized variants in terms of the likelihood they account for the reported associations. We employed targeted capture of the CHRNA5\\CHRNA3\\CHRNB4, CHRNB3\\CHRNA6, and EGLN2\\CYP2A6 loci and flanking regions followed by next-generation deep sequencing (mean coverage 78×) to capture genomic variation in 363 individuals. We performed single locus tests to determine if any single variant accounts for the association, and examined if sets of (rare) variants that overlapped with biologically meaningful annotations account for the associations. In total, we investigated 963 variants, of which 71.1% were rare (minor allele frequency < 0.01), 6.02% were insertion/deletions, and 51.7% were catalogued in dbSNP141. The single variant results showed that no variant fully accounts for the association in any region. In the variant set results, CHRNB4 accounts for most of the signal with significant sets consisting of directly damaging variants. CHRNA6 explains most of the signal in the CHRNB3\\CHRNA6 locus with significant sets indicating a regulatory role for CHRNA6. Significant sets in CYP2A6 involved directly damaging variants while the significant variant sets suggested a regulatory role for EGLN2. We found that multiple variants implicating multiple processes explain the signal. Some variants can be prioritized for functional follow-up. © The Author 2015. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  15. Deep Sequencing of Three Loci Implicated in Large-Scale Genome-Wide Association Study Smoking Meta-Analyses

    PubMed Central

    McClay, Joseph L.; Adkins, Daniel E.; Aberg, Karolina A.; Kumar, Gaurav; Nerella, Sri; Xie, Linying; Collins, Ann L.; Crowley, James J.; Quakenbush, Corey R.; Hillard, Christopher E.; Gao, Guimin; Shabalin, Andrey A.; Peterson, Roseann E.; Copeland, William E.; Silberg, Judy L.; Maes, Hermine; Sullivan, Patrick F.; Costello, Elizabeth J.; van den Oord, Edwin J.

    2016-01-01

    Abstract Introduction: Genome-wide association study meta-analyses have robustly implicated three loci that affect susceptibility for smoking: CHRNA5\\CHRNA3\\CHRNB4 , CHRNB3\\CHRNA6 and EGLN2\\CYP2A6 . Functional follow-up studies of these loci are needed to provide insight into biological mechanisms. However, these efforts have been hampered by a lack of knowledge about the specific causal variant(s) involved. In this study, we prioritized variants in terms of the likelihood they account for the reported associations. Methods: We employed targeted capture of the CHRNA5\\CHRNA3\\CHRNB4 , CHRNB3\\CHRNA6 , and EGLN2\\CYP2A6 loci and flanking regions followed by next-generation deep sequencing (mean coverage 78×) to capture genomic variation in 363 individuals. We performed single locus tests to determine if any single variant accounts for the association, and examined if sets of (rare) variants that overlapped with biologically meaningful annotations account for the associations. Results: In total, we investigated 963 variants, of which 71.1% were rare (minor allele frequency < 0.01), 6.02% were insertion/deletions, and 51.7% were catalogued in dbSNP141. The single variant results showed that no variant fully accounts for the association in any region. In the variant set results, CHRNB4 accounts for most of the signal with significant sets consisting of directly damaging variants. CHRNA6 explains most of the signal in the CHRNB3\\CHRNA6 locus with significant sets indicating a regulatory role for CHRNA6 . Significant sets in CYP2A6 involved directly damaging variants while the significant variant sets suggested a regulatory role for EGLN2 . Conclusions: We found that multiple variants implicating multiple processes explain the signal. Some variants can be prioritized for functional follow-up. PMID:26283763

  16. Visual-Motor Test Performance: Race and Achievement Variables.

    ERIC Educational Resources Information Center

    Fuller, Gerald B.; Friedrich, Douglas

    1979-01-01

    Rural Black and White children of variant academic achievement were tested on the Minnesota Percepto-Diagnostic Test, which consists of six gestalt designs for the subject to copy. Analyses resulted only in a significant achievement effect; when intellectual level was statistically controlled, race was not a significant variable. (Editor/SJL)

  17. Comparison of statistical tests for association between rare variants and binary traits.

    PubMed

    Bacanu, Silviu-Alin; Nelson, Matthew R; Whittaker, John C

    2012-01-01

    Genome-wide association studies have found thousands of common genetic variants associated with a wide variety of diseases and other complex traits. However, a large portion of the predicted genetic contribution to many traits remains unknown. One plausible explanation is that some of the missing variation is due to the effects of rare variants. Nonetheless, the statistical analysis of rare variants is challenging. A commonly used method is to contrast, within the same region (gene), the frequency of minor alleles at rare variants between cases and controls. However, this strategy is most useful under the assumption that the tested variants have similar effects. We previously proposed a method that can accommodate heterogeneous effects in the analysis of quantitative traits. Here we extend this method to include binary traits that can accommodate covariates. We use simulations for a variety of causal and covariate impact scenarios to compare the performance of the proposed method to standard logistic regression, C-alpha, SKAT, and EREC. We found that i) logistic regression methods perform well when the heterogeneity of the effects is not extreme and ii) SKAT and EREC have good performance under all tested scenarios but they can be computationally intensive. Consequently, it would be more computationally desirable to use a two-step strategy by (i) selecting promising genes by faster methods and ii) analyzing selected genes using SKAT/EREC. To select promising genes one can use (1) regression methods when effect heterogeneity is assumed to be low and the covariates explain a non-negligible part of trait variability, (2) C-alpha when heterogeneity is assumed to be large and covariates explain a small fraction of trait's variability and (3) the proposed trend and heterogeneity test when the heterogeneity is assumed to be non-trivial and the covariates explain a large fraction of trait variability.

  18. Calibrating genomic and allelic coverage bias in single-cell sequencing.

    PubMed

    Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher

    2015-04-16

    Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.

  19. Calibrating genomic and allelic coverage bias in single-cell sequencing

    PubMed Central

    Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher

    2016-01-01

    Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913

  20. Validation and optimization of the Ion Torrent S5 XL sequencer and Oncomine workflow for BRCA1 and BRCA2 genetic testing.

    PubMed

    Shin, Saeam; Kim, Yoonjung; Chul Oh, Seoung; Yu, Nae; Lee, Seung-Tae; Rak Choi, Jong; Lee, Kyung-A

    2017-05-23

    In this study, we validated the analytical performance of BRCA1/2 sequencing using Ion Torrent's new bench-top sequencer with amplicon panel with optimized bioinformatics pipelines. Using 43 samples that were previously validated by Illumina's MiSeq platform and/or by Sanger sequencing/multiplex ligation-dependent probe amplification, we amplified the target with the Oncomine™ BRCA Research Assay and sequenced on Ion Torrent S5 XL (Thermo Fisher Scientific, Waltham, MA, USA). We compared two bioinformatics pipelines for optimal processing of S5 XL sequence data: the Torrent Suite with a plug-in Torrent Variant Caller (Thermo Fisher Scientific), and commercial NextGENe software (Softgenetics, State College, PA, USA). All expected 681 single nucleotide variants, 15 small indels, and three copy number variants were correctly called, except one common variant adjacent to a rare variant on the primer-binding site. The sensitivity, specificity, false positive rate, and accuracy for detection of single nucleotide variant and small indels of S5 XL sequencing were 99.85%, 100%, 0%, and 99.99% for the Torrent Variant Caller and 99.85%, 99.99%, 0.14%, and 99.99% for NextGENe, respectively. The reproducibility of variant calling was 100%, and the precision of variant frequency also showed good performance with coefficients of variation between 0.32 and 5.29%. We obtained highly accurate data through uniform and sufficient coverage depth over all target regions and through optimization of the bioinformatics pipeline. We confirmed that our platform is accurate and practical for diagnostic BRCA1/2 testing in a clinical laboratory.

  1. QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles.

    PubMed

    Van der Borght, Koen; Thys, Kim; Wetzels, Yves; Clement, Lieven; Verbist, Bie; Reumers, Joke; van Vlijmen, Herman; Aerssens, Jeroen

    2015-11-10

    Next generation sequencing enables studying heterogeneous populations of viral infections. When the sequencing is done at high coverage depth ("deep sequencing"), low frequency variants can be detected. Here we present QQ-SNV (http://sourceforge.net/projects/qqsnv), a logistic regression classifier model developed for the Illumina sequencing platforms that uses the quantiles of the quality scores, to distinguish true single nucleotide variants from sequencing errors based on the estimated SNV probability. To train the model, we created a dataset of an in silico mixture of five HIV-1 plasmids. Testing of our method in comparison to the existing methods LoFreq, ShoRAH, and V-Phaser 2 was performed on two HIV and four HCV plasmid mixture datasets and one influenza H1N1 clinical dataset. For default application of QQ-SNV, variants were called using a SNV probability cutoff of 0.5 (QQ-SNV(D)). To improve the sensitivity we used a SNV probability cutoff of 0.0001 (QQ-SNV(HS)). To also increase specificity, SNVs called were overruled when their frequency was below the 80(th) percentile calculated on the distribution of error frequencies (QQ-SNV(HS-P80)). When comparing QQ-SNV versus the other methods on the plasmid mixture test sets, QQ-SNV(D) performed similarly to the existing approaches. QQ-SNV(HS) was more sensitive on all test sets but with more false positives. QQ-SNV(HS-P80) was found to be the most accurate method over all test sets by balancing sensitivity and specificity. When applied to a paired-end HCV sequencing study, with lowest spiked-in true frequency of 0.5%, QQ-SNV(HS-P80) revealed a sensitivity of 100% (vs. 40-60% for the existing methods) and a specificity of 100% (vs. 98.0-99.7% for the existing methods). In addition, QQ-SNV required the least overall computation time to process the test sets. Finally, when testing on a clinical sample, four putative true variants with frequency below 0.5% were consistently detected by QQ-SNV(HS-P80) from different generations of Illumina sequencers. We developed and successfully evaluated a novel method, called QQ-SNV, for highly efficient single nucleotide variant calling on Illumina deep sequencing virology data.

  2. Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.

    PubMed

    Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J

    2015-07-01

    Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. Copyright © 2015 by the Genetics Society of America.

  3. A Novel Binary Mixture of Helicoverpa armigera Single Nucleopolyhedrovirus Genotypic Variants Has Improved Insecticidal Characteristics for Control of Cotton Bollworms

    PubMed Central

    Arrizubieta, Maite; Simón, Oihane; Williams, Trevor

    2015-01-01

    The genotypic diversity of two Spanish isolates of Helicoverpa armigera single nucleopolyhedrovirus (HearSNPV) was evaluated with the aim of identifying mixtures of genotypes with improved insecticidal characteristics for control of the cotton bollworm. Two genotypic variants, HearSP1A and HearSP1B, were cloned in vitro from the most pathogenic wild-type isolate of the Iberian Peninsula, HearSNPV-SP1 (HearSP1-wt). Similarly, six genotypic variants (HearLB1 to -6) were obtained by endpoint dilution from larvae collected from cotton crops in southern Spain that died from virus disease during laboratory rearing. Variants differed significantly in their insecticidal properties, pathogenicity, speed of kill, and occlusion body (OB) production (OBs/larva). HearSP1B was ∼3-fold more pathogenic than HearSP1-wt and the other variants. HearLB1, HearLB2, HeaLB5, and HearLB6 were the fastest-killing variants. Moreover, although highly virulent, HearLB1, HearLB4, and HearLB5 produced more OBs/larva than did the other variants. The co-occluded HearSP1B:LB6 mixture at a 1:1 proportion was 1.7- to 2.8-fold more pathogenic than any single variant and other mixtures tested and also killed larvae as fast as the most virulent genotypes. Serial passage resulted in modified proportions of the component variants of the HearSP1B:LB6 co-occluded mixture, suggesting that transmissibility could be further improved by this process. We conclude that the improved insecticidal phenotype of the HearSP1B:LB6 co-occluded mixture underlines the utility of the genotypic variant dissection and reassociation approach for the development of effective virus-based insecticides. PMID:25841011

  4. Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic.

    PubMed

    Bowden, Jack; Del Greco M, Fabiola; Minelli, Cosetta; Davey Smith, George; Sheehan, Nuala A; Thompson, John R

    2016-12-01

    : MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error' (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied. An adaptation of the I2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it IGX2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example. In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of IGX2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We demonstrate our proposed approach for a two-sample summary data MR analysis to estimate the causal effect of low-density lipoprotein on heart disease risk. A high value of IGX2 close to 1 indicates that dilution does not materially affect the standard MR-Egger analyses for these data. : Care must be taken to assess the NOME assumption via the IGX2 statistic before implementing standard MR-Egger regression in the two-sample summary data context. If IGX2 is sufficiently low (less than 90%), inferences from the method should be interpreted with caution and adjustment methods considered. © The Author 2016. Published by Oxford University Press on behalf of the International Epidemiological Association.

  5. Family-Based Rare Variant Association Analysis: A Fast and Efficient Method of Multivariate Phenotype Association Analysis.

    PubMed

    Wang, Longfei; Lee, Sungyoung; Gim, Jungsoo; Qiao, Dandi; Cho, Michael; Elston, Robert C; Silverman, Edwin K; Won, Sungho

    2016-09-01

    Family-based designs have been repeatedly shown to be powerful in detecting the significant rare variants associated with human diseases. Furthermore, human diseases are often defined by the outcomes of multiple phenotypes, and thus we expect multivariate family-based analyses may be very efficient in detecting associations with rare variants. However, few statistical methods implementing this strategy have been developed for family-based designs. In this report, we describe one such implementation: the multivariate family-based rare variant association tool (mFARVAT). mFARVAT is a quasi-likelihood-based score test for rare variant association analysis with multiple phenotypes, and tests both homogeneous and heterogeneous effects of each variant on multiple phenotypes. Simulation results show that the proposed method is generally robust and efficient for various disease models, and we identify some promising candidate genes associated with chronic obstructive pulmonary disease. The software of mFARVAT is freely available at http://healthstat.snu.ac.kr/software/mfarvat/, implemented in C++ and supported on Linux and MS Windows. © 2016 WILEY PERIODICALS, INC.

  6. Two strategies to engineer flexible loops for improved enzyme thermostability

    PubMed Central

    Yu, Haoran; Yan, Yihan; Zhang, Cheng; Dalby, Paul A.

    2017-01-01

    Flexible sites are potential targets for engineering the stability of enzymes. Nevertheless, the success rate of the rigidifying flexible sites (RFS) strategy is still low due to a limited understanding of how to determine the best mutation candidates. In this study, two parallel strategies were applied to identify mutation candidates within the flexible loops of Escherichia coli transketolase (TK). The first was a “back to consensus mutations” approach, and the second was computational design based on ΔΔG calculations in Rosetta. Forty-nine single variants were generated and characterised experimentally. From these, three single-variants I189H, A282P, D143K were found to be more thermostable than wild-type TK. The combination of A282P with H192P, a variant constructed previously, resulted in the best all-round variant with a 3-fold improved half-life at 60 °C, 5-fold increased specific activity at 65 °C, 1.3-fold improved kcat and a Tm increased by 5 °C above that of wild type. Based on a statistical analysis of the stability changes for all variants, the qualitative prediction accuracy of the Rosetta program reached 65.3%. Both of the two strategies investigated were useful in guiding mutation candidates to flexible loops, and had the potential to be used for other enzymes. PMID:28145457

  7. Arrhythmias Following Comprehensive Stage II Surgical Palliation in Single Ventricle Patients.

    PubMed

    Wilhelm, Carolyn M; Paulus, Diane; Cua, Clifford L; Kertesz, Naomi J; Cheatham, John P; Galantowicz, Mark; Fernandez, Richard P

    2016-03-01

    Post-operative arrhythmias are common in pediatric patients following cardiac surgery. Following hybrid palliation in single ventricle patients, a comprehensive stage II palliation is performed. The incidence of arrhythmias in patients following comprehensive stage II palliation is unknown. The purpose of this study is to determine the incidence of arrhythmias following comprehensive stage II palliation. A single-center retrospective chart review was performed on all single ventricle patients undergoing a comprehensive stage II palliation from January 2010 to May 2014. Pre-operative, operative, and post-operative data were collected. A clinically significant arrhythmia was defined as an arrhythmia which led to cardiopulmonary resuscitation or required treatment with either pacing or antiarrhythmic medication. Statistical analysis was performed with Wilcoxon rank-sum test and Fisher's exact test with p < 0.05 significant. Forty-eight single ventricle patients were reviewed (32 hypoplastic left heart syndrome, 16 other single ventricle variants). Age at surgery was 185 ± 56 days. Cardiopulmonary bypass time was 259 ± 45 min. Average vasoactive-inotropic score was 5.97 ± 7.58. Six patients (12.5 %) had clinically significant arrhythmias: four sinus bradycardia, one 2:1 atrioventricular block, and one slow junctional rhythm. No tachyarrhythmias were documented for this patient population. Presence of arrhythmia was associated with elevated lactate (p = 0.04) and cardiac arrest (p = 0.002). Following comprehensive stage II palliation, single ventricle patients are at low risk for development of tachyarrhythmias. The most frequent arrhythmia seen in these patients was sinus bradycardia associated with respiratory compromise.

  8. Whole exome sequence-based association analyses of plasma amyloid-β in African and European Americans; the Atherosclerosis Risk in Communities-Neurocognitive Study.

    PubMed

    Simino, Jeannette; Wang, Zhiying; Bressler, Jan; Chouraki, Vincent; Yang, Qiong; Younkin, Steven G; Seshadri, Sudha; Fornage, Myriam; Boerwinkle, Eric; Mosley, Thomas H

    2017-01-01

    We performed single-variant and gene-based association analyses of plasma amyloid-β (aβ) concentrations using whole exome sequence from 1,414 African and European Americans. Our goal was to identify genes that influence plasma aβ42 concentrations and aβ42:aβ40 ratios in late middle age (mean = 59 years), old age (mean = 77 years), or change over time (mean = 18 years). Plasma aβ measures were linearly regressed onto age, gender, APOE ε4 carrier status, and time elapsed between visits (fold-changes only) separately by race. Following inverse normal transformation of the residuals, seqMeta was used to conduct race-specific single-variant and gene-based association tests while adjusting for population structure. Linear regression models were fit on autosomal variants with minor allele frequencies (MAF)≥1%. T5 burden and Sequence Kernel Association (SKAT) gene-based tests assessed functional variants with MAF≤5%. Cross-race fixed effects meta-analyses were Bonferroni-corrected for the number of variants or genes tested. Seven genes were associated with aβ in late middle age or change over time; no associations were identified in old age. Single variants in KLKB1 (rs3733402; p = 4.33x10-10) and F12 (rs1801020; p = 3.89x10-8) were significantly associated with midlife aβ42 levels through cross-race meta-analysis; the KLKB1 variant replicated internally using 1,014 additional participants with exome chip. ITPRIP, PLIN2, and TSPAN18 were associated with the midlife aβ42:aβ40 ratio via the T5 test; TSPAN18 was significant via the cross-race meta-analysis, whereas ITPRIP and PLIN2 were European American-specific. NCOA1 and NT5C3B were associated with the midlife aβ42:aβ40 ratio and the fold-change in aβ42, respectively, via SKAT in African Americans. No associations replicated externally (N = 725). We discovered age-dependent genetic effects, established associations between vascular-related genes (KLKB1, F12, PLIN2) and midlife plasma aβ levels, and identified a plausible Alzheimer's Disease candidate gene (ITPRIP) influencing cell death. Plasma aβ concentrations may have dynamic biological determinants across the lifespan; plasma aβ study designs or analyses must consider age.

  9. Evaluation and application of summary statistic imputation to discover new height-associated loci.

    PubMed

    Rüeger, Sina; McDaid, Aaron; Kutalik, Zoltán

    2018-05-01

    As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression.

  10. Evaluation and application of summary statistic imputation to discover new height-associated loci

    PubMed Central

    2018-01-01

    As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis of these traits. Association summary statistics from genome-wide meta-analyses are available for hundreds of traits. Updating these to ever-increasing reference panels is very cumbersome as it requires reimputation of the genetic data, rerunning the association scan, and meta-analysing the results. A much more efficient method is to directly impute the summary statistics, termed as summary statistics imputation, which we improved to accommodate variable sample size across SNVs. Its performance relative to genotype imputation and practical utility has not yet been fully investigated. To this end, we compared the two approaches on real (genotyped and imputed) data from 120K samples from the UK Biobank and show that, genotype imputation boasts a 3- to 5-fold lower root-mean-square error, and better distinguishes true associations from null ones: We observed the largest differences in power for variants with low minor allele frequency and low imputation quality. For fixed false positive rates of 0.001, 0.01, 0.05, using summary statistics imputation yielded a decrease in statistical power by 9, 43 and 35%, respectively. To test its capacity to discover novel associations, we applied summary statistics imputation to the GIANT height meta-analysis summary statistics covering HapMap variants, and identified 34 novel loci, 19 of which replicated using data in the UK Biobank. Additionally, we successfully replicated 55 out of the 111 variants published in an exome chip study. Our study demonstrates that summary statistics imputation is a very efficient and cost-effective way to identify and fine-map trait-associated loci. Moreover, the ability to impute summary statistics is important for follow-up analyses, such as Mendelian randomisation or LD-score regression. PMID:29782485

  11. PVRL1 as a Candidate Gene for Nonsyndromic Cleft Lip With or Without Cleft Palate: No Evidence for the Involvement of Common or Rare Variants in Southern Han Chinese Patients

    PubMed Central

    Cheng, Hong-Qiu; Huang, En-Min; Xu, Ming-Yan; Shu, Shen-You

    2012-01-01

    The poliovirus receptor related-1 (PVRL1) gene encodes nectin-1, a cell–cell adhesion molecule (OMIM #600644), and is mutated in the cleft lip with or without cleft palate/ectodermal dysplasia-1 syndrome (CLPED1, OMIM #225000). In addition, PVRL1 mutations have been associated with nonsyndromic cleft lip with or without a cleft palate (NSCL/P) in studies of multiethnic samples. To investigate the possible involvement of this gene in southern Han Chinese NSCL/P patients, we performed (i) a case–control association study, and (ii) a resequencing study. A set of 470 patients with NSCL/P and 693 controls were recruited, and a total of 45 tagging single-nucleotide polymorphisms (SNPs) were genotyped by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. In the resequencing study, the coding regions of the PVRL1 α isoform were direct sequenced in 45 trios from multiply affected families. One (rs7128327) of the 45 tested SNPs showed a trend toward statistical significance in the genotypic-level chi-square test (p=0.009567). However, this result did not withstand correction for multiple testing. Likewise, sliding window haplotype analyses consisting of two, three, or four SNPs failed to detect any positive association. Resequencing analysis also failed to identify any novel rare sequence variants. In conclusion, the present study provided no support for the hypothesis that common or rare variants in PVRL1 play a significant role in NSCL/P development in the southern Han Chinese population. This is the first study that has used tagging SNPs covering all the coding and noncoding regions to search for common NSCL/P-associated mutations of PVRL1. PMID:22455396

  12. A statistical method for the detection of variants from next-generation resequencing of DNA pools.

    PubMed

    Bansal, Vikas

    2010-06-15

    Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing. We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80-85% of SNPs identified using individual sequencing while achieving a low false discovery rate (3-5%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP. Implementation of this method is available at http://polymorphism.scripps.edu/~vbansal/software/CRISP/.

  13. Systematic meta-analyses and field synopsis of genetic association studies in colorectal adenomas

    PubMed Central

    Montazeri, Zahra; Theodoratou, Evropi; Nyiraneza, Christine; Timofeeva, Maria; Chen, Wanjing; Svinti, Victoria; Sivakumaran, Shanya; Gresham, Gillian; Cubitt, Laura; Carvajal-Carmona, Luis; Bertagnolli, Monica M; Zauber, Ann G; Tomlinson, Ian; Farrington, Susan M; Dunlop, Malcolm G; Campbell, Harry; Little, Julian

    2018-01-01

    Background Low penetrance genetic variants, primarily single nucleotide polymorphisms, have substantial influence on colorectal cancer (CRC) susceptibility. Most CRCs develop from colorectal adenomas (CRA). Here, we report the first comprehensive field synopsis that catalogues all genetic association studies on CRA, with a parallel online database (http://www.chs.med.ed.ac.uk/CRAgene/). Methods We performed a systematic review, reviewing 9750 titles and then extracted data from 130 publications reporting on 181 polymorphisms in 74 genes. We conducted meta-analyses to derive summary effect estimates for 37 polymorphisms in 26 genes. We applied the Venice criteria and Bayesian False Discovery Probability (BFDP) to assess the levels of the credibility of associations. Results We considered the association with the rs6983267 variant at 8q24 as “highly credible”, reaching genome wide statistical significance in at least one meta-analysis model. We identified “less credible” associations (higher heterogeneity, lower statistical power, BFDP>0.02) with a further four variants of four independent genes: MTHFR c.677C>T p.A222V (rs1801133), TP53 c.215C>G p.R72P (rs1042522), NQO1 c.559C>T p.P187S (rs1800566), and NAT1 alleles imputed as fast acetylator genotypes. For the remaining 32 variants of 22 genes for which positive associations with CRA risk have been previously reported, the meta-analyses revealed no credible evidence to support these as true associations. Conclusions The limited number of credible associations between low penetrance genetic variants and CRA reflects the lower volume of evidence and associated lack of statistical power to detect associations of the magnitude typically observed for genetic variants and chronic diseases. The CRAgene database provides context for CRA genetic association data and will help inform future research directions. PMID:26451011

  14. Assessing interactions between the associations of common genetic susceptibility variants, reproductive history and body mass index with breast cancer risk in the breast cancer association consortium: a combined case-control study

    PubMed Central

    2010-01-01

    Introduction Several common breast cancer genetic susceptibility variants have recently been identified. We aimed to determine how these variants combine with a subset of other known risk factors to influence breast cancer risk in white women of European ancestry using case-control studies participating in the Breast Cancer Association Consortium. Methods We evaluated two-way interactions between each of age at menarche, ever having had a live birth, number of live births, age at first birth and body mass index (BMI) and each of 12 single nucleotide polymorphisms (SNPs) (10q26-rs2981582 (FGFR2), 8q24-rs13281615, 11p15-rs3817198 (LSP1), 5q11-rs889312 (MAP3K1), 16q12-rs3803662 (TOX3), 2q35-rs13387042, 5p12-rs10941679 (MRPS30), 17q23-rs6504950 (COX11), 3p24-rs4973768 (SLC4A7), CASP8-rs17468277, TGFB1-rs1982073 and ESR1-rs3020314). Interactions were tested for by fitting logistic regression models including per-allele and linear trend main effects for SNPs and risk factors, respectively, and single-parameter interaction terms for linear departure from independent multiplicative effects. Results These analyses were applied to data for up to 26,349 invasive breast cancer cases and up to 32,208 controls from 21 case-control studies. No statistical evidence of interaction was observed beyond that expected by chance. Analyses were repeated using data from 11 population-based studies, and results were very similar. Conclusions The relative risks for breast cancer associated with the common susceptibility variants identified to date do not appear to vary across women with different reproductive histories or body mass index (BMI). The assumption of multiplicative combined effects for these established genetic and other risk factors in risk prediction models appears justified. PMID:21194473

  15. A powerful score-based test statistic for detecting gene-gene co-association.

    PubMed

    Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun

    2016-01-29

    The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.

  16. A Functional ATG16L1 (T300A) Variant is Associated with Necrotizing Enterocolitis in Premature Infants

    PubMed Central

    Sampath, Venkatesh; Bhandari, Vineet; Berger, Jessica; Merchant, Daniel; Zhang, Liyun; Ladd, Mihoko; Menden, Heather; Garland, Jeffery; Ambalavanan, Namasivayam; Mulrooney, Neil; Quasney, Michael; Dagle, John; Lavoie, Pascal M; Simpson, Pippa; Dahmer, Mary

    2017-01-01

    Background The genetic basis of dysfunctional immune responses in necrotizing enterocolitis (NEC) remains unknown. We hypothesized that variants in Nucleotide binding and Oligomerization Domain (NOD)-Like Receptors (NLRs) and Autophagy (ATG) genes modulate vulnerability to NEC. Methods We genotyped a multi-center cohort of premature infants with and without NEC for NOD1, NOD2, ATG16L1, CARD8 and NLRP3 variants. Chi-square tests and logistic regression were used for statistical analysis. Results In our primary cohort (n=1015), 86 (8.5%) infants developed NEC. The A allele of the ATG16L1 (Thr300Ala) variant was associated with increased NEC (AA vs. AG vs. GG; 11.3% vs. 8.4% vs. 4.8%, p=0.009). In regression models for NEC that adjusted for epidemiological confounders, GA (p=0.033) and the AA genotype (p=0.038) of ATG16L1 variant were associated with NEC. The association between the A allele of the ATG16L1 variant and NEC remained significant among Caucasian infants (p=0.02). In a replication cohort (n=259), NEC rates were highest among infants with the AA genotype but did not reach statistical significance. Conclusion We report a novel association between a hypomorphic variant in an autophagy gene (ATG16L1) and NEC in premature infants. Our data suggest that decreased autophagy arising from genetic variants may confer protection against NEC. PMID:27893720

  17. Association of the distal region of the ectonucleotide pyrophosphatase/phosphodiesterase 1 gene with type 2 diabetes in an African-American population enriched for nephropathy.

    PubMed

    Keene, Keith L; Mychaleckyj, Josyf C; Smith, Shelly G; Leak, Tennille S; Perlegas, Peter S; Langefeld, Carl D; Freedman, Barry I; Rich, Stephen S; Bowden, Donald W; Sale, Michèle M

    2008-04-01

    Variants in the ectonucleotide pyrophosphatase/phosphodiesterase 1 (ENPP1) gene have shown positive associations with diabetes and related phenotypes, including insulin resistance, metabolic syndrome, and type 1 diabetic nephropathy. Additionally, evidence for linkage for type 2 diabetes in African Americans was observed at 6q24-27, with the proximal edge of the peak encompassing the ENPP1 gene. Our objective was to comprehensively evaluate variants in ENPP1 for association with type 2 diabetic end-stage renal disease (ESRD). Forty-nine single nucleotide polymorphisms (SNPs) located in the coding and flanking regions of ENPP1 were genotyped in 577 African-American individuals with type 2 diabetic ESRD and 596 African-American control subjects. Haplotypic association and genotypic association for the dominant, additive, and recessive models were tested by calculating a chi(2) statistic and corresponding P value. Nine SNPs showed nominal evidence for association (P < 0.05) with type 2 diabetic ESRD in one or more genotypic model. The most significant associations were observed with rs7754586 (P = 0.003 dominant model, P = 0.0005 additive, and P = 0.007 recessive), located in the 3' untranslated region, and an intron 24 SNP (rs1974201: P = 0.004 dominant, P = 0.0005 additive, and P = 0.005 recessive). However, the extensively studied K121Q variant (rs1044498) did not reveal evidence for association with type 2 diabetic ESRD in this African-American population. This study was the first to comprehensively evaluate variants of the ENPP1 gene for association in an African-American population with type 2 diabetes and ESRD and suggests that variants in the distal region of the ENPP1 gene may contribute to diabetes or diabetic nephropathy susceptibility in African Americans.

  18. A Comparison Study of Multivariate Fixed Models and Gene Association with Multiple Traits (GAMuT) for Next-Generation Sequencing

    PubMed Central

    Chiu, Chi-yang; Jung, Jeesun; Wang, Yifan; Weeks, Daniel E.; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Amos, Christopher I.; Mills, James L.; Boehnke, Michael; Xiong, Momiao; Fan, Ruzong

    2016-01-01

    In this paper, extensive simulations are performed to compare two statistical methods to analyze multiple correlated quantitative phenotypes: (1) approximate F-distributed tests of multivariate functional linear models (MFLM) and additive models of multivariate analysis of variance (MANOVA), and (2) Gene Association with Multiple Traits (GAMuT) for association testing of high-dimensional genotype data. It is shown that approximate F-distributed tests of MFLM and MANOVA have higher power and are more appropriate for major gene association analysis (i.e., scenarios in which some genetic variants have relatively large effects on the phenotypes); GAMuT has higher power and is more appropriate for analyzing polygenic effects (i.e., effects from a large number of genetic variants each of which contributes a small amount to the phenotypes). MFLM and MANOVA are very flexible and can be used to perform association analysis for: (i) rare variants, (ii) common variants, and (iii) a combination of rare and common variants. Although GAMuT was designed to analyze rare variants, it can be applied to analyze a combination of rare and common variants and it performs well when (1) the number of genetic variants is large and (2) each variant contributes a small amount to the phenotypes (i.e., polygenes). MFLM and MANOVA are fixed effect models which perform well for major gene association analysis. GAMuT can be viewed as an extension of sequence kernel association tests (SKAT). Both GAMuT and SKAT are more appropriate for analyzing polygenic effects and they perform well not only in the rare variant case, but also in the case of a combination of rare and common variants. Data analyses of European cohorts and the Trinity Students Study are presented to compare the performance of the two methods. PMID:27917525

  19. Efficient inference for genetic association studies with multiple outcomes.

    PubMed

    Ruffieux, Helene; Davison, Anthony C; Hager, Jorg; Irincheeva, Irina

    2017-10-01

    Combined inference for heterogeneous high-dimensional data is critical in modern biology, where clinical and various kinds of molecular data may be available from a single study. Classical genetic association studies regress a single clinical outcome on many genetic variants one by one, but there is an increasing demand for joint analysis of many molecular outcomes and genetic variants in order to unravel functional interactions. Unfortunately, most existing approaches to joint modeling are either too simplistic to be powerful or are impracticable for computational reasons. Inspired by Richardson and others (2010, Bayesian Statistics 9), we consider a sparse multivariate regression model that allows simultaneous selection of predictors and associated responses. As Markov chain Monte Carlo (MCMC) inference on such models can be prohibitively slow when the number of genetic variants exceeds a few thousand, we propose a variational inference approach which produces posterior information very close to that of MCMC inference, at a much reduced computational cost. Extensive numerical experiments show that our approach outperforms popular variable selection methods and tailored Bayesian procedures, dealing within hours with problems involving hundreds of thousands of genetic variants and tens to hundreds of clinical or molecular outcomes. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. Pharmacogenetics of asthma

    PubMed Central

    Lima, John J.; Blake, Kathryn V.; Tantisira, Kelan G.; Weiss, Scott T.

    2009-01-01

    Purpose of review Patient response to the asthma drug classes, bronchodilators, inhaled corticosteroids and leukotriene modifiers, are characterized by a large degree of heterogeneity, which is attributable in part to genetic variation. Herein, we review and update the pharmacogenetics and pharmaogenomics of common asthma drugs. Recent findings Early studies suggest that bronchodilator reversibility and asthma worsening in patients on continuous short-acting and long-acting β-agonists are related to the Gly16Arg genotype for the ADRB2. More recent studies including genome-wide association studies implicate variants in other genes contribute to bronchodilator response heterogeneity and fail to replicate asthma worsening associated with continuous β-agonist use. Genetic determinants of the safety of long-acting β-agonist require further study. Variants in CRHR1, TBX21, and FCER2 contribute to variability in response for lung function, airways responsiveness, and exacerbations in patients taking inhaled corticosteroids. Variants in ALOX5, LTA4H, LTC4S, ABCC1, CYSLTR2, and SLCO2B1 contribute to variability in response to leukotriene modifiers. Summary Identification of novel variants that contribute to response heterogeneity supports future studies of single nucleotide polymorphism discovery and include gene expression and genome-wide association studies. Statistical models that predict the genomics of response to asthma drugs will complement single nucleotide polymorphism discovery in moving toward personalized medicine. PMID:19077707

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.

    DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a trulymore » genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. Here we found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.« less

  2. Improved methods for multi-trait fine mapping of pleiotropic risk loci.

    PubMed

    Kichaev, Gleb; Roytman, Megan; Johnson, Ruth; Eskin, Eleazar; Lindström, Sara; Kraft, Peter; Pasaniuc, Bogdan

    2017-01-15

    Genome-wide association studies (GWAS) have identified thousands of regions in the genome that contain genetic variants that increase risk for complex traits and diseases. However, the variants uncovered in GWAS are typically not biologically causal, but rather, correlated to the true causal variant through linkage disequilibrium (LD). To discern the true causal variant(s), a variety of statistical fine-mapping methods have been proposed to prioritize variants for functional validation. In this work we introduce a new approach, fastPAINTOR, that leverages evidence across correlated traits, as well as functional annotation data, to improve fine-mapping accuracy at pleiotropic risk loci. To improve computational efficiency, we describe an new importance sampling scheme to perform model inference. First, we demonstrate in simulations that by leveraging functional annotation data, fastPAINTOR increases fine-mapping resolution relative to existing methods. Next, we show that jointly modeling pleiotropic risk regions improves fine-mapping resolution compared to standard single trait and pleiotropic fine mapping strategies. We report a reduction in the number of SNPs required for follow-up in order to capture 90% of the causal variants from 23 SNPs per locus using a single trait to 12 SNPs when fine-mapping two traits simultaneously. Finally, we analyze summary association data from a large-scale GWAS of lipids and show that these improvements are largely sustained in real data. The fastPAINTOR framework is implemented in the PAINTOR v3.0 package which is publicly available to the research community http://bogdan.bioinformatics.ucla.edu/software/paintor CONTACT: gkichaev@ucla.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. Discovering genetic variants in Crohn's disease by exploring genomic regions enriched of weak association signals.

    PubMed

    D'Addabbo, Annarita; Palmieri, Orazio; Maglietta, Rosalia; Latiano, Anna; Mukherjee, Sayan; Annese, Vito; Ancona, Nicola

    2011-08-01

    A meta-analysis has re-analysed previous genome-wide association scanning definitively confirming eleven genes and further identifying 21 new loci. However, the identified genes/loci still explain only the minority of genetic predisposition of Crohn's disease. To identify genes weakly involved in disease predisposition by analysing chromosomal regions enriched of single nucleotide polymorphisms with modest statistical association. We utilized the WTCCC data set evaluating 1748 CD and 2938 controls. The identification of candidate genes/loci was performed by a two-step procedure: first of all chromosomal regions enriched of weak association signals were localized; subsequently, weak signals clustered in gene regions were identified. The statistical significance was assessed by non parametric permutation tests. The cytoband enrichment analysis highlighted 44 regions (P≤0.05) enriched with single nucleotide polymorphisms significantly associated with the trait including 23 out of 31 previously confirmed and replicated genes. Importantly, we highlight further 20 novel chromosomal regions carrying approximately one hundred genes/loci with modest association. Amongst these we find compelling functional candidate genes such as MAPT, GRB2 and CREM, LCT, and IL12RB2. Our study suggests a different statistical perspective to discover genes weakly associated with a given trait, although further confirmatory functional studies are needed. Copyright © 2011 Editrice Gastroenterologica Italiana S.r.l. All rights reserved.

  4. Association of keratin 8/18 variants with non-alcoholic fatty liver disease and insulin resistance in Chinese patients: A case-control study.

    PubMed

    Li, Rui; Liao, Xian-Hua; Ye, Jun-Zhao; Li, Min-Rui; Wu, Yan-Qin; Hu, Xuan; Zhong, Bi-Hui

    2017-06-14

    To test the hypothesis that K8/K18 variants predispose humans to non-alcoholic fatty liver disease (NAFLD) progression and its metabolic phenotypes. We selected a total of 373 unrelated adult subjects from our Physical Examination Department, including 200 unrelated NAFLD patients and 173 controls of both genders and different ages. Diagnoses of NAFLD were established according to ultrasonic signs of fatty liver. All subjects were tested for population characteristics, lipid profile, liver tests, as well as glucose tests. Genomic DNA was obtained from peripheral blood with a DNeasy Tissue Kit. K8/K18 coding regions were analyzed, including 15 exons and exon-intron boundaries. Among 200 NAFLD patients, 10 (5%) heterozygous carriers of keratin variants were identified. There were 5 amino-acid-altering heterozygous variants and 6 non-coding heterozygous variants. One novel amino-acid-altering heterozygous variant (K18 N193S) and three novel non-coding variants were observed (K8 IVS5-9A→G, K8 IVS6+19G→A, K18 T195T). A total of 9 patients had a single variant and 1 patient had compound variants (K18 N193S+K8 IVS3-15C→G). Only one R341H variant was found in the control group (1 of 173, 0.58%). The frequency of keratin variants in NAFLD patients was significantly higher than that in the control group (5% vs 0.58%, P = 0.015). Notably, the keratin variants were significantly associated with insulin resistance (IR) in NAFLD patients (8.86% in NAFLD patients with IR vs 2.5% in NAFLD patients without IR, P = 0.043). K8/K18 variants are overrepresented in Chinese NAFLD patients and might accelerate liver fat storage through IR.

  5. A Fast Multiple-Kernel Method With Applications to Detect Gene-Environment Interaction.

    PubMed

    Marceau, Rachel; Lu, Wenbin; Holloway, Shannon; Sale, Michèle M; Worrall, Bradford B; Williams, Stephen R; Hsu, Fang-Chi; Tzeng, Jung-Ying

    2015-09-01

    Kernel machine (KM) models are a powerful tool for exploring associations between sets of genetic variants and complex traits. Although most KM methods use a single kernel function to assess the marginal effect of a variable set, KM analyses involving multiple kernels have become increasingly popular. Multikernel analysis allows researchers to study more complex problems, such as assessing gene-gene or gene-environment interactions, incorporating variance-component based methods for population substructure into rare-variant association testing, and assessing the conditional effects of a variable set adjusting for other variable sets. The KM framework is robust, powerful, and provides efficient dimension reduction for multifactor analyses, but requires the estimation of high dimensional nuisance parameters. Traditional estimation techniques, including regularization and the "expectation-maximization (EM)" algorithm, have a large computational cost and are not scalable to large sample sizes needed for rare variant analysis. Therefore, under the context of gene-environment interaction, we propose a computationally efficient and statistically rigorous "fastKM" algorithm for multikernel analysis that is based on a low-rank approximation to the nuisance effect kernel matrices. Our algorithm is applicable to various trait types (e.g., continuous, binary, and survival traits) and can be implemented using any existing single-kernel analysis software. Through extensive simulation studies, we show that our algorithm has similar performance to an EM-based KM approach for quantitative traits while running much faster. We also apply our method to the Vitamin Intervention for Stroke Prevention (VISP) clinical trial, examining gene-by-vitamin effects on recurrent stroke risk and gene-by-age effects on change in homocysteine level. © 2015 WILEY PERIODICALS, INC.

  6. Further evidence of MAO-A gene variants associated with bipolar disorder.

    PubMed

    Müller, Daniel J; Serretti, Alessandro; Sicard, Tricia; Tharmalingam, Subi; King, Nicole; Artioli, Paola; Mandelli, Laura; Lorenzi, Cristina; Kennedy, James L

    2007-01-05

    The aim of this study was to investigate MAOA gene variants in bipolar disorder by using a family-based association approach. The first sample included 331 nuclear families from Western and Central Canada with at least 1 offspring affected with bipolar disorder comprising a total of 1,044 individuals. All subjects were genotyped for MAOA-941T > G and -uVNTR gene variants using PCR techniques. Haplotype TDT was statistically significant (LRS = 12.17; df = 3; P = 0.0068; permutation global significance = 0.00098), with the T-4 haplotype significantly associated with bipolar disorder (OR = 1.63, 95% CI = 1.11-2.37). Single marker analysis evidenced a borderline association for MAOA-941T > G (P = 0.04), but not for the uVNTR. Pooling the Canadian sample with a second previously reported Italian sample genotyped for the uVNTR variant, negative results were obtained as well. No different results were detected when analyzing female subjects separately. In conclusion, our family-based association study gives mild but further support of the involvement of MAOA variants in bipolar disorder.

  7. Metastatic phaeochromocytoma in a 23-year-old woman with an unclassified variant in the von Hippel Lindau disease gene: how can the pathogenicity of this variant be determined?

    PubMed

    Russell, Nicholas; Delatycki, Martin; Grossmann, Mathis

    2015-07-01

    A 23-year-old woman with metastatic phaeochromocytoma was found to have a previously unclassified variant in the von Hippel Lindau disease gene (c.361G>C). We use this case to highlight the issue of unclassified single nucleotide variants and the approaches to help predict whether they are disease causing or neutral. With increasing use of genetic testing, and widespread clinical use of next-generation sequencing around the corner, this issue is likely to become more prominent. © 2015 John Wiley & Sons Ltd.

  8. Single scan parameterization of space-variant point spread functions in image space via a printed array: the impact for two PET/CT scanners.

    PubMed

    Kotasidis, F A; Matthews, J C; Angelis, G I; Noonan, P J; Jackson, A; Price, P; Lionheart, W R; Reader, A J

    2011-05-21

    Incorporation of a resolution model during statistical image reconstruction often produces images of improved resolution and signal-to-noise ratio. A novel and practical methodology to rapidly and accurately determine the overall emission and detection blurring component of the system matrix using a printed point source array within a custom-made Perspex phantom is presented. The array was scanned at different positions and orientations within the field of view (FOV) to examine the feasibility of extrapolating the measured point source blurring to other locations in the FOV and the robustness of measurements from a single point source array scan. We measured the spatially-variant image-based blurring on two PET/CT scanners, the B-Hi-Rez and the TruePoint TrueV. These measured spatially-variant kernels and the spatially-invariant kernel at the FOV centre were then incorporated within an ordinary Poisson ordered subset expectation maximization (OP-OSEM) algorithm and compared to the manufacturer's implementation using projection space resolution modelling (RM). Comparisons were based on a point source array, the NEMA IEC image quality phantom, the Cologne resolution phantom and two clinical studies (carbon-11 labelled anti-sense oligonucleotide [(11)C]-ASO and fluorine-18 labelled fluoro-l-thymidine [(18)F]-FLT). Robust and accurate measurements of spatially-variant image blurring were successfully obtained from a single scan. Spatially-variant resolution modelling resulted in notable resolution improvements away from the centre of the FOV. Comparison between spatially-variant image-space methods and the projection-space approach (the first such report, using a range of studies) demonstrated very similar performance with our image-based implementation producing slightly better contrast recovery (CR) for the same level of image roughness (IR). These results demonstrate that image-based resolution modelling within reconstruction is a valid alternative to projection-based modelling, and that, when using the proposed practical methodology, the necessary resolution measurements can be obtained from a single scan. This approach avoids the relatively time-consuming and involved procedures previously proposed in the literature.

  9. [Fine mapping of complex disease susceptibility loci].

    PubMed

    Song, Qingfeng; Zhang, Hongxing; Ma, Yilong; Zhou, Gangqiao

    2014-01-01

    Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers have identified more than 3800 susceptibility loci for more than 660 diseases or traits. However, the most significantly associated variants or causative variants in these loci and their biological functions have remained to be clarified. These causative variants can help to elucidate the pathogenesis and discover new biomarkers of complex diseases. One of the main goals in the post-GWAS era is to identify the causative variants and susceptibility genes, and clarify their functional aspects by fine mapping. For common variants, imputation or re-sequencing based strategies were implemented to increase the number of analyzed variants and help to identify the most significantly associated variants. In addition, functional element, expression quantitative trait locus (eQTL) and haplotype analyses were performed to identify functional common variants and susceptibility genes. For rare variants, fine mapping was carried out by re-sequencing, rare haplotype analysis, family-based analysis, burden test, etc.This review summarizes the strategies and problems for fine mapping.

  10. Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome

    PubMed Central

    Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

    2014-01-01

    Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield. PMID:25333064

  11. Screening for single nucleotide variants, small indels and exon deletions with a next-generation sequencing based gene panel approach for Usher syndrome.

    PubMed

    Krawitz, Peter M; Schiska, Daniela; Krüger, Ulrike; Appelt, Sandra; Heinrich, Verena; Parkhomchuk, Dmitri; Timmermann, Bernd; Millan, Jose M; Robinson, Peter N; Mundlos, Stefan; Hecht, Jochen; Gross, Manfred

    2014-09-01

    Usher syndrome is an autosomal recessive disorder characterized both by deafness and blindness. For the three clinical subtypes of Usher syndrome causal mutations in altogether 12 genes and a modifier gene have been identified. Due to the genetic heterogeneity of Usher syndrome, the molecular analysis is predestined for a comprehensive and parallelized analysis of all known genes by next-generation sequencing (NGS) approaches. We describe here the targeted enrichment and deep sequencing for exons of Usher genes and compare the costs and workload of this approach compared to Sanger sequencing. We also present a bioinformatics analysis pipeline that allows us to detect single-nucleotide variants, short insertions and deletions, as well as copy number variations of one or more exons on the same sequence data. Additionally, we present a flexible in silico gene panel for the analysis of sequence variants, in which newly identified genes can easily be included. We applied this approach to a cohort of 44 Usher patients and detected biallelic pathogenic mutations in 35 individuals and monoallelic mutations in eight individuals of our cohort. Thirty-nine of the sequence variants, including two heterozygous deletions comprising several exons of USH2A, have not been reported so far. Our NGS-based approach allowed us to assess single-nucleotide variants, small indels, and whole exon deletions in a single test. The described diagnostic approach is fast and cost-effective with a high molecular diagnostic yield.

  12. Microsatellites as targets of natural selection.

    PubMed

    Haasl, Ryan J; Payseur, Bret A

    2013-02-01

    The ability to survey polymorphism on a genomic scale has enabled genome-wide scans for the targets of natural selection. Theory that connects patterns of genetic variation to evidence of natural selection most often assumes a diallelic locus and no recurrent mutation. Although these assumptions are suitable to selection that targets single nucleotide variants, fundamentally different types of mutation generate abundant polymorphism in genomes. Moreover, recent empirical results suggest that mutationally complex, multiallelic loci including microsatellites and copy number variants are sometimes targeted by natural selection. Given their abundance, the lack of inference methods tailored to the mutational peculiarities of these types of loci represents a notable gap in our ability to interrogate genomes for signatures of natural selection. Previous theoretical investigations of mutation-selection balance at multiallelic loci include assumptions that limit their application to inference from empirical data. Focusing on microsatellites, we assess the dynamics and population-level consequences of selection targeting mutationally complex variants. We develop general models of a multiallelic fitness surface, a realistic model of microsatellite mutation, and an efficient simulation algorithm. Using these tools, we explore mutation-selection-drift equilibrium at microsatellites and investigate the mutational history and selective regime of the microsatellite that causes Friedreich's ataxia. We characterize microsatellite selective events by their duration and cost, note similarities to sweeps from standing point variation, and conclude that it is premature to label microsatellites as ubiquitous agents of efficient adaptive change. Together, our models and simulation algorithm provide a powerful framework for statistical inference, which can be used to test the neutrality of microsatellites and other multiallelic variants.

  13. Microsatellites as Targets of Natural Selection

    PubMed Central

    Haasl, Ryan J.; Payseur, Bret A.

    2013-01-01

    The ability to survey polymorphism on a genomic scale has enabled genome-wide scans for the targets of natural selection. Theory that connects patterns of genetic variation to evidence of natural selection most often assumes a diallelic locus and no recurrent mutation. Although these assumptions are suitable to selection that targets single nucleotide variants, fundamentally different types of mutation generate abundant polymorphism in genomes. Moreover, recent empirical results suggest that mutationally complex, multiallelic loci including microsatellites and copy number variants are sometimes targeted by natural selection. Given their abundance, the lack of inference methods tailored to the mutational peculiarities of these types of loci represents a notable gap in our ability to interrogate genomes for signatures of natural selection. Previous theoretical investigations of mutation-selection balance at multiallelic loci include assumptions that limit their application to inference from empirical data. Focusing on microsatellites, we assess the dynamics and population-level consequences of selection targeting mutationally complex variants. We develop general models of a multiallelic fitness surface, a realistic model of microsatellite mutation, and an efficient simulation algorithm. Using these tools, we explore mutation-selection-drift equilibrium at microsatellites and investigate the mutational history and selective regime of the microsatellite that causes Friedreich’s ataxia. We characterize microsatellite selective events by their duration and cost, note similarities to sweeps from standing point variation, and conclude that it is premature to label microsatellites as ubiquitous agents of efficient adaptive change. Together, our models and simulation algorithm provide a powerful framework for statistical inference, which can be used to test the neutrality of microsatellites and other multiallelic variants. PMID:23104080

  14. Independent test assessment using the extreme value distribution theory.

    PubMed

    Almeida, Marcio; Blondell, Lucy; Peralta, Juan M; Kent, Jack W; Jun, Goo; Teslovich, Tanya M; Fuchsberger, Christian; Wood, Andrew R; Manning, Alisa K; Frayling, Timothy M; Cingolani, Pablo E; Sladek, Robert; Dyer, Thomas D; Abecasis, Goncalo; Duggirala, Ravindranath; Blangero, John

    2016-01-01

    The new generation of whole genome sequencing platforms offers great possibilities and challenges for dissecting the genetic basis of complex traits. With a very high number of sequence variants, a naïve multiple hypothesis threshold correction hinders the identification of reliable associations by the overreduction of statistical power. In this report, we examine 2 alternative approaches to improve the statistical power of a whole genome association study to detect reliable genetic associations. The approaches were tested using the Genetic Analysis Workshop 19 (GAW19) whole genome sequencing data. The first tested method estimates the real number of effective independent tests actually being performed in whole genome association project by the use of an extreme value distribution and a set of phenotype simulations. Given the familiar nature of the GAW19 data and the finite number of pedigree founders in the sample, the number of correlations between genotypes is greater than in a set of unrelated samples. Using our procedure, we estimate that the effective number represents only 15 % of the total number of independent tests performed. However, even using this corrected significance threshold, no genome-wide significant association could be detected for systolic and diastolic blood pressure traits. The second approach implements a biological relevance-driven hypothesis tested by exploiting prior computational predictions on the effect of nonsynonymous genetic variants detected in a whole genome sequencing association study. This guided testing approach was able to identify 2 promising single-nucleotide polymorphisms (SNPs), 1 for each trait, targeting biologically relevant genes that could help shed light on the genesis of the human hypertension. The first gene, PFH14 , associated with systolic blood pressure, interacts directly with genes involved in calcium-channel formation and the second gene, MAP4 , encodes a microtubule-associated protein and had already been detected by previous genome-wide association study experiments conducted in an Asian population. Our results highlight the necessity of the development of alternative approached to improve the efficiency on the detection of reasonable candidate associations in whole genome sequencing studies.

  15. Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF.

    PubMed

    Chalmers, R Philip

    2018-06-01

    This paper demonstrates that, after applying a simple modification to Li and Stout's (Psychometrika 61(4):647-677, 1996) CSIBTEST statistic, an improved variant of the statistic could be realized. It is shown that this modified version of CSIBTEST has a more direct association with the SIBTEST statistic presented by Shealy and Stout (Psychometrika 58(2):159-194, 1993). In particular, the asymptotic sampling distributions and general interpretation of the effect size estimates are the same for SIBTEST and the new CSIBTEST. Given the more natural connection to SIBTEST, it is shown that Li and Stout's hypothesis testing approach is insufficient for CSIBTEST; thus, an improved hypothesis testing procedure is required. Based on the presented arguments, a new chi-squared-based hypothesis testing approach is proposed for the modified CSIBTEST statistic. Positive results from a modest Monte Carlo simulation study strongly suggest the original CSIBTEST procedure and randomization hypothesis testing approach should be replaced by the modified statistic and hypothesis testing method.

  16. Looking beyond the exome: a phenotype-first approach to molecular diagnostic resolution in rare and undiagnosed diseases.

    PubMed

    Pena, Loren D M; Jiang, Yong-Hui; Schoch, Kelly; Spillmann, Rebecca C; Walley, Nicole; Stong, Nicholas; Rapisardo Horn, Sarah; Sullivan, Jennifer A; McConkie-Rosell, Allyn; Kansagra, Sujay; Smith, Edward C; El-Dairi, Mays; Bellet, Jane; Keels, Martha Ann; Jasien, Joan; Kranz, Peter G; Noel, Richard; Nagaraj, Shashi K; Lark, Robert K; Wechsler, Daniel S G; Del Gaudio, Daniela; Leung, Marco L; Hendon, Laura G; Parker, Collette C; Jones, Kelly L; Goldstein, David B; Shashi, Vandana

    2018-04-01

    PurposeTo describe examples of missed pathogenic variants on whole-exome sequencing (WES) and the importance of deep phenotyping for further diagnostic testing.MethodsGuided by phenotypic information, three children with negative WES underwent targeted single-gene testing.ResultsIndividual 1 had a clinical diagnosis consistent with infantile systemic hyalinosis, although WES and a next-generation sequencing (NGS)-based ANTXR2 test were negative. Sanger sequencing of ANTXR2 revealed a homozygous single base pair insertion, previously missed by the WES variant caller software. Individual 2 had neurodevelopmental regression and cerebellar atrophy, with no diagnosis on WES. New clinical findings prompted Sanger sequencing and copy number testing of PLA2G6. A novel homozygous deletion of the noncoding exon 1 (not included in the WES capture kit) was detected, with extension into the promoter, confirming the clinical suspicion of infantile neuroaxonal dystrophy. Individual 3 had progressive ataxia, spasticity, and magnetic resonance image changes of vanishing white matter leukoencephalopathy. An NGS leukodystrophy gene panel and WES showed a heterozygous pathogenic variant in EIF2B5; no deletions/duplications were detected. Sanger sequencing of EIF2B5 showed a frameshift indel, probably missed owing to failure of alignment.ConclusionThese cases illustrate potential pitfalls of WES/NGS testing and the importance of phenotype-guided molecular testing in yielding diagnoses.

  17. Looking beyond the exome: a phenotype-first approach to molecular diagnostic resolution in rare and undiagnosed diseases

    PubMed Central

    Pena, Loren DM; Jiang, Yong-Hui; Schoch, Kelly; Spillmann, Rebecca C.; Walley, Nicole; Stong, Nicholas; Horn, Sarah Rapisardo; Sullivan, Jennifer A.; McConkie-Rosell, Allyn; Kansagra, Sujay; Smith, Edward C.; El-Dairi, Mays; Bellet, Jane; Ann Keels, Martha; Jasien, Joan; Kranz, Peter G.; Noel, Richard; Nagaraj, Shashi K.; Lark, Robert K.; Wechsler, Daniel SG; del Gaudio, Daniela; Leung, Marco L.; Hendon, Laura G.; Parker, Collette C.; Jones, Kelly L.; Goldstein, David B.; Shashi, Vandana

    2017-01-01

    Purpose To describe examples of missed pathogenic variants on whole exome sequencing (WES) and the importance of deep phenotyping for further diagnostic testing. Methods Guided by phenotypic information, three children with negative WES underwent targeted single gene testing. Results Individual 1 had a clinical diagnosis consistent with infantile systemic hyalinosis, although WES and an NGS-based ANTXR2 test were negative. Sanger sequencing of ANTXR2 revealed a homozygous single base pair insertion, previously missed by the WES variant caller software. Individual 2 had neurodevelopmental regression and cerebellar atrophy, with no diagnosis on WES. New clinical findings prompted Sanger sequencing and copy number testing of PLA2G6. A novel homozygous deletion of the non-coding exon 1 (not included in the WES capture kit) was detected, with extension into the promoter, confirming the clinical suspicion of infantile neuroaxonal dystrophy. Individual 3 had progressive ataxia, spasticity and MRI changes of vanishing white matter leukoencephalopathy. An NGS leukodystrophy gene panel and WES showed a heterozygous pathogenic variant in EIF2B5; no deletions/duplications were detected. Sanger sequencing of EIF2B5 showed a frameshift indel, likely missed due to failure of alignment. Conclusions These cases illustrate potential pitfalls of WES/NGS testing, and the importance of phenotype-guided molecular testing in yielding diagnoses. PMID:28914269

  18. A Novel Binary Mixture of Helicoverpa armigera Single Nucleopolyhedrovirus Genotypic Variants Has Improved Insecticidal Characteristics for Control of Cotton Bollworms.

    PubMed

    Arrizubieta, Maite; Simón, Oihane; Williams, Trevor; Caballero, Primitivo

    2015-06-15

    The genotypic diversity of two Spanish isolates of Helicoverpa armigera single nucleopolyhedrovirus (HearSNPV) was evaluated with the aim of identifying mixtures of genotypes with improved insecticidal characteristics for control of the cotton bollworm. Two genotypic variants, HearSP1A and HearSP1B, were cloned in vitro from the most pathogenic wild-type isolate of the Iberian Peninsula, HearSNPV-SP1 (HearSP1-wt). Similarly, six genotypic variants (HearLB1 to -6) were obtained by endpoint dilution from larvae collected from cotton crops in southern Spain that died from virus disease during laboratory rearing. Variants differed significantly in their insecticidal properties, pathogenicity, speed of kill, and occlusion body (OB) production (OBs/larva). HearSP1B was ∼3-fold more pathogenic than HearSP1-wt and the other variants. HearLB1, HearLB2, HeaLB5, and HearLB6 were the fastest-killing variants. Moreover, although highly virulent, HearLB1, HearLB4, and HearLB5 produced more OBs/larva than did the other variants. The co-occluded HearSP1B:LB6 mixture at a 1:1 proportion was 1.7- to 2.8-fold more pathogenic than any single variant and other mixtures tested and also killed larvae as fast as the most virulent genotypes. Serial passage resulted in modified proportions of the component variants of the HearSP1B:LB6 co-occluded mixture, suggesting that transmissibility could be further improved by this process. We conclude that the improved insecticidal phenotype of the HearSP1B:LB6 co-occluded mixture underlines the utility of the genotypic variant dissection and reassociation approach for the development of effective virus-based insecticides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  19. Rare variant association analysis in case-parents studies by allowing for missing parental genotypes.

    PubMed

    Li, Yumei; Xiang, Yang; Xu, Chao; Shen, Hui; Deng, Hongwen

    2018-01-15

    The development of next-generation sequencing technologies has facilitated the identification of rare variants. Family-based design is commonly used to effectively control for population admixture and substructure, which is more prominent for rare variants. Case-parents studies, as typical strategies in family-based design, are widely used in rare variant-disease association analysis. Current methods in case-parents studies are based on complete case-parents data; however, parental genotypes may be missing in case-parents trios, and removing these data may lead to a loss in statistical power. The present study focuses on testing for rare variant-disease association in case-parents study by allowing for missing parental genotypes. In this report, we extended the collapsing method for rare variant association analysis in case-parents studies to allow for missing parental genotypes, and investigated the performance of two methods by using the difference of genotypes between affected offspring and their corresponding "complements" in case-parent trios and TDT framework. Using simulations, we showed that, compared with the methods just only using complete case-parents data, the proposed strategy allowing for missing parental genotypes, or even adding unrelated affected individuals, can greatly improve the statistical power and meanwhile is not affected by population stratification. We conclude that adding case-parents data with missing parental genotypes to complete case-parents data set can greatly improve the power of our strategy for rare variant-disease association.

  20. Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials.

    PubMed

    Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie

    2013-01-01

    Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.

  1. Association Studies of 22 Candidate SNPs with Late-Onset Alzheimer's Disease

    PubMed Central

    Figgins, Jessica A.; Minster, Ryan L.; Demirci, F. Yesim; DeKosky, Steven T.; Kamboh, M. Ilyas

    2009-01-01

    Alzheimer's disease (AD) is a complex and multifactorial disease with the possible involvement of several genes. With the exception of the APOE gene as a susceptibility marker, no other genes have been shown consistently to be associated with late-onset AD (LOAD). A recent genome-wide association study of 17,343 gene-based putative functional single nucleotide polymorphisms (SNPs) found 19 significant variants, including 3 linked to APOE, showing association with LOAD (Hum Mol Genet 2007; 16:865–873). We have set out to replicate the 16 new significant associations in a large case-control cohort of American Whites. Additionally, we examined six variants present in positional and/or biological candidate genes for AD. We genotyped the 22 SNPs in up to 1,009 Caucasian Americans with LOAD and up to 1,010 age-matched healthy Caucasian Americans, using 5′ nuclease assays. We did not observe a statistically significant association between the SNPs and the risk of AD, either individually or stratified by APOE. Our data suggest that the association of the studied variants with LOAD risk, if it exists, is not statistically significant in our sample. PMID:18780302

  2. Gene variations in sex hormone pathways and the risk of testicular germ cell tumour: a case-parent triad study in a Norwegian-Swedish population.

    PubMed

    Kristiansen, W; Andreassen, K E; Karlsson, R; Aschim, E L; Bremnes, R M; Dahl, O; Fosså, S D; Klepp, O; Langberg, C W; Solberg, A; Tretli, S; Adami, H-O; Wiklund, F; Grotmol, T; Haugen, T B

    2012-05-01

    Testicular germ cell tumour (TGCT) is the most common cancer in young men, and an imbalance between the estrogen and androgen levels in utero is hypothesized to influence TGCT risk. Thus, polymorphisms in genes involved in the action of sex hormones may contribute to variability in an individual's susceptibility to TGCT. We conducted a Norwegian-Swedish case-parent study. A total of 105 single-nucleotide polymorphisms (SNPs) in 20 sex hormone pathway genes were genotyped using Sequenom MassArray iPLEX Gold, in 831 complete triads and 474 dyads. To increase the statistical power, the analysis was expanded to include 712 case singletons and 3922 Swedish controls, thus including triads, dyads and the case-control samples in a single test for association. Analysis for allelic associations was performed with the UNPHASED program, using a likelihood-based association test for nuclear families with missing data, and odds ratios (ORs) and 95% confidence intervals (CIs) were calculated. False discovery rate (FDR) was used to adjust for multiple testing. Five genetic variants across the ESR2 gene [encoding estrogen receptor beta (ERβ)] were statistically significantly associated with the risk of TGCT. In the case-parent analysis, the markers rs12434245 and rs10137185 were associated with a reduced risk of TGCT (OR = 0.66 and 0.72, respectively; both FDRs <5%), whereas rs2978381 and rs12435857 were associated with an increased risk of TGCT (OR = 1.21 and 1.19, respectively; both FDRs <5%). In the combined case-parent/case-control analysis, rs12435857 and rs10146204 were associated with an increased risk of TGCT (OR = 1.15 and 1.13, respectively; both FDRs <5%), whereas rs10137185 was associated with a reduced risk of TGCT (OR = 0.79, FDR <5%). In addition, we found that three genetic variants in CYP19A1 (encoding aromatase) were statistically significantly associated with the risk of TGCT in the case-parent analysis. The T alleles of the rs2414099, rs8025374 and rs3751592 SNPs were associated with an increased risk of TGCT (OR = 1.30, 1.30 and 1.21, respectively; all FDRs <5%). We found no statistically significant differences in allelic effect estimates between parental inherited genetic variation in the sex hormone pathways and TGCT risk in the offspring, and no evidence of heterogeneity between seminomas and non-seminomas, or between the Norwegian and the Swedish population, in any of the SNPs examined. Our findings provide support for ERβ and aromatase being implicated in the aetiology of TGCT. Exploring the functional role of the TGCT risk-associated SNPs will further elucidate the biological mechanisms involved.

  3. Best practices for evaluating single nucleotide variant calling methods for microbial genomics

    PubMed Central

    Olson, Nathan D.; Lund, Steven P.; Colman, Rebecca E.; Foster, Jeffrey T.; Sahl, Jason W.; Schupp, James M.; Keim, Paul; Morrow, Jayne B.; Salit, Marc L.; Zook, Justin M.

    2015-01-01

    Innovations in sequencing technologies have allowed biologists to make incredible advances in understanding biological systems. As experience grows, researchers increasingly recognize that analyzing the wealth of data provided by these new sequencing platforms requires careful attention to detail for robust results. Thus far, much of the scientific Communit’s focus for use in bacterial genomics has been on evaluating genome assembly algorithms and rigorously validating assembly program performance. Missing, however, is a focus on critical evaluation of variant callers for these genomes. Variant calling is essential for comparative genomics as it yields insights into nucleotide-level organismal differences. Variant calling is a multistep process with a host of potential error sources that may lead to incorrect variant calls. Identifying and resolving these incorrect calls is critical for bacterial genomics to advance. The goal of this review is to provide guidance on validating algorithms and pipelines used in variant calling for bacterial genomics. First, we will provide an overview of the variant calling procedures and the potential sources of error associated with the methods. We will then identify appropriate datasets for use in evaluating algorithms and describe statistical methods for evaluating algorithm performance. As variant calling moves from basic research to the applied setting, standardized methods for performance evaluation and reporting are required; it is our hope that this review provides the groundwork for the development of these standards. PMID:26217378

  4. Fine-scale patterns of population stratification confound rare variant association tests.

    PubMed

    O'Connor, Timothy D; Kiezun, Adam; Bamshad, Michael; Rich, Stephen S; Smith, Joshua D; Turner, Emily; Leal, Suzanne M; Akey, Joshua M

    2013-01-01

    Advances in next-generation sequencing technology have enabled systematic exploration of the contribution of rare variation to Mendelian and complex diseases. Although it is well known that population stratification can generate spurious associations with common alleles, its impact on rare variant association methods remains poorly understood. Here, we performed exhaustive coalescent simulations with demographic parameters calibrated from exome sequence data to evaluate the performance of nine rare variant association methods in the presence of fine-scale population structure. We find that all methods have an inflated spurious association rate for parameter values that are consistent with levels of differentiation typical of European populations. For example, at a nominal significance level of 5%, some test statistics have a spurious association rate as high as 40%. Finally, we empirically assess the impact of population stratification in a large data set of 4,298 European American exomes. Our results have important implications for the design, analysis, and interpretation of rare variant genome-wide association studies.

  5. Unified Sequence-Based Association Tests Allowing for Multiple Functional Annotations and Meta-analysis of Noncoding Variation in Metabochip Data.

    PubMed

    He, Zihuai; Xu, Bin; Lee, Seunggeun; Ionita-Laza, Iuliana

    2017-09-07

    Substantial progress has been made in the functional annotation of genetic variation in the human genome. Integrative analysis that incorporates such functional annotations into sequencing studies can aid the discovery of disease-associated genetic variants, especially those with unknown function and located outside protein-coding regions. Direct incorporation of one functional annotation as weight in existing dispersion and burden tests can suffer substantial loss of power when the functional annotation is not predictive of the risk status of a variant. Here, we have developed unified tests that can utilize multiple functional annotations simultaneously for integrative association analysis with efficient computational techniques. We show that the proposed tests significantly improve power when variant risk status can be predicted by functional annotations. Importantly, when functional annotations are not predictive of risk status, the proposed tests incur only minimal loss of power in relation to existing dispersion and burden tests, and under certain circumstances they can even have improved power by learning a weight that better approximates the underlying disease model in a data-adaptive manner. The tests can be constructed with summary statistics of existing dispersion and burden tests for sequencing data, therefore allowing meta-analysis of multiple studies without sharing individual-level data. We applied the proposed tests to a meta-analysis of noncoding rare variants in Metabochip data on 12,281 individuals from eight studies for lipid traits. By incorporating the Eigen functional score, we detected significant associations between noncoding rare variants in SLC22A3 and low-density lipoprotein and total cholesterol, associations that are missed by standard dispersion and burden tests. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  6. AB087. Synergistic genetic effects of RET and NRG1 susceptibility variants in Hirschsprung disease

    PubMed Central

    Iskandar, Kristy; Makhmudi, Akhmad; Gunadi

    2017-01-01

    Background Hirschsprung disease (HSCR) is a complex genetic disorder, which characterized by absence of ganglion cells along variable lengths of the intestines in neonates, with the RET and NRG1 are reported as the most common susceptible genes for HSCR development. Here, we investigated three common genetic markers: RET rs2506030 and NRG1 rs7835688 and rs16879552, to determine their potential interactions to the susceptibility of HSCR in Indonesian population. Methods We ascertained 60 HSCR subjects and 118 non-HSCR controls. Three genetic markers of the RET and NRG1 were examined using TaqMan assay. Case-control association tests between three genetic markers and HSCR were performed using the χ2 (chi square) statistic and 2×2 contingency tables. We analyzed the family based association in duos and trios using the transmission disequilibrium test (TDT) for the variants using PLINK. Results There was association between NRG1 rs7835688 (4.3×10−3) variant and HSCR, but not RET rs2506030 (P=0.042) and NRG1 rs16879552 (P=0.097). TDT of 33 HSCR families demonstrates no genetic effect either at RET rs2506030 (P=0.034) or NRG1 rs7835688 (P=0.18) and rs16879552 (P=0.28). Two locus analyses of polymorphisms demonstrated that RET rs2506030 (GG), in combination with NRG1 rs7835688 (CC) or rs16879552 (CC), were associated with the increased disease risks of HSCR (OR =6.22, P=0.028 and OR =3.34, P=6.0×10−4, respectively) compared with a single variant of either RET or NRG1. Conclusions Our study shows that RET and NRG1 polymorphisms are common genetic risk factors for Indonesian HSCR. These results also imply that synergistic effects of RET and NRG1 is necessary for normal ganglionosis.

  7. Copy Number Variation across European Populations

    PubMed Central

    Chen, Wanting; Hayward, Caroline; Wright, Alan F.; Hicks, Andrew A.; Vitart, Veronique; Knott, Sara; Wild, Sarah H.; Pramstaller, Peter P.; Wilson, James F.; Rudan, Igor; Porteous, David J.

    2011-01-01

    Genome analysis provides a powerful approach to test for evidence of genetic variation within and between geographical regions and local populations. Copy number variants which comprise insertions, deletions and duplications of genomic sequence provide one such convenient and informative source. Here, we investigate copy number variants from genome wide scans of single nucleotide polymorphisms in three European population isolates, the island of Vis in Croatia, the islands of Orkney in Scotland and the South Tyrol in Italy. We show that whereas the overall copy number variant frequencies are similar between populations, their distribution is highly specific to the population of origin, a finding which is supported by evidence for increased kinship correlation for specific copy number variants within populations. PMID:21829696

  8. Comparison of locus-specific databases for BRCA1 and BRCA2 variants reveals disparity in variant classification within and among databases.

    PubMed

    Vail, Paris J; Morris, Brian; van Kan, Aric; Burdett, Brianna C; Moyes, Kelsey; Theisen, Aaron; Kerr, Iain D; Wenstrup, Richard J; Eggington, Julie M

    2015-10-01

    Genetic variants of uncertain clinical significance (VUSs) are a common outcome of clinical genetic testing. Locus-specific variant databases (LSDBs) have been established for numerous disease-associated genes as a research tool for the interpretation of genetic sequence variants to facilitate variant interpretation via aggregated data. If LSDBs are to be used for clinical practice, consistent and transparent criteria regarding the deposition and interpretation of variants are vital, as variant classifications are often used to make important and irreversible clinical decisions. In this study, we performed a retrospective analysis of 2017 consecutive BRCA1 and BRCA2 genetic variants identified from 24,650 consecutive patient samples referred to our laboratory to establish an unbiased dataset representative of the types of variants seen in the US patient population, submitted by clinicians and researchers for BRCA1 and BRCA2 testing. We compared the clinical classifications of these variants among five publicly accessible BRCA1 and BRCA2 variant databases: BIC, ClinVar, HGMD (paid version), LOVD, and the UMD databases. Our results show substantial disparity of variant classifications among publicly accessible databases. Furthermore, it appears that discrepant classifications are not the result of a single outlier but widespread disagreement among databases. This study also shows that databases sometimes favor a clinical classification when current best practice guidelines (ACMG/AMP/CAP) would suggest an uncertain classification. Although LSDBs have been well established for research applications, our results suggest several challenges preclude their wider use in clinical practice.

  9. Pathogenic Germline Variants in 10,389 Adult Cancers.

    PubMed

    Huang, Kuan-Lin; Mashl, R Jay; Wu, Yige; Ritter, Deborah I; Wang, Jiayin; Oh, Clara; Paczkowska, Marta; Reynolds, Sheila; Wyczalkowski, Matthew A; Oak, Ninad; Scott, Adam D; Krassowski, Michal; Cherniack, Andrew D; Houlahan, Kathleen E; Jayasinghe, Reyka; Wang, Liang-Bo; Zhou, Daniel Cui; Liu, Di; Cao, Song; Kim, Young Won; Koire, Amanda; McMichael, Joshua F; Hucthagowder, Vishwanathan; Kim, Tae-Beom; Hahn, Abigail; Wang, Chen; McLellan, Michael D; Al-Mulla, Fahd; Johnson, Kimberly J; Lichtarge, Olivier; Boutros, Paul C; Raphael, Benjamin; Lazar, Alexander J; Zhang, Wei; Wendl, Michael C; Govindan, Ramaswamy; Jain, Sanjay; Wheeler, David; Kulkarni, Shashikant; Dipersio, John F; Reimand, Jüri; Meric-Bernstam, Funda; Chen, Ken; Shmulevich, Ilya; Plon, Sharon E; Chen, Feng; Ding, Li

    2018-04-05

    We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  10. Adaptive Set-Based Methods for Association Testing

    PubMed Central

    Su, Yu-Chen; Gauderman, W. James; Kiros, Berhane; Lewinger, Juan Pablo

    2017-01-01

    With a typical sample size of a few thousand subjects, a single genomewide association study (GWAS) using traditional one-SNP-at-a-time methods can only detect genetic variants conferring a sizable effect on disease risk. Set-based methods, which analyze sets of SNPs jointly, can detect variants with smaller effects acting within a gene, a pathway, or other biologically relevant sets. While self-contained set-based methods (those that test sets of variants without regard to variants not in the set) are generally more powerful than competitive set-based approaches (those that rely on comparison of variants in the set of interest with variants not in the set), there is no consensus as to which self-contained methods are best. In particular, several self-contained set tests have been proposed to directly or indirectly ‘adapt’ to the a priori unknown proportion and distribution of effects of the truly associated SNPs in the set, which is a major determinant of their power. A popular adaptive set-based test is the adaptive rank truncated product (ARTP), which seeks the set of SNPs that yields the best-combined evidence of association. We compared the standard ARTP, several ARTP variations we introduced, and other adaptive methods in a comprehensive simulation study to evaluate their performance. We used permutations to assess significance for all the methods and thus provide a level playing field for comparison. We found the standard ARTP test to have the highest power across our simulations followed closely by the global model of random effects (GMRE) and a LASSO based test. PMID:26707371

  11. A practical guide to environmental association analysis in landscape genomics.

    PubMed

    Rellstab, Christian; Gugerli, Felix; Eckert, Andrew J; Hancock, Angela M; Holderegger, Rolf

    2015-09-01

    Landscape genomics is an emerging research field that aims to identify the environmental factors that shape adaptive genetic variation and the gene variants that drive local adaptation. Its development has been facilitated by next-generation sequencing, which allows for screening thousands to millions of single nucleotide polymorphisms in many individuals and populations at reasonable costs. In parallel, data sets describing environmental factors have greatly improved and increasingly become publicly accessible. Accordingly, numerous analytical methods for environmental association studies have been developed. Environmental association analysis identifies genetic variants associated with particular environmental factors and has the potential to uncover adaptive patterns that are not discovered by traditional tests for the detection of outlier loci based on population genetic differentiation. We review methods for conducting environmental association analysis including categorical tests, logistic regressions, matrix correlations, general linear models and mixed effects models. We discuss the advantages and disadvantages of different approaches, provide a list of dedicated software packages and their specific properties, and stress the importance of incorporating neutral genetic structure in the analysis. We also touch on additional important aspects such as sampling design, environmental data preparation, pooled and reduced-representation sequencing, candidate-gene approaches, linearity of allele-environment associations and the combination of environmental association analyses with traditional outlier detection tests. We conclude by summarizing expected future directions in the field, such as the extension of statistical approaches, environmental association analysis for ecological gene annotation, and the need for replication and post hoc validation studies. © 2015 John Wiley & Sons Ltd.

  12. Novel features and enhancements in BioBin, a tool for the biologically inspired binning and association analysis of rare variants

    PubMed Central

    Byrska-Bishop, Marta; Wallace, John; Frase, Alexander T; Ritchie, Marylyn D

    2018-01-01

    Abstract Motivation BioBin is an automated bioinformatics tool for the multi-level biological binning of sequence variants. Herein, we present a significant update to BioBin which expands the software to facilitate a comprehensive rare variant analysis and incorporates novel features and analysis enhancements. Results In BioBin 2.3, we extend our software tool by implementing statistical association testing, updating the binning algorithm, as well as incorporating novel analysis features providing for a robust, highly customizable, and unified rare variant analysis tool. Availability and implementation The BioBin software package is open source and freely available to users at http://www.ritchielab.com/software/biobin-download Contact mdritchie@geisinger.edu Supplementary information Supplementary data are available at Bioinformatics online. PMID:28968757

  13. HAPRAP: a haplotype-based iterative method for statistical fine mapping using GWAS summary statistics.

    PubMed

    Zheng, Jie; Rodriguez, Santiago; Laurin, Charles; Baird, Denis; Trela-Larsen, Lea; Erzurumluoglu, Mesut A; Zheng, Yi; White, Jon; Giambartolomei, Claudia; Zabaneh, Delilah; Morris, Richard; Kumari, Meena; Casas, Juan P; Hingorani, Aroon D; Evans, David M; Gaunt, Tom R; Day, Ian N M

    2017-01-01

    Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients ([Formula: see text]) of the variants. However, haplotypes rather than pairwise [Formula: see text], are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this article, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits height data, HAPRAP performs well with a small training sample size (N < 2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by single nucleotide polymorphisms (SNPs) with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). The HAPRAP package and documentation are available at http://apps.biocompute.org.uk/haprap/ CONTACT: : jie.zheng@bristol.ac.uk or tom.gaunt@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  14. Genetic variants associated with myocardial infarction risk factors in over 8000 individuals from five ethnic groups: The INTERHEART Genetics Study.

    PubMed

    Anand, Sonia S; Xie, Changchun; Paré, Guillaume; Montpetit, Alexandre; Rangarajan, Sumathy; McQueen, Matthew J; Cordell, Heather J; Keavney, Bernard; Yusuf, Salim; Hudson, Thomas J; Engert, James C

    2009-02-01

    Myocardial infarction (MI) is a leading cause of death globally, but specific genetic variants that influence MI and MI risk factors have not been assessed on a global basis. We included 8795 individuals of European, South Asian, Arab, Iranian, and Nepalese origin from the INTERHEART case-control study that genotyped 1536 single-nucleotide polymorphisms (SNPs) from 103 genes. One hundred and two SNPs were nominally associated with MI, but the statistical significance did not remain after adjustment for multiple testing. A subset of 940 SNPs from 69 genes were tested against MI risk factors. One hundred and sixty-three SNPs were nominally associated with a MI risk factor and 13 remained significant after adjusting for multiple testing. Of these 13, 11 were associated with apolipoprotein (Apo) B/A1 levels: 8 SNPs from 3 genes were associated with Apo B, and 3 cholesteryl ester transfer protein SNPs were associated with Apo A1. Seven of 8 of the SNPs associated with Apo B levels were nominally associated with MI (P<0.05), whereas none of the 3 cholesteryl ester transfer protein SNPs were associated with MI (P> or =0.17). Of the 3 SNPs most significantly associated with MI, rs7412, which defines the Apo E2 isoform, was associated with both a lower Apo B/A1 ratio (P=1.0x10(-7)) and lower MI risk (P=0.0004). Two low-density lipoprotein receptor variants, 1 intronic (rs6511720) and 1 in the 3' untranslated region (rs1433099) were both associated with a lower Apo B/A1 ratio (P<1.0x10(-5)) and a lower risk of MI (P=0.004 and P=0.003, respectively). Thirteen common SNPs were associated with MI risk factors. Importantly, SNPs associated with Apo B levels were associated with MI, whereas SNPs associated with Apo A1 levels were not. The Apo E isoform, and 2 common low-density lipoprotein receptor variants (rs1433099 and rs6511720) influence MI risk in this multiethnic sample.

  15. Prioritizing GWAS Results: A Review of Statistical Methods and Recommendations for Their Application

    PubMed Central

    Cantor, Rita M.; Lange, Kenneth; Sinsheimer, Janet S.

    2010-01-01

    Genome-wide association studies (GWAS) have rapidly become a standard method for disease gene discovery. A substantial number of recent GWAS indicate that for most disorders, only a few common variants are implicated and the associated SNPs explain only a small fraction of the genetic risk. This review is written from the viewpoint that findings from the GWAS provide preliminary genetic information that is available for additional analysis by statistical procedures that accumulate evidence, and that these secondary analyses are very likely to provide valuable information that will help prioritize the strongest constellations of results. We review and discuss three analytic methods to combine preliminary GWAS statistics to identify genes, alleles, and pathways for deeper investigations. Meta-analysis seeks to pool information from multiple GWAS to increase the chances of finding true positives among the false positives and provides a way to combine associations across GWAS, even when the original data are unavailable. Testing for epistasis within a single GWAS study can identify the stronger results that are revealed when genes interact. Pathway analysis of GWAS results is used to prioritize genes and pathways within a biological context. Following a GWAS, association results can be assigned to pathways and tested in aggregate with computational tools and pathway databases. Reviews of published methods with recommendations for their application are provided within the framework for each approach. PMID:20074509

  16. Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels

    PubMed Central

    van Leeuwen, Elisabeth M.; Karssen, Lennart C.; Deelen, Joris; Isaacs, Aaron; Medina-Gomez, Carolina; Mbarek, Hamdi; Kanterakis, Alexandros; Trompet, Stella; Postmus, Iris; Verweij, Niek; van Enckevort, David J.; Huffman, Jennifer E.; White, Charles C.; Feitosa, Mary F.; Bartz, Traci M.; Manichaikul, Ani; Joshi, Peter K.; Peloso, Gina M.; Deelen, Patrick; van Dijk, Freerk; Willemsen, Gonneke; de Geus, Eco J.; Milaneschi, Yuri; Penninx, Brenda W.J.H.; Francioli, Laurent C.; Menelaou, Androniki; Pulit, Sara L.; Rivadeneira, Fernando; Hofman, Albert; Oostra, Ben A.; Franco, Oscar H.; Leach, Irene Mateo; Beekman, Marian; de Craen, Anton J.M.; Uh, Hae-Won; Trochet, Holly; Hocking, Lynne J.; Porteous, David J.; Sattar, Naveed; Packard, Chris J.; Buckley, Brendan M.; Brody, Jennifer A.; Bis, Joshua C.; Rotter, Jerome I.; Mychaleckyj, Josyf C.; Campbell, Harry; Duan, Qing; Lange, Leslie A.; Wilson, James F.; Hayward, Caroline; Polasek, Ozren; Vitart, Veronique; Rudan, Igor; Wright, Alan F.; Rich, Stephen S.; Psaty, Bruce M.; Borecki, Ingrid B.; Kearney, Patricia M.; Stott, David J.; Adrienne Cupples, L.; Neerincx, Pieter B.T.; Elbers, Clara C.; Francesco Palamara, Pier; Pe'er, Itsik; Abdellaoui, Abdel; Kloosterman, Wigard P.; van Oven, Mannis; Vermaat, Martijn; Li, Mingkun; Laros, Jeroen F.J.; Stoneking, Mark; de Knijff, Peter; Kayser, Manfred; Veldink, Jan H.; van den Berg, Leonard H.; Byelas, Heorhiy; den Dunnen, Johan T.; Dijkstra, Martijn; Amin, Najaf; Joeri van der Velde, K.; van Setten, Jessica; Kattenberg, Mathijs; van Schaik, Barbera D.C.; Bot, Jan; Nijman, Isaäc J.; Mei, Hailiang; Koval, Vyacheslav; Ye, Kai; Lameijer, Eric-Wubbo; Moed, Matthijs H.; Hehir-Kwa, Jayne Y.; Handsaker, Robert E.; Sunyaev, Shamil R.; Sohail, Mashaal; Hormozdiari, Fereydoun; Marschall, Tobias; Schönhuth, Alexander; Guryev, Victor; Suchiman, H. Eka D.; Wolffenbuttel, Bruce H.; Platteel, Mathieu; Pitts, Steven J.; Potluri, Shobha; Cox, David R.; Li, Qibin; Li, Yingrui; Du, Yuanping; Chen, Ruoyan; Cao, Hongzhi; Li, Ning; Cao, Sujie; Wang, Jun; Bovenberg, Jasper A.; Jukema, J. Wouter; van der Harst, Pim; Sijbrands, Eric J.; Hottenga, Jouke-Jan; Uitterlinden, Andre G.; Swertz, Morris A.; van Ommen, Gert-Jan B.; de Bakker, Paul I.W.; Eline Slagboom, P.; Boomsma, Dorret I.; Wijmenga, Cisca; van Duijn, Cornelia M.

    2015-01-01

    Variants associated with blood lipid levels may be population-specific. To identify low-frequency variants associated with this phenotype, population-specific reference panels may be used. Here we impute nine large Dutch biobanks (~35,000 samples) with the population-specific reference panel created by the Genome of the Netherlands Project and perform association testing with blood lipid levels. We report the discovery of five novel associations at four loci (P value <6.61 × 10−4), including a rare missense variant in ABCA6 (rs77542162, p.Cys1359Arg, frequency 0.034), which is predicted to be deleterious. The frequency of this ABCA6 variant is 3.65-fold increased in the Dutch and its effect (βLDL-C=0.135, βTC=0.140) is estimated to be very similar to those observed for single variants in well-known lipid genes, such as LDLR. PMID:25751400

  17. Replication of prostate cancer risk loci in a Japanese case-control association study.

    PubMed

    Yamada, Hiroki; Penney, Kathryn L; Takahashi, Hiroyuki; Katoh, Takahiko; Yamano, Yuko; Yamakado, Minoru; Kimura, Takahiro; Kuruma, Hidetoshi; Kamata, Yuko; Egawa, Shin; Freedman, Matthew L

    2009-10-07

    Two prostate cancer genome-wide scans in populations of European ancestry identified several genetic variants that are strongly associated with prostate cancer risk. The effect of these risk variants and their cumulative effect in other populations are unknown. We evaluated the association of 23 risk single-nucleotide polymorphisms (SNPs) with prostate cancer risk and clinical covariates (Gleason score, tumor aggressiveness, and age at diagnosis) in men of Japanese ancestry (311 case subjects and 1035 control subjects) using unconditional logistic regression. We also used logistic regression to test the association between increasing numbers of independently associated risk alleles and the risk of prostate cancer, prostate cancer aggressiveness, and age at diagnosis. All statistical tests were two-sided. Seven of the 23 SNPs (five independent loci) were associated with prostate cancer risk (P values ranged from .0084 to 2.3 x 10(-8) and effect sizes [estimated as odds ratios, ORs] ranged from 1.35 to 1.82). None of the seven SNPs was associated with Gleason score or aggressive disease. rs6983561 and rs4430796 were associated with age at diagnosis (Ps = .0188 and .0339, respectively). Men with six or more risk alleles (27% of case patients and 11% of control subjects) had a higher risk of prostate cancer than men with two or fewer risk alleles (7% of case patients and 20% of control subjects) (OR = 6.22, P = 1.5 x 10(-12)). These results highlight the critical importance of considering ancestry in understanding how risk alleles influence disease and suggest that risk estimates and variants differ across populations. It is important to perform studies in multiple ancestral populations so that the composite genetic architecture of prostate cancer can be rigorously addressed.

  18. Genetic variant rs17225178 in the ARNT2 gene is associated with Asperger Syndrome.

    PubMed

    Di Napoli, Agnese; Warrier, Varun; Baron-Cohen, Simon; Chakrabarti, Bhismadev

    2015-01-01

    Autism Spectrum Conditions (ASC) are neurodevelopmental conditions characterized by difficulties in communication and social interaction, alongside unusually repetitive behaviours and narrow interests. Asperger Syndrome (AS) is one subgroup of ASC and differs from classic autism in that in AS there is no language or general cognitive delay. Genetic, epigenetic and environmental factors are implicated in ASC and genes involved in neural connectivity and neurodevelopment are good candidates for studying the susceptibility to ASC. The aryl-hydrocarbon receptor nuclear translocator 2 (ARNT2) gene encodes a transcription factor involved in neurodevelopmental processes, neuronal connectivity and cellular responses to hypoxia. A mutation in this gene has been identified in individuals with ASC and single nucleotide polymorphisms (SNPs) have been nominally associated with AS and autistic traits in previous studies. In this study, we tested 34 SNPs in ARNT2 for association with AS in 118 cases and 412 controls of Caucasian origin. P values were adjusted for multiple comparisons, and linkage disequilibrium (LD) among the SNPs analysed was calculated in our sample. Finally, SNP annotation allowed functional and structural analyses of the genetic variants in ARNT2. We tested the replicability of our result using the genome-wide association studies (GWAS) database of the Psychiatric Genomics Consortium (PGC). We report statistically significant association of rs17225178 with AS. This SNP modifies transcription factor binding sites and regions that regulate the chromatin state in neural cell lines. It is also included in a LD block in our sample, alongside other genetic variants that alter chromatin regulatory regions in neural cells. These findings demonstrate that rs17225178 in the ARNT2 gene is associated with AS and support previous studies that pointed out an involvement of this gene in the predisposition to ASC.

  19. Analysis of whole exome sequencing with cardiometabolic traits using family-based linkage and association in the IRAS Family Study

    PubMed Central

    Tabb, Keri L.; Hellwege, Jacklyn N.; Palmer, Nicholette D.; Dimitrov, Latchezar; Sajuthi, Satria; Taylor, Kent D.; NG, Maggie C.Y.; Hawkins, Gregory A.; Chen, Yii-Der Ida; Brown, W. Mark; McWilliams, David; Williams, Adrienne; Lorenzo, Carlos; Norris, Jill M.; Long, Jirong; Rotter, Jerome I.; Curran, Joanne E.; Blangero, John; Wagenknecht, Lynne E.; Langefeld, Carl D.; Bowden, Donald W.

    2017-01-01

    Summary Family-based methods are a potentially powerful tool to identify trait-defining genetic variants in extended families, particularly when used to complement conventional association analysis. We utilized two-point linkage analysis and single variant association analysis to evaluate whole exome sequencing (WES) data from 1,205 Hispanic Americans (78 families) from the Insulin Resistance Atherosclerosis Family Study. WES identified 211,612 variants above the minor allele frequency threshold of ≥0.005. These variants were tested for linkage and/or association with 50 cardiometabolic traits after quality control checks. Two-point linkage analysis yielded 10,580,600 LOD scores with 1,148 LOD scores ≥3, 183 LOD scores ≥4, and 29 LOD scores ≥5. The maximal novel LOD score was 5.50 for rs2289043:T>C, in UNC5C with subcutaneous adipose tissue volume. Association analysis identified 13 variants attaining genome-wide significance (p<5×10-08), with the strongest association between rs651821:C>T in APOA5, and triglyceride levels (p=3.67×10-10). Overall, there was a 5.2-fold increase in the number of informative variants detected by WES compared to exome chip analysis in this population, nearly 30% of which were novel variants relative to dbSNP build 138. Thus, integration of results from two-point linkage and single-variant association analysis from WES data enabled identification of novel signals potentially contributing to cardiometabolic traits. PMID:28067407

  20. THE EFFECT OF SEED TREATMENT ON THE MAIN PATHOGENS PRESENT IN WHEAT AGROECOSYSTEMS.

    PubMed

    Stef, R; Grozea, I; Puia, C; Carabet, A; Vlad, M; Manea, D

    2014-01-01

    Wheat crop (Triticum aestivum L.) from Poaceae family is affected by many diseases that cause yield losses. The present paper addresses a topic of economic, agrotechnics and social importance of wheat crop (occupying the first place among the Romanian cultivated crop, feeding 35 to 40% of world population). The study had as main objective product testing like Yunta 246 FS (imidacloprid 233 g/l + tebuconazol 13 g/l), Team Micorriza Plus (Glomus intraradices 150 spore/g + Glomus mosseae 150 spore/g + organic matter 56% and Rhizosphere Bacteria 107 UFC/g) and Condor (Trichoderma spp. 1 x 109 spore/g + Glomus sp. 10 spore/g + Rhizosphere Bacteria 1 x 107 UFC/g and organic matter 7%) applied in the pathosystem wheat/pathogens. The research was conducted in the western part of Romania, in 2010-2012, experience was placed after Latin rectangle method with 10 variants (they are different by product and dose applied) and the data were statistically interpreted. Results showed the presence of pathogens Septoria tritici, Drechslera tritici repentis and Drechslera teres in experimental variants. Statistical analysis showed that the most effective chemical mixture was imidacloprid + tebuconazol at the highest dose tested (3 l/t). Regarding the non-chemical product testing, the product Condor gave positive results. The highest values of quality parameters (protein and gluten) were obtained in the variants treated with Yunta 246 FS.

  1. Detection of ATM germline variants by the p53 mitotic centrosomal localization test in BRCA1/2-negative patients with early-onset breast cancer.

    PubMed

    Prodosmo, Andrea; Buffone, Amelia; Mattioni, Manlio; Barnabei, Agnese; Persichetti, Agnese; De Leo, Aurora; Appetecchia, Marialuisa; Nicolussi, Arianna; Coppa, Anna; Sciacchitano, Salvatore; Giordano, Carolina; Pinnarò, Paola; Sanguineti, Giuseppe; Strigari, Lidia; Alessandrini, Gabriele; Facciolo, Francesco; Cosimelli, Maurizio; Grazi, Gian Luca; Corrado, Giacomo; Vizza, Enrico; Giannini, Giuseppe; Soddu, Silvia

    2016-09-06

    Variant ATM heterozygotes have an increased risk of developing cancer, cardiovascular diseases, and diabetes. Costs and time of sequencing and ATM variant complexity make large-scale, general population screenings not cost-effective yet. Recently, we developed a straightforward, rapid, and inexpensive test based on p53 mitotic centrosomal localization (p53-MCL) in peripheral blood mononuclear cells (PBMCs) that diagnoses mutant ATM zygosity and recognizes tumor-associated ATM polymorphisms. Fresh PBMCs from 496 cancer patients were analyzed by p53-MCL: 90 cases with familial BRCA1/2-positive and -negative breast and/or ovarian cancer, 337 with sporadic cancers (ovarian, lung, colon, and post-menopausal breast cancers), and 69 with breast/thyroid cancer. Variants were confirmed by ATM sequencing. A total of seven individuals with ATM variants were identified, 5/65 (7.7 %) in breast cancer cases of familial breast and/or ovarian cancer and 2/69 (2.9 %) in breast/thyroid cancer. No variant ATM carriers were found among the other cancer cases. Excluding a single case in which both BRCA1 and ATM were mutated, no p53-MCL alterations were observed in BRCA1/2-positive cases. These data validate p53-MCL as reliable and specific test for germline ATM variants, confirm ATM as breast cancer susceptibility gene, and highlight a possible association with breast/thyroid cancers.

  2. Sequence variations of the human MPDZ gene and association with alcoholism in subjects with European ancestry.

    PubMed

    Karpyak, Victor M; Kim, Jeong-Hyun; Biernacka, Joanna M; Wieben, Eric D; Mrazek, David A; Black, John L; Choi, Doo-Sup

    2009-04-01

    Mpdz gene variations are known contributors of acute alcohol withdrawal severity and seizures in mice. To investigate the relevance of these findings for human alcoholism, we resequenced 46 exons, exon-intron boundaries, and 2 kilobases in the 5' region of the human MPDZ gene in 61 subjects with a history of alcohol withdrawal seizures (AWS), 59 subjects with a history of alcohol withdrawal without AWS, and 64 Coriell samples from self-reported nonalcoholic subjects [all European American (EA) ancestry] and compared with the Mpdz sequences of 3 mouse strains with different propensity to AWS. To explore potential associations of the human MPDZ gene with alcoholism and AWS, single SNP and haplotype analyses were performed using 13 common variants. Sixty-seven new, mostly rare variants were discovered in the human MPDZ gene. Sequence comparison revealed that the human gene does not have variations identical to those comprising Mpdz gene haplotype associated with AWS in mice. We also found no significant association between MPDZ haplotypes and AWS in humans. However, a global test of haplotype association revealed a significant difference in haplotype frequencies between alcohol-dependent subjects without AWS and Coriell controls (p = 0.015), suggesting a potential role of MPDZ in alcoholism and/or related phenotypes other than AWS. Haplotype-specific tests for the most common haplotypes (frequency > 0.05), revealed a specific high-risk haplotype (p = 0.006, maximum statistic p = 0.051), containing rs13297480G allele also found to be significantly more prevalent in alcoholics without AWS compared with nonalcoholic Coriell subjects (p = 0.019). Sequencing of MPDZ gene in individuals with EA ancestry revealed no variations in the sites identical to those associated with AWS in mice. Exploratory haplotype and single SNP association analyses suggest a possible association between the MPDZ gene and alcohol dependence but not AWS. Further functional genomic analysis of MPDZ variants and investigation of their association with a broader array of alcoholism-related phenotypes could reveal additional genetic markers of alcoholism.

  3. Resequencing of the CETP gene in American whites and African blacks: Association of rare and common variants with HDL-cholesterol levels

    PubMed Central

    Pirim, Dilek; Wang, Xingbin; Niemsiri, Vipavee; Radwan, Zaheda H.; Bunker, Clareann H.; Hokanson, John E.; Hamman, Richard F.; Barmada, M. Michael; Demirci, F. Yesim; Kamboh, M. Ilyas

    2015-01-01

    Background Cholesteryl ester transfer protein (CETP) plays a crucial role in lipid metabolism. Associations of common CETP variants with variation in plasma lipid levels, and/or CETP mass/activity have been extensively studied and well-documented; however, the effects of uncommon/rare CETP variants on plasma lipid profile remain undefined. Hence, resequencing of the gene in extreme phenotypes and follow-up rare-variant association analyses are essential to fill this gap. Objective To identify common and uncommon/rare variants in the CETP gene by resequencing the entire gene and test the effects of both common and uncommon/rare CETP variants on plasma lipid traits in two genetically distinct populations. Methods and Results The entire CETP gene plus flanking regions were resequenced in 190 individuals comprising 95 non-Hispanic Whites (NHWs) and 95 African blacks with extreme HDL-C levels. A total of 279 sequence variants were identified, of which 25 were novel. Selected variants were genotyped in the entire samples of 623 NHWs and 788 African blacks and 184 QC-passed variants were tested in relation to plasma lipid traits by using gene-based, single-site, haplotype and rare variant association analyses (SKAT-O). Two novel and independent associations of rs1968905 and rs289740 with HDL-C were identified in African blacks. Using SKAT-O analysis, we also identified rare variants with minor allele frequency <0.01 to be associated with HDL-C in both NHWs (P=0.024) and African blacks (P=0.009). Conclusions Our results point out that in addition to the common CETP variants, rare genetic variants in the CETP gene also contribute to the phenotypic variation of HDL-C in the general population. PMID:26683795

  4. Paraoxonase promoter and intronic variants modify risk of sporadic amyotrophic lateral sclerosis

    PubMed Central

    Cronin, Simon; Greenway, Matthew J; Prehn, Jochen H M; Hardiman, Orla

    2007-01-01

    Background The paraoxonases, PON1–3, play a major protective role both against environmental toxins and as part of the antioxidant defence system. Recently, non‐synonymous coding single nucleotide polymorphisms (SNPs), known to lower serum PON activity, have been associated with sporadic ALS (SALS) in a Polish population. A separate trio based study described a detrimental allele at the PON3 intronic variant INS2+3651 (rs10487132). Association between PON gene cluster variants and SALS requires external validation in an independent dataset. Aims To examine the association of the promoter SNPs PON1−162G>A and PON1−108T>C; the non‐synonymous functional SNPs PON1Q192R and L55M and PON2C311S and A148G; and the intronic marker PON3INS2+3651A>G, with SALS in a genetically homogenous population. Methods 221 Irish patients with SALS and 202 unrelated control subjects were genotyped using KASPar chemistries. Statistical analyses and haplotype estimations were conducted using Haploview and Unphased software. Multiple permutation testing, as implemented in Unphased, was applied to haplotype p values to correct for multiple hypotheses. Results Two of the seven SNPs were associated with SALS in the Irish population: PON155M (OR 1.52, p = 0.006) and PON3INS2+3651 G (OR 1.36, p = 0.03). Two locus haplotype analysis showed association only when both of these risk alleles were present (OR 1.7, p = 0.005), suggesting a potential effect modification. Low functioning promoter variants were observed to influence this effect when compared with wild‐type. Conclusions These data provide additional evidence that genetic variation across the paroxanase loci may be common susceptibility factors for SALS. PMID:17702780

  5. Haplotypes in CCR5-CCR2, CCL3 and CCL5 are associated with natural resistance to HIV-1 infection in a Colombian cohort.

    PubMed

    Vega, Jorge A; Villegas-Ospina, Simón; Aguilar-Jiménez, Wbeimar; Rugeles, María T; Bedoya, Gabriel; Zapata, Wildeman

    2017-06-01

    Variants in genes encoding for HIV-1 co-receptors and their natural ligands have been individually associated to natural resistance to HIV-1 infection. However, the simultaneous presence of these variants has been poorly studied. To evaluate the association of single and multilocus haplotypes in genes coding for the viral co-receptors CCR5 and CCR2, and their ligands CCL3 and CCL5, with resistance or susceptibility to HIV-1 infection. Nine variants in CCR5-CCR2, two SNPs in CCL3 and two in CCL5 were genotyped by PCR-RFLP in 35 seropositive (cases) and 49 HIV-1-exposed seronegative Colombian individuals (controls). Haplotypes were inferred using the Arlequin software, and their frequency in individual or combined loci was compared between cases and controls by the chi-square test. A p' value ;0.05 after Bonferroni correction was considered significant. Homozygosis of the human haplogroup (HH) E was absent in controls and frequent in cases, showing a tendency to susceptibility. The haplotypes C-C and T-T in CCL3 were associated with susceptibility (p'=0.016) and resistance (p';0.0001) to HIV-1 infection, respectively. Finally, in multilocus analysis, the haplotype combinations formed by HHC in CCR5-CCR2, T-T in CCL3 and G-C in CCL5 were associated with resistance (p'=0.006). Our results suggest that specific combinations of variants in genes from the same signaling pathway can define an HIV-1 resistant phenotype. Despite our small sample size, our statistically significant associations suggest strong effects; however, these results should be further validated in larger cohorts.

  6. A Genome-Wide Association Meta-Analysis of Attention-Deficit/Hyperactivity Disorder Symptoms in Population-Based Paediatric Cohorts

    PubMed Central

    Groen-Blokhuis, Maria M.; Pourcain, Beate St.; Greven, Corina U.; Pappa, Irene; Tiesler, Carla M.T.; Ang, Wei; Nolte, Ilja M.; Vilor-Tejedor, Natalia; Bacelis, Jonas; Ebejer, Jane L.; Zhao, Huiying; Davies, Gareth E.; Ehli, Erik A.; Evans, David M.; Fedko, Iryna O.; Guxens, Mònica; Hottenga, Jouke-Jan; Hudziak, James J.; Jugessur, Astanand; Kemp, John P.; Krapohl, Eva; Martin, Nicholas G.; Murcia, Mario; Myhre, Ronny; Ormel, Johan; Ring, Susan M.; Standl, Marie; Stergiakouli, Evie; Stoltenberg, Camilla; Thiering, Elisabeth; Timpson, Nicholas J.; Trzaskowski, Maciej; van der Most, Peter J.; Wang, Carol; Nyholt, Dale R.; Medland, Sarah E.; Neale, Benjamin; Jacobsson, Bo; Sunyer, Jordi; Hartman, Catharina A.; Whitehouse, Andrew J.O.; Pennell, Craig E.; Heinrich, Joachim; Plomin, Robert; Smith, George Davey; Tiemeier, Henning; Posthuma, Danielle; Boomsma, Dorret I.

    2016-01-01

    Objective To elucidate the influence of common genetic variants on childhood attention-deficit/hyperactivity disorder (ADHD) symptoms, to identify genetic variants that explain its high heritability, and to investigate the genetic overlap of ADHD symptom scores with ADHD diagnosis. Method Within the EArly Genetics and Lifecourse Epidemiology (EAGLE) consortium, genome-wide single nucleotide polymorphisms (SNPs) and ADHD symptom scores were available for 17,666 children (< 13 years) from nine population-based cohorts. SNP-based heritability was estimated in data from the three largest cohorts. Meta-analysis based on genome-wide association (GWA) analyses with SNPs was followed by gene-based association tests, and the overlap in results with a meta-analysis in the Psychiatric Genomics Consortium (PGC) case-control ADHD study was investigated. Results SNP-based heritability ranged from 5% to 34%, indicating that variation in common genetic variants influences ADHD symptom scores. The meta-analysis did not detect genome-wide significant SNPs, but three genes, lying close to each other with SNPs in high linkage disequilibrium (LD), showed a gene-wide significant association (p values between 1.46×10-6 and 2.66×10-6). One gene, WASL, is involved in neuronal development. Both SNP- and gene-based analyses indicated overlap with the PGC meta-analysis results with the genetic correlation estimated at 0.96. Conclusion The SNP-based heritability for ADHD symptom scores indicates a polygenic architecture and genes involved in neurite outgrowth are possibly involved. Continuous and dichotomous measures of ADHD appear to assess a genetically common phenotype. A next step is to combine data from population-based and case-control cohorts in genetic association studies to increase sample size and improve statistical power for identifying genetic variants. PMID:27663945

  7. Adapt-Mix: learning local genetic correlation structure improves summary statistics-based analyses

    PubMed Central

    Park, Danny S.; Brown, Brielin; Eng, Celeste; Huntsman, Scott; Hu, Donglei; Torgerson, Dara G.; Burchard, Esteban G.; Zaitlen, Noah

    2015-01-01

    Motivation: Approaches to identifying new risk loci, training risk prediction models, imputing untyped variants and fine-mapping causal variants from summary statistics of genome-wide association studies are playing an increasingly important role in the human genetics community. Current summary statistics-based methods rely on global ‘best guess’ reference panels to model the genetic correlation structure of the dataset being studied. This approach, especially in admixed populations, has the potential to produce misleading results, ignores variation in local structure and is not feasible when appropriate reference panels are missing or small. Here, we develop a method, Adapt-Mix, that combines information across all available reference panels to produce estimates of local genetic correlation structure for summary statistics-based methods in arbitrary populations. Results: We applied Adapt-Mix to estimate the genetic correlation structure of both admixed and non-admixed individuals using simulated and real data. We evaluated our method by measuring the performance of two summary statistics-based methods: imputation and joint-testing. When using our method as opposed to the current standard of ‘best guess’ reference panels, we observed a 28% decrease in mean-squared error for imputation and a 73.7% decrease in mean-squared error for joint-testing. Availability and implementation: Our method is publicly available in a software package called ADAPT-Mix available at https://github.com/dpark27/adapt_mix. Contact: noah.zaitlen@ucsf.edu PMID:26072481

  8. An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome.

    PubMed

    Ribeiro, Antonio; Golicz, Agnieszka; Hackett, Christine Anne; Milne, Iain; Stephen, Gordon; Marshall, David; Flavell, Andrew J; Bayer, Micha

    2015-11-11

    Single Nucleotide Polymorphisms (SNPs) are widely used molecular markers, and their use has increased massively since the inception of Next Generation Sequencing (NGS) technologies, which allow detection of large numbers of SNPs at low cost. However, both NGS data and their analysis are error-prone, which can lead to the generation of false positive (FP) SNPs. We explored the relationship between FP SNPs and seven factors involved in mapping-based variant calling - quality of the reference sequence, read length, choice of mapper and variant caller, mapping stringency and filtering of SNPs by read mapping quality and read depth. This resulted in 576 possible factor level combinations. We used error- and variant-free simulated reads to ensure that every SNP found was indeed a false positive. The variation in the number of FP SNPs generated ranged from 0 to 36,621 for the 120 million base pairs (Mbp) genome. All of the experimental factors tested had statistically significant effects on the number of FP SNPs generated and there was a considerable amount of interaction between the different factors. Using a fragmented reference sequence led to a dramatic increase in the number of FP SNPs generated, as did relaxed read mapping and a lack of SNP filtering. The choice of reference assembler, mapper and variant caller also significantly affected the outcome. The effect of read length was more complex and suggests a possible interaction between mapping specificity and the potential for contributing more false positives as read length increases. The choice of tools and parameters involved in variant calling can have a dramatic effect on the number of FP SNPs produced, with particularly poor combinations of software and/or parameter settings yielding tens of thousands in this experiment. Between-factor interactions make simple recommendations difficult for a SNP discovery pipeline but the quality of the reference sequence is clearly of paramount importance. Our findings are also a stark reminder that it can be unwise to use the relaxed mismatch settings provided as defaults by some read mappers when reads are being mapped to a relatively unfinished reference sequence from e.g. a non-model organism in its early stages of genomic exploration.

  9. Association of interleukin-1 gene variations with moderate to severe chronic periodontitis in multiple ethnicities

    PubMed Central

    Wu, X; Offenbacher, S; Lόpez, N J; Chen, D; Wang, H-Y; Rogus, J; Zhou, J; Beck, J; Jiang, S; Bao, X; Wilkins, L; Doucette-Stamm, L; Kornman, K

    2015-01-01

    Background and Objective Genetic markers associated with disease are often non-functional and generally tag one or more functional “causative” variants in linkage disequilibrium. Markers may not show tight linkage to the causative variants across multiple ethnicities due to evolutionary divergence, and therefore may not be informative across different population groups. Validated markers of disease suggest causative variants exist in the gene and, if the causative variants can be identified, it is reasonable to hypothesize that such variants will be informative across diverse populations. The aim of this study was to test that hypothesis using functional Interleukin-1 (IL-1) gene variations across multiple ethnic populations to replace the non-functional markers originally associated with chronic adult periodontitis in Caucasians. Material and Methods Adult chronic periodontitis cases and controls from four ethnic groups (Caucasians, African Americans, Hispanics and Asians) were recruited in the USA, Chile and China. Genotypes of IL1B gene single nucleotide polymorphisms (SNPs), including three functional SNPs (rs16944, rs1143623, rs4848306) in the promoter and one intronic SNP (rs1143633), were determined using a single base extension method or TaqMan 5′ nuclease assay. Logistic regression and other statistical analyses were used to examine the association between moderate to severe periodontitis and IL1B gene variations, including SNPs, haplotypes and composite genotypes. Genotype patterns associated with disease in the discovery study were then evaluated in independent validation studies. Results Significant associations were identified in the discovery study, consisting of Caucasians and African Americans, between moderate to severe adult chronic periodontitis and functional variations in the IL1B gene, including a pattern of four IL1B SNPs (OR = 1.87, p < 0.0001). The association between the disease and this IL1B composite genotype pattern was validated in two additional studies consisting of Hispanics (OR = 1.95, p = 0.04) or Asians (OR = 3.27, p = 0.01). A meta-analysis of the three populations supported the association between the IL-1 genotype pattern and moderate to severe periodontitis (OR 1.95; p < 0.001). Our analysis also demonstrated that IL1B gene variations had added value to conventional risk factors in predicting chronic periodontitis. Conclusion This study validated the influence of IL-1 genetic factors on the severity of chronic periodontitis in four different ethnicities. PMID:24690098

  10. A weighted U-statistic for genetic association analyses of sequencing data.

    PubMed

    Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing

    2014-12-01

    With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.

  11. Exome Array Analysis Identifies a Common Variant in IL27 Associated with Chronic Obstructive Pulmonary Disease

    PubMed Central

    Parker, Margaret M.; Chen, Han; Lao, Taotao; Hardin, Megan; Qiao, Dandi; Hawrylkiewicz, Iwona; Sliwinski, Pawel; Yim, Jae-Joon; Kim, Woo Jin; Kim, Deog Kyeom; Castaldi, Peter J.; Hersh, Craig P.; Morrow, Jarrett; Celli, Bartolome R.; Pinto-Plata, Victor M.; Criner, Gerald J.; Marchetti, Nathaniel; Bueno, Raphael; Agustí, Alvar; Make, Barry J.; Crapo, James D.; Calverley, Peter M.; Donner, Claudio F.; Lomas, David A.; Wouters, Emiel F. M.; Vestbo, Jorgen; Paré, Peter D.; Levy, Robert D.; Rennard, Stephen I.; Zhou, Xiaobo; Laird, Nan M.; Lin, Xihong; Beaty, Terri H.; Silverman, Edwin K.

    2016-01-01

    Rationale: Chronic obstructive pulmonary disease (COPD) susceptibility is in part related to genetic variants. Most genetic studies have been focused on genome-wide common variants without a specific focus on coding variants, but common and rare coding variants may also affect COPD susceptibility. Objectives: To identify coding variants associated with COPD. Methods: We tested nonsynonymous, splice, and stop variants derived from the Illumina HumanExome array for association with COPD in five study populations enriched for COPD. We evaluated single variants with a minor allele frequency greater than 0.5% using logistic regression. Results were combined using a fixed effects meta-analysis. We replicated novel single-variant associations in three additional COPD cohorts. Measurements and Main Results: We included 6,004 control subjects and 6,161 COPD cases across five cohorts for analysis. Our top result was rs16969968 (P = 1.7 × 10−14) in CHRNA5, a locus previously associated with COPD susceptibility and nicotine dependence. Additional top results were found in AGER, MMP3, and SERPINA1. A nonsynonymous variant, rs181206, in IL27 (P = 4.7 × 10−6) was just below the level of exome-wide significance but attained exome-wide significance (P = 5.7 × 10−8) when combined with results from other cohorts. Gene expression datasets revealed an association of rs181206 and the surrounding locus with expression of multiple genes; several were differentially expressed in COPD lung tissue, including TUFM. Conclusions: In an exome array analysis of COPD, we identified nonsynonymous variants at previously described loci and a novel exome-wide significant variant in IL27. This variant is at a locus previously described in genome-wide associations with diabetes, inflammatory bowel disease, and obesity and appears to affect genes potentially related to COPD pathogenesis. PMID:26771213

  12. A sigmoidal model for biosorption of heavy metal cations from aqueous media.

    PubMed

    Özen, Rümeysa; Sayar, Nihat Alpagu; Durmaz-Sam, Selcen; Sayar, Ahmet Alp

    2015-07-01

    A novel multi-input single output (MISO) black-box sigmoid model is developed to simulate the biosorption of heavy metal cations by the fission yeast from aqueous medium. Validation and verification of the model is done through statistical chi-squared hypothesis tests and the model is evaluated by uncertainty and sensitivity analyses. The simulated results are in agreement with the data of the studied system in which Schizosaccharomyces pombe biosorbs Ni(II) cations at various process conditions. Experimental data is obtained originally for this work using dead cells of an adapted variant of S. Pombe and represented by Freundlich isotherms. A process optimization scheme is proposed using the present model to build a novel application of a cost-merit objective function which would be useful to predict optimal operation conditions. Copyright © 2015. Published by Elsevier Inc.

  13. Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior.

    PubMed

    Windhorst, Dafna A; Mileva-Seitz, Viara R; Rippe, Ralph C A; Tiemeier, Henning; Jaddoe, Vincent W V; Verhulst, Frank C; van IJzendoorn, Marinus H; Bakermans-Kranenburg, Marian J

    2016-08-01

    In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome-wide environmental interaction (GWEI) studies in the search for genetic variation underlying individual differences in behavior. Genetic variants in 12 autosomal dopaminergic genes were available in an ethnically homogenous part of a population-based cohort. Harsh parenting was assessed with maternal (n = 1881) and paternal (n = 1710) reports at age 3. Externalizing behavior was assessed with the Child Behavior Checklist (CBCL) at age 5 (71 ± 3.7 months). We conducted gene-set analyses of the association between variation in dopaminergic genes and externalizing behavior, stratified for harsh parenting. The association was statistically significant or approached significance for children without harsh parenting experiences, but was absent in the group with harsh parenting. Similarly, significant associations between single genes and externalizing behavior were only found in the group without harsh parenting. Effect sizes in the groups with and without harsh parenting did not differ significantly. Gene-environment interaction tests were conducted for individual genetic variants, resulting in two significant interaction effects (rs1497023 and rs4922132) after correction for multiple testing. Our findings are suggestive of G × E interplay, with associations between dopamine genes and externalizing behavior present in children without harsh parenting, but not in children with harsh parenting experiences. Harsh parenting may overrule the role of genetic factors in externalizing behavior. Gene-based and gene-set analyses offer promising new alternatives to analyses focusing on single candidate polymorphisms when examining the interplay between genetic and environmental factors.

  14. Association of filaggrin variants with asthma and rhinitis: is eczema or allergic sensitization status an effect modifier?

    PubMed Central

    Ziyab, Ali H.; Karmaus, Wilfried; Zhang, Hongmei; Holloway, John W.; Steck, Susan E.; Ewart, Susan; Arshad, Syed Hasan

    2014-01-01

    Background Associations of filaggrin (FLG) variants with asthma and rhinitis have been shown to be modulated by eczema status. However, it is unknown whether allergic sensitization status modifies this association. The aim of this study was to determine whether FLG variants need eczema and/or allergic sensitization as a necessary component to execute its adverse effect on coexisting and subsequent asthma and rhinitis. Methods Repeated measurements of asthma, rhinitis, eczema, and allergic sensitization (documented by skin prick tests) at ages 1, 2, 4, 10, and 18 years were ascertained in the Isle of Wight birth cohort (n = 1,456). FLG haploinsufficiency was defined as having at least the minor allele of R501X, 2282del4, or S3247X variants. Log binomial regression models were used to test associations and statistical interactions. Results FLG variants increased the risk of asthma (RR = 1.39, 95% CI: 1.06 – 1.80) and rhinitis (RR = 1.37, 95% CI: 1.16 – 1.63). In delayed effect models, ‘FLG variants plus allergic sensitization’ and ‘FLG variants plus eczema’ increased the risk of subsequent asthma by 4.93-fold (95% CI: 3.61 – 6.71) and 3.33-fold (95% CI: 2.45 – 4.51), respectively, during the first 18 years of life. In contrast, neither eczema nor allergic sensitization in combination with FLG variants increased the risk of later rhinitis. Conclusions Allergic sensitization and eczema modulated the association between FLG variants and asthma, but not rhinitis. Results of our study imply that the mechanisms and pathways through which FLG variants predispose to increased risk of asthma and rhinitis may be different. PMID:25277085

  15. Surfactant Protein-C Promoter Variants Associated with Neonatal Respiratory Distress Syndrome Reduce Transcription

    PubMed Central

    Wambach, Jennifer A.; Yang, Ping; Wegner, Daniel J.; An, Ping; Hackett, Brian P.; Cole, F. S.; Hamvas, Aaron

    2010-01-01

    Dominant mutations in coding regions of the surfactant protein-C gene (SFTPC) cause respiratory distress syndrome (RDS) in infants. However, the contribution of variants in noncoding regions of SFTPC to pulmonary phenotypes is unknown. Using a case-control group of infants ≥34 weeks gestation (n=538), we used complete resequencing of SFTPC and its promoter, genotyping, and logistic regression to identify 80 single nucleotide polymorphisms (SNPs). Three promoter SNPs were statistically associated with neonatal RDS among European descent infants. To assess the transcriptional effects of these three promoter SNPs, we selectively mutated the SFTPC promoter and performed transient transfection using MLE-15 cells and a firefly luciferase reporter vector. Each promoter SNP decreased SFTPC transcription. The combination of two variants in high linkage dysequilibrium also decreased SFTPC transcription. In silico evaluation of transcription factor binding demonstrated that the rare allele at g.-1167 disrupts a SOX (SRY-related high mobility group box) consensus motif and introduces a GATA-1 site, at g.-2385 removes a MZF-1 (myeloid zinc finger) binding site, and at g.-1647 removes a potential methylation site. This combined statistical, in vitro, and in silico approach suggests that reduced SFTPC transcription contributes to the genetic risk for neonatal RDS in developmentally susceptible infants. PMID:20539253

  16. DNA repair gene XRCC1 polymorphisms, smoking, and bladder cancer risk.

    PubMed

    Stern, M C; Umbach, D M; van Gils, C H; Lunn, R M; Taylor, J A

    2001-02-01

    Bladder cancer is the sixth most common cancer in the United States. The main identified risk factor is cigarette smoking, which is estimated to contribute to up to 50% of new cases in men and 20% in women. Besides containing other carcinogens, cigarette smoke is a rich source of reactive oxygen species (ROS) that can induce a variety of DNA damage, some of which is repaired by the base excision repair (BER) pathway. The XRCC1 gene protein plays an important role in BER by serving as a scaffold for other repair enzymes and by recognizing single-strand DNA breaks. Three polymorphisms that induce amino acid changes have been found in codon 194 (exon 6), codon 280 (exon 9), and codon 399 (exon 10) of this gene. We tested whether polymorphisms in XRCC1 were associated with bladder cancer risk and whether this association was modified by cigarette smoking. Therefore, we genotyped for the three polymorphisms in 235 bladder cancer cases and 213 controls who had been frequency matched to cases on age, sex, and ethnicity. We found no evidence of an association between the codon 280 variant and bladder cancer risk [odds ratio (OR), 1.2; 95% confidence interval (CI), 0.6-2.6]. We found some evidence of a protective effect for subjects that carried at least one copy of the codon 194 variant allele relative to those homozygous for the common allele (OR, 0.59; 95% CI, 0.3-1.0). The combined analysis with smoking history suggested a possible gene-exposure interaction; however, the results were not statistically significant. Similarly, for the codon 399 polymorphism, our data suggested a protective effect of the homozygous variant genotype relative to carriers of either one or two copies of the common allele (OR, 0.70; 95% CI, 0.4-1.3), and provided limited evidence, albeit not statistically significant, for a gene-smoking interaction.

  17. Genetic ancestry modifies the association between genetic risk variants and breast cancer risk among Hispanic and non-Hispanic white women

    PubMed Central

    Fejerman, Laura

    2013-01-01

    Hispanic women in the USA have lower breast cancer incidence than non-Hispanic white (NHW) women. Genetic factors may contribute to this difference. Breast cancer genome-wide association studies (GWAS) conducted in women of European or Asian descent have identified multiple risk variants. We tested the association between 10 previously reported single nucleotide polymorphisms (SNPs) and risk of breast cancer in a sample of 4697 Hispanic and 3077 NHW women recruited as part of three population-based case–control studies of breast cancer. We used stratified logistic regression analyses to compare the associations with different genetic variants in NHWs and Hispanics classified by their proportion of Indigenous American (IA) ancestry. Five of 10 SNPs were statistically significantly associated with breast cancer risk. Three of the five significant variants (rs17157903-RELN, rs7696175-TLR1 and rs13387042-2q35) were associated with risk among Hispanics but not in NHWs. The odds ratio (OR) for the heterozygous at 2q35 was 0.75 [95% confidence interval (CI) = 0.50–1.15] for low IA ancestry and 1.38 (95% CI = 1.04–1.82) for high IA ancestry (P interaction 0.02). The ORs for association at RELN were 0.87 (95% CI = 0.59–1.29) and 1.69 (95% CI = 1.04–2.73), respectively (P interaction 0.03). At the TLR1 locus, the ORs for women homozygous for the rare allele were 0.74 (95% CI = 0.42–1.31) and 1.73 (95% CI = 1.19–2.52) (P interaction 0.03). Our results suggest that the proportion of IA ancestry modifies the magnitude and direction of the association of 3 of the 10 previously reported variants. Genetic ancestry should be considered when assessing risk in women of mixed descent and in studies designed to discover causal mutations. PMID:23563089

  18. Single-Trial Normalization for Event-Related Spectral Decomposition Reduces Sensitivity to Noisy Trials

    PubMed Central

    Grandchamp, Romain; Delorme, Arnaud

    2011-01-01

    In electroencephalography, the classical event-related potential model often proves to be a limited method to study complex brain dynamics. For this reason, spectral techniques adapted from signal processing such as event-related spectral perturbation (ERSP) – and its variant event-related synchronization and event-related desynchronization – have been used over the past 20 years. They represent average spectral changes in response to a stimulus. These spectral methods do not have strong consensus for comparing pre- and post-stimulus activity. When computing ERSP, pre-stimulus baseline removal is usually performed after averaging the spectral estimate of multiple trials. Correcting the baseline of each single-trial prior to averaging spectral estimates is an alternative baseline correction method. However, we show that this method leads to positively skewed post-stimulus ERSP values. We eventually present new single-trial-based ERSP baseline correction methods that perform trial normalization or centering prior to applying classical baseline correction methods. We show that single-trial correction methods minimize the contribution of artifactual data trials with high-amplitude spectral estimates and are robust to outliers when performing statistical inference testing. We then characterize these methods in terms of their time–frequency responses and behavior compared to classical ERSP methods. PMID:21994498

  19. Novel oxytocin receptor variants in laboring women requiring high doses of oxytocin.

    PubMed

    Reinl, Erin L; Goodwin, Zane A; Raghuraman, Nandini; Lee, Grace Y; Jo, Erin Y; Gezahegn, Beakal M; Pillai, Meghan K; Cahill, Alison G; de Guzman Strong, Cristina; England, Sarah K

    2017-08-01

    Although oxytocin commonly is used to augment or induce labor, it is difficult to predict its effectiveness because oxytocin dose requirements vary significantly among women. One possibility is that women requiring high or low doses of oxytocin have variations in the oxytocin receptor gene. To identify oxytocin receptor gene variants in laboring women with low and high oxytocin dosage requirements. Term, nulliparous women requiring oxytocin doses of ≤4 mU/min (low-dose-requiring, n = 83) or ≥20 mU/min (high-dose-requiring, n = 104) for labor augmentation or induction provided consent to a postpartum blood draw as a source of genomic DNA. Targeted-amplicon sequencing (coverage >30×) with MiSeq (Illumina) was performed to discover variants in the coding exons of the oxytocin receptor gene. Baseline relevant clinical history, outcomes, demographics, and oxytocin receptor gene sequence variants and their allele frequencies were compared between low-dose-requiring and high-dose-requiring women. The Scale-Invariant Feature Transform algorithm was used to predict the effect of variants on oxytocin receptor function. The Fisher exact or χ 2 tests were used for categorical variables, and Student t tests or Wilcoxon rank sum tests were used for continuous variables. A P value < .05 was considered statistically significant. The high-dose-requiring women had greater rates of obesity and diabetes and were more likely to have undergone labor induction and required prostaglandins. High-dose-requiring women were more likely to undergo cesarean delivery for first-stage arrest and less likely to undergo cesarean delivery for nonreassuring fetal status. Targeted sequencing of the oxytocin receptor gene in the total cohort (n = 187) revealed 30 distinct coding variants: 17 nonsynonymous, 11 synonymous, and 2 small structural variants. One novel variant (A243T) was found in both the low- and high-dose-requiring groups. Three novel variants (Y106H, A240_A249del, and P197delfs*206) resulting in an amino acid substitution, loss of 9 amino acids, and a frameshift stop mutation, respectively, were identified only in low-dose-requiring women. Nine nonsynonymous variants were unique to the high-dose-requiring group. These included 3 known variants (R151C, G221S, and W228C) and 6 novel variants (M133V, R150L, H173R, A248V, G253R, and I266V). Of these, R150L, R151C, and H173R were predicted by Scale-Invariant Feature Transform algorithm to damage oxytocin receptor function. There was no statistically significant association between the numbers of synonymous and nonsynonymous substitutions in the patient groups. Obesity, diabetes, and labor induction were associated with the requirement for high doses of oxytocin. We did not identify significant differences in the prevalence of oxytocin receptor variants between low-dose-requiring and high-dose-requiring women, but novel oxytocin receptor variants were enriched in the high-dose-requiring women. We also found 3 oxytocin receptor variants (2 novel, 1 known) that were predicted to damage oxytocin receptor function and would likely increase an individual's risk for requiring a high oxytocin dose. Further investigation of oxytocin receptor variants and their effects on protein function will inform precision medicine in pregnant women. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  20. Tourette's syndrome is not associated with interleukin-10 receptor 1 variants on chromosome 11q23.3.

    PubMed

    Kindler, Jochen; Schosser, Alexandra; Stamenkovic, Mara; Schloegelhofer, Monika; Leisch, Friedrich; Hornik, Kurt; Aschauer, Harald; Gasche, Christoph

    2008-01-15

    Interleukin-10 receptor 1 (IL-10R1) single nucleotide polymorphisms, located on chromosome 11q23 - a strong candidate for linkage with Tourette's syndrome (TS) - have been investigated for association with TS. DNA of 77 patients with a DSM-IV (Diagnostic and Statistical Manual IV) diagnosis of TS and 250 healthy controls was genotyped. IL-10R1 was not associated with TS.

  1. Fine-Mapping of Common Genetic Variants Associated with Colorectal Tumor Risk Identified Potential Functional Variants

    PubMed Central

    Gala, Manish; Abecasis, Goncalo; Bezieau, Stephane; Brenner, Hermann; Butterbach, Katja; Caan, Bette J.; Carlson, Christopher S.; Casey, Graham; Chang-Claude, Jenny; Conti, David V.; Curtis, Keith R.; Duggan, David; Gallinger, Steven; Haile, Robert W.; Harrison, Tabitha A.; Hayes, Richard B.; Hoffmeister, Michael; Hopper, John L.; Hudson, Thomas J.; Jenkins, Mark A.; Küry, Sébastien; Le Marchand, Loic; Leal, Suzanne M.; Newcomb, Polly A.; Nickerson, Deborah A.; Potter, John D.; Schoen, Robert E.; Schumacher, Fredrick R.; Seminara, Daniela; Slattery, Martha L.; Hsu, Li; Chan, Andrew T.; White, Emily; Berndt, Sonja I.; Peters, Ulrike

    2016-01-01

    Genome-wide association studies (GWAS) have identified many common single nucleotide polymorphisms (SNPs) associated with colorectal cancer risk. These SNPs may tag correlated variants with biological importance. Fine-mapping around GWAS loci can facilitate detection of functional candidates and additional independent risk variants. We analyzed 11,900 cases and 14,311 controls in the Genetics and Epidemiology of Colorectal Cancer Consortium and the Colon Cancer Family Registry. To fine-map genomic regions containing all known common risk variants, we imputed high-density genetic data from the 1000 Genomes Project. We tested single-variant associations with colorectal tumor risk for all variants spanning genomic regions 250-kb upstream or downstream of 31 GWAS-identified SNPs (index SNPs). We queried the University of California, Santa Cruz Genome Browser to examine evidence for biological function. Index SNPs did not show the strongest association signals with colorectal tumor risk in their respective genomic regions. Bioinformatics analysis of SNPs showing smaller P-values in each region revealed 21 functional candidates in 12 loci (5q31.1, 8q24, 11q13.4, 11q23, 12p13.32, 12q24.21, 14q22.2, 15q13, 18q21, 19q13.1, 20p12.3, and 20q13.33). We did not observe evidence of additional independent association signals in GWAS-identified regions. Our results support the utility of integrating data from comprehensive fine-mapping with expanding publicly available genomic databases to help clarify GWAS associations and identify functional candidates that warrant more onerous laboratory follow-up. Such efforts may aid the eventual discovery of disease-causing variant(s). PMID:27379672

  2. Conditional entropy in variation-adjusted windows detects selection signatures associated with expression quantitative trait loci (eQTLs)

    PubMed Central

    2015-01-01

    Background Over the past 50,000 years, shifts in human-environmental or human-human interactions shaped genetic differences within and among human populations, including variants under positive selection. Shaped by environmental factors, such variants influence the genetics of modern health, disease, and treatment outcome. Because evolutionary processes tend to act on gene regulation, we test whether regulatory variants are under positive selection. We introduce a new approach to enhance detection of genetic markers undergoing positive selection, using conditional entropy to capture recent local selection signals. Results We use conditional logistic regression to compare our Adjusted Haplotype Conditional Entropy (H|H) measure of positive selection to existing positive selection measures. H|H and existing measures were applied to published regulatory variants acting in cis (cis-eQTLs), with conditional logistic regression testing whether regulatory variants undergo stronger positive selection than the surrounding gene. These cis-eQTLs were drawn from six independent studies of genotype and RNA expression. The conditional logistic regression shows that, overall, H|H is substantially more powerful than existing positive-selection methods in identifying cis-eQTLs against other Single Nucleotide Polymorphisms (SNPs) in the same genes. When broken down by Gene Ontology, H|H predictions are particularly strong in some biological process categories, where regulatory variants are under strong positive selection compared to the bulk of the gene, distinct from those GO categories under overall positive selection. . However, cis-eQTLs in a second group of genes lack positive selection signatures detectable by H|H, consistent with ancient short haplotypes compared to the surrounding gene (for example, in innate immunity GO:0042742); under such other modes of selection, H|H would not be expected to be a strong predictor.. These conditional logistic regression models are adjusted for Minor allele frequency(MAF); otherwise, ascertainment bias is a huge factor in all eQTL data sets. Relationships between Gene Ontology categories, positive selection and eQTL specificity were replicated with H|H in a single larger data set. Our measure, Adjusted Haplotype Conditional Entropy (H|H), was essential in generating all of the results above because it: 1) is a stronger overall predictor for eQTLs than comparable existing approaches, and 2) shows low sequential auto-correlation, overcoming problems with convergence of these conditional regression statistical models. Conclusions Our new method, H|H, provides a consistently more robust signal associated with cis-eQTLs compared to existing methods. We interpret this to indicate that some cis-eQTLs are under positive selection compared to their surrounding genes. Conditional entropy indicative of a selective sweep is an especially strong predictor of eQTLs for genes in several biological processes of medical interest. Where conditional entropy is a weak or negative predictor of eQTLs, such as innate immune genes, this would be consistent with balancing selection acting on such eQTLs over long time periods. Different measures of selection may be needed for variant prioritization under other modes of evolutionary selection. PMID:26111110

  3. No Association between Oxytocin Receptor (OXTR) Gene Polymorphisms and Experimentally Elicited Social Preferences

    PubMed Central

    Apicella, Coren L.; Cesarini, David; Johannesson, Magnus; Dawes, Christopher T.; Lichtenstein, Paul; Wallace, Björn; Beauchamp, Jonathan; Westberg, Lars

    2010-01-01

    Background Oxytocin (OXT) has been implicated in a suite of complex social behaviors including observed choices in economic laboratory experiments. However, actual studies of associations between oxytocin receptor (OXTR) gene variants and experimentally elicited social preferences are rare. Methodology/Principal Findings We test hypotheses of associations between social preferences, as measured by behavior in two economic games, and 9 single nucleotide polymorphisms (SNPs) of the OXTR gene in a sample of Swedish twins (n = 684). Two standard economic games, the dictator game and the trust game, both involving real monetary consequences, were used to elicit such preferences. After correction for multiple hypothesis testing, we found no significant associations between any of the 9 single nucleotide polymorphisms (SNPs) and behavior in either of the games. Conclusion We were unable to replicate the most significant association reported in previous research between the amount donated in a dictator game and an OXTR genetic variant. PMID:20585395

  4. No association between oxytocin receptor (OXTR) gene polymorphisms and experimentally elicited social preferences.

    PubMed

    Apicella, Coren L; Cesarini, David; Johannesson, Magnus; Dawes, Christopher T; Lichtenstein, Paul; Wallace, Björn; Beauchamp, Jonathan; Westberg, Lars

    2010-06-16

    Oxytocin (OXT) has been implicated in a suite of complex social behaviors including observed choices in economic laboratory experiments. However, actual studies of associations between oxytocin receptor (OXTR) gene variants and experimentally elicited social preferences are rare. We test hypotheses of associations between social preferences, as measured by behavior in two economic games, and 9 single nucleotide polymorphisms (SNPs) of the OXTR gene in a sample of Swedish twins (n = 684). Two standard economic games, the dictator game and the trust game, both involving real monetary consequences, were used to elicit such preferences. After correction for multiple hypothesis testing, we found no significant associations between any of the 9 single nucleotide polymorphisms (SNPs) and behavior in either of the games. We were unable to replicate the most significant association reported in previous research between the amount donated in a dictator game and an OXTR genetic variant.

  5. Contribution of 20 single nucleotide polymorphisms of 13 genes to dyslipidemia associated with antiretroviral therapy.

    PubMed

    Arnedo, Mireia; Taffé, Patrick; Sahli, Roland; Furrer, Hansjakob; Hirschel, Bernard; Elzi, Luigia; Weber, Rainer; Vernazza, Pietro; Bernasconi, Enos; Darioli, Roger; Bergmann, Sven; Beckmann, Jacques S; Telenti, Amalio; Tarr, Philip E

    2007-09-01

    HIV-1 infected individuals have an increased cardiovascular risk which is partially mediated by dyslipidemia. Single nucleotide polymorphisms in multiple genes involved in lipid transport and metabolism are presumed to modulate the risk of dyslipidemia in response to antiretroviral therapy. The contribution to dyslipidemia of 20 selected single nucleotide polymorphisms of 13 genes reported in the literature to be associated with plasma lipid levels (ABCA1, ADRB2, APOA5, APOC3, APOE, CETP, LIPC, LIPG, LPL, MDR1, MTP, SCARB1, and TNF) was assessed by longitudinally modeling more than 4400 plasma lipid determinations in 438 antiretroviral therapy-treated participants during a median period of 4.8 years. An exploratory genetic score was tested that takes into account the cumulative contribution of multiple gene variants to plasma lipids. Variants of ABCA1, APOA5, APOC3, APOE, and CETP contributed to plasma triglyceride levels, particularly in the setting of ritonavir-containing antiretroviral therapy. Variants of APOA5 and CETP contributed to high-density lipoprotein-cholesterol levels. Variants of CETP and LIPG contributed to non-high-density lipoprotein-cholesterol levels, a finding not reported previously. Sustained hypertriglyceridemia and low high-density lipoprotein-cholesterol during the study period was significantly associated with the genetic score. Single nucleotide polymorphisms of ABCA1, APOA5, APOC3, APOE, and CETP contribute to plasma triglyceride and high-density lipoprotein-cholesterol levels during antiretroviral therapy exposure. Genetic profiling may contribute to the identification of patients at risk for antiretroviral therapy-related dyslipidemia.

  6. Selection and explosive growth alter genetic architecture and hamper the detection of causal rare variants

    PubMed Central

    Zaitlen, Noah A.; Ye, Chun Jimmie; Witte, John S.

    2016-01-01

    The role of rare alleles in complex phenotypes has been hotly debated, but most rare variant association tests (RVATs) do not account for the evolutionary forces that affect genetic architecture. Here, we use simulation and numerical algorithms to show that explosive population growth, as experienced by human populations, can dramatically increase the impact of very rare alleles on trait variance. We then assess the ability of RVATs to detect causal loci using simulations and human RNA-seq data. Surprisingly, we find that statistical performance is worst for phenotypes in which genetic variance is due mainly to rare alleles, and explosive population growth decreases power. Although many studies have attempted to identify causal rare variants, few have reported novel associations. This has sometimes been interpreted to mean that rare variants make negligible contributions to complex trait heritability. Our work shows that RVATs are not robust to realistic human evolutionary forces, so general conclusions about the impact of rare variants on complex traits may be premature. PMID:27197206

  7. Association of five SNPs with human hair colour in the Polish population.

    PubMed

    Siewierska-Górska, A; Sitek, A; Żądzińska, E; Bartosz, G; Strapagiel, D

    2017-03-01

    Twenty-two variants (single nucleotide polymorphisms - SNPs) of the genes involved in hair pigmentation (OCA2, HERC2, MC1R, SLC24A5, SLC45A2, TPCN2, TYR, TYRP1) were genotyped in a group of 186 Polish participants, representing a range of hair colours (45 red, 64 blond, 77 dark). A genotype-phenotype association analysis was performed. Using z-statistics we identified three variants highly associated with different hair colour categories (rs12913832:A>G in HERC2, rs1805007:T>C and rs1805008:C>T in MC1R). Two variants: rs1800401:C>T in OCA2 and rs16891982:C>G in SLC45A2 showed a high probability of a relation with hair colour, although that probability did not exceed the threshold of statistical significance after applying the Bonferroni correction. We created and validated mathematical logistic regression models in order to test the usefulness of the sets of polymorphisms for hair colour prediction in the Polish population. We subjected four models to stratified cross-validation. The first model consisted of three polymorphisms that proved to be important in the associative analysis. The second model included, apart from the mentioned polymorphisms, additionally rs16891982:C>G in SLC45A. The third model included, apart from the variants relevant in the associating analysis, rs1800401:C>T in OCA. The fourth model consisted of the set of polymorphisms from the first model supplemented with rs16891982:C>G in SLC45A and rs1800401:C>T in OCA. The validation of our models has shown that the inclusion of rs16891982:C>G in SLC45A and rs1800401:C>T in OCA increases the prediction of red hair in comparison with the algorithm including only rs12913832:A>G in HERC2, rs1805007:T>C and rs1805008:C>T in MC1R. The model consisting of all the five above-mentioned genetic variants has shown good prediction accuracies, expressed by the area under the curve (AUC) of the receiver operating characteristics: 0.84 for the red-haired, 0.82 for the dark-haired and 0.71 for the blond-haired. A genotype-phenotype association analysis brought results similar to those in other studies and confirmed the role of rs16891982:C>G, rs12913832:A>G, rs1805007:T>C and rs1805008:C>T in hair colour determination in the Polish population. Our study demonstrated for the first time the possibility of a share of the rs1800401:C>T SNP in the OCA2 gene in hair colour determination. Including this single nucleotide polymorphism in the actual hair colour predicting models would improve their predictive accuracy. Copyright © 2017 Elsevier GmbH. All rights reserved.

  8. The UK10K project identifies rare variants in health and disease.

    PubMed

    Walter, Klaudia; Min, Josine L; Huang, Jie; Crooks, Lucy; Memari, Yasin; McCarthy, Shane; Perry, John R B; Xu, ChangJiang; Futema, Marta; Lawson, Daniel; Iotchkova, Valentina; Schiffels, Stephan; Hendricks, Audrey E; Danecek, Petr; Li, Rui; Floyd, James; Wain, Louise V; Barroso, Inês; Humphries, Steve E; Hurles, Matthew E; Zeggini, Eleftheria; Barrett, Jeffrey C; Plagnol, Vincent; Richards, J Brent; Greenwood, Celia M T; Timpson, Nicholas J; Durbin, Richard; Soranzo, Nicole

    2015-10-01

    The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high read depth, 80×) of nearly 10,000 individuals from population-based and disease collections. In extensively phenotyped cohorts we characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel and identify novel alleles associated with levels of triglycerides (APOB), adiponectin (ADIPOQ) and low-density lipoprotein cholesterol (LDLR and RGAG1) from single-marker and rare variant aggregation tests. We describe population structure and functional annotation of rare and low-frequency variants, use the data to estimate the benefits of sequencing for association studies, and summarize lessons from disease-specific collections. Finally, we make available an extensive resource, including individual-level genetic and phenotypic data and web-based tools to facilitate the exploration of association results.

  9. Novel Common Genetic Susceptibility Loci for Colorectal Cancer.

    PubMed

    Schmit, Stephanie L; Edlund, Christopher K; Schumacher, Fredrick R; Gong, Jian; Harrison, Tabitha A; Huyghe, Jeroen R; Qu, Chenxu; Melas, Marilena; Van Den Berg, David J; Wang, Hansong; Tring, Stephanie; Plummer, Sarah J; Albanes, Demetrius; Alonso, M Henar; Amos, Christopher I; Anton, Kristen; Aragaki, Aaron K; Arndt, Volker; Barry, Elizabeth L; Berndt, Sonja I; Bezieau, Stéphane; Bien, Stephanie; Bloomer, Amanda; Boehm, Juergen; Boutron-Ruault, Marie-Christine; Brenner, Hermann; Brezina, Stefanie; Buchanan, Daniel D; Butterbach, Katja; Caan, Bette J; Campbell, Peter T; Carlson, Christopher S; Castelao, Jose E; Chan, Andrew T; Chang-Claude, Jenny; Chanock, Stephen J; Cheng, Iona; Cheng, Ya-Wen; Chin, Lee Soo; Church, James M; Church, Timothy; Coetzee, Gerhard A; Cotterchio, Michelle; Cruz Correa, Marcia; Curtis, Keith R; Duggan, David; Easton, Douglas F; English, Dallas; Feskens, Edith J M; Fischer, Rocky; FitzGerald, Liesel M; Fortini, Barbara K; Fritsche, Lars G; Fuchs, Charles S; Gago-Dominguez, Manuela; Gala, Manish; Gallinger, Steven J; Gauderman, W James; Giles, Graham G; Giovannucci, Edward L; Gogarten, Stephanie M; Gonzalez-Villalpando, Clicerio; Gonzalez-Villalpando, Elena M; Grady, William M; Greenson, Joel K; Gsur, Andrea; Gunter, Marc; Haiman, Christopher A; Hampe, Jochen; Harlid, Sophia; Harju, John F; Hayes, Richard B; Hofer, Philipp; Hoffmeister, Michael; Hopper, John L; Huang, Shu-Chen; Huerta, Jose Maria; Hudson, Thomas J; Hunter, David J; Idos, Gregory E; Iwasaki, Motoki; Jackson, Rebecca D; Jacobs, Eric J; Jee, Sun Ha; Jenkins, Mark A; Jia, Wei-Hua; Jiao, Shuo; Joshi, Amit D; Kolonel, Laurence N; Kono, Suminori; Kooperberg, Charles; Krogh, Vittorio; Kuehn, Tilman; Küry, Sébastien; LaCroix, Andrea; Laurie, Cecelia A; Lejbkowicz, Flavio; Lemire, Mathieu; Lenz, Heinz-Josef; Levine, David; Li, Christopher I; Li, Li; Lieb, Wolfgang; Lin, Yi; Lindor, Noralane M; Liu, Yun-Ru; Loupakis, Fotios; Lu, Yingchang; Luh, Frank; Ma, Jing; Mancao, Christoph; Manion, Frank J; Markowitz, Sanford D; Martin, Vicente; Matsuda, Koichi; Matsuo, Keitaro; McDonnell, Kevin J; McNeil, Caroline E; Milne, Roger; Molina, Antonio J; Mukherjee, Bhramar; Murphy, Neil; Newcomb, Polly A; Offit, Kenneth; Omichessan, Hanane; Palli, Domenico; Cotoré, Jesus P Paredes; Pérez-Mayoral, Julyann; Pharoah, Paul D; Potter, John D; Qu, Conghui; Raskin, Leon; Rennert, Gad; Rennert, Hedy S; Riggs, Bridget M; Schafmayer, Clemens; Schoen, Robert E; Sellers, Thomas A; Seminara, Daniela; Severi, Gianluca; Shi, Wei; Shibata, David; Shu, Xiao-Ou; Siegel, Erin M; Slattery, Martha L; Southey, Melissa; Stadler, Zsofia K; Stern, Mariana C; Stintzing, Sebastian; Taverna, Darin; Thibodeau, Stephen N; Thomas, Duncan C; Trichopoulou, Antonia; Tsugane, Shoichiro; Ulrich, Cornelia M; van Duijnhoven, Franzel J B; van Guelpan, Bethany; Vijai, Joseph; Virtamo, Jarmo; Weinstein, Stephanie J; White, Emily; Win, Aung Ko; Wolk, Alicja; Woods, Michael; Wu, Anna H; Wu, Kana; Xiang, Yong-Bing; Yen, Yun; Zanke, Brent W; Zeng, Yi-Xin; Zhang, Ben; Zubair, Niha; Kweon, Sun-Seog; Figueiredo, Jane C; Zheng, Wei; Marchand, Loic Le; Lindblom, Annika; Moreno, Victor; Peters, Ulrike; Casey, Graham; Hsu, Li; Conti, David V; Gruber, Stephen B

    2018-06-16

    Previous genome-wide association studies (GWAS) have identified 42 loci (P < 5 × 10-8) associated with risk of colorectal cancer (CRC). Expanded consortium efforts facilitating the discovery of additional susceptibility loci may capture unexplained familial risk. We conducted a GWAS in European descent CRC cases and control subjects using a discovery-replication design, followed by examination of novel findings in a multiethnic sample (cumulative n = 163 315). In the discovery stage (36 948 case subjects/30 864 control subjects), we identified genetic variants with a minor allele frequency of 1% or greater associated with risk of CRC using logistic regression followed by a fixed-effects inverse variance weighted meta-analysis. All novel independent variants reaching genome-wide statistical significance (two-sided P < 5 × 10-8) were tested for replication in separate European ancestry samples (12 952 case subjects/48 383 control subjects). Next, we examined the generalizability of discovered variants in East Asians, African Americans, and Hispanics (12 085 case subjects/22 083 control subjects). Finally, we examined the contributions of novel risk variants to familial relative risk and examined the prediction capabilities of a polygenic risk score. All statistical tests were two-sided. The discovery GWAS identified 11 variants associated with CRC at P < 5 × 10-8, of which nine (at 4q22.2/5p15.33/5p13.1/6p21.31/6p12.1/10q11.23/12q24.21/16q24.1/20q13.13) independently replicated at a P value of less than .05. Multiethnic follow-up supported the generalizability of discovery findings. These results demonstrated a 14.7% increase in familial relative risk explained by common risk alleles from 10.3% (95% confidence interval [CI] = 7.9% to 13.7%; known variants) to 11.9% (95% CI = 9.2% to 15.5%; known and novel variants). A polygenic risk score identified 4.3% of the population at an odds ratio for developing CRC of at least 2.0. This study provides insight into the architecture of common genetic variation contributing to CRC etiology and improves risk prediction for individualized screening.

  10. Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease.

    PubMed

    Ellingford, Jamie M; Barton, Stephanie; Bhaskar, Sanjeev; Williams, Simon G; Sergouniotis, Panagiotis I; O'Sullivan, James; Lamb, Janine A; Perveen, Rahat; Hall, Georgina; Newman, William G; Bishop, Paul N; Roberts, Stephen A; Leach, Rick; Tearle, Rick; Bayliss, Stuart; Ramsden, Simon C; Nemeth, Andrea H; Black, Graeme C M

    2016-05-01

    To compare the efficacy of whole genome sequencing (WGS) with targeted next-generation sequencing (NGS) in the diagnosis of inherited retinal disease (IRD). Case series. A total of 562 patients diagnosed with IRD. We performed a direct comparative analysis of current molecular diagnostics with WGS. We retrospectively reviewed the findings from a diagnostic NGS DNA test for 562 patients with IRD. A subset of 46 of 562 patients (encompassing potential clinical outcomes of diagnostic analysis) also underwent WGS, and we compared mutation detection rates and molecular diagnostic yields. In addition, we compared the sensitivity and specificity of the 2 techniques to identify known single nucleotide variants (SNVs) using 6 control samples with publically available genotype data. Diagnostic yield of genomic testing. Across known disease-causing genes, targeted NGS and WGS achieved similar levels of sensitivity and specificity for SNV detection. However, WGS also identified 14 clinically relevant genetic variants through WGS that had not been identified by NGS diagnostic testing for the 46 individuals with IRD. These variants included large deletions and variants in noncoding regions of the genome. Identification of these variants confirmed a molecular diagnosis of IRD for 11 of the 33 individuals referred for WGS who had not obtained a molecular diagnosis through targeted NGS testing. Weighted estimates, accounting for population structure, suggest that WGS methods could result in an overall 29% (95% confidence interval, 15-45) uplift in diagnostic yield. We show that WGS methods can detect disease-causing genetic variants missed by current NGS diagnostic methodologies for IRD and thereby demonstrate the clinical utility and additional value of WGS. Copyright © 2016 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fahrenkrog, Annette M.; Neves, Leandro G.; Resende, Jr., Marcio F. R.

    Genome-wide association studies (GWAS) have been used extensively to dissect the genetic regulation of complex traits in plants. These studies have focused largely on the analysis of common genetic variants despite the abundance of rare polymorphisms in several species, and their potential role in trait variation. Here, we conducted the first GWAS in Populus deltoides, a genetically diverse keystone forest species in North America and an important short rotation woody crop for the bioenergy industry. We searched for associations between eight growth and wood composition traits, and common and low-frequency single-nucleotide polymorphisms detected by targeted resequencing of 18 153 genesmore » in a population of 391 unrelated individuals. To increase power to detect associations with low-frequency variants, multiple-marker association tests were used in combination with single-marker association tests. Significant associations were discovered for all phenotypes and are indicative that low-frequency polymorphisms contribute to phenotypic variance of several bioenergy traits. Our results suggest that both common and low-frequency variants need to be considered for a comprehensive understanding of the genetic regulation of complex traits, particularly in species that carry large numbers of rare polymorphisms. Lastly, these polymorphisms may be critical for the development of specialized plant feedstocks for bioenergy.« less

  12. Magnetic field induced random pulse trains of magnetic and acoustic noises in martensitic single-crystal Ni2MnGa

    NASA Astrophysics Data System (ADS)

    Daróczi, Lajos; Piros, Eszter; Tóth, László Z.; Beke, Dezső L.

    2017-07-01

    Jerky magnetic and acoustic noises were evoked in a single variant martensitic Ni2MnGa single crystal (produced by uniaxial compression) by application of an external magnetic field along the hard magnetization direction. It is shown that after reaching the detwinning threshold, spontaneous reorientation of martensite variants (twins) leads not only to acoustic emission but magnetic two-directional noises as well. At small magnetic fields, below the above threshold, unidirectional magnetic emission is also observed and attributed to a Barkhausen-type noise due to magnetic domain wall motions during magnetization along the hard direction. After the above first run, in cycles of decreasing and increasing magnetic field, at low-field values, weak, unidirectional Barkhausen noise is detected and attributed to the discontinuous motion of domain walls during magnetization along the easy magnetization direction. The magnetic noise is also measured by constraining the sample in the same initial variant state along the hard direction and, after the unidirectional noise (as obtained also in the first run), a two-directional noise package is developed and it is attributed to domain rotations. From the statistical analysis of the above noises, the critical exponents, characterizing the power-law behavior, are calculated and compared with each other and with the literature data. Time correlations within the magnetic as well as acoustic signals lead to a common scaled power function (with β =-1.25 exponent) for both types of signals.

  13. Analysis of Rare, Exonic Variation amongst Subjects with Autism Spectrum Disorders and Population Controls

    PubMed Central

    Liu, Li; Sabo, Aniko; Neale, Benjamin M.; Nagaswamy, Uma; Stevens, Christine; Lim, Elaine; Bodea, Corneliu A.; Muzny, Donna; Reid, Jeffrey G.; Banks, Eric; Coon, Hillary; DePristo, Mark; Dinh, Huyen; Fennel, Tim; Flannick, Jason; Gabriel, Stacey; Garimella, Kiran; Gross, Shannon; Hawes, Alicia; Lewis, Lora; Makarov, Vladimir; Maguire, Jared; Newsham, Irene; Poplin, Ryan; Ripke, Stephan; Shakir, Khalid; Samocha, Kaitlin E.; Wu, Yuanqing; Boerwinkle, Eric; Buxbaum, Joseph D.; Cook, Edwin H.; Devlin, Bernie; Schellenberg, Gerard D.; Sutcliffe, James S.; Daly, Mark J.; Gibbs, Richard A.; Roeder, Kathryn

    2013-01-01

    We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD. PMID:23593035

  14. Complex analysis of urate transporters SLC2A9, SLC22A12 and functional characterization of non-synonymous allelic variants of GLUT9 in the Czech population: no evidence of effect on hyperuricemia and gout.

    PubMed

    Hurba, Olha; Mancikova, Andrea; Krylov, Vladimir; Pavlikova, Marketa; Pavelka, Karel; Stibůrková, Blanka

    2014-01-01

    Using European descent Czech populations, we performed a study of SLC2A9 and SLC22A12 genes previously identified as being associated with serum uric acid concentrations and gout. This is the first study of the impact of non-synonymous allelic variants on the function of GLUT9 except for patients suffering from renal hypouricemia type 2. The cohort consisted of 250 individuals (150 controls, 54 nonspecific hyperuricemics and 46 primary gout and/or hyperuricemia subjects). We analyzed 13 exons of SLC2A9 (GLUT9 variant 1 and GLUT9 variant 2) and 10 exons of SLC22A12 by PCR amplification and sequenced directly. Allelic variants were prepared and their urate uptake and subcellular localization were studied by Xenopus oocytes expression system. The functional studies were analyzed using the non-parametric Wilcoxon and Kruskall-Wallis tests; the association study used the Fisher exact test and linear regression approach. We identified a total of 52 sequence variants (12 unpublished). Eight non-synonymous allelic variants were found only in SLC2A9: rs6820230, rs2276961, rs144196049, rs112404957, rs73225891, rs16890979, rs3733591 and rs2280205. None of these variants showed any significant difference in the expression of GLUT9 and in urate transport. In the association study, eight variants showed a possible association with hyperuricemia. However, seven of these were in introns and the one exon located variant, rs7932775, did not show a statistically significant association with serum uric acid concentration. Our results did not confirm any effect of SLC22A12 and SLC2A9 variants on serum uric acid concentration. Our complex approach using association analysis together with functional and immunohistochemical characterization of non-synonymous allelic variants did not show any influence on expression, subcellular localization and urate uptake of GLUT9.

  15. Systematic Integration of Brain eQTL and GWAS Identifies ZNF323 as a Novel Schizophrenia Risk Gene and Suggests Recent Positive Selection Based on Compensatory Advantage on Pulmonary Function

    PubMed Central

    Luo, Xiong-Jian; Mattheisen, Manuel; Li, Ming; Huang, Liang; Rietschel, Marcella; Børglum, Anders D.; Als, Thomas D.; van den Oord, Edwin J.; Aberg, Karolina A.; Mors, Ole; Mortensen, Preben Bo; Luo, Zhenwu; Degenhardt, Franziska; Cichon, Sven; Schulze, Thomas G.; Nöthen, Markus M.; Su, Bing; Zhao, Zhongming; Gan, Lin; Yao, Yong-Gang

    2015-01-01

    Genome-wide association studies have identified multiple risk variants and loci that show robust association with schizophrenia. Nevertheless, it remains unclear how these variants confer risk to schizophrenia. In addition, the driving force that maintains the schizophrenia risk variants in human gene pool is poorly understood. To investigate whether expression-associated genetic variants contribute to schizophrenia susceptibility, we systematically integrated brain expression quantitative trait loci and genome-wide association data of schizophrenia using Sherlock, a Bayesian statistical framework. Our analyses identified ZNF323 as a schizophrenia risk gene (P = 2.22×10–6). Subsequent analyses confirmed the association of the ZNF323 and its expression-associated single nucleotide polymorphism rs1150711 in independent samples (gene-expression: P = 1.40×10–6; single-marker meta-analysis in the combined discovery and replication sample comprising 44123 individuals: P = 6.85×10−10). We found that the ZNF323 was significantly downregulated in hippocampus and frontal cortex of schizophrenia patients (P = .0038 and P = .0233, respectively). Evidence for pleiotropic effects was detected (association of rs1150711 with lung function and gene expression of ZNF323 in lung: P = 6.62×10–5 and P = 9.00×10–5, respectively) with the risk allele (T allele) for schizophrenia acting as protective allele for lung function. Subsequent population genetics analyses suggest that the risk allele (T) of rs1150711 might have undergone recent positive selection in human population. Our findings suggest that the ZNF323 is a schizophrenia susceptibility gene whose expression may influence schizophrenia risk. Our study also illustrates a possible mechanism for maintaining schizophrenia risk variants in the human gene pool. PMID:25759474

  16. Single-Molecule Counting of Point Mutations by Transient DNA Binding

    NASA Astrophysics Data System (ADS)

    Su, Xin; Li, Lidan; Wang, Shanshan; Hao, Dandan; Wang, Lei; Yu, Changyuan

    2017-03-01

    High-confidence detection of point mutations is important for disease diagnosis and clinical practice. Hybridization probes are extensively used, but are hindered by their poor single-nucleotide selectivity. Shortening the length of DNA hybridization probes weakens the stability of the probe-target duplex, leading to transient binding between complementary sequences. The kinetics of probe-target binding events are highly dependent on the number of complementary base pairs. Here, we present a single-molecule assay for point mutation detection based on transient DNA binding and use of total internal reflection fluorescence microscopy. Statistical analysis of single-molecule kinetics enabled us to effectively discriminate between wild type DNA sequences and single-nucleotide variants at the single-molecule level. A higher single-nucleotide discrimination is achieved than in our previous work by optimizing the assay conditions, which is guided by statistical modeling of kinetics with a gamma distribution. The KRAS c.34 A mutation can be clearly differentiated from the wild type sequence (KRAS c.34 G) at a relative abundance as low as 0.01% mutant to WT. To demonstrate the feasibility of this method for analysis of clinically relevant biological samples, we used this technology to detect mutations in single-stranded DNA generated from asymmetric RT-PCR of mRNA from two cancer cell lines.

  17. Trends in Correlation-Based Pattern Recognition and Tracking in Forward-Looking Infrared Imagery

    PubMed Central

    Alam, Mohammad S.; Bhuiyan, Sharif M. A.

    2014-01-01

    In this paper, we review the recent trends and advancements on correlation-based pattern recognition and tracking in forward-looking infrared (FLIR) imagery. In particular, we discuss matched filter-based correlation techniques for target detection and tracking which are widely used for various real time applications. We analyze and present test results involving recently reported matched filters such as the maximum average correlation height (MACH) filter and its variants, and distance classifier correlation filter (DCCF) and its variants. Test results are presented for both single/multiple target detection and tracking using various real-life FLIR image sequences. PMID:25061840

  18. Evaluation of targeted exome sequencing for 28 protein-based blood group systems, including the homologous gene systems, for blood group genotyping.

    PubMed

    Schoeman, Elizna M; Lopez, Genghis H; McGowan, Eunike C; Millard, Glenda M; O'Brien, Helen; Roulis, Eileen V; Liew, Yew-Wah; Martin, Jacqueline R; McGrath, Kelli A; Powley, Tanya; Flower, Robert L; Hyland, Catherine A

    2017-04-01

    Blood group single nucleotide polymorphism genotyping probes for a limited range of polymorphisms. This study investigated whether massively parallel sequencing (also known as next-generation sequencing), with a targeted exome strategy, provides an extended blood group genotype and the extent to which massively parallel sequencing correctly genotypes in homologous gene systems, such as RH and MNS. Donor samples (n = 28) that were extensively phenotyped and genotyped using single nucleotide polymorphism typing, were analyzed using the TruSight One Sequencing Panel and MiSeq platform. Genes for 28 protein-based blood group systems, GATA1, and KLF1 were analyzed. Copy number variation analysis was used to characterize complex structural variants in the GYPC and RH systems. The average sequencing depth per target region was 66.2 ± 39.8. Each sample harbored on average 43 ± 9 variants, of which 10 ± 3 were used for genotyping. For the 28 samples, massively parallel sequencing variant sequences correctly matched expected sequences based on single nucleotide polymorphism genotyping data. Copy number variation analysis defined the Rh C/c alleles and complex RHD hybrids. Hybrid RHD*D-CE-D variants were correctly identified, but copy number variation analysis did not confidently distinguish between D and CE exon deletion versus rearrangement. The targeted exome sequencing strategy employed extended the range of blood group genotypes detected compared with single nucleotide polymorphism typing. This single-test format included detection of complex MNS hybrid cases and, with copy number variation analysis, defined RH hybrid genes along with the RHCE*C allele hitherto difficult to resolve by variant detection. The approach is economical compared with whole-genome sequencing and is suitable for a red blood cell reference laboratory setting. © 2017 AABB.

  19. Panel-based Genetic Diagnostic Testing for Inherited Eye Diseases is Highly Accurate and Reproducible and More Sensitive for Variant Detection Than Exome Sequencing

    PubMed Central

    Bujakowska, Kinga M.; Sousa, Maria E.; Fonseca-Kelly, Zoë D.; Taub, Daniel G.; Janessian, Maria; Wang, Dan Yi; Au, Elizabeth D.; Sims, Katherine B.; Sweetser, David A.; Fulton, Anne B.; Liu, Qin; Wiggs, Janey L.; Gai, Xiaowu; Pierce, Eric A.

    2015-01-01

    Purpose Next-generation sequencing (NGS) based methods are being adopted broadly for genetic diagnostic testing, but the performance characteristics of these techniques have not been fully defined with regard to test accuracy and reproducibility. Methods We developed a targeted enrichment and NGS approach for genetic diagnostic testing of patients with inherited eye disorders, including inherited retinal degenerations, optic atrophy and glaucoma. In preparation for providing this Genetic Eye Disease (GEDi) test on a CLIA-certified basis, we performed experiments to measure the sensitivity, specificity, reproducibility as well as the clinical sensitivity of the test. Results The GEDi test is highly reproducible and accurate, with sensitivity and specificity for single nucleotide variant detection of 97.9% and 100%, respectively. The sensitivity for variant detection was notably better than the 88.3% achieved by whole exome sequencing (WES) using the same metrics, due to better coverage of targeted genes in the GEDi test compared to commercially available exome capture sets. Prospective testing of 192 patients with IRDs indicated that the clinical sensitivity of the GEDi test is high, with a diagnostic rate of 51%. Conclusion The data suggest that based on quantified performance metrics, selective targeted enrichment is preferable to WES for genetic diagnostic testing. PMID:25412400

  20. Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries.

    PubMed

    Baurley, James W; Edlund, Christopher K; Pardamean, Carissa I; Conti, David V; Krasnow, Ruth; Javitz, Harold S; Hops, Hyman; Swan, Gary E; Benowitz, Neal L; Bergen, Andrew W

    2016-09-01

    Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3'-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. This multiple ancestry meta-GWAS of the laboratory study-based NMR provides novel evidence and replication for genome-wide association of CYP2A6 single nucleotide and insertion-deletion polymorphisms. We identify three regions of genome-wide significance: proximal, intronic, and distal to CYP2A6. We replicate the top-ranking single nucleotide polymorphism from a recent GWAS of the NMR in Finnish smokers, identify a functional mechanism for this intronic variant from in silico analyses of RNA-seq data that is consistent with CYP2A6 expression measured in postmortem lung and liver, and provide additional support for the intergenic region between CYP2A6 and CYP2A7. © The Author 2016. Published by Oxford University Press on behalf of the Society for Research on Nicotine and Tobacco.

  1. Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries

    PubMed Central

    Baurley, James W.; Edlund, Christopher K.; Pardamean, Carissa I.; Conti, David V.; Krasnow, Ruth; Javitz, Harold S.; Hops, Hyman; Swan, Gary E.; Benowitz, Neal L.

    2016-01-01

    Introduction: Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3′-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Methods: Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. Results: African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). Conclusions: This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. Implications: This multiple ancestry meta-GWAS of the laboratory study-based NMR provides novel evidence and replication for genome-wide association of CYP2A6 single nucleotide and insertion–deletion polymorphisms. We identify three regions of genome-wide significance: proximal, intronic, and distal to CYP2A6. We replicate the top-ranking single nucleotide polymorphism from a recent GWAS of the NMR in Finnish smokers, identify a functional mechanism for this intronic variant from in silico analyses of RNA-seq data that is consistent with CYP2A6 expression measured in postmortem lung and liver, and provide additional support for the intergenic region between CYP2A6 and CYP2A7. PMID:27113016

  2. Directed evolution to re-adapt a co-evolved network within an enzyme.

    PubMed

    Strafford, John; Payongsri, Panwajee; Hibbert, Edward G; Morris, Phattaraporn; Batth, Sukhjeet S; Steadman, David; Smith, Mark E B; Ward, John M; Hailes, Helen C; Dalby, Paul A

    2012-01-01

    We have previously used targeted active-site saturation mutagenesis to identify a number of transketolase single mutants that improved activity towards either glycolaldehyde (GA), or the non-natural substrate propionaldehyde (PA). Here, all attempts to recombine the singles into double mutants led to unexpected losses of specific activity towards both substrates. A typical trade-off occurred between soluble expression levels and specific activity for all single mutants, but many double mutants decreased both properties more severely suggesting a critical loss of protein stability or native folding. Statistical coupling analysis (SCA) of a large multiple sequence alignment revealed a network of nine co-evolved residues that affected all but one double mutant. Such networks maintain important functional properties such as activity, specificity, folding, stability, and solubility and may be rapidly disrupted by introducing one or more non-naturally occurring mutations. To identify variants of this network that would accept and improve upon our best D469 mutants for activity towards PA, we created a library of random single, double and triple mutants across seven of the co-evolved residues, combining our D469 variants with only naturally occurring mutations at the remaining sites. A triple mutant cluster at D469, E498 and R520 was found to behave synergistically for the specific activity towards PA. Protein expression was severely reduced by E498D and improved by R520Q, yet variants containing both mutations led to improved specific activity and enzyme expression, but with loss of solubility and the formation of inclusion bodies. D469S and R520Q combined synergistically to improve k(cat) 20-fold for PA, more than for any previous transketolase mutant. R520Q also doubled the specific activity of the previously identified D469T to create our most active transketolase mutant to date. Our results show that recombining active-site mutants obtained by saturation mutagenesis can rapidly destabilise critical networks of co-evolved residues, whereas beneficial single mutants can be retained and improved upon by randomly recombining them with natural variants at other positions in the network. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. Towards a web-based decision support tool for selecting appropriate statistical test in medical and biological sciences.

    PubMed

    Suner, Aslı; Karakülah, Gökhan; Dicle, Oğuz

    2014-01-01

    Statistical hypothesis testing is an essential component of biological and medical studies for making inferences and estimations from the collected data in the study; however, the misuse of statistical tests is widely common. In order to prevent possible errors in convenient statistical test selection, it is currently possible to consult available test selection algorithms developed for various purposes. However, the lack of an algorithm presenting the most common statistical tests used in biomedical research in a single flowchart causes several problems such as shifting users among the algorithms, poor decision support in test selection and lack of satisfaction of potential users. Herein, we demonstrated a unified flowchart; covers mostly used statistical tests in biomedical domain, to provide decision aid to non-statistician users while choosing the appropriate statistical test for testing their hypothesis. We also discuss some of the findings while we are integrating the flowcharts into each other to develop a single but more comprehensive decision algorithm.

  4. Initial evaluation of discrete orthogonal basis reconstruction of ECT images

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moody, E.B.; Donohue, K.D.

    1996-12-31

    Discrete orthogonal basis restoration (DOBR) is a linear, non-iterative, and robust method for solving inverse problems for systems characterized by shift-variant transfer functions. This simulation study evaluates the feasibility of using DOBR for reconstructing emission computed tomographic (ECT) images. The imaging system model uses typical SPECT parameters and incorporates the effects of attenuation, spatially-variant PSF, and Poisson noise in the projection process. Sample reconstructions and statistical error analyses for a class of digital phantoms compare the DOBR performance for Hartley and Walsh basis functions. Test results confirm that DOBR with either basis set produces images with good statistical properties. Nomore » problems were encountered with reconstruction instability. The flexibility of the DOBR method and its consistent performance warrants further investigation of DOBR as a means of ECT image reconstruction.« less

  5. Targeted Deep Resequencing Identifies Coding Variants in the PEAR1 Gene That Play a Role in Platelet Aggregation

    PubMed Central

    Kim, Yoonhee; Suktitipat, Bhoom; Yanek, Lisa R.; Faraday, Nauder; Wilson, Alexander F.; Becker, Diane M.; Becker, Lewis C.; Mathias, Rasika A.

    2013-01-01

    Platelet aggregation is heritable, and genome-wide association studies have detected strong associations with a common intronic variant of the platelet endothelial aggregation receptor1 (PEAR1) gene both in African American and European American individuals. In this study, we used a sequencing approach to identify additional exonic variants in PEAR1 that may also determine variability in platelet aggregation in the GeneSTAR Study. A 0.3 Mb targeted region on chromosome 1q23.1 including the entire PEAR1 gene was Sanger sequenced in 104 subjects (45% male, 49% African American, age = 52±13) selected on the basis of hyper- and hypo- aggregation across three different agonists (collagen, epinephrine, and adenosine diphosphate). Single-variant and multi-variant burden tests for association were performed. Of the 235 variants identified through sequencing, 61 were novel, and three of these were missense variants. More rare variants (MAF<5%) were noted in African Americans compared to European Americans (108 vs. 45). The common intronic GWAS-identified variant (rs12041331) demonstrated the most significant association signal in African Americans (p = 4.020×10−4); no association was seen for additional exonic variants in this group. In contrast, multi-variant burden tests indicated that exonic variants play a more significant role in European Americans (p = 0.0099 for the collective coding variants compared to p = 0.0565 for intronic variant rs12041331). Imputation of the individual exonic variants in the rest of the GeneSTAR European American cohort (N = 1,965) supports the results noted in the sequenced discovery sample: p = 3.56×10−4, 2.27×10−7, 5.20×10−5 for coding synonymous variant rs56260937 and collagen, epinephrine and adenosine diphosphate induced platelet aggregation, respectively. Sequencing approaches confirm that a common intronic variant has the strongest association with platelet aggregation in African Americans, and show that exonic variants play an additional role in platelet aggregation in European Americans. PMID:23704978

  6. The Tumor Necrosis Factor α (-308 A/G) Polymorphism Is Associated with Cystic Fibrosis in Mexican Patients

    PubMed Central

    Sanchez-Dominguez, Celia N.; Reyes-Lopez, Miguel A.; Bustamante, Adriana; Cerda-Flores, Ricardo M.; Villalobos-Torres, Maria del C.; Gallardo-Blanco, Hugo L.; Rojas-Martinez, Augusto; Martinez-Rodriguez, Herminia G.; Barrera-Saldaña, Hugo A.; Ortiz-Lopez, Rocio

    2014-01-01

    Environmental and genetic factors may modify or contribute to the phenotypic differences observed in multigenic and monogenic diseases, such as cystic fibrosis (CF). An analysis of modifier genes can be helpful for estimating patient prognosis and directing preventive care. The aim of this study is to determine the association between seven genetic variants of four modifier genes and CF by comparing their corresponding allelic and genotypic frequencies in CF patients (n = 81) and control subjects (n = 104). Genetic variants of MBL2 exon 1 (A, B, C and D), the IL-8 promoter (−251 A/T), the TNFα promoter (TNF1/TNF2), and SERPINA1 (PI*Z and PI*S) were tested in CF patients and control subjects from northeastern Mexico by PCR-RFLP. Results The TNF2 allele (P = 0.012, OR 3.43, 95% CI 1.25–9.38) was significantly associated with CF under the dominant and additive models but was not associated with CF under the recessive model. This association remained statistically significant after adjusting for multiple tests using the Bonferroni correction (P = 0.0482). The other tested variants and genotypes did not show any association with the disease. Conclusion An analysis of seven genetic variants of four modifier genes showed that one variant, the TNF2 allele, appears to be significantly associated with CF in Mexican patients. PMID:24603877

  7. HIV-1 maternal and infant variants show similar sensitivity to broadly neutralizing antibodies, but sensitivity varies by subtype

    PubMed Central

    Jennifer, Mabuka; Leslie, Goo; Maxwel, Majiwa O.; Ruth, Nduati; Julie, Overbaugh

    2014-01-01

    Rationale To protect against HIV infection, passively transferred and/or vaccine elicited neutralizing antibodies (NAbs) need to effectively target diverse subtypes that are transmitted globally. These variants are a limited subset of those present during chronic infection and display some unique features. In the case of mother-to-child transmission (MTCT), transmitted variants tend to be resistant to neutralization by maternal autologous NAbs. Method To investigate whether variants transmitted during MTCT are generally resistant to HIV-1 specific NAbs, 107 maternal or infant variants representing the dominant HIV-1 subtypes were tested against six recently identified HIV-1-specific broadly neutralizing monoclonal antibodies (bNAbs), NIH45-46W, VRC01, PGT128, PGT121, PG9, and PGT145. Results Infant and maternal variants did not differ in their neutralization sensitivity to individual bNAbs, nor did viruses from transmitting versus non-transmitting mothers, although there was a trend for viruses from transmitting mothers to be less sensitive overall. No single bNAb neutralized all viruses, but a combination of bNAbs that target distinct epitopes covered 100% of the variants tested. Compared to heterosexually transmitted variants, vertically transmitted variants, were significantly more sensitive to neutralization by PGT128 and PGT121 (p=0.03 in both cases) but there were no differences for the other bNAbs. Overall, subtype A variants were significantly more sensitive to NIH45-46 (p=0.04), VRC01 (p=0.002) and PGT145 (p=0.03) compared to the non-subtype A and less sensitive to PGT121 than subtype Cs (p=0.0001). Conclusion A combination of bNAbs against distinct epitopes may be needed to provide maximum coverage against viruses in different modes of transmission and diverse subtypes. PMID:23856624

  8. Genetic variants associated with altered plasma levels of C-reactive protein are not associated with late-life cognitive ability in four Scottish samples.

    PubMed

    Marioni, Riccardo E; Deary, Ian J; Murray, Gordon D; Lowe, Gordon D O; Rafnsson, Snorri B; Strachan, Mark W J; Luciano, Michelle; Houlihan, Lorna M; Gow, Alan J; Harris, Sarah E; Stewart, Marlene C; Rumley, Ann; Fowkes, F Gerry R; Price, Jackie F

    2010-01-01

    It is unknown whether the relationship between raised inflammatory biomarker levels and late-life cognitive ability is causal. We explored this issue by testing the association between genetic regulators of plasma C-reactive protein (CRP) and cognition. Data were analysed from four cohorts based in central Scotland (Total N = 4,782). Associations were tested between variants in the CRP gene and both plasma CRP levels and a battery of neuropsychological tests, including a vocabulary-based estimate of peak prior cognitive ability and a general (summary) cognitive factor score, or 'g'. CRP levels were associated with a number of variants in the CRP gene (SNPs), including rs1205, rs1130864, rs1800947, and rs1417938 (P range 4.2e-06 to 0.041). Higher CRP levels were also associated with vocabulary-adjusted cognitive ability, used here to estimate lifetime cognitive change (P range 1.7e-04 to 0.038). After correction for multiple testing and adjustment for age and sex, no statistically significant associations were found between the SNPs and cognition. CRP is unlikely to be a causal determinant of late-life cognitive ability.

  9. CorSig: a general framework for estimating statistical significance of correlation and its application to gene co-expression analysis.

    PubMed

    Wang, Hong-Qiang; Tsai, Chung-Jui

    2013-01-01

    With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. A web server for CorSig is provided at http://202.127.200.1:8080/probeWeb. R code for CorSig is freely available for non-commercial use at http://aspendb.uga.edu/downloads.

  10. Double Dutch: A Tool for Designing Combinatorial Libraries of Biological Systems.

    PubMed

    Roehner, Nicholas; Young, Eric M; Voigt, Christopher A; Gordon, D Benjamin; Densmore, Douglas

    2016-06-17

    Recently, semirational approaches that rely on combinatorial assembly of characterized DNA components have been used to engineer biosynthetic pathways. In practice, however, it is not practical to assemble and test millions of pathway variants in order to elucidate how different DNA components affect the behavior of a pathway. To address this challenge, we apply a rigorous mathematical approach known as design of experiments (DOE) that can be used to construct empirical models of system behavior without testing all variants. To support this approach, we have developed a tool named Double Dutch, which uses a formal grammar and heuristic algorithms to automate the process of DOE library design. Compared to designing by hand, Double Dutch enables users to more efficiently and scalably design libraries of pathway variants that can be used in a DOE framework and uniquely provides a means to flexibly balance design considerations of statistical analysis, construction cost, and risk of homologous recombination, thereby demonstrating the utility of automating decision making when faced with complex design trade-offs.

  11. The role of AMH and its receptor SNP in the pathogenesis of PCOS.

    PubMed

    Wang, Fang; Niu, Wen-Bin; Kong, Hui-Juan; Guo, Yi-Hong; Sun, Ying-Pu

    2017-01-05

    The etiology of polycystic ovaries syndrome (PCOS) is unknown. Studies probing the role of genetic variants of anti-Mullerian hormone (AMH) and its type II receptor (AMHR2) in the pathogenesis of PCOS have yielded inconsistent results. Thus, we performed a systematic review and meta-analysis to determine the role of genetic variants of AMH/AMHR2 in the pathogenesis of PCOS. A systematic search of electronic databases was performed. Statistical analysis was performed using the Comprehensive Meta-Analysis software (Version 3). Pooled Odds Ratios (OR) (95% confidence intervals) were determined to assess the association between genetic variants of AMH/AMHR2 and PCOS. Five studies, involving a total of 2042 PCOS cases and 1071 controls, were included in the meta-analysis. Single nucleotide polymorphisms of AMH and AMHR2 did not appear to confer a heightened risk for PCOS (OR: 0.954, 95% CI: 0.848-1.073; P = 0.435; and OR: 1.074, 95% CI: 0.875-1.318; P = 0.494, respectively). In this study, genetic variants of AMH or AMHR2 were not found to be associated with a higher risk for PCOS. Copyright © 2016. Published by Elsevier Ireland Ltd.

  12. Mutation screening in the Greek population and evaluation of NLGN3 and NLGN4X genes causal factors for autism.

    PubMed

    Volaki, Konstantina; Pampanos, Andreas; Kitsiou-Tzeli, Sophia; Vrettou, Christina; Oikonomakis, Vasilis; Sofocleous, Christalena; Kanavakis, Emmanuel

    2013-10-01

    Molecular and neurobiological evidence for the involvement of neuroligins (particularly NLGN3 and NLGN4X genes) in autistic disorder is accumulating. However, previous mutation screening studies on these two genes have yielded controversial results. The present study explores, for the first time, the contribution of NLGN3 and NLGN4X genetic variants in Greek patients with autistic disorder. We analyzed the full exonic sequence of NLGN3 and NLGN4X genes in 40 patients strictly fulfilling the Diagnostic and Statistical Manual of Mental Disorders, 4th ed. criteria for autistic disorder. We identified nine nucleotide changes in NLGN4X--one probable causative mutation (p.K378R) previously reported by our research group, one novel variant (c.-206G>C), one nonvalidated single nucleotide polymorphism (SNP, rs111953947), and six known human SNPs reported in the SNP database--and one known human SNP in NLGN3 also reported in the SNP database. The variants identified are expected to be benign. However, they should be investigated in the context of variants in interacting cellular pathways to assess their contribution to the etiology of autism.

  13. The potential for increased power from combining P-values testing the same hypothesis.

    PubMed

    Ganju, Jitendra; Julie Ma, Guoguang

    2017-02-01

    The conventional approach to hypothesis testing for formal inference is to prespecify a single test statistic thought to be optimal. However, we usually have more than one test statistic in mind for testing the null hypothesis of no treatment effect but we do not know which one is the most powerful. Rather than relying on a single p-value, combining p-values from prespecified multiple test statistics can be used for inference. Combining functions include Fisher's combination test and the minimum p-value. Using randomization-based tests, the increase in power can be remarkable when compared with a single test and Simes's method. The versatility of the method is that it also applies when the number of covariates exceeds the number of observations. The increase in power is large enough to prefer combined p-values over a single p-value. The limitation is that the method does not provide an unbiased estimator of the treatment effect and does not apply to situations when the model includes treatment by covariate interaction.

  14. [Experimental model for the examination of inner pressure tolerance of telescopic anastomosis and other frequently performed anastomosis types of the esophagus].

    PubMed

    Szúcs, G; Tóth, I; Bráth, E; Gyáni, K; Miko, I

    2001-08-01

    We have good results with telescopic anastomosis technique in partial oesophagectomies and gastrectomies. As we could not find data about the healing process of telescopic anastomoses so we started experimenting. Inside pressure tolerance was examined immediately after performing anastomoses by measuring the bursting pressure using the organs of pigs slaughtered in the meat industry. Both oesophago-gastrostomies and oesophago-jejunostomies were performed with telescopic, single layer interrupted, single layer continuous, double layer interrupted and double layer continuous-interrupted technique, 9 of each anastomosis. A series of oesophago-jejunostomies were performed with EEA stapler. 99 anastomoses of 11 types were investigated. We found, that the inner pressure tolerance of telescopic oesophago-gastrostomy is better than any other single layer type variant. On the other hand the double layer type variants have much better pressure tolerance than the telescopic and other two type single layer anastomoses. The difference is statistically significant. In oesophago-jejunostomies the pressure tolerance of telescopic anastomosis is better than of the single layer interrupted type but the difference between the telescopic and single layer continuous type anastomoses is not significant. The pressure tolerance of double layer anastomosis is higher than the telescopic one but the difference is significant only in the continuous-interrupted type. The inner pressure tolerance of telescopic and EEA stapler anastomoses are equal. The investigation of additional features in anastomosis healing is in progress.

  15. Single Day Construction of Multigene Circuits with 3G Assembly.

    PubMed

    Halleran, Andrew D; Swaminathan, Anandh; Murray, Richard M

    2018-05-18

    The ability to rapidly design, build, and test prototypes is of key importance to every engineering discipline. DNA assembly often serves as a rate limiting step of the prototyping cycle for synthetic biology. Recently developed DNA assembly methods such as isothermal assembly and type IIS restriction enzyme systems take different approaches to accelerate DNA construction. We introduce a hybrid method, Golden Gate-Gibson (3G), that takes advantage of modular part libraries introduced by type IIS restriction enzyme systems and isothermal assembly's ability to build large DNA constructs in single pot reactions. Our method is highly efficient and rapid, facilitating construction of entire multigene circuits in a single day. Additionally, 3G allows generation of variant libraries enabling efficient screening of different possible circuit constructions. We characterize the efficiency and accuracy of 3G assembly for various construct sizes, and demonstrate 3G by characterizing variants of an inducible cell-lysis circuit.

  16. Clonal architecture of secondary acute myeloid leukemia defined by single-cell sequencing.

    PubMed

    Hughes, Andrew E O; Magrini, Vincent; Demeter, Ryan; Miller, Christopher A; Fulton, Robert; Fulton, Lucinda L; Eades, William C; Elliott, Kevin; Heath, Sharon; Westervelt, Peter; Ding, Li; Conrad, Donald F; White, Brian S; Shao, Jin; Link, Daniel C; DiPersio, John F; Mardis, Elaine R; Wilson, Richard K; Ley, Timothy J; Walter, Matthew J; Graubert, Timothy A

    2014-07-01

    Next-generation sequencing has been used to infer the clonality of heterogeneous tumor samples. These analyses yield specific predictions-the population frequency of individual clones, their genetic composition, and their evolutionary relationships-which we set out to test by sequencing individual cells from three subjects diagnosed with secondary acute myeloid leukemia, each of whom had been previously characterized by whole genome sequencing of unfractionated tumor samples. Single-cell mutation profiling strongly supported the clonal architecture implied by the analysis of bulk material. In addition, it resolved the clonal assignment of single nucleotide variants that had been initially ambiguous and identified areas of previously unappreciated complexity. Accordingly, we find that many of the key assumptions underlying the analysis of tumor clonality by deep sequencing of unfractionated material are valid. Furthermore, we illustrate a single-cell sequencing strategy for interrogating the clonal relationships among known variants that is cost-effective, scalable, and adaptable to the analysis of both hematopoietic and solid tumors, or any heterogeneous population of cells.

  17. Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations.

    PubMed

    Liu, Dajiang J; Leal, Suzanne M

    2012-10-05

    Next-generation sequencing has led to many complex-trait rare-variant (RV) association studies. Although single-variant association analysis can be performed, it is grossly underpowered. Therefore, researchers have developed many RV association tests that aggregate multiple variant sites across a genetic region (e.g., gene), and test for the association between the trait and the aggregated genotype. After these aggregate tests detect an association, it is only possible to estimate the average genetic effect for a group of RVs. As a result of the "winner's curse," such an estimate can be biased. Although for common variants one can obtain unbiased estimates of genetic parameters by analyzing a replication sample, for RVs it is desirable to obtain unbiased genetic estimates for the study where the association is identified. This is because there can be substantial heterogeneity of RV sites and frequencies even among closely related populations. In order to obtain an unbiased estimate for aggregated RV analysis, we developed bootstrap-sample-split algorithms to reduce the bias of the winner's curse. The unbiased estimates are greatly important for understanding the population-specific contribution of RVs to the heritability of complex traits. We also demonstrate both theoretically and via simulations that for aggregate RV analysis the genetic variance for a gene or region will always be underestimated, sometimes substantially, because of the presence of noncausal variants or because of the presence of causal variants with effects of different magnitudes or directions. Therefore, even if RVs play a major role in the complex-trait etiologies, a portion of the heritability will remain missing, and the contribution of RVs to the complex-trait etiologies will be underestimated. Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  18. SNP rs356219 of the α-synuclein (SNCA) gene is associated with Parkinson's disease in a Chinese Han population.

    PubMed

    Pan, Fenghua; Dong, Hairong; Ding, Haixia; Ye, Min; Liu, Weiguo; Wu, Yanfeng; Zhang, Xueling; Chen, Zhuoyou; Luo, Yang; Ding, Xinsheng

    2012-06-01

    Over the last decades, increasing knowledge about the genetic architecture of Parkinson's disease(PD) has provided novel insights into the pathogenesis of the disorder. Recently, several studies in different populations have found a strong association between idiopathic PD and the single-nucleotide polymorphism (SNP) rs356219, which is located in the 3'UTR of the SNCA gene. In this study, we aimed to verify these findings and to explore further the nature of the association in a subset of Chinese Han PD patients. Four hundred and three unrelated patients with sporadic PD and 315 healthy ethnically matched control subjects were recruited consecutively for the study. Patients and normal controls were genotyped for SNCA rs356219 variant by ligase detection reaction (LDR). A statistically significant difference was found in the frequencies of the single alleles of rs356219 (χ(2) = 12.986,P = 0.002) between PD patients and normal subjects. The distribution of A > G genotypes was different between patients and controls (χ(2) = 13.243, P < 0.001). The OR for subjects with the variant genotypes (AG and GG) was 1.88 (95%CI = 1.27-2.78, P = 0.001). The frequencies of the homozygous genotype for this variant was 42.2% (170 patients), which was significantly higher than that in controls (32.4%, P < 0.001). The results suggested that SNCA rs356219 variant might have an increased risk of susceptibility to PD in a Chinese Han population. Further studies are needed to replicate the association that we found. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. Using published data in Mendelian randomization: a blueprint for efficient identification of causal risk factors.

    PubMed

    Burgess, Stephen; Scott, Robert A; Timpson, Nicholas J; Davey Smith, George; Thompson, Simon G

    2015-07-01

    Finding individual-level data for adequately-powered Mendelian randomization analyses may be problematic. As publicly-available summarized data on genetic associations with disease outcomes from large consortia are becoming more abundant, use of published data is an attractive analysis strategy for obtaining precise estimates of the causal effects of risk factors on outcomes. We detail the necessary steps for conducting Mendelian randomization investigations using published data, and present novel statistical methods for combining data on the associations of multiple (correlated or uncorrelated) genetic variants with the risk factor and outcome into a single causal effect estimate. A two-sample analysis strategy may be employed, in which evidence on the gene-risk factor and gene-outcome associations are taken from different data sources. These approaches allow the efficient identification of risk factors that are suitable targets for clinical intervention from published data, although the ability to assess the assumptions necessary for causal inference is diminished. Methods and guidance are illustrated using the example of the causal effect of serum calcium levels on fasting glucose concentrations. The estimated causal effect of a 1 standard deviation (0.13 mmol/L) increase in calcium levels on fasting glucose (mM) using a single lead variant from the CASR gene region is 0.044 (95 % credible interval -0.002, 0.100). In contrast, using our method to account for the correlation between variants, the corresponding estimate using 17 genetic variants is 0.022 (95 % credible interval 0.009, 0.035), a more clearly positive causal effect.

  20. Deformation behavior of HCP titanium alloy: Experiment and Crystal plasticity modeling

    DOE PAGES

    Wronski, M.; Arul Kumar, Mariyappan; Capolungo, Laurent; ...

    2018-03-02

    The deformation behavior of commercially pure titanium is studied using experiments and a crystal plasticity model. Compression tests along the rolling, transverse, and normal-directions, and tensile tests along the rolling and transverse directions are performed at room temperature to study the activation of slip and twinning in the hexagonal closed packed titanium. A detailed EBSD based statistical analysis of the microstructure is performed to develop statistics of both {10-12} tensile and {11-22} compression twins. A simple Monte Carlo (MC) twin variant selection criterion is proposed within the framework of the visco-plastic self-consistent (VPSC) model with a dislocation density (DD) basedmore » law used to describe dislocation hardening. In the model, plasticity is accommodated by prismatic, basal and pyramidal slip modes, and {10-12} tensile and {11-22} compression twinning modes. Thus, the VPSC-MC model successfully captures the experimentally observed activation of low Schmid factor twin variants for both tensile and compression twins modes. The model also predicts macroscopic stress-strain response, texture evolution and twin volume fraction that are in agreement with experimental observations.« less

  1. Deformation behavior of HCP titanium alloy: Experiment and Crystal plasticity modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wronski, M.; Arul Kumar, Mariyappan; Capolungo, Laurent

    The deformation behavior of commercially pure titanium is studied using experiments and a crystal plasticity model. Compression tests along the rolling, transverse, and normal-directions, and tensile tests along the rolling and transverse directions are performed at room temperature to study the activation of slip and twinning in the hexagonal closed packed titanium. A detailed EBSD based statistical analysis of the microstructure is performed to develop statistics of both {10-12} tensile and {11-22} compression twins. A simple Monte Carlo (MC) twin variant selection criterion is proposed within the framework of the visco-plastic self-consistent (VPSC) model with a dislocation density (DD) basedmore » law used to describe dislocation hardening. In the model, plasticity is accommodated by prismatic, basal and pyramidal slip modes, and {10-12} tensile and {11-22} compression twinning modes. Thus, the VPSC-MC model successfully captures the experimentally observed activation of low Schmid factor twin variants for both tensile and compression twins modes. The model also predicts macroscopic stress-strain response, texture evolution and twin volume fraction that are in agreement with experimental observations.« less

  2. Actionable exomic incidental findings in 6503 participants: challenges of variant classification

    PubMed Central

    Amendola, Laura M.; Dorschner, Michael O.; Robertson, Peggy D.; Salama, Joseph S.; Hart, Ragan; Shirts, Brian H.; Murray, Mitzi L.; Tokita, Mari J.; Gallego, Carlos J.; Kim, Daniel Seung; Bennett, James T.; Crosslin, David R.; Ranchalis, Jane; Jones, Kelly L.; Rosenthal, Elisabeth A.; Jarvik, Ella R.; Itsara, Andy; Turner, Emily H.; Herman, Daniel S.; Schleit, Jennifer; Burt, Amber; Jamal, Seema M.; Abrudan, Jenica L.; Johnson, Andrew D.; Conlin, Laura K.; Dulik, Matthew C.; Santani, Avni; Metterville, Danielle R.; Kelly, Melissa; Foreman, Ann Katherine M.; Lee, Kristy; Taylor, Kent D.; Guo, Xiuqing; Crooks, Kristy; Kiedrowski, Lesli A.; Raffel, Leslie J.; Gordon, Ora; Machini, Kalotina; Desnick, Robert J.; Biesecker, Leslie G.; Lubitz, Steven A.; Mulchandani, Surabhi; Cooper, Greg M.; Joffe, Steven; Richards, C. Sue; Yang, Yaoping; Rotter, Jerome I.; Rich, Stephen S.; O’Donnell, Christopher J.; Berg, Jonathan S.; Spinner, Nancy B.; Evans, James P.; Fullerton, Stephanie M.; Leppig, Kathleen A.; Bennett, Robin L.; Bird, Thomas; Sybert, Virginia P.; Grady, William M.; Tabor, Holly K.; Kim, Jerry H.; Bamshad, Michael J.; Wilfond, Benjamin; Motulsky, Arno G.; Scott, C. Ronald; Pritchard, Colin C.; Walsh, Tom D.; Burke, Wylie; Raskind, Wendy H.; Byers, Peter; Hisama, Fuki M.; Rehm, Heidi; Nickerson, Debbie A.; Jarvik, Gail P.

    2015-01-01

    Recommendations for laboratories to report incidental findings from genomic tests have stimulated interest in such results. In order to investigate the criteria and processes for assigning the pathogenicity of specific variants and to estimate the frequency of such incidental findings in patients of European and African ancestry, we classified potentially actionable pathogenic single-nucleotide variants (SNVs) in all 4300 European- and 2203 African-ancestry participants sequenced by the NHLBI Exome Sequencing Project (ESP). We considered 112 gene-disease pairs selected by an expert panel as associated with medically actionable genetic disorders that may be undiagnosed in adults. The resulting classifications were compared to classifications from other clinical and research genetic testing laboratories, as well as with in silico pathogenicity scores. Among European-ancestry participants, 30 of 4300 (0.7%) had a pathogenic SNV and six (0.1%) had a disruptive variant that was expected to be pathogenic, whereas 52 (1.2%) had likely pathogenic SNVs. For African-ancestry participants, six of 2203 (0.3%) had a pathogenic SNV and six (0.3%) had an expected pathogenic disruptive variant, whereas 13 (0.6%) had likely pathogenic SNVs. Genomic Evolutionary Rate Profiling mammalian conservation score and the Combined Annotation Dependent Depletion summary score of conservation, substitution, regulation, and other evidence were compared across pathogenicity assignments and appear to have utility in variant classification. This work provides a refined estimate of the burden of adult onset, medically actionable incidental findings expected from exome sequencing, highlights challenges in variant classification, and demonstrates the need for a better curated variant interpretation knowledge base. PMID:25637381

  3. Max-AUC Feature Selection in Computer-Aided Detection of Polyps in CT Colonography

    PubMed Central

    Xu, Jian-Wu; Suzuki, Kenji

    2014-01-01

    We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level. PMID:24608058

  4. Max-AUC feature selection in computer-aided detection of polyps in CT colonography.

    PubMed

    Xu, Jian-Wu; Suzuki, Kenji

    2014-03-01

    We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level.

  5. A comparison of cosegregation analysis methods for the clinical setting.

    PubMed

    Rañola, John Michael O; Liu, Quanhui; Rosenthal, Elisabeth A; Shirts, Brian H

    2018-04-01

    Quantitative cosegregation analysis can help evaluate the pathogenicity of genetic variants. However, genetics professionals without statistical training often use simple methods, reporting only qualitative findings. We evaluate the potential utility of quantitative cosegregation in the clinical setting by comparing three methods. One thousand pedigrees each were simulated for benign and pathogenic variants in BRCA1 and MLH1 using United States historical demographic data to produce pedigrees similar to those seen in the clinic. These pedigrees were analyzed using two robust methods, full likelihood Bayes factors (FLB) and cosegregation likelihood ratios (CSLR), and a simpler method, counting meioses. Both FLB and CSLR outperform counting meioses when dealing with pathogenic variants, though counting meioses is not far behind. For benign variants, FLB and CSLR greatly outperform as counting meioses is unable to generate evidence for benign variants. Comparing FLB and CSLR, we find that the two methods perform similarly, indicating that quantitative results from either of these methods could be combined in multifactorial calculations. Combining quantitative information will be important as isolated use of cosegregation in single families will yield classification for less than 1% of variants. To encourage wider use of robust cosegregation analysis, we present a website ( http://www.analyze.myvariant.org ) which implements the CSLR, FLB, and Counting Meioses methods for ATM, BRCA1, BRCA2, CHEK2, MEN1, MLH1, MSH2, MSH6, and PMS2. We also present an R package, CoSeg, which performs the CSLR analysis on any gene with user supplied parameters. Future variant classification guidelines should allow nuanced inclusion of cosegregation evidence against pathogenicity.

  6. Multi-gene panel testing in Korean patients with common genetic generalized epilepsy syndromes.

    PubMed

    Lee, Cha Gon; Lee, Jeehun; Lee, Munhyang

    2018-01-01

    Genetic heterogeneity of common genetic generalized epilepsy syndromes is frequently considered. The present study conducted a focused analysis of potential candidate or susceptibility genes for common genetic generalized epilepsy syndromes using multi-gene panel testing with next-generation sequencing. This study included patients with juvenile myoclonic epilepsy, juvenile absence epilepsy, and epilepsy with generalized tonic-clonic seizures alone. We identified pathogenic variants according to the American College of Medical Genetics and Genomics guidelines and identified susceptibility variants using case-control association analyses and family analyses for familial cases. A total of 57 patients were enrolled, including 51 sporadic cases and 6 familial cases. Twenty-two pathogenic and likely pathogenic variants of 16 different genes were identified. CACNA1H was the most frequently observed single gene. Variants of voltage-gated Ca2+ channel genes, including CACNA1A, CACNA1G, and CACNA1H were observed in 32% of variants (n = 7/22). Analyses to identify susceptibility variants using case-control association analysis indicated that KCNMA1 c.400G>C was associated with common genetic generalized epilepsy syndromes. Only 1 family (family A) exhibited a candidate pathogenic variant p.(Arg788His) on CACNA1H, as determined via family analyses. This study identified candidate genetic variants in about a quarter of patients (n = 16/57) and an average of 2.8 variants was identified in each patient. The results reinforced the polygenic disorder with very high locus and allelic heterogeneity of common GGE syndromes. Further, voltage-gated Ca2+ channels are suggested as important contributors to common genetic generalized epilepsy syndromes. This study extends our comprehensive understanding of common genetic generalized epilepsy syndromes.

  7. Integration of bioinformatics and imaging informatics for identifying rare PSEN1 variants in Alzheimer's disease.

    PubMed

    Nho, Kwangsik; Horgusluoglu, Emrin; Kim, Sungeun; Risacher, Shannon L; Kim, Dokyoon; Foroud, Tatiana; Aisen, Paul S; Petersen, Ronald C; Jack, Clifford R; Shaw, Leslie M; Trojanowski, John Q; Weiner, Michael W; Green, Robert C; Toga, Arthur W; Saykin, Andrew J

    2016-08-12

    Pathogenic mutations in PSEN1 are known to cause familial early-onset Alzheimer's disease (EOAD) but common variants in PSEN1 have not been found to strongly influence late-onset AD (LOAD). The association of rare variants in PSEN1 with LOAD-related endophenotypes has received little attention. In this study, we performed a rare variant association analysis of PSEN1 with quantitative biomarkers of LOAD using whole genome sequencing (WGS) by integrating bioinformatics and imaging informatics. A WGS data set (N = 815) from the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort was used in this analysis. 757 non-Hispanic Caucasian participants underwent WGS from a blood sample and high resolution T1-weighted structural MRI at baseline. An automated MRI analysis technique (FreeSurfer) was used to measure cortical thickness and volume of neuroanatomical structures. We assessed imaging and cerebrospinal fluid (CSF) biomarkers as LOAD-related quantitative endophenotypes. Single variant analyses were performed using PLINK and gene-based analyses of rare variants were performed using the optimal Sequence Kernel Association Test (SKAT-O). A total of 839 rare variants (MAF < 1/√(2 N) = 0.0257) were found within a region of ±10 kb from PSEN1. Among them, six exonic (three non-synonymous) variants were observed. A single variant association analysis showed that the PSEN1 p. E318G variant increases the risk of LOAD only in participants carrying APOE ε4 allele where individuals carrying the minor allele of this PSEN1 risk variant have lower CSF Aβ1-42 and higher CSF tau. A gene-based analysis resulted in a significant association of rare but not common (MAF ≥ 0.0257) PSEN1 variants with bilateral entorhinal cortical thickness. This is the first study to show that PSEN1 rare variants collectively show a significant association with the brain atrophy in regions preferentially affected by LOAD, providing further support for a role of PSEN1 in LOAD. The PSEN1 p. E318G variant increases the risk of LOAD only in APOE ε4 carriers. Integrating bioinformatics with imaging informatics for identification of rare variants could help explain the missing heritability in LOAD.

  8. Apolipoprotein L1 gene variants in deceased organ donors are associated with renal allograft failure.

    PubMed

    Freedman, B I; Julian, B A; Pastan, S O; Israni, A K; Schladt, D; Gautreaux, M D; Hauptfeld, V; Bray, R A; Gebel, H M; Kirk, A D; Gaston, R S; Rogers, J; Farney, A C; Orlando, G; Stratta, R J; Mohan, S; Ma, L; Langefeld, C D; Hicks, P J; Palmer, N D; Adams, P L; Palanisamy, A; Reeves-Daniel, A M; Divers, J

    2015-06-01

    Apolipoprotein L1 gene (APOL1) nephropathy variants in African American deceased kidney donors were associated with shorter renal allograft survival in a prior single-center report. APOL1 G1 and G2 variants were genotyped in newly accrued DNA samples from African American deceased donors of kidneys recovered and/or transplanted in Alabama and North Carolina. APOL1 genotypes and allograft outcomes in subsequent transplants from 55 U.S. centers were linked, adjusting for age, sex and race/ethnicity of recipients, HLA match, cold ischemia time, panel reactive antibody levels, and donor type. For 221 transplantations from kidneys recovered in Alabama, there was a statistical trend toward shorter allograft survival in recipients of two-APOL1-nephropathy-variant kidneys (hazard ratio [HR] 2.71; p = 0.06). For all 675 kidneys transplanted from donors at both centers, APOL1 genotype (HR 2.26; p = 0.001) and African American recipient race/ethnicity (HR 1.60; p = 0.03) were associated with allograft failure. Kidneys from African American deceased donors with two APOL1 nephropathy variants reproducibly associate with higher risk for allograft failure after transplantation. These findings warrant consideration of rapidly genotyping deceased African American kidney donors for APOL1 risk variants at organ recovery and incorporation of results into allocation and informed-consent processes. © Copyright 2015 The American Society of Transplantation and the American Society of Transplant Surgeons.

  9. The genetic architecture of type 2 diabetes.

    PubMed

    Fuchsberger, Christian; Flannick, Jason; Teslovich, Tanya M; Mahajan, Anubha; Agarwala, Vineeta; Gaulton, Kyle J; Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I

    2016-08-04

    The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.

  10. Selection and explosive growth alter genetic architecture and hamper the detection of causal rare variants.

    PubMed

    Uricchio, Lawrence H; Zaitlen, Noah A; Ye, Chun Jimmie; Witte, John S; Hernandez, Ryan D

    2016-07-01

    The role of rare alleles in complex phenotypes has been hotly debated, but most rare variant association tests (RVATs) do not account for the evolutionary forces that affect genetic architecture. Here, we use simulation and numerical algorithms to show that explosive population growth, as experienced by human populations, can dramatically increase the impact of very rare alleles on trait variance. We then assess the ability of RVATs to detect causal loci using simulations and human RNA-seq data. Surprisingly, we find that statistical performance is worst for phenotypes in which genetic variance is due mainly to rare alleles, and explosive population growth decreases power. Although many studies have attempted to identify causal rare variants, few have reported novel associations. This has sometimes been interpreted to mean that rare variants make negligible contributions to complex trait heritability. Our work shows that RVATs are not robust to realistic human evolutionary forces, so general conclusions about the impact of rare variants on complex traits may be premature. © 2016 Uricchio et al.; Published by Cold Spring Harbor Laboratory Press.

  11. The genetic architecture of type 2 diabetes

    PubMed Central

    Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I

    2016-01-01

    The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of heritability. To test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole genome sequencing in 2,657 Europeans with and without diabetes, and exome sequencing in a total of 12,940 subjects from five ancestral groups. To increase statistical power, we expanded sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support a major role for lower-frequency variants in predisposition to type 2 diabetes. PMID:27398621

  12. Ranking viruses: measures of positional importance within networks define core viruses for rational polyvalent vaccine development.

    PubMed

    Anderson, Tavis K; Laegreid, William W; Cerutti, Francesco; Osorio, Fernando A; Nelson, Eric A; Christopher-Hennings, Jane; Goldberg, Tony L

    2012-06-15

    The extraordinary genetic and antigenic variability of RNA viruses is arguably the greatest challenge to the development of broadly effective vaccines. No single viral variant can induce sufficiently broad immunity, and incorporating all known naturally circulating variants into one multivalent vaccine is not feasible. Furthermore, no objective strategies currently exist to select actual viral variants that should be included or excluded in polyvalent vaccines. To address this problem, we demonstrate a method based on graph theory that quantifies the relative importance of viral variants. We demonstrate our method through application to the envelope glycoprotein gene of a particularly diverse RNA virus of pigs: porcine reproductive and respiratory syndrome virus (PRRSV). Using distance matrices derived from sequence nucleotide difference, amino acid difference and evolutionary distance, we constructed viral networks and used common network statistics to assign each sequence an objective ranking of relative 'importance'. To validate our approach, we use an independent published algorithm to score our top-ranked wild-type variants for coverage of putative T-cell epitopes across the 9383 sequences in our dataset. Top-ranked viruses achieve significantly higher coverage than low-ranked viruses, and top-ranked viruses achieve nearly equal coverage as a synthetic mosaic protein constructed in silico from the same set of 9383 sequences. Our approach relies on the network structure of PRRSV but applies to any diverse RNA virus because it identifies subsets of viral variants that are most important to overall viral diversity. We suggest that this method, through the objective quantification of variant importance, provides criteria for choosing viral variants for further characterization, diagnostics, surveillance and ultimately polyvalent vaccine development.

  13. Evaluation of Allele-Specific Somatic Changes of Genome-Wide Association Study Susceptibility Alleles in Human Colorectal Cancers

    PubMed Central

    Gerber, Madelyn M.; Hampel, Heather; Schulz, Nathan P.; Fernandez, Soledad; Wei, Lai; Zhou, Xiao-Ping; de la Chapelle, Albert; Toland, Amanda Ewart

    2012-01-01

    Background Tumors frequently exhibit loss of tumor suppressor genes or allelic gains of activated oncogenes. A significant proportion of cancer susceptibility loci in the mouse show somatic losses or gains consistent with the presence of a tumor susceptibility or resistance allele. Thus, allele-specific somatic gains or losses at loci may demarcate the presence of resistance or susceptibility alleles. The goal of this study was to determine if previously mapped susceptibility loci for colorectal cancer show evidence of allele-specific somatic events in colon tumors. Methods We performed quantitative genotyping of 16 single nucleotide polymorphisms (SNPs) showing statistically significant association with colorectal cancer in published genome-wide association studies (GWAS). We genotyped 194 paired normal and colorectal tumor DNA samples and 296 paired validation samples to investigate these SNPs for allele-specific somatic gains and losses. We combined analysis of our data with published data for seven of these SNPs. Results No statistically significant evidence for allele-specific somatic selection was observed for the tested polymorphisms in the discovery set. The rs6983267 variant, which has shown preferential loss of the non-risk T allele and relative gain of the risk G allele in previous studies, favored relative gain of the G allele in the combined discovery and validation samples (corrected p-value = 0.03). When we combined our data with published allele-specific imbalance data for this SNP, the G allele of rs6983267 showed statistically significant evidence of relative retention (p-value = 2.06×10−4). Conclusions Our results suggest that the majority of variants identified as colon cancer susceptibility alleles through GWAS do not exhibit somatic allele-specific imbalance in colon tumors. Our data confirm previously published results showing allele-specific imbalance for rs6983267. These results indicate that allele-specific imbalance of cancer susceptibility alleles may not be a common phenomenon in colon cancer. PMID:22629442

  14. Multi-template analysis of human perirhinal cortex in brain MRI: Explicitly accounting for anatomical variability

    PubMed Central

    Xie, Long; Pluta, John B.; Das, Sandhitsu R.; Wisse, Laura E.M.; Wang, Hongzhi; Mancuso, Lauren; Kliot, Dasha; Avants, Brian B.; Ding, Song-Lin; Manjón, José V.; Wolk, David A.; Yushkevich, Paul A.

    2016-01-01

    Rational The human perirhinal cortex (PRC) plays critical roles in episodic and semantic memory and visual perception. The PRC consists of Brodmann areas 35 and 36 (BA35, BA36). In Alzheimer's disease (AD), BA35 is the first cortical site affected by neurofibrillary tangle pathology, which is closely linked to neural injury in AD. Large anatomical variability, manifested in the form of different cortical folding and branching patterns, makes it difficult to segment the PRC in MRI scans. Pathology studies have found that in ~97% of specimens, the PRC falls into one of three discrete anatomical variants. However, current methods for PRC segmentation and morphometry in MRI are based on single-template approaches, which may not be able to accurately model these discrete variants Methods A multi-template analysis pipeline that explicitly accounts for anatomical variability is used to automatically label the PRC and measure its thickness in T2-weighted MRI scans. The pipeline uses multi-atlas segmentation to automatically label medial temporal lobe cortices including entorhinal cortex, PRC and the parahippocampal cortex. Pairwise registration between label maps and clustering based on residual dissimilarity after registration are used to construct separate templates for the anatomical variants of the PRC. An optimal path of deformations linking these templates is used to establish correspondences between all the subjects. Experimental evaluation focuses on the ability of single-template and multi-template analyses to detect differences in the thickness of medial temporal lobe cortices between patients with amnestic mild cognitive impairment (aMCI, n=41) and age-matched controls (n=44). Results The proposed technique is able to generate templates that recover the three dominant discrete variants of PRC and establish more meaningful correspondences between subjects than a single-template approach. The largest reduction in thickness associated with aMCI, in absolute terms, was found in left BA35 using both regional and summary thickness measures. Further, statistical maps of regional thickness difference between aMCI and controls revealed different patterns for the three anatomical variants. PMID:27702610

  15. Genome-wide association study reveals putative regulators of bioenergy traits in Populus deltoides

    DOE PAGES

    Fahrenkrog, Annette M.; Neves, Leandro G.; Resende, Jr., Marcio F. R.; ...

    2016-09-06

    Genome-wide association studies (GWAS) have been used extensively to dissect the genetic regulation of complex traits in plants. These studies have focused largely on the analysis of common genetic variants despite the abundance of rare polymorphisms in several species, and their potential role in trait variation. Here, we conducted the first GWAS in Populus deltoides, a genetically diverse keystone forest species in North America and an important short rotation woody crop for the bioenergy industry. We searched for associations between eight growth and wood composition traits, and common and low-frequency single-nucleotide polymorphisms detected by targeted resequencing of 18 153 genesmore » in a population of 391 unrelated individuals. To increase power to detect associations with low-frequency variants, multiple-marker association tests were used in combination with single-marker association tests. Significant associations were discovered for all phenotypes and are indicative that low-frequency polymorphisms contribute to phenotypic variance of several bioenergy traits. Our results suggest that both common and low-frequency variants need to be considered for a comprehensive understanding of the genetic regulation of complex traits, particularly in species that carry large numbers of rare polymorphisms. Lastly, these polymorphisms may be critical for the development of specialized plant feedstocks for bioenergy.« less

  16. Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data.

    PubMed

    Gray, Vanessa E; Hause, Ronald J; Luebeck, Jens; Shendure, Jay; Fowler, Douglas M

    2018-01-24

    Large datasets describing the quantitative effects of mutations on protein function are becoming increasingly available. Here, we leverage these datasets to develop Envision, which predicts the magnitude of a missense variant's molecular effect. Envision combines 21,026 variant effect measurements from nine large-scale experimental mutagenesis datasets, a hitherto untapped training resource, with a supervised, stochastic gradient boosting learning algorithm. Envision outperforms other missense variant effect predictors both on large-scale mutagenesis data and on an independent test dataset comprising 2,312 TP53 variants whose effects were measured using a low-throughput approach. This dataset was never used for hyperparameter tuning or model training and thus serves as an independent validation set. Envision prediction accuracy is also more consistent across amino acids than other predictors. Finally, we demonstrate that Envision's performance improves as more large-scale mutagenesis data are incorporated. We precompute Envision predictions for every possible single amino acid variant in human, mouse, frog, zebrafish, fruit fly, worm, and yeast proteomes (https://envision.gs.washington.edu/). Copyright © 2017 Elsevier Inc. All rights reserved.

  17. The Chandra Source Catalog: Source Variability

    NASA Astrophysics Data System (ADS)

    Nowak, Michael; Rots, A. H.; McCollough, M. L.; Primini, F. A.; Glotfelty, K. J.; Bonaventura, N. R.; Chen, J. C.; Davis, J. E.; Doe, S. M.; Evans, J. D.; Fabbiano, G.; Galle, E.; Gibbs, D. G.; Grier, J. D.; Hain, R.; Hall, D. M.; Harbo, P. N.; He, X.; Houck, J. C.; Karovska, M.; Lauer, J.; McDowell, J. C.; Miller, J. B.; Mitschang, A. W.; Morgan, D. L.; Nichols, J. S.; Plummer, D. A.; Refsdal, B. L.; Siemiginowska, A. L.; Sundheim, B. A.; Tibbetts, M. S.; Van Stone, D. W.; Winkelman, S. L.; Zografou, P.

    2009-01-01

    The Chandra Source Catalog (CSC) contains fields of view that have been studied with individual, uninterrupted observations that span integration times ranging from 1 ksec to 160 ksec, and a large number of which have received (multiple) repeat observations days to years later. The CSC thus offers an unprecedented look at the variability of the X-ray sky over a broad range of time scales, and across a wide diversity of variable X-ray sources: stars in the local galactic neighborhood, galactic and extragalactic X-ray binaries, Active Galactic Nuclei, etc. Here we describe the methods used to identify and quantify source variability within a single observation, and the methods used to assess the variability of a source when detected in multiple, individual observations. Three tests are used to detect source variability within a single observation: the Kolmogorov-Smirnov test and its variant, the Kuiper test, and a Bayesian approach originally suggested by Gregory and Loredo. The latter test not only provides an indicator of variability, but is also used to create a best estimate of the variable lightcurve shape. We assess the performance of these tests via simulation of statistically stationary, variable processes with arbitrary input power spectral densities (here we concentrate on results of red noise simulations) at variety of mean count rates and fractional root mean square variabilities relevant to CSC sources. We also assess the false positive rate via simulations of constant sources whose sole source of fluctuation is Poisson noise. We compare these simulations to a preliminary assessment of the variability found in real CSC sources, and estimate the variability sensitivities of the CSC.

  18. The Chandra Source Catalog: Source Variability

    NASA Astrophysics Data System (ADS)

    Nowak, Michael; Rots, A. H.; McCollough, M. L.; Primini, F. A.; Glotfelty, K. J.; Bonaventura, N. R.; Chen, J. C.; Davis, J. E.; Doe, S. M.; Evans, J. D.; Evans, I.; Fabbiano, G.; Galle, E. C.; Gibbs, D. G., II; Grier, J. D.; Hain, R.; Hall, D. M.; Harbo, P. N.; He, X.; Houck, J. C.; Karovska, M.; Lauer, J.; McDowell, J. C.; Miller, J. B.; Mitschang, A. W.; Morgan, D. L.; Nichols, J. S.; Plummer, D. A.; Refsdal, B. L.; Siemiginowska, A. L.; Sundheim, B. A.; Tibbetts, M. S.; van Stone, D. W.; Winkelman, S. L.; Zografou, P.

    2009-09-01

    The Chandra Source Catalog (CSC) contains fields of view that have been studied with individual, uninterrupted observations that span integration times ranging from 1 ksec to 160 ksec, and a large number of which have received (multiple) repeat observations days to years later. The CSC thus offers an unprecedented look at the variability of the X-ray sky over a broad range of time scales, and across a wide diversity of variable X-ray sources: stars in the local galactic neighborhood, galactic and extragalactic X-ray binaries, Active Galactic Nuclei, etc. Here we describe the methods used to identify and quantify source variability within a single observation, and the methods used to assess the variability of a source when detected in multiple, individual observations. Three tests are used to detect source variability within a single observation: the Kolmogorov-Smirnov test and its variant, the Kuiper test, and a Bayesian approach originally suggested by Gregory and Loredo. The latter test not only provides an indicator of variability, but is also used to create a best estimate of the variable lightcurve shape. We assess the performance of these tests via simulation of statistically stationary, variable processes with arbitrary input power spectral densities (here we concentrate on results of red noise simulations) at variety of mean count rates and fractional root mean square variabilities relevant to CSC sources. We also assess the false positive rate via simulations of constant sources whose sole source of fluctuation is Poisson noise. We compare these simulations to an assessment of the variability found in real CSC sources, and estimate the variability sensitivities of the CSC.

  19. Test-Retest Reliability of Standard and Emotional Stroop Tasks: An Investigation of Color-Word and Picture-Word Versions

    ERIC Educational Resources Information Center

    Strauss, Gregory P.; Allen, Daniel N.; Jorgensen, Melinda L.; Cramer, Stacey L.

    2005-01-01

    Previous studies have examined the reliability of scores derived from various Stroop tasks. However, few studies have compared reliability of more recently developed Stroop variants such as emotional Stroop tasks to standard versions of the Stroop. The current study developed four different single-stimulus Stroop tasks and compared test-retest…

  20. Genetic variants and early cigarette smoking and nicotine dependence phenotypes in adolescents.

    PubMed

    O'Loughlin, Jennifer; Sylvestre, Marie-Pierre; Labbe, Aurélie; Low, Nancy C; Roy-Gagnon, Marie-Hélène; Dugas, Erika N; Karp, Igor; Engert, James C

    2014-01-01

    While the heritability of cigarette smoking and nicotine dependence (ND) is well-documented, the contribution of specific genetic variants to specific phenotypes has not been closely examined. The objectives of this study were to test the associations between 321 tagging single-nucleotide polymorphisms (SNPs) that capture common genetic variation in 24 genes, and early smoking and ND phenotypes in novice adolescent smokers, and to assess if genetic predictors differ across these phenotypes. In a prospective study of 1294 adolescents aged 12-13 years recruited from ten Montreal-area secondary schools, 544 participants who had smoked at least once during the 7-8 year follow-up provided DNA. 321 single-nucleotide polymorphisms (SNPs) in 24 candidate genes were tested for an association with number of cigarettes smoked in the past 3 months, and with five ND phenotypes (a modified version of the Fagerstrom Tolerance Questionnaire, the ICD-10 and three clusters of ND symptoms representing withdrawal symptoms, use of nicotine for self-medication, and a general ND/craving symptom indicator). The pattern of SNP-gene associations differed across phenotypes. Sixteen SNPs in seven genes (ANKK1, CHRNA7, DDC, DRD2, COMT, OPRM1, SLC6A3 (also known as DAT1)) were associated with at least one phenotype with a p-value <0.01 using linear mixed models. After permutation and FDR adjustment, none of the associations remained statistically significant, although the p-values for the association between rs557748 in OPRM1 and the ND/craving and self-medication phenotypes were both 0.076. Because the genetic predictors differ, specific cigarette smoking and ND phenotypes should be distinguished in genetic studies in adolescents. Fifteen of the 16 top-ranked SNPs identified in this study were from loci involved in dopaminergic pathways (ANKK1/DRD2, DDC, COMT, OPRM1, and SLC6A3). Dopaminergic pathways may be salient during early smoking and the development of ND.

  1. Resequencing the susceptibility gene, ITGAM, identifies two functionally deleterious rare variants in systemic lupus erythematosus cases

    PubMed Central

    2014-01-01

    Introduction The majority of the genetic variance of systemic lupus erythematosus (SLE) remains unexplained by the common disease-common variant hypothesis. Rare variants, which are not detectable by genome-wide association studies because of their low frequencies, are predicted to explain part of this ”missing heritability.” However, recent studies identifying rare variants within known disease-susceptibility loci have failed to show genetic associations because of their extremely low frequencies, leading to the questioning of the contribution of rare variants to disease susceptibility. A common (minor allele frequency = 17.4% in cases) nonsynonymous coding variant rs1143679 (R77H) in ITGAM (CD11b), which forms half of the heterodimeric integrin receptor, complement receptor 3 (CR3), is robustly associated with SLE and has been shown to impair CR3-mediated phagocytosis. Methods We resequenced ITGAM in 73 SLE cases and identified two previously unidentified, case-specific nonsynonymous variants, F941V and G1145S. Both variants were genotyped in 2,107 and 949 additional SLE cases, respectively, to estimate their frequencies in a disease population. An in vitro model was used to assess the impact of F941V and G1145S, together with two nonsynonymous ITGAM polymorphisms, A858V (rs1143683) and M441T (rs11861251), on CR3-mediated phagocytosis. A paired two-tailed t test was used to compare the phagocytic capabilities of each variant with that of wild-type CR3. Results Both rare variants, F941V and G1145S, significantly impair CR3-mediated phagocytosis in an in vitro model (61% reduction, P = 0.006; 26% reduction, P = 0.0232). However, neither of the common variants, M441T and A858V, had an effect on phagocytosis. Neither rare variant was observed again in the genotyping of additional SLE cases, suggesting that there frequencies are extremely low. Conclusions Our results add further evidence to the functional importance of ITGAM in SLE pathogenesis through impaired phagocytosis. Additionally, this study provides a new example of the identification of rare variants in common-allele-associated loci, which, because of their extremely low frequencies, are not statistically associated. However, the demonstration of their functional effects adds support to their contribution to disease risk, and questions the current notion of dismissing the contribution of very rare variants on purely statistical analyses. PMID:24886912

  2. A powerful approach for association analysis incorporating imprinting effects

    PubMed Central

    Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam

    2011-01-01

    Motivation: For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. Results: In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy–Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. Contact: wingfung@hku.hk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21798962

  3. A powerful approach for association analysis incorporating imprinting effects.

    PubMed

    Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam

    2011-09-15

    For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy-Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. wingfung@hku.hk Supplementary data are available at Bioinformatics online.

  4. Ovine Reference Materials and Assays for Prion Genetic Testing

    USDA-ARS?s Scientific Manuscript database

    Codon variants implicated in scrapie susceptibility or disease progression include those at amino acid positions 112, 136, 141, 154, and 171. Nine single nucleotide polymorphisms (SNPs) determine which residues are encoded by the five implicated codons and accurately scoring these SNPs is essential...

  5. Computing Relative Free Energies of Solvation using Single Reference Thermodynamic Integration Augmented with Hamiltonian Replica Exchange.

    PubMed

    Khavrutskii, Ilja V; Wallqvist, Anders

    2010-11-09

    This paper introduces an efficient single-topology variant of Thermodynamic Integration (TI) for computing relative transformation free energies in a series of molecules with respect to a single reference state. The presented TI variant that we refer to as Single-Reference TI (SR-TI) combines well-established molecular simulation methodologies into a practical computational tool. Augmented with Hamiltonian Replica Exchange (HREX), the SR-TI variant can deliver enhanced sampling in select degrees of freedom. The utility of the SR-TI variant is demonstrated in calculations of relative solvation free energies for a series of benzene derivatives with increasing complexity. Noteworthy, the SR-TI variant with the HREX option provides converged results in a challenging case of an amide molecule with a high (13-15 kcal/mol) barrier for internal cis/trans interconversion using simulation times of only 1 to 4 ns.

  6. European multiple sclerosis risk variants in the south Asian population.

    PubMed

    Pandit, Lekha; Ban, Maria; Beecham, Ashley Harris; McCauley, Jacob L; Sawcer, Stephen; D'Cunha, Anitha; Malli, Chaitra; Malik, Omar

    2016-10-01

    In less than a decade, genomewide association studies have identified over 100 single-nucleotide variants that are associated with increased risk of developing multiple sclerosis. However, since these studies have focused almost exclusively on European populations, it is unclear what role these variants might play in determining risk in other ethnic groups. To assess the effects of European multiple sclerosis-associated risk variants in the south Asian population. Using a combination of chip-based genotyping and next-generation sequencing, we have assessed 109 European-associated variants in a total of 270 cases and 555 controls from the south Asian population. We found that two-thirds of the tested variants (72/109) showed over representation of the European risk allele in south Asian cases (p < 0.0003). In the rest of the Immunochip array, the most associated variant was rs7318477 which maps close to TNFSF13B, the gene for the B-cell-related protein BAFF. Our data indicate substantial overlap in genetic risk architecture between Europeans and south Asians and suggest that the aetiology of the disease may be largely independent of ethnicity. © The Author(s), 2016.

  7. Association of common variants in PAH and LAT1 with non-syndromic cleft lip with or without cleft palate (NSCL/P) in the Polish population.

    PubMed

    Hozyasz, Kamil K; Mostowska, Adrianna; Wójcicki, Piotr; Lasota, Agnieszka; Wołkowicz, Anna; Dunin-Wilczyńska, Izabella; Jagodziński, Paweł P

    2014-04-01

    Non-syndromic cleft lip with or without cleft palate (NSCL/P) is a common structural malformation with a complex and multifactorial aetiology. Associations of abnormalities in phenylalanine metabolism and orofacial clefts have been suggested. Eight single nucleotide polymorphisms (SNPs) of genes encoding phenylalanine hydroxylase (PAH) and large neutral l-amino acid transporter type 1 (LAT1), as well as the PAH mutation that is most common in the Polish population (rs5030858; R408W), were investigated in 263 patients with NSCL/P and 270 matched controls using high resolution melting curve analysis (HRM). We found that two polymorphic variants of PAH appear to be risk factors for NSCL/P. The odds ratio (OR) for individuals with the rs7485331 A allele (AC or AA) compared to CC homozygotes was 0.616 (95% confidence interval [CI]=0.437-0.868; p=0.005) and this association remains statistically significant after multiple testing correction. The PAH rs12425434, previously associated with schizophrenia, was borderline associated with orofacial clefts. Moreover, haplotype analysis of polymorphisms in the PAH gene revealed a 4-marker combination that was significantly associated with NSCL/P. The global p-value for a haplotype comprised of SNPs rs74385331, rs12425434, rs1722392, and the mutation rs5030858 was 0.032, but this association did not survive multiple testing correction. This study suggests the involvement of the PAH gene in the aetiology of NSCL/P in the tested population. Further replication will be required in separate cohorts to confirm the consistency of the observed association. Copyright © 2014 Elsevier Ltd. All rights reserved.

  8. Insufficient evidence for association of NOD2/CARD15 or other inflammatory bowel disease–associated markers on GVHD incidence or other adverse outcomes in T-replete, unrelated donor transplantation

    PubMed Central

    Nguyen, Yume; Al-Lehibi, Abed; Gorbe, Elizabeth; Li, Ellen; Haagenson, Michael; Wang, Tao; Spellman, Stephen; Lee, Stephanie J.

    2010-01-01

    Previous European studies suggest NOD2/CARD15 and interleukin-23 receptor (IL-23R) donor or recipient variants are associated with adverse clinical outcomes in allogeneic hematopoietic stem cell transplantation. We reexamined these findings as well as the role of another inflammatory bowel disease (IBD) susceptibility gene (immunity-related GTPase family, M [IRGM]) on transplantation outcomes in 390 US patients and their matched unrelated donors, accrued between 1995 and 2004. Patients received T-replete grafts with mostly myeloablative conditioning regimens. Multivariate analyses were performed for overall survival, disease-free survival, transplantation-related mortality, relapse, and acute and chronic graft-versus-host disease. Of 390 pairs, NOD2/CARD15 variant single nucleotide polymorphisms (SNPs) were found in 14% of donors and 17% of recipients. In 3% both donor and recipient had a mutant SNP. Thirteen percent of donors and 16% of recipients had variant IL23R SNPs, with 3% having both donor and recipient variants. Twenty-three percent of both donors and recipients had variant IRGM SNPs. None of the 3 IBD-associated alleles showed a statistically significant association with any adverse clinical outcomes. Our results do not support an association between the 3 IBD-associated SNPs and adverse outcomes after matched unrelated donor hematopoietic cell transplantations in US patients. PMID:20177049

  9. A powerful and robust test in genetic association studies.

    PubMed

    Cheng, Kuang-Fu; Lee, Jen-Yu

    2014-01-01

    There are several well-known single SNP tests presented in the literature for detecting gene-disease association signals. Having in place an efficient and robust testing process across all genetic models would allow a more comprehensive approach to analysis. Although some studies have shown that it is possible to construct such a test when the variants are common and the genetic model satisfies certain conditions, the model conditions are too restrictive and in general difficult to verify. In this paper, we propose a powerful and robust test without assuming any model restrictions. Our test is based on the selected 2 × 2 tables derived from the usual 2 × 3 table. By signals from these tables, we show through simulations across a wide range of allele frequencies and genetic models that this approach may produce a test which is almost uniformly most powerful in the analysis of low- and high-frequency variants. Two cancer studies are used to demonstrate applications of the proposed test. © 2014 S. Karger AG, Basel.

  10. Complex Analysis of Urate Transporters SLC2A9, SLC22A12 and Functional Characterization of Non-Synonymous Allelic Variants of GLUT9 in the Czech Population: No Evidence of Effect on Hyperuricemia and Gout

    PubMed Central

    Hurba, Olha; Mancikova, Andrea; Krylov, Vladimir; Pavlikova, Marketa; Pavelka, Karel; Stibůrková, Blanka

    2014-01-01

    Objective Using European descent Czech populations, we performed a study of SLC2A9 and SLC22A12 genes previously identified as being associated with serum uric acid concentrations and gout. This is the first study of the impact of non-synonymous allelic variants on the function of GLUT9 except for patients suffering from renal hypouricemia type 2. Methods The cohort consisted of 250 individuals (150 controls, 54 nonspecific hyperuricemics and 46 primary gout and/or hyperuricemia subjects). We analyzed 13 exons of SLC2A9 (GLUT9 variant 1 and GLUT9 variant 2) and 10 exons of SLC22A12 by PCR amplification and sequenced directly. Allelic variants were prepared and their urate uptake and subcellular localization were studied by Xenopus oocytes expression system. The functional studies were analyzed using the non-parametric Wilcoxon and Kruskall-Wallis tests; the association study used the Fisher exact test and linear regression approach. Results We identified a total of 52 sequence variants (12 unpublished). Eight non-synonymous allelic variants were found only in SLC2A9: rs6820230, rs2276961, rs144196049, rs112404957, rs73225891, rs16890979, rs3733591 and rs2280205. None of these variants showed any significant difference in the expression of GLUT9 and in urate transport. In the association study, eight variants showed a possible association with hyperuricemia. However, seven of these were in introns and the one exon located variant, rs7932775, did not show a statistically significant association with serum uric acid concentration. Conclusion Our results did not confirm any effect of SLC22A12 and SLC2A9 variants on serum uric acid concentration. Our complex approach using association analysis together with functional and immunohistochemical characterization of non-synonymous allelic variants did not show any influence on expression, subcellular localization and urate uptake of GLUT9. PMID:25268603

  11. VirVarSeq: a low-frequency virus variant detection pipeline for Illumina sequencing using adaptive base-calling accuracy filtering.

    PubMed

    Verbist, Bie M P; Thys, Kim; Reumers, Joke; Wetzels, Yves; Van der Borght, Koen; Talloen, Willem; Aerssens, Jeroen; Clement, Lieven; Thas, Olivier

    2015-01-01

    In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%. The VirVarSeq is available, together with a user's guide and test data, at sourceforge: http://sourceforge.net/projects/virtools/?source=directory. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Accounting for Population Structure in Gene-by-Environment Interactions in Genome-Wide Association Studies Using Mixed Models.

    PubMed

    Sul, Jae Hoon; Bilow, Michael; Yang, Wen-Yun; Kostem, Emrah; Furlotte, Nick; He, Dan; Eskin, Eleazar

    2016-03-01

    Although genome-wide association studies (GWASs) have discovered numerous novel genetic variants associated with many complex traits and diseases, those genetic variants typically explain only a small fraction of phenotypic variance. Factors that account for phenotypic variance include environmental factors and gene-by-environment interactions (GEIs). Recently, several studies have conducted genome-wide gene-by-environment association analyses and demonstrated important roles of GEIs in complex traits. One of the main challenges in these association studies is to control effects of population structure that may cause spurious associations. Many studies have analyzed how population structure influences statistics of genetic variants and developed several statistical approaches to correct for population structure. However, the impact of population structure on GEI statistics in GWASs has not been extensively studied and nor have there been methods designed to correct for population structure on GEI statistics. In this paper, we show both analytically and empirically that population structure may cause spurious GEIs and use both simulation and two GWAS datasets to support our finding. We propose a statistical approach based on mixed models to account for population structure on GEI statistics. We find that our approach effectively controls population structure on statistics for GEIs as well as for genetic variants.

  13. Gene expression allelic imbalance in ovine brown adipose tissue impacts energy homeostasis

    PubMed Central

    Ghazanfar, Shila; Vuocolo, Tony; Morrison, Janna L.; Nicholas, Lisa M.; McMillen, Isabella C.; Yang, Jean Y. H.; Buckley, Michael J.

    2017-01-01

    Heritable trait variation within a population of organisms is largely governed by DNA variations that impact gene transcription and protein function. Identifying genetic variants that affect complex functional traits is a primary aim of population genetics studies, especially in the context of human disease and agricultural production traits. The identification of alleles directly altering mRNA expression and thereby biological function is challenging due to difficulty in isolating direct effects of cis-acting genetic variations from indirect trans-acting genetic effects. Allele specific gene expression or allelic imbalance in gene expression (AI) occurring at heterozygous loci provides an opportunity to identify genes directly impacted by cis-acting genetic variants as indirect trans-acting effects equally impact the expression of both alleles. However, the identification of genes showing AI in the context of the expression of all genes remains a challenge due to a variety of technical and statistical issues. The current study focuses on the discovery of genes showing AI using single nucleotide polymorphisms as allelic reporters. By developing a computational and statistical process that addressed multiple analytical challenges, we ranked 5,809 genes for evidence of AI using RNA-Seq data derived from brown adipose tissue samples from a cohort of late gestation fetal lambs and then identified a conservative subgroup of 1,293 genes. Thus, AI was extensive, representing approximately 25% of the tested genes. Genes associated with AI were enriched for multiple Gene Ontology (GO) terms relating to lipid metabolism, mitochondrial function and the extracellular matrix. These functions suggest that cis-acting genetic variations causing AI in the population are preferentially impacting genes involved in energy homeostasis and tissue remodelling. These functions may contribute to production traits likely to be under genetic selection in the population. PMID:28665992

  14. TAPAS: tools to assist the targeted protein quantification of human alternative splice variants.

    PubMed

    Yang, Jae-Seong; Sabidó, Eduard; Serrano, Luis; Kiel, Christina

    2014-10-15

    In proteomes of higher eukaryotes, many alternative splice variants can only be detected by their shared peptides. This makes it highly challenging to use peptide-centric mass spectrometry to distinguish and to quantify protein isoforms resulting from alternative splicing events. We have developed two complementary algorithms based on linear mathematical models to efficiently compute a minimal set of shared and unique peptides needed to quantify a set of isoforms and splice variants. Further, we developed a statistical method to estimate the splice variant abundances based on stable isotope labeled peptide quantities. The algorithms and databases are integrated in a web-based tool, and we have experimentally tested the limits of our quantification method using spiked proteins and cell extracts. The TAPAS server is available at URL http://davinci.crg.es/tapas/. luis.serrano@crg.eu or christina.kiel@crg.eu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Single-Item Measurement of Suicidal Behaviors: Validity and Consequences of Misclassification

    PubMed Central

    Millner, Alexander J.; Lee, Michael D.; Nock, Matthew K.

    2015-01-01

    Suicide is a leading cause of death worldwide. Although research has made strides in better defining suicidal behaviors, there has been less focus on accurate measurement. Currently, the widespread use of self-report, single-item questions to assess suicide ideation, plans and attempts may contribute to measurement problems and misclassification. We examined the validity of single-item measurement and the potential for statistical errors. Over 1,500 participants completed an online survey containing single-item questions regarding a history of suicidal behaviors, followed by questions with more precise language, multiple response options and narrative responses to examine the validity of single-item questions. We also conducted simulations to test whether common statistical tests are robust against the degree of misclassification produced by the use of single-items. We found that 11.3% of participants that endorsed a single-item suicide attempt measure engaged in behavior that would not meet the standard definition of a suicide attempt. Similarly, 8.8% of those who endorsed a single-item measure of suicide ideation endorsed thoughts that would not meet standard definitions of suicide ideation. Statistical simulations revealed that this level of misclassification substantially decreases statistical power and increases the likelihood of false conclusions from statistical tests. Providing a wider range of response options for each item reduced the misclassification rate by approximately half. Overall, the use of single-item, self-report questions to assess the presence of suicidal behaviors leads to misclassification, increasing the likelihood of statistical decision errors. Improving the measurement of suicidal behaviors is critical to increase understanding and prevention of suicide. PMID:26496707

  16. Rare ADH Variant Constellations are Specific for Alcohol Dependence

    PubMed Central

    Zuo, Lingjun; Zhang, Heping; Malison, Robert T.; Li, Chiang-Shan R.; Zhang, Xiang-Yang; Wang, Fei; Lu, Lingeng; Lu, Lin; Wang, Xiaoping; Krystal, John H.; Zhang, Fengyu; Deng, Hong-Wen; Luo, Xingguang

    2013-01-01

    Aims: Some of the well-known functional alcohol dehydrogenase (ADH) gene variants (e.g. ADH1B*2, ADH1B*3 and ADH1C*2) that significantly affect the risk of alcohol dependence are rare variants in most populations. In the present study, we comprehensively examined the associations between rare ADH variants [minor allele frequency (MAF) <0.05] and alcohol dependence, with several other neuropsychiatric and neurological disorders as reference. Methods: A total of 49,358 subjects in 22 independent cohorts with 11 different neuropsychiatric and neurological disorders were analyzed, including 3 cohorts with alcohol dependence. The entire ADH gene cluster (ADH7–ADH1C–ADH1B–ADH1A–ADH6–ADH4–ADH5 at Chr4) was imputed in all samples using the same reference panels that included whole-genome sequencing data. We stringently cleaned the phenotype and genotype data to obtain a total of 870 single nucleotide polymorphisms with 0< MAF <0.05 for association analysis. Results: We found that a rare variant constellation across the entire ADH gene cluster was significantly associated with alcohol dependence in European-Americans (Fp1: simulated global P = 0.045), European-Australians (Fp5: global P = 0.027; collapsing: P = 0.038) and African-Americans (Fp5: global P = 0.050; collapsing: P = 0.038), but not with any other neuropsychiatric disease. Association signals in this region came principally from ADH6, ADH7, ADH1B and ADH1C. In particular, a rare ADH6 variant constellation showed a replicable association with alcohol dependence across these three independent cohorts. No individual rare variants were statistically significantly associated with any disease examined after group- and region-wide correction for multiple comparisons. Conclusion: We conclude that rare ADH variants are specific for alcohol dependence. The ADH gene cluster may harbor a causal variant(s) for alcohol dependence. PMID:23019235

  17. Attrition from Web-Based Cognitive Testing: A Repeated Measures Comparison of Gamification Techniques

    PubMed Central

    Skinner, Andy; Coyle, David; Lawrence, Natalia; Munafo, Marcus

    2017-01-01

    Background The prospect of assessing cognition longitudinally and remotely is attractive to researchers, health practitioners, and pharmaceutical companies alike. However, such repeated testing regimes place a considerable burden on participants, and with cognitive tasks typically being regarded as effortful and unengaging, these studies may experience high levels of participant attrition. One potential solution is to gamify these tasks to make them more engaging: increasing participant willingness to take part and reducing attrition. However, such an approach must balance task validity with the introduction of entertaining gamelike elements. Objective This study aims to investigate the effects of gamelike features on participant attrition using a between-subjects, longitudinal Web-based testing study. Methods We used three variants of a common cognitive task, the Stop Signal Task (SST), with a single gamelike feature in each: one variant where points were rewarded for performing optimally; another where the task was given a graphical theme; and a third variant, which was a standard SST and served as a control condition. Participants completed four compulsory test sessions over 4 consecutive days before entering a 6-day voluntary testing period where they faced a daily decision to either drop out or continue taking part. Participants were paid for each session they completed. Results A total of 482 participants signed up to take part in the study, with 265 completing the requisite four consecutive test sessions. No evidence of an effect of gamification on attrition was observed. A log-rank test showed no evidence of a difference in dropout rates between task variants (χ22=3.0, P=.22), and a one-way analysis of variance of the mean number of sessions completed per participant in each variant also showed no evidence of a difference (F2,262=1.534, P=.21, partial η2=0.012). Conclusions Our findings raise doubts about the ability of gamification to reduce attrition from longitudinal cognitive testing studies. PMID:29167090

  18. Regionally variant collagen alignment correlates with viscoelastic properties of the disc of the human temporomandibular joint.

    PubMed

    Gutman, Shawn; Kim, Daniel; Tarafder, Solaiman; Velez, Sergio; Jeong, Julia; Lee, Chang H

    2018-02-01

    To determine the regionally variant quality of collagen alignment in human TMJ discs and its statistical correlation with viscoelastic properties. For quantitative analysis of the quality of collagen alignment, horizontal sections of human TMJ discs with Pricrosirius Red staining were imaged under circularly polarized microscopy. Mean angle and angular deviation of collagen fibers in each region were analyzed using a well-established automated image-processing for angular gradient. Instantaneous and relaxation moduli of each disc region were measured under stress-relaxation test both in tensile and compression. Then Spearman correlation analysis was performed between the angular deviation and the moduli. To understand the effect of glycosaminoglycans on the correlation, TMJ disc samples were treated by chondroitinase ABC (C-ABC). Our imaging processing analysis showed the region-variant direction of collagen alignment, consistently with previous findings. Interestingly, the quality of collagen alignment, not only the directions, was significantly different in between the regions. The angular deviation of fiber alignment in the anterior and intermediate regions were significantly smaller than the posterior region. Medial and lateral regions showed significantly bigger angular deviation than all the other regions. The regionally variant angular deviation values showed statistically significant correlation with the tensile instantaneous modulus and the relaxation modulus, partially dependent on C-ABC treatment. Our findings suggest the region-variant degree of collagen fiber alignment is likely attributed to the heterogeneous viscoelastic properties of TMJ disc that may have significant implications in development of regenerative therapy for TMJ disc. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Pleiotropic associations of risk variants identified for other cancers with lung cancer risk: the PAGE and TRICL consortia.

    PubMed

    Park, S Lani; Fesinmeyer, Megan D; Timofeeva, Maria; Caberto, Christian P; Kocarnik, Jonathan M; Han, Younghun; Love, Shelly-Ann; Young, Alicia; Dumitrescu, Logan; Lin, Yi; Goodloe, Robert; Wilkens, Lynne R; Hindorff, Lucia; Fowke, Jay H; Carty, Cara; Buyske, Steven; Schumacher, Frederick R; Butler, Anne; Dilks, Holli; Deelman, Ewa; Cote, Michele L; Chen, Wei; Pande, Mala; Christiani, David C; Field, John K; Bickebller, Heike; Risch, Angela; Heinrich, Joachim; Brennan, Paul; Wang, Yufei; Eisen, Timothy; Houlston, Richard S; Thun, Michael; Albanes, Demetrius; Caporaso, Neil; Peters, Ulrike; North, Kari E; Heiss, Gerardo; Crawford, Dana C; Bush, William S; Haiman, Christopher A; Landi, Maria Teresa; Hung, Rayjean J; Kooperberg, Charles; Amos, Christopher I; Le Marchand, Loïc; Cheng, Iona

    2014-04-01

    Genome-wide association studies have identified hundreds of genetic variants associated with specific cancers. A few of these risk regions have been associated with more than one cancer site; however, a systematic evaluation of the associations between risk variants for other cancers and lung cancer risk has yet to be performed. We included 18023 patients with lung cancer and 60543 control subjects from two consortia, Population Architecture using Genomics and Epidemiology (PAGE) and Transdisciplinary Research in Cancer of the Lung (TRICL). We examined 165 single-nucleotide polymorphisms (SNPs) that were previously associated with at least one of 16 non-lung cancer sites. Study-specific logistic regression results underwent meta-analysis, and associations were also examined by race/ethnicity, histological cell type, sex, and smoking status. A Bonferroni-corrected P value of 2.5×10(-5) was used to assign statistical significance. The breast cancer SNP LSP1 rs3817198 was associated with an increased risk of lung cancer (odds ratio [OR] = 1.10; 95% confidence interval [CI] = 1.05 to 1.14; P = 2.8×10(-6)). This association was strongest for women with adenocarcinoma (P = 1.2×10(-4)) and not statistically significant in men (P = .14) with this cell type (P het by sex = .10). Two glioma risk variants, TERT rs2853676 and CDKN2BAS1 rs4977756, which are located in regions previously associated with lung cancer, were associated with increased risk of adenocarcinoma (OR = 1.16; 95% CI = 1.10 to 1.22; P = 1.1×10(-8)) and squamous cell carcinoma (OR = 1.13; CI = 1.07 to 1.19; P = 2.5×10(-5)), respectively. Our findings demonstrate a novel pleiotropic association between the breast cancer LSP1 risk region marked by variant rs3817198 and lung cancer risk.

  20. Pathogenic variants screening in seventeen candidate genes on 2p15 for association with ankylosing spondylitis in a Han Chinese population.

    PubMed

    Wang, Mengmeng; Xin, Lihong; Cai, Guoqi; Zhang, Xu; Yang, Xiao; Li, Xiaona; Xia, Qing; Wang, Li; Xu, Shengqian; Xu, Jianhua; Shuai, Zongwen; Ding, Changhai; Pan, Faming

    2017-01-01

    Previous studies have found the association between rs10865331 in 2p15 area and ankylosing spondylitis (AS). This study aimed to identify additional functional genetic variants in 2p15 region associated with AS susceptibility. We used next generation sequencing (NGS) in 100 AS cases and 100 healthy controls to screen AS susceptible genetic variants, and validated these variants in 620 cases and 620 controls by using imLDRTM technique for single nucleotide polymorphism (SNP) genotyping. Totally, we identified 12 SNPs that might confer susceptibility to AS. Of those SNPs, three (rs14170, rs2123111 and rs1729674) were nominally associated (P<0.05) with AS, but were no longer statistically significant after Bonferroni correction. After stratified by gender, another two SNPs (rs11428092 and rs10208769 in USP34) were associated with AS in males but not females, though this was not statistically significant after Bonferroni correction. In addition, rs1729674, rs14170, rs2123111 and rs10208769 were in strong linkage disequilibrium (LD) and were further enrolled in haplotype analysis. A novel haplotype TAGA was found to be associated with a decreased risk of AS (odds ratio (OR) (95% confidence interval (CI)) = 0.832 (0.705-0.982)). Beyond that, we also demonstrated a strong relationship between rs10865331 and AS susceptibility (OR (95% CI) = 1.303(1.111-1.526)). rs14170 and rs2123111 inUSP34 and rs1729674 in C2orf74 may be associated with AS susceptibility in Han Chinese population. USP34 and C2orf74 in 2p15 region may be AS novel susceptibility genes.

  1. White matter pathology in ALS and lower motor neuron ALS variants: a diffusion tensor imaging study using tract-based spatial statistics.

    PubMed

    Prudlo, Johannes; Bißbort, Charlotte; Glass, Aenne; Grossmann, Annette; Hauenstein, Karlheinz; Benecke, Reiner; Teipel, Stefan J

    2012-09-01

    The aim of this work was to investigate white-matter microstructural changes within and outside the corticospinal tract in classical amyotrophic lateral sclerosis (ALS) and in lower motor neuron (LMN) ALS variants by means of diffusion tensor imaging (DTI). We investigated 22 ALS patients and 21 age-matched controls utilizing a whole-brain approach with a 1.5-T scanner for DTI. The patient group was comprised of 15 classical ALS- and seven LMN ALS-variant patients (progressive muscular atrophy, flail arm and flail leg syndrome). Disease severity was measured by the revised version of the functional rating scale. White matter fractional anisotropy (FA) was assessed using tract-based spatial statistics (TBSS) and a region of interest (ROI) approach. We found significant FA reductions in motor and extra-motor cerebral fiber tracts in classical ALS and in the LMN ALS-variant patients compared to controls. The voxel-based TBSS results were confirmed by the ROI findings. The white matter damage correlated with the disease severity in the patient group and was found in a similar distribution, but to a lesser extent, among the LMN ALS-variant subgroup. ALS and LMN ALS variants are multisystem degenerations. DTI shows the potential to determine an earlier diagnosis, particularly in LMN ALS variants. The statistically identical findings of white matter lesions in classical ALS and LMN variants as ascertained by DTI further underline that these variants should be regarded as part of the ALS spectrum.

  2. TCF7L2 Genetic Variants Contribute to Phenotypic Heterogeneity of Type 1 Diabetes.

    PubMed

    Redondo, Maria J; Geyer, Susan; Steck, Andrea K; Sosenko, Jay; Anderson, Mark; Antinozzi, Peter; Michels, Aaron; Wentworth, John; Xu, Ping; Pugliese, Alberto

    2018-02-01

    The phenotypic diversity of type 1 diabetes suggests heterogeneous etiopathogenesis. We investigated the relationship of type 2 diabetes-associated transcription factor 7 like 2 ( TCF7L2 ) single nucleotide polymorphisms (SNPs) with immunologic and metabolic characteristics at type 1 diabetes diagnosis. We studied TrialNet participants with newly diagnosed autoimmune type 1 diabetes with available TCF7L2 rs4506565 and rs7901695 SNP data ( n = 810; median age 13.6 years; range 3.3-58.6). We modeled the influence of carrying a TCF7L2 variant (i.e., having 1 or 2 minor alleles) on the number of islet autoantibodies and oral glucose tolerance test (OGTT)-stimulated C-peptide and glucose measures at diabetes diagnosis. All analyses were adjusted for known confounders. The rs4506565 variant was a significant independent factor of expressing a single autoantibody, instead of multiple autoantibodies, at diagnosis (odds ratio [OR] 1.66 [95% CI 1.07, 2.57], P = 0.024). Interaction analysis demonstrated that this association was only significant in participants ≥12 years old ( n = 504; OR 2.12 [1.29, 3.47], P = 0.003) but not younger ones ( n = 306, P = 0.73). The rs4506565 variant was independently associated with higher C-peptide area under the curve (AUC) ( P = 0.008) and lower mean glucose AUC ( P = 0.0127). The results were similar for the rs7901695 SNP. In this cohort of individuals with new-onset type 1 diabetes, type 2 diabetes-linked TCF7L2 variants were associated with single autoantibody (among those ≥12 years old), higher C-peptide AUC, and lower glucose AUC levels during an OGTT. Thus, carriers of the TCF7L2 variant had a milder immunologic and metabolic phenotype at type 1 diabetes diagnosis, which could be partly driven by type 2 diabetes-like pathogenic mechanisms. © 2017 by the American Diabetes Association.

  3. A meta-analysis of multiple myeloma risk regions in African and European ancestry populations identifies putatively functional loci

    PubMed Central

    Rand, Kristin A.; Song, Chi; Dean, Eric; Serie, Daniel J.; Curtin, Karen; Sheng, Xin; Hu, Donglei; Huff, Carol Ann; Bernal-Mizrachi, Leon; Tomasson, Michael H.; Ailawadhi, Sikander; Singhal, Seema; Pawlish, Karen; Peters, Edward S.; Bock, Cathryn H.; Stram, Alex; Van Den Berg, David J; Edlund, Christopher K.; V.Conti, David; Zimmerman, Todd; Hwang, Amie E.; Huntsman, Scott; Graff, John; Nooka, Ajay; Kong, Yinfei; Pregja, Silvana L.; Berndt, Sonja I.; Blot, William J.; Carpten, John; Casey, Graham; Chu, Lisa; Diver, W. Ryan; Stevens, Victoria L.; Lieber, Michael R.; Goodman, Phyllis J.; Hennis, Anselm J.M.; Hsing, Ann W.; Mehta, Jayesh; Kittles, Rick A.; Kolb, Suzanne; Klein, Eric A.; Leske, Cristina; Murphy, Adam B.; Nemesure, Barbara; Neslund-Dudas, Christine; Strom, Sara S.; Vij, Ravi; Rybicki, Benjamin A.; Stanford, Janet L.; Signorello, Lisa B.; Witte, John S.; Ambrosone, Christine B.; Bhatti, Parveen; John, Esther M.; Bernstein, Leslie; Zheng, Wei; Olshan, Andrew F.; Hu, Jennifer J.; Ziegler, Regina G.; Nyante, Sarah J.; Bandera, Elisa V.; Birmann, Brenda M.; Ingles, Sue A.; Press, Michael F.; Atanackovic, Djordje; Glenn, Martha J.; Cannon-Albright, Lisa A.; Jones, Brandt; Tricot, Guido; Martin, Thomas G.; Kumar, Shaji K.; Wolf, Jeffrey L.; Deming, Sandra L.; Rothman, Nathaniel; Brooks-Wilson, Angela R.; Rajkumar, S. Vincent; Kolonel, Laurence N.; Chanock, Stephen J.; Slager, Susan L.; Severson, Richard K.; Janakiraman, Nalini; Terebelo, Howard R.; Brown, Elizabeth E.; De Roos, Anneclaire J.; Mohrbacher, Ann F.; Colditz, Graham A.; Giles, Graham G.; Spinelli, John J.; Chiu, Brian C.; Munshi, Nikhil C.; Anderson, Kenneth C.; Levy, Joan; Zonder, Jeffrey A.; Orlowski, Robert Z.; Lonial, Sagar; Camp, Nicola J.; Vachon, Celine M.; Ziv, Elad; Stram, Daniel O.; Hazelett, Dennis J.; Haiman, Christopher A.; Cozen, Wendy

    2017-01-01

    Background Genome-wide association studies (GWAS) in European populations have identified genetic risk variants associated with multiple myeloma (MM). Methods We performed association testing of common variation in eight regions in 1,264 MM patients and 1,479 controls of European ancestry (EA) and 1,305 MM patients and 7,078 controls of African ancestry (AA) and conducted a meta-analysis to localize the signals, with epigenetic annotation used to predict functionality. Results We found that variants in 7p15.3, 17p11.2, 22q13.1 were statistically significantly (p<0.05) associated with MM risk in AAs and EAs and the variant in 3p22.1 was associated in EAs only. In a combined AA-EA meta-analysis, variation in five regions (2p23.3, 3p22.1, 7p15.3, 17p11.2, 22q13.1) was statistically signficantly associated with MM risk. In 3p22.1, the correlated variants clustered within the gene body of ULK4. Correlated variants in 7p15.3 clustered around an enhancer at the 3′ end of the CDCA7L transcription termination site. A missense variant at 17p11.2 (rs34562254, Pro251Leu, OR=1.32, p=2.93×10−7) in TNFRSF13B, encodes a lymphocyte-specific protein in the tumor necrosis factor receptor family that interacts with the NF-κB pathway. SNPs correlated with the index signal in 22q13.1 cluster around the promoter and enhancer regions of CBX7. Conclusions We found that reported MM susceptibility regions contain risk variants important across populations supporting the use of multiple racial/ethnic groups with different underlying genetic architecture to enhance the localization and identification of putatively functional alleles. Impact A subset of reported risk loci for multiple myeloma have consistent affects across populations and are likely to be functional. PMID:27587788

  4. Using high-resolution variant frequencies to empower clinical genome interpretation.

    PubMed

    Whiffin, Nicola; Minikel, Eric; Walsh, Roddy; O'Donnell-Luria, Anne H; Karczewski, Konrad; Ing, Alexander Y; Barton, Paul J R; Funke, Birgit; Cook, Stuart A; MacArthur, Daniel; Ware, James S

    2017-10-01

    PurposeWhole-exome and whole-genome sequencing have transformed the discovery of genetic variants that cause human Mendelian disease, but discriminating pathogenic from benign variants remains a daunting challenge. Rarity is recognized as a necessary, although not sufficient, criterion for pathogenicity, but frequency cutoffs used in Mendelian analysis are often arbitrary and overly lenient. Recent very large reference datasets, such as the Exome Aggregation Consortium (ExAC), provide an unprecedented opportunity to obtain robust frequency estimates even for very rare variants.MethodsWe present a statistical framework for the frequency-based filtering of candidate disease-causing variants, accounting for disease prevalence, genetic and allelic heterogeneity, inheritance mode, penetrance, and sampling variance in reference datasets.ResultsUsing the example of cardiomyopathy, we show that our approach reduces by two-thirds the number of candidate variants under consideration in the average exome, without removing true pathogenic variants (false-positive rate<0.001).ConclusionWe outline a statistically robust framework for assessing whether a variant is "too common" to be causative for a Mendelian disorder of interest. We present precomputed allele frequency cutoffs for all variants in the ExAC dataset.

  5. Listeners' processing of a given reduced word pronunciation variant directly reflects their exposure to this variant: Evidence from native listeners and learners of French.

    PubMed

    Brand, Sophie; Ernestus, Mirjam

    2018-05-01

    In casual conversations, words often lack segments. This study investigates whether listeners rely on their experience with reduced word pronunciation variants during the processing of single segment reduction. We tested three groups of listeners in a lexical decision experiment with French words produced either with or without word-medial schwa (e.g., /ʀvy/ and /ʀvy/ for revue). Participants also rated the relative frequencies of the two pronunciation variants of the words. If the recognition accuracy and reaction times (RTs) for a given listener group correlate best with the frequencies of occurrence holding for that given listener group, recognition is influenced by listeners' exposure to these variants. Native listeners' relative frequency ratings correlated well with their accuracy scores and RTs. Dutch advanced learners' accuracy scores and RTs were best predicted by their own ratings. In contrast, the accuracy and RTs from Dutch beginner learners of French could not be predicted by any relative frequency rating; the rating task was probably too difficult for them. The participant groups showed behaviour reflecting their difference in experience with the pronunciation variants. Our results strongly suggest that listeners store the frequencies of occurrence of pronunciation variants, and consequently the variants themselves.

  6. graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture.

    PubMed

    Chung, Dongjun; Kim, Hang J; Zhao, Hongyu

    2017-02-01

    Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. However, identification of risk variants associated with complex diseases remains challenging as they are often affected by many genetic variants with small or moderate effects. There has been accumulating evidence suggesting that different complex traits share common risk basis, namely pleiotropy. Recently, several statistical methods have been developed to improve statistical power to identify risk variants for complex traits through a joint analysis of multiple GWAS datasets by leveraging pleiotropy. While these methods were shown to improve statistical power for association mapping compared to separate analyses, they are still limited in the number of phenotypes that can be integrated. In order to address this challenge, in this paper, we propose a novel statistical framework, graph-GPA, to integrate a large number of GWAS datasets for multiple phenotypes using a hidden Markov random field approach. Application of graph-GPA to a joint analysis of GWAS datasets for 12 phenotypes shows that graph-GPA improves statistical power to identify risk variants compared to statistical methods based on smaller number of GWAS datasets. In addition, graph-GPA also promotes better understanding of genetic mechanisms shared among phenotypes, which can potentially be useful for the development of improved diagnosis and therapeutics. The R implementation of graph-GPA is currently available at https://dongjunchung.github.io/GGPA/.

  7. A quadratically regularized functional canonical correlation analysis for identifying the global structure of pleiotropy with NGS data

    PubMed Central

    Zhu, Yun; Fan, Ruzong; Xiong, Momiao

    2017-01-01

    Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics. PMID:29040274

  8. Generalizing Terwilliger's likelihood approach: a new score statistic to test for genetic association.

    PubMed

    el Galta, Rachid; Uitte de Willige, Shirley; de Visser, Marieke C H; Helmer, Quinta; Hsu, Li; Houwing-Duistermaat, Jeanine J

    2007-09-24

    In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.

  9. Identification of Metabolic Modifiers That Underlie Phenotypic Variations in Energy-Balance Regulation

    PubMed Central

    Chang, Chia Lin; Cai, James J.; Cheng, Po Jen; Chueh, Ho Yen; Hsu, Sheau Yu Teddy

    2011-01-01

    OBJECTIVE Although recent studies have shown that human genomes contain hundreds of loci that exhibit signatures of positive selection, variants that are associated with adaptation in energy-balance regulation remain elusive. We reasoned that the difficulty in identifying such variants could be due to heterogeneity in selection pressure and that an integrative approach that incorporated experiment-based evidence and population genetics-based statistical judgments would be needed to reveal important metabolic modifiers in humans. RESEARCH DESIGN AND METHODS To identify common metabolic modifiers that underlie phenotypic variation in diabetes-associated or obesity-associated traits in humans, or both, we screened 207 candidate loci for regulatory single nucleotide polymorphisms (SNPs) that exhibited evidence of gene–environmental interactions. RESULTS Three SNPs (rs3895874, rs3848460, and rs937301) at the 5′ gene region of human GIP were identified as prime metabolic-modifier candidates at the enteroinsular axis. Functional studies have shown that GIP promoter reporters carrying derived alleles of these three SNPs (haplotype GIP−1920A) have significantly lower transcriptional activities than those with ancestral alleles at corresponding positions (haplotype GIP−1920G). Consistently, studies of pregnant women who have undergone a screening test for gestational diabetes have shown that patients with a homozygous GIP−1920A/A genotype have significantly lower serum concentrations of glucose-dependent insulinotropic polypeptide (GIP) than those carrying an ancestral GIP−1920G haplotype. After controlling for a GIPR variation, we showed that serum glucose concentrations of patients carrying GIP−1920A/A homozygotes are significantly higher than that of those carrying an ancestral GIP−1920G haplotype (odds ratio 3.53). CONCLUSIONS Our proof-of-concept study indicates that common regulatory GIP variants impart a difference in GIP and glucose metabolism. The study also provides a rare example that identified the common variant-common phenotypic variation pattern based on evidence of moderate gene–environmental interactions. PMID:21300845

  10. [Phenotypic and genotypic spectra of patients with glucose-6-phosphate dehydrogenase deficiency gene known pathogenic variants: a single-center study].

    PubMed

    Chen, X; Yang, L; Wang, H J; Wu, B B; Lu, Y L; Dong, X R; Zhou, W H

    2018-05-02

    Objective: To analyze the hotspots of known pathogenic disease-causing variants of glucose-6-phosphate dehydrogenase (G6PD) and the phenotype spectrum of neonatal patients with known pathogenic disease-causing variants of G6PD. Methods: The known pathogenic disease-causing variants of G6PD were collected from Human Gene Mutation Database. Screening was performed for these variants among the 7 966 cases (2 357 neonatal, 5 609 non-neonatal) in the database of sequencing at Molecular Diagnosis Center, Children's Hospital of Fudan University. All these samples were from patients suspected with genetic disorder. The database contained Whole Exon Sequencing data and Clinical Exon Sequencing data. We screened out the patients with known pathogenic disease-causing variants of G6PD, analyzed the hotspot of G6PD and the phenotype spectrum of neonatal patients with known pathogenic disease-causing variants of G6PD. Results: (1) Among the next generation sequencing data of the 7 966 samples, 86 samples (1.1%) were detected as positive for the known pathogenic disease-causing variants of G6PD (positive samples set). In the positive sample set, 51 patients (33 males, 18 females) were newborn babies. Forty-three patients (26 males, 17 females) had the enzyme activity data of G6PD. (2) Among the 86 samples, Arg463His, Arg459Leu, Leu342Phe, Val291Met were the leading 4 disease-causing variants found in 72 samples (84%). (3) Male neonatal patients with the same variants had the statistically significant differences in enzyme activity: among 13 patients with Arg463His, enzyme activity of 9 patients was ranked as grade Ⅲ, 1 case ranked as Ⅳ, 3 cases had no activity data;among 10 patients with Arg459Leu, enzyme activity of 4 patients was ranked as Ⅱ, 4 cases ranked as Ⅲ, 2 cases had no activity data;among 2 patients with His32Arg, enzyme activity of one patient was ranked as Ⅱ, another was Ⅲ. Male neonatal patients with the same mutation and enzyme activity also had the statistically significant differences in phenotype spectrum: among 9 patients with Arg463His and level Ⅲ enzyme activity, 6 presented hyperbilirubinemia, 2 met the criteria for exchange transfusion therapy, 2 showed hemolysis;among 4 patients with Arg459Leu and level Ⅱ enzyme activity, 3 presented hyperbilirubinemia;among 4 patients with Arg459Leu and level Ⅲ enzyme activity, 2 presented hyperbilirubinemia, 1 met the standard of exchange transfusion therapy;among 3 patients with Val291Met and level Ⅲ enzyme activity, 1 presented hyperbilirubinemia. Conclusions: Arg463His, Arg459Leu, Leu342Phe, Val291Met were the hotspots variants for the G6PD. Patients with the same G6PD variants and sex present different phenotype, patients with the same G6PD variants, sex and enzyme activity also present different phenotype .

  11. Haplotypic Analysis of Wellcome Trust Case Control Consortium Data

    PubMed Central

    Browning, Brian L.; Browning, Sharon R.

    2008-01-01

    We applied a recently developed multilocus association testing method (localized haplotype clustering) to Wellcome Trust Case Control Consortium data (14,000 cases of seven common diseases and 3,000 shared controls genotyped on the Affymetrix 500K array). After rigorous data quality filtering, we identified three disease-associated loci with strong statistical support from localized haplotype cluster tests but with only marginal significance in single marker tests. These loci are chromosomes 10p15.1 with type 1 diabetes (p = 5.1 × 10-9), 12q15 with type 2 diabetes (p = 1.9 × 10-7) and 15q26.2 with hypertension (p = 2.8 × 10-8). We also detected the association of chromosome 9p21.3 with type 2 diabetes (p = 2.8 × 10-8), although this locus did not pass our stringent genotype quality filters. The association of 10p15.1 with type 1 diabetes and 9p21.3 with type 2 diabetes have both been replicated in other studies using independent data sets. Overall, localized haplotype cluster analysis had better success detecting disease associated variants than a previous single-marker analysis of imputed HapMap SNPs. We found that stringent application of quality score thresholds to genotype data substantially reduced false-positive results arising from genotype error. In addition, we demonstrate that it is possible to simultaneously phase 16,000 individuals genotyped on genome-wide data (450K markers) using the Beagle software package. PMID:18224336

  12. Clinical Validation and Implementation of a Targeted Next-Generation Sequencing Assay to Detect Somatic Variants in Non-Small Cell Lung, Melanoma, and Gastrointestinal Malignancies

    PubMed Central

    Fisher, Kevin E.; Zhang, Linsheng; Wang, Jason; Smith, Geoffrey H.; Newman, Scott; Schneider, Thomas M.; Pillai, Rathi N.; Kudchadkar, Ragini R.; Owonikoko, Taofeek K.; Ramalingam, Suresh S.; Lawson, David H.; Delman, Keith A.; El-Rayes, Bassel F.; Wilson, Malania M.; Sullivan, H. Clifford; Morrison, Annie S.; Balci, Serdar; Adsay, N. Volkan; Gal, Anthony A.; Sica, Gabriel L.; Saxe, Debra F.; Mann, Karen P.; Hill, Charles E.; Khuri, Fadlo R.; Rossi, Michael R.

    2017-01-01

    We tested and clinically validated a targeted next-generation sequencing (NGS) mutation panel using 80 formalin-fixed, paraffin-embedded (FFPE) tumor samples. Forty non-small cell lung carcinoma (NSCLC), 30 melanoma, and 30 gastrointestinal (12 colonic, 10 gastric, and 8 pancreatic adenocarcinoma) FFPE samples were selected from laboratory archives. After appropriate specimen and nucleic acid quality control, 80 NGS libraries were prepared using the Illumina TruSight tumor (TST) kit and sequenced on the Illumina MiSeq. Sequence alignment, variant calling, and sequencing quality control were performed using vendor software and laboratory-developed analysis workflows. TST generated ≥500× coverage for 98.4% of the 13,952 targeted bases. Reproducible and accurate variant calling was achieved at ≥5% variant allele frequency with 8 to 12 multiplexed samples per MiSeq flow cell. TST detected 112 variants overall, and confirmed all known single-nucleotide variants (n = 27), deletions (n = 5), insertions (n = 3), and multinucleotide variants (n = 3). TST detected at least one variant in 85.0% (68/80), and two or more variants in 36.2% (29/80), of samples. TP53 was the most frequently mutated gene in NSCLC (13 variants; 13/32 samples), gastrointestinal malignancies (15 variants; 13/25 samples), and overall (30 variants; 28/80 samples). BRAF mutations were most common in melanoma (nine variants; 9/23 samples). Clinically relevant NGS data can be obtained from routine clinical FFPE solid tumor specimens using TST, benchtop instruments, and vendor-supplied bioinformatics pipelines. PMID:26801070

  13. Single Color Multiplexed ddPCR Copy Number Measurements and Single Nucleotide Variant Genotyping.

    PubMed

    Wood-Bouwens, Christina M; Ji, Hanlee P

    2018-01-01

    Droplet digital PCR (ddPCR) allows for accurate quantification of genetic events such as copy number variation and single nucleotide variants. Probe-based assays represent the current "gold-standard" for detection and quantification of these genetic events. Here, we introduce a cost-effective single color ddPCR assay that allows for single genome resolution quantification of copy number and single nucleotide variation.

  14. Actionable exomic incidental findings in 6503 participants: challenges of variant classification.

    PubMed

    Amendola, Laura M; Dorschner, Michael O; Robertson, Peggy D; Salama, Joseph S; Hart, Ragan; Shirts, Brian H; Murray, Mitzi L; Tokita, Mari J; Gallego, Carlos J; Kim, Daniel Seung; Bennett, James T; Crosslin, David R; Ranchalis, Jane; Jones, Kelly L; Rosenthal, Elisabeth A; Jarvik, Ella R; Itsara, Andy; Turner, Emily H; Herman, Daniel S; Schleit, Jennifer; Burt, Amber; Jamal, Seema M; Abrudan, Jenica L; Johnson, Andrew D; Conlin, Laura K; Dulik, Matthew C; Santani, Avni; Metterville, Danielle R; Kelly, Melissa; Foreman, Ann Katherine M; Lee, Kristy; Taylor, Kent D; Guo, Xiuqing; Crooks, Kristy; Kiedrowski, Lesli A; Raffel, Leslie J; Gordon, Ora; Machini, Kalotina; Desnick, Robert J; Biesecker, Leslie G; Lubitz, Steven A; Mulchandani, Surabhi; Cooper, Greg M; Joffe, Steven; Richards, C Sue; Yang, Yaoping; Rotter, Jerome I; Rich, Stephen S; O'Donnell, Christopher J; Berg, Jonathan S; Spinner, Nancy B; Evans, James P; Fullerton, Stephanie M; Leppig, Kathleen A; Bennett, Robin L; Bird, Thomas; Sybert, Virginia P; Grady, William M; Tabor, Holly K; Kim, Jerry H; Bamshad, Michael J; Wilfond, Benjamin; Motulsky, Arno G; Scott, C Ronald; Pritchard, Colin C; Walsh, Tom D; Burke, Wylie; Raskind, Wendy H; Byers, Peter; Hisama, Fuki M; Rehm, Heidi; Nickerson, Debbie A; Jarvik, Gail P

    2015-03-01

    Recommendations for laboratories to report incidental findings from genomic tests have stimulated interest in such results. In order to investigate the criteria and processes for assigning the pathogenicity of specific variants and to estimate the frequency of such incidental findings in patients of European and African ancestry, we classified potentially actionable pathogenic single-nucleotide variants (SNVs) in all 4300 European- and 2203 African-ancestry participants sequenced by the NHLBI Exome Sequencing Project (ESP). We considered 112 gene-disease pairs selected by an expert panel as associated with medically actionable genetic disorders that may be undiagnosed in adults. The resulting classifications were compared to classifications from other clinical and research genetic testing laboratories, as well as with in silico pathogenicity scores. Among European-ancestry participants, 30 of 4300 (0.7%) had a pathogenic SNV and six (0.1%) had a disruptive variant that was expected to be pathogenic, whereas 52 (1.2%) had likely pathogenic SNVs. For African-ancestry participants, six of 2203 (0.3%) had a pathogenic SNV and six (0.3%) had an expected pathogenic disruptive variant, whereas 13 (0.6%) had likely pathogenic SNVs. Genomic Evolutionary Rate Profiling mammalian conservation score and the Combined Annotation Dependent Depletion summary score of conservation, substitution, regulation, and other evidence were compared across pathogenicity assignments and appear to have utility in variant classification. This work provides a refined estimate of the burden of adult onset, medically actionable incidental findings expected from exome sequencing, highlights challenges in variant classification, and demonstrates the need for a better curated variant interpretation knowledge base. © 2015 Amendola et al.; Published by Cold Spring Harbor Laboratory Press.

  15. SLC30A8 nonsynonymous variant is associated with recovery following exercise and skeletal muscle size and strength.

    PubMed

    Sprouse, Courtney; Gordish-Dressman, Heather; Orkunoglu-Suer, E Funda; Lipof, Jason S; Moeckel-Cole, Stephanie; Patel, Ronak R; Adham, Kasra; Larkin, Justin S; Hubal, Monica J; Kearns, Amy K; Clarkson, Priscilla M; Thompson, Paul D; Angelopoulos, Theodore J; Gordon, Paul M; Moyna, Niall M; Pescatello, Linda S; Visich, Paul S; Zoeller, Robert F; Hoffman, Eric P; Tosi, Laura L; Devaney, Joseph M

    2014-01-01

    Genome-wide association studies have identified thousands of variants that are associated with numerous phenotypes. One such variant, rs13266634, a nonsynonymous single nucleotide polymorphism in the solute carrier family 30 (zinc transporter) member eight gene, is associated with a 53% increase in the risk of developing type 2 diabetes (T2D). We hypothesized that individuals with the protective allele against T2D would show a positive response to short-term and long-term resistance exercise. Two cohorts of young adults-the Eccentric Muscle Damage (EMD; n = 156) cohort and the Functional Single Nucleotide Polymorphisms Associated with Muscle Size and Strength Study (FAMuSS; n = 874)-were tested for association of the rs13266634 variant with measures of skeletal muscle response to resistance exercise. Our results were sexually dimorphic in both cohorts. Men in the EMD study with two copies of the protective allele showed less post-exercise bout strength loss, less soreness, and lower creatine kinase values. In addition, men in the FAMuSS, homozygous for the protective allele, showed higher pre-exercise strength and larger arm skeletal muscle volume, but did not show a significant difference in skeletal muscle hypertrophy or strength with resistance training.

  16. Single nucleotide polymorphisms in an Indian cohort and association of CNTN4, MMP2 and SNTB1 variants with oral cancer.

    PubMed

    Yete, Subuhi; Pradhan, Sultan; Saranath, Dhananjaya

    2017-08-01

    Oral cancer is a high incidence cancer in India primarily due to the prevalent tobacco/areca nut chewing habits and hence a major health concern. India constitutes 26% of the global oral cancer burden. Besides the well-established risk factors, the genomic constitution of an individual plays a role in oral cancer. The aim of the current study was to analyse genomic variants represented as single nucleotide polymorphisms (SNPs), analyse their prevalence and investigate risk association of allelotypes/genotypes to oral cancers. Eleven SNPs in genes associated with biological functions were analysed in an Indian cohort (n = 1000) comprising 500 oral cancer patients and 500 long term tobacco habitués as controls, using Allelic discrimination Real-Time PCR assay with SYBR Green dye. Fisher's exact test and Odds Ratio were used for statistical analysis. Increased risk was observed for rs9849237 CC [P = 0.008; OR 1.412 (1.09-1.82)] and rs243865 CT [P = 0.004; OR 1.469 (1.13-1.90)] genotypes, whereas rs9849237 CT [P = 0.034; OR 0.755 (0.58-0.97)], rs243865 CC [P = 0.002; OR 0.669 (0.51-0.86)] and rs10090787 CC [P = 0.049; OR 0.774 (0.60-0.99)] genotypes indicated decreased risk to oral cancer. The other SNPs showed equidistribution in both groups. Our data indicated genotypes and alleles in specific SNPs rs9849237, rs243865 and rs10090787 with increased/decreased risk to oral cancer. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Detection of EGFR Variants in Plasma: A Multilaboratory Comparison of the cobas EGFR Mutation Test v2 in Europe.

    PubMed

    Keppens, Cleo; Palma, John F; Das, Partha M; Scudder, Sidney; Wen, Wei; Normanno, Nicola; Van Krieken, J Han; Sacco, Alessandra; Fenizia, Francesca; de Castro, David Gonzalez; Hönigschnabl, Selma; Kern, Izidor; Lopez-Rios, Fernando; Lozano, Maria D; Marchetti, Antonio; Halfon, Philippe; Schuuring, Ed; Setinek, Ulrike; Sorensen, Boe; Taniere, Phillipe; Tiemann, Markus; Vosmikova, Hana; Dequeker, Elisabeth M C

    2018-04-25

    Molecular testing of EGFR is required to predict the response likelihood to targeted therapy in non-small-cell lung cancer. Analysis of circulating tumor DNA in plasma may complement limitations of tumor tissue. This study evaluated the interlaboratory performance and reproducibility of the cobas EGFR Mutation Test v2 to detect EGFR variants in plasma. Fourteen laboratories received two identical panels of 27 single-blinded plasma samples. Samples were wild-type or spiked with plasmid DNA to contain seven common EGFR variants at six predefined concentrations from 50 to 5000 copies per mL. The circulating tumor DNA was extracted by the cobas cfDNA Sample Preparation kit, followed by duplicate analysis with the EGFRv2 kit (Roche Molecular Systems, Pleasanton, CA). Lowest sensitivities were obtained for the c.2156G>C p.(Gly719Ala) and c.2573T>G p.(Leu858Arg) variants for the lowest target copies. For all other variants, sensitivities varied between 96.3% and 100.0%. Specificities were all 98.8% to 100.0%. Coefficients of variation indicated good intra and interlaboratory repeatability and reproducibility, but increased for decreasing concentrations. Prediction models revealed a significant correlation for all variants between the pre-defined copy number and the observed semiquantitative index values which reflects the samples' plasma mutation load. This study demonstrates an overall robust performance of the EGFRv2 kit in plasma. Prediction models may be applied to estimate the plasma mutation load for diagnostic or research purposes. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  18. [Evaluation of the usefulness of various PCR method variations and nucleic acid hybridization for CMV infection in immunosuppressed patients].

    PubMed

    Siennicka, J; Trzcińska, A; Litwińska, B; Durlik, M; Seferyńska, I; Pałynyczko, G; Kańtoch, M

    2000-01-01

    In diagnosis of CMV infection various laboratory methods are used. The methods based on detection of viral nucleic acids have been introduced routinely in many laboratories. The aim of this study was to compare nucleic acid hybridisation method and various variants of PCR methods with respect to their ability to detect CMV DNA. The studied material comprised 60 blood samples from 19 patients including 13 renal transplant recipients and 6 with acute leukaemia. The samples were subjected to hybridisation (Murex Hybrid Capture System CMV DNA) and PCR carried out in 3 variants: with one pair of primers (single PCR), nested PCR and Digene SHARP System with detection of PCR product using a genetic probe in ELISA system. The sensitivity of the variants ranged from 10(0) particles of viral DNA in nested PCR to 10(2) in single PCR. The producer claimed the sensitivity of the hybridisation test to be 3 x 10(5) and it seems to be sufficient for detection of CMV infection. The obtained results show that sensitivity of hybridisation was comparable to that of single PCR and the possibility of obtaining quantitative results makes it superior, on efficacy of antiviral therapy, especially in monitoring CMV infection in immunossuppressed patients and in following the efficacy of antiviral treatment.

  19. Systematic Integration of Brain eQTL and GWAS Identifies ZNF323 as a Novel Schizophrenia Risk Gene and Suggests Recent Positive Selection Based on Compensatory Advantage on Pulmonary Function.

    PubMed

    Luo, Xiong-Jian; Mattheisen, Manuel; Li, Ming; Huang, Liang; Rietschel, Marcella; Børglum, Anders D; Als, Thomas D; van den Oord, Edwin J; Aberg, Karolina A; Mors, Ole; Mortensen, Preben Bo; Luo, Zhenwu; Degenhardt, Franziska; Cichon, Sven; Schulze, Thomas G; Nöthen, Markus M; Su, Bing; Zhao, Zhongming; Gan, Lin; Yao, Yong-Gang

    2015-11-01

    Genome-wide association studies have identified multiple risk variants and loci that show robust association with schizophrenia. Nevertheless, it remains unclear how these variants confer risk to schizophrenia. In addition, the driving force that maintains the schizophrenia risk variants in human gene pool is poorly understood. To investigate whether expression-associated genetic variants contribute to schizophrenia susceptibility, we systematically integrated brain expression quantitative trait loci and genome-wide association data of schizophrenia using Sherlock, a Bayesian statistical framework. Our analyses identified ZNF323 as a schizophrenia risk gene (P = 2.22×10(-6)). Subsequent analyses confirmed the association of the ZNF323 and its expression-associated single nucleotide polymorphism rs1150711 in independent samples (gene-expression: P = 1.40×10(-6); single-marker meta-analysis in the combined discovery and replication sample comprising 44123 individuals: P = 6.85×10(-10)). We found that the ZNF323 was significantly downregulated in hippocampus and frontal cortex of schizophrenia patients (P = .0038 and P = .0233, respectively). Evidence for pleiotropic effects was detected (association of rs1150711 with lung function and gene expression of ZNF323 in lung: P = 6.62×10(-5) and P = 9.00×10(-5), respectively) with the risk allele (T allele) for schizophrenia acting as protective allele for lung function. Subsequent population genetics analyses suggest that the risk allele (T) of rs1150711 might have undergone recent positive selection in human population. Our findings suggest that the ZNF323 is a schizophrenia susceptibility gene whose expression may influence schizophrenia risk. Our study also illustrates a possible mechanism for maintaining schizophrenia risk variants in the human gene pool. © The Author 2015. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  20. Interaction of CYP2C19, P2Y12, and GPIIIa Variants Associates With Efficacy of Clopidogrel and Adverse Events on Patients With Ischemic Stroke.

    PubMed

    Yi, Xingyang; Wang, Yanfen; Lin, Jing; Cheng, Wen; Zhou, Qiang; Wang, Chun

    2017-10-01

    Clopidogrel is a clinically important oral antiplatelet agent for the treatment or prevention of cerebrovascular disease. However, different individuals have different sensitivities to clopidogrel. This study assessed variants of different genes for association with response to clopidogrel, clinical outcome, and side effects in patients with ischemic stroke (IS). We consecutively enrolled 375 patients with IS after they received clopidogrel therapy, and venous blood samples were subjected to genotyping allelic variants of genes modulating clopidogrel absorption (ATP binding cassette subfamily B1, ABCB1), metabolic activation (cytochrome P450[CYP] 3A and CYP2C19), and biologic activity (platelet membrane receptor [ P2Y12, P2Y1)], and glycoprotein IIIa [ GPIIIa]) and statistically analyzing their interactions with clopidogrel sensitivity (CS) and adverse events, risk of IS recurrence, myocardial infarction, and death during 6 months of follow-up. Adverse events occurred in 37 patients (31 had IS recurrence, 4 died, and 2 had myocardial infarction) during the first 6 months of follow-up. Single locus analysis showed that only the CYP2C19*2(rs4244285) variant was independently associated with CS and risk of adverse events after adjusting covariates. However, there was significant gene-gene interaction among CYP2C19*2(rs4244285), P2Y12(rs16863323), and GPIIIa (rs2317676) analyzed by generalized multifactor dimensionality reduction methods. The rate of adverse events among patients with the 3-loci interaction was 2.82 times the rate among those with no interaction (95% confidence interval: 2.04-8.63). Sensitivity of patients with IS to clopidogrel and clopidogrel-induced adverse clinical events may be multifactorial but is not determined by single gene polymorphisms.

  1. Genome-wide detection of intervals of genetic heterogeneity associated with complex traits

    PubMed Central

    Llinares-López, Felipe; Grimm, Dominik G.; Bodenham, Dean A.; Gieraths, Udo; Sugiyama, Mahito; Rowan, Beth; Borgwardt, Karsten

    2015-01-01

    Motivation: Genetic heterogeneity, the fact that several sequence variants give rise to the same phenotype, is a phenomenon that is of the utmost interest in the analysis of complex phenotypes. Current approaches for finding regions in the genome that exhibit genetic heterogeneity suffer from at least one of two shortcomings: (i) they require the definition of an exact interval in the genome that is to be tested for genetic heterogeneity, potentially missing intervals of high relevance, or (ii) they suffer from an enormous multiple hypothesis testing problem due to the large number of potential candidate intervals being tested, which results in either many false positives or a lack of power to detect true intervals. Results: Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype. It also solves both the inherent computational efficiency problem and the statistical problem of multiple hypothesis testing, which are both caused by the huge number of candidate intervals. We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping. Conclusions: Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes. Availability and implementation: The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/sis.html. Contact: felipe.llinares@bsse.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26072488

  2. Serotonin-related FEV gene variant in the sudden infant death syndrome is a common polymorphism in the African-American population.

    PubMed

    Broadbelt, Kevin G; Barger, Melissa A; Paterson, David S; Holm, Ingrid A; Haas, Elisabeth A; Krous, Henry F; Kinney, Hannah C; Markianos, Kyriacos; Beggs, Alan H

    2009-12-01

    An important subset of the sudden infant death syndrome (SIDS) is associated with multiple serotonergic (5-HT) abnormalities in regions of the medulla oblongata. The mouse ortholog of the fifth Ewing variant gene (FEV) is critical for 5-HT neuronal development. A putatively rare intronic variant [IVS2-191_190insA, here referred to as c.128-(191_192)dupA] has been reported as a SIDS-associated mutation in an African-American population. We tested this association in an independent dataset: 137 autopsied cases (78 SIDS, 59 controls) and an additional 296 control DNA samples from Coriell Cell Repositories. In addition to the c.128-(191_192)dupA variant, we observed an associated single-base deletion [c.128-(301-306)delG] in a subset of the samples. Neither of the two FEV variants showed significant association with SIDS in either the African-American subgroup or the overall cohort. Although we found a significant association of c.128-(191_192)dupA with SIDS when San Diego Hispanic SIDS cases were compared with San Diego Hispanic controls plus Mexican controls (p = 0.04), this became nonsignificant after multiple testing correction. Among Coriell controls, 33 of 99 (33%) African-American and 0 of 197 (0%) of the remaining controls carry the polymorphism (c.128-(191_192)dupA). The polymorphism seems to be a common, likely nonpathogenic, variant in the African-American population.

  3. Contributions of PTCH Gene Variants to Isolated Cleft Lip and Palate

    PubMed Central

    Mansilla, M.A.; Cooper, M.E.; Goldstein, T.; Castilla, E.E.; Camelo, J.S. Lopez; Marazita, M.L.; Murray, J.C.

    2007-01-01

    Objective Mutations in patched (PTCH) cause the nevoid basal cell carcinoma syndrome (NBCCS), or Gorlin syndrome. Nevoid basal cell carcinoma syndrome may present with developmental anomalies, including rib and craniofacial abnormalities, and predisposes to several tumor types, including basal cell carcinoma and medulloblastoma. Cleft palate is found in 4% of individuals with nevoid basal cell carcinoma syndrome. Because there might be specific sequence alterations in PTCH that limit expression to orofacial clefting, a genetic study of PTCH was undertaken in cases with cleft lip and/or palate (CL/P) known not to have nevoid basal cell carcinoma syndrome. Results Seven new normal variants spread along the entire gene and three missense mutations were found among cases with cleft lip and/or palate. One of these variants (P295S) was not found in any of 1188 control samples. A second variant was found in a case and also in 1 of 1119 controls. The third missense (S827G) was found in 5 of 1369 cases and in 5 of 1104 controls and is likely a rare normal variant. Linkage and linkage desequilibrium also was assessed using normal variants in and adjacent to the PTCH gene in 220 families (1776 individuals), each with two or more individuals with isolated clefting. Although no statistically significant evidence of linkage (multipoint HLOD peak = 2.36) was uncovered, there was borderline evidence of significant transmission distortion for one haplotype of two single nucleotide polymorphisms located within the PTCH gene (p = .08). Conclusion Missense mutations in PTCH may be rare causes of isolated cleft lip and/or palate. An as yet unidentified variant near PTCH may act as a modifier of cleft lip and/or palate. PMID:16405370

  4. Association of Cardiometabolic Genes with Arsenic Metabolism Biomarkers in American Indian Communities: The Strong Heart Family Study (SHFS)

    PubMed Central

    Balakrishnan, Poojitha; Vaidya, Dhananjay; Franceschini, Nora; Voruganti, V. Saroja; Gribble, Matthew O.; Haack, Karin; Laston, Sandra; Umans, Jason G.; Francesconi, Kevin A.; Goessler, Walter; North, Kari E.; Lee, Elisa; Yracheta, Joseph; Best, Lyle G.; MacCluer, Jean W.; Kent, Jack; Cole, Shelley A.; Navas-Acien, Ana

    2016-01-01

    Background: Metabolism of inorganic arsenic (iAs) is subject to inter-individual variability, which is explained partly by genetic determinants. Objectives: We investigated the association of genetic variants with arsenic species and principal components of arsenic species in the Strong Heart Family Study (SHFS). Methods: We examined variants previously associated with cardiometabolic traits (~ 200,000 from Illumina Cardio MetaboChip) or arsenic metabolism and toxicity (670) among 2,428 American Indian participants in the SHFS. Urine arsenic species were measured by high performance liquid chromatography–inductively coupled plasma mass spectrometry (HPLC-ICP-MS), and percent arsenic species [iAs, monomethylarsonate (MMA), and dimethylarsinate (DMA), divided by their sum × 100] were logit transformed. We created two orthogonal principal components that summarized iAs, MMA, and DMA and were also phenotypes for genetic analyses. Linear regression was performed for each phenotype, dependent on allele dosage of the variant. Models accounted for familial relatedness and were adjusted for age, sex, total arsenic levels, and population stratification. Single nucleotide polymorphism (SNP) associations were stratified by study site and were meta-analyzed. Bonferroni correction was used to account for multiple testing. Results: Variants at 10q24 were statistically significant for all percent arsenic species and principal components of arsenic species. The index SNP for iAs%, MMA%, and DMA% (rs12768205) and for the principal components (rs3740394, rs3740393) were located near AS3MT, whose gene product catalyzes methylation of iAs to MMA and DMA. Among the candidate arsenic variant associations, functional SNPs in AS3MT and 10q24 were most significant (p < 9.33 × 10–5). Conclusions: This hypothesis-driven association study supports the role of common variants in arsenic metabolism, particularly AS3MT and 10q24. Citation: Balakrishnan P, Vaidya D, Franceschini N, Voruganti VS, Gribble MO, Haack K, Laston S, Umans JG, Francesconi KA, Goessler W, North KE, Lee E, Yracheta J, Best LG, MacCluer JW, Kent J Jr., Cole SA, Navas-Acien A. 2017. Association of cardiometabolic genes with arsenic metabolism biomarkers in American Indian communities: the Strong Heart Family Study (SHFS). Environ Health Perspect 125:15–22; http://dx.doi.org/10.1289/EHP251 PMID:27352405

  5. Association of Cardiometabolic Genes with Arsenic Metabolism Biomarkers in American Indian Communities: The Strong Heart Family Study (SHFS).

    PubMed

    Balakrishnan, Poojitha; Vaidya, Dhananjay; Franceschini, Nora; Voruganti, V Saroja; Gribble, Matthew O; Haack, Karin; Laston, Sandra; Umans, Jason G; Francesconi, Kevin A; Goessler, Walter; North, Kari E; Lee, Elisa; Yracheta, Joseph; Best, Lyle G; MacCluer, Jean W; Kent, Jack; Cole, Shelley A; Navas-Acien, Ana

    2017-01-01

    Metabolism of inorganic arsenic (iAs) is subject to inter-individual variability, which is explained partly by genetic determinants. We investigated the association of genetic variants with arsenic species and principal components of arsenic species in the Strong Heart Family Study (SHFS). We examined variants previously associated with cardiometabolic traits (~ 200,000 from Illumina Cardio MetaboChip) or arsenic metabolism and toxicity (670) among 2,428 American Indian participants in the SHFS. Urine arsenic species were measured by high performance liquid chromatography-inductively coupled plasma mass spectrometry (HPLC-ICP-MS), and percent arsenic species [iAs, monomethylarsonate (MMA), and dimethylarsinate (DMA), divided by their sum × 100] were logit transformed. We created two orthogonal principal components that summarized iAs, MMA, and DMA and were also phenotypes for genetic analyses. Linear regression was performed for each phenotype, dependent on allele dosage of the variant. Models accounted for familial relatedness and were adjusted for age, sex, total arsenic levels, and population stratification. Single nucleotide polymorphism (SNP) associations were stratified by study site and were meta-analyzed. Bonferroni correction was used to account for multiple testing. Variants at 10q24 were statistically significant for all percent arsenic species and principal components of arsenic species. The index SNP for iAs%, MMA%, and DMA% (rs12768205) and for the principal components (rs3740394, rs3740393) were located near AS3MT, whose gene product catalyzes methylation of iAs to MMA and DMA. Among the candidate arsenic variant associations, functional SNPs in AS3MT and 10q24 were most significant (p < 9.33 × 10-5). This hypothesis-driven association study supports the role of common variants in arsenic metabolism, particularly AS3MT and 10q24. Citation: Balakrishnan P, Vaidya D, Franceschini N, Voruganti VS, Gribble MO, Haack K, Laston S, Umans JG, Francesconi KA, Goessler W, North KE, Lee E, Yracheta J, Best LG, MacCluer JW, Kent J Jr., Cole SA, Navas-Acien A. 2017. Association of cardiometabolic genes with arsenic metabolism biomarkers in American Indian communities: the Strong Heart Family Study (SHFS). Environ Health Perspect 125:15-22; http://dx.doi.org/10.1289/EHP251.

  6. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA.

    PubMed

    Wu, Zheyang; Zhao, Hongyu

    2012-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies.

  7. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA

    PubMed Central

    Wu, Zheyang; Zhao, Hongyu

    2013-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies. PMID:23956610

  8. Impact of a cis-associated gene expression SNP in 20q11.22 on bipolar disorder susceptibility, hippocampal structure and cognitive performance

    PubMed Central

    Li, Ming; Luo, Xiong-jian; Landén, Mikael; Bergen, Sarah E.; Hultman, Christina M.; Li, Xiao; Zhang, Wen; Yao, Yong-Gang; Zhang, Chen; Liu, Jiewei; Mattheisen, Manuel; Cichon, Sven; Mühleisen, Thomas W.; Degenhardt, Franziska A.; Nöthen, Markus M.; Schulze, Thomas G.; Grigoroiu-Serbanescu, Maria; Li, Hao; Fuller, Chris K.; Chen, Chunhui; Dong, Qi; Chen, Chuansheng; Jamain, Stéphane; Leboyer, Marion; Bellivier, Frank; Etain, Bruno; Kahn, Jean-Pierre; Henry, Chantal; Preisig, Martin; Kutalik, Zoltán; Castelao, Enrique; Wright, Adam; Mitchell, Philip B.; Fullerton, Janice M.; Schofield, Peter R.; Montgomery, Grant W.; Medland, Sarah E.; Gordon, Scott D.; Martin, Nicholas G.; Rietschel, Marcella; Liu, Chunyu; Kleinman, Joel E.; Hyde, Thomas M.; Weinberger, Daniel R.; Su, Bing

    2016-01-01

    Summary Bipolar disorder (BPD) is a highly heritable polygenic disorder. Recent enrichment analyses suggest that there may be true risk variants for BPD among the expression quantitative trait loci (eQTL) in the brain. Aims We sought to assess the impact of eQTL variants on BPD risk by combining data from both BPD genome-wide association study (GWAS) and brain eQTL. Method To detect single-nucleotide polymorphisms (SNPs) that influence expression levels of genes associated with BPD, we jointly analyzed data from a BPD GWAS (7,481 cases and 9,250 controls) and a genome-wide brain (cortical) eQTL (193 healthy controls) using a Bayesian statistical method, with independent follow-up replications. The identified risk SNP was then further tested for association with hippocampal volume (N=5,775) and cognitive performance (N=342) among healthy subjects. Results Integrative analysis revealed a significant association between a brain eQTL rs6088662 in 20q11.22 and BPD (Log Bayes Factor=5.48; BPD p-val=5.85×10−5). Follow-up studies across multiple independent samples confirmed the association of the risk SNP (rs6088662) with gene expression and BPD susceptibility (p-val=3.54×10−8). Further exploratory analysis revealed that rs6088662 is also associated with hippocampal volume and cognitive performance in healthy subjects. Conclusions Our findings suggest that 20q11.22 is likely a risk region for BPD, highlighting the informativeness of integrating functional annotation of genetic variants for gene expression in advancing our understanding of the biological basis underlying complex diseases such as BPD. PMID:26338991

  9. Resequencing of IRS2 reveals rare variants for obesity but not fasting glucose homeostasis in Hispanic children.

    PubMed

    Butte, Nancy F; Voruganti, V Saroja; Cole, Shelley A; Haack, Karin; Comuzzie, Anthony G; Muzny, Donna M; Wheeler, David A; Chang, Kyle; Hawes, Alicia; Gibbs, Richard A

    2011-09-22

    Our objective was to resequence insulin receptor substrate 2 (IRS2) to identify variants associated with obesity- and diabetes-related traits in Hispanic children. Exonic and intronic segments, 5' and 3' flanking regions of IRS2 (∼14.5 kb), were bidirectionally sequenced for single nucleotide polymorphism (SNP) discovery in 934 Hispanic children using 3730XL DNA Sequencers. Additionally, 15 SNPs derived from Illumina HumanOmni1-Quad BeadChips were analyzed. Measured genotype analysis tested associations between SNPs and obesity and diabetes-related traits. Bayesian quantitative trait nucleotide analysis was used to statistically infer the most likely functional polymorphisms. A total of 140 SNPs were identified with minor allele frequencies (MAF) ranging from 0.001 to 0.47. Forty-two of the 70 coding SNPs result in nonsynonymous amino acid substitutions relative to the consensus sequence; 28 SNPs were detected in the promoter, 12 in introns, 28 in the 3'-UTR, and 2 in the 5'-UTR. Two insertion/deletions (indels) were detected. Ten independent rare SNPs (MAF = 0.001-0.009) were associated with obesity-related traits (P = 0.01-0.00002). SNP 10510452_139 in the promoter region was shown to have a high posterior probability (P = 0.77-0.86) of influencing BMI, fat mass, and waist circumference in Hispanic children. SNP 10510452_139 contributed between 2 and 4% of the population variance in body weight and composition. None of the SNPs or indels were associated with diabetes-related traits or accounted for a previously identified quantitative trait locus on chromosome 13 for fasting serum glucose. Rare but not common IRS2 variants may play a role in the regulation of body weight but not an essential role in fasting glucose homeostasis in Hispanic children.

  10. Impact of a cis-associated gene expression SNP on chromosome 20q11.22 on bipolar disorder susceptibility, hippocampal structure and cognitive performance.

    PubMed

    Li, Ming; Luo, Xiong-jian; Landén, Mikael; Bergen, Sarah E; Hultman, Christina M; Li, Xiao; Zhang, Wen; Yao, Yong-Gang; Zhang, Chen; Liu, Jiewei; Mattheisen, Manuel; Cichon, Sven; Mühleisen, Thomas W; Degenhardt, Franziska A; Nöthen, Markus M; Schulze, Thomas G; Grigoroiu-Serbanescu, Maria; Li, Hao; Fuller, Chris K; Chen, Chunhui; Dong, Qi; Chen, Chuansheng; Jamain, Stéphane; Leboyer, Marion; Bellivier, Frank; Etain, Bruno; Kahn, Jean-Pierre; Henry, Chantal; Preisig, Martin; Kutalik, Zoltán; Castelao, Enrique; Wright, Adam; Mitchell, Philip B; Fullerton, Janice M; Schofield, Peter R; Montgomery, Grant W; Medland, Sarah E; Gordon, Scott D; Martin, Nicholas G; Rietschel, Marcella; Liu, Chunyu; Kleinman, Joel E; Hyde, Thomas M; Weinberger, Daniel R; Su, Bing

    2016-02-01

    Bipolar disorder is a highly heritable polygenic disorder. Recent enrichment analyses suggest that there may be true risk variants for bipolar disorder in the expression quantitative trait loci (eQTL) in the brain. We sought to assess the impact of eQTL variants on bipolar disorder risk by combining data from both bipolar disorder genome-wide association studies (GWAS) and brain eQTL. To detect single nucleotide polymorphisms (SNPs) that influence expression levels of genes associated with bipolar disorder, we jointly analysed data from a bipolar disorder GWAS (7481 cases and 9250 controls) and a genome-wide brain (cortical) eQTL (193 healthy controls) using a Bayesian statistical method, with independent follow-up replications. The identified risk SNP was then further tested for association with hippocampal volume (n = 5775) and cognitive performance (n = 342) among healthy individuals. Integrative analysis revealed a significant association between a brain eQTL rs6088662 on chromosome 20q11.22 and bipolar disorder (log Bayes factor = 5.48; bipolar disorder P = 5.85 × 10(-5)). Follow-up studies across multiple independent samples confirmed the association of the risk SNP (rs6088662) with gene expression and bipolar disorder susceptibility (P = 3.54 × 10(-8)). Further exploratory analysis revealed that rs6088662 is also associated with hippocampal volume and cognitive performance in healthy individuals. Our findings suggest that 20q11.22 is likely a risk region for bipolar disorder; they also highlight the informative value of integrating functional annotation of genetic variants for gene expression in advancing our understanding of the biological basis underlying complex disorders, such as bipolar disorder. © The Royal College of Psychiatrists 2016.

  11. Targeted exome sequencing for the identification of a protective variant against Internet gaming disorder at rs2229910 of neurotrophic tyrosine kinase receptor, type 3 (NTRK3): A pilot study

    PubMed Central

    Kim, Jeong-Yu; Jeong, Jo-Eun; Rhee, Je-Keun; Cho, Hyun; Chun, Ji-Won; Kim, Tae-Min; Choi, Sam-Wook; Choi, Jung-Seok; Kim, Dai-Jin

    2016-01-01

    Background and aims Internet gaming disorder (IGD) has gained recognition as a potential new diagnosis in the fifth revision of the Diagnostic and Statistical Manual of Mental Disorders, but genetic evidence supporting this disorder remains scarce. Methods In this study, targeted exome sequencing was conducted in 30 IGD patients and 30 control subjects with a focus on genes linked to various neurotransmitters associated with substance and non-substance addictions, depression, and attention deficit hyperactivity disorder. Results rs2229910 of neurotrophic tyrosine kinase receptor, type 3 (NTRK3) was the only single nucleotide polymorphism (SNP) that exhibited a significantly different minor allele frequency in IGD subjects compared to controls (p = .01932), suggesting that this SNP has a protective effect against IGD (odds ratio = 0.1541). The presence of this potentially protective allele was also associated with less time spent on Internet gaming and lower scores on the Young’s Internet Addiction Test and Korean Internet Addiction Proneness Scale for Adults. Conclusions The results of this first targeted exome sequencing study of IGD subjects indicate that rs2229910 of NTRK3 is a genetic variant that is significantly related to IGD. These findings may have significant implications for future research investigating the genetics of IGD and other behavioral addictions. PMID:27826991

  12. Linkage of osteoporosis to chromosome 20p12 and association to BMP2.

    PubMed

    Styrkarsdottir, Unnur; Cazier, Jean-Baptiste; Kong, Augustine; Rolfsson, Ottar; Larsen, Helene; Bjarnadottir, Emma; Johannsdottir, Vala D; Sigurdardottir, Margret S; Bagger, Yu; Christiansen, Claus; Reynisdottir, Inga; Grant, Struan F A; Jonasson, Kristjan; Frigge, Michael L; Gulcher, Jeffrey R; Sigurdsson, Gunnar; Stefansson, Kari

    2003-12-01

    Osteoporotic fractures are a major cause of morbidity and mortality in ageing populations. Osteoporosis, defined as low bone mineral density (BMD) and associated fractures, have significant genetic components that are largely unknown. Linkage analysis in a large number of extended osteoporosis families in Iceland, using a phenotype that combines osteoporotic fractures and BMD measurements, showed linkage to Chromosome 20p12.3 (multipoint allele-sharing LOD, 5.10; p value, 6.3 x 10(-7)), results that are statistically significant after adjusting for the number of phenotypes tested and the genome-wide search. A follow-up association analysis using closely spaced polymorphic markers was performed. Three variants in the bone morphogenetic protein 2 (BMP2) gene, a missense polymorphism and two anonymous single nucleotide polymorphism haplotypes, were determined to be associated with osteoporosis in the Icelandic patients. The association is seen with many definitions of an osteoporotic phenotype, including osteoporotic fractures as well as low BMD, both before and after menopause. A replication study with a Danish cohort of postmenopausal women was conducted to confirm the contribution of the three identified variants. In conclusion, we find that a region on the short arm of Chromosome 20 contains a gene or genes that appear to be a major risk factor for osteoporosis and osteoporotic fractures, and our evidence supports the view that BMP2 is at least one of these genes.

  13. SOS2 and ACP1 Loci Identified through Large-Scale Exome Chip Analysis Regulate Kidney Development and Function.

    PubMed

    Li, Man; Li, Yong; Weeks, Olivia; Mijatovic, Vladan; Teumer, Alexander; Huffman, Jennifer E; Tromp, Gerard; Fuchsberger, Christian; Gorski, Mathias; Lyytikäinen, Leo-Pekka; Nutile, Teresa; Sedaghat, Sanaz; Sorice, Rossella; Tin, Adrienne; Yang, Qiong; Ahluwalia, Tarunveer S; Arking, Dan E; Bihlmeyer, Nathan A; Böger, Carsten A; Carroll, Robert J; Chasman, Daniel I; Cornelis, Marilyn C; Dehghan, Abbas; Faul, Jessica D; Feitosa, Mary F; Gambaro, Giovanni; Gasparini, Paolo; Giulianini, Franco; Heid, Iris; Huang, Jinyan; Imboden, Medea; Jackson, Anne U; Jeff, Janina; Jhun, Min A; Katz, Ronit; Kifley, Annette; Kilpeläinen, Tuomas O; Kumar, Ashish; Laakso, Markku; Li-Gao, Ruifang; Lohman, Kurt; Lu, Yingchang; Mägi, Reedik; Malerba, Giovanni; Mihailov, Evelin; Mohlke, Karen L; Mook-Kanamori, Dennis O; Robino, Antonietta; Ruderfer, Douglas; Salvi, Erika; Schick, Ursula M; Schulz, Christina-Alexandra; Smith, Albert V; Smith, Jennifer A; Traglia, Michela; Yerges-Armstrong, Laura M; Zhao, Wei; Goodarzi, Mark O; Kraja, Aldi T; Liu, Chunyu; Wessel, Jennifer; Boerwinkle, Eric; Borecki, Ingrid B; Bork-Jensen, Jette; Bottinger, Erwin P; Braga, Daniele; Brandslund, Ivan; Brody, Jennifer A; Campbell, Archie; Carey, David J; Christensen, Cramer; Coresh, Josef; Crook, Errol; Curhan, Gary C; Cusi, Daniele; de Boer, Ian H; de Vries, Aiko P J; Denny, Joshua C; Devuyst, Olivier; Dreisbach, Albert W; Endlich, Karlhans; Esko, Tõnu; Franco, Oscar H; Fulop, Tibor; Gerhard, Glenn S; Glümer, Charlotte; Gottesman, Omri; Grarup, Niels; Gudnason, Vilmundur; Hansen, Torben; Harris, Tamara B; Hayward, Caroline; Hocking, Lynne; Hofman, Albert; Hu, Frank B; Husemoen, Lise Lotte N; Jackson, Rebecca D; Jørgensen, Torben; Jørgensen, Marit E; Kähönen, Mika; Kardia, Sharon L R; König, Wolfgang; Kooperberg, Charles; Kriebel, Jennifer; Launer, Lenore J; Lauritzen, Torsten; Lehtimäki, Terho; Levy, Daniel; Linksted, Pamela; Linneberg, Allan; Liu, Yongmei; Loos, Ruth J F; Lupo, Antonio; Meisinger, Christine; Melander, Olle; Metspalu, Andres; Mitchell, Paul; Nauck, Matthias; Nürnberg, Peter; Orho-Melander, Marju; Parsa, Afshin; Pedersen, Oluf; Peters, Annette; Peters, Ulrike; Polasek, Ozren; Porteous, David; Probst-Hensch, Nicole M; Psaty, Bruce M; Qi, Lu; Raitakari, Olli T; Reiner, Alex P; Rettig, Rainer; Ridker, Paul M; Rivadeneira, Fernando; Rossouw, Jacques E; Schmidt, Frank; Siscovick, David; Soranzo, Nicole; Strauch, Konstantin; Toniolo, Daniela; Turner, Stephen T; Uitterlinden, André G; Ulivi, Sheila; Velayutham, Dinesh; Völker, Uwe; Völzke, Henry; Waldenberger, Melanie; Wang, Jie Jin; Weir, David R; Witte, Daniel; Kuivaniemi, Helena; Fox, Caroline S; Franceschini, Nora; Goessling, Wolfram; Köttgen, Anna; Chu, Audrey Y

    2017-03-01

    Genome-wide association studies have identified >50 common variants associated with kidney function, but these variants do not fully explain the variation in eGFR. We performed a two-stage meta-analysis of associations between genotypes from the Illumina exome array and eGFR on the basis of serum creatinine (eGFRcrea) among participants of European ancestry from the CKDGen Consortium ( n Stage1 : 111,666; n Stage2 : 48,343). In single-variant analyses, we identified single nucleotide polymorphisms at seven new loci associated with eGFRcrea ( PPM1J , EDEM3, ACP1, SPEG, EYA4, CYP1A1 , and ATXN2L ; P Stage1 <3.7×10 -7 ), of which most were common and annotated as nonsynonymous variants. Gene-based analysis identified associations of functional rare variants in three genes with eGFRcrea, including a novel association with the SOS Ras/Rho guanine nucleotide exchange factor 2 gene, SOS2 ( P =5.4×10 -8 by sequence kernel association test). Experimental follow-up in zebrafish embryos revealed changes in glomerular gene expression and renal tubule morphology in the embryonic kidney of acp1- and sos2 -knockdowns. These developmental abnormalities associated with altered blood clearance rate and heightened prevalence of edema. This study expands the number of loci associated with kidney function and identifies novel genes with potential roles in kidney formation. Copyright © 2017 by the American Society of Nephrology.

  14. Stabilization of the μ-opioid receptor by truncated single transmembrane splice variants through a chaperone-like action.

    PubMed

    Xu, Jin; Xu, Ming; Brown, Taylor; Rossi, Grace C; Hurd, Yasmin L; Inturrisi, Charles E; Pasternak, Gavril W; Pan, Ying-Xian

    2013-07-19

    The μ-opioid receptor gene, OPRM1, undergoes extensive alternative pre-mRNA splicing, as illustrated by the identification of an array of splice variants generated by both 5' and 3' alternative splicing. The current study reports the identification of another set of splice variants conserved across species that are generated through exon skipping or insertion that encodes proteins containing only a single transmembrane (TM) domain. Using a Tet-Off system, we demonstrated that the truncated single TM variants can dimerize with the full-length 7-TM μ-opioid receptor (MOR-1) in the endoplasmic reticulum, leading to increased expression of MOR-1 at the protein level by a chaperone-like function that minimizes endoplasmic reticulum-associated degradation. In vivo antisense studies suggested that the single TM variants play an important role in morphine analgesia, presumably through modulation of receptor expression levels. Our studies suggest the functional roles of truncated receptors in other G protein-coupled receptor families.

  15. BRCA2 Polymorphic Stop Codon K3326X and the Risk of Breast, Prostate, and Ovarian Cancers

    PubMed Central

    Meeks, Huong D.; Song, Honglin; Michailidou, Kyriaki; Bolla, Manjeet K.; Dennis, Joe; Wang, Qin; Barrowdale, Daniel; Frost, Debra; McGuffog, Lesley; Ellis, Steve; Feng, Bingjian; Buys, Saundra S.; Hopper, John L.; Southey, Melissa C.; Tesoriero, Andrea; James, Paul A.; Bruinsma, Fiona; Campbell, Ian G.; Broeks, Annegien; Schmidt, Marjanka K.; Hogervorst, Frans B. L.; Beckman, Matthias W.; Fasching, Peter A.; Fletcher, Olivia; Johnson, Nichola; Sawyer, Elinor J.; Riboli, Elio; Banerjee, Susana; Menon, Usha; Tomlinson, Ian; Burwinkel, Barbara; Hamann, Ute; Marme, Frederik; Rudolph, Anja; Janavicius, Ramunas; Tihomirova, Laima; Tung, Nadine; Garber, Judy; Cramer, Daniel; Terry, Kathryn L.; Poole, Elizabeth M.; Tworoger, Shelley S.; Dorfling, Cecilia M.; van Rensburg, Elizabeth J.; Godwin, Andrew K.; Guénel, Pascal; Truong, Thérèse; Stoppa-Lyonnet, Dominique; Damiola, Francesca; Mazoyer, Sylvie; Sinilnikova, Olga M.; Isaacs, Claudine; Maugard, Christine; Bojesen, Stig E.; Flyger, Henrik; Gerdes, Anne-Marie; Hansen, Thomas V. O.; Jensen, Allen; Kjaer, Susanne K.; Hogdall, Claus; Hogdall, Estrid; Pedersen, Inge Sokilde; Thomassen, Mads; Benitez, Javier; González-Neira, Anna; Osorio, Ana; de la Hoya, Miguel; Segura, Pedro Perez; Diez, Orland; Lazaro, Conxi; Brunet, Joan; Anton-Culver, Hoda; Eunjung, Lee; John, Esther M.; Neuhausen, Susan L.; Ding, Yuan Chun; Castillo, Danielle; Weitzel, Jeffrey N.; Ganz, Patricia A.; Nussbaum, Robert L.; Chan, Salina B.; Karlan, Beth Y.; Lester, Jenny; Wu, Anna; Gayther, Simon; Ramus, Susan J.; Sieh, Weiva; Whittermore, Alice S.; Monteiro, Alvaro N. A.; Phelan, Catherine M.; Terry, Mary Beth; Piedmonte, Marion; Offit, Kenneth; Robson, Mark; Levine, Douglas; Moysich, Kirsten B.; Cannioto, Rikki; Olson, Sara H.; Daly, Mary B.; Nathanson, Katherine L.; Domchek, Susan M.; Lu, Karen H.; Liang, Dong; Hildebrant, Michelle A. T.; Ness, Roberta; Modugno, Francesmary; Pearce, Leigh; Goodman, Marc T.; Thompson, Pamela J.; Brenner, Hermann; Butterbach, Katja; Meindl, Alfons; Hahnen, Eric; Wappenschmidt, Barbara; Brauch, Hiltrud; Brüning, Thomas; Blomqvist, Carl; Khan, Sofia; Nevanlinna, Heli; Pelttari, Liisa M.; Aittomäki, Kristiina; Butzow, Ralf; Bogdanova, Natalia V.; Dörk, Thilo; Lindblom, Annika; Margolin, Sara; Rantala, Johanna; Kosma, Veli-Matti; Mannermaa, Arto; Lambrechts, Diether; Neven, Patrick; Claes, Kathleen B. M.; Maerken, Tom Van; Chang-Claude, Jenny; Flesch-Janys, Dieter; Heitz, Florian; Varon-Mateeva, Raymonda; Peterlongo, Paolo; Radice, Paolo; Viel, Alessandra; Barile, Monica; Peissel, Bernard; Manoukian, Siranoush; Montagna, Marco; Oliani, Cristina; Peixoto, Ana; Teixeira, Manuel R.; Collavoli, Anita; Hallberg, Emily; Olson, Janet E.; Goode, Ellen L.; Hart, Steven N.; Shimelis, Hermela; Cunningham, Julie M.; Giles, Graham G.; Milne, Roger L.; Healey, Sue; Tucker, Kathy; Haiman, Christopher A.; Henderson, Brian E.; Goldberg, Mark S.; Tischkowitz, Marc; Simard, Jacques; Soucy, Penny; Eccles, Diana M.; Le, Nhu; Borresen-Dale, Anne-Lise; Kristensen, Vessela; Salvesen, Helga B.; Bjorge, Line; Bandera, Elisa V.; Risch, Harvey; Zheng, Wei; Beeghly-Fadiel, Alicia; Cai, Hui; Pylkäs, Katri; Tollenaar, Robert A. E. M.; van der Ouweland, Ans M. W.; Andrulis, Irene L.; Knight, Julia A.; Narod, Steven; Devilee, Peter; Winqvist, Robert; Figueroa, Jonine; Greene, Mark H.; Mai, Phuong L.; Loud, Jennifer T.; García-Closas, Montserrat; Schoemaker, Minouk J.; Czene, Kamila; Darabi, Hatef; McNeish, Iain; Siddiquil, Nadeem; Glasspool, Rosalind; Kwong, Ava; Park, Sue K.; Teo, Soo Hwang; Yoon, Sook-Yee; Matsuo, Keitaro; Hosono, Satoyo; Woo, Yin Ling; Gao, Yu-Tang; Foretova, Lenka; Singer, Christian F.; Rappaport-Feurhauser, Christine; Friedman, Eitan; Laitman, Yael; Rennert, Gad; Imyanitov, Evgeny N.; Hulick, Peter J.; Olopade, Olufunmilayo I.; Senter, Leigha; Olah, Edith; Doherty, Jennifer A.; Schildkraut, Joellen; Koppert, Linetta B.; Kiemeney, Lambertus A.; Massuger, Leon F. A. G.; Cook, Linda S.; Pejovic, Tanja; Li, Jingmei; Borg, Ake; Öfverholm, Anna; Rossing, Mary Anne; Wentzensen, Nicolas; Henriksson, Karin; Cox, Angela; Cross, Simon S.; Pasini, Barbara J.; Shah, Mitul; Kabisch, Maria; Torres, Diana; Jakubowska, Anna; Lubinski, Jan; Gronwald, Jacek; Agnarsson, Bjarni A.; Kupryjanczyk, Jolanta; Moes-Sosnowska, Joanna; Fostira, Florentia; Konstantopoulou, Irene; Slager, Susan; Jones, Michael; Antoniou, Antonis C.; Berchuck, Andrew; Swerdlow, Anthony; Chenevix-Trench, Georgia; Dunning, Alison M.; Pharoah, Paul D. P.; Hall, Per; Easton, Douglas F.; Couch, Fergus J.; Spurdle, Amanda B.

    2016-01-01

    Background: The K3326X variant in BRCA2 (BRCA2*c.9976A>T; p.Lys3326*; rs11571833) has been found to be associated with small increased risks of breast cancer. However, it is not clear to what extent linkage disequilibrium with fully pathogenic mutations might account for this association. There is scant information about the effect of K3326X in other hormone-related cancers. Methods: Using weighted logistic regression, we analyzed data from the large iCOGS study including 76 637 cancer case patients and 83 796 control patients to estimate odds ratios (ORw) and 95% confidence intervals (CIs) for K3326X variant carriers in relation to breast, ovarian, and prostate cancer risks, with weights defined as probability of not having a pathogenic BRCA2 variant. Using Cox proportional hazards modeling, we also examined the associations of K3326X with breast and ovarian cancer risks among 7183 BRCA1 variant carriers. All statistical tests were two-sided. Results: The K3326X variant was associated with breast (ORw = 1.28, 95% CI = 1.17 to 1.40, P = 5.9x10- 6) and invasive ovarian cancer (ORw = 1.26, 95% CI = 1.10 to 1.43, P = 3.8x10-3). These associations were stronger for serous ovarian cancer and for estrogen receptor–negative breast cancer (ORw = 1.46, 95% CI = 1.2 to 1.70, P = 3.4x10-5 and ORw = 1.50, 95% CI = 1.28 to 1.76, P = 4.1x10-5, respectively). For BRCA1 mutation carriers, there was a statistically significant inverse association of the K3326X variant with risk of ovarian cancer (HR = 0.43, 95% CI = 0.22 to 0.84, P = .013) but no association with breast cancer. No association with prostate cancer was observed. Conclusions: Our study provides evidence that the K3326X variant is associated with risk of developing breast and ovarian cancers independent of other pathogenic variants in BRCA2. Further studies are needed to determine the biological mechanism of action responsible for these associations. PMID:26586665

  16. Difference between age-related macular degeneration and polypoidal choroidal vasculopathy in the hereditary contribution of the A69S variant of the age-related maculopathy susceptibility 2 gene (ARMS2).

    PubMed

    Yanagisawa, Suiho; Kondo, Naoshi; Miki, Akiko; Matsumiya, Wataru; Kusuhara, Sentaro; Tsukahara, Yasutomo; Honda, Shigeru; Negi, Akira

    2011-01-01

    To investigate whether the A69S variant of the age-related maculopathy susceptibility 2 gene (ARMS2) has a different hereditary contribution in neovascular age-related macular degeneration (AMD) and polypoidal choroidal vasculopathy (PCV). We initially conducted a comparative genetic analysis of neovascular AMD and PCV, genotyping the ARMS2 A69S variant in 181 subjects with neovascular AMD, 198 subjects with PCV, and 203 controls in a Japanese population. Genotyping was conducted using TaqMan technology. Results were then integrated into a meta-analysis of previous studies representing an assessment of the association between the ARMS2 A69S variant and neovascular AMD and/or PCV, comprising a total of 3,828 subjects of Asian descent. The Q-statistic test was used to assess between-study heterogeneity. Summary odds ratios (ORs) and 95% confidence intervals (CIs) were estimated using a fixed effects model. The genetic effect of the A69S variant was stronger in neovascular AMD (allelic summary OR=3.09 [95% CI, 2.71-3.51], fixed effects p<0.001) than in PCV (allelic summary OR=2.13 [95% CI, 1.91-2.38], fixed effects p<0.001). The pooled risk allele frequency was significantly higher in neovascular AMD (64.7%) than in PCV (55.6%). The population attributable risks for the variant allele were estimated to be 43.9% (95% CI, 39.0%-48.4%) and 29.7% (95% CI, 25.4%-34.0%) for neovascular AMD and PCV, respectively. No significant between-study heterogeneity was observed in any statistical analysis in this meta-analysis. Our meta-analysis provides substantial evidence that the ARMS2 A69S variant confers a significantly higher risk of neovascular AMD than PCV. Furthermore, there is compelling evidence that the risk attributable to the A69S variant differs between geographic atrophy and neovascular AMD. Together with defining the molecular basis of susceptibility, understanding the relationships between this genomic region and disease subtypes will yield important insights, elucidating the biologic architecture of this phenotypically heterogeneous disorder.

  17. Failure of replicating the association between hippocampal volume and 3 single-nucleotide polymorphisms identified from the European genome-wide association study in Asian populations.

    PubMed

    Li, Ming; Ohi, Kazutaka; Chen, Chunhui; He, Qinghua; Liu, Jie-Wei; Chen, Chuansheng; Luo, Xiong-Jian; Dong, Qi; Hashimoto, Ryota; Su, Bing

    2014-12-01

    Hippocampal volume is a key brain structure for learning ability and memory process, and hippocampal atrophy is a recognized biological marker of Alzheimer's disease. However, the genetic bases of hippocampal volume are still unclear although it is a heritable trait. Genome-wide association studies (GWASs) on hippocampal volume have implicated several significantly associated genetic variants in Europeans. Here, to test the contributions of these GWASs identified genetic variants to hippocampal volume in different ethnic populations, we screened the GWAS-identified candidate single-nucleotide polymorphisms in 3 independent healthy Asian brain imaging samples (a total of 990 subjects). The results showed that none of these single-nucleotide polymorphisms were associated with hippocampal volume in either individual or combined Asian samples. The replication results suggested a complexity of genetic architecture for hippocampal volume and potential genetic heterogeneity between different ethnic populations. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Polygenic Overlap Between C-Reactive Protein, Plasma Lipids, and Alzheimer Disease.

    PubMed

    Desikan, Rahul S; Schork, Andrew J; Wang, Yunpeng; Thompson, Wesley K; Dehghan, Abbas; Ridker, Paul M; Chasman, Daniel I; McEvoy, Linda K; Holland, Dominic; Chen, Chi-Hua; Karow, David S; Brewer, James B; Hess, Christopher P; Williams, Julie; Sims, Rebecca; O'Donovan, Michael C; Choi, Seung Hoan; Bis, Joshua C; Ikram, M Arfan; Gudnason, Vilmundur; DeStefano, Anita L; van der Lee, Sven J; Psaty, Bruce M; van Duijn, Cornelia M; Launer, Lenore; Seshadri, Sudha; Pericak-Vance, Margaret A; Mayeux, Richard; Haines, Jonathan L; Farrer, Lindsay A; Hardy, John; Ulstein, Ingun Dina; Aarsland, Dag; Fladby, Tormod; White, Linda R; Sando, Sigrid B; Rongve, Arvid; Witoelar, Aree; Djurovic, Srdjan; Hyman, Bradley T; Snaedal, Jon; Steinberg, Stacy; Stefansson, Hreinn; Stefansson, Kari; Schellenberg, Gerard D; Andreassen, Ole A; Dale, Anders M

    2015-06-09

    Epidemiological findings suggest a relationship between Alzheimer disease (AD), inflammation, and dyslipidemia, although the nature of this relationship is not well understood. We investigated whether this phenotypic association arises from a shared genetic basis. Using summary statistics (P values and odds ratios) from genome-wide association studies of >200 000 individuals, we investigated overlap in single-nucleotide polymorphisms associated with clinically diagnosed AD and C-reactive protein (CRP), triglycerides, and high- and low-density lipoprotein levels. We found up to 50-fold enrichment of AD single-nucleotide polymorphisms for different levels of association with C-reactive protein, low-density lipoprotein, high-density lipoprotein, and triglyceride single-nucleotide polymorphisms using a false discovery rate threshold <0.05. By conditioning on polymorphisms associated with the 4 phenotypes, we identified 55 loci associated with increased AD risk. We then conducted a meta-analysis of these 55 variants across 4 independent AD cohorts (total: n=29 054 AD cases and 114 824 healthy controls) and discovered 2 genome-wide significant variants on chromosome 4 (rs13113697; closest gene, HS3ST1; odds ratio=1.07; 95% confidence interval=1.05-1.11; P=2.86×10(-8)) and chromosome 10 (rs7920721; closest gene, ECHDC3; odds ratio=1.07; 95% confidence interval=1.04-1.11; P=3.38×10(-8)). We also found that gene expression of HS3ST1 and ECHDC3 was altered in AD brains compared with control brains. We demonstrate genetic overlap between AD, C-reactive protein, and plasma lipids. By conditioning on the genetic association with the cardiovascular phenotypes, we identify novel AD susceptibility loci, including 2 genome-wide significant variants conferring increased risk for AD. © 2015 American Heart Association, Inc.

  19. Single Nucleotide Variants Associated With Polygenic Hypercholesterolemia in Families Diagnosed Clinically With Familial Hypercholesterolemia.

    PubMed

    Lamiquiz-Moneo, Itziar; Pérez-Ruiz, María Rosario; Jarauta, Estíbaliz; Tejedor, María Teresa; Bea, Ana M; Mateo-Gallego, Rocío; Pérez-Calahorra, Sofía; Baila-Rueda, Lucía; Marco-Benedí, Victoria; de Castro-Orós, Isabel; Cenarro, Ana; Civeira, Fernando

    2018-05-01

    Approximately 20% to 40% of clinically defined familial hypercholesterolemia cases do not show a causative mutation in candidate genes, and some of them may have a polygenic origin. A cholesterol gene risk score for the diagnosis of polygenic hypercholesterolemia has been demonstrated to be valuable to differentiate polygenic and monogenic hypercholesterolemia. The aim of this study was to determine the contribution to low-density lipoprotein cholesterol (LDL-C) of the single nucleotide variants associated with polygenic hypercholesterolemia in probands with genetic hypercholesterolemia without mutations in candidate genes (nonfamilial hypercholesterolemia genetic hypercholesterolemia) and the genetic score in cascade screening in their family members. We recruited 49 nonfamilial hypercholesterolemia genetic hypercholesterolemia families (294 participants) and calculated cholesterol gene scores, derived from single nucleotide variants in SORT1, APOB, ABCG8, APOE and LDLR and lipoprotein(a) plasma concentration. Risk alleles in SORT1, ABCG8, APOE, and LDLR showed a statistically significantly higher frequency in blood relatives than in the 1000 Genomes Project. However, there were no differences between affected and nonaffected members. The contribution of the cholesterol gene score to LDL-C was significantly higher in affected than in nonaffected participants (P = .048). The percentage of the LDL-C variation explained by the score was 3.1%, and this percentage increased to 6.9% in those families with the highest genetic score in the proband. Nonfamilial hypercholesterolemia genetic hypercholesterolemia families concentrate risk alleles for high LDL-C. Their contribution varies greatly among families, indicating the complexity and heterogeneity of these forms of hypercholesterolemias. The gene score explains a small percentage of LDL-C, which limits its use in diagnosis. Copyright © 2017 Sociedad Española de Cardiología. Published by Elsevier España, S.L.U. All rights reserved.

  20. Integrated sequence analysis pipeline provides one-stop solution for identifying disease-causing mutations.

    PubMed

    Hu, Hao; Wienker, Thomas F; Musante, Luciana; Kalscheuer, Vera M; Kahrizi, Kimia; Najmabadi, Hossein; Ropers, H Hilger

    2014-12-01

    Next-generation sequencing has greatly accelerated the search for disease-causing defects, but even for experts the data analysis can be a major challenge. To facilitate the data processing in a clinical setting, we have developed a novel medical resequencing analysis pipeline (MERAP). MERAP assesses the quality of sequencing, and has optimized capacity for calling variants, including single-nucleotide variants, insertions and deletions, copy-number variation, and other structural variants. MERAP identifies polymorphic and known causal variants by filtering against public domain databases, and flags nonsynonymous and splice-site changes. MERAP uses a logistic model to estimate the causal likelihood of a given missense variant. MERAP considers the relevant information such as phenotype and interaction with known disease-causing genes. MERAP compares favorably with GATK, one of the widely used tools, because of its higher sensitivity for detecting indels, its easy installation, and its economical use of computational resources. Upon testing more than 1,200 individuals with mutations in known and novel disease genes, MERAP proved highly reliable, as illustrated here for five families with disease-causing variants. We believe that the clinical implementation of MERAP will expedite the diagnostic process of many disease-causing defects. © 2014 WILEY PERIODICALS, INC.

  1. Exome sequencing in an admixed isolated population indicates NFXL1 variants confer a risk for specific language impairment.

    PubMed

    Villanueva, Pía; Nudel, Ron; Hoischen, Alexander; Fernández, María Angélica; Simpson, Nuala H; Gilissen, Christian; Reader, Rose H; Jara, Lillian; Echeverry, María Magdalena; Echeverry, Maria Magdalena; Francks, Clyde; Baird, Gillian; Conti-Ramsden, Gina; O'Hare, Anne; Bolton, Patrick F; Hennessy, Elizabeth R; Palomino, Hernán; Carvajal-Carmona, Luis; Veltman, Joris A; Cazier, Jean-Baptiste; De Barbieri, Zulema; Fisher, Simon E; Newbury, Dianne F

    2015-03-01

    Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10-4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model.

  2. Effects of human SAMHD1 polymorphisms on HIV-1 susceptibility

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Tommy E.; Brandariz-Nuñez, Alberto; Valle-Casuso, Jose Carlos

    SAMHD1 is a human restriction factor that prevents efficient infection of macrophages, dendritic cells and resting CD4+ T cells by HIV-1. Here we explored the antiviral activity and biochemical properties of human SAMHD1 polymorphisms. Our studies focused on human SAMHD1 polymorphisms that were previously identified as evolving under positive selection for rapid amino acid replacement during primate speciation. The different human SAMHD1 polymorphisms were tested for their ability to block HIV-1, HIV-2 and equine infectious anemia virus (EIAV). All studied SAMHD1 variants block HIV-1, HIV-2 and EIAV infection when compared to wild type. We found that these variants did notmore » lose their ability to oligomerize or to bind RNA. Furthermore, all tested variants were susceptible to degradation by Vpx, and localized to the nuclear compartment. We tested the ability of human SAMHD1 polymorphisms to decrease the dNTP cellular levels. In agreement, none of the different SAMHD1 variants lost their ability to reduce cellular levels of dNTPs. Finally, we found that none of the tested human SAMHD1 polymorphisms affected the ability of the protein to block LINE-1 retrotransposition. - Highlights: • Human SAMHD1 single-nucleotide polymorphisms block HIV-1 and HIV-2 infection. • SAMHD1 polymorphisms do not affect its ability to block LINE-1 retrotransposition. • SAMHD1 polymorphisms decrease the cellular levels of dNTPs.« less

  3. The suitability of matrix assisted laser desorption/ionization time of flight mass spectrometry in a laboratory developed test using cystic fibrosis carrier screening as a model.

    PubMed

    Farkas, Daniel H; Miltgen, Nicholas E; Stoerker, Jay; van den Boom, Dirk; Highsmith, W Edward; Cagasan, Lesley; McCullough, Ron; Mueller, Reinhold; Tang, Lin; Tynan, John; Tate, Courtney; Bombard, Allan

    2010-09-01

    We designed a laboratory developed test (LDT) by using an open platform for mutation/polymorphism detection. Using a 108-member (mutation plus variant) cystic fibrosis carrier screening panel as a model, we completed the last phase of LDT validation by using matrix-assisted laser desorption/ionization time of flight mass spectrometry. Panel customization was accomplished via specific amplification primer and extension probe design. Amplified genomic DNA was subjected to allele specific, single base extension endpoint analysis by mass spectrometry for inspection of the cystic fibrosis transmembrane regulator gene (NM_000492.3). The panel of mutations and variants was tested against 386 blinded samples supplied by "authority" laboratories highly experienced in cystic fibrosis transmembrane regulator genotyping; >98% concordance was observed. All discrepant and discordant results were resolved satisfactorily. Taken together, these results describe the concluding portion of the LDT validation process and the use of mass spectrometry to detect a large number of complex reactions within a single run as well as its suitability as a platform appropriate for interrogation of scores to hundreds of targets.

  4. A Novel β-Globin Chain Hemoglobin Variant, Hb Allentown [β137(H15)Val→Trp (GTG>TGG) HBB: c.412_413delinsTG, p.Val138Trp], Associated with Low Oxygen Saturation, Intermittent Aplastic Crises and Splenomegaly.

    PubMed

    Collier, Anderson B; Coon, Lea M; Monteleone, Philip; Umaru, Samuel; Swanson, Kenneth C; Hoyer, James D; Oliveira, Jennifer L

    2016-01-01

    Hemoglobin (Hb) variants may be associated with low oxygen saturation and exacerbated episodes of anemia from common stressors such as viral infections. These attributes frequently cause increased clinical concern and unnecessary and expensive testing if not considered early in the evaluation of the patient. Some clinically significant Hb variants result in a normal Hb electrophoresis result, which can be method-dependent. Herein we describe a patient with low oxygen saturation and a history of hemolytic anemia who was subsequently found to carry a novel, unstable β-globin variant that we have named Hb Allentown [β137(H15)Val→Trp (GTG>TGG) HBB: c.412_413delinsTG, p.Val138Trp] for the place of identification of the variant. Hb Allentown is formed by a rare double nucleotide substitution within the same codon. Additionally, positive identification of rare Hb variants characterized by a single method is discouraged, as the Hb variant was misclassified as Hb S-South End or β6(A3)Glu→Val;β132(H10)Lys→Asn (HBB: c.[20A > T;399A > C]) by the initial laboratory.

  5. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics.

    PubMed

    Giambartolomei, Claudia; Vukcevic, Damjan; Schadt, Eric E; Franke, Lude; Hingorani, Aroon D; Wallace, Chris; Plagnol, Vincent

    2014-05-01

    Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, in particular cardiovascular diseases and lipid biomarkers. The next challenge consists of understanding the molecular basis of these associations. The integration of multiple association datasets, including gene expression datasets, can contribute to this goal. We have developed a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant. An application is the integration of disease scans with expression quantitative trait locus (eQTL) studies, but any pair of GWAS datasets can be integrated in this framework. We demonstrate the value of the approach by re-analysing a gene expression dataset in 966 liver samples with a published meta-analysis of lipid traits including >100,000 individuals of European ancestry. Combining all lipid biomarkers, our re-analysis supported 26 out of 38 reported colocalisation results with eQTLs and identified 14 new colocalisation results, hence highlighting the value of a formal statistical test. In three cases of reported eQTL-lipid pairs (SYPL2, IFT172, TBKBP1) for which our analysis suggests that the eQTL pattern is not consistent with the lipid association, we identify alternative colocalisation results with SORT1, GCKR, and KPNB1, indicating that these genes are more likely to be causal in these genomic intervals. A key feature of the method is the ability to derive the output statistics from single SNP summary statistics, hence making it possible to perform systematic meta-analysis type comparisons across multiple GWAS datasets (implemented online at http://coloc.cs.ucl.ac.uk/coloc/). Our methodology provides information about candidate causal genes in associated intervals and has direct implications for the understanding of complex diseases as well as the design of drugs to target disease pathways.

  6. The Necessity of the Hippocampus for Statistical Learning

    PubMed Central

    Covington, Natalie V.; Brown-Schmidt, Sarah; Duff, Melissa C.

    2018-01-01

    Converging evidence points to a role for the hippocampus in statistical learning, but open questions about its necessity remain. Evidence for necessity comes from Schapiro and colleagues who report that a single patient with damage to hippocampus and broader medial temporal lobe cortex was unable to discriminate new from old sequences in several statistical learning tasks. The aim of the current study was to replicate these methods in a larger group of patients who have either damage localized to hippocampus or a broader medial temporal lobe damage, to ascertain the necessity of the hippocampus in statistical learning. Patients with hippocampal damage consistently showed less learning overall compared with healthy comparison participants, consistent with an emerging consensus for hippocampal contributions to statistical learning. Interestingly, lesion size did not reliably predict performance. However, patients with hippocampal damage were not uniformly at chance and demonstrated above-chance performance in some task variants. These results suggest that hippocampus is necessary for statistical learning levels achieved by most healthy comparison participants but significant hippocampal pathology alone does not abolish such learning. PMID:29308986

  7. Examination of polymorphic glutathione S-transferase (GST) genes, tobacco smoking and prostate cancer risk among Men of African Descent: A case-control study

    PubMed Central

    2009-01-01

    Background Polymorphisms in glutathione S-transferase (GST) genes may influence response to oxidative stress and modify prostate cancer (PCA) susceptibility. These enzymes generally detoxify endogenous and exogenous agents, but also participate in the activation and inactivation of oxidative metabolites that may contribute to PCA development. Genetic variations within selected GST genes may influence PCA risk following exposure to carcinogen compounds found in cigarette smoke and decreased the ability to detoxify them. Thus, we evaluated the effects of polymorphic GSTs (M1, T1, and P1) alone and combined with cigarette smoking on PCA susceptibility. Methods In order to evaluate the effects of GST polymorphisms in relation to PCA risk, we used TaqMan allelic discrimination assays along with a multi-faceted statistical strategy involving conventional and advanced statistical methodologies (e.g., Multifactor Dimensionality Reduction and Interaction Graphs). Genetic profiles collected from 873 men of African-descent (208 cases and 665 controls) were utilized to systematically evaluate the single and joint modifying effects of GSTM1 and GSTT1 gene deletions, GSTP1 105 Val and cigarette smoking on PCA risk. Results We observed a moderately significant association between risk among men possessing at least one variant GSTP1 105 Val allele (OR = 1.56; 95%CI = 0.95-2.58; p = 0.049), which was confirmed by MDR permutation testing (p = 0.001). We did not observe any significant single gene effects among GSTM1 (OR = 1.08; 95%CI = 0.65-1.82; p = 0.718) and GSTT1 (OR = 1.15; 95%CI = 0.66-2.02; p = 0.622) on PCA risk among all subjects. Although the GSTM1-GSTP1 pairwise combination was selected as the best two factor LR and MDR models (p = 0.01), assessment of the hierarchical entropy graph suggested that the observed synergistic effect was primarily driven by the GSTP1 Val marker. Notably, the GSTM1-GSTP1 axis did not provide additional information gain when compared to either loci alone based on a hierarchical entropy algorithm and graph. Smoking status did not significantly modify the relationship between the GST SNPs and PCA. Conclusion A moderately significant association was observed between PCA risk and men possessing at least one variant GSTP1 105 Val allele (p = 0.049) among men of African descent. We also observed a 2.1-fold increase in PCA risk associated with men possessing the GSTP1 (Val/Val) and GSTM1 (*1/*1 + *1/*0) alleles. MDR analysis validated these findings; detecting GSTP1 105 Val (p = 0.001) as the best single factor for predicting PCA risk. Our findings emphasize the importance of utilizing a combination of traditional and advanced statistical tools to identify and validate single gene and multi-locus interactions in relation to cancer susceptibility. PMID:19917083

  8. Inferring causal relationships between phenotypes using summary statistics from genome-wide association studies.

    PubMed

    Meng, Xiang-He; Shen, Hui; Chen, Xiang-Ding; Xiao, Hong-Mei; Deng, Hong-Wen

    2018-03-01

    Genome-wide association studies (GWAS) have successfully identified numerous genetic variants associated with diverse complex phenotypes and diseases, and provided tremendous opportunities for further analyses using summary association statistics. Recently, Pickrell et al. developed a robust method for causal inference using independent putative causal SNPs. However, this method may fail to infer the causal relationship between two phenotypes when only a limited number of independent putative causal SNPs identified. Here, we extended Pickrell's method to make it more applicable for the general situations. We extended the causal inference method by replacing the putative causal SNPs with the lead SNPs (the set of the most significant SNPs in each independent locus) and tested the performance of our extended method using both simulation and empirical data. Simulations suggested that when the same number of genetic variants is used, our extended method had similar distribution of test statistic under the null model as well as comparable power under the causal model compared with the original method by Pickrell et al. But in practice, our extended method would generally be more powerful because the number of independent lead SNPs was often larger than the number of independent putative causal SNPs. And including more SNPs, on the other hand, would not cause more false positives. By applying our extended method to summary statistics from GWAS for blood metabolites and femoral neck bone mineral density (FN-BMD), we successfully identified ten blood metabolites that may causally influence FN-BMD. We extended a causal inference method for inferring putative causal relationship between two phenotypes using summary statistics from GWAS, and identified a number of potential causal metabolites for FN-BMD, which may provide novel insights into the pathophysiological mechanisms underlying osteoporosis.

  9. Association between Variants in Atopy-Related Immunologic Candidate Genes and Pancreatic Cancer Risk.

    PubMed

    Cotterchio, Michelle; Lowcock, Elizabeth; Bider-Canfield, Zoe; Lemire, Mathieu; Greenwood, Celia; Gallinger, Steven; Hudson, Thomas

    2015-01-01

    Many epidemiology studies report that atopic conditions such as allergies are associated with reduced pancreas cancer risk. The reason for this relationship is not yet understood. This is the first study to comprehensively evaluate the association between variants in atopy-related candidate genes and pancreatic cancer risk. A population-based case-control study of pancreas cancer cases diagnosed during 2011-2012 (via Ontario Cancer Registry), and controls recruited using random digit dialing utilized DNA from 179 cases and 566 controls. Following an exhaustive literature review, SNPs in 180 candidate genes were pre-screened using dbGaP pancreas cancer GWAS data; 147 SNPs in 56 allergy-related immunologic genes were retained and genotyped. Logistic regression was used to estimate age-adjusted odd ratio (AOR) for each variant and false discovery rate was used to adjust Wald p-values for multiple testing. Subsequently, a risk allele score was derived based on statistically significant variants. 18 SNPs in 14 candidate genes (CSF2, DENND1B, DPP10, FLG, IL13, IL13RA2, LRP1B, NOD1, NPSR1, ORMDL3, RORA, STAT4, TLR6, TRA) were significantly associated with pancreas cancer risk. After adjustment for multiple comparisons, two LRP1B SNPs remained statistically significant; for example, LRP1B rs1449477 (AA vs. CC: AOR=0.37, 95% CI: 0.22-0.62; p (adjusted)=0.04). Furthermore, the risk allele score was associated with a significant reduction in pancreas cancer risk (p=0.0007). Preliminary findings suggest certain atopy-related variants may be associated with pancreas cancer risk. Further studies are needed to replicate this, and to elucidate the biology behind the growing body of epidemiologic evidence suggesting allergies may reduce pancreatic cancer risk.

  10. Combination Testing Using a Single MSH5 Variant alongside HLA Haplotypes Improves the Sensitivity of Predicting Coeliac Disease Risk in the Polish Population.

    PubMed

    Paziewska, Agnieszka; Cukrowska, Bozena; Dabrowska, Michalina; Goryca, Krzysztof; Piatkowska, Magdalena; Kluska, Anna; Mikula, Michal; Karczmarski, Jakub; Oralewska, Beata; Rybak, Anna; Socha, Jerzy; Balabas, Aneta; Zeber-Lubecka, Natalia; Ambrozkiewicz, Filip; Konopka, Ewa; Trojanowska, Ilona; Zagroba, Malgorzata; Szperl, Malgorzata; Ostrowski, Jerzy

    2015-01-01

    Assessment of non-HLA variants alongside standard HLA testing was previously shown to improve the identification of potential coeliac disease (CD) patients. We intended to identify new genetic variants associated with CD in the Polish population that would improve CD risk prediction when used alongside HLA haplotype analysis. DNA samples of 336 CD and 264 unrelated healthy controls were used to create DNA pools for a genome wide association study (GWAS). GWAS findings were validated with individual HLA tag single nucleotide polymorphism (SNP) typing of 473 patients and 714 healthy controls. Association analysis using four HLA-tagging SNPs showed that, as was found in other populations, positive predicting genotypes (HLA-DQ2.5/DQ2.5, HLA-DQ2.5/DQ2.2, and HLA-DQ2.5/DQ8) were found at higher frequencies in CD patients than in healthy control individuals in the Polish population. Both CD-associated SNPs discovered by GWAS were found in the CD susceptibility region, confirming the previously-determined association of the major histocompatibility (MHC) region with CD pathogenesis. The two most significant SNPs from the GWAS were rs9272346 (HLA-dependent; localized within 1 Kb of DQA1) and rs3130484 (HLA-independent; mapped to MSH5). Specificity of CD prediction using the four HLA-tagging SNPs achieved 92.9%, but sensitivity was only 45.5%. However, when a testing combination of the HLA-tagging SNPs and the MSH5 SNP was used, specificity decreased to 80%, and sensitivity increased to 74%. This study confirmed that improvement of CD risk prediction sensitivity could be achieved by including non-HLA SNPs alongside HLA SNPs in genetic testing.

  11. Inosine triphosphatase polymorphisms and ribavirin pharmacokinetics as determinants of ribavirin-associate anemia in patients receiving standard anti-HCV treatment.

    PubMed

    DʼAvolio, Antonio; Ciancio, Alessia; Siccardi, Marco; Smedile, Antonina; Baietto, Lorena; Simiele, Marco; Marucco, Diego Aguilar; Cariti, Giuseppe; Calcagno, Andrea; de Requena, Daniel Gonzalez; Sciandra, Mauro; Cusato, Jessica; Troshina, Giulia; Bonora, Stefano; Rizzetto, Mario; Di Perri, Giovanni

    2012-04-01

    Functional variants of inosine triphosphatase (ITPA) were recently found to protect against ribavirin (RBV)-induced hemolytic anemia. However, no definitive data are yet available on the role of plasma RBV concentrations on hemoglobin (Hb) decrement. Moreover, no data have been published on the possible interplay between these 2 factors. A retrospective analysis included 167 patients. The ITPA variants rs7270101 and rs1127354 were genotyped and tested using the χ test for association with Hb reduction at week 4. We also investigated, using multivariate logistic regression, the impact of RBV plasma exposure on Hb concentrations. Both single nucleotide polymorphisms were associated with Hb decrease. The carrier of at least 1 variant allele in the functional ITPA single nucleotide polymorphisms was associated with a lower decrement of Hb (-1.1 g/dL), as compared with patients without a variant allele (-2.75 g/dL; P = 4.09 × 10). RBV concentrations were not influenced by ITPA genotypes. A cut-off of 2.3 μg/mL of RBV was found to be associated with anemia (area-under-receiver operating characteristic = 0.630, sensitivity = 50.0%, and specificity = 69.5%, P = 0.008). In multivariate logistic regression analyses, the carrier of a variant allele (P = 0.005) and plasma RBV concentrations <2.3 μg/mL (P = 0.016) were independently associated with protection against clinically significant anemia at week 4. Although no direct relationship was found between ITPA polymorphisms and plasma RBV concentrations, both factors were shown to be significantly associated with anemia. A multivariate regression model based on ITPA genetic polymorphisms and RBV trough concentration was developed for predicting the risk of anemia. By relying upon these 2 variables, an individualized management of anemia seems to be feasible in recipients of pegylated interferon-RBV therapy.

  12. TCF7L2 gene polymorphisms do not predict susceptibility to diabetes in tropical calcific pancreatitis but may interact with SPINK1 and CTSB mutations in predicting diabetes.

    PubMed

    Mahurkar, Swapna; Bhaskar, Seema; Reddy, D Nageshwar; Prakash, Swami; Rao, G Venkat; Singh, Shivaram Prasad; Thomas, Varghese; Chandak, Giriraj Ratan

    2008-08-16

    Tropical calcific pancreatitis (TCP) is a type of chronic pancreatitis unique to developing countries in tropical regions and one of its important features is invariable progression to diabetes, a condition called fibro-calculous pancreatic diabetes (FCPD), but the nature of diabetes in TCP is controversial. We analysed the recently reported type 2 diabetes (T2D) associated polymorphisms in the TCF7L2 gene using a case-control approach, under the hypothesis that TCF7L2 variants should show similar association if diabetes in FCPD is similar to T2D. We also investigated the interaction between the TCF7L2 variants and N34S SPINK1 and L26V CTSB mutations, since they are strong predictors of risk for TCP. Two polymorphisms rs7903146 and rs12255372 in the TCF7L2 gene were analyzed by direct sequencing in 478 well-characterized TCP patients and 661 healthy controls of Dravidian and Indo-European ethnicities. Their association with TCP with diabetes (FCPD) and without diabetes was tested in both populations independently using chi-square test. Finally, a meta analysis was performed on all the cases and controls for assessing the overall significance irrespective of ethnicity. We dichotomized the whole cohort based on the presence or absence of N34S SPINK1 and L26V CTSB mutations and further subdivided them into TCP and FCPD patients and compared the distribution of TCF7L2 variants between them. The allelic and genotypic frequencies for both TCF7L2 polymorphisms, did not differ significantly between TCP patients and controls belonging to either of the ethnic groups or taken together. No statistically significant association of the SNPs was observed with TCP or FCPD or between carriers and non-carriers of N34S SPINK1 and L26V CTSB mutations. The minor allele frequency for rs7903146 was different between TCP and FCPD patients carrying the N34S SPINK1 variant but did not reach statistical significance (OR = 1.59, 95% CI = 0.93-2.70, P = 0.09), while, TCF7L2variant showed a statistically significant association between TCP and FCPD patients carrying the 26V allele (OR = 1.69, 95% CI = 1.11-2.56, P = 0.013). Type 2 diabetes associated TCF7L2 variants are not associated with diabetes in TCP. Since, TCF7L2 is a major susceptibility gene for T2D, it may be hypothesized that the diabetes in TCP patients may not be similar to T2D. Our data also suggests that co-existence of TCF7L2 variants and the SPINK1 and CTSB mutations, that predict susceptibility to exocrine damage, may interact to determine the onset of diabetes in TCP patients.

  13. A functional U-statistic method for association analysis of sequencing data.

    PubMed

    Jadhav, Sneha; Tong, Xiaoran; Lu, Qing

    2017-11-01

    Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.

  14. Gene and pathway level analyses of germline DNA-repair gene variants and prostate cancer susceptibility using the iCOGS-genotyping array.

    PubMed

    Saunders, Edward J; Dadaev, Tokhir; Leongamornlert, Daniel A; Al Olama, Ali Amin; Benlloch, Sara; Giles, Graham G; Wiklund, Fredrik; Gronberg, Henrik; Haiman, Christopher A; Schleutker, Johanna; Nordestgaard, Borge G; Travis, Ruth C; Neal, David; Pasayan, Nora; Khaw, Kay-Tee; Stanford, Janet L; Blot, William J; Thibodeau, Stephen N; Maier, Christiane; Kibel, Adam S; Cybulski, Cezary; Cannon-Albright, Lisa; Brenner, Hermann; Park, Jong Y; Kaneva, Radka; Batra, Jyotsna; Teixeira, Manuel R; Pandha, Hardev; Govindasami, Koveela; Muir, Ken; Easton, Douglas F; Eeles, Rosalind A; Kote-Jarai, Zsofia

    2016-04-12

    Germline mutations within DNA-repair genes are implicated in susceptibility to multiple forms of cancer. For prostate cancer (PrCa), rare mutations in BRCA2 and BRCA1 give rise to moderately elevated risk, whereas two of B100 common, low-penetrance PrCa susceptibility variants identified so far by genome-wide association studies implicate RAD51B and RAD23B. Genotype data from the iCOGS array were imputed to the 1000 genomes phase 3 reference panel for 21 780 PrCa cases and 21 727 controls from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium. We subsequently performed single variant, gene and pathway-level analyses using 81 303 SNPs within 20 Kb of a panel of 179 DNA-repair genes. Single SNP analyses identified only the previously reported association with RAD51B. Gene-level analyses using the SKAT-C test from the SNP-set (Sequence) Kernel Association Test (SKAT) identified a significant association with PrCa for MSH5. Pathway-level analyses suggested a possible role for the translesion synthesis pathway in PrCa risk and Homologous recombination/Fanconi Anaemia pathway for PrCa aggressiveness, even though after adjustment for multiple testing these did not remain significant. MSH5 is a novel candidate gene warranting additional follow-up as a prospective PrCa-risk locus. MSH5 has previously been reported as a pleiotropic susceptibility locus for lung, colorectal and serous ovarian cancers.

  15. Thrombomodulin gene variants are associated with increased mortality after coronary artery bypass surgery in replicated analyses.

    PubMed

    Lobato, Robert L; White, William D; Mathew, Joseph P; Newman, Mark F; Smith, Peter K; McCants, Charles B; Alexander, John H; Podgoreanu, Mihai V

    2011-09-13

    We tested the hypothesis that genetic variation in thrombotic and inflammatory pathways is independently associated with long-term mortality after coronary artery bypass graft (CABG) surgery. Two separate cohorts of patients undergoing CABG surgery at a single institution were examined, and all-cause mortality between 30 days and 5 years after the index CABG was ascertained from the National Death Index. In a discovery cohort of 1018 patients, a panel of 90 single-nucleotide polymorphisms (SNPs) in 49 candidate genes was tested with Cox proportional hazard models to identify clinical and genomic multivariate predictors of incident death. After adjustment for multiple comparisons and clinical predictors of mortality, the homozygote minor allele of a common variant in the thrombomodulin (THBD) gene (rs1042579) was independently associated with significantly increased risk of all-cause mortality (hazard ratio, 2.26; 95% CI, 1.31 to 3.92; P=0.003). Six tag SNPs in the THBD gene, 1 of which (rs3176123) in complete linkage disequilibrium with rs1042579, were then assessed in an independent validation cohort of 930 patients. After multivariate adjustment for the clinical predictors identified in the discovery cohort and multiple testing, the homozygote minor allele of rs3176123 independently predicted all-cause mortality (hazard ratio, 3.6; 95% CI, 1.67 to 7.78; P=0.001). In 2 independent cardiac surgery cohorts, linked common allelic variants in the THBD gene are independently associated with increased long-term mortality risk after CABG and significantly improve the classification ability of traditional postoperative mortality prediction models.

  16. Generalization of Associations of Kidney-Related Genetic Loci to American Indians

    PubMed Central

    Haack, Karin; Almasy, Laura; Laston, Sandra; Lee, Elisa T.; Best, Lyle G.; Fabsitz, Richard R.; MacCluer, Jean W.; Howard, Barbara V.; Umans, Jason G.; Cole, Shelley A.

    2014-01-01

    Summary Background and objectives CKD disproportionally affects American Indians, who similar to other populations, show genetic susceptibility to kidney outcomes. Recent studies have identified several loci associated with kidney traits, but their relevance in American Indians is unknown. Design, setting, participants, & measurements This study used data from a large, family-based genetic study of American Indians (the Strong Heart Family Study), which includes 94 multigenerational families enrolled from communities located in Oklahoma, the Dakotas, and Arizona. Individuals were recruited from the Strong Heart Study, a population-based study of cardiovascular disease in American Indians. This study selected 25 single nucleotide polymorphisms in 23 loci identified from recently published kidney-related genome-wide association studies in individuals of European ancestry to evaluate their associations with kidney function (estimated GFR; individuals 18 years or older, up to 3282 individuals) and albuminuria (urinary albumin to creatinine ratio; n=3552) in the Strong Heart Family Study. This study also examined the association of single nucleotide polymorphisms in the APOL1 region with estimated GFR in 1121 Strong Heart Family Study participants. GFR was estimated using the abbreviated Modification of Diet in Renal Disease Equation. Additive genetic models adjusted for age and sex were used. Results This study identified significant associations of single nucleotide polymorphisms with estimated GFR in or nearby PRKAG2, SLC6A13, UBE2Q2, PIP5K1B, and WDR72 (P<2.1 × 10-3 to account for multiple testing). Single nucleotide polymorphisms in these loci explained 2.2% of the estimated GFR total variance and 2.9% of its heritability. An intronic variant of BCAS3 was significantly associated with urinary albumin to creatinine ratio. APOL1 single nucleotide polymorphisms were not associated with estimated GFR in a single variant test or haplotype analyses, and the at-risk variants identified in individuals with African ancestry were not detected in DNA sequencing of American Indians. Conclusion This study extends the genetic associations of loci affecting kidney function to American Indians, a population at high risk of kidney disease, and provides additional support for a potential biologic relevance of these loci across ancestries. PMID:24311711

  17. TumorNext: A comprehensive tumor profiling assay that incorporates high resolution copy number analysis and germline status to improve testing accuracy

    PubMed Central

    Gray, Phillip N.; Vuong, Huy; Tsai, Pei; Lu, Hsaio-Mei; Mu, Wenbo; Hsuan, Vickie; Hoo, Jayne; Shah, Swati; Uyeda, Lisa; Fox, Susanne; Patel, Harshil; Janicek, Mike; Brown, Sandra; Dobrea, Lavinia; Wagman, Lawrence; Plimack, Elizabeth; Mehra, Ranee; Golemis, Erica A.; Bilusic, Marijo; Zibelman, Matthew; Elliott, Aaron

    2016-01-01

    The development of targeted therapies for both germline and somatic DNA mutations has increased the need for molecular profiling assays to determine the mutational status of specific genes. Moreover, the potential of off-label prescription of targeted therapies favors classifying tumors based on DNA alterations rather than traditional tissue pathology. Here we describe the analytical validation of a custom probe-based NGS tumor panel, TumorNext, which can detect single nucleotide variants, small insertions and deletions in 142 genes that are frequently mutated in somatic and/or germline cancers. TumorNext also detects gene fusions and structural variants, such as tandem duplications and inversions, in 15 frequently disrupted oncogenes and tumor suppressors. The assay uses a matched control and custom bioinformatics pipeline to differentiate between somatic and germline mutations, allowing precise variant classification. We tested 170 previously characterized samples, of which > 95% were formalin-fixed paraffin embedded tissue from 8 different cancer types, and highlight examples where lack of germline status may have led to the inappropriate prescription of therapy. We also describe the validation of the Affymetrix OncoScan platform, an array technology for high resolution copy number variant detection for use in parallel with the NGS panel that can detect single copy amplifications and hemizygous deletions. We analyzed 80 previously characterized formalin-fixed paraffin-embedded specimens and provide examples of hemizygous deletion detection in samples with known pathogenic germline mutations. Thus, the TumorNext combined approach of NGS and OncoScan potentially allows for the identification of the “second hit” in hereditary cancer patients. PMID:27626691

  18. Smoothing Forecasting Methods for Academic Library Circulations: An Evaluation and Recommendation.

    ERIC Educational Resources Information Center

    Brooks, Terrence A.; Forys, John W., Jr.

    1986-01-01

    Circulation time-series data from 50 midwest academic libraries were used to test 110 variants of 8 smoothing forecasting methods. Data and methodologies and illustrations of two recommended methods--the single exponential smoothing method and Brown's one-parameter linear exponential smoothing method--are given. Eight references are cited. (EJS)

  19. How single nucleotide polymorphism chips will advance our knowledge of factors controlling puberty and aid in selecting replacement beef females

    USDA-ARS?s Scientific Manuscript database

    The promise of genomic selection is accurate prediction of animals' genetic potential from their genotypes. Simple DNA tests might replace low accuracy predictions for expensive or lowly heritable measures of puberty and fertility based on performance and pedigree. Knowing which DNA variants affec...

  20. Do Genetic Susceptibility Variants Associate with Disease Severity in Early Active Rheumatoid Arthritis?

    PubMed

    Scott, Ian C; Rijsdijk, Frühling; Walker, Jemma; Quist, Jelmar; Spain, Sarah L; Tan, Rachael; Steer, Sophia; Okada, Yukinori; Raychaudhuri, Soumya; Cope, Andrew P; Lewis, Cathryn M

    2015-07-01

    Genetic variants affect both the development and severity of rheumatoid arthritis (RA). Recent studies have expanded the number of RA susceptibility variants. We tested the hypothesis that these associated with disease severity in a clinical trial cohort of patients with early, active RA. We evaluated 524 patients with RA enrolled in the Combination Anti-Rheumatic Drugs in Early RA (CARDERA) trials. We tested validated susceptibility variants - 69 single-nucleotide polymorphisms (SNP), 15 HLA-DRB1 alleles, and amino acid polymorphisms in 6 HLA molecule positions - for their associations with progression in Larsen scoring, 28-joint Disease Activity Scores, and Health Assessment Questionnaire (HAQ) scores over 2 years using linear mixed-effects and latent growth curve models. HLA variants were associated with joint destruction. The *04:01 SNP (rs660895, p = 0.0003), *04:01 allele (p = 0.0002), and HLA-DRβ1 amino acids histidine at position 13 (p = 0.0005) and valine at position 11 (p = 0.0012) significantly associated with radiological progression. This association was only significant in anticitrullinated protein antibody (ACPA)-positive patients, suggesting that while their effects were not mediated by ACPA, they only predicted joint damage in ACPA-positive RA. Non-HLA variants did not associate with radiograph damage (assessed individually and cumulatively as a weighted genetic risk score). Two SNP - rs11889341 (STAT4, p = 0.0001) and rs653178 (SH2B3-PTPN11, p = 0.0004) - associated with HAQ scores over 6-24 months. HLA susceptibility variants play an important role in determining radiological progression in early, active ACPA-positive RA. Genome-wide and HLA-wide analyses across large populations are required to better characterize the genetic architecture of radiological progression in RA.

  1. Genetic variation and dopamine D2 receptor availability: a systematic review and meta-analysis of human in vivo molecular imaging studies.

    PubMed

    Gluskin, B S; Mickey, B J

    2016-03-01

    The D2 dopamine receptor mediates neuropsychiatric symptoms and is a target of pharmacotherapy. Inter-individual variation of D2 receptor density is thought to influence disease risk and pharmacological response. Numerous molecular imaging studies have tested whether common genetic variants influence D2 receptor binding potential (BP) in humans, but demonstration of robust effects has been limited by small sample sizes. We performed a systematic search of published human in vivo molecular imaging studies to estimate effect sizes of common genetic variants on striatal D2 receptor BP. We identified 21 studies examining 19 variants in 11 genes. The most commonly studied variant was a single-nucleotide polymorphism in ANKK1 (rs1800497, Glu713Lys, also called 'Taq1A'). Fixed- and random-effects meta-analyses of this variant (5 studies, 194 subjects total) revealed that striatal BP was significantly and robustly lower among carriers of the minor allele (Lys713) relative to major allele homozygotes. The weighted standardized mean difference was -0.57 under the fixed-effect model (95% confidence interval=(-0.87, -0.27), P=0.0002). The normal relationship between rs1800497 and BP was not apparent among subjects with neuropsychiatric diseases. Significant associations with baseline striatal D2 receptor BP have been reported for four DRD2 variants (rs1079597, rs1076560, rs6277 and rs1799732) and a PER2 repeat polymorphism, but none have yet been tested in more than two independent samples. Our findings resolve apparent discrepancies in the literature and establish that rs1800497 robustly influences striatal D2 receptor availability. This genetic variant is likely to contribute to important individual differences in human striatal function, neuropsychiatric disease risk and pharmacological response.

  2. Whole-Exome Sequencing in Age-Related Macular Degeneration Identifies Rare Variants in COL8A1, a Component of Bruch's Membrane.

    PubMed

    Corominas, Jordi; Colijn, Johanna M; Geerlings, Maartje J; Pauper, Marc; Bakker, Bjorn; Amin, Najaf; Lores Motta, Laura; Kersten, Eveline; Garanto, Alejandro; Verlouw, Joost A M; van Rooij, Jeroen G J; Kraaij, Robert; de Jong, Paulus T V M; Hofman, Albert; Vingerling, Johannes R; Schick, Tina; Fauser, Sascha; de Jong, Eiko K; van Duijn, Cornelia M; Hoyng, Carel B; Klaver, Caroline C W; den Hollander, Anneke I

    2018-04-26

    Genome-wide association studies and targeted sequencing studies of candidate genes have identified common and rare variants that are associated with age-related macular degeneration (AMD). Whole-exome sequencing (WES) studies allow a more comprehensive analysis of rare coding variants across all genes of the genome and will contribute to a better understanding of the underlying disease mechanisms. To date, the number of WES studies in AMD case-control cohorts remains scarce and sample sizes are limited. To scrutinize the role of rare protein-altering variants in AMD cause, we performed the largest WES study in AMD to date in a large European cohort consisting of 1125 AMD patients and 1361 control participants. Genome-wide case-control association study of WES data. One thousand one hundred twenty-five AMD patients and 1361 control participants. A single variant association test of WES data was performed to detect variants that are associated individually with AMD. The cumulative effect of multiple rare variants with 1 gene was analyzed using a gene-based CMC burden test. Immunohistochemistry was performed to determine the localization of the Col8a1 protein in mouse eyes. Genetic variants associated with AMD. We detected significantly more rare protein-altering variants in the COL8A1 gene in patients (22/2250 alleles [1.0%]) than in control participants (11/2722 alleles [0.4%]; P = 7.07×10 -5 ). The association of rare variants in the COL8A1 gene is independent of the common intergenic variant (rs140647181) near the COL8A1 gene previously associated with AMD. We demonstrated that the Col8a1 protein localizes at Bruch's membrane. This study supported a role for protein-altering variants in the COL8A1 gene in AMD pathogenesis. We demonstrated the presence of Col8a1 in Bruch's membrane, further supporting the role of COL8A1 variants in AMD pathogenesis. Protein-altering variants in COL8A1 may alter the integrity of Bruch's membrane, contributing to the accumulation of drusen and the development of AMD. Copyright © 2018 American Academy of Ophthalmology. Published by Elsevier Inc. All rights reserved.

  3. Novel Myopia Genes and Pathways Identified From Syndromic Forms of Myopia

    PubMed Central

    Loughman, James; Wildsoet, Christine F.; Williams, Cathy; Guggenheim, Jeremy A.

    2018-01-01

    Purpose To test the hypothesis that genes known to cause clinical syndromes featuring myopia also harbor polymorphisms contributing to nonsyndromic refractive errors. Methods Clinical phenotypes and syndromes that have refractive errors as a recognized feature were identified using the Online Mendelian Inheritance in Man (OMIM) database. One hundred fifty-four unique causative genes were identified, of which 119 were specifically linked with myopia and 114 represented syndromic myopia (i.e., myopia and at least one other clinical feature). Myopia was the only refractive error listed for 98 genes and hyperopia and the only refractive error noted for 28 genes, with the remaining 28 genes linked to phenotypes with multiple forms of refractive error. Pathway analysis was carried out to find biological processes overrepresented within these sets of genes. Genetic variants located within 50 kb of the 119 myopia-related genes were evaluated for involvement in refractive error by analysis of summary statistics from genome-wide association studies (GWAS) conducted by the CREAM Consortium and 23andMe, using both single-marker and gene-based tests. Results Pathway analysis identified several biological processes already implicated in refractive error development through prior GWAS analyses and animal studies, including extracellular matrix remodeling, focal adhesion, and axon guidance, supporting the research hypothesis. Novel pathways also implicated in myopia development included mannosylation, glycosylation, lens development, gliogenesis, and Schwann cell differentiation. Hyperopia was found to be linked to a different pattern of biological processes, mostly related to organogenesis. Comparison with GWAS findings further confirmed that syndromic myopia genes were enriched for genetic variants that influence refractive errors in the general population. Gene-based analyses implicated 21 novel candidate myopia genes (ADAMTS18, ADAMTS2, ADAMTSL4, AGK, ALDH18A1, ASXL1, COL4A1, COL9A2, ERBB3, FBN1, GJA1, GNPTG, IFIH1, KIF11, LTBP2, OCA2, POLR3B, POMT1, PTPN11, TFAP2A, ZNF469). Conclusions Common genetic variants within or nearby genes that cause syndromic myopia are enriched for variants that cause nonsyndromic, common myopia. Analysis of syndromic forms of refractive errors can provide new insights into the etiology of myopia and additional potential targets for therapeutic interventions. PMID:29346494

  4. SCN5A (NaV1.5) Variant Functional Perturbation and Clinical Presentation: Variants of a Certain Significance.

    PubMed

    Kroncke, Brett M; Glazer, Andrew M; Smith, Derek K; Blume, Jeffrey D; Roden, Dan M

    2018-05-01

    Accurately predicting the impact of rare nonsynonymous variants on disease risk is an important goal in precision medicine. Variants in the cardiac sodium channel SCN5A (protein Na V 1.5; voltage-dependent cardiac Na+ channel) are associated with multiple arrhythmia disorders, including Brugada syndrome and long QT syndrome. Rare SCN5A variants also occur in ≈1% of unaffected individuals. We hypothesized that in vitro electrophysiological functional parameters explain a statistically significant portion of the variability in disease penetrance. From a comprehensive literature review, we quantified the number of carriers presenting with and without disease for 1712 reported SCN5A variants. For 356 variants, data were also available for 5 Na V 1.5 electrophysiological parameters: peak current, late/persistent current, steady-state V1/2 of activation and inactivation, and recovery from inactivation. We found that peak and late current significantly associate with Brugada syndrome ( P <0.001; ρ=-0.44; Spearman rank test) and long QT syndrome disease penetrance ( P <0.001; ρ=0.37). Steady-state V1/2 activation and recovery from inactivation associate significantly with Brugada syndrome and long QT syndrome penetrance, respectively. Continuous estimates of disease penetrance align with the current American College of Medical Genetics classification paradigm. Na V 1.5 in vitro electrophysiological parameters are correlated with Brugada syndrome and long QT syndrome disease risk. Our data emphasize the value of in vitro electrophysiological characterization and incorporating counts of affected and unaffected carriers to aid variant classification. This quantitative analysis of the electrophysiological literature should aid the interpretation of Na V 1.5 variant electrophysiological abnormalities and help improve Na V 1.5 variant classification. © 2018 American Heart Association, Inc.

  5. Haplotype-based identification of a microsomal transfer protein marker associated with the human lifespan

    PubMed Central

    Geesaman, Bard J.; Benson, Erica; Brewster, Stephanie J.; Kunkel, Louis M.; Blanché, Hélène; Thomas, Gilles; Perls, Thomas T.; Daly, Mark J.; Puca, Annibale A.

    2003-01-01

    We previously reported a genomewide linkage study for human longevity using 308 long-lived individuals (LLI) (centenarians or near-centenarians) in 137 sibships and identified statistically significant linkage within chromosome 4 near microsatellite D4S1564. This interval spans 12 million bp and contains ≈50 putative genes. To identify the specific gene and gene variants impacting lifespan, we performed a haplotype-based fine-mapping study of the interval. The resulting genetic association study identified a haplotype marker within microsomal transfer protein as a modifier of human lifespan. This same variant was tested in a second cohort of LLI from France, and although the association was not replicated, there was evidence for statistical distortion in the form of Hardy–Weinberg disequilibrium. Microsomal transfer protein has been identified as the rate-limiting step in lipoprotein synthesis and may affect longevity by subtly modulating this pathway. This study provides proof of concept for the feasibility of using the genomes of LLI to identify genes impacting longevity. PMID:14615589

  6. MultiGeMS: detection of SNVs from multiple samples using model selection on high-throughput sequencing data.

    PubMed

    Murillo, Gabriel H; You, Na; Su, Xiaoquan; Cui, Wei; Reilly, Muredach P; Li, Mingyao; Ning, Kang; Cui, Xinping

    2016-05-15

    Single nucleotide variant (SNV) detection procedures are being utilized as never before to analyze the recent abundance of high-throughput DNA sequencing data, both on single and multiple sample datasets. Building on previously published work with the single sample SNV caller genotype model selection (GeMS), a multiple sample version of GeMS (MultiGeMS) is introduced. Unlike other popular multiple sample SNV callers, the MultiGeMS statistical model accounts for enzymatic substitution sequencing errors. It also addresses the multiple testing problem endemic to multiple sample SNV calling and utilizes high performance computing (HPC) techniques. A simulation study demonstrates that MultiGeMS ranks highest in precision among a selection of popular multiple sample SNV callers, while showing exceptional recall in calling common SNVs. Further, both simulation studies and real data analyses indicate that MultiGeMS is robust to low-quality data. We also demonstrate that accounting for enzymatic substitution sequencing errors not only improves SNV call precision at low mapping quality regions, but also improves recall at reference allele-dominated sites with high mapping quality. The MultiGeMS package can be downloaded from https://github.com/cui-lab/multigems xinping.cui@ucr.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  7. A comparison of ensemble post-processing approaches that preserve correlation structures

    NASA Astrophysics Data System (ADS)

    Schefzik, Roman; Van Schaeybroeck, Bert; Vannitsem, Stéphane

    2016-04-01

    Despite the fact that ensemble forecasts address the major sources of uncertainty, they exhibit biases and dispersion errors and therefore are known to improve by calibration or statistical post-processing. For instance the ensemble model output statistics (EMOS) method, also known as non-homogeneous regression approach (Gneiting et al., 2005) is known to strongly improve forecast skill. EMOS is based on fitting and adjusting a parametric probability density function (PDF). However, EMOS and other common post-processing approaches apply to a single weather quantity at a single location for a single look-ahead time. They are therefore unable of taking into account spatial, inter-variable and temporal dependence structures. Recently many research efforts have been invested in designing post-processing methods that resolve this drawback but also in verification methods that enable the detection of dependence structures. New verification methods are applied on two classes of post-processing methods, both generating physically coherent ensembles. A first class uses the ensemble copula coupling (ECC) that starts from EMOS but adjusts the rank structure (Schefzik et al., 2013). The second class is a member-by-member post-processing (MBM) approach that maps each raw ensemble member to a corrected one (Van Schaeybroeck and Vannitsem, 2015). We compare variants of the EMOS-ECC and MBM classes and highlight a specific theoretical connection between them. All post-processing variants are applied in the context of the ensemble system of the European Centre of Weather Forecasts (ECMWF) and compared using multivariate verification tools including the energy score, the variogram score (Scheuerer and Hamill, 2015) and the band depth rank histogram (Thorarinsdottir et al., 2015). Gneiting, Raftery, Westveld, and Goldman, 2005: Calibrated probabilistic forecasting using ensemble model output statistics and minimum CRPS estimation. Mon. Wea. Rev., {133}, 1098-1118. Scheuerer and Hamill, 2015. Variogram-based proper scoring rules for probabilistic forecasts of multivariate quantities. Mon. Wea. Rev. {143},1321-1334. Schefzik, Thorarinsdottir, Gneiting. Uncertainty quantification in complex simulation models using ensemble copula coupling. Statistical Science {28},616-640, 2013. Thorarinsdottir, M. Scheuerer, and C. Heinz, 2015. Assessing the calibration of high-dimensional ensemble forecasts using rank histograms, arXiv:1310.0236. Van Schaeybroeck and Vannitsem, 2015: Ensemble post-processing using member-by-member approaches: theoretical aspects. Q.J.R. Meteorol. Soc., 141: 807-818.

  8. Quantitating the multiplicity of infection with human immunodeficiency virus type 1 subtype C reveals a non-poisson distribution of transmitted variants.

    PubMed

    Abrahams, M-R; Anderson, J A; Giorgi, E E; Seoighe, C; Mlisana, K; Ping, L-H; Athreya, G S; Treurnicht, F K; Keele, B F; Wood, N; Salazar-Gonzalez, J F; Bhattacharya, T; Chu, H; Hoffman, I; Galvin, S; Mapanje, C; Kazembe, P; Thebus, R; Fiscus, S; Hide, W; Cohen, M S; Karim, S Abdool; Haynes, B F; Shaw, G M; Hahn, B H; Korber, B T; Swanstrom, R; Williamson, C

    2009-04-01

    Identifying the specific genetic characteristics of successfully transmitted variants may prove central to the development of effective vaccine and microbicide interventions. Although human immunodeficiency virus transmission is associated with a population bottleneck, the extent to which different factors influence the diversity of transmitted viruses is unclear. We estimate here the number of transmitted variants in 69 heterosexual men and women with primary subtype C infections. From 1,505 env sequences obtained using a single genome amplification approach we show that 78% of infections involved single variant transmission and 22% involved multiple variant transmissions (median of 3). We found evidence for mutations selected for cytotoxic-T-lymphocyte or antibody escape and a high prevalence of recombination in individuals infected with multiple variants representing another potential escape pathway in these individuals. In a combined analysis of 171 subtype B and C transmission events, we found that infection with more than one variant does not follow a Poisson distribution, indicating that transmission of individual virions cannot be seen as independent events, each occurring with low probability. While most transmissions resulted from a single infectious unit, multiple variant transmissions represent a significant fraction of transmission events, suggesting that there may be important mechanistic differences between these groups that are not yet understood.

  9. BRCA1/2 missense mutations and the value of in-silico analyses.

    PubMed

    Sadowski, Carolin E; Kohlstedt, Daniela; Meisel, Cornelia; Keller, Katja; Becker, Kerstin; Mackenroth, Luisa; Rump, Andreas; Schröck, Evelin; Wimberger, Pauline; Kast, Karin

    2017-11-01

    The clinical implications of genetic variants in BRCA1/2 in healthy and affected individuals are considerable. Variant interpretation, however, is especially challenging for missense variants. The majority of them are classified as variants of unknown clinical significance (VUS). Computational (in-silico) predictive programs are easy to access, but represent only one tool out of a wide range of complemental approaches to classify VUS. With this single-center study, we aimed to evaluate the impact of in-silico analyses in a spectrum of different BRCA1/2 missense variants. We conducted mutation analysis of BRCA1/2 in 523 index patients with suspected hereditary breast and ovarian cancer (HBOC). Classification of the genetic variants was performed according to the German Consortium (GC)-HBOC database. Additionally, all missense variants were classified by the following three in-silico prediction tools: SIFT, Mutation Taster (MT2) and PolyPhen2 (PPH2). Overall 201 different variants, 68 of which constituted missense variants were ranked as pathogenic, neutral, or unknown. The classification of missense variants by in-silico tools resulted in a higher amount of pathogenic mutations (25% vs. 13.2%) compared to the GC-HBOC-classification. Altogether, more than fifty percent (38/68, 55.9%) of missense variants were ranked differently. Sensitivity of in-silico-tools for mutation prediction was 88.9% (PPH2), 100% (SIFT) and 100% (MT2). We found a relevant discrepancy in variant classification by using in-silico prediction tools, resulting in potential overestimation and/or underestimation of cancer risk. More reliable, notably gene-specific, prediction tools and functional tests are needed to improve clinical counseling. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  10. Genetics and Genomics of Single-Gene Cardiovascular Diseases: Common Hereditary Cardiomyopathies as Prototypes of Single-Gene Disorders

    PubMed Central

    Marian, Ali J.; van Rooij, Eva; Roberts, Robert

    2016-01-01

    This is the first of 2 review papers on genetics and genomics appearing as part of the series on “omics.” Genomics pertains to all components of an organism’s genes, whereas genetics involves analysis of a specific gene(s) in the context of heredity. The paper provides introductory comments, describes the basis of human genetic diversity, and addresses the phenotypic consequences of genetic variants. Rare variants with large effect sizes are responsible for single-gene disorders, whereas complex polygenic diseases are typically due to multiple genetic variants, each exerting a modest effect size. To illustrate the clinical implications of genetic variants with large effect sizes, 3 common forms of hereditary cardiomyopathies are discussed as prototypic examples of single-gene disorders, including their genetics, clinical manifestations, pathogenesis, and treatment. The genetic basis of complex traits is discussed in a separate paper. PMID:28007145

  11. Remarkable Diversity of Escherichia coli Carrying mcr-1 from Hospital Sewage with the Identification of Two New mcr-1 Variants.

    PubMed

    Zhao, Feifei; Feng, Yu; Lü, Xiaoju; McNally, Alan; Zong, Zhiyong

    2017-01-01

    The plasmid-borne colistin-resistant gene mcr-1 has rapidly become a worldwide public health concern. This study aims to determine the host bacterial strains, plasmids, and genetic contexts of mcr-1 in hospital sewage. A 1-ml hospital sewage sample was cultured. Colistin-resistant bacterial colonies were selected on agar plates and were subjected to whole genome sequencing and subsequent analysis. The transfer of mcr-1 between bacterial strains was tested using conjugation. New variants of mcr-1 were cloned to test the impact of variations on the function of mcr-1 . Plasmids carrying mcr-1 were retrieved from GenBank for comparison based on concatenated backbone genes. In the sewage sample, we observed that mcr-1 was located in various genetic contexts on the chromosome, or plasmids of four different replicon types (IncHI2, IncI2, IncP, and IncX4), in Klebsiella pneumoniae, Kluyvera spp. and seven Escherichia coli strains of six different sequence types (ST10, ST34, ST48, ST1196, ST7086, and ST7087). We also identified two new variants of mcr-1, mcr-1.4 and mcr-1.7 , both of which encode an amino acid variation from mcr-1 . mcr-1 -carrying IncX4 plasmids, which have a global distribution across the Enterobacteriaceae , are the result of global dissemination of a single common plasmid, while IncI2 mcr-1 plasmids appear to acquire mcr-1 in multiple events. In conclusion, the unprecedented remarkable diversity of species, strains, plasmids, and genetic contexts carrying mcr-1 present in a single sewage sample from a single healthcare site highlights the continued evolution and dynamic transmission of mcr-1 in healthcare-associated environments.

  12. Pleiotropic Effects of Variants in Dementia Genes in Parkinson Disease.

    PubMed

    Ibanez, Laura; Dube, Umber; Davis, Albert A; Fernandez, Maria V; Budde, John; Cooper, Breanna; Diez-Fairen, Monica; Ortega-Cubero, Sara; Pastor, Pau; Perlmutter, Joel S; Cruchaga, Carlos; Benitez, Bruno A

    2018-01-01

    Background: The prevalence of dementia in Parkinson disease (PD) increases dramatically with advancing age, approaching 80% in patients who survive 20 years with the disease. Increasing evidence suggests clinical, pathological and genetic overlap between Alzheimer disease, dementia with Lewy bodies and frontotemporal dementia with PD. However, the contribution of the dementia-causing genes to PD risk, cognitive impairment and dementia in PD is not fully established. Objective: To assess the contribution of coding variants in Mendelian dementia-causing genes on the risk of developing PD and the effect on cognitive performance of PD patients. Methods: We analyzed the coding regions of the amyloid-beta precursor protein ( APP ), Presenilin 1 and 2 ( PSEN1, PSEN2 ), and Granulin ( GRN ) genes from 1,374 PD cases and 973 controls using pooled-DNA targeted sequence, human exome-chip and whole-exome sequencing (WES) data by single variant and gene base (SKAT-O and burden tests) analyses. Global cognitive function was assessed using the Mini-Mental State Examination (MMSE) or the Montreal Cognitive Assessment (MoCA). The effect of coding variants in dementia-causing genes on cognitive performance was tested by multiple regression analysis adjusting for gender, disease duration, age at dementia assessment, study site and APOE carrier status. Results: Known AD pathogenic mutations in the PSEN1 (p.A79V) and PSEN2 (p.V148I) genes were found in 0.3% of all PD patients. There was a significant burden of rare, likely damaging variants in the GRN and PSEN1 genes in PD patients when compared with frequencies in the European population from the ExAC database. Multiple regression analysis revealed that PD patients carrying rare variants in the APP, PSEN1, PSEN2 , and GRN genes exhibit lower cognitive tests scores than non-carrier PD patients ( p = 2.0 × 10 -4 ), independent of age at PD diagnosis, age at evaluation, APOE status or recruitment site. Conclusions: Pathogenic mutations in the Alzheimer disease-causing genes ( PSEN1 and PSEN2) are found in sporadic PD patients. PD patients with cognitive decline carry rare variants in dementia-causing genes. Variants in genes causing Mendelian neurodegenerative diseases exhibit pleiotropic effects.

  13. Single Assay for Simultaneous Detection and Differential Identification of Human and Avian Influenza Virus Types, Subtypes, and Emergent Variants

    PubMed Central

    Metzgar, David; Myers, Christopher A.; Russell, Kevin L.; Faix, Dennis; Blair, Patrick J.; Brown, Jason; Vo, Scott; Swayne, David E.; Thomas, Colleen; Stenger, David A.; Lin, Baochuan; Malanoski, Anthony P.; Wang, Zheng; Blaney, Kate M.; Long, Nina C.; Schnur, Joel M.; Saad, Magdi D.; Borsuk, Lisa A.; Lichanska, Agnieszka M.; Lorence, Matthew C.; Weslowski, Brian; Schafer, Klaus O.; Tibbetts, Clark

    2010-01-01

    For more than four decades the cause of most type A influenza virus infections of humans has been attributed to only two viral subtypes, A/H1N1 or A/H3N2. In contrast, avian and other vertebrate species are a reservoir of type A influenza virus genome diversity, hosting strains representing at least 120 of 144 combinations of 16 viral hemagglutinin and 9 viral neuraminidase subtypes. Viral genome segment reassortments and mutations emerging within this reservoir may spawn new influenza virus strains as imminent epidemic or pandemic threats to human health and poultry production. Traditional methods to detect and differentiate influenza virus subtypes are either time-consuming and labor-intensive (culture-based) or remarkably insensitive (antibody-based). Molecular diagnostic assays based upon reverse transcriptase-polymerase chain reaction (RT-PCR) have short assay cycle time, and high analytical sensitivity and specificity. However, none of these diagnostic tests determine viral gene nucleotide sequences to distinguish strains and variants of a detected pathogen from one specimen to the next. Decision-quality, strain- and variant-specific pathogen gene sequence information may be critical for public health, infection control, surveillance, epidemiology, or medical/veterinary treatment planning. The Resequencing Pathogen Microarray (RPM-Flu) is a robust, highly multiplexed and target gene sequencing-based alternative to both traditional culture- or biomarker-based diagnostic tests. RPM-Flu is a single, simultaneous differential diagnostic assay for all subtype combinations of type A influenza viruses and for 30 other viral and bacterial pathogens that may cause influenza-like illness. These other pathogen targets of RPM-Flu may co-infect and compound the morbidity and/or mortality of patients with influenza. The informative specificity of a single RPM-Flu test represents specimen-specific viral gene sequences as determinants of virus type, A/HN subtype, virulence, host-range, and resistance to antiviral agents. PMID:20140251

  14. Single assay for simultaneous detection and differential identification of human and avian influenza virus types, subtypes, and emergent variants.

    PubMed

    Metzgar, David; Myers, Christopher A; Russell, Kevin L; Faix, Dennis; Blair, Patrick J; Brown, Jason; Vo, Scott; Swayne, David E; Thomas, Colleen; Stenger, David A; Lin, Baochuan; Malanoski, Anthony P; Wang, Zheng; Blaney, Kate M; Long, Nina C; Schnur, Joel M; Saad, Magdi D; Borsuk, Lisa A; Lichanska, Agnieszka M; Lorence, Matthew C; Weslowski, Brian; Schafer, Klaus O; Tibbetts, Clark

    2010-02-03

    For more than four decades the cause of most type A influenza virus infections of humans has been attributed to only two viral subtypes, A/H1N1 or A/H3N2. In contrast, avian and other vertebrate species are a reservoir of type A influenza virus genome diversity, hosting strains representing at least 120 of 144 combinations of 16 viral hemagglutinin and 9 viral neuraminidase subtypes. Viral genome segment reassortments and mutations emerging within this reservoir may spawn new influenza virus strains as imminent epidemic or pandemic threats to human health and poultry production. Traditional methods to detect and differentiate influenza virus subtypes are either time-consuming and labor-intensive (culture-based) or remarkably insensitive (antibody-based). Molecular diagnostic assays based upon reverse transcriptase-polymerase chain reaction (RT-PCR) have short assay cycle time, and high analytical sensitivity and specificity. However, none of these diagnostic tests determine viral gene nucleotide sequences to distinguish strains and variants of a detected pathogen from one specimen to the next. Decision-quality, strain- and variant-specific pathogen gene sequence information may be critical for public health, infection control, surveillance, epidemiology, or medical/veterinary treatment planning. The Resequencing Pathogen Microarray (RPM-Flu) is a robust, highly multiplexed and target gene sequencing-based alternative to both traditional culture- or biomarker-based diagnostic tests. RPM-Flu is a single, simultaneous differential diagnostic assay for all subtype combinations of type A influenza viruses and for 30 other viral and bacterial pathogens that may cause influenza-like illness. These other pathogen targets of RPM-Flu may co-infect and compound the morbidity and/or mortality of patients with influenza. The informative specificity of a single RPM-Flu test represents specimen-specific viral gene sequences as determinants of virus type, A/HN subtype, virulence, host-range, and resistance to antiviral agents.

  15. Evaluation of Effect CAT -262C/T, SOD + 35A/C, GPx1 Pro197Leu Polymorphisms in Patients with IBD in the Polish Population.

    PubMed

    Mrowicki, Jerzy; Mrowicka, Małgorzata; Majsterek, Ireneusz; Mik, Michał; Dziki, Adam; Dziki, Łukasz

    2016-12-01

    Inflammatory bowel disease (IBD) are a heterogeneous group of disorders in the course dominated by chronic, recurrent gastrointestinal inflammation. It is believed that the activation of IBD occurs in patients with a genetic predisposition to their development. Chronic inflammation develops as a result of an excessive reaction of the immune system principally under the influence of environmental risk factors. Among them, it has been shown that the mechanism of oxidative stress is associated with the pathophysiology of inflammatory bowel disease, responsible for the commencement and progress of these diseases. The aim of the study was the relationship between single nucleotide polymorphisms (SNPs) of individual antioxidant enzymes, and the prevalence of inflammatory bowel disease that may be associated with increased levels of oxidative stress. A total of 111 IBD patients, including 65 patients with ulcerative colitis (UC) and 46 with Crohn's disease (CD) and 125 healthy controls recruited from the Polish population, were genotyped for CAT -262C / T (rs1001179), SOD + 35A / C (rs2234694), GPx Pro 197 Leu polymorphisms. Genotyping of CAT, SOD, GPx gene polymorphism was performed by a RFLP-PCR. The performed analysis of genetic polymorphisms of antioxidant enzymes showed that polymorphic variant of the CAT -262 C / T may have protective effects in patients with ulcerative colitis in the range of genotype C / T; OR = 0.49 (0.25-0.99), p = 0.044. Trend protective, but statistically unrelated, it was also observed for genotype T / T and T allele of the same polymorphism and genotypes and alleles + 35A / C SOD1 in UC as well as polymorphic variants CAT -262 C / T, Pro197Leu of GPx1, + 35A / C SOD1 in CD. The results were compared with a control group of potentially healthy individuals without such diseases. It has been shown that the polymorphism of antioxidant enzymes CAT gene -262 C / T may have protective effects in patients who are carriers of a genotype C / T at the UC. The potential protective effect without statistical relationships were also observed for other genotypes and alleles studied polymorphic variants of antioxidant enzymes in CD and CAT- 262C / T and + 35 A / C SOD1 in UC. Conducted our audit should be extended to more group of patients in order to assess whether or not to confirm the observed during analysis, the protective effect of CAT-262 C / T in ulcerative colitis and other trends observed for other polymorphic variants tested genes.

  16. Attrition from Web-Based Cognitive Testing: A Repeated Measures Comparison of Gamification Techniques.

    PubMed

    Lumsden, Jim; Skinner, Andy; Coyle, David; Lawrence, Natalia; Munafo, Marcus

    2017-11-22

    The prospect of assessing cognition longitudinally and remotely is attractive to researchers, health practitioners, and pharmaceutical companies alike. However, such repeated testing regimes place a considerable burden on participants, and with cognitive tasks typically being regarded as effortful and unengaging, these studies may experience high levels of participant attrition. One potential solution is to gamify these tasks to make them more engaging: increasing participant willingness to take part and reducing attrition. However, such an approach must balance task validity with the introduction of entertaining gamelike elements. This study aims to investigate the effects of gamelike features on participant attrition using a between-subjects, longitudinal Web-based testing study. We used three variants of a common cognitive task, the Stop Signal Task (SST), with a single gamelike feature in each: one variant where points were rewarded for performing optimally; another where the task was given a graphical theme; and a third variant, which was a standard SST and served as a control condition. Participants completed four compulsory test sessions over 4 consecutive days before entering a 6-day voluntary testing period where they faced a daily decision to either drop out or continue taking part. Participants were paid for each session they completed. A total of 482 participants signed up to take part in the study, with 265 completing the requisite four consecutive test sessions. No evidence of an effect of gamification on attrition was observed. A log-rank test showed no evidence of a difference in dropout rates between task variants (χ 2 2 =3.0, P=.22), and a one-way analysis of variance of the mean number of sessions completed per participant in each variant also showed no evidence of a difference (F 2,262 =1.534, P=.21, partial η 2 =0.012). Our findings raise doubts about the ability of gamification to reduce attrition from longitudinal cognitive testing studies. ©Jim Lumsden, Andy Skinner, David Coyle, Natalia Lawrence, Marcus Munafo. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 22.11.2017.

  17. Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations

    PubMed Central

    2012-01-01

    Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent test data than a classifier trained using features selected by a statistical p-value-based filter, which is currently the most popular approach for constructing predictive models in GWAS. Conclusions Greedy RLS is the first known implementation of a machine learning based method with the capability to conduct a wrapper-based feature selection on an entire GWAS containing several thousand examples and over 400,000 variants. In our experiments, greedy RLS selected a highly predictive subset of genetic variants in a fraction of the time spent by wrapper-based selection methods used together with SVM classifiers. The proposed algorithms are freely available as part of the RLScore software library at http://users.utu.fi/aatapa/RLScore/. PMID:22551170

  18. Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction

    PubMed Central

    Laehnemann, David; Borkhardt, Arndt

    2016-01-01

    Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159

  19. An Econometric Model of External Labor Supply to the Establishment Within a Confined Geographic Market.

    ERIC Educational Resources Information Center

    Hines, Robert James

    The study conducted in the Buffalo, New York standard metropolitan statistical area, was undertaken to formulate and test a simple model of labor supply for a local labor market. The principal variables to be examined to determine the external supply function of labor to the establishment are variants of the rate of change of the entry wage and…

  20. CHASM and SNVBox: toolkit for detecting biologically important single nucleotide mutations in cancer.

    PubMed

    Wong, Wing Chung; Kim, Dewey; Carter, Hannah; Diekhans, Mark; Ryan, Michael C; Karchin, Rachel

    2011-08-01

    Thousands of cancer exomes are currently being sequenced, yielding millions of non-synonymous single nucleotide variants (SNVs) of possible relevance to disease etiology. Here, we provide a software toolkit to prioritize SNVs based on their predicted contribution to tumorigenesis. It includes a database of precomputed, predictive features covering all positions in the annotated human exome and can be used either stand-alone or as part of a larger variant discovery pipeline. MySQL database, source code and binaries freely available for academic/government use at http://wiki.chasmsoftware.org, Source in Python and C++. Requires 32 or 64-bit Linux system (tested on Fedora Core 8,10,11 and Ubuntu 10), 2.5*≤ Python <3.0*, MySQL server >5.0, 60 GB available hard disk space (50 MB for software and data files, 40 GB for MySQL database dump when uncompressed), 2 GB of RAM.

  1. Plasminogen activator inhibitor 1 4G/5G and -844G/A variants in idiopathic recurrent pregnancy loss.

    PubMed

    Magdoud, Kalthoum; Herbepin, Viviana G; Touraine, Renaud; Almawi, Wassim Y; Mahjoub, Touhami

    2013-09-01

    Plasminogen activator inhibitor type 1 (PAI-1) regulates fibrinolysis, and the common promoter region variants -675G/A (4G/5G) and -844G/A are associated with increased thrombotic risk. Despite evidence linking altered fibrinolysis with adverse pregnancy events, including idiopathic recurrent pregnancy loss (RPL), the contribution of PAI-1 variants to RPL risk remains controversial. We investigated the association between the PAI-1 -844G/A and 4G/5G (-675G/A) variants with altered risk of RPL. This was a case-control study involving 304 women with confirmed RPL and 371 age- and ethnically matched control women. PAI-1 genotyping was performed by PCR single-specific primer -675 (G/A) and real-time PCR (-844G/A) analysis. Minor allele frequency (MAF) of 4G/5G (P < 0.001), but not -844G/A (P = 0.507), was higher in RPL cases. PAI-1 4G/5G single-nucleotide polymorphism (SNP) was significantly associated with RPL under additive, dominant, and recessive genetic models; no association of -844G/A with RPL was seen irrespective of the genetic model tested. Taking common -844G/5G haplotype as reference (OR = 1.00), multivariate analysis confirmed the association of 4G-containing -844A/4G (P < 0.001) and -844G/4G (P = 0.011) haplotypes with increased RPL risk. 4G/5G, but not -844G/A, PAI-1 variant is associated with an increased risk of RPL. © 2013 John Wiley & Sons Ltd.

  2. Initial Results of Multigene Panel Testing for Hereditary Breast and Ovarian Cancer and Lynch Syndrome.

    PubMed

    Howarth, Dt R; Lum, Sharon S; Esquivel, Pamela; Garberoglio, Carlos A; Senthil, Maheswari; Solomon, Naveenraj L

    2015-10-01

    Multigene panel testing for hereditary cancer risk has recently become commercially available; however, the impact of its use on patient care is undefined. We sought to evaluate results from implementation of panel testing in a multidisciplinary cancer center. We performed a retrospective review of consecutive patients undergoing genetic testing after initiating use of multigene panel testing at Loma Linda University Medical Center. From February 13 to August 25, 2014, 92 patients were referred for genetic testing based on National Comprehensive Cancer Network guidelines. Testing was completed in 90 patients. Overall, nine (10%) pathogenic mutations were identified: five BRCA1/2, and four in non-BRCA loci. Single-site testing identified one BRCA1 and one BRCA2 mutation. The remaining mutations were identified by use of panel testing for hereditary breast and ovarian cancer. There were 40 variants of uncertain significance identified in 34 patients. The use of panel testing more than doubled the identification rate of clinically significant pathogenic mutations that would have been missed with BRCA testing alone. The large number of variants of uncertain significance identified will require long-term follow-up for potential reclassification. Multigene panel testing provides additional information that may improve patient outcomes.

  3. Identification of co-occurrence in a patient with Dent's disease and ADA2-deficiency by exome sequencing.

    PubMed

    Günthner, Roman; Wagner, Matias; Thurm, Tobias; Ponsel, Sabine; Höfele, Julia; Lange-Sperandio, Bärbel

    2018-04-05

    Patients with co-occurrence of two independent pathologies pose a challenge for clinicians as the phenotype often presents as an unclear syndrome. In these cases, exome sequencing serves as a powerful instrument to determine the underlying genetic causes. Here, we present the case of a 4-year old boy with proteinuria, microhematuria, hypercalciuria, nephrocalcinosis, livedo-like rash, recurrent abdominal pain, anemia and continuously elevated CRP. Single exome sequencing revealed the pathogenic nonsense mutation p.(Arg98*) in the CLCN5 gene causing the X-linked inherited, renal tubular disorder Dent's disease. Furthermore, the two pathogenic and compound heterozygous missense variants p.(Gly47Ala) and p.(Pro251Leu) in the CECR1 gene could be identified. Mutations in the CECR1 gene are associated with a hereditary form of polyarteritis nodosa, called ADA2-deficiency. Both parents were carriers of a single heterozygous variant in CECR1 and the mother was carrier of the CLCN5 variant. This case evidently demonstrates the advantage of whole exome sequencing compared to single gene testing as the pathology in the CECR1 gene might have only been diagnosed after the occurrence of signs of systemic vasculitis like strokes or hemorrhages. Therefore, treatment and prevention can now start early to improve the outcome of these patients. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Mitochondrial DNA variants in obesity.

    PubMed

    Knoll, Nadja; Jarick, Ivonne; Volckmar, Anna-Lena; Klingenspor, Martin; Illig, Thomas; Grallert, Harald; Gieger, Christian; Wichmann, Heinz-Erich; Peters, Annette; Wiegand, Susanna; Biebermann, Heike; Fischer-Posovszky, Pamela; Wabitsch, Martin; Völzke, Henry; Nauck, Matthias; Teumer, Alexander; Rosskopf, Dieter; Rimmbach, Christian; Schreiber, Stefan; Jacobs, Gunnar; Lieb, Wolfgang; Franke, Andre; Hebebrand, Johannes; Hinney, Anke

    2014-01-01

    Heritability estimates for body mass index (BMI) variation are high. For mothers and their offspring higher BMI correlations have been described than for fathers. Variation(s) in the exclusively maternally inherited mitochondrial DNA (mtDNA) might contribute to this parental effect. Thirty-two to 40 mtDNA single nucleotide polymorphisms (SNPs) were available from genome-wide association study SNP arrays (Affymetrix 6.0). For discovery, we analyzed association in a case-control (CC) sample of 1,158 extremely obese children and adolescents and 435 lean adult controls. For independent confirmation, 7,014 population-based adults were analyzed as CC sample of n = 1,697 obese cases (BMI ≥ 30 kg/m2) and n = 2,373 normal weight and lean controls (BMI<25 kg/m2). SNPs were analyzed as single SNPs and haplogroups determined by HaploGrep. Fisher's two-sided exact test was used for association testing. Moreover, the D-loop was re-sequenced (Sanger) in 192 extremely obese children and adolescents and 192 lean adult controls. Association testing of detected variants was performed using Fisher's two-sided exact test. For discovery, nominal association with obesity was found for the frequent allele G of m.8994G/A (rs28358887, p = 0.002) located in ATP6. Haplogroup W was nominally overrepresented in the controls (p = 0.039). These findings could not be confirmed independently. For two of the 252 identified D-loop variants nominal association was detected (m.16292C/T, p = 0.007, m.16189T/C, p = 0.048). Only eight controls carried the m.16292T allele, five of whom belonged to haplogroup W that was initially enriched among these controls. m.16189T/C might create an uninterrupted poly-C tract located near a regulatory element involved in replication of mtDNA. Though follow-up of some D-loop variants still is conceivable, our hypothesis of a contribution of variation in the exclusively maternally inherited mtDNA to the observed larger correlations for BMI between mothers and their offspring could not be substantiated by the findings of the present study.

  5. A probabilistic method for testing and estimating selection differences between populations

    PubMed Central

    He, Yungang; Wang, Minxian; Huang, Xin; Li, Ran; Xu, Hongyang; Xu, Shuhua; Jin, Li

    2015-01-01

    Human populations around the world encounter various environmental challenges and, consequently, develop genetic adaptations to different selection forces. Identifying the differences in natural selection between populations is critical for understanding the roles of specific genetic variants in evolutionary adaptation. Although numerous methods have been developed to detect genetic loci under recent directional selection, a probabilistic solution for testing and quantifying selection differences between populations is lacking. Here we report the development of a probabilistic method for testing and estimating selection differences between populations. By use of a probabilistic model of genetic drift and selection, we showed that logarithm odds ratios of allele frequencies provide estimates of the differences in selection coefficients between populations. The estimates approximate a normal distribution, and variance can be estimated using genome-wide variants. This allows us to quantify differences in selection coefficients and to determine the confidence intervals of the estimate. Our work also revealed the link between genetic association testing and hypothesis testing of selection differences. It therefore supplies a solution for hypothesis testing of selection differences. This method was applied to a genome-wide data analysis of Han and Tibetan populations. The results confirmed that both the EPAS1 and EGLN1 genes are under statistically different selection in Han and Tibetan populations. We further estimated differences in the selection coefficients for genetic variants involved in melanin formation and determined their confidence intervals between continental population groups. Application of the method to empirical data demonstrated the outstanding capability of this novel approach for testing and quantifying differences in natural selection. PMID:26463656

  6. Fast single-pass alignment and variant calling using sequencing data

    USDA-ARS?s Scientific Manuscript database

    Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts...

  7. ANRIL Genetic Variants in Iranian Breast Cancer Patients

    PubMed Central

    Khorshidi, Hamid Reza; Taheri, Mohammad; Noroozi, Rezvan; Sarrafzadeh, Shaghayegh; Sayad, Arezou; Ghafouri-Fard, Soudeh

    2017-01-01

    Objective The genetic variants of the long non-coding RNA ANRIL (an antisense noncoding RNA in the INK4 locus) as well as its expression have been shown to be associated with several human diseases including cancers. The aim of this study was to examine the association of ANRIL variants with breast cancer susceptibility in Iranian patients. Materials and Methods In this case-control study, we genotyped rs1333045, rs4977574, rs1333048 and rs10757278 single nucleotide polymorphisms (SNPs) in 122 breast can- cer patients as well as in 200 normal age-matched subjects by tetra-primer amplification refractory mutation system polymerase chain reaction (T-ARMS-PCR). Results The TT genotype at rs1333045 was significantly over-represented among pa- tients (P=0.038) but did not remain significant after multiple-testing correction. In addi- tion, among all observed haplotypes (with SNP order of rs1333045, rs1333048 rs4977574 and rs10757278), four haplotypes were shown to be associated with breast cancer risk. However, after multiple testing corrections, TCGA was the only haplotype which remained significant. Conclusion These results suggest that breast cancer risk is significantly associated with ANRIL variants. Future work analyzing the expression of different associated ANRIL haplotypes would further shed light on the role of ANRIL in this disease. PMID:28580310

  8. Extreme-phenotype genome-wide association study (XP-GWAS): a method for identifying trait-associated variants by sequencing pools of individuals selected from a diversity panel.

    PubMed

    Yang, Jinliang; Jiang, Haiying; Yeh, Cheng-Ting; Yu, Jianming; Jeddeloh, Jeffrey A; Nettleton, Dan; Schnable, Patrick S

    2015-11-01

    Although approaches for performing genome-wide association studies (GWAS) are well developed, conventional GWAS requires high-density genotyping of large numbers of individuals from a diversity panel. Here we report a method for performing GWAS that does not require genotyping of large numbers of individuals. Instead XP-GWAS (extreme-phenotype GWAS) relies on genotyping pools of individuals from a diversity panel that have extreme phenotypes. This analysis measures allele frequencies in the extreme pools, enabling discovery of associations between genetic variants and traits of interest. This method was evaluated in maize (Zea mays) using the well-characterized kernel row number trait, which was selected to enable comparisons between the results of XP-GWAS and conventional GWAS. An exome-sequencing strategy was used to focus sequencing resources on genes and their flanking regions. A total of 0.94 million variants were identified and served as evaluation markers; comparisons among pools showed that 145 of these variants were statistically associated with the kernel row number phenotype. These trait-associated variants were significantly enriched in regions identified by conventional GWAS. XP-GWAS was able to resolve several linked QTL and detect trait-associated variants within a single gene under a QTL peak. XP-GWAS is expected to be particularly valuable for detecting genes or alleles responsible for quantitative variation in species for which extensive genotyping resources are not available, such as wild progenitors of crops, orphan crops, and other poorly characterized species such as those of ecological interest. © 2015 The Authors The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.

  9. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    PubMed Central

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  10. A Meta-analysis of Multiple Myeloma Risk Regions in African and European Ancestry Populations Identifies Putatively Functional Loci.

    PubMed

    Rand, Kristin A; Song, Chi; Dean, Eric; Serie, Daniel J; Curtin, Karen; Sheng, Xin; Hu, Donglei; Huff, Carol Ann; Bernal-Mizrachi, Leon; Tomasson, Michael H; Ailawadhi, Sikander; Singhal, Seema; Pawlish, Karen; Peters, Edward S; Bock, Cathryn H; Stram, Alex; Van Den Berg, David J; Edlund, Christopher K; Conti, David V; Zimmerman, Todd; Hwang, Amie E; Huntsman, Scott; Graff, John; Nooka, Ajay; Kong, Yinfei; Pregja, Silvana L; Berndt, Sonja I; Blot, William J; Carpten, John; Casey, Graham; Chu, Lisa; Diver, W Ryan; Stevens, Victoria L; Lieber, Michael R; Goodman, Phyllis J; Hennis, Anselm J M; Hsing, Ann W; Mehta, Jayesh; Kittles, Rick A; Kolb, Suzanne; Klein, Eric A; Leske, Cristina; Murphy, Adam B; Nemesure, Barbara; Neslund-Dudas, Christine; Strom, Sara S; Vij, Ravi; Rybicki, Benjamin A; Stanford, Janet L; Signorello, Lisa B; Witte, John S; Ambrosone, Christine B; Bhatti, Parveen; John, Esther M; Bernstein, Leslie; Zheng, Wei; Olshan, Andrew F; Hu, Jennifer J; Ziegler, Regina G; Nyante, Sarah J; Bandera, Elisa V; Birmann, Brenda M; Ingles, Sue A; Press, Michael F; Atanackovic, Djordje; Glenn, Martha J; Cannon-Albright, Lisa A; Jones, Brandt; Tricot, Guido; Martin, Thomas G; Kumar, Shaji K; Wolf, Jeffrey L; Deming Halverson, Sandra L; Rothman, Nathaniel; Brooks-Wilson, Angela R; Rajkumar, S Vincent; Kolonel, Laurence N; Chanock, Stephen J; Slager, Susan L; Severson, Richard K; Janakiraman, Nalini; Terebelo, Howard R; Brown, Elizabeth E; De Roos, Anneclaire J; Mohrbacher, Ann F; Colditz, Graham A; Giles, Graham G; Spinelli, John J; Chiu, Brian C; Munshi, Nikhil C; Anderson, Kenneth C; Levy, Joan; Zonder, Jeffrey A; Orlowski, Robert Z; Lonial, Sagar; Camp, Nicola J; Vachon, Celine M; Ziv, Elad; Stram, Daniel O; Hazelett, Dennis J; Haiman, Christopher A; Cozen, Wendy

    2016-12-01

    Genome-wide association studies (GWAS) in European populations have identified genetic risk variants associated with multiple myeloma. We performed association testing of common variation in eight regions in 1,318 patients with multiple myeloma and 1,480 controls of European ancestry and 1,305 patients with multiple myeloma and 7,078 controls of African ancestry and conducted a meta-analysis to localize the signals, with epigenetic annotation used to predict functionality. We found that variants in 7p15.3, 17p11.2, 22q13.1 were statistically significantly (P < 0.05) associated with multiple myeloma risk in persons of African ancestry and persons of European ancestry, and the variant in 3p22.1 was associated in European ancestry only. In a combined African ancestry-European ancestry meta-analysis, variation in five regions (2p23.3, 3p22.1, 7p15.3, 17p11.2, 22q13.1) was statistically significantly associated with multiple myeloma risk. In 3p22.1, the correlated variants clustered within the gene body of ULK4 Correlated variants in 7p15.3 clustered around an enhancer at the 3' end of the CDCA7L transcription termination site. A missense variant at 17p11.2 (rs34562254, Pro251Leu, OR, 1.32; P = 2.93 × 10 -7 ) in TNFRSF13B encodes a lymphocyte-specific protein in the TNF receptor family that interacts with the NF-κB pathway. SNPs correlated with the index signal in 22q13.1 cluster around the promoter and enhancer regions of CBX7 CONCLUSIONS: We found that reported multiple myeloma susceptibility regions contain risk variants important across populations, supporting the use of multiple racial/ethnic groups with different underlying genetic architecture to enhance the localization and identification of putatively functional alleles. A subset of reported risk loci for multiple myeloma has consistent effects across populations and is likely to be functional. Cancer Epidemiol Biomarkers Prev; 25(12); 1609-18. ©2016 AACR. ©2016 American Association for Cancer Research.

  11. Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models

    PubMed Central

    Chiu, Chi-yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-ling; Xiong, Momiao; Fan, Ruzong

    2017-01-01

    To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data. PMID:28000696

  12. Meta-analysis of quantitative pleiotropic traits for next-generation sequencing with multivariate functional linear models.

    PubMed

    Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-Ling; Xiong, Momiao; Fan, Ruzong

    2017-02-01

    To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.

  13. Robust inference for group sequential trials.

    PubMed

    Ganju, Jitendra; Lin, Yunzhi; Zhou, Kefei

    2017-03-01

    For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. Copyright © 2017 John Wiley & Sons, Ltd.

  14. A Phylogenetic Analysis of 34 Chloroplast Genomes Elucidates the Relationships between Wild and Domestic Species within the Genus Citrus

    PubMed Central

    Carbonell-Caballero, Jose; Alonso, Roberto; Ibañez, Victoria; Terol, Javier; Talon, Manuel; Dopazo, Joaquin

    2015-01-01

    Citrus genus includes some of the most important cultivated fruit trees worldwide. Despite being extensively studied because of its commercial relevance, the origin of cultivated citrus species and the history of its domestication still remain an open question. Here, we present a phylogenetic analysis of the chloroplast genomes of 34 citrus genotypes which constitutes the most comprehensive and detailed study to date on the evolution and variability of the genus Citrus. A statistical model was used to estimate divergence times between the major citrus groups. Additionally, a complete map of the variability across the genome of different citrus species was produced, including single nucleotide variants, heteroplasmic positions, indels (insertions and deletions), and large structural variants. The distribution of all these variants provided further independent support to the phylogeny obtained. An unexpected finding was the high level of heteroplasmy found in several of the analyzed genomes. The use of the complete chloroplast DNA not only paves the way for a better understanding of the phylogenetic relationships within the Citrus genus but also provides original insights into other elusive evolutionary processes, such as chloroplast inheritance, heteroplasmy, and gene selection. PMID:25873589

  15. [Construction of haplotype and haplotype block based on tag single nucleotide polymorphisms and their applications in association studies].

    PubMed

    Gu, Ming-liang; Chu, Jia-you

    2007-12-01

    Human genome has structures of haplotype and haplotype block which provide valuable information on human evolutionary history and may lead to the development of more efficient strategies to identify genetic variants that increase susceptibility to complex diseases. Haplotype block can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of ptag SNPsq can be used to distinguish a large fraction of the haplotypes. These tag SNPs can be potentially useful for construction of haplotype and haplotype block, and association studies in complex diseases. There are two general classes of methods to construct haplotype and haplotype blocks based on genotypes on large pedigrees and statistical algorithms respectively. The author evaluate several construction methods to assess the power of different association tests with a variety of disease models and block-partitioning criteria. The advantages, limitations and applications of each method and the application in the association studies are discussed equitably. With the completion of the HapMap and development of statistical algorithms for addressing haplotype reconstruction, ideas of construction of haplotype based on combination of mathematics, physics, and computer science etc will have profound impacts on population genetics, location and cloning for susceptible genes in complex diseases, and related domain with life science etc.

  16. Type 2 Diabetes Susceptibility in the Greek-Cypriot Population: Replication of Associations with TCF7L2, FTO, HHEX, SLC30A8 and IGF2BP2 Polymorphisms

    PubMed Central

    Votsi, Christina; Toufexis, Costas; Michailidou, Kyriaki; Antoniades, Athos; Skordis, Nicos; Karaolis, Minas; Pattichis, Constantinos S.; Christodoulou, Kyproula

    2017-01-01

    Type 2 diabetes (T2D) has been the subject of numerous genetic studies in recent years which revealed associations of the disease with a large number of susceptibility loci. We hereby initiate the evaluation of T2D susceptibility loci in the Greek-Cypriot population by performing a replication case-control study. One thousand and eighteen individuals (528 T2D patients, 490 controls) were genotyped at 21 T2D susceptibility loci, using the allelic discrimination method. Statistically significant associations of T2D with five of the tested single nucleotide polymorphisms (SNPs) (TCF7L2 rs7901695, FTO rs8050136, HHEX rs5015480, SLC30A8 rs13266634 and IGF2BP2 rs4402960) were observed in this study population. Furthermore, 14 of the tested SNPs had odds ratios (ORs) in the same direction as the previously published studies, suggesting that these variants can potentially be used in the Greek-Cypriot population for predictive testing of T2D. In conclusion, our findings expand the genetic assessment of T2D susceptibility loci and reconfirm five of the worldwide established loci in a distinct, relatively small, newly investigated population. PMID:28067832

  17. Association Analysis of FOXO3 Longevity Variants With Blood Pressure and Essential Hypertension

    PubMed Central

    Chen, Randi; Donlon, Timothy A.; Evans, Daniel S.; Tranah, Gregory J.; Parimi, Neeta; Ehret, Georg B.; Newton-Cheh, Christopher; Seto, Todd; Willcox, D. Craig; Masaki, Kamal H.; Kamide, Kei; Ryuno, Hirochika; Oguro, Ryosuke; Nakama, Chikako; Kabayama, Mai; Yamamoto, Koichi; Sugimoto, Ken; Ikebe, Kazunori; Masui, Yukie; Arai, Yasumichi; Ishizaki, Tatsuro; Gondo, Yasuyuki; Rakugi, Hiromi; Willcox, Bradley J.

    2016-01-01

    BACKGROUND The minor alleles of 3 FOXO3 single nucleotide polymorphisms (SNPs)—rs2802292, rs2253310, and rs2802288—are associated with human longevity. The aim of the present study was to test these SNPs for association with blood pressure (BP) and essential hypertension (EHT). METHODS In a primary study involving Americans of Japanese ancestry drawn from the Family Blood Pressure Program II we genotyped 411 female and 432 male subjects aged 40–79 years and tested for statistical association by contingency table analysis and generalized linear models that included logistic regression adjusting for sibling correlation in the data set. Replication of rs2802292 with EHT was attempted in Japanese SONIC study subjects and of each SNP in a meta-analysis of genome-wide association studies of BP in individuals of European ancestry. RESULTS In Americans of Japanese ancestry, women homozygous for the longevity-associated (minor) allele of each FOXO3 SNP had 6mm Hg lower systolic BP and 3mm Hg lower diastolic BP compared with major allele homozygotes (Bonferroni corrected P < 0.05 and >0.05, respectively). Frequencies of minor allele homozygotes were 3.3–3.9% in women with EHT compared with 9.5–9.6% in normotensive women (P = 0.03–0.04; haplotype analysis P = 0.0002). No association with BP or EHT was evident in males. An association with EHT was seen for the minor allele of rs2802292 in the Japanese SONIC cohort (P = 0.03), while in European subjects the minor allele of each SNP was associated with higher systolic and diastolic BP. CONCLUSION Longevity-associated FOXO3 variants may be associated with lower BP and EHT in Japanese women. PMID:26476085

  18. Association Analysis of FOXO3 Longevity Variants With Blood Pressure and Essential Hypertension.

    PubMed

    Morris, Brian J; Chen, Randi; Donlon, Timothy A; Evans, Daniel S; Tranah, Gregory J; Parimi, Neeta; Ehret, Georg B; Newton-Cheh, Christopher; Seto, Todd; Willcox, D Craig; Masaki, Kamal H; Kamide, Kei; Ryuno, Hirochika; Oguro, Ryosuke; Nakama, Chikako; Kabayama, Mai; Yamamoto, Koichi; Sugimoto, Ken; Ikebe, Kazunori; Masui, Yukie; Arai, Yasumichi; Ishizaki, Tatsuro; Gondo, Yasuyuki; Rakugi, Hiromi; Willcox, Bradley J

    2016-11-01

    The minor alleles of 3 FOXO3 single nucleotide polymorphisms (SNPs)- rs2802292 , rs2253310 , and rs2802288 -are associated with human longevity. The aim of the present study was to test these SNPs for association with blood pressure (BP) and essential hypertension (EHT). In a primary study involving Americans of Japanese ancestry drawn from the Family Blood Pressure Program II we genotyped 411 female and 432 male subjects aged 40-79 years and tested for statistical association by contingency table analysis and generalized linear models that included logistic regression adjusting for sibling correlation in the data set. Replication of rs2802292 with EHT was attempted in Japanese SONIC study subjects and of each SNP in a meta-analysis of genome-wide association studies of BP in individuals of European ancestry. In Americans of Japanese ancestry, women homozygous for the longevity-associated (minor) allele of each FOXO3 SNP had 6mm Hg lower systolic BP and 3mm Hg lower diastolic BP compared with major allele homozygotes (Bonferroni corrected P < 0.05 and >0.05, respectively). Frequencies of minor allele homozygotes were 3.3-3.9% in women with EHT compared with 9.5-9.6% in normotensive women ( P = 0.03-0.04; haplotype analysis P = 0.0002). No association with BP or EHT was evident in males. An association with EHT was seen for the minor allele of rs2802292 in the Japanese SONIC cohort ( P = 0.03), while in European subjects the minor allele of each SNP was associated with higher systolic and diastolic BP. Longevity-associated FOXO3 variants may be associated with lower BP and EHT in Japanese women.

  19. Association of multiple genetic variants with chronic obstructive pulmonary disease susceptibility in Hainan region.

    PubMed

    Ding, Yipeng; Niu, Huan; Zhou, Long; Zhou, Wenjing; Chen, Jiannan; Xie, Shiliang; Geng, Tingting; Ouyang, Yanhong; He, Ping; Sun, Pei; Feng, Tian; Jin, Tianbo

    2017-11-01

    Recent genome-wide association studies have shown associations between variants in loci (4q28.1, 6p21.32, 6p21.1, 6q16.1, 10q22.1 and 10q22.3) and chronic obstructive pulmonary disease (COPD) or smoking behaviors. The objective of this study was to look for associations between 16 single nucleotide polymorphisms (SNP) at these six loci and COPD susceptibility in Hainan region. A case-control cohort was composed of 200 COPD cases and 401 controls that were genotyped and analyzed statistically. Odds ratios (OR) and 95% confidence intervals (CIs) were computed by chi-square (χ 2 ) test and genetic models by unconditional logistic regression. After Hardy-Weinberg equilibrium (HWE) P value screening, we excluded the SNP rs12220777 with P < 0.001. By χ 2 test only rs9296092 which located on 6p21.32 was provided the strongest evidence of an increasing risk of COPD with an OR of 3.28 (95% CI = 1.03 - 2.32; P = 0.003) between cases and controls. By genetic models analysis, we not only found rs9296092 increased COPD risk, but also found in the over-dominant model the genotype 'C/T' (OR = 0.55; 95% CI = 0.33 - 0.93; P = 0.023) of rs950063 was proved to be associated with decreased COPD risk. This study is the first to provide evidence of importance of rs9296092 and rs950063 for risk of COPD in Hainan Province. Further studies are needed to characterize the functional sequences that cause COPD. © 2015 John Wiley & Sons Ltd.

  20. A common variant in DRD3 receptor is associated with autism spectrum disorder.

    PubMed

    de Krom, Mariken; Staal, Wouter G; Ophoff, Roel A; Hendriks, Judith; Buitelaar, Jan; Franke, Barbara; de Jonge, Maretha V; Bolton, Patrick; Collier, David; Curran, Sarah; van Engeland, Herman; van Ree, Jan M

    2009-04-01

    The presence of specific and common genetic etiologies for autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD) was investigated for 132 candidate genes in a two-stage design-association study. 1,536 single nucleotide polymorphisms (SNPs) covering these candidate genes were tested in ASD (n = 144) and ADHD (n = 110) patients and control subjects (n = 404) from The Netherlands. A second stage was performed with those SNPs from Stage I reaching a significance threshold for association of p < .01 in an independent sample of ASD patients (n = 128) and controls (n = 124) from the United Kingdom and a Dutch ADHD (n = 150) and control (n = 149) sample. No shared association was found between ASD and ADHD. However, in the first and second ASD samples and in a joint statistical analysis, a significant association between SNP rs167771 located in the DRD3 gene was found (joint analysis uncorrected: p = 3.11 x 10(-6); corrected for multiple testing and potential stratification: p = .00162). The DRD3 gene is related to stereotyped behavior, liability to side effects of antipsychotic medication, and movement disorders and may therefore have important clinical implications for ASD.

  1. Protein Interaction Networks Reveal Novel Autism Risk Genes within GWAS Statistical Noise

    PubMed Central

    Correia, Catarina; Oliveira, Guiomar; Vicente, Astrid M.

    2014-01-01

    Genome-wide association studies (GWAS) for Autism Spectrum Disorder (ASD) thus far met limited success in the identification of common risk variants, consistent with the notion that variants with small individual effects cannot be detected individually in single SNP analysis. To further capture disease risk gene information from ASD association studies, we applied a network-based strategy to the Autism Genome Project (AGP) and the Autism Genetics Resource Exchange GWAS datasets, combining family-based association data with Human Protein-Protein interaction (PPI) data. Our analysis showed that autism-associated proteins at higher than conventional levels of significance (P<0.1) directly interact more than random expectation and are involved in a limited number of interconnected biological processes, indicating that they are functionally related. The functionally coherent networks generated by this approach contain ASD-relevant disease biology, as demonstrated by an improved positive predictive value and sensitivity in retrieving known ASD candidate genes relative to the top associated genes from either GWAS, as well as a higher gene overlap between the two ASD datasets. Analysis of the intersection between the networks obtained from the two ASD GWAS and six unrelated disease datasets identified fourteen genes exclusively present in the ASD networks. These are mostly novel genes involved in abnormal nervous system phenotypes in animal models, and in fundamental biological processes previously implicated in ASD, such as axon guidance, cell adhesion or cytoskeleton organization. Overall, our results highlighted novel susceptibility genes previously hidden within GWAS statistical “noise” that warrant further analysis for causal variants. PMID:25409314

  2. Protein interaction networks reveal novel autism risk genes within GWAS statistical noise.

    PubMed

    Correia, Catarina; Oliveira, Guiomar; Vicente, Astrid M

    2014-01-01

    Genome-wide association studies (GWAS) for Autism Spectrum Disorder (ASD) thus far met limited success in the identification of common risk variants, consistent with the notion that variants with small individual effects cannot be detected individually in single SNP analysis. To further capture disease risk gene information from ASD association studies, we applied a network-based strategy to the Autism Genome Project (AGP) and the Autism Genetics Resource Exchange GWAS datasets, combining family-based association data with Human Protein-Protein interaction (PPI) data. Our analysis showed that autism-associated proteins at higher than conventional levels of significance (P<0.1) directly interact more than random expectation and are involved in a limited number of interconnected biological processes, indicating that they are functionally related. The functionally coherent networks generated by this approach contain ASD-relevant disease biology, as demonstrated by an improved positive predictive value and sensitivity in retrieving known ASD candidate genes relative to the top associated genes from either GWAS, as well as a higher gene overlap between the two ASD datasets. Analysis of the intersection between the networks obtained from the two ASD GWAS and six unrelated disease datasets identified fourteen genes exclusively present in the ASD networks. These are mostly novel genes involved in abnormal nervous system phenotypes in animal models, and in fundamental biological processes previously implicated in ASD, such as axon guidance, cell adhesion or cytoskeleton organization. Overall, our results highlighted novel susceptibility genes previously hidden within GWAS statistical "noise" that warrant further analysis for causal variants.

  3. ColoSeq provides comprehensive lynch and polyposis syndrome mutational analysis using massively parallel sequencing.

    PubMed

    Pritchard, Colin C; Smith, Christina; Salipante, Stephen J; Lee, Ming K; Thornton, Anne M; Nord, Alex S; Gulden, Cassandra; Kupfer, Sonia S; Swisher, Elizabeth M; Bennett, Robin L; Novetsky, Akiva P; Jarvik, Gail P; Olopade, Olufunmilayo I; Goodfellow, Paul J; King, Mary-Claire; Tait, Jonathan F; Walsh, Tom

    2012-07-01

    Lynch syndrome (hereditary nonpolyposis colon cancer) and adenomatous polyposis syndromes frequently have overlapping clinical features. Current approaches for molecular genetic testing are often stepwise, taking a best-candidate gene approach with testing of additional genes if initial results are negative. We report a comprehensive assay called ColoSeq that detects all classes of mutations in Lynch and polyposis syndrome genes using targeted capture and massively parallel next-generation sequencing on the Illumina HiSeq2000 instrument. In blinded specimens and colon cancer cell lines with defined mutations, ColoSeq correctly identified 28/28 (100%) pathogenic mutations in MLH1, MSH2, MSH6, PMS2, EPCAM, APC, and MUTYH, including single nucleotide variants (SNVs), small insertions and deletions, and large copy number variants. There was 100% reproducibility of detection mutation between independent runs. The assay correctly identified 222 of 224 heterozygous SNVs (99.4%) in HapMap samples, demonstrating high sensitivity of calling all variants across each captured gene. Average coverage was greater than 320 reads per base pair when the maximum of 96 index samples with barcodes were pooled. In a specificity study of 19 control patients without cancer from different ethnic backgrounds, we did not find any pathogenic mutations but detected two variants of uncertain significance. ColoSeq offers a powerful, cost-effective means of genetic testing for Lynch and polyposis syndromes that eliminates the need for stepwise testing and multiple follow-up clinical visits. Copyright © 2012 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  4. The (in)famous GWAS P-value threshold revisited and updated for low-frequency variants.

    PubMed

    Fadista, João; Manning, Alisa K; Florez, Jose C; Groop, Leif

    2016-08-01

    Genome-wide association studies (GWAS) have long relied on proposed statistical significance thresholds to be able to differentiate true positives from false positives. Although the genome-wide significance P-value threshold of 5 × 10(-8) has become a standard for common-variant GWAS, it has not been updated to cope with the lower allele frequency spectrum used in many recent array-based GWAS studies and sequencing studies. Using a whole-genome- and -exome-sequencing data set of 2875 individuals of European ancestry from the Genetics of Type 2 Diabetes (GoT2D) project and a whole-exome-sequencing data set of 13 000 individuals from five ancestries from the GoT2D and T2D-GENES (Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples) projects, we describe guidelines for genome- and exome-wide association P-value thresholds needed to correct for multiple testing, explaining the impact of linkage disequilibrium thresholds for distinguishing independent variants, minor allele frequency and ancestry characteristics. We emphasize the advantage of studying recent genetic isolate populations when performing rare and low-frequency genetic association analyses, as the multiple testing burden is diminished due to higher genetic homogeneity.

  5. Influence of Hydrogen and Number of Particle Variants on Ordinary and Two-Way Shape Memory Effects in Ti-Ni Single Crystals

    NASA Astrophysics Data System (ADS)

    Kireeva, I. V.; Platonova, Yu. N.; Chumlyakov, Yu. I.

    2017-02-01

    The ordinary and two-way shape memory effects (SMEs) are investigated for [ overline{1} 12] single crystals of Ti-51.3Ni (at.%) alloy aged at 823 K for 1.5 h in free state and under tensile stress of 150 MPa without hydrogen and after saturation by hydrogen. It is established that without hydrogen in [ overline{1} 12] single crystals with one and four variants of Ti3Ni4 particles the maximum magnitude of the ordinary SME is 1.9-2.6% under the external stress σext = 250 MPa. Under σext > 250 MPa, crystals are destroyed. The magnitude of the two-way SME caused by the B2- R- B19' MT equal to 1.1% at σext = 0 is observed in [ overline{1} 12] single crystals with one variant of Ti3Ni4 particles. The physical reason for the observed two-way SME is the internal compressive stresses oriented along the [ overline{1} 12] directions arising from one variant of Ti3Ni4 particles as a result of aging under tensile stress of 150 MPa. It is established that hydrogen does not influence the TR temperature, reduces the plasticity, and suppresses the two-way SME. The suppression of two-way SME in the [ overline{1} 12] single crystals of the Ti-51.3Ni (at.%) alloy with one variant of Ti3Ni4 particles is caused by shielding of stress fields from one variant of Ti3Ni4 particles and multiple nucleation of R- and B19' martensite variants under loading with saturation by hydrogen.

  6. Targeted next-generation sequencing makes new molecular diagnoses and expands genotype-phenotype relationship in Ehlers-Danlos syndrome.

    PubMed

    Weerakkody, Ruwan A; Vandrovcova, Jana; Kanonidou, Christina; Mueller, Michael; Gampawar, Piyush; Ibrahim, Yousef; Norsworthy, Penny; Biggs, Jennifer; Abdullah, Abdulshakur; Ross, David; Black, Holly A; Ferguson, David; Cheshire, Nicholas J; Kazkaz, Hanadi; Grahame, Rodney; Ghali, Neeti; Vandersteen, Anthony; Pope, F Michael; Aitman, Timothy J

    2016-11-01

    Ehlers-Danlos syndrome (EDS) comprises a group of overlapping hereditary disorders of connective tissue with significant morbidity and mortality, including major vascular complications. We sought to identify the diagnostic utility of a next-generation sequencing (NGS) panel in a mixed EDS cohort. We developed and applied PCR-based NGS assays for targeted, unbiased sequencing of 12 collagen and aortopathy genes to a cohort of 177 unrelated EDS patients. Variants were scored blind to previous genetic testing and then compared with results of previous Sanger sequencing. Twenty-eight pathogenic variants in COL5A1/2, COL3A1, FBN1, and COL1A1 and four likely pathogenic variants in COL1A1, TGFBR1/2, and SMAD3 were identified by the NGS assays. These included all previously detected single-nucleotide and other short pathogenic variants in these genes, and seven newly detected pathogenic or likely pathogenic variants leading to clinically significant diagnostic revisions. Twenty-two variants of uncertain significance were identified, seven of which were in aortopathy genes and required clinical follow-up. Unbiased NGS-based sequencing made new molecular diagnoses outside the expected EDS genotype-phenotype relationship and identified previously undetected clinically actionable variants in aortopathy susceptibility genes. These data may be of value in guiding future clinical pathways for genetic diagnosis in EDS.Genet Med 18 11, 1119-1127.

  7. Exome Sequencing in an Admixed Isolated Population Indicates NFXL1 Variants Confer a Risk for Specific Language Impairment

    PubMed Central

    Villanueva, Pía; Nudel, Ron; Hoischen, Alexander; Fernández, María Angélica; Simpson, Nuala H.; Gilissen, Christian; Reader, Rose H.; Jara, Lillian; Echeverry, Maria Magdalena; Francks, Clyde; Baird, Gillian; Conti-Ramsden, Gina; O’Hare, Anne; Bolton, Patrick F.; Hennessy, Elizabeth R.; Palomino, Hernán; Carvajal-Carmona, Luis; Veltman, Joris A.; Cazier, Jean-Baptiste; De Barbieri, Zulema

    2015-01-01

    Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10–4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model. PMID:25781923

  8. Genome-wide association study of response to cognitive-behavioural therapy in children with anxiety disorders.

    PubMed

    Coleman, Jonathan R I; Lester, Kathryn J; Keers, Robert; Roberts, Susanna; Curtis, Charles; Arendt, Kristian; Bögels, Susan; Cooper, Peter; Creswell, Cathy; Dalgleish, Tim; Hartman, Catharina A; Heiervang, Einar R; Hötzel, Katrin; Hudson, Jennifer L; In-Albon, Tina; Lavallee, Kristen; Lyneham, Heidi J; Marin, Carla E; Meiser-Stedman, Richard; Morris, Talia; Nauta, Maaike H; Rapee, Ronald M; Schneider, Silvia; Schneider, Sophie C; Silverman, Wendy K; Thastum, Mikael; Thirlwall, Kerstin; Waite, Polly; Wergeland, Gro Janne; Breen, Gerome; Eley, Thalia C

    2016-09-01

    Anxiety disorders are common, and cognitive-behavioural therapy (CBT) is a first-line treatment. Candidate gene studies have suggested a genetic basis to treatment response, but findings have been inconsistent. To perform the first genome-wide association study (GWAS) of psychological treatment response in children with anxiety disorders (n = 980). Presence and severity of anxiety was assessed using semi-structured interview at baseline, on completion of treatment (post-treatment), and 3 to 12 months after treatment completion (follow-up). DNA was genotyped using the Illumina Human Core Exome-12v1.0 array. Linear mixed models were used to test associations between genetic variants and response (change in symptom severity) immediately post-treatment and at 6-month follow-up. No variants passed a genome-wide significance threshold (P = 5 × 10(-8)) in either analysis. Four variants met criteria for suggestive significance (P<5 × 10(-6)) in association with response post-treatment, and three variants in the 6-month follow-up analysis. This is the first genome-wide therapygenetic study. It suggests no common variants of very high effect underlie response to CBT. Future investigations should maximise power to detect single-variant and polygenic effects by using larger, more homogeneous cohorts. © The Royal College of Psychiatrists 2016.

  9. Genome-wide association study of response to cognitive–behavioural therapy in children with anxiety disorders

    PubMed Central

    Coleman, Jonathan R. I.; Lester, Kathryn J.; Keers, Robert; Roberts, Susanna; Curtis, Charles; Arendt, Kristian; Bögels, Susan; Cooper, Peter; Creswell, Cathy; Dalgleish, Tim; Hartman, Catharina A.; Heiervang, Einar R.; Hötzel, Katrin; Hudson, Jennifer L.; In-Albon, Tina; Lavallee, Kristen; Lyneham, Heidi J.; Marin, Carla E.; Meiser-Stedman, Richard; Morris, Talia; Nauta, Maaike H.; Rapee, Ronald M.; Schneider, Silvia; Schneider, Sophie C.; Silverman, Wendy K.; Thastum, Mikael; Thirlwall, Kerstin; Waite, Polly; Wergeland, Gro Janne; Breen, Gerome; Eley, Thalia C.

    2016-01-01

    Background Anxiety disorders are common, and cognitive–behavioural therapy (CBT) is a first-line treatment. Candidate gene studies have suggested a genetic basis to treatment response, but findings have been inconsistent. Aims To perform the first genome-wide association study (GWAS) of psychological treatment response in children with anxiety disorders (n = 980). Method Presence and severity of anxiety was assessed using semi-structured interview at baseline, on completion of treatment (post-treatment), and 3 to 12 months after treatment completion (follow-up). DNA was genotyped using the Illumina Human Core Exome-12v1.0 array. Linear mixed models were used to test associations between genetic variants and response (change in symptom severity) immediately post-treatment and at 6-month follow-up. Results No variants passed a genome-wide significance threshold (P = 5 × 10−8) in either analysis. Four variants met criteria for suggestive significance (P<5 × 10−6) in association with response post-treatment, and three variants in the 6-month follow-up analysis. Conclusions This is the first genome-wide therapygenetic study. It suggests no common variants of very high effect underlie response to CBT. Future investigations should maximise power to detect single-variant and polygenic effects by using larger, more homogeneous cohorts. PMID:26989097

  10. regSNPs: a strategy for prioritizing regulatory single nucleotide substitutions

    PubMed Central

    Teng, Mingxiang; Ichikawa, Shoji; Padgett, Leah R.; Wang, Yadong; Mort, Matthew; Cooper, David N.; Koller, Daniel L.; Foroud, Tatiana; Edenberg, Howard J.; Econs, Michael J.; Liu, Yunlong

    2012-01-01

    Motivation: One of the fundamental questions in genetics study is to identify functional DNA variants that are responsible to a disease or phenotype of interest. Results from large-scale genetics studies, such as genome-wide association studies (GWAS), and the availability of high-throughput sequencing technologies provide opportunities in identifying causal variants. Despite the technical advances, informatics methodologies need to be developed to prioritize thousands of variants for potential causative effects. Results: We present regSNPs, an informatics strategy that integrates several established bioinformatics tools, for prioritizing regulatory SNPs, i.e. the SNPs in the promoter regions that potentially affect phenotype through changing transcription of downstream genes. Comparing to existing tools, regSNPs has two distinct features. It considers degenerative features of binding motifs by calculating the differences on the binding affinity caused by the candidate variants and integrates potential phenotypic effects of various transcription factors. When tested by using the disease-causing variants documented in the Human Gene Mutation Database, regSNPs showed mixed performance on various diseases. regSNPs predicted three SNPs that can potentially affect bone density in a region detected in an earlier linkage study. Potential effects of one of the variants were validated using luciferase reporter assay. Contact: yunliu@iupui.edu Supplementary information: Supplementary data are available at Bioinformatics online PMID:22611130

  11. Genotyping of 25 leukemia-associated genes in a single work flow by next-generation sequencing technology with low amounts of input template DNA.

    PubMed

    Rinke, Jenny; Schäfer, Vivien; Schmidt, Mathias; Ziermann, Janine; Kohlmann, Alexander; Hochhaus, Andreas; Ernst, Thomas

    2013-08-01

    We sought to establish a convenient, sensitive next-generation sequencing (NGS) method for genotyping the 26 most commonly mutated leukemia-associated genes in a single work flow and to optimize this method for low amounts of input template DNA. We designed 184 PCR amplicons that cover all of the candidate genes. NGS was performed with genomic DNA (gDNA) from a cohort of 10 individuals with chronic myelomonocytic leukemia. The results were compared with NGS data obtained from sequencing of DNA generated by whole-genome amplification (WGA) of 20 ng template gDNA. Differences between gDNA and WGA samples in variant frequencies were determined for 2 different WGA kits. For gDNA samples, 25 of 26 genes were successfully sequenced with a sensitivity of 5%, which was achieved by a median coverage of 492 reads (range, 308-636 reads) per amplicon. We identified 24 distinct mutations in 11 genes. With WGA samples, we reliably detected all mutations above 5% sensitivity with a median coverage of 506 reads (range, 256-653 reads) per amplicon. With all variants included in the analysis, WGA amplification by the 2 kits tested yielded differences in variant frequencies that ranged from -28.19% to +9.94% [mean (SD) difference, -0.2% (4.08%)] and from -35.03% to +18.67% [mean difference, -0.75% (5.12%)]. Our method permits simultaneous analysis of a wide range of leukemia-associated target genes in a single sequencing run. NGS can be performed after WGA of template DNA for reliable detection of variants without introducing appreciable bias.

  12. Exploring genetic variants predisposing to diabetes mellitus and their association with indicators of socioeconomic status.

    PubMed

    Schmidt, Börge; Dragano, Nico; Scherag, André; Pechlivanis, Sonali; Hoffmann, Per; Nöthen, Markus M; Erbel, Raimund; Jöckel, Karl-Heinz; Moebus, Susanne

    2014-06-16

    The relevance of disease-related genetic variants for the explanation of social inequalities in complex diseases is unclear and empirical analyses are largely missing. The aim of our study was to examine whether genetic variants predisposing to diabetes mellitus are associated with socioeconomic status in a population-based cohort. We genotyped 11 selected diabetes-related single nucleotide polymorphisms in 4655 participants (age 45-75 years) of the Heinz Nixdorf Recall study. Diabetes status was self-reported or defined by blood glucose levels. Education, income and paternal occupation were assessed as indicators of socioeconomic status. Multiple regression analyses were used to examine the association of socioeconomic status and diabetes by estimating sex-specific and age-adjusted prevalence ratios and their corresponding 95%-confidence intervals. To explore the relationship between individual single nucleotide polymorphisms and socioeconomic status sex- and age-adjusted odds ratios were computed. We adjusted the alpha-level for multiple testing of 11 single nucleotide polymorphisms using Bonferroni's method (α(BF) ~ 0.005). In addition, we explored the association of a genetic risk score with socioeconomic status. Social inequalities in diabetes were observed for all indicators of socioeconomic status. However, there were no significant associations between individual diabetes-related risk alleles and socioeconomic status with odds ratios ranging from 0.87 to 1.23. Similarly, the genetic risk score analysis revealed no evidence for an association. Our data provide no evidence for an association between 11 diabetes-related risk alleles and different indicators of socioeconomic status in a population-based cohort, suggesting that the explored genetic variants do not contribute to health inequalities in diabetes.

  13. ClinGen Pathogenicity Calculator: a configurable system for assessing pathogenicity of genetic variants.

    PubMed

    Patel, Ronak Y; Shah, Neethu; Jackson, Andrew R; Ghosh, Rajarshi; Pawliczek, Piotr; Paithankar, Sameer; Baker, Aaron; Riehle, Kevin; Chen, Hailin; Milosavljevic, Sofia; Bizon, Chris; Rynearson, Shawn; Nelson, Tristan; Jarvik, Gail P; Rehm, Heidi L; Harrison, Steven M; Azzariti, Danielle; Powell, Bradford; Babb, Larry; Plon, Sharon E; Milosavljevic, Aleksandar

    2017-01-12

    The success of the clinical use of sequencing based tests (from single gene to genomes) depends on the accuracy and consistency of variant interpretation. Aiming to improve the interpretation process through practice guidelines, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) have published standards and guidelines for the interpretation of sequence variants. However, manual application of the guidelines is tedious and prone to human error. Web-based tools and software systems may not only address this problem but also document reasoning and supporting evidence, thus enabling transparency of evidence-based reasoning and resolution of discordant interpretations. In this report, we describe the design, implementation, and initial testing of the Clinical Genome Resource (ClinGen) Pathogenicity Calculator, a configurable system and web service for the assessment of pathogenicity of Mendelian germline sequence variants. The system allows users to enter the applicable ACMG/AMP-style evidence tags for a specific allele with links to supporting data for each tag and generate guideline-based pathogenicity assessment for the allele. Through automation and comprehensive documentation of evidence codes, the system facilitates more accurate application of the ACMG/AMP guidelines, improves standardization in variant classification, and facilitates collaborative resolution of discordances. The rules of reasoning are configurable with gene-specific or disease-specific guideline variations (e.g. cardiomyopathy-specific frequency thresholds and functional assays). The software is modular, equipped with robust application program interfaces (APIs), and available under a free open source license and as a cloud-hosted web service, thus facilitating both stand-alone use and integration with existing variant curation and interpretation systems. The Pathogenicity Calculator is accessible at http://calculator.clinicalgenome.org . By enabling evidence-based reasoning about the pathogenicity of genetic variants and by documenting supporting evidence, the Calculator contributes toward the creation of a knowledge commons and more accurate interpretation of sequence variants in research and clinical care.

  14. Asymmetric single-strand polymorphism: an accurate and cost-effective method to amplify and sequence allelic variants

    USDA-ARS?s Scientific Manuscript database

    We needed to obtain an alternative to conventional cloning to generate high-quality DNA sequences from a variety of nuclear orthologs for phylogenetic studies in potato, to save time and money and to avoid problems typically encountered in cloning. We tested a variety of SSCP protocols to include pu...

  15. CYP2D6 gene variants: association with breast cancer specific survival in a cohort of breast cancer patients from the United Kingdom treated with adjuvant tamoxifen

    PubMed Central

    2010-01-01

    Introduction Tamoxifen is one of the most effective adjuvant breast cancer therapies available. Its metabolism involves the phase I enzyme, cytochrome P4502D6 (CYP2D6), encoded by the highly polymorphic CYP2D6 gene. CYP2D6 variants resulting in poor metabolism of tamoxifen are hypothesised to reduce its efficacy. An FDA-approved pre-treatment CYP2D6 gene testing assay is available. However, evidence from published studies evaluating CYP2D6 variants as predictive factors of tamoxifen efficacy and clinical outcome are conflicting, querying the clinical utility of CYP2D6 testing. We investigated the association of CYP2D6 variants with breast cancer specific survival (BCSS) in breast cancer patients receiving tamoxifen. Methods This was a population based case-cohort study. We genotyped known functional variants (n = 7; minor allele frequency (MAF) > 0.01) and single nucleotide polymorphisms (SNPs) (n = 5; MAF > 0.05) tagging all known common variants (tagSNPs), in CYP2D6 in 6640 DNA samples from patients with invasive breast cancer from SEARCH (Studies of Epidemiology and Risk factors in Cancer Heredity); 3155 cases had received tamoxifen therapy. There were 312 deaths from breast cancer, in the tamoxifen treated patients, with over 18000 years of cumulative follow-up. The association between genotype and BCSS was evaluated using Cox proportional hazards regression analysis. Results In tamoxifen treated patients, there was weak evidence that the poor-metaboliser variant, CYP2D6*6 (MAF = 0.01), was associated with decreased BCSS (P = 0.02; HR = 1.95; 95% CI = 1.12-3.40). No other variants, including CYP2D6*4 (MAF = 0.20), previously reported to be associated with poorer clinical outcomes, were associated with differences in BCSS, in either the tamoxifen or non-tamoxifen groups. Conclusions CYP2D6*6 may affect BCSS in tamoxifen-treated patients. However, the absence of an association with survival in more frequent variants, including CYP2D6*4, questions the validity of the reported association between CYP2D6 genotype and treatment response in breast cancer. Until larger, prospective studies confirming any associations are available, routine CYP2D6 genetic testing should not be used in the clinical setting. PMID:20731819

  16. The t-CWT: a new ERP detection and quantification method based on the continuous wavelet transform and Student's t-statistics.

    PubMed

    Bostanov, Vladimir; Kotchoubey, Boris

    2006-12-01

    This study was aimed at developing a method for extraction and assessment of event-related brain potentials (ERP) from single-trials. This method should be applicable in the assessment of single persons' ERPs and should be able to handle both single ERP components and whole waveforms. We adopted a recently developed ERP feature extraction method, the t-CWT, for the purposes of hypothesis testing in the statistical assessment of ERPs. The t-CWT is based on the continuous wavelet transform (CWT) and Student's t-statistics. The method was tested in two ERP paradigms, oddball and semantic priming, by assessing individual-participant data on a single-trial basis, and testing the significance of selected ERP components, P300 and N400, as well as of whole ERP waveforms. The t-CWT was also compared to other univariate and multivariate ERP assessment methods: peak picking, area computation, discrete wavelet transform (DWT) and principal component analysis (PCA). The t-CWT produced better results than all of the other assessment methods it was compared with. The t-CWT can be used as a reliable and powerful method for ERP-component detection and testing of statistical hypotheses concerning both single ERP components and whole waveforms extracted from either single persons' or group data. The t-CWT is the first such method based explicitly on the criteria of maximal statistical difference between two average ERPs in the time-frequency domain and is particularly suitable for ERP assessment of individual data (e.g. in clinical settings), but also for the investigation of small and/or novel ERP effects from group data.

  17. ERASE-Seq: Leveraging replicate measurements to enhance ultralow frequency variant detection in NGS data

    PubMed Central

    Kamps-Hughes, Nick; McUsic, Andrew; Kurihara, Laurie; Harkins, Timothy T.; Pal, Prithwish; Ray, Claire

    2018-01-01

    The accurate detection of ultralow allele frequency variants in DNA samples is of interest in both research and medical settings, particularly in liquid biopsies where cancer mutational status is monitored from circulating DNA. Next-generation sequencing (NGS) technologies employing molecular barcoding have shown promise but significant sensitivity and specificity improvements are still needed to detect mutations in a majority of patients before the metastatic stage. To address this we present analytical validation data for ERASE-Seq (Elimination of Recurrent Artifacts and Stochastic Errors), a method for accurate and sensitive detection of ultralow frequency DNA variants in NGS data. ERASE-Seq differs from previous methods by creating a robust statistical framework to utilize technical replicates in conjunction with background error modeling, providing a 10 to 100-fold reduction in false positive rates compared to published molecular barcoding methods. ERASE-Seq was tested using spiked human DNA mixtures with clinically realistic DNA input quantities to detect SNVs and indels between 0.05% and 1% allele frequency, the range commonly found in liquid biopsy samples. Variants were detected with greater than 90% sensitivity and a false positive rate below 0.1 calls per 10,000 possible variants. The approach represents a significant performance improvement compared to molecular barcoding methods and does not require changing molecular reagents. PMID:29630678

  18. Variation in PTCHD2, CRISP3, NAP1L4, FSCB, and AP3B2 associated with spherical equivalent.

    PubMed

    Chen, Fei; Duggal, Priya; Klein, Barbara E K; Lee, Kristine E; Truitt, Barbara; Klein, Ronald; Iyengar, Sudha K; Klein, Alison P

    2016-01-01

    Ocular refraction is measured in spherical equivalent as the power of the external lens required to focus images on the retina. Myopia (nearsightedness) and hyperopia (farsightedness) are the most common refractive errors, and the leading causes of visual impairment and blindness in the world. The goal of this study is to identify rare and low-frequency variants that influence spherical equivalent. We conducted variant-level and gene-level quantitative trait association analyses for mean spherical equivalent, using data from 1,560 individuals in the Beaver Dam Eye Study. Genotyping was conducted using the Illumina exome array. We analyzed 34,976 single nucleotide variants and 11,571 autosomal genes across the genome, using single-variant tests as well as gene-based tests. Spherical equivalent was significantly associated with five genes in gene-based analysis: PTCHD2 at 1p36.22 (p = 3.6 × 10(-7)), CRISP3 at 6p12.3 (p = 4.3 × 10(-6)), NAP1L4 at 11p15.5 (p = 3.6 × 10(-6)), FSCB at 14q21.2 (p = 1.5 × 10(-7)), and AP3B2 at 15q25.2 (p = 1.6 × 10(-7)). The variant-based tests identified evidence suggestive of association with two novel variants in linkage disequilibrium (pairwise r(2) = 0.80) in the TCTE1 gene region at 6p21.1 (rs2297336, minor allele frequency (MAF) = 14.1%, β = -0.62 p = 3.7 × 10(-6); rs324146, MAF = 16.9%, β = -0.55, p = 1.4 × 10(-5)). In addition to these novel findings, we successfully replicated a previously reported association with rs634990 near GJD2 at 15q14 (MAF = 47%, β = -0.29, p=1.8 × 10(-3)). We also found evidence of association with spherical equivalent on 2q37.1 in PRSS56 at rs1550094 (MAF = 31%, β = -0.33, p = 1.7 × 10(-3)), a region previously associated with myopia. We identified several novel candidate genes that may play a role in the control of spherical equivalent. However, further studies are needed to replicate these findings. In addition, our results contribute to the increasing evidence that variation in the GJD2 and PRSS56 genes influence the development of refractive errors. Identifying that variation in these genes is associated with spherical equivalent may provide further insight into the etiology of myopia and consequent vision loss.

  19. Multidimensional structure-function relationships in human β-cardiac myosin from population-scale genetic variation

    PubMed Central

    Homburger, Julian R.; Green, Eric M.; Caleshu, Colleen; Sunitha, Margaret S.; Taylor, Rebecca E.; Ruppel, Kathleen M.; Metpally, Raghu Prasad Rao; Colan, Steven D.; Michels, Michelle; Day, Sharlene M.; Olivotto, Iacopo; Bustamante, Carlos D.; Dewey, Frederick E.; Ho, Carolyn Y.; Spudich, James A.; Ashley, Euan A.

    2016-01-01

    Myosin motors are the fundamental force-generating elements of muscle contraction. Variation in the human β-cardiac myosin heavy chain gene (MYH7) can lead to hypertrophic cardiomyopathy (HCM), a heritable disease characterized by cardiac hypertrophy, heart failure, and sudden cardiac death. How specific myosin variants alter motor function or clinical expression of disease remains incompletely understood. Here, we combine structural models of myosin from multiple stages of its chemomechanical cycle, exome sequencing data from two population cohorts of 60,706 and 42,930 individuals, and genetic and phenotypic data from 2,913 patients with HCM to identify regions of disease enrichment within β-cardiac myosin. We first developed computational models of the human β-cardiac myosin protein before and after the myosin power stroke. Then, using a spatial scan statistic modified to analyze genetic variation in protein 3D space, we found significant enrichment of disease-associated variants in the converter, a kinetic domain that transduces force from the catalytic domain to the lever arm to accomplish the power stroke. Focusing our analysis on surface-exposed residues, we identified a larger region significantly enriched for disease-associated variants that contains both the converter domain and residues on a single flat surface on the myosin head described as the myosin mesa. Notably, patients with HCM with variants in the enriched regions have earlier disease onset than patients who have HCM with variants elsewhere. Our study provides a model for integrating protein structure, large-scale genetic sequencing, and detailed phenotypic data to reveal insight into time-shifted protein structures and genetic disease. PMID:27247418

  20. A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics*

    PubMed Central

    Li, Jing; Su, Zengliu; Ma, Ze-Qiang; Slebos, Robbert J. C.; Halvey, Patrick; Tabb, David L.; Liebler, Daniel C.; Pao, William; Zhang, Bing

    2011-01-01

    Shotgun proteomics data analysis usually relies on database search. However, commonly used protein sequence databases do not contain information on protein variants and thus prevent variant peptides and proteins from been identified. Including known coding variations into protein sequence databases could help alleviate this problem. Based on our recently published human Cancer Proteome Variation Database, we have created a protein sequence database that comprehensively annotates thousands of cancer-related coding variants collected in the Cancer Proteome Variation Database as well as noncancer-specific ones from the Single Nucleotide Polymorphism Database (dbSNP). Using this database, we then developed a data analysis workflow for variant peptide identification in shotgun proteomics. The high risk of false positive variant identifications was addressed by a modified false discovery rate estimation method. Analysis of colorectal cancer cell lines SW480, RKO, and HCT-116 revealed a total of 81 peptides that contain either noncancer-specific or cancer-related variations. Twenty-three out of 26 variants randomly selected from the 81 were confirmed by genomic sequencing. We further applied the workflow on data sets from three individual colorectal tumor specimens. A total of 204 distinct variant peptides were detected, and five carried known cancer-related mutations. Each individual showed a specific pattern of cancer-related mutations, suggesting potential use of this type of information for personalized medicine. Compatibility of the workflow has been tested with four popular database search engines including Sequest, Mascot, X!Tandem, and MyriMatch. In summary, we have developed a workflow that effectively uses existing genomic data to enable variant peptide detection in proteomics. PMID:21389108

  1. Fine-mapping the human leukocyte antigen locus in rheumatoid arthritis and other rheumatic diseases: identifying causal amino acid variants?

    PubMed

    van Heemst, Jurgen; Huizinga, Tom J W; van der Woude, Diane; Toes, René E M

    2015-05-01

    To provide an update on and the context of the recent findings obtained with novel statistical methods on the association of the human leukocyte antigen (HLA) locus with rheumatic diseases. Novel single nucleotide polymorphism fine-mapping data obtained for the HLA locus have indicated the strongest association with amino acid positions 11 and 13 of HLA-DRB1 molecule for several rheumatic diseases. On the basis of these data, a dominant role for position 11/13 in driving the association with these diseases is proposed and the identification of causal variants in the HLA region in relation to disease susceptibility implicated. The HLA class II locus is the most important risk factor for several rheumatic diseases. Recently, new statistical approaches have identified previously unrecognized amino acid positions in the HLA-DR molecule that associate with anticitrullinated protein antibody-negative and anticitrullinated protein antibody-positive rheumatoid arthritis. Likewise, similar findings have been made for other rheumatic conditions such as giant-cell arteritis and systemic lupus erythematosus. Interestingly, all these studies point toward an association with the same amino acid positions: amino acid positions 11 and 13 of the HLA-DR β chain. As both these positions influence peptide binding by HLA-DR and have been implicated in antigen presentation, the novel fine-mapping approach is proposed to map causal variants in the HLA region relevant to rheumatoid arthritis and several rheumatic diseases. If these interpretations are correct, they would direct the biological research aiming to address the explanation for the HLA-disease association. Here, we provide an overview of the recent findings and evidence from literature that, although relevant new insights have been obtained on HLA-disease associations, the interpretation of the biological role of these amino acids as causal variants explaining that such associations should be taken with caution.

  2. BRCA2 Polymorphic Stop Codon K3326X and the Risk of Breast, Prostate, and Ovarian Cancers.

    PubMed

    Meeks, Huong D; Song, Honglin; Michailidou, Kyriaki; Bolla, Manjeet K; Dennis, Joe; Wang, Qin; Barrowdale, Daniel; Frost, Debra; McGuffog, Lesley; Ellis, Steve; Feng, Bingjian; Buys, Saundra S; Hopper, John L; Southey, Melissa C; Tesoriero, Andrea; James, Paul A; Bruinsma, Fiona; Campbell, Ian G; Broeks, Annegien; Schmidt, Marjanka K; Hogervorst, Frans B L; Beckman, Matthias W; Fasching, Peter A; Fletcher, Olivia; Johnson, Nichola; Sawyer, Elinor J; Riboli, Elio; Banerjee, Susana; Menon, Usha; Tomlinson, Ian; Burwinkel, Barbara; Hamann, Ute; Marme, Frederik; Rudolph, Anja; Janavicius, Ramunas; Tihomirova, Laima; Tung, Nadine; Garber, Judy; Cramer, Daniel; Terry, Kathryn L; Poole, Elizabeth M; Tworoger, Shelley S; Dorfling, Cecilia M; van Rensburg, Elizabeth J; Godwin, Andrew K; Guénel, Pascal; Truong, Thérèse; Stoppa-Lyonnet, Dominique; Damiola, Francesca; Mazoyer, Sylvie; Sinilnikova, Olga M; Isaacs, Claudine; Maugard, Christine; Bojesen, Stig E; Flyger, Henrik; Gerdes, Anne-Marie; Hansen, Thomas V O; Jensen, Allen; Kjaer, Susanne K; Hogdall, Claus; Hogdall, Estrid; Pedersen, Inge Sokilde; Thomassen, Mads; Benitez, Javier; González-Neira, Anna; Osorio, Ana; Hoya, Miguel de la; Segura, Pedro Perez; Diez, Orland; Lazaro, Conxi; Brunet, Joan; Anton-Culver, Hoda; Eunjung, Lee; John, Esther M; Neuhausen, Susan L; Ding, Yuan Chun; Castillo, Danielle; Weitzel, Jeffrey N; Ganz, Patricia A; Nussbaum, Robert L; Chan, Salina B; Karlan, Beth Y; Lester, Jenny; Wu, Anna; Gayther, Simon; Ramus, Susan J; Sieh, Weiva; Whittermore, Alice S; Monteiro, Alvaro N A; Phelan, Catherine M; Terry, Mary Beth; Piedmonte, Marion; Offit, Kenneth; Robson, Mark; Levine, Douglas; Moysich, Kirsten B; Cannioto, Rikki; Olson, Sara H; Daly, Mary B; Nathanson, Katherine L; Domchek, Susan M; Lu, Karen H; Liang, Dong; Hildebrant, Michelle A T; Ness, Roberta; Modugno, Francesmary; Pearce, Leigh; Goodman, Marc T; Thompson, Pamela J; Brenner, Hermann; Butterbach, Katja; Meindl, Alfons; Hahnen, Eric; Wappenschmidt, Barbara; Brauch, Hiltrud; Brüning, Thomas; Blomqvist, Carl; Khan, Sofia; Nevanlinna, Heli; Pelttari, Liisa M; Aittomäki, Kristiina; Butzow, Ralf; Bogdanova, Natalia V; Dörk, Thilo; Lindblom, Annika; Margolin, Sara; Rantala, Johanna; Kosma, Veli-Matti; Mannermaa, Arto; Lambrechts, Diether; Neven, Patrick; Claes, Kathleen B M; Maerken, Tom Van; Chang-Claude, Jenny; Flesch-Janys, Dieter; Heitz, Florian; Varon-Mateeva, Raymonda; Peterlongo, Paolo; Radice, Paolo; Viel, Alessandra; Barile, Monica; Peissel, Bernard; Manoukian, Siranoush; Montagna, Marco; Oliani, Cristina; Peixoto, Ana; Teixeira, Manuel R; Collavoli, Anita; Hallberg, Emily; Olson, Janet E; Goode, Ellen L; Hart, Steven N; Shimelis, Hermela; Cunningham, Julie M; Giles, Graham G; Milne, Roger L; Healey, Sue; Tucker, Kathy; Haiman, Christopher A; Henderson, Brian E; Goldberg, Mark S; Tischkowitz, Marc; Simard, Jacques; Soucy, Penny; Eccles, Diana M; Le, Nhu; Borresen-Dale, Anne-Lise; Kristensen, Vessela; Salvesen, Helga B; Bjorge, Line; Bandera, Elisa V; Risch, Harvey; Zheng, Wei; Beeghly-Fadiel, Alicia; Cai, Hui; Pylkäs, Katri; Tollenaar, Robert A E M; Ouweland, Ans M W van der; Andrulis, Irene L; Knight, Julia A; Narod, Steven; Devilee, Peter; Winqvist, Robert; Figueroa, Jonine; Greene, Mark H; Mai, Phuong L; Loud, Jennifer T; García-Closas, Montserrat; Schoemaker, Minouk J; Czene, Kamila; Darabi, Hatef; McNeish, Iain; Siddiquil, Nadeem; Glasspool, Rosalind; Kwong, Ava; Park, Sue K; Teo, Soo Hwang; Yoon, Sook-Yee; Matsuo, Keitaro; Hosono, Satoyo; Woo, Yin Ling; Gao, Yu-Tang; Foretova, Lenka; Singer, Christian F; Rappaport-Feurhauser, Christine; Friedman, Eitan; Laitman, Yael; Rennert, Gad; Imyanitov, Evgeny N; Hulick, Peter J; Olopade, Olufunmilayo I; Senter, Leigha; Olah, Edith; Doherty, Jennifer A; Schildkraut, Joellen; Koppert, Linetta B; Kiemeney, Lambertus A; Massuger, Leon F A G; Cook, Linda S; Pejovic, Tanja; Li, Jingmei; Borg, Ake; Öfverholm, Anna; Rossing, Mary Anne; Wentzensen, Nicolas; Henriksson, Karin; Cox, Angela; Cross, Simon S; Pasini, Barbara J; Shah, Mitul; Kabisch, Maria; Torres, Diana; Jakubowska, Anna; Lubinski, Jan; Gronwald, Jacek; Agnarsson, Bjarni A; Kupryjanczyk, Jolanta; Moes-Sosnowska, Joanna; Fostira, Florentia; Konstantopoulou, Irene; Slager, Susan; Jones, Michael; Antoniou, Antonis C; Berchuck, Andrew; Swerdlow, Anthony; Chenevix-Trench, Georgia; Dunning, Alison M; Pharoah, Paul D P; Hall, Per; Easton, Douglas F; Couch, Fergus J; Spurdle, Amanda B; Goldgar, David E

    2016-02-01

    The K3326X variant in BRCA2 (BRCA2*c.9976A>T; p.Lys3326*; rs11571833) has been found to be associated with small increased risks of breast cancer. However, it is not clear to what extent linkage disequilibrium with fully pathogenic mutations might account for this association. There is scant information about the effect of K3326X in other hormone-related cancers. Using weighted logistic regression, we analyzed data from the large iCOGS study including 76 637 cancer case patients and 83 796 control patients to estimate odds ratios (ORw) and 95% confidence intervals (CIs) for K3326X variant carriers in relation to breast, ovarian, and prostate cancer risks, with weights defined as probability of not having a pathogenic BRCA2 variant. Using Cox proportional hazards modeling, we also examined the associations of K3326X with breast and ovarian cancer risks among 7183 BRCA1 variant carriers. All statistical tests were two-sided. The K3326X variant was associated with breast (ORw = 1.28, 95% CI = 1.17 to 1.40, P = 5.9x10(-) (6)) and invasive ovarian cancer (ORw = 1.26, 95% CI = 1.10 to 1.43, P = 3.8x10(-3)). These associations were stronger for serous ovarian cancer and for estrogen receptor-negative breast cancer (ORw = 1.46, 95% CI = 1.2 to 1.70, P = 3.4x10(-5) and ORw = 1.50, 95% CI = 1.28 to 1.76, P = 4.1x10(-5), respectively). For BRCA1 mutation carriers, there was a statistically significant inverse association of the K3326X variant with risk of ovarian cancer (HR = 0.43, 95% CI = 0.22 to 0.84, P = .013) but no association with breast cancer. No association with prostate cancer was observed. Our study provides evidence that the K3326X variant is associated with risk of developing breast and ovarian cancers independent of other pathogenic variants in BRCA2. Further studies are needed to determine the biological mechanism of action responsible for these associations. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. Abelson Helper Integration Site-1 Gene Variants on Major Depressive Disorder and Bipolar Disorder

    PubMed Central

    Porcelli, Stefano; Han, Changsu; Lee, Soo-Jung; Patkar, Ashwin A.; Masand, Prakash S.; Balzarro, Beatrice; Alberti, Siegfried; De Ronchi, Diana; Serretti, Alessandro

    2014-01-01

    Objective The present study aimed to explore whether 4 single nucleotide polymorphisms (SNPs) within the AHI1 gene could be associated with major depressive disorder (MD) and bipolar disorder (BD), and whether they could predict clinical outcomes in mood disorders. Methods One hundred and eighty-four (184) patients with MD, 170 patients with BD and 170 healthy controls were genotyped for 4 AHI1 SNPs (rs11154801, rs7750586, rs9647635 and rs9321501). Baseline and final clinical measures for MD patients were assessed through the Hamilton Rating Scale for Depression (HAM-D). Allelic and genotypic frequencies in MD and BD subjects were compared with those of each disorder and healthy group using the χ2 statistics. Repeated measures ANOVA was used to test possible influences of SNPs on treatment efficacy. Results The rs9647635 A/A was more represented in subjects with BD as compared with MD and healthy subjects together. The rs9647635 A/A was also more presented in patients with MD than in healthy subjects. With regard to the allelic analysis, rs9647635 A allele was more represented in subjects with BD compared with healthy subjects, while it was not observed between patients with MD and healthy subjects. Conclusion Our findings provide potential evidence of an association between some variants of AHI1 and mood disorders susceptibility but not with clinical outcomes. However, we will need to do more adequately-powered and advanced association studies to draw any conclusion due to clear limitations. PMID:25395981

  4. Type I error rates of rare single nucleotide variants are inflated in tests of association with non-normally distributed traits using simple linear regression methods.

    PubMed

    Schwantes-An, Tae-Hwi; Sung, Heejong; Sabourin, Jeremy A; Justice, Cristina M; Sorant, Alexa J M; Wilson, Alexander F

    2016-01-01

    In this study, the effects of (a) the minor allele frequency of the single nucleotide variant (SNV), (b) the degree of departure from normality of the trait, and (c) the position of the SNVs on type I error rates were investigated in the Genetic Analysis Workshop (GAW) 19 whole exome sequence data. To test the distribution of the type I error rate, 5 simulated traits were considered: standard normal and gamma distributed traits; 2 transformed versions of the gamma trait (log 10 and rank-based inverse normal transformations); and trait Q1 provided by GAW 19. Each trait was tested with 313,340 SNVs. Tests of association were performed with simple linear regression and average type I error rates were determined for minor allele frequency classes. Rare SNVs (minor allele frequency < 0.05) showed inflated type I error rates for non-normally distributed traits that increased as the minor allele frequency decreased. The inflation of average type I error rates increased as the significance threshold decreased. Normally distributed traits did not show inflated type I error rates with respect to the minor allele frequency for rare SNVs. There was no consistent effect of transformation on the uniformity of the distribution of the location of SNVs with a type I error.

  5. Field Synopsis and Re-analysis of Systematic Meta-analyses of Genetic Association Studies in Multiple Sclerosis: a Bayesian Approach.

    PubMed

    Park, Jae Hyon; Kim, Joo Hi; Jo, Kye Eun; Na, Se Whan; Eisenhut, Michael; Kronbichler, Andreas; Lee, Keum Hwa; Shin, Jae Il

    2018-07-01

    To provide an up-to-date summary of multiple sclerosis-susceptible gene variants and assess the noteworthiness in hopes of finding true associations, we investigated the results of 44 meta-analyses on gene variants and multiple sclerosis published through December 2016. Out of 70 statistically significant genotype associations, roughly a fifth (21%) of the comparisons showed noteworthy false-positive rate probability (FPRP) at a statistical power to detect an OR of 1.5 and at a prior probability of 10 -6 assumed for a random single nucleotide polymorphism. These associations (IRF8/rs17445836, STAT3/rs744166, HLA/rs4959093, HLA/rs2647046, HLA/rs7382297, HLA/rs17421624, HLA/rs2517646, HLA/rs9261491, HLA/rs2857439, HLA/rs16896944, HLA/rs3132671, HLA/rs2857435, HLA/rs9261471, HLA/rs2523393, HLA-DRB1/rs3135388, RGS1/rs2760524, PTGER4/rs9292777) also showed a noteworthy Bayesian false discovery probability (BFDP) and one additional association (CD24 rs8734/rs52812045) was also noteworthy via BFDP computation. Herein, we have identified several noteworthy biomarkers of multiple sclerosis susceptibility. We hope these data are used to study multiple sclerosis genetics and inform future screening programs.

  6. Hb Molfetta [beta126(H4)Val-->Leu, GTG-->CTG]: a new, silent, neutral beta chain variant found in an Italian woman.

    PubMed

    Qualtieri, Antonio; Le, Pera Maria; Pedace, Vera; Magariello, Angela; Brancati, Carlo

    2002-02-01

    We have identified a new neutral hemoglobin variant in a pregnant Italian woman, that resulted from a GTG-->CTG replacement at codon 126 of the beta chain, corresponding to a Val-->Leu amino acid change at position beta126(H4). Thermal and isopropanol stability tests were normal and there were no abnormal clinical features. Routine electrophoretic and ion exchange chromatographic methods for hemoglobin separation failed to show this variant, but reversed phase high performance liquid chromatography revealed an abnormal peak eluting near the normal beta chain. No abnormal tryptic peptide was revealed on the high performance liquid chromatographic elution pattern of the total globin digest. The mutation was determined at the DNA level by amplification of the three beta exons by polymerase chain reaction and direct sequencing of one exon that showed an abnormal migration on single strand conformational polymorphism analysis.

  7. Strategic approaches to unraveling genetic causes of cardiovascular diseases

    USDA-ARS?s Scientific Manuscript database

    DNA sequence variants are major components of the "causal field" for virtually all medical phenotypes, whether single gene familial disorders or complex traits without a clear familial aggregation. The causal variants in single gene disorders are necessary and sufficient to impart large effects. In ...

  8. Rare missense variants in CHRNB3 and CHRNA3 are associated with risk of alcohol and cocaine dependence.

    PubMed

    Haller, Gabe; Kapoor, Manav; Budde, John; Xuei, Xiaoling; Edenberg, Howard; Nurnberger, John; Kramer, John; Brooks, Andy; Tischfield, Jay; Almasy, Laura; Agrawal, Arpana; Bucholz, Kathleen; Rice, John; Saccone, Nancy; Bierut, Laura; Goate, Alison

    2014-02-01

    Previous findings have demonstrated that variants in nicotinic receptor genes are associated with nicotine, alcohol and cocaine dependence. Because of the substantial comorbidity, it has often been unclear whether a variant is associated with multiple substances or whether the association is actually with a single substance. To investigate the possible contribution of rare variants to the development of substance dependencies other than nicotine dependence, specifically alcohol and cocaine dependence, we undertook pooled sequencing of the coding regions and flanking sequence of CHRNA5, CHRNA3, CHRNB4, CHRNA6 and CHRNB3 in 287 African American and 1028 European American individuals from the Collaborative Study of the Genetics of Alcoholism (COGA). All members of families for whom any individual was sequenced (2504 African Americans and 7318 European Americans) were then genotyped for all variants identified by sequencing. For each gene, we then tested for association using FamSKAT. For European Americans, we find increased DSM-IV cocaine dependence symptoms (FamSKAT P = 2 × 10(-4)) and increased DSM-IV alcohol dependence symptoms (FamSKAT P = 5 × 10(-4)) among carriers of missense variants in CHRNB3. Additionally, one variant (rs149775276; H329Y) shows association with both cocaine dependence symptoms (P = 7.4 × 10(-5), β = 2.04) and alcohol dependence symptoms (P = 2.6 × 10(-4), β = 2.04). For African Americans, we find decreased cocaine dependence symptoms among carriers of missense variants in CHRNA3 (FamSKAT P = 0.005). Replication in an independent sample supports the role of rare variants in CHRNB3 and alcohol dependence (P = 0.006). These are the first results to implicate rare variants in CHRNB3 or CHRNA3 in risk for alcohol dependence or cocaine dependence.

  9. ENGINES: exploring single nucleotide variation in entire human genomes.

    PubMed

    Amigo, Jorge; Salas, Antonio; Phillips, Christopher

    2011-04-19

    Next generation ultra-sequencing technologies are starting to produce extensive quantities of data from entire human genome or exome sequences, and therefore new software is needed to present and analyse this vast amount of information. The 1000 Genomes project has recently released raw data for 629 complete genomes representing several human populations through their Phase I interim analysis and, although there are certain public tools available that allow exploration of these genomes, to date there is no tool that permits comprehensive population analysis of the variation catalogued by such data. We have developed a genetic variant site explorer able to retrieve data for Single Nucleotide Variation (SNVs), population by population, from entire genomes without compromising future scalability and agility. ENGINES (ENtire Genome INterface for Exploring SNVs) uses data from the 1000 Genomes Phase I to demonstrate its capacity to handle large amounts of genetic variation (>7.3 billion genotypes and 28 million SNVs), as well as deriving summary statistics of interest for medical and population genetics applications. The whole dataset is pre-processed and summarized into a data mart accessible through a web interface. The query system allows the combination and comparison of each available population sample, while searching by rs-number list, chromosome region, or genes of interest. Frequency and FST filters are available to further refine queries, while results can be visually compared with other large-scale Single Nucleotide Polymorphism (SNP) repositories such as HapMap or Perlegen. ENGINES is capable of accessing large-scale variation data repositories in a fast and comprehensive manner. It allows quick browsing of whole genome variation, while providing statistical information for each variant site such as allele frequency, heterozygosity or FST values for genetic differentiation. Access to the data mart generating scripts and to the web interface is granted from http://spsmart.cesga.es/engines.php. © 2011 Amigo et al; licensee BioMed Central Ltd.

  10. Chosen single nucleotide polymorphisms (SNPs) of enamel formation genes and dental caries in a population of Polish children.

    PubMed

    Gerreth, Karolina; Zaorska, Katarzyna; Zabel, Maciej; Borysewicz-Lewicka, Maria; Nowicki, Michał

    2017-09-01

    It is increasingly emphasized that the influence of a host's factors in the etiology of dental caries are of most interest, particularly those concerned with genetic aspect. The aim of the study was to analyze the genotype and allele frequencies of single nucleotide polymorphisms (SNPs) in AMELX, AMBN, TUFT1, TFIP11, MMP20 and KLK4 genes and to prove their association with dental caries occurrence in a population of Polish children. The study was performed in 96 children (48 individuals with caries - "cases" and 48 free of this disease - "controls"), aged 20-42 months, chosen out of 262 individuals who had dental examination performed and attended 4 day nurseries located in Poznań (Poland). From both groups oral swab was collected for molecular evaluation. Eleven selected SNPs markers were genotyped by Sanger sequencing. Genotype and allele frequencies were calculated and a standard χ2 analysis was used to test for deviation from Hardy-Weinberg equilibrium. The association of genetic variations with caries susceptibility or resistance was assessed by the Fisher's exact test and p ≤ 0.05 was considered statistically significant. Five markers were significantly associated with caries incidence in children in the study: rs17878486 in AMELX (p < 0.0001), rs34538475 in AMBN (p < 0.0001), rs2337360 in TUFT1 (p < 0.0001), and rs2235091 (p = 0.0085) and rs198969 (p = 0.0069) in KLK4. Genotype and allele frequencies indicated both risk and protective variants for these markers. Single nucleotide polymorphisms in AMELX, AMBN, TUFT1, KLK4 genes may be considered as a risk factor for dental caries occurrence in Polish children.

  11. Targeted Genetic Screen in Amyotrophic Lateral Sclerosis Reveals Novel Genetic Variants with Synergistic Effect on Clinical Phenotype.

    PubMed

    Cooper-Knock, Johnathan; Robins, Henry; Niedermoser, Isabell; Wyles, Matthew; Heath, Paul R; Higginbottom, Adrian; Walsh, Theresa; Kazoka, Mbombe; Ince, Paul G; Hautbergue, Guillaume M; McDermott, Christopher J; Kirby, Janine; Shaw, Pamela J

    2017-01-01

    Amyotrophic lateral sclerosis (ALS) is underpinned by an oligogenic rare variant architecture. Identified genetic variants of ALS include RNA-binding proteins containing prion-like domains (PrLDs). We hypothesized that screening genes encoding additional similar proteins will yield novel genetic causes of ALS. The most common genetic variant of ALS patients is a G4C2-repeat expansion within C9ORF72 . We have shown that G4C2-repeat RNA sequesters RNA-binding proteins. A logical consequence of this is that loss-of-function mutations in G4C2-binding partners might contribute to ALS pathogenesis independently of and/or synergistically with C9ORF72 expansions. Targeted sequencing of genomic DNA encoding either RNA-binding proteins or known ALS genes ( n = 274 genes) was performed in ALS patients to identify rare deleterious genetic variants and explore genotype-phenotype relationships. Genomic DNA was extracted from 103 ALS patients including 42 familial ALS patients and 61 young-onset (average age of onset 41 years) sporadic ALS patients; patients were chosen to maximize the probability of identifying genetic causes of ALS. Thirteen patients carried a G4C2-repeat expansion of C9ORF72 . We identified 42 patients with rare deleterious variants; 6 patients carried more than one variant. Twelve mutations were discovered in known ALS genes which served as a validation of our strategy. Rare deleterious variants in RNA-binding proteins were significantly enriched in ALS patients compared to control frequencies ( p = 5.31E-18). Nineteen patients featured at least one variant in a RNA-binding protein containing a PrLD. The number of variants per patient correlated with rate of disease progression ( t -test, p = 0.033). We identified eighteen patients with a single variant in a G4C2-repeat binding protein. Patients with a G4C2-binding protein variant in combination with a C9ORF72 expansion had a significantly faster disease course ( t -test, p = 0.025). Our data are consistent with an oligogenic model of ALS. We provide evidence for a number of entirely novel genetic variants of ALS caused by mutations in RNA-binding proteins. Moreover we show that these mutations act synergistically with each other and with C9ORF72 expansions to modify the clinical phenotype of ALS. A key finding is that this synergy is present only between functionally interacting variants. This work has significant implications for ALS therapy development.

  12. Genome-Wide Analysis of Gene-Gene and Gene-Environment Interactions Using Closed-Form Wald Tests.

    PubMed

    Yu, Zhaoxia; Demetriou, Michael; Gillen, Daniel L

    2015-09-01

    Despite the successful discovery of hundreds of variants for complex human traits using genome-wide association studies, the degree to which genes and environmental risk factors jointly affect disease risk is largely unknown. One obstacle toward this goal is that the computational effort required for testing gene-gene and gene-environment interactions is enormous. As a result, numerous computationally efficient tests were recently proposed. However, the validity of these methods often relies on unrealistic assumptions such as additive main effects, main effects at only one variable, no linkage disequilibrium between the two single-nucleotide polymorphisms (SNPs) in a pair or gene-environment independence. Here, we derive closed-form and consistent estimates for interaction parameters and propose to use Wald tests for testing interactions. The Wald tests are asymptotically equivalent to the likelihood ratio tests (LRTs), largely considered to be the gold standard tests but generally too computationally demanding for genome-wide interaction analysis. Simulation studies show that the proposed Wald tests have very similar performances with the LRTs but are much more computationally efficient. Applying the proposed tests to a genome-wide study of multiple sclerosis, we identify interactions within the major histocompatibility complex region. In this application, we find that (1) focusing on pairs where both SNPs are marginally significant leads to more significant interactions when compared to focusing on pairs where at least one SNP is marginally significant; and (2) parsimonious parameterization of interaction effects might decrease, rather than increase, statistical power. © 2015 WILEY PERIODICALS, INC.

  13. High-throughput matrix-assisted laser desorption ionization-time of flight mass spectrometry as an alternative approach to monitoring drug resistance of hepatitis B virus.

    PubMed

    Rybicka, Magda; Stalke, Piotr; Dreczewski, Marcin; Smiatacz, Tomasz; Bielawski, Krzysztof Piotr

    2014-01-01

    Long-term antiviral therapy of chronic hepatitis B virus (HBV) infection can lead to the selection of drug-resistant HBV variants and treatment failure. Moreover, these HBV strains are possibly present in treatment-naive patients. Currently available assays for the detection of HBV drug resistance can identify mutants that constitute ≥5% of the viral population. Furthermore, drug-resistant HBV variants can be detected when a viral load is >10(4) copies/ml (1,718 IU/ml). The aim of this study was to compare matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS) and multitemperature single-strand conformation polymorphism (MSSCP) with commercially available assays for the detection of drug-resistant HBV strains. HBV DNA was extracted from 87 serum samples acquired from 45 chronic hepatitis B (CHB) patients. The 37 selected HBV variants were analyzed in 4 separate primer extension reactions on the MALDI-TOF MS. Moreover, MSSCP for identifying drug-resistant HBV YMDD variants was developed and turned out to be more sensitive than INNOLiPA HBV DR and direct sequencing. MALDI-TOF MS had the capability to detect mutant strains within a mixed viral population occurring with an allelic frequency of approximately 1% (with a specific value of ≥10(2) copies/ml, also expressed as ≥17.18 IU/ml). In our study, MSSCP detected 98% of the HBV YMDD variants among strains detected by the MALDI-TOF MS assay. The routine tests revealed results of 40% and 11%, respectively, for INNOLiPA and direct sequencing. The commonly available HBV tests are less sensitive than MALDI-TOF MS in the detection of HBV-resistant variants, including quasispecies.

  14. Association of Cancer Susceptibility Variants with Risk of Multiple Primary Cancers: the Population Architecture using Genomics and Epidemiology Study

    PubMed Central

    Park, S. Lani; Caberto, Christian P.; Lin, Yi; Goodloe, Robert J.; Dumitrescu, Logan; Love, Shelly-Ann; Matise, Tara C.; Hindorff, Lucia A.; Fowke, Jay H.; Schumacher, Fredrick R.; Beebe-Dimmer, Jennifer; Chen, Chu; Hou, Lifang; Thomas, Fridtjof; Deelman, Ewa; Han, Ying; Peters, Ulrike; North, Kari E.; Heiss, Gerardo; Crawford, Dana C.; Haiman, Christopher A.; Wilkens, Lynne R.; Bush, William S.; Kooperberg, Charles; Cheng, Iona; Le Marchand, Loïc

    2014-01-01

    Background Multiple primary cancers account for ~16% of all incident cancers in the U.S.. While genome-wide association studies (GWAS) have identified many common genetic variants associated with various cancer sites, no study has examined the association of these genetic variants with risk of multiple primary cancers (MPC). Methods As part of the NHGRI Population Architecture using Genomics and Epidemiology (PAGE) study, we used data from the Multiethnic Cohort and Women’s Health Initiative. Incident MPC (IMPC) cases (n=1,385) were defined as participants diagnosed with >1 incident cancers after cohort entry. Participants diagnosed with only one incident cancer after cohort entry with follow-up equal to or longer than IMPC cases served as controls (single-index cancer controls; n= 9,626). Fixed-effects meta-analyses of unconditional logistic regression analyses were used to evaluate the association between cancer risk variants and IMPC risk. To account for multiple comparisons, we used the false positive report probability (FPRP) to determine statistical significance. Results A nicotine dependence-associated and lung cancer variant, CHRNA3 rs578776 (OR=1.16, 95% CI=1.05–1.26; p=0.004) and two breast cancer variants, EMBP1 rs11249433 and TOX3 rs3803662 (OR=1.16, 95% CI=1.04–1.28; p=0.005 and OR=1.13, 95% CI=1.03–1.23; p=0.006) were significantly associated with risk of IMPC. The associations for rs578776 and rs11249433 remained (p<0.05) after removing subjects who had lung or breast cancers, respectively (p-values≤0.046). These associations did not show significant heterogeneity by smoking status (p-heterogeneity≥0.53). Conclusions Our study has identified rs578776 and rs11249433 as risk variants for IMPC. Impact These findings may help to identify genetic regions associated with IMPC risk. PMID:25139936

  15. Genetic variants in ATM, H2AFX and MRE11 genes and susceptibility to breast cancer in the polish population.

    PubMed

    Podralska, Marta; Ziółkowska-Suchanek, Iwona; Żurawek, Magdalena; Dzikiewicz-Krawczyk, Agnieszka; Słomski, Ryszard; Nowak, Jerzy; Stembalska, Agnieszka; Pesz, Karolina; Mosor, Maria

    2018-04-20

    DNA damage repair is a complex process, which can trigger the development of cancer if disturbed. In this study, we hypothesize a role of variants in the ATM, H2AFX and MRE11 genes in determining breast cancer (BC) susceptibility. We examined the whole sequence of the ATM kinase domain and estimated the frequency of founder mutations in the ATM gene (c.5932G > T, c.6095G > A, and c.7630-2A > C) and single nucleotide polymorphisms (SNPs) in H2AFX (rs643788, rs8551, rs7759, and rs2509049) and MRE11 (rs1061956 and rs2155209) among 315 breast cancer patients and 515 controls. The analysis was performed using high-resolution melting for new variants and the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method for recurrent ATM mutations. H2AFX and MRE11 polymorphisms were analyzed using TaqMan assays. The cumulative genetic risk scores (CGRS) were calculated using unweighted and weighted approaches. We identified four mutations (c.6067G > A, c.8314G > A, c.8187A > T, and c.6095G > A) in the ATM gene in three BC cases and two control subjects. We observed a statistically significant association of H2AFX variants with BC. Risk alleles (the G of rs7759 and the T of rs8551 and rs2509049) were observed more frequently in BC cases compared to the control group, with P values, odds ratios (OR) and 95% confidence intervals (CIs) of 0.0018, 1.47 (1.19 to 1.82); 0.018, 1.33 (1.09 to 1.64); and 0.024, 1.3 (1.06 to 1.59), respectively. Haplotype-based tests identified a significant association of the H2AFX CACT haplotype with BC (P <  0.0001, OR = 27.29, 95% CI 3.56 to 209.5). The risk of BC increased with the growing number of risk alleles. The OR (95% CI) for carriers of ≥ four risk alleles was 1.71 (1.11 to 2.62) for the CGRS. This study confirms that H2AFX variants are associated with an increased risk of BC. The above-reported sequence variants of MRE11 genes may not constitute a risk factor of breast cancer in the Polish population. The contribution of mutations detected in the ATM gene to the development of breast cancer needs further detailed study.

  16. DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

    PubMed

    Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

    2017-01-07

    Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.

  17. Long memory and multifractality: A joint test

    NASA Astrophysics Data System (ADS)

    Goddard, John; Onali, Enrico

    2016-06-01

    The properties of statistical tests for hypotheses concerning the parameters of the multifractal model of asset returns (MMAR) are investigated, using Monte Carlo techniques. We show that, in the presence of multifractality, conventional tests of long memory tend to over-reject the null hypothesis of no long memory. Our test addresses this issue by jointly estimating long memory and multifractality. The estimation and test procedures are applied to exchange rate data for 12 currencies. Among the nested model specifications that are investigated, in 11 out of 12 cases, daily returns are most appropriately characterized by a variant of the MMAR that applies a multifractal time-deformation process to NIID returns. There is no evidence of long memory.

  18. Rapid functional analysis of computationally complex rare human IRF6 gene variants using a novel zebrafish model.

    PubMed

    Li, Edward B; Truong, Dawn; Hallett, Shawn A; Mukherjee, Kusumika; Schutte, Brian C; Liao, Eric C

    2017-09-01

    Large-scale sequencing efforts have captured a rapidly growing catalogue of genetic variations. However, the accurate establishment of gene variant pathogenicity remains a central challenge in translating personal genomics information to clinical decisions. Interferon Regulatory Factor 6 (IRF6) gene variants are significant genetic contributors to orofacial clefts. Although approximately three hundred IRF6 gene variants have been documented, their effects on protein functions remain difficult to interpret. Here, we demonstrate the protein functions of human IRF6 missense gene variants could be rapidly assessed in detail by their abilities to rescue the irf6 -/- phenotype in zebrafish through variant mRNA microinjections at the one-cell stage. The results revealed many missense variants previously predicted by traditional statistical and computational tools to be loss-of-function and pathogenic retained partial or full protein function and rescued the zebrafish irf6 -/- periderm rupture phenotype. Through mRNA dosage titration and analysis of the Exome Aggregation Consortium (ExAC) database, IRF6 missense variants were grouped by their abilities to rescue at various dosages into three functional categories: wild type function, reduced function, and complete loss-of-function. This sensitive and specific biological assay was able to address the nuanced functional significances of IRF6 missense gene variants and overcome many limitations faced by current statistical and computational tools in assigning variant protein function and pathogenicity. Furthermore, it unlocked the possibility for characterizing yet undiscovered human IRF6 missense gene variants from orofacial cleft patients, and illustrated a generalizable functional genomics paradigm in personalized medicine.

  19. Characterization of the two intra-individual sequence variants in the 18S rRNA gene in the plant parasitic nematode, Rotylenchulus reniformis.

    PubMed

    Nyaku, Seloame T; Sripathi, Venkateswara R; Kantety, Ramesh V; Gu, Yong Q; Lawrence, Kathy; Sharma, Govind C

    2013-01-01

    The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene.

  20. Characterization of the Two Intra-Individual Sequence Variants in the 18S rRNA Gene in the Plant Parasitic Nematode, Rotylenchulus reniformis

    PubMed Central

    Nyaku, Seloame T.; Sripathi, Venkateswara R.; Kantety, Ramesh V.; Gu, Yong Q.; Lawrence, Kathy; Sharma, Govind C.

    2013-01-01

    The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene. PMID:23593343

  1. Bioavailability of Lumefantrine Is Significantly Enhanced with a Novel Formulation Approach, an Outcome from a Randomized, Open-Label Pharmacokinetic Study in Healthy Volunteers.

    PubMed

    Jain, Jay Prakash; Leong, F Joel; Chen, Lan; Kalluri, Sampath; Koradia, Vishal; Stein, Daniel S; Wolf, Marie-Christine; Sunkara, Gangadhar; Kota, Jagannath

    2017-09-01

    The artemether-lumefantrine combination requires food intake for the optimal absorption of lumefantrine. In an attempt to enhance the bioavailability of lumefantrine, new solid dispersion formulations (SDF) were developed, and the pharmacokinetics of two SDF variants were assessed in a randomized, open-label, sequential two-part study in healthy volunteers. In part 1, the relative bioavailability of the two SDF variants was compared with that of the conventional formulation after administration of a single dose of 480 mg under fasted conditions in three parallel cohorts. In part 2, the pharmacokinetics of lumefantrine from both SDF variants were evaluated after a single dose of 480 mg under fed conditions and a single dose of 960 mg under fasted conditions. The bioavailability of lumefantrine from SDF variant 1 and variant 2 increased up to ∼48-fold and ∼24-fold, respectively, relative to that of the conventional formulation. Both variants demonstrated a positive food effect and a less than proportional increase in exposure between the 480-mg and 960-mg doses. Most adverse events (AEs) were mild to moderate in severity and not suspected to be related to the study drug. All five drug-related AEs occurred in subjects taking SDF variant 2. No clinically significant treatment-emergent changes in vital signs, electrocardiograms, or laboratory blood assessments were noted. The solid dispersion formulation enhances the lumefantrine bioavailability to a significant extent, and SDF variant 1 is superior to SDF variant 2. Copyright © 2017 Jain et al.

  2. Empirical Bayes scan statistics for detecting clusters of disease risk variants in genetic studies.

    PubMed

    McCallum, Kenneth J; Ionita-Laza, Iuliana

    2015-12-01

    Recent developments of high-throughput genomic technologies offer an unprecedented detailed view of the genetic variation in various human populations, and promise to lead to significant progress in understanding the genetic basis of complex diseases. Despite this tremendous advance in data generation, it remains very challenging to analyze and interpret these data due to their sparse and high-dimensional nature. Here, we propose novel applications and new developments of empirical Bayes scan statistics to identify genomic regions significantly enriched with disease risk variants. We show that the proposed empirical Bayes methodology can be substantially more powerful than existing scan statistics methods especially so in the presence of many non-disease risk variants, and in situations when there is a mixture of risk and protective variants. Furthermore, the empirical Bayes approach has greater flexibility to accommodate covariates such as functional prediction scores and additional biomarkers. As proof-of-concept we apply the proposed methods to a whole-exome sequencing study for autism spectrum disorders and identify several promising candidate genes. © 2015, The International Biometric Society.

  3. Clinical utility of genetic testing in pediatric drug-resistant epilepsy: a pilot study.

    PubMed

    Ream, Margie A; Mikati, Mohamad A

    2014-08-01

    The utility of genetic testing in pediatric drug-resistant epilepsy (PDRE), its yield in "real life" clinical practice, and the practical implications of such testing are yet to be determined. To start to address the above gaps in our knowledge as they apply to a patient population seen in a tertiary care center. We retrospectively reviewed our experience with the use of clinically available genetic tests in the diagnosis and management of PDRE in one clinic over one year. Genetic testing included, depending on clinical judgment, one or more of the following: karyotype, chromosomal microarray, single gene sequencing, gene sequencing panels, and/or whole exome sequencing (WES). We were more likely to perform genetic testing in patients with developmental delay, epileptic encephalopathy, and generalized epilepsy. In our unique population, the yield of specific genetic diagnosis was relatively high: karyotype 14.3%, microarray 16.7%, targeted single gene sequencing 15.4%, gene panels 46.2%, and WES 16.7%. Overall yield of diagnosis from at least one of the above tests was 34.5%. Disease-causing mutations that were not clinically suspected based on the patients' phenotypes and representing novel phenotypes were found in 6.9% (2/29), with an additional 17.2% (5/29) demonstrating pharmacologic variants. Three patients were incidentally found to be carriers of recessive neurologic diseases (10.3%). Variants of unknown significance (VUSs) were identified in 34.5% (10/29). We conclude that genetic testing had at least some utility in our patient population of PDRE, that future similar larger studies in various populations are warranted, and that clinics offering such tests must be prepared to address the complicated questions raised by the results of such testing. Copyright © 2014. Published by Elsevier Inc.

  4. Low Frequency Variants, Collapsed Based on Biological Knowledge, Uncover Complexity of Population Stratification in 1000 Genomes Project Data

    PubMed Central

    Moore, Carrie B.; Wallace, John R.; Wolfe, Daniel J.; Frase, Alex T.; Pendergrass, Sarah A.; Weiss, Kenneth M.; Ritchie, Marylyn D.

    2013-01-01

    Analyses investigating low frequency variants have the potential for explaining additional genetic heritability of many complex human traits. However, the natural frequencies of rare variation between human populations strongly confound genetic analyses. We have applied a novel collapsing method to identify biological features with low frequency variant burden differences in thirteen populations sequenced by the 1000 Genomes Project. Our flexible collapsing tool utilizes expert biological knowledge from multiple publicly available database sources to direct feature selection. Variants were collapsed according to genetically driven features, such as evolutionary conserved regions, regulatory regions genes, and pathways. We have conducted an extensive comparison of low frequency variant burden differences (MAF<0.03) between populations from 1000 Genomes Project Phase I data. We found that on average 26.87% of gene bins, 35.47% of intergenic bins, 42.85% of pathway bins, 14.86% of ORegAnno regulatory bins, and 5.97% of evolutionary conserved regions show statistically significant differences in low frequency variant burden across populations from the 1000 Genomes Project. The proportion of bins with significant differences in low frequency burden depends on the ancestral similarity of the two populations compared and types of features tested. Even closely related populations had notable differences in low frequency burden, but fewer differences than populations from different continents. Furthermore, conserved or functionally relevant regions had fewer significant differences in low frequency burden than regions under less evolutionary constraint. This degree of low frequency variant differentiation across diverse populations and feature elements highlights the critical importance of considering population stratification in the new era of DNA sequencing and low frequency variant genomic analyses. PMID:24385916

  5. Polygenic obesity in humans.

    PubMed

    Hinney, Anke; Hebebrand, Johannes

    2008-01-01

    The molecular genetic analysis of obesity has led to the identification of a limited number of confirmed major genes. While such major genes have a clear influence on the development of the phenotype, the underlying mutations are however (extremely) infrequent and thus of minor clinical importance only. The genetic predisposition to obesity must thus be polygenic; a number of such variants should be found in most obese subjects; however, these variants predisposing to obesity are also found in normal weight and even lean individuals. Therefore, a polygene can only be identified and validated by statistical analyses: the appropriate gene variant (allele) occurs more frequently in obese than in non-obese subjects. Each single polygene makes only a small contribution to the development of obesity. The 103Ile allele of the Val103Ile single nucleotide polymorphism (SNP) of the melanocortin-4 receptor gene (MC4R) was the first confirmed polygenetic variant with an influence on the body mass index (BMI); the more common Val103 allele is more frequent in obese individuals. As determined in a recent, large-scaled meta-analysis the effect size of this allele on mean BMI was approximately -0.5 kg/m(2). The first genome-wide association study (GWA) for obesity, based on approximately 100,000 SNPs analyzed in families of the Framingham study, revealed that a SNP in the proximity of the insulin-induced gene 2 (INSIG2) was associated with obesity. The positive result was replicated in independent samples; however, some other study groups detected no association. Currently, a meta-analysis is ongoing; its result will contribute to the evaluation of the importance of the INSIG2 polymorphism in body weight regulation. SNP alleles in intron 1 of the fat mass and obesity associated gene (FTO) confer the most relevant polygenic effect on obesity. In the first GWA for extreme early onset obesity we substantiated that variation in FTO strongly contributes to early onset obesity. Copyright 2008 S. Karger AG, Basel.

  6. An exponential filter model predicts lightness illusions

    PubMed Central

    Zeman, Astrid; Brooks, Kevin R.; Ghebreab, Sennay

    2015-01-01

    Lightness, or perceived reflectance of a surface, is influenced by surrounding context. This is demonstrated by the Simultaneous Contrast Illusion (SCI), where a gray patch is perceived lighter against a black background and vice versa. Conversely, assimilation is where the lightness of the target patch moves toward that of the bounding areas and can be demonstrated in White's effect. Blakeslee and McCourt (1999) introduced an oriented difference-of-Gaussian (ODOG) model that is able to account for both contrast and assimilation in a number of lightness illusions and that has been subsequently improved using localized normalization techniques. We introduce a model inspired by image statistics that is based on a family of exponential filters, with kernels spanning across multiple sizes and shapes. We include an optional second stage of normalization based on contrast gain control. Our model was tested on a well-known set of lightness illusions that have previously been used to evaluate ODOG and its variants, and model lightness values were compared with typical human data. We investigate whether predictive success depends on filters of a particular size or shape and whether pooling information across filters can improve performance. The best single filter correctly predicted the direction of lightness effects for 21 out of 27 illusions. Combining two filters together increased the best performance to 23, with asymptotic performance at 24 for an arbitrarily large combination of filter outputs. While normalization improved prediction magnitudes, it only slightly improved overall scores in direction predictions. The prediction performance of 24 out of 27 illusions equals that of the best performing ODOG variant, with greater parsimony. Our model shows that V1-style orientation-selectivity is not necessary to account for lightness illusions and that a low-level model based on image statistics is able to account for a wide range of both contrast and assimilation effects. PMID:26157381

  7. Cloning and characterization of human immunodeficiency virus type 1 variants diminished in the ability to induce syncytium-independent cytolysis.

    PubMed Central

    Stevenson, M; Haggerty, S; Lamonica, C; Mann, A M; Meier, C; Wasiak, A

    1990-01-01

    The phenomenon of interference was exploited to isolate low-abundance noncytopathic human immunodeficiency virus type 1 (HIV-1) variants from a primary HIV-1 isolate from an asymptomatic HIV-1-seropositive hemophiliac. Successive rounds of virus infection of a cytolysis-susceptible CD4+ cell line and isolation of surviving cells resulted in selective amplification of an HIV-1 variant reduced in the ability to induce cytolysis. The presence of a PvuII polymorphism facilitated subsequent amplification and cloning of cytopathic and noncytopathic HIV-1 variants from the primary isolate. Cloned virus stocks from cytopathic and noncytopathic variants exhibited similar replication kinetics, infectivity, and syncytium induction in susceptible host cells. The noncytopathic HIV-1 variant was unable, however, to induce single-cell killing in susceptible host cells. Construction of viral hybrids in which regions of cytopathic and noncytopathic variants were exchanged indicated that determinants for the noncytopathic phenotype map to the envelope glycoprotein. Sequence analysis of the envelope coding regions indicated the absence of two highly conserved N-linked glycosylation sites in the noncytopathic HIV-1 variant, which accompanied differences in processing of precursor gp160 envelope glycoprotein. These results demonstrate that determinants for syncytium-independent single-cell killing are located within the envelope glycoprotein and suggest that single-cell killing is profoundly influenced by alterations in envelope sequence which affect posttranslational processing of HIV-1 envelope glycoprotein within the infected cell. Images PMID:1695254

  8. A probabilistic method for testing and estimating selection differences between populations.

    PubMed

    He, Yungang; Wang, Minxian; Huang, Xin; Li, Ran; Xu, Hongyang; Xu, Shuhua; Jin, Li

    2015-12-01

    Human populations around the world encounter various environmental challenges and, consequently, develop genetic adaptations to different selection forces. Identifying the differences in natural selection between populations is critical for understanding the roles of specific genetic variants in evolutionary adaptation. Although numerous methods have been developed to detect genetic loci under recent directional selection, a probabilistic solution for testing and quantifying selection differences between populations is lacking. Here we report the development of a probabilistic method for testing and estimating selection differences between populations. By use of a probabilistic model of genetic drift and selection, we showed that logarithm odds ratios of allele frequencies provide estimates of the differences in selection coefficients between populations. The estimates approximate a normal distribution, and variance can be estimated using genome-wide variants. This allows us to quantify differences in selection coefficients and to determine the confidence intervals of the estimate. Our work also revealed the link between genetic association testing and hypothesis testing of selection differences. It therefore supplies a solution for hypothesis testing of selection differences. This method was applied to a genome-wide data analysis of Han and Tibetan populations. The results confirmed that both the EPAS1 and EGLN1 genes are under statistically different selection in Han and Tibetan populations. We further estimated differences in the selection coefficients for genetic variants involved in melanin formation and determined their confidence intervals between continental population groups. Application of the method to empirical data demonstrated the outstanding capability of this novel approach for testing and quantifying differences in natural selection. © 2015 He et al.; Published by Cold Spring Harbor Laboratory Press.

  9. Influence of cytochrome 2C19 allelic variants on on-treatment platelet reactivity evaluated by five different platelet function tests.

    PubMed

    Gremmel, Thomas; Kopp, Christoph W; Moertl, Deddo; Seidinger, Daniela; Koppensteiner, Renate; Panzer, Simon; Mannhalter, Christine; Steiner, Sabine

    2012-05-01

    The antiplatelet effect of clopidogrel has been linked to cytochrome P450 2C19 (CYP2C19) carrier status. The presence of loss of function and gain of function variants were found to have a gene-dose effect on clopidogrel metabolism. However, genotyping is only one aspect of predicting response to clopidogrel and several platelet function tests are available to measure platelet response. Patients and methods We studied the influence of CYP2C19 allelic variants on on-treatment platelet reactivity as assessed by light transmission aggregometry (LTA), the VerifyNow P2Y12 assay, the VASP assay, multiple electrode aggregometry (MEA), and the Impact-R in 288 patients after stenting for cardiovascular disease. Allelic variants of CYP2C19 were determined with the Infiniti® CYP450 2C19+ assay and categorized into four metabolizer states (ultrarapid, extensive, intermediate, poor). Platelet reactivity increased linearly from ultrarapid to poor metabolizers using the VerifyNow P2Y12 assay (P = 0.04), the VASP assay (P = 0.02) and the Impact-R (P = 0.04). The proportion of patients with high on-treatment residual platelet reactivity (HRPR) identified by LTA, the VerifyNow P2Y12 assay and the VASP assay increased when the metabolizer status decreased, while no such relationship could be identified for results of MEA and Impact-R. The presence of loss of function variants (*2/*2, *2-8*/wt, *2/*17) was an independent predictor of HRPR in LTA and the VASP assay while it did not reach statistical significance in the VerifyNow P2Y12 assay, MEA, and the Impact-R. Depending on the type of platelet function test differences in the association of on-treatment platelet reactivity with CYP2C19 carrier status are observed. Copyright © 2011 Elsevier Ltd. All rights reserved.

  10. A frequent regulatory variant of the estrogen-related receptor alpha gene associated with BMD in French-Canadian premenopausal women.

    PubMed

    Laflamme, Nathalie; Giroux, Sylvie; Loredo-Osti, J Concepción; Elfassihi, Latifa; Dodin, Sylvie; Blanchet, Claudine; Morgan, Kenneth; Giguère, Vincent; Rousseau, François

    2005-06-01

    Genes are important BMD determinants. We studied the association of an ESRRA gene functional variant with BMD in 1335 premenopausal women. The ESRRA genotype was an independent predictor of L2-L4 BMD, with an effect similar to smoking and equivalent to a 10-kg difference in weight. Several genetic polymorphisms have been associated with osteoporosis or osteoporosis fractures, but no functional effect has been shown for most of these gene variants. Because functional studies have implicated estrogen-related receptor alpha (ESRRA) in bone metabolism, we evaluated whether a recently described regulatory variant of the ESRRA gene is associated with lumbar and hip BMD as measured by DXA and with heel bone parameters as measured by quantitative ultrasound (QUS). Heel bone parameters were measured by right calcaneal QUS in 1335 healthy French-Canadian premenopausal women, and one-half of these women also had their BMD evaluated at two sites: femoral neck and lumbar spine (L2-L4) by DXA. All bone measures were tested separately for association with the ESRRA genotype by analysis of covariance. The significance of the ESRRA contribution to the model was also assessed by two different permutation tests. A statistically significant association between ESRRA genotype and lumbar spine BMD was observed: women carrying the long ESRRA genotype had a 3.9% (0.045 g/cm2) higher lumbar spine BMD than those carrying the short ESRRA genotype (p = 0.004), independently of other risk factors measured. This effect of ESRRA genotype is similar to the effect of smoking and equivalent to a 10-kg difference in weight. This association was confirmed by permutation tests (p = 0.004). The same trend was observed for femoral neck BMD (2.6%, p = 0.07). However, no association was observed between ESRRA and QUS heel bone measures. These results support the genetic influence of this ESRRA regulatory variant on BMD.

  11. Rare variant testing across methods and thresholds using the multi-kernel sequence kernel association test (MK-SKAT).

    PubMed

    Urrutia, Eugene; Lee, Seunggeun; Maity, Arnab; Zhao, Ni; Shen, Judong; Li, Yun; Wu, Michael C

    Analysis of rare genetic variants has focused on region-based analysis wherein a subset of the variants within a genomic region is tested for association with a complex trait. Two important practical challenges have emerged. First, it is difficult to choose which test to use. Second, it is unclear which group of variants within a region should be tested. Both depend on the unknown true state of nature. Therefore, we develop the Multi-Kernel SKAT (MK-SKAT) which tests across a range of rare variant tests and groupings. Specifically, we demonstrate that several popular rare variant tests are special cases of the sequence kernel association test which compares pair-wise similarity in trait value to similarity in the rare variant genotypes between subjects as measured through a kernel function. Choosing a particular test is equivalent to choosing a kernel. Similarly, choosing which group of variants to test also reduces to choosing a kernel. Thus, MK-SKAT uses perturbation to test across a range of kernels. Simulations and real data analyses show that our framework controls type I error while maintaining high power across settings: MK-SKAT loses power when compared to the kernel for a particular scenario but has much greater power than poor choices.

  12. Crohn’s Disease and Genetic Hitchhiking at IBD5

    PubMed Central

    Huff, Chad D.; Witherspoon, David J.; Zhang, Yuhua; Gatenbee, Chandler; Denson, Lee A.; Kugathasan, Subra; Hakonarson, Hakon; Whiting, April; Davis, Chadwick T.; Wu, Wilfred; Xing, Jinchuan; Watkins, W. Scott; Bamshad, Michael J.; Bradfield, Jonathan P.; Bulayeva, Kazima; Simonson, Tatum S.; Jorde, Lynn B.; Guthery, Stephen L.

    2012-01-01

    Inflammatory bowel disease 5 (IBD5) is a 250 kb haplotype on chromosome 5 that is associated with an increased risk of Crohn’s disease in Europeans. The OCTN1 gene is centrally located on IBD5 and encodes a transporter of the antioxidant ergothioneine (ET). The 503F variant of OCTN1 is strongly associated with IBD5 and is a gain-of-function mutation that increases absorption of ET. Although 503F has been implicated as the variant potentially responsible for Crohn’s disease susceptibility at IBD5, there is little evidence beyond statistical association to support its role in disease causation. We hypothesize that 503F is a recent adaptation in Europeans that swept to relatively high frequency and that disease association at IBD5 results not from 503F itself, but from one or more nearby hitchhiking variants, in the genes IRF1 or IL5. To test for evidence of recent positive selection on the 503F allele, we employed the iHS statistic, which was significant in the European CEU HapMap population (P = 0.0007) and European Human Genome Diversity Panel populations (P ≤ 0.01). To evaluate the hypothesis of disease-variant hitchhiking, we performed haplotype association tests on high-density microarray data in a sample of 1,868 Crohn’s disease cases and 5,550 controls. We found that 503F haplotypes with recombination breakpoints between OCTN1 and IRF1 or IL5 were not associated with disease (odds ratio [OR]: 1.05, P = 0.21). In contrast, we observed strong disease association for 503F haplotypes with no recombination between these three genes (OR: 1.24, P = 2.6 × 10−8), as expected if the sweeping haplotype harbored one or more disease-causing mutations in IRF1 or IL5. To further evaluate these disease-gene candidates, we obtained expression data from lower gastrointestinal biopsies of healthy individuals and Crohn’s disease patients. We observed a 72% increase in gene expression of IRF1 among Crohn’s disease patients (P = 0.0006) and no significant difference in expression of OCTN1. Collectively, these data indicate that the 503F variant has increased in frequency due to recent positive selection and that disease-causing variants in linkage disequilibrium with 503F have hitchhiked to relatively high frequency, thus forming the IBD5 risk haplotype. Finally, our association results and expression data support IRF1 as a strong candidate for Crohn’s disease causation. PMID:21816865

  13. Predicting type 2 diabetes using genetic and environmental risk factors in a multi-ethnic Malaysian cohort.

    PubMed

    Abdullah, N; Abdul Murad, N A; Mohd Haniff, E A; Syafruddin, S E; Attia, J; Oldmeadow, C; Kamaruddin, M A; Abd Jalal, N; Ismail, N; Ishak, M; Jamal, R; Scott, R J; Holliday, E G

    2017-08-01

    Malaysia has a high and rising prevalence of type 2 diabetes (T2D). While environmental (non-genetic) risk factors for the disease are well established, the role of genetic variations and gene-environment interactions remain understudied in this population. This study aimed to estimate the relative contributions of environmental and genetic risk factors to T2D in Malaysia and also to assess evidence for gene-environment interactions that may explain additional risk variation. This was a case-control study including 1604 Malays, 1654 Chinese and 1728 Indians from the Malaysian Cohort Project. The proportion of T2D risk variance explained by known genetic and environmental factors was assessed by fitting multivariable logistic regression models and evaluating McFadden's pseudo R 2 and the area under the receiver-operating characteristic curve (AUC). Models with and without the genetic risk score (GRS) were compared using the log likelihood ratio Chi-squared test and AUCs. Multiplicative interaction between genetic and environmental risk factors was assessed via logistic regression within and across ancestral groups. Interactions were assessed for the GRS and its 62 constituent variants. The models including environmental risk factors only had pseudo R 2 values of 16.5-28.3% and AUC of 0.75-0.83. Incorporating a genetic score aggregating 62 T2D-associated risk variants significantly increased the model fit (likelihood ratio P-value of 2.50 × 10 -4 -4.83 × 10 -12 ) and increased the pseudo R 2 by about 1-2% and AUC by 1-3%. None of the gene-environment interactions reached significance after multiple testing adjustment, either for the GRS or individual variants. For individual variants, 33 out of 310 tested associations showed nominal statistical significance with 0.001 < P < 0.05. This study suggests that known genetic risk variants contribute a significant but small amount to overall T2D risk variation in Malaysian population groups. If gene-environment interactions involving common genetic variants exist, they are likely of small effect, requiring substantially larger samples for detection. Copyright © 2017 The Royal Society for Public Health. All rights reserved.

  14. Prognostic Relevance of Urinary Bladder Cancer Susceptibility Loci

    PubMed Central

    Grotenhuis, Anne J.; Dudek, Aleksandra M.; Verhaegh, Gerald W.; Witjes, J. Alfred; Aben, Katja K.; van der Marel, Saskia L.; Vermeulen, Sita H.; Kiemeney, Lambertus A.

    2014-01-01

    In the last few years, susceptibility loci have been identified for urinary bladder cancer (UBC) through candidate-gene and genome-wide association studies. Prognostic relevance of most of these loci is yet unknown. In this study, we used data of the Nijmegen Bladder Cancer Study (NBCS) to perform a comprehensive evaluation of the prognostic relevance of all confirmed UBC susceptibility loci. Detailed clinical data concerning diagnosis, stage, treatment, and disease course of a population-based series of 1,602 UBC patients were collected retrospectively based on a medical file survey. Kaplan-Meier survival analyses and Cox proportional hazard regression were performed, and log-rank tests calculated, to evaluate the association between 12 confirmed UBC susceptibility variants and recurrence and progression in non-muscle invasive bladder cancer (NMIBC) patients. Among muscle-invasive or metastatic bladder cancer (MIBC) patients, association of these variants with overall survival was tested. Subgroup analyses by tumor aggressiveness and smoking status were performed in NMIBC patients. In the overall NMIBC group (n = 1,269), a statistically significant association between rs9642880 at 8q24 and risk of progression was observed (GT vs. TT: HR = 1.08 (95% CI: 0.76–1.54), GG vs. TT: HR = 1.81 (95% CI: 1.23–2.66), P for trend = 2.6×10−3). In subgroup analyses, several other variants showed suggestive, though non-significant, prognostic relevance for recurrence and progression in NMIBC and survival in MIBC. This study provides suggestive evidence that genetic loci involved in UBC etiology may influence disease prognosis. Elucidation of the causal variant(s) could further our understanding of the mechanism of disease, could point to new therapeutic targets, and might aid in improvement of prognostic tools. PMID:24586564

  15. PPM1D Mosaic Truncating Variants in Ovarian Cancer Cases May Be Treatment-Related Somatic Mutations

    PubMed Central

    Pharoah, Paul D. P.; Song, Honglin; Dicks, Ed; Intermaggio, Maria P.; Harrington, Patricia; Baynes, Caroline; Alsop, Kathryn; Bogdanova, Natalia; Cicek, Mine S.; Cunningham, Julie M.; Fridley, Brooke L.; Gentry-Maharaj, Aleksandra; Hillemanns, Peter; Lele, Shashi; Lester, Jenny; McGuire, Valerie; Moysich, Kirsten B.; Poblete, Samantha; Sieh, Weiva; Sucheston-Campbell, Lara; Widschwendter, Martin; Whittemore, Alice S.; Dörk, Thilo; Menon, Usha; Odunsi, Kunle; Goode, Ellen L.; Karlan, Beth Y.; Bowtell, David D.; Gayther, Simon A.; Ramus, Susan J.

    2016-01-01

    Mosaic truncating mutations in the protein phosphatase, Mg2+/Mn2+-dependent, 1D (PPM1D) gene have recently been reported with a statistically significantly greater frequency in lymphocyte DNA from ovarian cancer case patients compared with unaffected control patients. Using massively parallel sequencing (MPS) we identified truncating PPM1D mutations in 12 of 3236 epithelial ovarian cancer (EOC) case patients (0.37%) but in only one of 3431 unaffected control patients (0.03%) (P = .001). All statistical tests were two-sided. A combination of Sanger sequencing, pyrosequencing, and MPS data suggested that 12 of the 13 mutations were mosaic. All mutations were identified in post-chemotherapy treatment blood samples from case patients (n = 1827) (average 1234 days post-treatment in carriers) rather than from cases collected pretreatment (less than 14 days after diagnosis, n = 1384) (P = .002). These data suggest that PPM1D variants in EOC cases are primarily somatic mosaic mutations caused by treatment and are not associated with germline predisposition to EOC. PMID:26823519

  16. Quantitating the Multiplicity of Infection with Human Immunodeficiency Virus Type 1 Subtype C Reveals a Non-Poisson Distribution of Transmitted Variants▿ †

    PubMed Central

    Abrahams, M.-R.; Anderson, J. A.; Giorgi, E. E.; Seoighe, C.; Mlisana, K.; Ping, L.-H.; Athreya, G. S.; Treurnicht, F. K.; Keele, B. F.; Wood, N.; Salazar-Gonzalez, J. F.; Bhattacharya, T.; Chu, H.; Hoffman, I.; Galvin, S.; Mapanje, C.; Kazembe, P.; Thebus, R.; Fiscus, S.; Hide, W.; Cohen, M. S.; Karim, S. Abdool; Haynes, B. F.; Shaw, G. M.; Hahn, B. H.; Korber, B. T.; Swanstrom, R.; Williamson, C.

    2009-01-01

    Identifying the specific genetic characteristics of successfully transmitted variants may prove central to the development of effective vaccine and microbicide interventions. Although human immunodeficiency virus transmission is associated with a population bottleneck, the extent to which different factors influence the diversity of transmitted viruses is unclear. We estimate here the number of transmitted variants in 69 heterosexual men and women with primary subtype C infections. From 1,505 env sequences obtained using a single genome amplification approach we show that 78% of infections involved single variant transmission and 22% involved multiple variant transmissions (median of 3). We found evidence for mutations selected for cytotoxic-T-lymphocyte or antibody escape and a high prevalence of recombination in individuals infected with multiple variants representing another potential escape pathway in these individuals. In a combined analysis of 171 subtype B and C transmission events, we found that infection with more than one variant does not follow a Poisson distribution, indicating that transmission of individual virions cannot be seen as independent events, each occurring with low probability. While most transmissions resulted from a single infectious unit, multiple variant transmissions represent a significant fraction of transmission events, suggesting that there may be important mechanistic differences between these groups that are not yet understood. PMID:19193811

  17. EPHA2 Polymorphisms in Estonian Patients with Age-Related Cataract.

    PubMed

    Celojevic, Dragana; Abramsson, Alexandra; Seibt Palmér, Mona; Tasa, Gunnar; Juronen, Erkki; Zetterberg, Henrik; Zetterberg, Madeleine

    2016-01-01

    Ephrin receptors (Ephs) are tyrosine kinases that together with their ligands, ephrins, are considered important in cell-cell communication, especially during embryogenesis but also for epithelium homeostasis. Studies have demonstrated the involvement of mutations or common variants of the gene encoding Eph receptor A2 (EPHA2), in congenital cataract and in age-related cataract. This study investigated a number of disease-associated single nucleotide polymorphisms (SNPs) in EPHA2 in patients with age-related cataract. The study included 491 Estonian patients who had surgery for age-related cataract, classified as nuclear, cortical, posterior subcapsular and mixed lens opacities, and 185 controls of the same ethnical origin. Seven SNPs in EPHA2 (rs7543472, rs11260867, rs7548209, rs3768293, rs6603867, rs6678616, rs477558) were genotyped using TaqMan Allelic Discrimination. Statistical analyses for single factor associations used χ(2)-test and logistic regression was performed including relevant covariates (age, sex and smoking). In single-SNP allele analysis, only the rs7543472 showed a borderline significant association with risk of cataract (p = 0.048). Regression analysis with known risk factors for cataract showed no significant associations of the studied SNPs with cataract. Stratification by cataract subtype did not alter the results. Adjusted odds ratios were between 0.82 and 1.16 (95% confidence interval 0.61-1.60). The present study does not support a major role of EphA2 in cataractogenesis in an Estonian population.

  18. Statistical behavior of the tensile property of heated cotton fiber

    USDA-ARS?s Scientific Manuscript database

    The temperature dependence of the tensile property of single cotton fiber was studied in the range of 160-300°C using Favimat test, and its statistical behavior was interpreted in terms of structural changes. The tenacity of control cotton fiber was well described by the single Weibull distribution,...

  19. Variability of Creatine Metabolism Genes in Children with Autism Spectrum Disorder.

    PubMed

    Cameron, Jessie M; Levandovskiy, Valeriy; Roberts, Wendy; Anagnostou, Evdokia; Scherer, Stephen; Loh, Alvin; Schulze, Andreas

    2017-07-31

    Creatine deficiency syndrome (CDS) comprises three separate enzyme deficiencies with overlapping clinical presentations: arginine:glycine amidinotransferase ( GATM gene, glycine amidinotransferase), guanidinoacetate methyltransferase ( GAMT gene), and creatine transporter deficiency ( SLC6A8 gene, solute carrier family 6 member 8). CDS presents with developmental delays/regression, intellectual disability, speech and language impairment, autistic behaviour, epileptic seizures, treatment-refractory epilepsy, and extrapyramidal movement disorders; symptoms that are also evident in children with autism. The objective of the study was to test the hypothesis that genetic variability in creatine metabolism genes is associated with autism. We sequenced GATM , GAMT and SLC6A8 genes in 166 patients with autism (coding sequence, introns and adjacent untranslated regions). A total of 29, 16 and 25 variants were identified in each gene, respectively. Four variants were novel in GATM , and 5 in SLC6A8 (not present in the 1000 Genomes, Exome Sequencing Project (ESP) or Exome Aggregation Consortium (ExAC) databases). A single variant in each gene was identified as non-synonymous, and computationally predicted to be potentially damaging. Nine variants in GATM were shown to have a lower minor allele frequency (MAF) in the autism population than in the 1000 Genomes database, specifically in the East Asian population (Fisher's exact test). Two variants also had lower MAFs in the European population. In summary, there were no apparent associations of variants in GAMT and SLC6A8 genes with autism. The data implying there could be a lower association of some specific GATM gene variants with autism is an observation that would need to be corroborated in a larger group of autism patients, and with sub-populations of Asian ethnicities. Overall, our findings suggest that the genetic variability of creatine synthesis/transport is unlikely to play a part in the pathogenesis of autism spectrum disorder (ASD) in children.

  20. Kernel-Based Measure of Variable Importance for Genetic Association Studies.

    PubMed

    Gallego, Vicente; Luz Calle, M; Oller, Ramon

    2017-06-17

    The identification of genetic variants that are associated with disease risk is an important goal of genetic association studies. Standard approaches perform univariate analysis where each genetic variant, usually Single Nucleotide Polymorphisms (SNPs), is tested for association with disease status. Though many genetic variants have been identified and validated so far using this univariate approach, for most complex diseases a large part of their genetic component is still unknown, the so called missing heritability. We propose a Kernel-based measure of variable importance (KVI) that provides the contribution of a SNP, or a group of SNPs, to the joint genetic effect of a set of genetic variants. KVI can be used for ranking genetic markers individually, sets of markers that form blocks of linkage disequilibrium or sets of genetic variants that lie in a gene or a genetic pathway. We prove that, unlike the univariate analysis, KVI captures the relationship with other genetic variants in the analysis, even when measured at the individual level for each genetic variable separately. This is specially relevant and powerful for detecting genetic interactions. We illustrate the results with data from an Alzheimer's disease study and show through simulations that the rankings based on KVI improve those rankings based on two measures of importance provided by the Random Forest. We also prove with a simulation study that KVI is very powerful for detecting genetic interactions.

  1. Assessing the Power of Exome Chips.

    PubMed

    Page, Christian Magnus; Baranzini, Sergio E; Mevik, Bjørn-Helge; Bos, Steffan Daniel; Harbo, Hanne F; Andreassen, Bettina Kulle

    2015-01-01

    Genotyping chips for rare and low-frequent variants have recently gained popularity with the introduction of exome chips, but the utility of these chips remains unclear. These chips were designed using exome sequencing data from mainly American-European individuals, enriched for a narrow set of common diseases. In addition, it is well-known that the statistical power of detecting associations with rare and low-frequent variants is much lower compared to studies exclusively involving common variants. We developed a simulation program adaptable to any exome chip design to empirically evaluate the power of the exome chips. We implemented the main properties of the Illumina HumanExome BeadChip array. The simulated data sets were used to assess the power of exome chip based studies for varying effect sizes and causal variant scenarios. We applied two widely-used statistical approaches for rare and low-frequency variants, which collapse the variants into genetic regions or genes. Under optimal conditions, we found that a sample size between 20,000 to 30,000 individuals were needed in order to detect modest effect sizes (0.5% < PAR > 1%) with 80% power. For small effect sizes (PAR <0.5%), 60,000-100,000 individuals were needed in the presence of non-causal variants. In conclusion, we found that at least tens of thousands of individuals are necessary to detect modest effects under optimal conditions. In addition, when using rare variant chips on cohorts or diseases they were not originally designed for, the identification of associated variants or genes will be even more challenging.

  2. Statistical approaches to assessing single and multiple outcome measures in dry eye therapy and diagnosis.

    PubMed

    Tomlinson, Alan; Hair, Mario; McFadyen, Angus

    2013-10-01

    Dry eye is a multifactorial disease which would require a broad spectrum of test measures in the monitoring of its treatment and diagnosis. However, studies have typically reported improvements in individual measures with treatment. Alternative approaches involve multiple, combined outcomes being assessed by different statistical analyses. In order to assess the effect of various statistical approaches to the use of single and combined test measures in dry eye, this review reanalyzed measures from two previous studies (osmolarity, evaporation, tear turnover rate, and lipid film quality). These analyses assessed the measures as single variables within groups, pre- and post-intervention with a lubricant supplement, by creating combinations of these variables and by validating these combinations with the combined sample of data from all groups of dry eye subjects. The effectiveness of single measures and combinations in diagnosis of dry eye was also considered. Copyright © 2013. Published by Elsevier Inc.

  3. Genetically elevated fetuin-A levels, fasting glucose levels, and risk of type 2 diabetes: the cardiovascular health study.

    PubMed

    Jensen, Majken K; Bartz, Traci M; Djoussé, Luc; Kizer, Jorge R; Zieman, Susan J; Rimm, Eric B; Siscovick, David S; Psaty, Bruce M; Ix, Joachim H; Mukamal, Kenneth J

    2013-10-01

    Fetuin-A levels are associated with higher risk of type 2 diabetes, but it is unknown if the association is causal. We investigated common (>5%) genetic variants in the fetuin-A gene (AHSG) fetuin-A levels, fasting glucose, and risk of type 2 diabetes. Genetic variation, fetuin-A levels, and fasting glucose were assessed in 2,893 Caucasian and 542 African American community-living individuals 65 years of age or older in 1992-1993. Common AHSG variants (rs4917 and rs2248690) were strongly associated with fetuin-A concentrations (P<0.0001). In analyses of 259 incident cases of type 2 diabetes, the single nucleotide polymorphisms (SNPs) were not associated with diabetes risk during follow-up and similar null associations were observed when 579 prevalent cases were included. As expected, higher fetuin-A levels were associated with higher fasting glucose concentrations (1.9 mg/dL [95% CI, 1.2-2.7] higher per SD in Caucasians), but Mendelian randomization analyses using both SNPs as unbiased proxies for measured fetuin-A did not support an association between genetically predicted fetuin-A levels and fasting glucose (-0.3 mg/dL [95% CI, -1.9 to 1.3] lower per SD in Caucasians). The difference between the associations of fasting glucose with actual and genetically predicted fetuin-A level was statistically significant (P=0.001). Results among the smaller sample of African Americans trended in similar directions but were statistically insignificant. Common variants in the AHSG gene are strongly associated with plasma fetuin-A concentrations, but not with risk of type 2 diabetes or glucose concentrations, raising the possibility that the association between fetuin-A and type 2 diabetes may not be causal.

  4. Childhood physical, environmental, and genetic predictors of adult hypertension: the cardiovascular risk in young Finns study.

    PubMed

    Juhola, Jonna; Oikonen, Mervi; Magnussen, Costan G; Mikkilä, Vera; Siitonen, Niina; Jokinen, Eero; Laitinen, Tomi; Würtz, Peter; Gidding, Samuel S; Taittonen, Leena; Seppälä, Ilkka; Jula, Antti; Kähönen, Mika; Hutri-Kähönen, Nina; Lehtimäki, Terho; Viikari, Jorma S A; Juonala, Markus; Raitakari, Olli T

    2012-07-24

    Hypertension is a major modifiable cardiovascular risk factor. The present longitudinal study aimed to examine the best combination of childhood physical and environmental factors to predict adult hypertension and furthermore whether newly identified genetic variants for blood pressure increase the prediction of adult hypertension. The study cohort included 2625 individuals from the Cardiovascular Risk in Young Finns Study who were followed up for 21 to 27 years since baseline (1980; age, 3-18 years). In addition to dietary factors and biomarkers related to blood pressure, we examined whether a genetic risk score based on 29 newly identified single-nucleotide polymorphisms enhances the prediction of adult hypertension. Hypertension in adulthood was defined as systolic blood pressure ≥ 130 mm Hg and/or diastolic blood pressure ≥ 85 mm Hg or medication for the condition. Independent childhood risk factors for adult hypertension included the individual's own blood pressure (P<0.0001), parental hypertension (P<0.0001), childhood overweight/obesity (P=0.005), low parental occupational status (P=0.003), and high genetic risk score (P<0.0001). Risk assessment based on childhood overweight/obesity status, parental hypertension, and parental occupational status was superior in predicting hypertension compared with the approach using only data on childhood blood pressure levels (C statistics, 0.718 versus 0.733; P=0.0007). Inclusion of both parental hypertension history and data on novel genetic variants for hypertension further improved the C statistics (0.742; P=0.015). Prediction of adult hypertension was enhanced by taking into account known physical and environmental childhood risk factors, family history of hypertension, and novel genetic variants. A multifactorial approach may be useful in identifying children at high risk for adult hypertension.

  5. A systematic variant screening in familial cases of congenital heart defects demonstrates the usefulness of molecular genetics in this field

    PubMed Central

    El Malti, Rajae; Liu, Hui; Doray, Bérénice; Thauvin, Christel; Maltret, Alice; Dauphin, Claire; Gonçalves-Rocha, Miguel; Teboul, Michel; Blanchet, Patricia; Roume, Joëlle; Gronier, Céline; Ducreux, Corinne; Veyrier, Magali; Marçon, François; Acar, Philippe; Lusson, Jean-René; Levy, Marilyne; Beyler, Constance; Vigneron, Jacqueline; Cordier-Alex, Marie-Pierre; Heitz, François; Sanlaville, Damien; Bonnet, Damien; Bouvagnet, Patrice

    2016-01-01

    The etiology of congenital heart defect (CHD) combines environmental and genetic factors. So far, there were studies reporting on the screening of a single gene on unselected CHD or on familial cases selected for specific CHD types. Our goal was to systematically screen a proband of familial cases of CHD on a set of genetic tests to evaluate the prevalence of disease-causing variant identification. A systematic screening of GATA4, NKX2-5, ZIC3 and Multiplex ligation-dependent probe amplification (MLPA) P311 Kit was setup on the proband of 154 families with at least two cases of non-syndromic CHD. Additionally, ELN screening was performed on families with supravalvular arterial stenosis. Twenty-two variants were found, but segregation analysis confirmed unambiguously the causality of 16 variants: GATA4 (1 ×), NKX2-5 (6 ×), ZIC3 (3 ×), MLPA (2 ×) and ELN (4 ×). Therefore, this approach was able to identify the causal variant in 10.4% of familial CHD cases. This study demonstrated the existence of a de novo variant even in familial CHD cases and the impact of CHD variants on adult cardiac condition even in the absence of CHD. This study showed that the systematic screening of genetic factors is useful in familial CHD cases with up to 10.4% elucidated cases. When successful, it drastically improved genetic counseling by discovering unaffected variant carriers who are at risk of transmitting their variant and are also exposed to develop cardiac complications during adulthood thus prompting long-term cardiac follow-up. This study provides an important baseline at dawning of the next-generation sequencing era. PMID:26014430

  6. Rare, evolutionarily unlikely missense substitutions in CHEK2 contribute to breast cancer susceptibility: results from a breast cancer family registry case-control mutation-screening study.

    PubMed

    Le Calvez-Kelm, Florence; Lesueur, Fabienne; Damiola, Francesca; Vallée, Maxime; Voegele, Catherine; Babikyan, Davit; Durand, Geoffroy; Forey, Nathalie; McKay-Chopin, Sandrine; Robinot, Nivonirina; Nguyen-Dumont, Tù; Thomas, Alun; Byrnes, Graham B; Hopper, John L; Southey, Melissa C; Andrulis, Irene L; John, Esther M; Tavtigian, Sean V

    2011-01-18

    Both protein-truncating variants and some missense substitutions in CHEK2 confer increased risk of breast cancer. However, no large-scale study has used full open reading frame mutation screening to assess the contribution of rare missense substitutions in CHEK2 to breast cancer risk. This absence has been due in part to a lack of validated statistical methods for summarizing risk attributable to large numbers of individually rare missense substitutions. Previously, we adapted an in silico assessment of missense substitutions used for analysis of unclassified missense substitutions in BRCA1 and BRCA2 to the problem of assessing candidate genes using rare missense substitution data observed in case-control mutation-screening studies. The method involves stratifying rare missense substitutions observed in cases and/or controls into a series of grades ordered a priori from least to most likely to be evolutionarily deleterious, followed by a logistic regression test for trends to compare the frequency distributions of the graded missense substitutions in cases versus controls. Here we used this approach to analyze CHEK2 mutation-screening data from a population-based series of 1,303 female breast cancer patients and 1,109 unaffected female controls. We found evidence of risk associated with rare, evolutionarily unlikely CHEK2 missense substitutions. Additional findings were that (1) the risk estimate for the most severe grade of CHEK2 missense substitutions (denoted C65) is approximately equivalent to that of CHEK2 protein-truncating variants; (2) the population attributable fraction and the familial relative risk explained by the pool of rare missense substitutions were similar to those explained by the pool of protein-truncating variants; and (3) post hoc power calculations implied that scaling up case-control mutation screening to examine entire biochemical pathways would require roughly 2,000 cases and controls to achieve acceptable statistical power. This study shows that CHEK2 harbors many rare sequence variants that confer increased risk of breast cancer and that a substantial proportion of these are missense substitutions. The study validates our analytic approach to rare missense substitutions and provides a method to combine data from protein-truncating variants and rare missense substitutions into a one degree of freedom per gene test.

  7. Rare Variant, Gene-Based Association Study of Hereditary Melanoma Using Whole-Exome Sequencing.

    PubMed

    Artomov, Mykyta; Stratigos, Alexander J; Kim, Ivana; Kumar, Raj; Lauss, Martin; Reddy, Bobby Y; Miao, Benchun; Daniela Robles-Espinoza, Carla; Sankar, Aravind; Njauw, Ching-Ni; Shannon, Kristen; Gragoudas, Evangelos S; Marie Lane, Anne; Iyer, Vivek; Newton-Bishop, Julia A; Timothy Bishop, D; Holland, Elizabeth A; Mann, Graham J; Singh, Tarjinder; Daly, Mark J; Tsao, Hensin

    2017-12-01

    Extraordinary progress has been made in our understanding of common variants in many diseases, including melanoma. Because the contribution of rare coding variants is not as well characterized, we performed an exome-wide, gene-based association study of familial cutaneous melanoma (CM) and ocular melanoma (OM). Using 11 990 jointly processed individual DNA samples, whole-exome sequencing was performed, followed by large-scale joint variant calling using GATK (Genome Analysis ToolKit). PLINK/SEQ was used for statistical analysis of genetic variation. Four models were used to estimate the association among different types of variants. In vitro functional validation was performed using three human melanoma cell lines in 2D and 3D proliferation assays. In vivo tumor growth was assessed using xenografts of human melanoma A375 melanoma cells in nude mice (eight mice per group). All statistical tests were two-sided. Strong signals were detected for CDKN2A (Pmin = 6.16 × 10-8) in the CM cohort (n = 273) and BAP1 (Pmin = 3.83 × 10-6) in the OM (n = 99) cohort. Eleven genes that exhibited borderline association (P < 10-4) were independently validated using The Cancer Genome Atlas melanoma cohort (379 CM, 47 OM) and a matched set of 3563 European controls with CDKN2A (P = .009), BAP1 (P = .03), and EBF3 (P = 4.75 × 10-4), a candidate risk locus, all showing evidence of replication. EBF3 was then evaluated using germline data from a set of 132 familial melanoma cases and 4769 controls of UK origin (joint P = 1.37 × 10-5). Somatically, loss of EBF3 expression correlated with progression, poorer outcome, and high MITF tumors. Functionally, induction of EBF3 in melanoma cells reduced cell growth in vitro, retarded tumor formation in vivo, and reduced MITF levels. The results of this large rare variant germline association study further define the mutational landscape of hereditary melanoma and implicate EBF3 as a possible CM predisposition gene.

  8. Rare Variant, Gene-Based Association Study of Hereditary Melanoma Using Whole-Exome Sequencing

    PubMed Central

    Artomov, Mykyta; Stratigos, Alexander J; Kim, Ivana; Kumar, Raj; Lauss, Martin; Reddy, Bobby Y; Miao, Benchun; Daniela Robles-Espinoza, Carla; Sankar, Aravind; Njauw, Ching-Ni; Shannon, Kristen; Gragoudas, Evangelos S; Marie Lane, Anne; Iyer, Vivek; Newton-Bishop, Julia A; Timothy Bishop, D; Holland, Elizabeth A; Mann, Graham J; Singh, Tarjinder; Daly, Mark J; Tsao, Hensin

    2017-01-01

    Abstract Background Extraordinary progress has been made in our understanding of common variants in many diseases, including melanoma. Because the contribution of rare coding variants is not as well characterized, we performed an exome-wide, gene-based association study of familial cutaneous melanoma (CM) and ocular melanoma (OM). Methods Using 11 990 jointly processed individual DNA samples, whole-exome sequencing was performed, followed by large-scale joint variant calling using GATK (Genome Analysis ToolKit). PLINK/SEQ was used for statistical analysis of genetic variation. Four models were used to estimate the association among different types of variants. In vitro functional validation was performed using three human melanoma cell lines in 2D and 3D proliferation assays. In vivo tumor growth was assessed using xenografts of human melanoma A375 melanoma cells in nude mice (eight mice per group). All statistical tests were two-sided. Results Strong signals were detected for CDKN2A (Pmin = 6.16 × 10-8) in the CM cohort (n = 273) and BAP1 (Pmin = 3.83 × 10‐6) in the OM (n = 99) cohort. Eleven genes that exhibited borderline association (P < 10‐4) were independently validated using The Cancer Genome Atlas melanoma cohort (379 CM, 47 OM) and a matched set of 3563 European controls with CDKN2A (P = .009), BAP1 (P = .03), and EBF3 (P = 4.75 × 10‐4), a candidate risk locus, all showing evidence of replication. EBF3 was then evaluated using germline data from a set of 132 familial melanoma cases and 4769 controls of UK origin (joint P = 1.37 × 10‐5). Somatically, loss of EBF3 expression correlated with progression, poorer outcome, and high MITF tumors. Functionally, induction of EBF3 in melanoma cells reduced cell growth in vitro, retarded tumor formation in vivo, and reduced MITF levels. Conclusions The results of this large rare variant germline association study further define the mutational landscape of hereditary melanoma and implicate EBF3 as a possible CM predisposition gene. PMID:29522175

  9. A SCN10A SNP biases human pain sensitivity

    PubMed Central

    Duan, Guangyou; Han, Chongyang; Wang, Qingli; Guo, Shanna; Zhang, Yuhao; Ying, Ying; Huang, Penghao; Zhang, Li; Macala, Lawrence; Shah, Palak; Zhang, Mi; Li, Ningbo; Dib-Hajj, Sulayman D; Zhang, Xianwei

    2016-01-01

    Background: Nav1.8 sodium channels, encoded by SCN10A, are preferentially expressed in nociceptive neurons and play an important role in human pain. Although rare gain-of-function variants in SCN10A have been identified in individuals with painful peripheral neuropathies, whether more common variants in SCN10A can have an effect at the channel level and at the dorsal root ganglion, neuronal level leading to a pain disorder or an altered normal pain threshold has not been determined. Results: Candidate single nucleotide polymorphism association approach together with experimental pain testing in human subjects was used to explore possible common SCN10A missense variants that might affect human pain sensitivity. We demonstrated an association between rs6795970 (G > A; p.Ala1073Val) and higher thresholds for mechanical pain in a discovery cohort (496 subjects) and confirmed it in a larger replication cohort (1005 female subjects). Functional assessments showed that although the minor allele shifts channel activation by −4.3 mV, a proexcitatory attribute, it accelerates inactivation, an antiexcitatory attribute, with the net effect being reduced repetitive firing of dorsal root ganglion neurons, consistent with lower mechanical pain sensitivity. Conclusions: At the association and mechanistic levels, the SCN10A single nucleotide polymorphism rs6795970 biases human pain sensitivity. PMID:27590072

  10. Role of the DGAT gene C79T single-nucleotide polymorphism in French obese subjects.

    PubMed

    Coudreau, Sylvie Kipfer; Tounian, Patrick; Bonhomme, Geneviève; Froguel, Philippe; Girardet, Jean-Philippe; Guy-Grand, Bernard; Basdevant, Arnaud; Clément, Karine

    2003-10-01

    Acyl-coenzyme A, diacylglycerol acyltransferase (DGAT), is a key enzyme involved in adipose-cell triglyceride storage. A 79-bp T-to-C single-nucleotide polymorphism (SNP) on the 3' region of the DGAT transcriptional site has been reported to increase promoter activity and is associated with higher BMI in Turkish women. To validate the possible role of this genetic variant in obesity, as well as the variant's possible cellular-functional significance, we performed an association study between the T79C change and several obesity-related phenotypes in 1357 obese French adults and children. The prevalence of the T79C SNP was similar between obese adults and children when each group was compared with the controls. (CC genotype carrier frequencies were 0.25 to 0.29 in the obese groups and 0.21 in controls; p > 0.05.) In each of the obese adult and child groups studied, the T79C variant was not found to be associated with any of the obesity-related phenotypes tested. Although the T79C SNP of the DGAT gene was studied in several groups of white subjects, the association between this SNP and obesity-related phenotypes, previously described, was not confirmed in our population.

  11. Association of genetic variants in RAB23 and ANXA11 with uveitis in sarcoidosis

    PubMed Central

    Davoudi, Samaneh; Chang, Victoria S.; Navarro-Gomez, Daniel; Stanwyck, Lynn K.; Sevgi, Damla Duriye; Papavasileiou, Evangelia; Ren, Aiai; Uchiyama, Eduardo; Sullivan, Lynn; Lobo, Ann-Marie; Papaliodis, George N.

    2018-01-01

    Purpose Uveitis occurs in a subset of patients with sarcoidosis. The purpose of this study was to determine whether genetic variants that have been associated previously with overall sarcoidosis are associated with increased risk of developing uveitis. Methods Seventy-seven subjects were enrolled, including 45 patients diagnosed with sarcoidosis-related uveitis as cases and 32 patients with systemic sarcoidosis without ocular involvement as controls. Thirty-eight single nucleotide polymorphisms (SNPs) previously associated with sarcoidosis, sarcoidosis severity, or other organ-specific sarcoidosis involvement were identified. Allele frequencies in ocular sarcoidosis cases versus controls were compared using the chi-square test, and p values were corrected for multiple hypotheses testing using permutation. All analyses were conducted with PLINK. Results SNPs rs1040461 and rs61860052, in ras-related protein RAS23 (RAB23) and annexin A11 (ANXA11) genes, respectively, were associated with sarcoidosis-associated uveitis. The T allele of rs1040461 and the A allele of rs61860052 were found to be more prevalent in ocular sarcoidosis cases. These associations remained after correction for the multiple hypotheses tested (p=0.01 and p=0.02). In a subanalysis of Caucasian Americans only, two additional variants within the major histocompatibility complex (MHC) genes on chromosome 6, in HLA-DRB5 and HLA-DRB1, were associated with uveitis as well (p=0.009 and p=0.04). Conclusions Genetic variants in RAB23 and ANXA11 genes were associated with an increased risk of sarcoidosis-associated uveitis. These loci have previously been associated with overall sarcoidosis risk. PMID:29416296

  12. Association of the I264T Variant in the Sulfide Quinone Reductase-Like (SQRDL) Gene with Osteoporosis in Korean Postmenopausal Women

    PubMed Central

    Park, Eunkuk; Kim, Bo-Young; Choi, Vit-Na; Yoo, Young-Hyun; Kim, Bom-Taeck; Jeong, Seon-Yong

    2015-01-01

    To identify novel susceptibility variants for osteoporosis in Korean postmenopausal women, we performed a genome-wide association analysis of 1180 nonsynonymous single nucleotide polymorphisms (nsSNPs) in 405 individuals with osteoporosis and 722 normal controls of the Korean Association Resource cohort. A logistic regression analysis revealed 72 nsSNPs that showed a significant association with osteoporosis (p<0.05). The top 10 nsSNPs showing the lowest p-values (p = 5.2×10-4–8.5×10-3) were further studied to investigate their effects at the protein level. Based on the results of an in silico prediction of the protein’s functional effect based on amino acid alterations and a sequence conservation evaluation of the amino acid residues at the positions of the nsSNPs among orthologues, we selected one nsSNP in the SQRDL gene (rs1044032, SQRDL I264T) as a meaningful genetic variant associated with postmenopausal osteoporosis. To assess whether the SQRDL I264T variant played a functional role in the pathogenesis of osteoporosis, we examined the in vitro effect of the nsSNP on bone remodeling. Overexpression of the SQRDL I264T variant in the preosteoblast MC3T3-E1 cells significantly increased alkaline phosphatase activity, mineralization, and the mRNA expression of osteoblastogenesis markers, Runx2, Sp7, and Bglap genes, whereas the SQRDL wild type had no effect or a negative effect on osteoblast differentiation. Overexpression of the SQRDL I264T variant did not affect osteoclast differentiation of the primary-cultured monocytes. The known effects of hydrogen sulfide (H2S) on bone remodeling may explain the findings of the current study, which demonstrated the functional role of the H2S-catalyzing enzyme SQRDL I264T variant in osteoblast differentiation. In conclusion, the results of the statistical and experimental analyses indicate that the SQRDL I264T nsSNP may be a significant susceptibility variant for osteoporosis in Korean postmenopausal women that is involved in osteoblast differentiation. PMID:26258864

  13. Total Zinc Intake May Modify the Glucose-Raising Effect of a Zinc Transporter (SLC30A8) Variant

    PubMed Central

    Kanoni, Stavroula; Nettleton, Jennifer A.; Hivert, Marie-France; Ye, Zheng; van Rooij, Frank J.A.; Shungin, Dmitry; Sonestedt, Emily; Ngwa, Julius S.; Wojczynski, Mary K.; Lemaitre, Rozenn N.; Gustafsson, Stefan; Anderson, Jennifer S.; Tanaka, Toshiko; Hindy, George; Saylor, Georgia; Renstrom, Frida; Bennett, Amanda J.; van Duijn, Cornelia M.; Florez, Jose C.; Fox, Caroline S.; Hofman, Albert; Hoogeveen, Ron C.; Houston, Denise K.; Hu, Frank B.; Jacques, Paul F.; Johansson, Ingegerd; Lind, Lars; Liu, Yongmei; McKeown, Nicola; Ordovas, Jose; Pankow, James S.; Sijbrands, Eric J.G.; Syvänen, Ann-Christine; Uitterlinden, André G.; Yannakoulia, Mary; Zillikens, M. Carola; Wareham, Nick J.; Prokopenko, Inga; Bandinelli, Stefania; Forouhi, Nita G.; Cupples, L. Adrienne; Loos, Ruth J.; Hallmans, Goran; Dupuis, Josée; Langenberg, Claudia; Ferrucci, Luigi; Kritchevsky, Stephen B.; McCarthy, Mark I.; Ingelsson, Erik; Borecki, Ingrid B.; Witteman, Jacqueline C.M.; Orho-Melander, Marju; Siscovick, David S.; Meigs, James B.; Franks, Paul W.; Dedoussis, George V.

    2011-01-01

    OBJECTIVE Many genetic variants have been associated with glucose homeostasis and type 2 diabetes in genome-wide association studies. Zinc is an essential micronutrient that is important for β-cell function and glucose homeostasis. We tested the hypothesis that zinc intake could influence the glucose-raising effect of specific variants. RESEARCH DESIGN AND METHODS We conducted a 14-cohort meta-analysis to assess the interaction of 20 genetic variants known to be related to glycemic traits and zinc metabolism with dietary zinc intake (food sources) and a 5-cohort meta-analysis to assess the interaction with total zinc intake (food sources and supplements) on fasting glucose levels among individuals of European ancestry without diabetes. RESULTS We observed a significant association of total zinc intake with lower fasting glucose levels (β-coefficient ± SE per 1 mg/day of zinc intake: −0.0012 ± 0.0003 mmol/L, summary P value = 0.0003), while the association of dietary zinc intake was not significant. We identified a nominally significant interaction between total zinc intake and the SLC30A8 rs11558471 variant on fasting glucose levels (β-coefficient ± SE per A allele for 1 mg/day of greater total zinc intake: −0.0017 ± 0.0006 mmol/L, summary interaction P value = 0.005); this result suggests a stronger inverse association between total zinc intake and fasting glucose in individuals carrying the glucose-raising A allele compared with individuals who do not carry it. None of the other interaction tests were statistically significant. CONCLUSIONS Our results suggest that higher total zinc intake may attenuate the glucose-raising effect of the rs11558471 SLC30A8 (zinc transporter) variant. Our findings also support evidence for the association of higher total zinc intake with lower fasting glucose levels. PMID:21810599

  14. Amyotrophic lateral sclerosis onset is influenced by the burden of rare variants in known amyotrophic lateral sclerosis genes.

    PubMed

    Cady, Janet; Allred, Peggy; Bali, Taha; Pestronk, Alan; Goate, Alison; Miller, Timothy M; Mitra, Robi D; Ravits, John; Harms, Matthew B; Baloh, Robert H

    2015-01-01

    To define the genetic landscape of amyotrophic lateral sclerosis (ALS) and assess the contribution of possible oligogenic inheritance, we aimed to comprehensively sequence 17 known ALS genes in 391 ALS patients from the United States. Targeted pooled-sample sequencing was used to identify variants in 17 ALS genes. Fragment size analysis was used to define ATXN2 and C9ORF72 expansion sizes. Genotype-phenotype correlations were made with individual variants and total burden of variants. Rare variant associations for risk of ALS were investigated at both the single variant and gene level. A total of 64.3% of familial and 27.8% of sporadic subjects carried potentially pathogenic novel or rare coding variants identified by sequencing or an expanded repeat in C9ORF72 or ATXN2; 3.8% of subjects had variants in >1 ALS gene, and these individuals had disease onset 10 years earlier (p = 0.0046) than subjects with variants in a single gene. The number of potentially pathogenic coding variants did not influence disease duration or site of onset. Rare and potentially pathogenic variants in known ALS genes are present in >25% of apparently sporadic and 64% of familial patients, significantly higher than previous reports using less comprehensive sequencing approaches. A significant number of subjects carried variants in >1 gene, which influenced the age of symptom onset and supports oligogenic inheritance as relevant to disease pathogenesis. © 2014 American Neurological Association.

  15. Investigation of variants identified in caucasian genome-wide association studies for plasma high-density lipoprotein cholesterol and triglycerides levels in Mexican dyslipidemic study samples.

    PubMed

    Weissglas-Volkov, Daphna; Aguilar-Salinas, Carlos A; Sinsheimer, Janet S; Riba, Laura; Huertas-Vazquez, Adriana; Ordoñez-Sánchez, Maria L; Rodriguez-Guillen, Rosario; Cantor, Rita M; Tusie-Luna, Teresa; Pajukanta, Päivi

    2010-02-01

    Although epidemiological studies have demonstrated an increased predisposition to low high-density lipoprotein cholesterol and high triglyceride levels in the Mexican population, Mexicans have not been included in any of the previously reported genome-wide association studies for lipids. We investigated 6 single-nucleotide polymorphisms associated with triglycerides, 7 with high-density lipoprotein cholesterol, and 1 with both triglycerides and high-density lipoprotein cholesterol in recent Caucasian genome-wide association studies in Mexican familial combined hyperlipidemia families and hypertriglyceridemia case-control study samples. These variants were within or near the genes ABCA1, ANGPTL3, APOA5, APOB, CETP, GALNT2, GCKR, LCAT, LIPC, LPL (2), MMAB-MVK, TRIB1, and XKR6-AMAC1L2. We performed a combined analysis of the family-based and case-control studies (n=2298) using the Z method to combine statistics. Ten of the single-nucleotide polymorphisms were nominally significant and 5 were significant after Bonferroni correction (P=2.20 x 10(-3) to 2.6 x 10(-11)) for the number of tests performed (APOA5, CETP, GCKR, and GALNT2). Interestingly, our strongest signal was obtained for triglycerides with the minor allele of rs964184 (P=2.6 x 10(-11)) in the APOA1/C3/A4/A5 gene cluster region that is significantly more common in Mexicans (27%) than in whites (12%). It is important to confirm whether known loci have a consistent effect across ethnic groups. We show replication of 5 Caucasian genome-wide association studies lipid associations in Mexicans. The remaining loci will require a comprehensive investigation to exclude or verify their significance in Mexicans. We also demonstrate that rs964184 has a large effect (odds ratio, 1.74) and is more frequent in the Mexican population, and thus it may contribute to the high predisposition to dyslipidemias in Mexicans.

  16. A power set-based statistical selection procedure to locate susceptible rare variants associated with complex traits with sequencing data.

    PubMed

    Sun, Hokeun; Wang, Shuang

    2014-08-15

    Existing association methods for rare variants from sequencing data have focused on aggregating variants in a gene or a genetic region because of the fact that analysing individual rare variants is underpowered. However, these existing rare variant detection methods are not able to identify which rare variants in a gene or a genetic region of all variants are associated with the complex diseases or traits. Once phenotypic associations of a gene or a genetic region are identified, the natural next step in the association study with sequencing data is to locate the susceptible rare variants within the gene or the genetic region. In this article, we propose a power set-based statistical selection procedure that is able to identify the locations of the potentially susceptible rare variants within a disease-related gene or a genetic region. The selection performance of the proposed selection procedure was evaluated through simulation studies, where we demonstrated the feasibility and superior power over several comparable existing methods. In particular, the proposed method is able to handle the mixed effects when both risk and protective variants are present in a gene or a genetic region. The proposed selection procedure was also applied to the sequence data on the ANGPTL gene family from the Dallas Heart Study to identify potentially susceptible rare variants within the trait-related genes. An R package 'rvsel' can be downloaded from http://www.columbia.edu/∼sw2206/ and http://statsun.pusan.ac.kr. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. A multicenter study confirms CD226 gene association with systemic sclerosis-related pulmonary fibrosis

    PubMed Central

    2012-01-01

    Introduction CD226 genetic variants have been associated with a number of autoimmune diseases and recently with systemic sclerosis (SSc). The aim of this study was to test the influence of CD226 loci in SSc susceptibility, clinical phenotypes and autoantibody status in a large multicenter European population. Methods A total of seven European populations of Caucasian ancestry were included, comprising 2,131 patients with SSc and 3,966 healthy controls. Three CD226 single nucleotide polymorphisms (SNPs), rs763361, rs3479968 and rs727088, were genotyped using Taqman 5'allelic discrimination assays. Results Pooled analyses showed no evidence of association of the three SNPs, neither with the global disease nor with the analyzed subphenotypes. However, haplotype block analysis revealed a significant association for the TCG haplotype (SNP order: rs763361, rs34794968, rs727088) with lung fibrosis positive patients (PBonf = 3.18E-02 OR 1.27 (1.05 to 1.54)). Conclusion Our data suggest that the tested genetic variants do not individually influence SSc susceptibility but a CD226 three-variant haplotype is related with genetic predisposition to SSc-related pulmonary fibrosis. PMID:22531499

  18. Inner-volume echo volumar imaging (IVEVI) for robust fetal brain imaging.

    PubMed

    Nunes, Rita G; Ferrazzi, Giulio; Price, Anthony N; Hutter, Jana; Gaspar, Andreia S; Rutherford, Mary A; Hajnal, Joseph V

    2018-07-01

    Fetal functional MRI studies using conventional 2-dimensional single-shot echo-planar imaging sequences may require discarding a large data fraction as a result of fetal and maternal motion. Increasing the temporal resolution using echo volumar imaging (EVI) could provide an effective alternative strategy. Echo volumar imaging was combined with inner volume (IV) imaging (IVEVI) to locally excite the fetal brain and acquire full 3-dimensional images, fast enough to freeze most fetal head motion. IVEVI was implemented by modifying a standard multi-echo echo-planar imaging sequence. A spin echo with orthogonal excitation and refocusing ensured localized excitation. To introduce T2* weighting and to save time, the k-space center was shifted relative to the spin echo. Both single and multi-shot variants were tested. Acoustic noise was controlled by adjusting the amplitude and switching frequency of the readout gradient. Image-based shimming was used to minimize B 0 inhomogeneities within the fetal brain. The sequence was first validated in an adult. Eight fetuses were scanned using single-shot IVEVI at a 3.5 × 3.5 × 5.0 mm 3 resolution with a readout duration of 383 ms. Multishot IVEVI showed reduced geometric distortions along the second phase-encode direction. Fetal EVI remains challenging. Although effective echo times comparable to the T2* values of fetal cortical gray matter at 3 T could be achieved, controlling acoustic noise required longer readouts, leading to substantial distortions in single-shot images. Although multishot variants enabled us to reduce susceptibility-induced geometric distortions, sensitivity to motion was increased. Future studies should therefore focus on improvements to multishot variants. Magn Reson Med 80:279-285, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.

  19. Natural single amino acid polymorphism (F19Y) in human galectin-8: detection of structural alterations and increased growth-regulatory activity on tumor cells.

    PubMed

    Ruiz, Federico M; Scholz, Barbara A; Buzamet, Eliza; Kopitz, Jürgen; André, Sabine; Menéndez, Margarita; Romero, Antonio; Solís, Dolores; Gabius, Hans-Joachim

    2014-03-01

    Natural amino acid substitution by single-site nucleotide polymorphism can become a valuable tool for structure-activity correlations, especially if evidence for association to disease parameters exists. Focusing on the F19Y change in human galectin-8, connected clinically to rheumatoid arthritis, we here initiate the study of consequences of a single-site substitution in the carbohydrate recognition domain of this family of cellular effectors. We apply a strategically combined set of structural and cell biological techniques for comparing properties of the wild-type and variant proteins. The overall hydrodynamic behavior of the full-length protein and of the separate N-domain is not noticeably altered, but displacements in the F0 β-strand of the β-sandwich fold in the N-domain are induced, as evidenced by protein crystallography. Analysis of thermal stability by circular dichroism spectroscopy revealed perceptible differences for the full-length proteins, pointing to an impact of the substitution beyond the N-domain. In addition, small differences in thermodynamic parameters of carbohydrate binding are detected. On the level of two types of tumor cells, characteristics of binding appeared rather similar. In further comparison of the influence on proliferation, the variant proved to be more active as growth regulator in the six tested lines of neuroblastoma, erythroleukemia and colon adenocarcinoma. The seemingly subtle structural change identified here thus has functional implications in vitro, encouraging further analysis in autoimmune regulation and, in a broad context, in work with other natural single-site variants, using the documented combined strategy. The atomic coordinates and structure factors (codes 4BMB, 4BME) have been deposited in the Protein Data Bank. © 2014 FEBS.

  20. Germline contamination and leakage in whole genome somatic single nucleotide variant detection.

    PubMed

    Sendorek, Dorota H; Caloian, Cristian; Ellrott, Kyle; Bare, J Christopher; Yamaguchi, Takafumi N; Ewing, Adam D; Houlahan, Kathleen E; Norman, Thea C; Margolin, Adam A; Stuart, Joshua M; Boutros, Paul C

    2018-01-31

    The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly access should be controlled. It is not thought to be possible to re-identify patients from somatic variant data. However, somatic variant detection pipelines can mistakenly identify germline variants as somatic ones, a process called "germline leakage". The rate of germline leakage across different somatic variant detection pipelines is not well-understood, and it is uncertain whether or not somatic variant calls should be considered re-identifiable. To fill this gap, we quantified germline leakage across 259 sets of whole-genome somatic single nucleotide variant (SNVs) predictions made by 21 teams as part of the ICGC-TCGA DREAM Somatic Mutation Calling Challenge. The median somatic SNV prediction set contained 4325 somatic SNVs and leaked one germline polymorphism. The level of germline leakage was inversely correlated with somatic SNV prediction accuracy and positively correlated with the amount of infiltrating normal cells. The specific germline variants leaked differed by tumour and algorithm. To aid in quantitation and correction of leakage, we created a tool, called GermlineFilter, for use in public-facing somatic SNV databases. The potential for patient re-identification from leaked germline variants in somatic SNV predictions has led to divergent open data access policies, based on different assessments of the risks. Indeed, a single, well-publicized re-identification event could reshape public perceptions of the values of genomic data sharing. We find that modern somatic SNV prediction pipelines have low germline-leakage rates, which can be further reduced, especially for cloud-sharing, using pre-filtering software.

Top