HapZipper: sharing HapMap populations just got easier.
Chanda, Pritam; Elhaik, Eran; Bader, Joel S
2012-11-01
The rapidly growing amount of genomic sequence data being generated and made publicly available necessitate the development of new data storage and archiving methods. The vast amount of data being shared and manipulated also create new challenges for network resources. Thus, developing advanced data compression techniques is becoming an integral part of data production and analysis. The HapMap project is one of the largest public resources of human single-nucleotide polymorphisms (SNPs), characterizing over 3 million SNPs genotyped in over 1000 individuals. The standard format and biological properties of HapMap data suggest that a dedicated genetic compression method can outperform generic compression tools. We propose a compression methodology for genetic data by introducing HapZipper, a lossless compression tool tailored to compress HapMap data beyond benchmarks defined by generic tools such as gzip, bzip2 and lzma. We demonstrate the usefulness of HapZipper by compressing HapMap 3 populations to <5% of their original sizes. HapZipper is freely downloadable from https://bitbucket.org/pchanda/hapzipper/downloads/HapZipper.tar.bz2.
interPopula: a Python API to access the HapMap Project dataset
2010-01-01
Background The HapMap project is a publicly available catalogue of common genetic variants that occur in humans, currently including several million SNPs across 1115 individuals spanning 11 different populations. This important database does not provide any programmatic access to the dataset, furthermore no standard relational database interface is provided. Results interPopula is a Python API to access the HapMap dataset. interPopula provides integration facilities with both the Python ecology of software (e.g. Biopython and matplotlib) and other relevant human population datasets (e.g. Ensembl gene annotation and UCSC Known Genes). A set of guidelines and code examples to address possible inconsistencies across heterogeneous data sources is also provided. Conclusions interPopula is a straightforward and flexible Python API that facilitates the construction of scripts and applications that require access to the HapMap dataset. PMID:21210977
HapMap filter 1.0: a tool to preprocess the HapMap genotypic data for association studies.
Zhang, Wei; Duan, Shiwei; Dolan, M Eileen
2008-05-13
The International HapMap Project provides a resource of genotypic data on single nucleotide polymorphisms (SNPs), which can be used in various association studies to identify the genetic determinants for phenotypic variations. Prior to the association studies, the HapMap dataset should be preprocessed in order to reduce the computation time and control the multiple testing problem. The less informative SNPs including those with very low genotyping rate and SNPs with rare minor allele frequencies to some extent in one or more population are removed. Some research designs only use SNPs in a subset of HapMap cell lines. Although the HapMap website and other association software packages have provided some basic tools for optimizing these datasets, a fast and user-friendly program to generate the output for filtered genotypic data would be beneficial for association studies. Here, we present a flexible, straight-forward bioinformatics program that can be useful in preparing the HapMap genotypic data for association studies by specifying cell lines and two common filtering criteria: minor allele frequencies and genotyping rate. The software was developed for Microsoft Windows and written in C++. The Windows executable and source code in Microsoft Visual C++ are available at Google Code (http://hapmap-filter-v1.googlecode.com/) or upon request. Their distribution is subject to GNU General Public License v3.
HapMap tagSNP transferability in multiple populations: general guidelines
Xing, Jinchuan; Witherspoon, David J.; Watkins, W. Scott; Zhang, Yuhua; Tolpinrud, Whitney; Jorde, Lynn B.
2008-01-01
This PDF receipt will only be used as the basis for generating PubMed Central (PMC) documents. PMC documents will be made available for review after conversion (approx. 2–3 weeks time). Any corrections that need to be made will be done at that time. No materials will be released to PMC without the approval of an author. Only the PMC documents will appear on PubMed Central -- this PDF Receipt will not appear on PubMed Central. Linkage disequilibrium (LD) has received much recent attention because of its value in localizing disease-causing genes. Due to the extensive LD between neighboring loci in the human genome, it is believed that a subset of the single nucleotide polymorphisms in a region (tagSNPs) can be selected to capture most of the remaining SNP variants. In this study, we examined LD patterns and HapMap tagSNP transferability in more than 300 individuals. A South Indian and an African Mbuti Pygmy population sample were included to evaluate the performance of HapMap tagSNPs in geographically distinct and genetically isolated populations. Our results show that HapMap tagSNPs selected with r2 >= 0.8 can capture more than 85% of the SNPs in populations that are from the same continental group. Combined tagSNPs from HapMap CEU and CHB+JPT serve as the best reference for the Indian sample. The HapMap YRI are a sufficient reference for tagSNP selection in the Pygmy sample. In addition to our findings, we reviewed over 25 recent studies of tagSNP transferability and propose a general guideline for selecting tagSNPs from HapMap populations. PMID:18482828
Buchanan, Carrie C; Torstenson, Eric S; Bush, William S; Ritchie, Marylyn D
2012-01-01
Since publication of the human genome in 2003, geneticists have been interested in risk variant associations to resolve the etiology of traits and complex diseases. The International HapMap Consortium undertook an effort to catalog all common variation across the genome (variants with a minor allele frequency (MAF) of at least 5% in one or more ethnic groups). HapMap along with advances in genotyping technology led to genome-wide association studies which have identified common variants associated with many traits and diseases. In 2008 the 1000 Genomes Project aimed to sequence 2500 individuals and identify rare variants and 99% of variants with a MAF of <1%. To determine whether the 1000 Genomes Project includes all the variants in HapMap, we examined the overlap between single nucleotide polymorphisms (SNPs) genotyped in the two resources using merged phase II/III HapMap data and low coverage pilot data from 1000 Genomes. Comparison of the two data sets showed that approximately 72% of HapMap SNPs were also found in 1000 Genomes Project pilot data. After filtering out HapMap variants with a MAF of <5% (separately for each population), 99% of HapMap SNPs were found in 1000 Genomes data. Not all variants cataloged in HapMap are also cataloged in 1000 Genomes. This could affect decisions about which resource to use for SNP queries, rare variant validation, or imputation. Both the HapMap and 1000 Genomes Project databases are useful resources for human genetics, but it is important to understand the assumptions made and filtering strategies employed by these projects.
Buchanan, Carrie C; Torstenson, Eric S; Bush, William S
2012-01-01
Background Since publication of the human genome in 2003, geneticists have been interested in risk variant associations to resolve the etiology of traits and complex diseases. The International HapMap Consortium undertook an effort to catalog all common variation across the genome (variants with a minor allele frequency (MAF) of at least 5% in one or more ethnic groups). HapMap along with advances in genotyping technology led to genome-wide association studies which have identified common variants associated with many traits and diseases. In 2008 the 1000 Genomes Project aimed to sequence 2500 individuals and identify rare variants and 99% of variants with a MAF of <1%. Methods To determine whether the 1000 Genomes Project includes all the variants in HapMap, we examined the overlap between single nucleotide polymorphisms (SNPs) genotyped in the two resources using merged phase II/III HapMap data and low coverage pilot data from 1000 Genomes. Results Comparison of the two data sets showed that approximately 72% of HapMap SNPs were also found in 1000 Genomes Project pilot data. After filtering out HapMap variants with a MAF of <5% (separately for each population), 99% of HapMap SNPs were found in 1000 Genomes data. Conclusions Not all variants cataloged in HapMap are also cataloged in 1000 Genomes. This could affect decisions about which resource to use for SNP queries, rare variant validation, or imputation. Both the HapMap and 1000 Genomes Project databases are useful resources for human genetics, but it is important to understand the assumptions made and filtering strategies employed by these projects. PMID:22319179
EvoSNP-DB: A database of genetic diversity in East Asian populations.
Kim, Young Uk; Kim, Young Jin; Lee, Jong-Young; Park, Kiejung
2013-08-01
Genome-wide association studies (GWAS) have become popular as an approach for the identification of large numbers of phenotype-associated variants. However, differences in genetic architecture and environmental factors mean that the effect of variants can vary across populations. Understanding population genetic diversity is valuable for the investigation of possible population specific and independent effects of variants. EvoSNP-DB aims to provide information regarding genetic diversity among East Asian populations, including Chinese, Japanese, and Korean. Non-redundant SNPs (1.6 million) were genotyped in 54 Korean trios (162 samples) and were compared with 4 million SNPs from HapMap phase II populations. EvoSNP-DB provides two user interfaces for data query and visualization, and integrates scores of genetic diversity (Fst and VarLD) at the level of SNPs, genes, and chromosome regions. EvoSNP-DB is a web-based application that allows users to navigate and visualize measurements of population genetic differences in an interactive manner, and is available online at [http://biomi.cdc.go.kr/EvoSNP/].
Vongpaisarnsin, Kornkiat; Listman, Jennifer Beth; Malison, Robert T; Gelernter, Joel
2015-01-01
The main purpose of this work was to identify a set of AIMs that stratify the genetic structure and diversity of the Thai population from a high-throughput autosomal genome-wide association study. In this study, more than one million SNPs from the International HapMap database and the Thai depression genome-wide association study have been examined to identify ancestry informative markers (AIMs) that distinguish between Thai populations. An efficient strategy is proposed to identify and characterize such SNPs and to test high-resolution SNP data from international HapMap populations. The best AIMs are identified to stratify the population and to infer genetic ancestry structure. A total of 124 AIMs were clearly clustered geographically across the continent, whereas only 89 AIMs stratified the Thai population from East Asian populations. Finally, a set of 273 AIMs was able to distinguish northern from southern Thai subpopulations. These markers will be of particular value in identifying the ethnic origins in regions where matching by self-reports is unavailable or unreliable, which usually occurs in real forensic cases. PMID:25759192
Vongpaisarnsin, Kornkiat; Listman, Jennifer Beth; Malison, Robert T; Gelernter, Joel
2015-07-01
The main purpose of this work was to identify a set of AIMs that stratify the genetic structure and diversity of the Thai population from a high-throughput autosomal genome-wide association study. In this study, more than one million SNPs from the international HapMap database and the Thai depression genome-wide association study have been examined to identify ancestry informative markers (AIMs) that distinguish between Thai populations. An efficient strategy is proposed to identify and characterize such SNPs and to test high-resolution SNP data from international HapMap populations. The best AIMs are identified to stratify the population and to infer genetic ancestry structure. A total of 124 AIMs were clearly clustered geographically across the continent, whereas only 89 AIMs stratified the Thai population from East Asian populations. Finally, a set of 273 AIMs was able to distinguish northern from southern Thai subpopulations. These markers will be of particular value in identifying the ethnic origins in regions where matching by self-reports is unavailable or unreliable, which usually occurs in real forensic cases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Genome-wide association analysis of ischemic stroke in young adults.
Cheng, Yu-Ching; O'Connell, Jeffrey R; Cole, John W; Stine, O Colin; Dueker, Nicole; McArdle, Patrick F; Sparks, Mary J; Shen, Jess; Laurie, Cathy C; Nelson, Sarah; Doheny, Kimberly F; Ling, Hua; Pugh, Elizabeth W; Brott, Thomas G; Brown, Robert D; Meschia, James F; Nalls, Michael; Rich, Stephen S; Worrall, Bradford; Anderson, Christopher D; Biffi, Alessandro; Cortellini, Lynelle; Furie, Karen L; Rost, Natalia S; Rosand, Jonathan; Manolio, Teri A; Kittner, Steven J; Mitchell, Braxton D
2011-11-01
Ischemic stroke (IS) is among the leading causes of death in Western countries. There is a significant genetic component to IS susceptibility, especially among young adults. To date, research to identify genetic loci predisposing to stroke has met only with limited success. We performed a genome-wide association (GWA) analysis of early-onset IS to identify potential stroke susceptibility loci. The GWA analysis was conducted by genotyping 1 million SNPs in a biracial population of 889 IS cases and 927 controls, ages 15-49 years. Genotypes were imputed using the HapMap3 reference panel to provide 1.4 million SNPs for analysis. Logistic regression models adjusting for age, recruitment stages, and population structure were used to determine the association of IS with individual SNPs. Although no single SNP reached genome-wide significance (P < 5 × 10(-8)), we identified two SNPs in chromosome 2q23.3, rs2304556 (in FMNL2; P = 1.2 × 10(-7)) and rs1986743 (in ARL6IP6; P = 2.7 × 10(-7)), strongly associated with early-onset stroke. These data suggest that a novel locus on human chromosome 2q23.3 may be associated with IS susceptibility among young adults.
Singapore Genome Variation Project: a haplotype map of three Southeast Asian populations.
Teo, Yik-Ying; Sim, Xueling; Ong, Rick T H; Tan, Adrian K S; Chen, Jieming; Tantoso, Erwin; Small, Kerrin S; Ku, Chee-Seng; Lee, Edmund J D; Seielstad, Mark; Chia, Kee-Seng
2009-11-01
The Singapore Genome Variation Project (SGVP) provides a publicly available resource of 1.6 million single nucleotide polymorphisms (SNPs) genotyped in 268 individuals from the Chinese, Malay, and Indian population groups in Southeast Asia. This online database catalogs information and summaries on genotype and phased haplotype data, including allele frequencies, assessment of linkage disequilibrium (LD), and recombination rates in a format similar to the International HapMap Project. Here, we introduce this resource and describe the analysis of human genomic variation upon agglomerating data from the HapMap and the Human Genome Diversity Project, providing useful insights into the population structure of the three major population groups in Asia. In addition, this resource also surveyed across the genome for variation in regional patterns of LD between the HapMap and SGVP populations, and for signatures of positive natural selection using two well-established metrics: iHS and XP-EHH. The raw and processed genetic data, together with all population genetic summaries, are publicly available for download and browsing through a web browser modeled with the Generic Genome Browser.
Singapore Genome Variation Project: A haplotype map of three Southeast Asian populations
Teo, Yik-Ying; Sim, Xueling; Ong, Rick T.H.; Tan, Adrian K.S.; Chen, Jieming; Tantoso, Erwin; Small, Kerrin S.; Ku, Chee-Seng; Lee, Edmund J.D.; Seielstad, Mark; Chia, Kee-Seng
2009-01-01
The Singapore Genome Variation Project (SGVP) provides a publicly available resource of 1.6 million single nucleotide polymorphisms (SNPs) genotyped in 268 individuals from the Chinese, Malay, and Indian population groups in Southeast Asia. This online database catalogs information and summaries on genotype and phased haplotype data, including allele frequencies, assessment of linkage disequilibrium (LD), and recombination rates in a format similar to the International HapMap Project. Here, we introduce this resource and describe the analysis of human genomic variation upon agglomerating data from the HapMap and the Human Genome Diversity Project, providing useful insights into the population structure of the three major population groups in Asia. In addition, this resource also surveyed across the genome for variation in regional patterns of LD between the HapMap and SGVP populations, and for signatures of positive natural selection using two well-established metrics: iHS and XP-EHH. The raw and processed genetic data, together with all population genetic summaries, are publicly available for download and browsing through a web browser modeled with the Generic Genome Browser. PMID:19700652
Fine-Scale Map of Encyclopedia of DNA Elements Regions in the Korean Population
Yoo, Yeon-Kyeong; Ke, Xiayi; Hong, Sungwoo; Jang, Hye-Yoon; Park, Kyunghee; Kim, Sook; Ahn, TaeJin; Lee, Yeun-Du; Song, Okryeol; Rho, Na-Young; Lee, Moon Sue; Lee, Yeon-Su; Kim, Jaeheup; Kim, Young J.; Yang, Jun-Mo; Song, Kyuyoung; Kimm, Kyuchan; Weir, Bruce; Cardon, Lon R.; Lee, Jong-Eun; Hwang, Jung-Joo
2006-01-01
The International HapMap Project aims to generate detailed human genome variation maps by densely genotyping single-nucleotide polymorphisms (SNPs) in CEPH, Chinese, Japanese, and Yoruba samples. This will undoubtedly become an important facility for genetic studies of diseases and complex traits in the four populations. To address how the genetic information contained in such variation maps is transferable to other populations, the Korean government, industries, and academics have launched the Korean HapMap project to genotype high-density Encyclopedia of DNA Elements (ENCODE) regions in 90 Korean individuals. Here we show that the LD pattern, block structure, haplotype diversity, and recombination rate are highly concordant between Korean and the two HapMap Asian samples, particularly Japanese. The availability of information from both Chinese and Japanese samples helps to predict more accurately the possible performance of HapMap markers in Korean disease-gene studies. Tagging SNPs selected from the two HapMap Asian maps, especially the Japanese map, were shown to be very effective for Korean samples. These results demonstrate that the HapMap variation maps are robust in related populations and will serve as an important resource for the studies of the Korean population in particular. PMID:16702437
A second generation human haplotype map of over 3.1 million SNPs.
Frazer, Kelly A; Ballinger, Dennis G; Cox, David R; Hinds, David A; Stuve, Laura L; Gibbs, Richard A; Belmont, John W; Boudreau, Andrew; Hardenbol, Paul; Leal, Suzanne M; Pasternak, Shiran; Wheeler, David A; Willis, Thomas D; Yu, Fuli; Yang, Huanming; Zeng, Changqing; Gao, Yang; Hu, Haoran; Hu, Weitao; Li, Chaohua; Lin, Wei; Liu, Siqi; Pan, Hao; Tang, Xiaoli; Wang, Jian; Wang, Wei; Yu, Jun; Zhang, Bo; Zhang, Qingrun; Zhao, Hongbin; Zhao, Hui; Zhou, Jun; Gabriel, Stacey B; Barry, Rachel; Blumenstiel, Brendan; Camargo, Amy; Defelice, Matthew; Faggart, Maura; Goyette, Mary; Gupta, Supriya; Moore, Jamie; Nguyen, Huy; Onofrio, Robert C; Parkin, Melissa; Roy, Jessica; Stahl, Erich; Winchester, Ellen; Ziaugra, Liuda; Altshuler, David; Shen, Yan; Yao, Zhijian; Huang, Wei; Chu, Xun; He, Yungang; Jin, Li; Liu, Yangfan; Shen, Yayun; Sun, Weiwei; Wang, Haifeng; Wang, Yi; Wang, Ying; Xiong, Xiaoyan; Xu, Liang; Waye, Mary M Y; Tsui, Stephen K W; Xue, Hong; Wong, J Tze-Fei; Galver, Luana M; Fan, Jian-Bing; Gunderson, Kevin; Murray, Sarah S; Oliphant, Arnold R; Chee, Mark S; Montpetit, Alexandre; Chagnon, Fanny; Ferretti, Vincent; Leboeuf, Martin; Olivier, Jean-François; Phillips, Michael S; Roumy, Stéphanie; Sallée, Clémentine; Verner, Andrei; Hudson, Thomas J; Kwok, Pui-Yan; Cai, Dongmei; Koboldt, Daniel C; Miller, Raymond D; Pawlikowska, Ludmila; Taillon-Miller, Patricia; Xiao, Ming; Tsui, Lap-Chee; Mak, William; Song, You Qiang; Tam, Paul K H; Nakamura, Yusuke; Kawaguchi, Takahisa; Kitamoto, Takuya; Morizono, Takashi; Nagashima, Atsushi; Ohnishi, Yozo; Sekine, Akihiro; Tanaka, Toshihiro; Tsunoda, Tatsuhiko; Deloukas, Panos; Bird, Christine P; Delgado, Marcos; Dermitzakis, Emmanouil T; Gwilliam, Rhian; Hunt, Sarah; Morrison, Jonathan; Powell, Don; Stranger, Barbara E; Whittaker, Pamela; Bentley, David R; Daly, Mark J; de Bakker, Paul I W; Barrett, Jeff; Chretien, Yves R; Maller, Julian; McCarroll, Steve; Patterson, Nick; Pe'er, Itsik; Price, Alkes; Purcell, Shaun; Richter, Daniel J; Sabeti, Pardis; Saxena, Richa; Schaffner, Stephen F; Sham, Pak C; Varilly, Patrick; Altshuler, David; Stein, Lincoln D; Krishnan, Lalitha; Smith, Albert Vernon; Tello-Ruiz, Marcela K; Thorisson, Gudmundur A; Chakravarti, Aravinda; Chen, Peter E; Cutler, David J; Kashuk, Carl S; Lin, Shin; Abecasis, Gonçalo R; Guan, Weihua; Li, Yun; Munro, Heather M; Qin, Zhaohui Steve; Thomas, Daryl J; McVean, Gilean; Auton, Adam; Bottolo, Leonardo; Cardin, Niall; Eyheramendy, Susana; Freeman, Colin; Marchini, Jonathan; Myers, Simon; Spencer, Chris; Stephens, Matthew; Donnelly, Peter; Cardon, Lon R; Clarke, Geraldine; Evans, David M; Morris, Andrew P; Weir, Bruce S; Tsunoda, Tatsuhiko; Mullikin, James C; Sherry, Stephen T; Feolo, Michael; Skol, Andrew; Zhang, Houcan; Zeng, Changqing; Zhao, Hui; Matsuda, Ichiro; Fukushima, Yoshimitsu; Macer, Darryl R; Suda, Eiko; Rotimi, Charles N; Adebamowo, Clement A; Ajayi, Ike; Aniagwu, Toyin; Marshall, Patricia A; Nkwodimmah, Chibuzor; Royal, Charmaine D M; Leppert, Mark F; Dixon, Missy; Peiffer, Andy; Qiu, Renzong; Kent, Alastair; Kato, Kazuto; Niikawa, Norio; Adewole, Isaac F; Knoppers, Bartha M; Foster, Morris W; Clayton, Ellen Wright; Watkin, Jessica; Gibbs, Richard A; Belmont, John W; Muzny, Donna; Nazareth, Lynne; Sodergren, Erica; Weinstock, George M; Wheeler, David A; Yakub, Imtaz; Gabriel, Stacey B; Onofrio, Robert C; Richter, Daniel J; Ziaugra, Liuda; Birren, Bruce W; Daly, Mark J; Altshuler, David; Wilson, Richard K; Fulton, Lucinda L; Rogers, Jane; Burton, John; Carter, Nigel P; Clee, Christopher M; Griffiths, Mark; Jones, Matthew C; McLay, Kirsten; Plumb, Robert W; Ross, Mark T; Sims, Sarah K; Willey, David L; Chen, Zhu; Han, Hua; Kang, Le; Godbout, Martin; Wallenburg, John C; L'Archevêque, Paul; Bellemare, Guy; Saeki, Koji; Wang, Hongguang; An, Daochang; Fu, Hongbo; Li, Qing; Wang, Zhen; Wang, Renwu; Holden, Arthur L; Brooks, Lisa D; McEwen, Jean E; Guyer, Mark S; Wang, Vivian Ota; Peterson, Jane L; Shi, Michael; Spiegel, Jack; Sung, Lawrence M; Zacharia, Lynn F; Collins, Francis S; Kennedy, Karen; Jamieson, Ruth; Stewart, John
2007-10-18
We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.
USDA-ARS?s Scientific Manuscript database
Using next generation sequencing technology the International Swine SNP Consortium has identified 500,000 SNPs and used these to design an Illumina Infinium iSelect™ SNP BeadChip with a selection of 60,218 SNPs. The selected SNPs include previously validated SNPs and SNPs identified de novo using se...
Integrating common and rare genetic variation in diverse human populations.
Altshuler, David M; Gibbs, Richard A; Peltonen, Leena; Altshuler, David M; Gibbs, Richard A; Peltonen, Leena; Dermitzakis, Emmanouil; Schaffner, Stephen F; Yu, Fuli; Peltonen, Leena; Dermitzakis, Emmanouil; Bonnen, Penelope E; Altshuler, David M; Gibbs, Richard A; de Bakker, Paul I W; Deloukas, Panos; Gabriel, Stacey B; Gwilliam, Rhian; Hunt, Sarah; Inouye, Michael; Jia, Xiaoming; Palotie, Aarno; Parkin, Melissa; Whittaker, Pamela; Yu, Fuli; Chang, Kyle; Hawes, Alicia; Lewis, Lora R; Ren, Yanru; Wheeler, David; Gibbs, Richard A; Muzny, Donna Marie; Barnes, Chris; Darvishi, Katayoon; Hurles, Matthew; Korn, Joshua M; Kristiansson, Kati; Lee, Charles; McCarrol, Steven A; Nemesh, James; Dermitzakis, Emmanouil; Keinan, Alon; Montgomery, Stephen B; Pollack, Samuela; Price, Alkes L; Soranzo, Nicole; Bonnen, Penelope E; Gibbs, Richard A; Gonzaga-Jauregui, Claudia; Keinan, Alon; Price, Alkes L; Yu, Fuli; Anttila, Verneri; Brodeur, Wendy; Daly, Mark J; Leslie, Stephen; McVean, Gil; Moutsianas, Loukas; Nguyen, Huy; Schaffner, Stephen F; Zhang, Qingrun; Ghori, Mohammed J R; McGinnis, Ralph; McLaren, William; Pollack, Samuela; Price, Alkes L; Schaffner, Stephen F; Takeuchi, Fumihiko; Grossman, Sharon R; Shlyakhter, Ilya; Hostetter, Elizabeth B; Sabeti, Pardis C; Adebamowo, Clement A; Foster, Morris W; Gordon, Deborah R; Licinio, Julio; Manca, Maria Cristina; Marshall, Patricia A; Matsuda, Ichiro; Ngare, Duncan; Wang, Vivian Ota; Reddy, Deepa; Rotimi, Charles N; Royal, Charmaine D; Sharp, Richard R; Zeng, Changqing; Brooks, Lisa D; McEwen, Jean E
2010-09-02
Despite great progress in identifying genetic variants that influence human disease, most inherited risk remains unexplained. A more complete understanding requires genome-wide studies that fully examine less common alleles in populations with a wide range of ancestry. To inform the design and interpretation of such studies, we genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations, and sequenced ten 100-kilobase regions in 692 of these individuals. This integrated data set of common and rare alleles, called 'HapMap 3', includes both SNPs and copy number polymorphisms (CNPs). We characterized population-specific differences among low-frequency variants, measured the improvement in imputation accuracy afforded by the larger reference panel, especially in imputing SNPs with a minor allele frequency of
Li, Cong; Sun, Dongxiao; Zhang, Shengli; Yang, Shaohua; Alim, M A; Zhang, Qin; Li, Yanhua; Liu, Lin
2016-07-28
A previous genome-wide association study deduced that one (ARS-BFGL-NGS-39328), two (Hapmap26001-BTC-038813 and Hapmap31284-BTC-039204), two (Hapmap26001-BTC-038813 and BTB-00246150), and one (Hapmap50366-BTA-46960) genome-wide significant single nucleotide polymorphisms (SNPs) associated with milk fatty acids were close to or within the fatty acid synthase (FASN), peroxisome proliferator-activated receptor gamma, coactivator 1 alpha (PPARGC1A), ATP-binding cassette, sub-family G, member 2 (ABCG2) and insulin-like growth factor 1 (IGF1) genes. To further confirm the linkage and reveal the genetic effects of these four candidate genes on milk fatty acid composition, genetic polymorphisms were identified and genotype-phenotype associations were performed in a Chinese Holstein cattle population. Nine SNPs were identified in FASN, among which SNP rs41919985 was predicted to result in an amino acid substitution from threonine (ACC) to alanine (GCC), five SNPs (rs136947640, rs134340637, rs41919992, rs41919984 and rs41919986) were synonymous mutations, and the remaining three (rs41919999, rs132865003 and rs133498277) were found in FASN introns. Only one SNP each was identified for PPARGC1A, ABCG2 and IGF1. Association studies revealed that FASN, PPARGC1A, ABCG2 and IGF1 were mainly associated with medium-chain saturated fatty acids and long-chain unsaturated fatty acids, especially FASN for C10:0, C12:0 and C14:0. Strong linkage disequilibrium was observed among ARS-BFGL-NGS-39328 and rs132865003 and rs134340637 in FASN (D´ > 0.9), and among Hapmap26001-BTC-038813 and Hapmap31284-BTC-039204 and rs109579682 in PPARGC1A (D´ > 0.9). Subsequently, haplotype-based analysis revealed significant associations of the haplotypes encompassing eight FASN SNPs (rs41919999, rs132865003, rs134340637, rs41919992, rs133498277, rs41919984, rs41919985 and rs41919986) with C10:0, C12:0, C14:0, C18:1n9c, saturated fatty acids (SFA) and unsaturated fatty acids (UFA) (P = 0.0204 to P < 0.0001). Our study confirmed the linkage between the significant SNPs in our previous genome-wide association study and variants in FASN and PPARGC1A. SNPs within FASN, PPARGC1A, ABCG2 and IGF1 showed significant genetic effects on milk fatty acid composition in dairy cattle, indicating their potential functions in milk fatty acids synthesis and metabolism. The findings presented here provide evidence for the selection of dairy cows with healthier milk fatty acid composition by marker-assisted breeding or genomic selection schemes, as well as furthering our understanding of technological processing aspects of cows' milk.
Kim, Kyung-Seon; Kim, Ghi-Su; Hwang, Joo-Yeon; Lee, Hye-Ja; Park, Mi-Hyun; Kim, Kwang-joong; Jung, Jongsun; Cha, Hyo-Soung; Shin, Hyoung Doo; Kang, Jong-Ho; Park, Eui Kyun; Kim, Tae-Ho; Hong, Jung-Min; Koh, Jung-Min; Oh, Bermseok; Kimm, Kuchan; Kim, Shin-Yoon; Lee, Jong-Young
2007-01-01
Background Osteoporosis is defined as the loss of bone mineral density that leads to bone fragility with aging. Population-based case-control studies have identified polymorphisms in many candidate genes that have been associated with bone mass maintenance or osteoporotic fracture. To investigate single nucleotide polymorphisms (SNPs) that are associated with osteoporosis, we examined the genetic variation among Koreans by analyzing 81 genes according to their function in bone formation and resorption during bone remodeling. Methods We resequenced all the exons, splice junctions and promoter regions of candidate osteoporosis genes using 24 unrelated Korean individuals. Using the common SNPs from our study and the HapMap database, a statistical analysis of deviation in heterozygosity depicted. Results We identified 942 variants, including 888 SNPs, 43 insertion/deletion polymorphisms, and 11 microsatellite markers. Of the SNPs, 557 (63%) had been previously identified and 331 (37%) were newly discovered in the Korean population. When compared SNPs in the Korean population with those in HapMap database, 1% (or less) of SNPs in the Japanese and Chinese subpopulations and 20% of those in Caucasian and African subpopulations were significantly differentiated from the Hardy-Weinberg expectations. In addition, an analysis of the genetic diversity showed that there were no significant differences among Korean, Han Chinese and Japanese populations, but African and Caucasian populations were significantly differentiated in selected genes. Nevertheless, in the detailed analysis of genetic properties, the LD and Haplotype block patterns among the five sub-populations were substantially different from one another. Conclusion Through the resequencing of 81 osteoporosis candidate genes, 118 unknown SNPs with a minor allele frequency (MAF) > 0.05 were discovered in the Korean population. In addition, using the common SNPs between our study and HapMap, an analysis of genetic diversity and deviation in heterozygosity was performed and the polymorphisms of the above genes among the five populations were substantially differentiated from one another. Further studies of osteoporosis could utilize the polymorphisms identified in our data since they may have important implications for the selection of highly informative SNPs for future association studies. PMID:18036257
A tool for selecting SNPs for association studies based on observed linkage disequilibrium patterns.
De La Vega, Francisco M; Isaac, Hadar I; Scafe, Charles R
2006-01-01
The design of genetic association studies using single-nucleotide polymorphisms (SNPs) requires the selection of subsets of the variants providing high statistical power at a reasonable cost. SNPs must be selected to maximize the probability that a causative mutation is in linkage disequilibrium (LD) with at least one marker genotyped in the study. The HapMap project performed a genome-wide survey of genetic variation with about a million SNPs typed in four populations, providing a rich resource to inform the design of association studies. A number of strategies have been proposed for the selection of SNPs based on observed LD, including construction of metric LD maps and the selection of haplotype tagging SNPs. Power calculations are important at the study design stage to ensure successful results. Integrating these methods and annotations can be challenging: the algorithms required to implement these methods are complex to deploy, and all the necessary data and annotations are deposited in disparate databases. Here, we present the SNPbrowser Software, a freely available tool to assist in the LD-based selection of markers for association studies. This stand-alone application provides fast query capabilities and swift visualization of SNPs, gene annotations, power, haplotype blocks, and LD map coordinates. Wizards implement several common SNP selection workflows including the selection of optimal subsets of SNPs (e.g. tagging SNPs). Selected SNPs are screened for their conversion potential to either TaqMan SNP Genotyping Assays or the SNPlex Genotyping System, two commercially available genotyping platforms, expediting the set-up of genetic studies with an increased probability of success.
Randhawa, April Kaur; Horne, David J.; Adams, Mark D.; Shey, Muki; Barnholtz-Sloan, Jill; Mayanja-Kizza, Harriet; Kaplan, Gilla; Hanekom, Willem A.; Boom, W. Henry; Hawn, Thomas R.; Stein, Catherine M.
2012-01-01
Genetic epidemiological studies of complex diseases often rely on data from the International HapMap Consortium for identification of single nucleotide polymorphisms (SNPs), particularly those that tag haplotypes. However, little is known about the relevance of the African populations used to collect HapMap data for study populations conducted elsewhere in Africa. Toll-like receptor (TLR) genes play a key role in susceptibility to various infectious diseases, including tuberculosis. We conducted full-exon sequencing in samples obtained from Uganda (n = 48) and South Africa (n = 48), in four genes in the TLR pathway: TLR2, TLR4, TLR6, and TIRAP. We identified one novel TIRAP SNP (with minor allele frequency [MAF] 3.2%) and a novel TLR6 SNP (MAF 8%) in the Ugandan population, and a TLR6 SNP that is unique to the South African population (MAF 14%). These SNPs were also not present in the 1000 Genomes data. Genotype and haplotype frequencies and linkage disequilibrium patterns in Uganda and South Africa were similar to African populations in the HapMap datasets. Multidimensional scaling analysis of polymorphisms in all four genes suggested broad overlap of all of the examined African populations. Based on these data, we propose that there is enough similarity among African populations represented in the HapMap database to justify initial SNP selection for genetic epidemiological studies in Uganda and South Africa. We also discovered three novel polymorphisms that appear to be population-specific and would only be detected by sequencing efforts. PMID:23112821
Johnson, Eric O; Hancock, Dana B; Levy, Joshua L; Gaddis, Nathan C; Saccone, Nancy L; Bierut, Laura J; Page, Grier P
2013-05-01
A great promise of publicly sharing genome-wide association data is the potential to create composite sets of controls. However, studies often use different genotyping arrays, and imputation to a common set of SNPs has shown substantial bias: a problem which has no broadly applicable solution. Based on the idea that using differing genotyped SNP sets as inputs creates differential imputation errors and thus bias in the composite set of controls, we examined the degree to which each of the following occurs: (1) imputation based on the union of genotyped SNPs (i.e., SNPs available on one or more arrays) results in bias, as evidenced by spurious associations (type 1 error) between imputed genotypes and arbitrarily assigned case/control status; (2) imputation based on the intersection of genotyped SNPs (i.e., SNPs available on all arrays) does not evidence such bias; and (3) imputation quality varies by the size of the intersection of genotyped SNP sets. Imputations were conducted in European Americans and African Americans with reference to HapMap phase II and III data. Imputation based on the union of genotyped SNPs across the Illumina 1M and 550v3 arrays showed spurious associations for 0.2 % of SNPs: ~2,000 false positives per million SNPs imputed. Biases remained problematic for very similar arrays (550v1 vs. 550v3) and were substantial for dissimilar arrays (Illumina 1M vs. Affymetrix 6.0). In all instances, imputing based on the intersection of genotyped SNPs (as few as 30 % of the total SNPs genotyped) eliminated such bias while still achieving good imputation quality.
Muralidharan, Niveditha; Gulati, Reena; Misra, Durga Prasanna; Negi, Vir S
2018-02-01
The aim of the study was to look for any association of MTR 2756A>G and MTRR 66A>G gene polymorphisms with clinical phenotype, methotrexate (MTX) treatment response, and MTX-induced adverse events in South Indian Tamil patients with rheumatoid arthritis (RA). A total of 335 patients with RA were investigated. MTR 2756A>G gene polymorphism was analyzed by PCR-RFLP, and MTRR 66A>G SNP was analyzed by TaqMan 5' nuclease assay. The allele frequencies were compared with HapMap groups. MTR 2756G allele was found to be associated with risk of developing RA. The allele frequencies of MTR 2756A>G and MTRR 66A>G SNPs in controls differed significantly when compared with HapMap groups. Neither of the SNPs influenced the MTX treatment outcome and adverse effects. Neither of the SNPs seems to be associated with MTX treatment outcome and adverse events in South Indian Tamil patients with RA.
Sung, Yun J; Gu, C Charles; Tiwari, Hemant K; Arnett, Donna K; Broeckel, Ulrich; Rao, Dabeeru C
2012-07-01
Genotype imputation provides imputation of untyped single nucleotide polymorphisms (SNPs) that are present on a reference panel such as those from the HapMap Project. It is popular for increasing statistical power and comparing results across studies using different platforms. Imputation for African American populations is challenging because their linkage disequilibrium blocks are shorter and also because no ideal reference panel is available due to admixture. In this paper, we evaluated three imputation strategies for African Americans. The intersection strategy used a combined panel consisting of SNPs polymorphic in both CEU and YRI. The union strategy used a panel consisting of SNPs polymorphic in either CEU or YRI. The merge strategy merged results from two separate imputations, one using CEU and the other using YRI. Because recent investigators are increasingly using the data from the 1000 Genomes (1KG) Project for genotype imputation, we evaluated both 1KG-based imputations and HapMap-based imputations. We used 23,707 SNPs from chromosomes 21 and 22 on Affymetrix SNP Array 6.0 genotyped for 1,075 HyperGEN African Americans. We found that 1KG-based imputations provided a substantially larger number of variants than HapMap-based imputations, about three times as many common variants and eight times as many rare and low-frequency variants. This higher yield is expected because the 1KG panel includes more SNPs. Accuracy rates using 1KG data were slightly lower than those using HapMap data before filtering, but slightly higher after filtering. The union strategy provided the highest imputation yield with next highest accuracy. The intersection strategy provided the lowest imputation yield but the highest accuracy. The merge strategy provided the lowest imputation accuracy. We observed that SNPs polymorphic only in CEU had much lower accuracy, reducing the accuracy of the union strategy. Our findings suggest that 1KG-based imputations can facilitate discovery of significant associations for SNPs across the whole MAF spectrum. Because the 1KG Project is still under way, we expect that later versions will provide better imputation performance. © 2012 Wiley Periodicals, Inc.
Medaka: a promising model animal for comparative population genomics
Matsumoto, Yoshifumi; Oota, Hiroki; Asaoka, Yoichi; Nishina, Hiroshi; Watanabe, Koji; Bujnicki, Janusz M; Oda, Shoji; Kawamura, Shoji; Mitani, Hiroshi
2009-01-01
Background Within-species genome diversity has been best studied in humans. The international HapMap project has revealed a tremendous amount of single-nucleotide polymorphisms (SNPs) among humans, many of which show signals of positive selection during human evolution. In most of the cases, however, functional differences between the alleles remain experimentally unverified due to the inherent difficulty of human genetic studies. It would therefore be highly useful to have a vertebrate model with the following characteristics: (1) high within-species genetic diversity, (2) a variety of gene-manipulation protocols already developed, and (3) a completely sequenced genome. Medaka (Oryzias latipes) and its congeneric species, tiny fresh-water teleosts distributed broadly in East and Southeast Asia, meet these criteria. Findings Using Oryzias species from 27 local populations, we conducted a simple screening of nonsynonymous SNPs for 11 genes with apparent orthology between medaka and humans. We found medaka SNPs for which the same sites in human orthologs are known to be highly differentiated among the HapMap populations. Importantly, some of these SNPs show signals of positive selection. Conclusion These results indicate that medaka is a promising model system for comparative population genomics exploring the functional and adaptive significance of allelic differentiations. PMID:19426554
Patterns of linkage disequilibrium at PARK16 may explain variances in genetic association studies.
Li, Huihua; Teo, Yik-Ying; Tan, Eng-King
2015-09-01
Reproducing genomewide association studies findings in different populations is challenging, because the reproducibility fundamentally relies on the similar patterns of linkage disequilibrium between the unknown causal variants and the genotyped single-nucleotide polymorphisms (SNPs). The PARK16 locus was reported to alter the risk of Parkinson's disease (PD) in genomewide association studies in Japanese and Caucasians. We evaluated the regional linkage disequilibrium pattern at PARK16 locus in Caucasians, Japanese, and Chinese from HapMap and Chinese, Malays, and Indians from the Singapore Genome Variation Project, using the traditional heatmaps and targeted analysis of PARK16 gene via Monte Carlo simulation through varLD scores of these ethnic groups. One hundred SNPs in Caucasians, 95 SNPs in Chinese, 78 SNPs in Japanese from HapMap, 86 SNPs in Chinese, 99 SNPs in Indians, and 97 SNPs in Malays from the Singapore Genome Variation Project were included. Our targeted analysis showed that the linkage disequilibrium pattern of SNPs close to rs947211 was similar in Caucasians and Asians, including Chinese, Japanese, and Malay (all P > 0.0001), whereas different linkage disequilibrium patterns around rs823128, rs823156, and rs708730 were found between Caucasians and these Asian groups (all P < 0.0001). Our study suggests a higher chance to detect the association between rs947211 and PD in Chinese, Malay, and other Caucasian groups because of the similar linkage disequilibrium pattern around rs947211. The associations between rs823128/rs823156/rs708730 and PD are more likely to be replicated in Chinese and Malay populations. © 2015 International Parkinson and Movement Disorder Society.
Empirical Distributions of F ST from Large-Scale Human Polymorphism Data
Elhaik, Eran
2012-01-01
Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright’s F ST that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F ST may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F ST analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F ST distribution closely follows an exponential distribution. Third, although the overall F ST distribution is similarly shaped (inverse J), F ST distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F ST of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F ST distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection. PMID:23185452
Empirical distributions of F(ST) from large-scale human polymorphism data.
Elhaik, Eran
2012-01-01
Studies of the apportionment of human genetic variation have long established that most human variation is within population groups and that the additional variation between population groups is small but greatest when comparing different continental populations. These studies often used Wright's F(ST) that apportions the standardized variance in allele frequencies within and between population groups. Because local adaptations increase population differentiation, high-F(ST) may be found at closely linked loci under selection and used to identify genes undergoing directional or heterotic selection. We re-examined these processes using HapMap data. We analyzed 3 million SNPs on 602 samples from eight worldwide populations and a consensus subset of 1 million SNPs found in all populations. We identified four major features of the data: First, a hierarchically F(ST) analysis showed that only a paucity (12%) of the total genetic variation is distributed between continental populations and even a lesser genetic variation (1%) is found between intra-continental populations. Second, the global F(ST) distribution closely follows an exponential distribution. Third, although the overall F(ST) distribution is similarly shaped (inverse J), F(ST) distributions varies markedly by allele frequency when divided into non-overlapping groups by allele frequency range. Because the mean allele frequency is a crude indicator of allele age, these distributions mark the time-dependent change in genetic differentiation. Finally, the change in mean-F(ST) of these groups is linear in allele frequency. These results suggest that investigating the extremes of the F(ST) distribution for each allele frequency group is more efficient for detecting selection. Consequently, we demonstrate that such extreme SNPs are more clustered along the chromosomes than expected from linkage disequilibrium for each allele frequency group. These genomic regions are therefore likely candidates for natural selection.
LD2SNPing: linkage disequilibrium plotter and RFLP enzyme mining for tag SNPs
Chang, Hsueh-Wei; Chuang, Li-Yeh; Chang, Yan-Jhu; Cheng, Yu-Huei; Hung, Yu-Chen; Chen, Hsiang-Chi; Yang, Cheng-Hong
2009-01-01
Background Linkage disequilibrium (LD) mapping is commonly used to evaluate markers for genome-wide association studies. Most types of LD software focus strictly on LD analysis and visualization, but lack supporting services for genotyping. Results We developed a freeware called LD2SNPing, which provides a complete package of mining tools for genotyping and LD analysis environments. The software provides SNP ID- and gene-centric online retrievals for SNP information and tag SNP selection from dbSNP/NCBI and HapMap, respectively. Restriction fragment length polymorphism (RFLP) enzyme information for SNP genotype is available to all SNP IDs and tag SNPs. Single and multiple SNP inputs are possible in order to perform LD analysis by online retrieval from HapMap and NCBI. An LD statistics section provides D, D', r2, δQ, ρ, and the P values of the Hardy-Weinberg Equilibrium for each SNP marker, and Chi-square and likelihood-ratio tests for the pair-wise association of two SNPs in LD calculation. Finally, 2D and 3D plots, as well as plain-text output of the results, can be selected. Conclusion LD2SNPing thus provides a novel visualization environment for multiple SNP input, which facilitates SNP association studies. The software, user manual, and tutorial are freely available at . PMID:19500380
Hajiloo, Mohsen; Sapkota, Yadav; Mackey, John R; Robson, Paula; Greiner, Russell; Damaraju, Sambasivarao
2013-02-22
Population stratification is a systematic difference in allele frequencies between subpopulations. This can lead to spurious association findings in the case-control genome wide association studies (GWASs) used to identify single nucleotide polymorphisms (SNPs) associated with disease-linked phenotypes. Methods such as self-declared ancestry, ancestry informative markers, genomic control, structured association, and principal component analysis are used to assess and correct population stratification but each has limitations. We provide an alternative technique to address population stratification. We propose a novel machine learning method, ETHNOPRED, which uses the genotype and ethnicity data from the HapMap project to learn ensembles of disjoint decision trees, capable of accurately predicting an individual's continental and sub-continental ancestry. To predict an individual's continental ancestry, ETHNOPRED produced an ensemble of 3 decision trees involving a total of 10 SNPs, with 10-fold cross validation accuracy of 100% using HapMap II dataset. We extended this model to involve 29 disjoint decision trees over 149 SNPs, and showed that this ensemble has an accuracy of ≥ 99.9%, even if some of those 149 SNP values were missing. On an independent dataset, predominantly of Caucasian origin, our continental classifier showed 96.8% accuracy and improved genomic control's λ from 1.22 to 1.11. We next used the HapMap III dataset to learn classifiers to distinguish European subpopulations (North-Western vs. Southern), East Asian subpopulations (Chinese vs. Japanese), African subpopulations (Eastern vs. Western), North American subpopulations (European vs. Chinese vs. African vs. Mexican vs. Indian), and Kenyan subpopulations (Luhya vs. Maasai). In these cases, ETHNOPRED produced ensembles of 3, 39, 21, 11, and 25 disjoint decision trees, respectively involving 31, 502, 526, 242 and 271 SNPs, with 10-fold cross validation accuracy of 86.5% ± 2.4%, 95.6% ± 3.9%, 95.6% ± 2.1%, 98.3% ± 2.0%, and 95.9% ± 1.5%. However, ETHNOPRED was unable to produce a classifier that can accurately distinguish Chinese in Beijing vs. Chinese in Denver. ETHNOPRED is a novel technique for producing classifiers that can identify an individual's continental and sub-continental heritage, based on a small number of SNPs. We show that its learned classifiers are simple, cost-efficient, accurate, transparent, flexible, fast, applicable to large scale GWASs, and robust to missing values.
Population differences in the rate of proliferation of international HapMap cell lines.
Stark, Amy L; Zhang, Wei; Zhou, Tong; O'Donnell, Peter H; Beiswanger, Christine M; Huang, R Stephanie; Cox, Nancy J; Dolan, M Eileen
2010-12-10
The International HapMap Project is a resource for researchers containing genotype, sequencing, and expression information for EBV-transformed lymphoblastoid cell lines derived from populations across the world. The expansion of the HapMap beyond the four initial populations of Phase 2, referred to as Phase 3, has increased the sample number and ethnic diversity available for investigation. However, differences in the rate of cellular proliferation between the populations can serve as confounders in phenotype-genotype studies using these cell lines. Within the Phase 2 populations, the JPT and CHB cell lines grow faster (p < 0.0001) than the CEU or YRI cell lines. Phase 3 YRI cell lines grow significantly slower than Phase 2 YRI lines (p < 0.0001), with no widespread genetic differences based on common SNPs. In addition, we found significant growth differences between the cell lines in the Phase 2 ASN populations and the Han Chinese from the Denver metropolitan area panel in Phase 3 (p < 0.0001). Therefore, studies that separate HapMap panels into discovery and replication sets must take this into consideration. Copyright © 2010 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Cloud computing-based TagSNP selection algorithm for human genome data.
Hung, Che-Lun; Chen, Wen-Pei; Hua, Guan-Jie; Zheng, Huiru; Tsai, Suh-Jen Jane; Lin, Yaw-Ling
2015-01-05
Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used.
Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data
Hung, Che-Lun; Chen, Wen-Pei; Hua, Guan-Jie; Zheng, Huiru; Tsai, Suh-Jen Jane; Lin, Yaw-Ling
2015-01-01
Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used. PMID:25569088
Controversial opinion: evaluation of EGR1 and LAMA2 loci for high myopia in Chinese populations.
Lin, Fang-yu; Huang, Zhu; Lu, Ning; Chen, Wei; Fang, Hui; Han, Wei
2016-03-01
Functional studies have suggested the important role of early growth response 1 (EGR1) and Laminin α2-chain (LAMA2) in human eye development. Genetic studies have reported a significant association of the single nucleotide polymorphism (SNP) in the LAMA2 gene with myopia. This study aimed to evaluate the association of the tagging SNPs (tSNPs) in the EGR1 and LAMA2 genes with high myopia in two independent Han Chinese populations. Four tSNPs (rs11743810 in the EGR1 gene; rs2571575, rs9321170, and rs1889891 in the LAMA2 gene) were selected, according to the HapMap database (http://hapmap.ncbi.nlm.nih.gov), and were genotyped using the ligase detection reaction (LDR) approach for 167 Han Chinese nuclear families with extremely highly myopic offspring (<-10.0 diopters) and an independent group with 485 extremely highly myopic cases (<-10.0 diopters) and 499 controls. Direct sequencing was used to confirm the LDR results in twenty randomly selected subjects. Family-based association analysis was performed using the family-based association test (FBAT) software package (Version 1.5.5). Population-based association analysis was performed using the Chi-square test. The association analysis power was estimated using online software (http://design.cs.ucla.edu). The FBAT demonstrated that all four tSNPs tested did not show association with high myopia (P>0.05). Haplotype analysis of tSNPs in the LAMA2 genes also did not show a significant association (P>0.05). Meanwhile, population-based association analysis also showed no significant association results with high myopia (P>0.05). On the basis of our family- and population-based analyses for the Han Chinese population, we did not find positive association signals of the four SNPs in the LAMA2 and EGR1 genes with high myopia.
Novel and efficient tag SNPs selection algorithms.
Chen, Wen-Pei; Hung, Che-Lun; Tsai, Suh-Jen Jane; Lin, Yaw-Ling
2014-01-01
SNPs are the most abundant forms of genetic variations amongst species; the association studies between complex diseases and SNPs or haplotypes have received great attention. However, these studies are restricted by the cost of genotyping all SNPs; thus, it is necessary to find smaller subsets, or tag SNPs, representing the rest of the SNPs. In fact, the existing tag SNP selection algorithms are notoriously time-consuming. An efficient algorithm for tag SNP selection was presented, which was applied to analyze the HapMap YRI data. The experimental results show that the proposed algorithm can achieve better performance than the existing tag SNP selection algorithms; in most cases, this proposed algorithm is at least ten times faster than the existing methods. In many cases, when the redundant ratio of the block is high, the proposed algorithm can even be thousands times faster than the previously known methods. Tools and web services for haplotype block analysis integrated by hadoop MapReduce framework are also developed using the proposed algorithm as computation kernels.
WASP: a Web-based Allele-Specific PCR assay designing tool for detecting SNPs and mutations
Wangkumhang, Pongsakorn; Chaichoompu, Kridsadakorn; Ngamphiw, Chumpol; Ruangrit, Uttapong; Chanprasert, Juntima; Assawamakin, Anunchai; Tongsima, Sissades
2007-01-01
Background Allele-specific (AS) Polymerase Chain Reaction is a convenient and inexpensive method for genotyping Single Nucleotide Polymorphisms (SNPs) and mutations. It is applied in many recent studies including population genetics, molecular genetics and pharmacogenomics. Using known AS primer design tools to create primers leads to cumbersome process to inexperience users since information about SNP/mutation must be acquired from public databases prior to the design. Furthermore, most of these tools do not offer the mismatch enhancement to designed primers. The available web applications do not provide user-friendly graphical input interface and intuitive visualization of their primer results. Results This work presents a web-based AS primer design application called WASP. This tool can efficiently design AS primers for human SNPs as well as mutations. To assist scientists with collecting necessary information about target polymorphisms, this tool provides a local SNP database containing over 10 million SNPs of various populations from public domain databases, namely NCBI dbSNP, HapMap and JSNP respectively. This database is tightly integrated with the tool so that users can perform the design for existing SNPs without going off the site. To guarantee specificity of AS primers, the proposed system incorporates a primer specificity enhancement technique widely used in experiment protocol. In particular, WASP makes use of different destabilizing effects by introducing one deliberate 'mismatch' at the penultimate (second to last of the 3'-end) base of AS primers to improve the resulting AS primers. Furthermore, WASP offers graphical user interface through scalable vector graphic (SVG) draw that allow users to select SNPs and graphically visualize designed primers and their conditions. Conclusion WASP offers a tool for designing AS primers for both SNPs and mutations. By integrating the database for known SNPs (using gene ID or rs number), this tool facilitates the awkward process of getting flanking sequences and other related information from public SNP databases. It takes into account the underlying destabilizing effect to ensure the effectiveness of designed primers. With user-friendly SVG interface, WASP intuitively presents resulting designed primers, which assist users to export or to make further adjustment to the design. This software can be freely accessed at . PMID:17697334
Aurora-A as a Modifier of Breast Cancer Risk in BRCA 1/2 Mutation Carriers
2007-06-01
Dieter Schaefer, Institute of Human Genetics, University of Frankfurt, Frankfurt, Germany; Norbert Arnold, University of Schleswig- Holstein , Campus...Intron 2 Opossum Mouse Rat Cow Dog Intron 1 Figure 3 | The FGFR2 locus. a, Map of the whole FGFR2 gene, viewed relative to common SNPs on HapMap
HapMap scanning of novel human minor histocompatibility antigens.
Kamei, Michi; Nannya, Yasuhito; Torikai, Hiroki; Kawase, Takakazu; Taura, Kenjiro; Inamoto, Yoshihiro; Takahashi, Taro; Yazaki, Makoto; Morishima, Satoko; Tsujimura, Kunio; Miyamura, Koichi; Ito, Tetsuya; Togari, Hajime; Riddell, Stanley R; Kodera, Yoshihisa; Morishima, Yasuo; Takahashi, Toshitada; Kuzushima, Kiyotaka; Ogawa, Seishi; Akatsuka, Yoshiki
2009-05-21
Minor histocompatibility antigens (mHags) are molecular targets of allo-immunity associated with hematopoietic stem cell transplantation (HSCT) and involved in graft-versus-host disease, but they also have beneficial antitumor activity. mHags are typically defined by host SNPs that are not shared by the donor and are immunologically recognized by cytotoxic T cells isolated from post-HSCT patients. However, the number of molecularly identified mHags is still too small to allow prospective studies of their clinical importance in transplantation medicine, mostly due to the lack of an efficient method for isolation. Here we show that when combined with conventional immunologic assays, the large data set from the International HapMap Project can be directly used for genetic mapping of novel mHags. Based on the immunologically determined mHag status in HapMap panels, a target mHag locus can be uniquely mapped through whole genome association scanning taking advantage of the unprecedented resolution and power obtained with more than 3 000 000 markers. The feasibility of our approach could be supported by extensive simulations and further confirmed by actually isolating 2 novel mHags as well as 1 previously identified example. The HapMap data set represents an invaluable resource for investigating human variation, with obvious applications in genetic mapping of clinically relevant human traits.
Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data
Degner, Jacob F.; Marioni, John C.; Pai, Athma A.; Pickrell, Joseph K.; Nkadori, Everlyne; Gilad, Yoav; Pritchard, Jonathan K.
2009-01-01
Motivation: Next-generation sequencing has become an important tool for genome-wide quantification of DNA and RNA. However, a major technical hurdle lies in the need to map short sequence reads back to their correct locations in a reference genome. Here, we investigate the impact of SNP variation on the reliability of read-mapping in the context of detecting allele-specific expression (ASE). Results: We generated 16 million 35 bp reads from mRNA of each of two HapMap Yoruba individuals. When we mapped these reads to the human genome we found that, at heterozygous SNPs, there was a significant bias toward higher mapping rates of the allele in the reference sequence, compared with the alternative allele. Masking known SNP positions in the genome sequence eliminated the reference bias but, surprisingly, did not lead to more reliable results overall. We find that even after masking, ∼5–10% of SNPs still have an inherent bias toward more effective mapping of one allele. Filtering out inherently biased SNPs removes 40% of the top signals of ASE. The remaining SNPs showing ASE are enriched in genes previously known to harbor cis-regulatory variation or known to show uniparental imprinting. Our results have implications for a variety of applications involving detection of alternate alleles from short-read sequence data. Availability: Scripts, written in Perl and R, for simulating short reads, masking SNP variation in a reference genome and analyzing the simulation output are available upon request from JFD. Raw short read data were deposited in GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE18156. Contact: jdegner@uchicago.edu; marioni@uchicago.edu; gilad@uchicago.edu; pritch@uchicago.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19808877
2013-01-01
Background Population stratification is a systematic difference in allele frequencies between subpopulations. This can lead to spurious association findings in the case–control genome wide association studies (GWASs) used to identify single nucleotide polymorphisms (SNPs) associated with disease-linked phenotypes. Methods such as self-declared ancestry, ancestry informative markers, genomic control, structured association, and principal component analysis are used to assess and correct population stratification but each has limitations. We provide an alternative technique to address population stratification. Results We propose a novel machine learning method, ETHNOPRED, which uses the genotype and ethnicity data from the HapMap project to learn ensembles of disjoint decision trees, capable of accurately predicting an individual’s continental and sub-continental ancestry. To predict an individual’s continental ancestry, ETHNOPRED produced an ensemble of 3 decision trees involving a total of 10 SNPs, with 10-fold cross validation accuracy of 100% using HapMap II dataset. We extended this model to involve 29 disjoint decision trees over 149 SNPs, and showed that this ensemble has an accuracy of ≥ 99.9%, even if some of those 149 SNP values were missing. On an independent dataset, predominantly of Caucasian origin, our continental classifier showed 96.8% accuracy and improved genomic control’s λ from 1.22 to 1.11. We next used the HapMap III dataset to learn classifiers to distinguish European subpopulations (North-Western vs. Southern), East Asian subpopulations (Chinese vs. Japanese), African subpopulations (Eastern vs. Western), North American subpopulations (European vs. Chinese vs. African vs. Mexican vs. Indian), and Kenyan subpopulations (Luhya vs. Maasai). In these cases, ETHNOPRED produced ensembles of 3, 39, 21, 11, and 25 disjoint decision trees, respectively involving 31, 502, 526, 242 and 271 SNPs, with 10-fold cross validation accuracy of 86.5% ± 2.4%, 95.6% ± 3.9%, 95.6% ± 2.1%, 98.3% ± 2.0%, and 95.9% ± 1.5%. However, ETHNOPRED was unable to produce a classifier that can accurately distinguish Chinese in Beijing vs. Chinese in Denver. Conclusions ETHNOPRED is a novel technique for producing classifiers that can identify an individual’s continental and sub-continental heritage, based on a small number of SNPs. We show that its learned classifiers are simple, cost-efficient, accurate, transparent, flexible, fast, applicable to large scale GWASs, and robust to missing values. PMID:23432980
2010-10-01
reproducibility of genotype c alls among the four batches by comparing the HapMap samples across batches. We also calculated identity-by- descent (IBD...used to aid clinicians in per sonalizing dosage to improve the therapeutic index of radiotherapy treatment for prostate cancer. References None Appendices None
2011-10-01
the HapMap samples across batches. We also calculated identity-by- descent (IBD) and identity-by-state (IBS) measures to confirm the identity of the...development of adverse events following radiotherapy. Such a tool could be used to aid clinicians in personalizing dosage to improve the therapeutic index of
Howie, Bryan N.; Donnelly, Peter; Marchini, Jonathan
2009-01-01
Genotype imputation methods are now being widely used in the analysis of genome-wide association studies. Most imputation analyses to date have used the HapMap as a reference dataset, but new reference panels (such as controls genotyped on multiple SNP chips and densely typed samples from the 1,000 Genomes Project) will soon allow a broader range of SNPs to be imputed with higher accuracy, thereby increasing power. We describe a genotype imputation method (IMPUTE version 2) that is designed to address the challenges presented by these new datasets. The main innovation of our approach is a flexible modelling framework that increases accuracy and combines information across multiple reference panels while remaining computationally feasible. We find that IMPUTE v2 attains higher accuracy than other methods when the HapMap provides the sole reference panel, but that the size of the panel constrains the improvements that can be made. We also find that imputation accuracy can be greatly enhanced by expanding the reference panel to contain thousands of chromosomes and that IMPUTE v2 outperforms other methods in this setting at both rare and common SNPs, with overall error rates that are 15%–20% lower than those of the closest competing method. One particularly challenging aspect of next-generation association studies is to integrate information across multiple reference panels genotyped on different sets of SNPs; we show that our approach to this problem has practical advantages over other suggested solutions. PMID:19543373
Evaluating the association of common APOA2 variants with type 2 diabetes
Duesing, Konsta; Charpentier, Guillaume; Marre, Michel; Tichet, Jean; Hercberg, Serge; Balkau, Beverley; Froguel, Philippe; Gibson, Fernando
2009-01-01
Background APOA2 is a positional and biological candidate gene for type 2 diabetes at the chromosome 1q21-q24 susceptibility locus. The aim of this study was to examine if HapMap phase II tag SNPs in APOA2 are associated with type 2 diabetes and quantitative traits in French Caucasian subjects. Methods We genotyped the three HapMap phase II tagging SNPs (rs6413453, rs5085 and rs5082) required to capture the common variation spanning the APOA2 locus in our type 2 diabetes case-control cohort comprising 3,093 French Caucasian subjects. The association between these variants and quantitative traits was also examined in the normoglycaemic adults of the control cohort. In addition, meta-analysis of publicly available whole genome association data was performed. Results None of the APOA2 tag SNPs were associated with type 2 diabetes in the French Caucasian case-control cohort (rs6413453, P = 0.619; rs5085, P = 0.245; rs5082, P = 0.591). However, rs5082 was marginally associated with total cholesterol levels (P = 0.026) and waist-to-hip ratio (P = 0.029). The meta-analysis of data from 12,387 subjects confirmed our finding that common variation at the APOA2 locus is not associated with type 2 diabetes. Conclusion The available data does not support a role for common variants in APOA2 on type 2 diabetes susceptibility or related quantitative traits in Northern Europeans. PMID:19216768
Evaluating the association of common APOA2 variants with type 2 diabetes.
Duesing, Konsta; Charpentier, Guillaume; Marre, Michel; Tichet, Jean; Hercberg, Serge; Balkau, Beverley; Froguel, Philippe; Gibson, Fernando
2009-02-13
APOA2 is a positional and biological candidate gene for type 2 diabetes at the chromosome 1q21-q24 susceptibility locus. The aim of this study was to examine if HapMap phase II tag SNPs in APOA2 are associated with type 2 diabetes and quantitative traits in French Caucasian subjects. We genotyped the three HapMap phase II tagging SNPs (rs6413453, rs5085 and rs5082) required to capture the common variation spanning the APOA2 locus in our type 2 diabetes case-control cohort comprising 3,093 French Caucasian subjects. The association between these variants and quantitative traits was also examined in the normoglycaemic adults of the control cohort. In addition, meta-analysis of publicly available whole genome association data was performed. None of the APOA2 tag SNPs were associated with type 2 diabetes in the French Caucasian case-control cohort (rs6413453, P = 0.619; rs5085, P = 0.245; rs5082, P = 0.591). However, rs5082 was marginally associated with total cholesterol levels (P = 0.026) and waist-to-hip ratio (P = 0.029). The meta-analysis of data from 12,387 subjects confirmed our finding that common variation at the APOA2 locus is not associated with type 2 diabetes. The available data does not support a role for common variants in APOA2 on type 2 diabetes susceptibility or related quantitative traits in Northern Europeans.
The diploid genome sequence of an Asian individual
Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian
2009-01-01
Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735
Allelic expression mapping across cellular lineages to establish impact of non-coding SNPs
Adoue, Veronique; Schiavi, Alicia; Light, Nicholas; Almlöf, Jonas Carlsson; Lundmark, Per; Ge, Bing; Kwan, Tony; Caron, Maxime; Rönnblom, Lars; Wang, Chuan; Chen, Shu-Huang; Goodall, Alison H; Cambien, Francois; Deloukas, Panos; Ouwehand, Willem H; Syvänen, Ann-Christine; Pastinen, Tomi
2014-01-01
Most complex disease-associated genetic variants are located in non-coding regions and are therefore thought to be regulatory in nature. Association mapping of differential allelic expression (AE) is a powerful method to identify SNPs with direct cis-regulatory impact (cis-rSNPs). We used AE mapping to identify cis-rSNPs regulating gene expression in 55 and 63 HapMap lymphoblastoid cell lines from a Caucasian and an African population, respectively, 70 fibroblast cell lines, and 188 purified monocyte samples and found 40–60% of these cis-rSNPs to be shared across cell types. We uncover a new class of cis-rSNPs, which disrupt footprint-derived de novo motifs that are predominantly bound by repressive factors and are implicated in disease susceptibility through overlaps with GWAS SNPs. Finally, we provide the proof-of-principle for a new approach for genome-wide functional validation of transcription factor–SNP interactions. By perturbing NFκB action in lymphoblasts, we identified 489 cis-regulated transcripts with altered AE after NFκB perturbation. Altogether, we perform a comprehensive analysis of cis-variation in four cell populations and provide new tools for the identification of functional variants associated to complex diseases. PMID:25326100
Stefanaki, Irene; Panagiotou, Orestis A; Kodela, Elisavet; Gogas, Helen; Kypreou, Katerina P; Chatzinasiou, Foteini; Nikolaou, Vasiliki; Plaka, Michaela; Kalfa, Iro; Antoniou, Christina; Ioannidis, John P A; Evangelou, Evangelos; Stratigos, Alexander J
2013-01-01
Genetic association studies have revealed numerous polymorphisms conferring susceptibility to melanoma. We aimed to replicate previously discovered melanoma-associated single-nucleotide polymorphisms (SNPs) in a Greek case-control population, and examine their predictive value. Based on a field synopsis of genetic variants of melanoma (MelGene), we genotyped 284 patients and 284 controls at 34 melanoma-associated SNPs of which 19 derived from GWAS. We tested each one of the 33 SNPs passing quality control for association with melanoma both with and without accounting for the presence of well-established phenotypic risk factors. We compared the risk allele frequencies between the Greek population and the HapMap CEU sample. Finally, we evaluated the predictive ability of the replicated SNPs. Risk allele frequencies were significantly lower compared to the HapMap CEU for eight SNPs (rs16891982--SLC45A2, rs12203592--IRF4, rs258322--CDK10, rs1805007--MC1R, rs1805008--MC1R, rs910873--PIGU, rs17305573--PIGU, and rs1885120--MTAP) and higher for one SNP (rs6001027--PLA2G6) indicating a different profile of genetic susceptibility in the studied population. Previously identified effect estimates modestly correlated with those found in our population (r = 0.72, P<0.0001). The strongest associations were observed for rs401681-T in CLPTM1L (odds ratio [OR] 1.60, 95% CI 1.22-2.10; P = 0.001), rs16891982-C in SCL45A2 (OR 0.51, 95% CI 0.34-0.76; P = 0.001), and rs1805007-T in MC1R (OR 4.38, 95% CI 2.03-9.43; P = 2×10⁻⁵). Nominally statistically significant associations were seen also for another 5 variants (rs258322-T in CDK10, rs1805005-T in MC1R, rs1885120-C in MYH7B, rs2218220-T in MTAP and rs4911442-G in the ASIP region). The addition of all SNPs with nominal significance to a clinical non-genetic model did not substantially improve melanoma risk prediction (AUC for clinical model 83.3% versus 83.9%, p = 0.66). Overall, our study has validated genetic variants that are likely to contribute to melanoma susceptibility in the Greek population.
Genetic ancestry is associated with colorectal adenomas and adenocarcinomas in Latino populations.
Hernandez-Suarez, Gustavo; Sanabria, Maria Carolina; Serrano, Marta; Herran, Oscar F; Perez, Jesus; Plata, Jose L; Zabaleta, Jovanny; Tenesa, Albert
2014-10-01
Colorectal cancer rates in Latin American countries are less than half of those observed in the United States. Latin Americans are the resultant of generations of an admixture of Native American, European, and African individuals. The potential role of genetic admixture in colorectal carcinogenesis has not been examined. We evaluate the association of genetic ancestry with colorectal neoplasms in 190 adenocarcinomas, 113 sporadic adenomas and 243 age- and sex-matched controls enrolled in a multicentric case-control study in Colombia. Individual ancestral genetic fractions were estimated using the STRUCTURE software, based on allele frequencies and assuming three distinct population origins. We used the Illumina Cancer Panel to genotype 1,421 sparse single-nucleotide polymorphisms (SNPs), and Northern and Western European ancestry, LWJ and Han Chinese in Beijing, China populations from the HapMap project as references. A total of 678 autosomal SNPs overlapped with the HapMap data set SNPs and were used for ancestry estimations. African mean ancestry fraction was higher in adenomas (0.13, 95% confidence interval (95% CI)=0.11-0.15) and cancer cases (0.14, 95% CI=0.12-0.16) compared with controls (0.11, 95% CI=0.10-0.12). Conditional logistic regression analysis, controlling for known risk factors, showed a positive association of African ancestry per 10% increase with both colorectal adenoma (odds ratio (OR)=1.12, 95% CI=0.97-1.30) and adenocarcinoma (OR=1.19, 95% CI=1.05-1.35). In conclusion, increased African ancestry (or variants linked to it) contributes to the increased susceptibility of colorectal cancer in admixed Latin American population.
Khatkar, Mehar S.; Zenger, Kyall R.; Hobbs, Matthew; Hawken, Rachel J.; Cavanagh, Julie A. L.; Barris, Wes; McClintock, Alexander E.; McClintock, Sara; Thomson, Peter C.; Tier, Bruce; Nicholas, Frank W.; Raadsma, Herman W.
2007-01-01
Analysis of data on 1000 Holstein–Friesian bulls genotyped for 15,036 single-nucleotide polymorphisms (SNPs) has enabled genomewide identification of haplotype blocks and tag SNPs. A final subset of 9195 SNPs in Hardy–Weinberg equilibrium and mapped on autosomes on the bovine sequence assembly (release Btau 3.1) was used in this study. The average intermarker spacing was 251.8 kb. The average minor allele frequency (MAF) was 0.29 (0.05–0.5). Following recent precedents in human HapMap studies, a haplotype block was defined where 95% of combinations of SNPs within a region are in very high linkage disequilibrium. A total of 727 haplotype blocks consisting of ≥3 SNPs were identified. The average block length was 69.7 ± 7.7 kb, which is ∼5–10 times larger than in humans. These blocks comprised a total of 2964 SNPs and covered 50,638 kb of the sequence map, which constitutes 2.18% of the length of all autosomes. A set of tag SNPs, which will be useful for further fine-mapping studies, has been identified. Overall, the results suggest that as many as 75,000–100,000 tag SNPs would be needed to track all important haplotype blocks in the bovine genome. This would require ∼250,000 SNPs in the discovery phase. PMID:17435229
Genetic ancestry is associated with colorectal adenomas and adenocarcinomas in Latino populations
Hernandez-Suarez, Gustavo; Sanabria, Maria Carolina; Serrano, Marta; Herran, Oscar F; Perez, Jesus; Plata, Jose L; Zabaleta, Jovanny; Tenesa, Albert
2014-01-01
Colorectal cancer rates in Latin American countries are less than half of those observed in the United States. Latin Americans are the resultant of generations of an admixture of Native American, European, and African individuals. The potential role of genetic admixture in colorectal carcinogenesis has not been examined. We evaluate the association of genetic ancestry with colorectal neoplasms in 190 adenocarcinomas, 113 sporadic adenomas and 243 age- and sex-matched controls enrolled in a multicentric case–control study in Colombia. Individual ancestral genetic fractions were estimated using the STRUCTURE software, based on allele frequencies and assuming three distinct population origins. We used the Illumina Cancer Panel to genotype 1,421 sparse single-nucleotide polymorphisms (SNPs), and Northern and Western European ancestry, LWJ and Han Chinese in Beijing, China populations from the HapMap project as references. A total of 678 autosomal SNPs overlapped with the HapMap data set SNPs and were used for ancestry estimations. African mean ancestry fraction was higher in adenomas (0.13, 95% confidence interval (95% CI)=0.11–0.15) and cancer cases (0.14, 95% CI=0.12–0.16) compared with controls (0.11, 95% CI=0.10–0.12). Conditional logistic regression analysis, controlling for known risk factors, showed a positive association of African ancestry per 10% increase with both colorectal adenoma (odds ratio (OR)=1.12, 95% CI=0.97–1.30) and adenocarcinoma (OR=1.19, 95% CI=1.05–1.35). In conclusion, increased African ancestry (or variants linked to it) contributes to the increased susceptibility of colorectal cancer in admixed Latin American population. PMID:24518838
A Comparison of Phasing Algorithms for Trios and Unrelated Individuals
Marchini, Jonathan; Cutler, David; Patterson, Nick; Stephens, Matthew; Eskin, Eleazar; Halperin, Eran; Lin, Shin; Qin, Zhaohui S.; Munro, Heather M.; Abecasis, Gonçalo R.; Donnelly, Peter
2006-01-01
Knowledge of haplotype phase is valuable for many analysis methods in the study of disease, population, and evolutionary genetics. Considerable research effort has been devoted to the development of statistical and computational methods that infer haplotype phase from genotype data. Although a substantial number of such methods have been developed, they have focused principally on inference from unrelated individuals, and comparisons between methods have been rather limited. Here, we describe the extension of five leading algorithms for phase inference for handling father-mother-child trios. We performed a comprehensive assessment of the methods applied to both trios and to unrelated individuals, with a focus on genomic-scale problems, using both simulated data and data from the HapMap project. The most accurate algorithm was PHASE (v2.1). For this method, the percentages of genotypes whose phase was incorrectly inferred were 0.12%, 0.05%, and 0.16% for trios from simulated data, HapMap Centre d'Etude du Polymorphisme Humain (CEPH) trios, and HapMap Yoruban trios, respectively, and 5.2% and 5.9% for unrelated individuals in simulated data and the HapMap CEPH data, respectively. The other methods considered in this work had comparable but slightly worse error rates. The error rates for trios are similar to the levels of genotyping error and missing data expected. We thus conclude that all the methods considered will provide highly accurate estimates of haplotypes when applied to trio data sets. Running times differ substantially between methods. Although it is one of the slowest methods, PHASE (v2.1) was used to infer haplotypes for the 1 million–SNP HapMap data set. Finally, we evaluated methods of estimating the value of r2 between a pair of SNPs and concluded that all methods estimated r2 well when the estimated value was ⩾0.8. PMID:16465620
efficient association study design via power-optimized tag SNP selection
HAN, BUHM; KANG, HYUN MIN; SEO, MYEONG SEONG; ZAITLEN, NOAH; ESKIN, ELEAZAR
2008-01-01
Discovering statistical correlation between causal genetic variation and clinical traits through association studies is an important method for identifying the genetic basis of human diseases. Since fully resequencing a cohort is prohibitively costly, genetic association studies take advantage of local correlation structure (or linkage disequilibrium) between single nucleotide polymorphisms (SNPs) by selecting a subset of SNPs to be genotyped (tag SNPs). While many current association studies are performed using commercially available high-throughput genotyping products that define a set of tag SNPs, choosing tag SNPs remains an important problem for both custom follow-up studies as well as designing the high-throughput genotyping products themselves. The most widely used tag SNP selection method optimizes over the correlation between SNPs (r2). However, tag SNPs chosen based on an r2 criterion do not necessarily maximize the statistical power of an association study. We propose a study design framework that chooses SNPs to maximize power and efficiently measures the power through empirical simulation. Empirical results based on the HapMap data show that our method gains considerable power over a widely used r2-based method, or equivalently reduces the number of tag SNPs required to attain the desired power of a study. Our power-optimized 100k whole genome tag set provides equivalent power to the Affymetrix 500k chip for the CEU population. For the design of custom follow-up studies, our method provides up to twice the power increase using the same number of tag SNPs as r2-based methods. Our method is publicly available via web server at http://design.cs.ucla.edu. PMID:18702637
Association of the oxytocin receptor gene (OXTR) in Caucasian children and adolescents with autism.
Jacob, Suma; Brune, Camille W; Carter, C S; Leventhal, Bennett L; Lord, Catherine; Cook, Edwin H
2007-04-24
The oxytocin receptor gene (OXTR) has been studied in autism because of the role of oxytocin (OT) in social cognition. Linkage has also been demonstrated to the region of OXTR in a large sample. Two single nucleotide polymorphisms (SNPs) and a haplotype constructed from them in OXTR have been associated with autism in the Chinese Han population. We tested whether these associations replicated in a Caucasian sample with strictly defined autistic disorder. We genotyped the two previously associated SNPs (rs2254298, rs53576) in 57 Caucasian autism trios. Probands met clinical, ADI-R, and ADOS criteria for autistic disorder. Significant association was detected at rs2254298 (p=0.03) but not rs53576. For rs2254298, overtransmission of the G allele to probands with autistic disorder was found which contrasts with the overtransmission of A previously reported in the Chinese Han sample. In both samples, G was more frequent than A. However, in our Caucasian autism trios and the CEU Caucasian HapMap samples the frequency of A was less than that reported in the Chinese Han and Chinese in Bejing HapMap samples. The haplotype test of association did not reveal excess transmission from parents to affected offspring. These findings provide support for association of OXTR with autism in a Caucasian population. Overtransmission of different alleles in different populations may be due to a different pattern of linkage disequilibrium between the marker rs2254298 and an as yet undetermined susceptibility variant in OXTR.
Fine mapping on chromosome 13q32-34 and brain expression analysis implicates MYO16 in schizophrenia.
Rodriguez-Murillo, Laura; Xu, Bin; Roos, J Louw; Abecasis, Gonçalo R; Gogos, Joseph A; Karayiorgou, Maria
2014-03-01
We previously reported linkage of schizophrenia and schizoaffective disorder to 13q32-34 in the European descent Afrikaner population from South Africa. The nature of genetic variation underlying linkage peaks in psychiatric disorders remains largely unknown and both rare and common variants may be contributing. Here, we examine the contribution of common variants located under the 13q32-34 linkage region. We used densely spaced SNPs to fine map the linkage peak region using both a discovery sample of 415 families and a meta-analysis incorporating two additional replication family samples. In a second phase of the study, we use one family-based data set with 237 families and independent case-control data sets for fine mapping of the common variant association signal using HapMap SNPs. We report a significant association with a genetic variant (rs9583277) within the gene encoding for the myosin heavy-chain Myr 8 (MYO16), which has been implicated in neuronal phosphoinositide 3-kinase signaling. Follow-up analysis of HapMap variation within MYO16 in a second set of Afrikaner families and additional case-control data sets of European descent highlighted a region across introns 2-6 as the most likely region to harbor common MYO16 risk variants. Expression analysis revealed a significant increase in the level of MYO16 expression in the brains of schizophrenia patients. Our results suggest that common variation within MYO16 may contribute to the genetic liability to schizophrenia.
Ojeda, Diego A; Forero, Diego A
2014-10-01
Non-synonymous single nucleotide polymorphisms (nsSNPs) in brain-expressed genes represent interesting candidates for genetic research in neuropsychiatric disorders. To study novel nsSNPs in brain-expressed genes in a sample of Colombian subjects. We applied an approach based on in silico mining of available genomic data to identify and select novel nsSNPs in brain-expressed genes. We developed novel genotyping assays, based in allele-specific PCR methods, for these nsSNPs and genotyped them in 171 Colombian subjects. Five common nsSNPs (rs6855837; p.Leu395Ile, rs2305160; p.Thr394Ala, rs10503929; p.Met289Thr, rs2270641; p.Thr4Pro and rs3822659; p.Ser735Ala) were studied, located in the CLOCK, NPAS2, NRG1, SLC18A1 and WWC1 genes. We reported allele and genotype frequencies in a sample of South American healthy subjects. There is previous experimental evidence, arising from genome-wide expression and association studies, for the involvement of these genes in several neuropsychiatric disorders and endophenotypes, such as schizophrenia, mood disorders or memory performance. Frequencies for these nsSNPSs in the Colombian samples varied in comparison to different HapMap populations. Future study of these nsSNPs in brain-expressed genes, a synaptogenomics approach, will be important for a better understanding of neuropsychiatric diseases and endophenotypes in different populations.
Namgoong, Suhg; Cheong, Hyun Sub; Kim, Ji On; Kim, Lyoung Hyo; Na, Han Sung; Koh, In Song; Chung, Myeon Woo; Shin, Hyoung Doo
2015-11-01
Organic anion-transporting polypeptide (OATP; gene symbol, SLCO) transporters are generally involved in the uptake of multiple drugs and their metabolites at most epithelial barriers. The pattern of single-nucleotide polymorphisms (SNPs) in these transporters may be determinants of interindividual variability in drug disposition and response. The objective of this study was to define the distribution of SNPs of three SLCO genes, SLCO1B1, SLCO1B3, and SLCO2B1, in a Korean population and other ethnic groups. The study was screened using the Illumina GoldenGate assay for genomic DNA from 450 interethnic subjects, including 11 pharmacogenetic core variants and 76 HapMap tagging SNPs. The genotype distribution of the Korean population was similar to East Asian populations, but significantly different from African American and European American cohorts. These interethnic differences will be useful information for prospective studies, including genetic association and pharmacogenetic studies of drug metabolism by SLCO families. Copyright © 2015 Elsevier B.V. All rights reserved.
Association of the Oxytocin Receptor Gene (OXTR) in Caucasian Children and Adolescents with Autism
Jacob, Suma; Brune, Camille W.; Carter, C. S.; Leventhal, Bennett L.; Lord, Catherine; Cook, Edwin H.
2009-01-01
Background The oxytocin receptor gene (OXTR) has been studied in autism because of the role of oxytocin (OT) in social cognition. Linkage has also been demonstrated to the region of OXTR in a large sample. Two single nucleotide polymorphisms (SNPs) and a haplotype constructed from them in OXTR have been associated with autism in the Chinese Han population. We tested whether these associations replicated in a Caucasian sample with strictly defined autistic disorder. Methods We genotyped the two previously associated SNPs (rs2254298, rs53576) in 57 Caucasian autism trios. Probands met clinical, ADI-R, and ADOS criteria for autistic disorder. Results Significant association was detected at rs2254298 (p = 0.03) but not rs53576. For rs2254298, overtransmission of the G allele to probands with autistic disorder was found which contrasts with the overtransmission of A previously reported in the Chinese Han sample. In both samples, G was more frequent than A. However, in our Caucasian autism trios and the CEU Caucasian HapMap samples the frequency of A was less than that reported in the Chinese Han and Chinese in Bejing HapMap samples. The haplotype test of association did not reveal excess transmission from parents to affected offspring. Conclusions These findings provide support for association of OXTR with autism in a Caucasian population. Overtransmission of different alleles in different populations may be due to a different pattern of linkage disequilibrium between the marker rs2254298 and an as yet undetermined susceptibility variant in OXTR. PMID:17383819
Fine Mapping on Chromosome 13q32–34 and Brain Expression Analysis Implicates MYO16 in Schizophrenia
Rodriguez-Murillo, Laura; Xu, Bin; Roos, J Louw; Abecasis, Gonçalo R; Gogos, Joseph A; Karayiorgou, Maria
2014-01-01
We previously reported linkage of schizophrenia and schizoaffective disorder to 13q32–34 in the European descent Afrikaner population from South Africa. The nature of genetic variation underlying linkage peaks in psychiatric disorders remains largely unknown and both rare and common variants may be contributing. Here, we examine the contribution of common variants located under the 13q32–34 linkage region. We used densely spaced SNPs to fine map the linkage peak region using both a discovery sample of 415 families and a meta-analysis incorporating two additional replication family samples. In a second phase of the study, we use one family-based data set with 237 families and independent case–control data sets for fine mapping of the common variant association signal using HapMap SNPs. We report a significant association with a genetic variant (rs9583277) within the gene encoding for the myosin heavy-chain Myr 8 (MYO16), which has been implicated in neuronal phosphoinositide 3-kinase signaling. Follow-up analysis of HapMap variation within MYO16 in a second set of Afrikaner families and additional case–control data sets of European descent highlighted a region across introns 2–6 as the most likely region to harbor common MYO16 risk variants. Expression analysis revealed a significant increase in the level of MYO16 expression in the brains of schizophrenia patients. Our results suggest that common variation within MYO16 may contribute to the genetic liability to schizophrenia. PMID:24141571
O'Donnell, Peter H.; Gamazon, Eric; Zhang, Wei; Stark, Amy L.; Kistner-Griffin, Emily O.; Huang, R. Stephanie; Dolan, M. Eileen
2010-01-01
Objectives Clinical studies show that Asians (ASN) are more susceptible to toxicities associated with platinum-containing regimens. We hypothesized that studying ASN as an `enriched phenotype' population could enable the discovery of novel genetic determinants of platinum susceptibility. Methods Using well-genotyped lymphoblastoid cell lines from the HapMap, we determined cisplatin and carboplatin cytotoxicity phenotypes (IC50s) for ASN, Caucasians (CEU), and Africans (YRI). IC50s were used in genome-wide association studies. Results ASN were most sensitive to platinums, corroborating clinical findings. ASN genome-wide association studies produced 479 single-nucleotide polymorphisms (SNPs) associating with cisplatin susceptibility and 199 with carboplatin susceptibility (P<10−4). Considering only the most significant variants (P< 9.99 × 10−6), backwards elimination was then used to identify reduced-model SNPs, which robustly described the drug phenotypes within ASN. These SNPs comprised highly descriptive genetic signatures of susceptibility, with 12 SNPs explaining more than 95% of the susceptibility phenotype variation for cisplatin, and eight SNPs approximately 75% for carboplatin. To determine the possible function of these variants in ASN, the SNPs were tested for association with differential expression of target genes. SNPs were highly associated with the expression of multiple target genes, and notably, the histone H3 family was implicated for both drugs, suggesting a platinum-class mechanism. Histone H3 has repeatedly been described as regulating the formation of platinum-DNA adducts, but this is the first evidence that specific genetic variants might mediate these interactions in a pharmacogenetic manner. Finally, to determine whether any ASN-identified SNPs might also be important in other human populations, we interrogated all 479/199 SNPs for association with platinum susceptibility in an independent combined CEU/YRI population. Three unique SNPs for cisplatin and 10 for carboplatin replicated in CEU/YRI. Conclusion Enriched `platinum susceptible' populations can be used to discover novel genetic determinants governing interindividual platinum chemotherapy susceptibility. PMID:20393316
Genetic polymorphisms associated with increased risk of developing chronic myelogenous leukemia
Bruzzoni-Giovanelli, Heriberto; González, Juan R.; Sigaux, François; Villoutreix, Bruno O.; Cayuela, Jean Michel; Guilhot, Joëlle; Preudhomme, Claude; Guilhot, François; Poyet, Jean-Luc; Rousselot, Philippe
2015-01-01
Little is known about inherited factors associated with the risk of developing chronic myelogenous leukemia (CML). We used a dedicated DNA chip containing 16 561 single nucleotide polymorphisms (SNPs) covering 1 916 candidate genes to analyze 437 CML patients and 1 144 healthy control individuals. Single SNP association analysis identified 139 SNPs that passed multiple comparisons (1% false discovery rate). The HDAC9, AVEN, SEMA3C, IKBKB, GSTA3, RIPK1 and FGF2 genes were each represented by three SNPs, the PSM family by four SNPs and the SLC15A1 gene by six. Haplotype analysis showed that certain combinations of rare alleles of these genes increased the risk of developing CML by more than two or three-fold. A classification tree model identified five SNPs belonging to the genes PSMB10, TNFRSF10D, PSMB2, PPARD and CYP26B1, which were associated with CML predisposition. A CML-risk-allele score was created using these five SNPs. This score was accurate for discriminating CML status (AUC: 0.61, 95%CI: 0.58–0.64). Interestingly, the score was associated with age at diagnosis and the average number of risk alleles was significantly higher in younger patients. The risk-allele score showed the same distribution in the general population (HapMap CEU samples) as in our control individuals and was associated with differential gene expression patterns of two genes (VAPA and TDRKH). In conclusion, we describe haplotypes and a genetic score that are significantly associated with a predisposition to develop CML. The SNPs identified will also serve to drive fundamental research on the putative role of these genes in CML development. PMID:26474455
Davis, Charronne F; Dorak, M Tevfik
2010-04-01
The most common mutation of the HFE gene C282Y has shown a risk association with childhood acute lymphoblastic leukemia (ALL) in Welsh and Scottish case-control studies. This finding has not been replicated outside Britain. Here, we present a thorough analysis of the HFE gene in a panel of HLA homozygous reference cell lines and in the original population sample from South Wales (117 childhood ALL cases and 414 newborn controls). The 21 of 24 variants analyzed were from the HFE gene region extending 52 kb from the histone gene HIST1H1C to HIST1H1T. We identified the single-nucleotide polymorphism (SNP) rs807212 as a tagging SNP for the most common HFE region haplotype, which contains wild-type alleles of all HFE variants examined. This intergenic SNP rs807212 yielded a strong male-specific protective association (per allele OR = 0.38, 95% CI = 0.22-0.64, P (trend) = 0.0002; P = 0.48 in females), which accounted for the original C282Y risk association. In the HapMap project data, rs807212 was in strong linkage disequilibrium with 25 other SNPs spanning 151 kb around HFE. Minor alleles of these 26 SNPs characterized the most common haplotype for the HFE region, which lacked all disease-associated HFE variants. The HapMap data suggested positive selection in this region even in populations where the HFE C282Y mutation is absent. These results have implications for the sex-specific associations observed in this region and suggest the inclusion of rs807212 in future studies of the HFE gene and the extended HLA class I region.
PACSIN2 polymorphism influences TPMT activity and mercaptopurine-related gastrointestinal toxicity.
Stocco, Gabriele; Yang, Wenjian; Crews, Kristine R; Thierfelder, William E; Decorti, Giuliana; Londero, Margherita; Franca, Raffaella; Rabusin, Marco; Valsecchi, Maria Grazia; Pei, Deqing; Cheng, Cheng; Paugh, Steven W; Ramsey, Laura B; Diouf, Barthelemy; McCorkle, Joseph Robert; Jones, Terreia S; Pui, Ching-Hon; Relling, Mary V; Evans, William E
2012-11-01
Treatment-related toxicity can be life-threatening and is the primary cause of interruption or discontinuation of chemotherapy for acute lymphoblastic leukemia (ALL), leading to an increased risk of relapse. Mercaptopurine is an essential component of continuation therapy in all ALL treatment protocols worldwide. Genetic polymorphisms in thiopurine S-methyltransferase (TPMT) are known to have a marked effect on mercaptopurine metabolism and toxicity; however, some patients with wild-type TPMT develop toxicity during mercaptopurine treatment for reasons that are not well understood. To identify additional genetic determinants of mercaptopurine toxicity, a genome-wide analysis was performed in a panel of human HapMap cell lines to identify trans-acting genes whose expression and/or single-nucleotide polymorphisms (SNPs) are related to TPMT activity, then validated in patients with ALL. The highest ranking gene with both mRNA expression and SNPs associated with TPMT activity in HapMap cell lines was protein kinase C and casein kinase substrate in neurons 2 (PACSIN2). The association of a PACSIN2 SNP (rs2413739) with TPMT activity was confirmed in patients and knock-down of PACSIN2 mRNA in human leukemia cells (NALM6) resulted in significantly lower TPMT activity. Moreover, this PACSIN2 SNP was significantly associated with the incidence of severe gastrointestinal (GI) toxicity during consolidation therapy containing mercaptopurine, and remained significant in a multivariate analysis including TPMT and SLCO1B1 as covariates, consistent with its influence on TPMT activity. The association with GI toxicity was also validated in a separate cohort of pediatric patients with ALL. These data indicate that polymorphism in PACSIN2 significantly modulates TPMT activity and influences the risk of GI toxicity associated with mercaptopurine therapy.
PACSIN2 polymorphism influences TPMT activity and mercaptopurine-related gastrointestinal toxicity
Stocco, Gabriele; Yang, Wenjian; Crews, Kristine R.; Thierfelder, William E.; Decorti, Giuliana; Londero, Margherita; Franca, Raffaella; Rabusin, Marco; Valsecchi, Maria Grazia; Pei, Deqing; Cheng, Cheng; Paugh, Steven W.; Ramsey, Laura B.; Diouf, Barthelemy; McCorkle, Joseph Robert; Jones, Terreia S.; Pui, Ching-Hon; Relling, Mary V.; Evans, William E.
2012-01-01
Treatment-related toxicity can be life-threatening and is the primary cause of interruption or discontinuation of chemotherapy for acute lymphoblastic leukemia (ALL), leading to an increased risk of relapse. Mercaptopurine is an essential component of continuation therapy in all ALL treatment protocols worldwide. Genetic polymorphisms in thiopurine S-methyltransferase (TPMT) are known to have a marked effect on mercaptopurine metabolism and toxicity; however, some patients with wild-type TPMT develop toxicity during mercaptopurine treatment for reasons that are not well understood. To identify additional genetic determinants of mercaptopurine toxicity, a genome-wide analysis was performed in a panel of human HapMap cell lines to identify trans-acting genes whose expression and/or single-nucleotide polymorphisms (SNPs) are related to TPMT activity, then validated in patients with ALL. The highest ranking gene with both mRNA expression and SNPs associated with TPMT activity in HapMap cell lines was protein kinase C and casein kinase substrate in neurons 2 (PACSIN2). The association of a PACSIN2 SNP (rs2413739) with TPMT activity was confirmed in patients and knock-down of PACSIN2 mRNA in human leukemia cells (NALM6) resulted in significantly lower TPMT activity. Moreover, this PACSIN2 SNP was significantly associated with the incidence of severe gastrointestinal (GI) toxicity during consolidation therapy containing mercaptopurine, and remained significant in a multivariate analysis including TPMT and SLCO1B1 as covariates, consistent with its influence on TPMT activity. The association with GI toxicity was also validated in a separate cohort of pediatric patients with ALL. These data indicate that polymorphism in PACSIN2 significantly modulates TPMT activity and influences the risk of GI toxicity associated with mercaptopurine therapy. PMID:22846425
Outcomes of methotrexate therapy for psoriasis and relationship to genetic polymorphisms.
Warren, R B; Smith, R L; Campalani, E; Eyre, S; Smith, C H; Barker, J N W N; Worthington, J; Griffiths, C E M
2009-02-01
The use of methotrexate is limited by interindividual variability in response. Previous studies in patients with either rheumatoid arthritis or psoriasis suggest that genetic variation across the methotrexate metabolic pathway might enable prediction of both efficacy and toxicity of the drug. To assess if single nucleotide polymorphisms (SNPs) across four genes that are relevant to methotrexate metabolism [folypolyglutamate synthase (FPGS), gamma-glutamyl hydrolase (GGH), methylenetetrahydrofolate reductase (MTHFR) and 5-aminoimidazole-4-carboxamide ribonucleotide transformylase (ATIC)] are related to treatment outcomes in patients with psoriasis. DNA was collected from 374 patients with psoriasis who had been treated with methotrexate. Data were available on individual outcomes to therapy, namely efficacy and toxicity. Haplotype-tagging SNPs (r(2) > 0.8) for the four genes with a minor allele frequency of > 5% were selected from the HAPMAP phase II data. Genotyping was undertaken using the MassARRAY spectrometric method (Sequenom). There were no significant associations detected between clinical outcomes in patients with psoriasis treated with methotrexate and SNPs in the four genes investigated. Genetic variation in four key genes relevant to the intracellular metabolism of methotrexate does not appear to predict response to methotrexate therapy in patients with psoriasis.
Linkage disequilibrium between STRPs and SNPs across the human genome.
Payseur, Bret A; Place, Michael; Weber, James L
2008-05-01
Patterns of linkage disequilibrium (LD) reveal the action of evolutionary processes and provide crucial information for association mapping of disease genes. Although recent studies have described the landscape of LD among single nucleotide polymorphisms (SNPs) from across the human genome, associations involving other classes of molecular variation remain poorly understood. In addition to recombination and population history, mutation rate and process are expected to shape LD. To test this idea, we measured associations between short-tandem-repeat polymorphisms (STRPs), which can mutate rapidly and recurrently, and SNPs in 721 regions across the human genome. We directly compared STRP-SNP LD with SNP-SNP LD from the same genomic regions in the human HapMap populations. The intensity of STRP-SNP LD, measured by the average of D', was reduced, consistent with the action of recurrent mutation. Nevertheless, a higher fraction of STRP-SNP pairs than SNP-SNP pairs showed significant LD, on both short (up to 50 kb) and long (cM) scales. These results reveal the substantial effects of mutational processes on LD at STRPs and provide important measures of the potential of STRPs for association mapping of disease genes.
Williams, Robert C; Elston, Robert C; Kumar, Pankaj; Knowler, William C; Abboud, Hanna E; Adler, Sharon; Bowden, Donald W; Divers, Jasmin; Freedman, Barry I; Igo, Robert P; Ipp, Eli; Iyengar, Sudha K; Kimmel, Paul L; Klag, Michael J; Kohn, Orly; Langefeld, Carl D; Leehey, David J; Nelson, Robert G; Nicholas, Susanne B; Pahl, Madeleine V; Parekh, Rulan S; Rotter, Jerome I; Schelling, Jeffrey R; Sedor, John R; Shah, Vallabh O; Smith, Michael W; Taylor, Kent D; Thameem, Farook; Thornley-Brown, Denyse; Winkler, Cheryl A; Guo, Xiuqing; Zager, Phillip; Hanson, Robert L
2016-05-04
The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.
Kang, Eun Yong; Martin, Lisa J.; Mangul, Serghei; Isvilanonda, Warin; Zou, Jennifer; Ben-David, Eyal; Han, Buhm; Lusis, Aldons J.; Shifman, Sagiv; Eskin, Eleazar
2016-01-01
The study of the genetics of gene expression is of considerable importance to understanding the nature of common, complex diseases. The most widely applied approach to identifying relationships between genetic variation and gene expression is the expression quantitative trait loci (eQTL) approach. Here, we increased the computational power of eQTL with an alternative and complementary approach based on analyzing allele specific expression (ASE). We designed a novel analytical method to identify cis-acting regulatory variants based on genome sequencing and measurements of ASE from RNA-sequencing (RNA-seq) data. We evaluated the power and resolution of our method using simulated data. We then applied the method to map regulatory variants affecting gene expression in lymphoblastoid cell lines (LCLs) from 77 unrelated northern and western European individuals (CEU), which were part of the HapMap project. A total of 2309 SNPs were identified as being associated with ASE patterns. The SNPs associated with ASE were enriched within promoter regions and were significantly more likely to signal strong evidence for a regulatory role. Finally, among the candidate regulatory SNPs, we identified 108 SNPs that were previously associated with human immune diseases. With further improvements in quantifying ASE from RNA-seq, the application of our method to other datasets is expected to accelerate our understanding of the biological basis of common diseases. PMID:27765809
Delaney, Jessica T; Jeff, Janina M; Brown, Nancy J; Pretorius, Mias; Okafor, Henry E; Darbar, Dawood; Roden, Dan M; Crawford, Dana C
2012-01-01
Despite a greater burden of risk factors, atrial fibrillation (AF) is less common among African Americans than European-descent populations. Genome-wide association studies (GWAS) for AF in European-descent populations have identified three predominant genomic regions associated with increased risk (1q21, 4q25, and 16q22). The contribution of these loci to AF risk in African American is unknown. We studied 73 African Americans with AF from the Vanderbilt-Meharry AF registry and 71 African American controls, with no history of AF including after cardiac surgery. Tests of association were performed for 148 SNPs across the three regions associated with AF, and 22 SNPs were significantly associated with AF (P<0.05). The SNPs with the strongest associations in African Americans were both different from the index SNPs identified in European-descent populations and independent from the index European-descent population SNPs (r(2)<0.40 in HapMap CEU): 1q21 rs4845396 (odds ratio [OR] 0.30, 95% confidence interval [CI] 0.13-0.67, P = 0.003), 4q25 rs4631108 (OR 3.43, 95% CI 1.59-7.42, P = 0.002), and 16q22 rs16971547 (OR 8.1, 95% CI 1.46-45.4, P = 0.016). Estimates of European ancestry were similar among cases (23.6%) and controls (23.8%). Accordingly, the probability of having two copies of the European derived chromosomes at each region did not differ between cases and controls. Variable European admixture at known AF loci does not explain decreased AF susceptibility in African Americans. These data support the role of 1q21, 4q25, and 16q22 variants in AF risk for African Americans, although the index SNPs differ from those identified in European-descent populations.
A survey of the population genetic variation in the human kinome.
Zhang, Wei; Catenacci, Daniel V T; Duan, Shiwei; Ratain, Mark J
2009-08-01
Protein kinases are key regulators of various biological processes, such as control of cell growth, metabolism, differentiation and apoptosis. Therefore, protein kinases have been an important class of targets for anticancer drugs. Health-related disparities such as differential drug response have been observed between human populations. A survey of the human kinases and their ligand genes for those containing population-specific genetic variants could provide new insights into the mechanisms of these health disparities and suggest novel targets for ethnicity-specific personalized medicine. Using the International HapMap Project genotypic data on single-nucleotide polymorphisms (SNPs), the protein kinase complement of the human genome (kinome) and some experimentally verified ligand genes were scanned for the existence of population-specific SNPs (eSNPs). In general, protein kinases were found to contain a much higher proportion of eSNPs than the whole genome background, indicating a stronger pressure for adaptation in individual populations. In contrast, the proportion of ligand genes containing eSNPs was not different from that of the whole genome background. Although with some important limitations, our results suggest that human kinases are more likely to be under recent positive selection than ligands. Our findings suggest that the health-related disparities associated with kinase signaling pathways are more likely to be driven by the genetic variation in the kinase genes than their cognate ligands. Illustrating the role of molecular evolution in the genetic variation of the human kinome could provide a promising route to understand the ethnic differences in cancer and facilitate the realization of ethnicity-based individualized medicine.
Leak, T. S.; Perlegas, P.S.; Smith, S.G.; Keene, K.L.; Hicks, P.J.; Langefeld, C.D.; Mychaleckyj, J.C.; Rich, S.S.; Kirk, J.K.; Freedman, B.I.; Bowden, D.W.; Sale, M.M.
2009-01-01
Variants in the engulfment and cell motility 1 (ELMO1) gene are associated with nephropathy due to type 2 diabetes mellitus (T2DM) in a Japanese cohort. We comprehensively evaluated this gene in African American (AA) T2DM patients with end-stage renal disease (ESRD). Three hundred nine HapMap tagging SNPs and 9 reportedly associated SNPs were genotyped in 577 AA T2DM-ESRD patients and 596 AA non-diabetic controls, plus 43 non-diabetic European American controls and 45 Yoruba Nigerian samples for admixture adjustment. Replication analyses were conducted in 558 AAs with T2DM-ESRD and 564 controls without diabetes. Extension analyses included 328 AA with T2DM lacking nephropathy and 326 with non-diabetic ESRD. The original and replication analyses confirmed association with four SNPs in intron 13 (permutation p-values for combined analyses = 0.001-0.003), one in intron 1 (P=0.004) and one in intron 5 (P=0.002) with T2DM-associated ESRD. In a subsequent combined analysis of all 1,135 T2DM-ESRD cases and 1,160 controls, an additional 7 intron 13 SNPs produced evidence of association (P = 3.5×10-5 – P=0.05). No associations were seen with these SNPs in those with T2DM lacking nephropathy or with ESRD due to non-diabetic causes. Variants in intron 13 of the ELMO1 gene appear to confer risk for diabetic nephropathy in AA. PMID:19183347
Yamaguchi-Kabata, Yumi; Nakazono, Kazuyuki; Takahashi, Atsushi; Saito, Susumu; Hosono, Naoya; Kubo, Michiaki; Nakamura, Yusuke; Kamatani, Naoyuki
2008-10-01
Because population stratification can cause spurious associations in case-control studies, understanding the population structure is important. Here, we examined Japanese population structure by "Eigenanalysis," using the genotypes for 140,387 SNPs in 7003 Japanese individuals, along with 60 European, 60 African, and 90 East-Asian individuals, in the HapMap project. Most Japanese individuals fell into two main clusters, Hondo and Ryukyu; the Hondo cluster includes most of the individuals from the main islands in Japan, and the Ryukyu cluster includes most of the individuals from Okinawa. The SNPs with the greatest frequency differences between the Hondo and Ryukyu clusters were found in the HLA region in chromosome 6. The nonsynonymous SNPs with the greatest frequency differences between the Hondo and Ryukyu clusters were the Val/Ala polymorphism (rs3827760) in the EDAR gene, associated with hair thickness, and the Gly/Ala polymorphism (rs17822931) in the ABCC11 gene, associated with ear-wax type. Genetic differentiation was observed, even among different regions in Honshu Island, the largest island of Japan. Simulation studies showed that the inclusion of different proportions of individuals from different regions of Japan in case and control groups can lead to an inflated rate of false-positive results when the sample sizes are large.
Patterns of population differentiation of candidate genes for cardiovascular disease.
Kullo, Iftikhar J; Ding, Keyue
2007-07-12
The basis for ethnic differences in cardiovascular disease (CVD) susceptibility is not fully understood. We investigated patterns of population differentiation (FST) of a set of genes in etiologic pathways of CVD among 3 ethnic groups: Yoruba in Nigeria (YRI), Utah residents with European ancestry (CEU), and Han Chinese (CHB) + Japanese (JPT). We identified 37 pathways implicated in CVD based on the PANTHER classification and 416 genes in these pathways were further studied; these genes belonged to 6 biological processes (apoptosis, blood circulation and gas exchange, blood clotting, homeostasis, immune response, and lipoprotein metabolism). Genotype data were obtained from the HapMap database. We calculated FST for 15,559 common SNPs (minor allele frequency > or = 0.10 in at least one population) in genes that co-segregated among the populations, as well as an average-weighted FST for each gene. SNPs were classified as putatively functional (non-synonymous and untranslated regions) or non-functional (intronic and synonymous sites). Mean FST values for common putatively functional variants were significantly higher than FST values for nonfunctional variants. A significant variation in FST was also seen based on biological processes; the processes of 'apoptosis' and 'lipoprotein metabolism' showed an excess of genes with high FST. Thus, putative functional SNPs in genes in etiologic pathways for CVD show greater population differentiation than non-functional SNPs and a significant variance of FST values was noted among pairwise population comparisons for different biological processes. These results suggest a possible basis for varying susceptibility to CVD among ethnic groups.
Genetic variation in cell death genes and risk of non-Hodgkin lymphoma.
Schuetz, Johanna M; Daley, Denise; Graham, Jinko; Berry, Brian R; Gallagher, Richard P; Connors, Joseph M; Gascoyne, Randy D; Spinelli, John J; Brooks-Wilson, Angela R
2012-01-01
Non-Hodgkin lymphomas are a heterogeneous group of solid tumours that constitute the 5(th) highest cause of cancer mortality in the United States and Canada. Poor control of cell death in lymphocytes can lead to autoimmune disease or cancer, making genes involved in programmed cell death of lymphocytes logical candidate genes for lymphoma susceptibility. We tested for genetic association with NHL and NHL subtypes, of SNPs in lymphocyte cell death genes using an established population-based study. 17 candidate genes were chosen based on biological function, with 123 SNPs tested. These included tagSNPs from HapMap and novel SNPs discovered by re-sequencing 47 cases in genes for which SNP representation was judged to be low. The main analysis, which estimated odds ratios by fitting data to an additive logistic regression model, used European ancestry samples that passed quality control measures (569 cases and 547 controls). A two-tiered approach for multiple testing correction was used: correction for number of tests within each gene by permutation-based methodology, followed by correction for the number of genes tested using the false discovery rate. Variant rs928883, near miR-155, showed an association (OR per A-allele: 2.80 [95% CI: 1.63-4.82]; p(F) = 0.027) with marginal zone lymphoma that is significant after correction for multiple testing. This is the first reported association between a germline polymorphism at a miRNA locus and lymphoma.
Keinan, Alon; Mullikin, James C; Patterson, Nick; Reich, David
2007-10-01
Large data sets on human genetic variation have been collected recently, but their usefulness for learning about history and natural selection has been limited by biases in the ways polymorphisms were chosen. We report large subsets of SNPs from the International HapMap Project that allow us to overcome these biases and to provide accurate measurement of a quantity of crucial importance for understanding genetic variation: the allele frequency spectrum. Our analysis shows that East Asian and northern European ancestors shared the same population bottleneck expanding out of Africa but that both also experienced more recent genetic drift, which was greater in East Asians.
Leak, Tennille S.; Mychaleckyj, Josyf C.; Smith, Shelly G.; Keene, Keith L.; Gordon, Candace J.; Hicks, Pamela J.; Freedman, Barry I.; Bowden, Donald W.; Sale, Michèle M.
2009-01-01
Previously we performed a genome scan for type 2 diabetes (T2DM) using 638 African-American (AA) affected sibling pairs from 247 families; non-parametric linkage analysis suggested evidence of linkage at 6q24-27 (LOD 2.26). To comprehensively evaluate this region we performed a 2-stage association study by first constructing a SNP map of 754 SNPs selected from HapMap on the basis of linkage disequilibrium (LD) in 300 AAT2DM-ESRD subjects, 311 AA controls, 43 European American controls and 45 Yoruba Nigerian samples (Set 1). Replication analyses were conducted in an independent population of 283 AA T2DM-ESRD subjects and 282 AA controls (Set 2). In addition, we adjusted for the impact of admixture on association results by using ancestry informative markers (AIMs). In Stage 1, 137 (18.2%) SNPs showed nominal evidence of association (P<0.05) in one or more of tests of association: allelic (n=33), dominant (n=36), additive (n=29), or recessive (n=34) genotypic models, and 2- (n=47) and 3-SNP (n=43) haplotypic analyses. These SNPs were selected for follow-up genotyping. Stage 2 analyses confirmed association with a predicted 2-SNP “risk” haplotype in the PARK2 gene. Also, two intergenic SNPs showed consistent genotypic association with T2DM-ESRD: rs12197043 and rs4897081. Combined analysis of all subjects from both stages revealed nominal associations with 17 SNPs within genes; including suggestive associations in ESR1 and PARK2. This study confirms known diabetic nephropathy loci and identifies potentially novel susceptibility variants located within 6q24-27 in AA. PMID:18560894
Fox, Ervin R.; Young, J. Hunter; Li, Yali; Dreisbach, Albert W.; Keating, Brendan J.; Musani, Solomon K.; Liu, Kiang; Morrison, Alanna C.; Ganesh, Santhi; Kutlar, Abdullah; Ramachandran, Vasan S.; Polak, Josef F.; Fabsitz, Richard R.; Dries, Daniel L.; Farlow, Deborah N.; Redline, Susan; Adeyemo, Adebowale; Hirschorn, Joel N.; Sun, Yan V.; Wyatt, Sharon B.; Penman, Alan D.; Palmas, Walter; Rotter, Jerome I.; Townsend, Raymond R.; Doumatey, Ayo P.; Tayo, Bamidele O.; Mosley, Thomas H.; Lyon, Helen N.; Kang, Sun J.; Rotimi, Charles N.; Cooper, Richard S.; Franceschini, Nora; Curb, J. David; Martin, Lisa W.; Eaton, Charles B.; Kardia, Sharon L.R.; Taylor, Herman A.; Caulfield, Mark J.; Ehret, Georg B.; Johnson, Toby; Chakravarti, Aravinda; Zhu, Xiaofeng; Levy, Daniel; Munroe, Patricia B.; Rice, Kenneth M.; Bochud, Murielle; Johnson, Andrew D.; Chasman, Daniel I.; Smith, Albert V.; Tobin, Martin D.; Verwoert, Germaine C.; Hwang, Shih-Jen; Pihur, Vasyl; Vollenweider, Peter; O'Reilly, Paul F.; Amin, Najaf; Bragg-Gresham, Jennifer L.; Teumer, Alexander; Glazer, Nicole L.; Launer, Lenore; Zhao, Jing Hua; Aulchenko, Yurii; Heath, Simon; Sõber, Siim; Parsa, Afshin; Luan, Jian'an; Arora, Pankaj; Dehghan, Abbas; Zhang, Feng; Lucas, Gavin; Hicks, Andrew A.; Jackson, Anne U.; Peden, John F.; Tanaka, Toshiko; Wild, Sarah H.; Rudan, Igor; Igl, Wilmar; Milaneschi, Yuri; Parker, Alex N.; Fava, Cristiano; Chambers, John C.; Kumari, Meena; JinGo, Min; van der Harst, Pim; Kao, Wen Hong Linda; Sjögren, Marketa; Vinay, D.G.; Alexander, Myriam; Tabara, Yasuharu; Shaw-Hawkins, Sue; Whincup, Peter H.; Liu, Yongmei; Shi, Gang; Kuusisto, Johanna; Seielstad, Mark; Sim, Xueling; Nguyen, Khanh-Dung Hoang; Lehtimäki, Terho; Matullo, Giuseppe; Wu, Ying; Gaunt, Tom R.; Charlotte Onland-Moret, N.; Cooper, Matthew N.; Platou, Carl G.P.; Org, Elin; Hardy, Rebecca; Dahgam, Santosh; Palmen, Jutta; Vitart, Veronique; Braund, Peter S.; Kuznetsova, Tatiana; Uiterwaal, Cuno S.P.M.; Campbell, Harry; Ludwig, Barbara; Tomaszewski, Maciej; Tzoulaki, Ioanna; Palmer, Nicholette D.; Aspelund, Thor; Garcia, Melissa; Chang, Yen-Pei C.; O'Connell, Jeffrey R.; Steinle, Nanette I.; Grobbee, Diederick E.; Arking, Dan E.; Hernandez, Dena; Najjar, Samer; McArdle, Wendy L.; Hadley, David; Brown, Morris J.; Connell, John M.; Hingorani, Aroon D.; Day, Ian N.M.; Lawlor, Debbie A.; Beilby, John P.; Lawrence, Robert W.; Clarke, Robert; Collins, Rory; Hopewell, Jemma C.; Ongen, Halit; Bis, Joshua C.; Kähönen, Mika; Viikari, Jorma; Adair, Linda S.; Lee, Nanette R.; Chen, Ming-Huei; Olden, Matthias; Pattaro, Cristian; Hoffman Bolton, Judith A.; Köttgen, Anna; Bergmann, Sven; Mooser, Vincent; Chaturvedi, Nish; Frayling, Timothy M.; Islam, Muhammad; Jafar, Tazeen H.; Erdmann, Jeanette; Kulkarni, Smita R.; Bornstein, Stefan R.; Grässler, Jürgen; Groop, Leif; Voight, Benjamin F.; Kettunen, Johannes; Howard, Philip; Taylor, Andrew; Guarrera, Simonetta; Ricceri, Fulvio; Emilsson, Valur; Plump, Andrew; Barroso, Inês; Khaw, Kay-Tee; Weder, Alan B.; Hunt, Steven C.; Bergman, Richard N.; Collins, Francis S.; Bonnycastle, Lori L.; Scott, Laura J.; Stringham, Heather M.; Peltonen, Leena; Perola, Markus; Vartiainen, Erkki; Brand, Stefan-Martin; Staessen, Jan A.; Wang, Thomas J.; Burton, Paul R.; SolerArtigas, Maria; Dong, Yanbin; Snieder, Harold; Wang, Xiaoling; Zhu, Haidong; Lohman, Kurt K.; Rudock, Megan E.; Heckbert, Susan R.; Smith, Nicholas L.; Wiggins, Kerri L.; Shriner, Daniel; Veldre, Gudrun; Viigimaa, Margus; Kinra, Sanjay; Prabhakaran, Dorairajan; Tripathy, Vikal; Langefeld, Carl D.; Rosengren, Annika; Thelle, Dag S.; MariaCorsi, Anna; Singleton, Andrew; Forrester, Terrence; Hilton, Gina; McKenzie, Colin A.; Salako, Tunde; Iwai, Naoharu; Kita, Yoshikuni; Ogihara, Toshio; Ohkubo, Takayoshi; Okamura, Tomonori; Ueshima, Hirotsugu; Umemura, Satoshi; Eyheramendy, Susana; Meitinger, Thomas; Wichmann, H.-Erich; Cho, Yoon Shin; Kim, Hyung-Lae; Lee, Jong-Young; Scott, James; Sehmi, Joban S.; Zhang, Weihua; Hedblad, Bo; Nilsson, Peter; Smith, George Davey; Wong, Andrew; Narisu, Narisu; Stančáková, Alena; Raffel, Leslie J.; Yao, Jie; Kathiresan, Sekar; O'Donnell, Chris; Schwartz, Steven M.; Arfan Ikram, M.; Longstreth, Will T.; Seshadri, Sudha; Shrine, Nick R.G.; Wain, Louise V.; Morken, Mario A.; Swift, Amy J.; Laitinen, Jaana; Prokopenko, Inga; Zitting, Paavo; Cooper, Jackie A.; Humphries, Steve E.; Danesh, John; Rasheed, Asif; Goel, Anuj; Hamsten, Anders; Watkins, Hugh; Bakker, Stephan J.L.; van Gilst, Wiek H.; Janipalli, Charles S.; Radha Mani, K.; Yajnik, Chittaranjan S.; Hofman, Albert; Mattace-Raso, Francesco U.S.; Oostra, Ben A.; Demirkan, Ayse; Isaacs, Aaron; Rivadeneira, Fernando; Lakatta, Edward G.; Orru, Marco; Scuteri, Angelo; Ala-Korpela, Mika; Kangas, Antti J.; Lyytikäinen, Leo-Pekka; Soininen, Pasi; Tukiainen, Taru; Würz, Peter; Twee-Hee Ong, Rick; Dörr, Marcus; Kroemer, Heyo K.; Völker, Uwe; Völzke, Henry; Galan, Pilar; Hercberg, Serge; Lathrop, Mark; Zelenika, Diana; Deloukas, Panos; Mangino, Massimo; Spector, Tim D.; Zhai, Guangju; Meschia, James F.; Nalls, Michael A.; Sharma, Pankaj; Terzic, Janos; Kranthi Kumar, M.J.; Denniff, Matthew; Zukowska-Szczechowska, Ewa; Wagenknecht, Lynne E.; Fowkes, Gerald R.; Charchar, Fadi J.; Schwarz, Peter E.H.; Hayward, Caroline; Guo, Xiuqing; Bots, Michiel L.; Brand, Eva; Samani, Nilesh J.; Polasek, Ozren; Talmud, Philippa J.; Nyberg, Fredrik; Kuh, Diana; Laan, Maris; Hveem, Kristian; Palmer, Lyle J.; van der Schouw, Yvonne T.; Casas, Juan P.; Mohlke, Karen L.; Vineis, Paolo; Raitakari, Olli; Wong, Tien Y.; Shyong Tai, E.; Laakso, Markku; Rao, Dabeeru C.; Harris, Tamara B.; Morris, Richard W.; Dominiczak, Anna F.; Kivimaki, Mika; Marmot, Michael G.; Miki, Tetsuro; Saleheen, Danish; Chandak, Giriraj R.; Coresh, Josef; Navis, Gerjan; Salomaa, Veikko; Han, Bok-Ghee; Kooner, Jaspal S.; Melander, Olle; Ridker, Paul M.; Bandinelli, Stefania; Gyllensten, Ulf B.; Wright, Alan F.; Wilson, James F.; Ferrucci, Luigi; Farrall, Martin; Tuomilehto, Jaakko; Pramstaller, Peter P.; Elosua, Roberto; Soranzo, Nicole; Sijbrands, Eric J.G.; Altshuler, David; Loos, Ruth J.F.; Shuldiner, Alan R.; Gieger, Christian; Meneton, Pierre; Uitterlinden, Andre G.; Wareham, Nicholas J.; Gudnason, Vilmundur; Rettig, Rainer; Uda, Manuela; Strachan, David P.; Witteman, Jacqueline C.M.; Hartikainen, Anna-Liisa; Beckmann, Jacques S.; Boerwinkle, Eric; Boehnke, Michael; Larson, Martin G.; Järvelin, Marjo-Riitta; Psaty, Bruce M.; Abecasis, Gonçalo R.; Elliott, Paul; van Duijn , Cornelia M.; Newton-Cheh, Christopher
2011-01-01
The prevalence of hypertension in African Americans (AAs) is higher than in other US groups; yet, few have performed genome-wide association studies (GWASs) in AA. Among people of European descent, GWASs have identified genetic variants at 13 loci that are associated with blood pressure. It is unknown if these variants confer susceptibility in people of African ancestry. Here, we examined genome-wide and candidate gene associations with systolic blood pressure (SBP) and diastolic blood pressure (DBP) using the Candidate Gene Association Resource (CARe) consortium consisting of 8591 AAs. Genotypes included genome-wide single-nucleotide polymorphism (SNP) data utilizing the Affymetrix 6.0 array with imputation to 2.5 million HapMap SNPs and candidate gene SNP data utilizing a 50K cardiovascular gene-centric array (ITMAT-Broad-CARe [IBC] array). For Affymetrix data, the strongest signal for DBP was rs10474346 (P= 3.6 × 10−8) located near GPR98 and ARRDC3. For SBP, the strongest signal was rs2258119 in C21orf91 (P= 4.7 × 10−8). The top IBC association for SBP was rs2012318 (P= 6.4 × 10−6) near SLC25A42 and for DBP was rs2523586 (P= 1.3 × 10−6) near HLA-B. None of the top variants replicated in additional AA (n = 11 882) or European-American (n = 69 899) cohorts. We replicated previously reported European-American blood pressure SNPs in our AA samples (SH2B3, P= 0.009; TBX3-TBX5, P= 0.03; and CSK-ULK3, P= 0.0004). These genetic loci represent the best evidence of genetic influences on SBP and DBP in AAs to date. More broadly, this work supports that notion that blood pressure among AAs is a trait with genetic underpinnings but also with significant complexity. PMID:21378095
Fox, Ervin R; Young, J Hunter; Li, Yali; Dreisbach, Albert W; Keating, Brendan J; Musani, Solomon K; Liu, Kiang; Morrison, Alanna C; Ganesh, Santhi; Kutlar, Abdullah; Ramachandran, Vasan S; Polak, Josef F; Fabsitz, Richard R; Dries, Daniel L; Farlow, Deborah N; Redline, Susan; Adeyemo, Adebowale; Hirschorn, Joel N; Sun, Yan V; Wyatt, Sharon B; Penman, Alan D; Palmas, Walter; Rotter, Jerome I; Townsend, Raymond R; Doumatey, Ayo P; Tayo, Bamidele O; Mosley, Thomas H; Lyon, Helen N; Kang, Sun J; Rotimi, Charles N; Cooper, Richard S; Franceschini, Nora; Curb, J David; Martin, Lisa W; Eaton, Charles B; Kardia, Sharon L R; Taylor, Herman A; Caulfield, Mark J; Ehret, Georg B; Johnson, Toby; Chakravarti, Aravinda; Zhu, Xiaofeng; Levy, Daniel
2011-06-01
The prevalence of hypertension in African Americans (AAs) is higher than in other US groups; yet, few have performed genome-wide association studies (GWASs) in AA. Among people of European descent, GWASs have identified genetic variants at 13 loci that are associated with blood pressure. It is unknown if these variants confer susceptibility in people of African ancestry. Here, we examined genome-wide and candidate gene associations with systolic blood pressure (SBP) and diastolic blood pressure (DBP) using the Candidate Gene Association Resource (CARe) consortium consisting of 8591 AAs. Genotypes included genome-wide single-nucleotide polymorphism (SNP) data utilizing the Affymetrix 6.0 array with imputation to 2.5 million HapMap SNPs and candidate gene SNP data utilizing a 50K cardiovascular gene-centric array (ITMAT-Broad-CARe [IBC] array). For Affymetrix data, the strongest signal for DBP was rs10474346 (P= 3.6 × 10(-8)) located near GPR98 and ARRDC3. For SBP, the strongest signal was rs2258119 in C21orf91 (P= 4.7 × 10(-8)). The top IBC association for SBP was rs2012318 (P= 6.4 × 10(-6)) near SLC25A42 and for DBP was rs2523586 (P= 1.3 × 10(-6)) near HLA-B. None of the top variants replicated in additional AA (n = 11 882) or European-American (n = 69 899) cohorts. We replicated previously reported European-American blood pressure SNPs in our AA samples (SH2B3, P= 0.009; TBX3-TBX5, P= 0.03; and CSK-ULK3, P= 0.0004). These genetic loci represent the best evidence of genetic influences on SBP and DBP in AAs to date. More broadly, this work supports that notion that blood pressure among AAs is a trait with genetic underpinnings but also with significant complexity.
Vasan, Ramachandran S; Glazer, Nicole L; Felix, Janine F; Lieb, Wolfgang; Wild, Philipp S; Felix, Stephan B; Watzinger, Norbert; Larson, Martin G; Smith, Nicholas L; Dehghan, Abbas; Grosshennig, Anika; Schillert, Arne; Teumer, Alexander; Schmidt, Reinhold; Kathiresan, Sekar; Lumley, Thomas; Aulchenko, Yurii S; König, Inke R; Zeller, Tanja; Homuth, Georg; Struchalin, Maksim; Aragam, Jayashri; Bis, Joshua C; Rivadeneira, Fernando; Erdmann, Jeanette; Schnabel, Renate B; Dörr, Marcus; Zweiker, Robert; Lind, Lars; Rodeheffer, Richard J; Greiser, Karin Halina; Levy, Daniel; Haritunians, Talin; Deckers, Jaap W; Stritzke, Jan; Lackner, Karl J; Völker, Uwe; Ingelsson, Erik; Kullo, Iftikhar; Haerting, Johannes; O'Donnell, Christopher J; Heckbert, Susan R; Stricker, Bruno H; Ziegler, Andreas; Reffelmann, Thorsten; Redfield, Margaret M; Werdan, Karl; Mitchell, Gary F; Rice, Kenneth; Arnett, Donna K; Hofman, Albert; Gottdiener, John S; Uitterlinden, Andre G; Meitinger, Thomas; Blettner, Maria; Friedrich, Nele; Wang, Thomas J; Psaty, Bruce M; van Duijn, Cornelia M; Wichmann, H-Erich; Munzel, Thomas F; Kroemer, Heyo K; Benjamin, Emelia J; Rotter, Jerome I; Witteman, Jacqueline C; Schunkert, Heribert; Schmidt, Helena; Völzke, Henry; Blankenberg, Stefan
2009-07-08
Echocardiographic measures of left ventricular (LV) structure and function are heritable phenotypes of cardiovascular disease. To identify common genetic variants associated with cardiac structure and function by conducting a meta-analysis of genome-wide association data in 5 population-based cohort studies (stage 1) with replication (stage 2) in 2 other community-based samples. Within each of 5 community-based cohorts comprising the EchoGen consortium (stage 1; n = 12 612 individuals of European ancestry; 55% women, aged 26-95 years; examinations between 1978-2008), we estimated the association between approximately 2.5 million single-nucleotide polymorphisms (SNPs; imputed to the HapMap CEU panel) and echocardiographic traits. In stage 2, SNPs significantly associated with traits in stage 1 were tested for association in 2 other cohorts (n = 4094 people of European ancestry). Using a prespecified P value threshold of 5 x 10(-7) to indicate genome-wide significance, we performed an inverse variance-weighted fixed-effects meta-analysis of genome-wide association data from each cohort. Echocardiographic traits: LV mass, internal dimensions, wall thickness, systolic dysfunction, aortic root, and left atrial size. In stage 1, 16 genetic loci were associated with 5 echocardiographic traits: 1 each with LV internal dimensions and systolic dysfunction, 3 each with LV mass and wall thickness, and 8 with aortic root size. In stage 2, 5 loci replicated (6q22 locus associated with LV diastolic dimensions, explaining <1% of trait variance; 5q23, 12p12, 12q14, and 17p13 associated with aortic root size, explaining 1%-3% of trait variance). We identified 5 genetic loci harboring common variants that were associated with variation in LV diastolic dimensions and aortic root size, but such findings explained a very small proportion of variance. Further studies are required to replicate these findings, identify the causal variants at or near these loci, characterize their functional significance, and determine whether they are related to overt cardiovascular disease.
A map of copy number variations in Chinese populations.
Lou, Haiyi; Li, Shilin; Yang, Yajun; Kang, Longli; Zhang, Xin; Jin, Wenfei; Wu, Bailin; Jin, Li; Xu, Shuhua
2011-01-01
It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies.
Chen, Rong; Corona, Erik; Sikora, Martin; Dudley, Joel T.; Morgan, Alex A.; Moreno-Estrada, Andres; Nilsen, Geoffrey B.; Ruau, David; Lincoln, Stephen E.; Bustamante, Carlos D.; Butte, Atul J.
2012-01-01
Many disease-susceptible SNPs exhibit significant disparity in ancestral and derived allele frequencies across worldwide populations. While previous studies have examined population differentiation of alleles at specific SNPs, global ethnic patterns of ensembles of disease risk alleles across human diseases are unexamined. To examine these patterns, we manually curated ethnic disease association data from 5,065 papers on human genetic studies representing 1,495 diseases, recording the precise risk alleles and their measured population frequencies and estimated effect sizes. We systematically compared the population frequencies of cross-ethnic risk alleles for each disease across 1,397 individuals from 11 HapMap populations, 1,064 individuals from 53 HGDP populations, and 49 individuals with whole-genome sequences from 10 populations. Type 2 diabetes (T2D) demonstrated extreme directional differentiation of risk allele frequencies across human populations, compared with null distributions of European-frequency matched control genomic alleles and risk alleles for other diseases. Most T2D risk alleles share a consistent pattern of decreasing frequencies along human migration into East Asia. Furthermore, we show that these patterns contribute to disparities in predicted genetic risk across 1,397 HapMap individuals, T2D genetic risk being consistently higher for individuals in the African populations and lower in the Asian populations, irrespective of the ethnicity considered in the initial discovery of risk alleles. We observed a similar pattern in the distribution of T2D Genetic Risk Scores, which are associated with an increased risk of developing diabetes in the Diabetes Prevention Program cohort, for the same individuals. This disparity may be attributable to the promotion of energy storage and usage appropriate to environments and inconsistent energy intake. Our results indicate that the differential frequencies of T2D risk alleles may contribute to the observed disparity in T2D incidence rates across ethnic populations. PMID:22511877
Congruence as a measurement of extended haplotype structure across the genome
2012-01-01
Background Historically, extended haplotypes have been defined using only a few data points, such as alleles for several HLA genes in the MHC. High-density SNP data, and the increasing affordability of whole genome SNP typing, creates the opportunity to define higher resolution extended haplotypes. This drives the need for new tools that support quantification and visualization of extended haplotypes as defined by as many as 2000 SNPs. Confronted with high-density SNP data across the major histocompatibility complex (MHC) for 2,300 complete families, compiled by the Type 1 Diabetes Genetics Consortium (T1DGC), we developed software for studying extended haplotypes. Methods The software, called ExHap (Extended Haplotype), uses a similarity measurement we term congruence to identify and quantify long-range allele identity. Using ExHap, we analyzed congruence in both the T1DGC data and family-phased data from the International HapMap Project. Results Congruent chromosomes from the T1DGC data have between 96.5% and 99.9% allele identity over 1,818 SNPs spanning 2.64 megabases of the MHC (HLA-DRB1 to HLA-A). Thirty-three of 132 DQ-DR-B-A defined haplotype groups have > 50% congruent chromosomes in this region. For example, 92% of chromosomes within the DR3-B8-A1 haplotype are congruent from HLA-DRB1 to HLA-A (99.8% allele identity). We also applied ExHap to all 22 autosomes for both CEU and YRI cohorts from the International HapMap Project, identifying multiple candidate extended haplotypes. Conclusions Long-range congruence is not unique to the MHC region. Patterns of allele identity on phased chromosomes provide a simple, straightforward approach to visually and quantitatively inspect complex long-range structural patterns in the genome. Such patterns aid the biologist in appreciating genetic similarities and differences across cohorts, and can lead to hypothesis generation for subsequent studies. PMID:22369243
Identification of polymorphic inversions from genotypes
2012-01-01
Background Polymorphic inversions are a source of genetic variability with a direct impact on recombination frequencies. Given the difficulty of their experimental study, computational methods have been developed to infer their existence in a large number of individuals using genome-wide data of nucleotide variation. Methods based on haplotype tagging of known inversions attempt to classify individuals as having a normal or inverted allele. Other methods that measure differences between linkage disequilibrium attempt to identify regions with inversions but unable to classify subjects accurately, an essential requirement for association studies. Results We present a novel method to both identify polymorphic inversions from genome-wide genotype data and classify individuals as containing a normal or inverted allele. Our method, a generalization of a published method for haplotype data [1], utilizes linkage between groups of SNPs to partition a set of individuals into normal and inverted subpopulations. We employ a sliding window scan to identify regions likely to have an inversion, and accumulation of evidence from neighboring SNPs is used to accurately determine the inversion status of each subject. Further, our approach detects inversions directly from genotype data, thus increasing its usability to current genome-wide association studies (GWAS). Conclusions We demonstrate the accuracy of our method to detect inversions and classify individuals on principled-simulated genotypes, produced by the evolution of an inversion event within a coalescent model [2]. We applied our method to real genotype data from HapMap Phase III to characterize the inversion status of two known inversions within the regions 17q21 and 8p23 across 1184 individuals. Finally, we scan the full genomes of the European Origin (CEU) and Yoruba (YRI) HapMap samples. We find population-based evidence for 9 out of 15 well-established autosomic inversions, and for 52 regions previously predicted by independent experimental methods in ten (9+1) individuals [3,4]. We provide efficient implementations of both genotype and haplotype methods as a unified R package inveRsion. PMID:22321652
A Map of Copy Number Variations in Chinese Populations
Yang, Yajun; Kang, Longli; Zhang, Xin; Jin, Wenfei; Wu, Bailin; Jin, Li; Xu, Shuhua
2011-01-01
It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies. PMID:22087296
Population Structure With Localized Haplotype Clusters
Browning, Sharon R.; Weir, Bruce S.
2010-01-01
We propose a multilocus version of FST and a measure of haplotype diversity using localized haplotype clusters. Specifically, we use haplotype clusters identified with BEAGLE, which is a program implementing a hidden Markov model for localized haplotype clustering and performing several functions including inference of haplotype phase. We apply this methodology to HapMap phase 3 data. With this haplotype-cluster approach, African populations have highest diversity and lowest divergence from the ancestral population, East Asian populations have lowest diversity and highest divergence, and other populations (European, Indian, and Mexican) have intermediate levels of diversity and divergence. These relationships accord with expectation based on other studies and accepted models of human history. In contrast, the population-specific FST estimates obtained directly from single-nucleotide polymorphisms (SNPs) do not reflect such expected relationships. We show that ascertainment bias of SNPs has less impact on the proposed haplotype-cluster-based FST than on the SNP-based version, which provides a potential explanation for these results. Thus, these new measures of FST and haplotype-cluster diversity provide an important new tool for population genetic analysis of high-density SNP data. PMID:20457877
Wood, Andrew R; Perry, John R B; Tanaka, Toshiko; Hernandez, Dena G; Zheng, Hou-Feng; Melzer, David; Gibbs, J Raphael; Nalls, Michael A; Weedon, Michael N; Spector, Tim D; Richards, J Brent; Bandinelli, Stefania; Ferrucci, Luigi; Singleton, Andrew B; Frayling, Timothy M
2013-01-01
Genome-wide association (GWA) studies have been limited by the reliance on common variants present on microarrays or imputable from the HapMap Project data. More recently, the completion of the 1000 Genomes Project has provided variant and haplotype information for several million variants derived from sequencing over 1,000 individuals. To help understand the extent to which more variants (including low frequency (1% ≤ MAF <5%) and rare variants (<1%)) can enhance previously identified associations and identify novel loci, we selected 93 quantitative circulating factors where data was available from the InCHIANTI population study. These phenotypes included cytokines, binding proteins, hormones, vitamins and ions. We selected these phenotypes because many have known strong genetic associations and are potentially important to help understand disease processes. We performed a genome-wide scan for these 93 phenotypes in InCHIANTI. We identified 21 signals and 33 signals that reached P<5×10(-8) based on HapMap and 1000 Genomes imputation, respectively, and 9 and 11 that reached a stricter, likely conservative, threshold of P<5×10(-11) respectively. Imputation of 1000 Genomes genotype data modestly improved the strength of known associations. Of 20 associations detected at P<5×10(-8) in both analyses (17 of which represent well replicated signals in the NHGRI catalogue), six were captured by the same index SNP, five were nominally more strongly associated in 1000 Genomes imputed data and one was nominally more strongly associated in HapMap imputed data. We also detected an association between a low frequency variant and phenotype that was previously missed by HapMap based imputation approaches. An association between rs112635299 and alpha-1 globulin near the SERPINA gene represented the known association between rs28929474 (MAF = 0.007) and alpha1-antitrypsin that predisposes to emphysema (P = 2.5×10(-12)). Our data provide important proof of principle that 1000 Genomes imputation will detect novel, low frequency-large effect associations.
Wood, Andrew R.; Perry, John R. B.; Tanaka, Toshiko; Hernandez, Dena G.; Zheng, Hou-Feng; Melzer, David; Gibbs, J. Raphael; Nalls, Michael A.; Weedon, Michael N.; Spector, Tim D.; Richards, J. Brent; Bandinelli, Stefania; Ferrucci, Luigi; Singleton, Andrew B.; Frayling, Timothy M.
2013-01-01
Genome-wide association (GWA) studies have been limited by the reliance on common variants present on microarrays or imputable from the HapMap Project data. More recently, the completion of the 1000 Genomes Project has provided variant and haplotype information for several million variants derived from sequencing over 1,000 individuals. To help understand the extent to which more variants (including low frequency (1% ≤ MAF <5%) and rare variants (<1%)) can enhance previously identified associations and identify novel loci, we selected 93 quantitative circulating factors where data was available from the InCHIANTI population study. These phenotypes included cytokines, binding proteins, hormones, vitamins and ions. We selected these phenotypes because many have known strong genetic associations and are potentially important to help understand disease processes. We performed a genome-wide scan for these 93 phenotypes in InCHIANTI. We identified 21 signals and 33 signals that reached P<5×10−8 based on HapMap and 1000 Genomes imputation, respectively, and 9 and 11 that reached a stricter, likely conservative, threshold of P<5×10−11 respectively. Imputation of 1000 Genomes genotype data modestly improved the strength of known associations. Of 20 associations detected at P<5×10−8 in both analyses (17 of which represent well replicated signals in the NHGRI catalogue), six were captured by the same index SNP, five were nominally more strongly associated in 1000 Genomes imputed data and one was nominally more strongly associated in HapMap imputed data. We also detected an association between a low frequency variant and phenotype that was previously missed by HapMap based imputation approaches. An association between rs112635299 and alpha-1 globulin near the SERPINA gene represented the known association between rs28929474 (MAF = 0.007) and alpha1-antitrypsin that predisposes to emphysema (P = 2.5×10−12). Our data provide important proof of principle that 1000 Genomes imputation will detect novel, low frequency-large effect associations. PMID:23696881
Genome-wide association study of smoking behaviors in COPD patients
Siedlinski, Mateusz; Cho, Michael H.; Bakke, Per; Gulsvik, Amund; Lomas, David A.; Anderson, Wayne; Kong, Xiangyang; Rennard, Stephen I.; Beaty, Terri H.; Hokanson, John E.; Crapo, James D.; Silverman, Edwin K.
2012-01-01
Background Cigarette smoking is a major risk factor for COPD and COPD severity. Previous genome-wide association studies (GWAS) have identified numerous single nucleotide polymorphisms (SNPs) associated with the number of cigarettes smoked per day (CPD) and a Dopamine Beta-Hydroxylase (DBH) locus associated with smoking cessation in multiple populations. Objective To identify SNPs associated with lifetime average and current CPD, age at smoking initiation, and smoking cessation in COPD subjects. Methods GWAS were conducted in 4 independent cohorts encompassing 3,441 ever-smoking COPD subjects (GOLD stage II or higher). Untyped SNPs were imputed using HapMap (phase II) panel. Results from all cohorts were meta-analyzed. Results Several SNPs near the HLA region on chromosome 6p21 and in an intergenic region on chromosome 2q21 showed associations with age at smoking initiation, both with the lowest p=2×10−7. No SNPs were associated with lifetime average CPD, current CPD or smoking cessation with p<10−6. Nominally significant associations with candidate SNPs within alpha-nicotinic acetylcholine receptors 3/5 (CHRNA3/CHRNA5; e.g. p=0.00011 for SNP rs1051730) and Cytochrome P450 2A6 (CYP2A6; e.g. p=2.78×10−5 for a nonsynonymous SNP rs1801272) regions were observed for lifetime average CPD, however only CYP2A6 showed evidence of significant association with current CPD. A candidate SNP (rs3025343) in the DBH was significantly (p=0.015) associated with smoking cessation. Conclusion We identified two candidate regions associated with age at smoking initiation in COPD subjects. Associations of CHRNA3/CHRNA5 and CYP2A6 loci with CPD and DBH with smoking cessation are also likely of importance in the smoking behaviors of COPD patients. PMID:21685187
Tang, Shaowen; Lv, Xiaozhen; Zhang, Yuan; Wu, Shanshan; Yang, Zhirong; Xia, Yinyin; Tu, Dehua; Deng, Peiyuan; Ma, Yu; Chen, Dafang; Zhan, Siyan
2013-01-01
The pathogenic mechanism of anti-tuberculosis (anti-TB) drug-induced hepatitis is associated with drug metabolizing enzymes. No tagging single-nucleotide polymorphisms (tSNPs) of cytochrome P450 2E1(CYP2E1) in the risk of anti-TB drug-induced hepatitis have been reported. The present study was aimed at exploring the role of tSNPs in CYP2E1 gene in a population-based anti-TB treatment cohort. A nested case-control study was designed. Each hepatitis case was 14 matched with controls by age, gender, treatment history, disease severity and drug dosage. The tSNPs were selected by using Haploview 4.2 based on the HapMap database of Han Chinese in Beijing, and detected by using TaqMan allelic discrimination technology. Eighty-nine anti-TB drug-induced hepatitis cases and 356 controls were included in this study. 6 tSNPs (rs2031920, rs2070672, rs915908, rs8192775, rs2515641, rs2515644) were genotyped and minor allele frequencies of these tSNPs were 21.9%, 23.0%, 19.1%, 23.6%, 20.8% and 44.4% in the cases and 20.9%, 22.7%, 18.9%, 23.2%, 18.2% and 43.2% in the controls, respectively. No significant difference was observed in genotypes or allele frequencies of the 6 tSNPs between case group and control group, and neither of haplotypes in block 1 nor in block 2 was significantly associated with the development of hepatitis. Based on the Chinese anti-TB treatment cohort, we did not find a statistically significant association between genetic polymorphisms of CYP2E1 and the risk of anti-TB drug-induced hepatitis. None of the haplotypes showed a significant association with the development of hepatitis in Chinese TB population.
Delaney, Jessica T.; Jeff, Janina M.; Brown, Nancy J.; Pretorius, Mias; Okafor, Henry E.; Darbar, Dawood; Roden, Dan M.; Crawford, Dana C.
2012-01-01
Background Despite a greater burden of risk factors, atrial fibrillation (AF) is less common among African Americans than European-descent populations. Genome-wide association studies (GWAS) for AF in European-descent populations have identified three predominant genomic regions associated with increased risk (1q21, 4q25, and 16q22). The contribution of these loci to AF risk in African American is unknown. Methodology/Principal Findings We studied 73 African Americans with AF from the Vanderbilt-Meharry AF registry and 71 African American controls, with no history of AF including after cardiac surgery. Tests of association were performed for 148 SNPs across the three regions associated with AF, and 22 SNPs were significantly associated with AF (P<0.05). The SNPs with the strongest associations in African Americans were both different from the index SNPs identified in European-descent populations and independent from the index European-descent population SNPs (r2<0.40 in HapMap CEU): 1q21 rs4845396 (odds ratio [OR] 0.30, 95% confidence interval [CI] 0.13–0.67, P = 0.003), 4q25 rs4631108 (OR 3.43, 95% CI 1.59–7.42, P = 0.002), and 16q22 rs16971547 (OR 8.1, 95% CI 1.46–45.4, P = 0.016). Estimates of European ancestry were similar among cases (23.6%) and controls (23.8%). Accordingly, the probability of having two copies of the European derived chromosomes at each region did not differ between cases and controls. Conclusions/Significance Variable European admixture at known AF loci does not explain decreased AF susceptibility in African Americans. These data support the role of 1q21, 4q25, and 16q22 variants in AF risk for African Americans, although the index SNPs differ from those identified in European-descent populations. PMID:22384221
Jiang, Chao Qiang; Liu, Bin; Cheung, Bernard MY; Lam, Tai Hing; Lin, Jie Ming; Li Jin, Ya; Yue, Xiao Jun; Ong, Kwok Leung; Tam, Sidney; Wong, Ka Sing; Tomlinson, Brian; Lam, Karen SL; Thomas, G Neil
2010-01-01
Single nucleotide polymorphisms (SNPs) in the apolipoprotein A5 (APOA5) gene have been associated with hypertriglyceridaemia. We investigated which SNPs in the APOA5 gene were associated with triglyceride levels in two independent Chinese populations. In all, 1375 subjects in the Hong Kong Cardiovascular Risk Factor Prevalence Study were genotyped for five tagging SNPs chosen from HapMap. Replication was sought in 1996 subjects from the Guangzhou Biobank Cohort Study. Among the five SNPs, rs662799 (-1131T>C) was strongly related to log-transformed triglyceride levels among Hong Kong subjects (β=0.192, P=2.6 × 10−13). Plasma triglyceride level was 36.1% higher in CC compared to TT genotype. This association was confirmed in Guangzhou subjects (β=0.159, P=1.3 × 10−12), and was significantly irrespective of sex, age group, obesity, metabolic syndrome, hypertension, diabetes, smoking and alcohol drinking. The odds ratios and 95% confidence interval for plasma triglycerides ≥1.7 mmol/l associated with TC and CC genotypes were, respectively, 1.81 (1.37–2.39) and 2.22 (1.44–3.43) in Hong Kong and 1.27 (1.05–1.54) and 1.97 (1.42–2.73) in Guangzhou. Haplotype analysis suggested the association was due to rs662799 only. The corroborative findings in two independent populations indicate that the APOA5-1131T>C polymorphism is an important and clinically relevant determinant of plasma triglyceride levels in the Chinese population. PMID:20571505
Castaldi, Peter J; Cho, Michael H; Litonjua, Augusto A; Bakke, Per; Gulsvik, Amund; Lomas, David A; Anderson, Wayne; Beaty, Terri H; Hokanson, John E; Crapo, James D; Laird, Nan; Silverman, Edwin K
2011-12-01
Two recent metaanalyses of genome-wide association studies conducted by the CHARGE and SpiroMeta consortia identified novel loci yielding evidence of association at or near genome-wide significance (GWS) with FEV(1) and FEV(1)/FVC. We hypothesized that a subset of these markers would also be associated with chronic obstructive pulmonary disease (COPD) susceptibility. Thirty-two single-nucleotide polymorphisms (SNPs) in or near 17 genes in 11 previously identified GWS spirometric genomic regions were tested for association with COPD status in four COPD case-control study samples (NETT/NAS, the Norway case-control study, ECLIPSE, and the first 1,000 subjects in COPDGene; total sample size, 3,456 cases and 1,906 controls). In addition to testing the 32 spirometric GWS SNPs, we tested a dense panel of imputed HapMap2 SNP markers from the 17 genes located near the 32 GWS SNPs and in a set of 21 well studied COPD candidate genes. Of the previously identified GWS spirometric genomic regions, three loci harbored SNPs associated with COPD susceptibility at a 5% false discovery rate: the 4q24 locus including FLJ20184/INTS12/GSTCD/NPNT, the 6p21 locus including AGER and PPT2, and the 5q33 locus including ADAM19. In conclusion, markers previously associated at or near GWS with spirometric measures were tested for association with COPD status in data from four COPD case-control studies, and three loci showed evidence of association with COPD susceptibility at a 5% false discovery rate.
Paschou, Peristera
2010-01-01
Recent large-scale studies of European populations have demonstrated the existence of population genetic structure within Europe and the potential to accurately infer individual ancestry when information from hundreds of thousands of genetic markers is used. In fact, when genomewide genetic variation of European populations is projected down to a two-dimensional Principal Components Analysis plot, a surprising correlation with actual geographic coordinates of self-reported ancestry has been reported. This substructure can hamper the search of susceptibility genes for common complex disorders leading to spurious correlations. The identification of genetic markers that can correct for population stratification becomes therefore of paramount importance. Analyzing 1,200 individuals from 11 populations genotyped for more than 500,000 SNPs (Population Reference Sample), we present a systematic exploration of the extent to which geographic coordinates of origin within Europe can be predicted, with small panels of SNPs. Markers are selected to correlate with the top principal components of the dataset, as we have previously demonstrated. Performing thorough cross-validation experiments we show that it is indeed possible to predict individual ancestry within Europe down to a few hundred kilometers from actual individual origin, using information from carefully selected panels of 500 or 1,000 SNPs. Furthermore, we show that these panels can be used to correctly assign the HapMap Phase 3 European populations to their geographic origin. The SNPs that we propose can prove extremely useful in a variety of different settings, such as stratification correction or genetic ancestry testing, and the study of the history of European populations. PMID:20805874
Patterns of population differentiation of candidate genes for cardiovascular disease
Kullo, Iftikhar J; Ding, Keyue
2007-01-01
Background The basis for ethnic differences in cardiovascular disease (CVD) susceptibility is not fully understood. We investigated patterns of population differentiation (FST) of a set of genes in etiologic pathways of CVD among 3 ethnic groups: Yoruba in Nigeria (YRI), Utah residents with European ancestry (CEU), and Han Chinese (CHB) + Japanese (JPT). We identified 37 pathways implicated in CVD based on the PANTHER classification and 416 genes in these pathways were further studied; these genes belonged to 6 biological processes (apoptosis, blood circulation and gas exchange, blood clotting, homeostasis, immune response, and lipoprotein metabolism). Genotype data were obtained from the HapMap database. Results We calculated FST for 15,559 common SNPs (minor allele frequency ≥ 0.10 in at least one population) in genes that co-segregated among the populations, as well as an average-weighted FST for each gene. SNPs were classified as putatively functional (non-synonymous and untranslated regions) or non-functional (intronic and synonymous sites). Mean FST values for common putatively functional variants were significantly higher than FST values for nonfunctional variants. A significant variation in FST was also seen based on biological processes; the processes of 'apoptosis' and 'lipoprotein metabolism' showed an excess of genes with high FST. Thus, putative functional SNPs in genes in etiologic pathways for CVD show greater population differentiation than non-functional SNPs and a significant variance of FST values was noted among pairwise population comparisons for different biological processes. Conclusion These results suggest a possible basis for varying susceptibility to CVD among ethnic groups. PMID:17626638
A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation.
Howe, Glenn T; Yu, Jianbin; Knaus, Brian; Cronn, Richard; Kolpak, Scott; Dolan, Peter; Lorenz, W Walter; Dean, Jeffrey F D
2013-02-28
Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array-more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change.
A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation
2013-01-01
Background Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. Results We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Conclusions Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array—more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change. PMID:23445355
Genome-Wide Association Studies of the PR Interval in African Americans
Palmer, Cameron; Meng, Yan A.; Soliman, Elsayed Z.; Musani, Solomon K.; Kerr, Kathleen F.; Schnabel, Renate B.; Lubitz, Steven A.; Sotoodehnia, Nona; Redline, Susan; Pfeufer, Arne; Müller, Martina; Evans, Daniel S.; Nalls, Michael A.; Liu, Yongmei; Newman, Anne B.; Zonderman, Alan B.; Evans, Michele K.; Deo, Rajat; Ellinor, Patrick T.; Paltoo, Dina N.
2011-01-01
The PR interval on the electrocardiogram reflects atrial and atrioventricular nodal conduction time. The PR interval is heritable, provides important information about arrhythmia risk, and has been suggested to differ among human races. Genome-wide association (GWA) studies have identified common genetic determinants of the PR interval in individuals of European and Asian ancestry, but there is a general paucity of GWA studies in individuals of African ancestry. We performed GWA studies in African American individuals from four cohorts (n = 6,247) to identify genetic variants associated with PR interval duration. Genotyping was performed using the Affymetrix 6.0 microarray. Imputation was performed for 2.8 million single nucleotide polymorphisms (SNPs) using combined YRI and CEU HapMap phase II panels. We observed a strong signal (rs3922844) within the gene encoding the cardiac sodium channel (SCN5A) with genome-wide significant association (p<2.5×10−8) in two of the four cohorts and in the meta-analysis. The signal explained 2% of PR interval variability in African Americans (beta = 5.1 msec per minor allele, 95% CI = 4.1–6.1, p = 3×10−23). This SNP was also associated with PR interval (beta = 2.4 msec per minor allele, 95% CI = 1.8–3.0, p = 3×10−16) in individuals of European ancestry (n = 14,042), but with a smaller effect size (p for heterogeneity <0.001) and variability explained (0.5%). Further meta-analysis of the four cohorts identified genome-wide significant associations with SNPs in SCN10A (rs6798015), MEIS1 (rs10865355), and TBX5 (rs7312625) that were highly correlated with SNPs identified in European and Asian GWA studies. African ancestry was associated with increased PR duration (13.3 msec, p = 0.009) in one but not the other three cohorts. Our findings demonstrate the relevance of common variants to African Americans at four loci previously associated with PR interval in European and Asian samples and identify an association signal at one of these loci that is more strongly associated with PR interval in African Americans than in Europeans. PMID:21347284
Vallée Marcotte, Bastien; Cormier, Hubert; Rudkowska, Iwona; Lemieux, Simone; Couture, Patrick; Vohl, Marie-Claude
2017-11-06
The objective was to test whether FFAR4 single nucleotide polymorphisms (SNPs) are associated with glycemic control-related traits in humans following fish oil supplementation. A total of 210 participants were given 3 g/day of omega-3 (n-3) fatty acids (FA) (1.9-2.2 g of eicosapentaenoic acid (EPA) and 1.1 g of docosahexaenoic acid (DHA)) during six weeks. Biochemical parameters were taken before and after the supplementation. Using the HapMap database and the tagger procedure in Haploview, 12 tagging SNPs in FFAR4 were selected and then genotyped using TaqMan technology. Transcript expression levels were measured for 30 participants in peripheral mononuclear blood cells. DNA methylation levels were measured for 35 participants in leukocytes. In silico analyses were also performed. Four gene-diet interactions on fasting insulin levels and homeostatic model assessment of insulin resistance (HOMA-IR) index values were found. rs17108973 explained a significant proportion of the variance of insulin levels (3.0%) and HOMA-IR (2.03%) index values. Splice site prediction was different depending on the allele for rs11187527. rs17108973 and rs17484310 had different affinity for transcription factors depending on the allele. n-3 FAs effectively improve insulin-related traits for major allele homozygotes of four FFAR4 SNPs as opposed to carriers of the minor alleles.
Vallée Marcotte, Bastien; Cormier, Hubert; Rudkowska, Iwona; Lemieux, Simone; Couture, Patrick
2017-01-01
The objective was to test whether FFAR4 single nucleotide polymorphisms (SNPs) are associated with glycemic control-related traits in humans following fish oil supplementation. A total of 210 participants were given 3 g/day of omega-3 (n-3) fatty acids (FA) (1.9–2.2 g of eicosapentaenoic acid (EPA) and 1.1 g of docosahexaenoic acid (DHA)) during six weeks. Biochemical parameters were taken before and after the supplementation. Using the HapMap database and the tagger procedure in Haploview, 12 tagging SNPs in FFAR4 were selected and then genotyped using TaqMan technology. Transcript expression levels were measured for 30 participants in peripheral mononuclear blood cells. DNA methylation levels were measured for 35 participants in leukocytes. In silico analyses were also performed. Four gene–diet interactions on fasting insulin levels and homeostatic model assessment of insulin resistance (HOMA-IR) index values were found. rs17108973 explained a significant proportion of the variance of insulin levels (3.0%) and HOMA-IR (2.03%) index values. Splice site prediction was different depending on the allele for rs11187527. rs17108973 and rs17484310 had different affinity for transcription factors depending on the allele. n-3 FAs effectively improve insulin-related traits for major allele homozygotes of four FFAR4 SNPs as opposed to carriers of the minor alleles. PMID:29113108
SNP-RFLPing 2: an updated and integrated PCR-RFLP tool for SNP genotyping.
Chang, Hsueh-Wei; Cheng, Yu-Huei; Chuang, Li-Yeh; Yang, Cheng-Hong
2010-04-08
PCR-restriction fragment length polymorphism (RFLP) assay is a cost-effective method for SNP genotyping and mutation detection, but the manual mining for restriction enzyme sites is challenging and cumbersome. Three years after we constructed SNP-RFLPing, a freely accessible database and analysis tool for restriction enzyme mining of SNPs, significant improvements over the 2006 version have been made and incorporated into the latest version, SNP-RFLPing 2. The primary aim of SNP-RFLPing 2 is to provide comprehensive PCR-RFLP information with multiple functionality about SNPs, such as SNP retrieval to multiple species, different polymorphism types (bi-allelic, tri-allelic, tetra-allelic or indels), gene-centric searching, HapMap tagSNPs, gene ontology-based searching, miRNAs, and SNP500Cancer. The RFLP restriction enzymes and the corresponding PCR primers for the natural and mutagenic types of each SNP are simultaneously analyzed. All the RFLP restriction enzyme prices are also provided to aid selection. Furthermore, the previously encountered updating problems for most SNP related databases are resolved by an on-line retrieval system. The user interfaces for functional SNP analyses have been substantially improved and integrated. SNP-RFLPing 2 offers a new and user-friendly interface for RFLP genotyping that can be used in association studies and is freely available at http://bio.kuas.edu.tw/snp-rflping2.
Han, Summer S.; Yeager, Meredith; Moore, Lee E.; Wei, Ming-Hui; Pfeiffer, Ruth; Toure, Ousmane; Purdue, Mark P.; Johansson, Mattias; Scelo, Ghislaine; Chung, Charles C.; Gaborieau, Valerie; Zaridze, David; Schwartz, Kendra; Szeszenia-Dabrowska, Neonilia; Davis, Faith; Bencko, Vladimir; Colt, Joanne S.; Janout, Vladimir; Matveev, Vsevolod; Foretova, Lenka; Mates, Dana; Navratilova, M.; Boffetta, Paolo; Berg, Christine D.; Grubb, Robert L.; Stevens, Victoria L.; Thun, Michael J.; Diver, W. Ryan; Gapstur, Susan M.; Albanes, Demetrius; Weinstein, Stephanie J.; Virtamo, Jarmo; Burdett, Laurie; Brisuda, Antonin; McKay, James D.; Fraumeni, Joseph F.; Chatterjee, Nilanjan; Rosenberg, Philip S.; Rothman, Nathaniel; Brennan, Paul; Chow, Wong-Ho; Tucker, Margaret A.; Chanock, Stephen J.; Toro, Jorge R.
2012-01-01
In follow-up of a recent genome-wide association study (GWAS) that identified a locus in chromosome 2p21 associated with risk for renal cell carcinoma (RCC), we conducted a fine mapping analysis of a 120 kb region that includes EPAS1. We genotyped 59 tagged common single-nucleotide polymorphisms (SNPs) in 2278 RCC and 3719 controls of European background and observed a novel signal for rs9679290 [P = 5.75 × 10−8, per-allele odds ratio (OR) = 1.27, 95% confidence interval (CI): 1.17–1.39]. Imputation of common SNPs surrounding rs9679290 using HapMap 3 and 1000 Genomes data yielded two additional signals, rs4953346 (P = 4.09 × 10−14) and rs12617313 (P = 7.48 × 10−12), both highly correlated with rs9679290 (r2 > 0.95), but interestingly not correlated with the two SNPs reported in the GWAS: rs11894252 and rs7579899 (r2 < 0.1 with rs9679290). Genotype analysis of rs12617313 confirmed an association with RCC risk (P = 1.72 × 10−9, per-allele OR = 1.28, 95% CI: 1.18–1.39) In conclusion, we report that chromosome 2p21 harbors a complex genetic architecture for common RCC risk variants. PMID:22113997
Zhang, Suhua; Bian, Yingnan; Chen, Anqi; Zheng, Hancheng; Gao, Yuzhen; Hou, Yiping; Li, Chengtao
2017-03-01
Utilizing massively parallel sequencing (MPS) technology for SNP testing in forensic genetics is becoming attractive because of the shortcomings of STR markers, such as their high mutation rates and disadvantages associated with the current PCR-CE method as well as its limitations regarding multiplex capabilities. MPS offers the potential to genotype hundreds to thousands of SNPs from multiple samples in a single experimental run. In this study, we designed a customized SNP panel that includes 273 forensically relevant identity SNPs chosen from SNPforID, IISNP, and the HapMap database as well as previously related studies and evaluated the levels of genotyping precision, sequence coverage, sensitivity and SNP performance using the Ion Torrent PGM. In a concordant study of the custom MPS-SNP panel, only four MPS callings were missing due to coverage reads that were too low (<20), whereas the others were fully concordant with Sanger's sequencing results across the two control samples, that is, 9947A and 9948. The analyses indicated a balanced coverage among the included loci, with the exception of the 16 SNPs that were used to detect an inconsistent allele balance and/or lower coverage reads among 50 tested individuals from the Chinese HAN population and the above controls. With the exception of the 16 poorly performing SNPs, the sequence coverage obtained was extensive for the bulk of the SNPs, and only three Y-SNPs (rs16980601, rs11096432, rs3900) showed a mean coverage below 1000. Analyses of the dilution series of control DNA 9948 yielded reproducible results down to 1ng of DNA input. In addition, we provide an analysis tool for automated data quality control and genotyping checks, and we conclude that the SNP targets are polymorphic and independent in the Chinese HAN population. In summary, the evaluation of the sensitivity, accuracy and genotyping performance provides strong support for the application of MPS technology in forensic SNP analysis, and the assay offers a straightforward sample-to-genotype workflow that could be beneficial in forensic casework with respect to both individual identification and complex kinship issues. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
RTEL1 tagging SNPs and haplotypes were associated with glioma development.
Li, Gang; Jin, Tianbo; Liang, Hongjuan; Zhang, Zhiguo; He, Shiming; Tu, Yanyang; Yang, Haixia; Geng, Tingting; Cui, Guangbin; Chen, Chao; Gao, Guodong
2013-05-17
As glioma ranks as the first most prevalent solid tumors in primary central nervous system, certain single-nucleotide polymorphisms (SNPs) may be related to increased glioma risk, and have implications in carcinogenesis. The present case-control study was carried out to elucidate how common variants contribute to glioma susceptibility. Ten candidate tagging SNPs (tSNPs) were selected from seven genes whose polymorphisms have been proven by classical literatures and reliable databases to be tended to relate with gliomas, and with the minor allele frequency (MAF)>5% in the HapMap Asian population. The selected tSNPs were genotyped in 629 glioma patients and 645 controls from a Han Chinese population using the multiplexed SNP MassEXTEND assay calibrated. Two significant tSNPs in RTEL1 gene were observed to be associated with glioma risk (rs6010620, P=0.0016, OR: 1.32, 95% CI: 1.11-1.56; rs2297440, P=0.001, OR: 1.33, 95% CI: 1.12-1.58) by χ2 test. It was identified the genotype "GG" of rs6010620 acted as the protective genotype for glioma (OR, 0.46; 95% CI, 0.31-0.7; P=0.0002), while the genotype "CC" of rs2297440 as the protective genotype in glioma (OR, 0.47; 95% CI, 0.31-0.71; P=0.0003). Furthermore, haplotype "GCT" in RTEL1 gene was found to be associated with risk of glioma (OR, 0.7; 95% CI, 0.57-0.86; Fisher's P=0.0005; Pearson's P=0.0005), and haplotype "ATT" was detected to be associated with risk of glioma (OR, 1.32; 95% CI, 1.12-1.57; Fisher's P=0.0013; Pearson's P=0.0013). Two single variants, the genotypes of "GG" of rs6010620 and "CC" of rs2297440 (rs6010620 and rs2297440) in the RTEL1 gene, together with two haplotypes of GCT and ATT, were identified to be associated with glioma development. And it might be used to evaluate the glioma development risks to screen the above RTEL1 tagging SNPs and haplotypes. The virtual slides for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1993021136961998.
Pemberton, T J; Jakobsson, M; Conrad, D F; Coop, G; Wall, J D; Pritchard, J K; Patel, P I; Rosenberg, N A
2008-07-01
When performing association studies in populations that have not been the focus of large-scale investigations of haplotype variation, it is often helpful to rely on genomic databases in other populations for study design and analysis - such as in the selection of tag SNPs and in the imputation of missing genotypes. One way of improving the use of these databases is to rely on a mixture of database samples that is similar to the population of interest, rather than using the single most similar database sample. We demonstrate the effectiveness of the mixture approach in the application of African, European, and East Asian HapMap samples for tag SNP selection in populations from India, a genetically intermediate region underrepresented in genomic studies of haplotype variation.
Liu, Hanmei; Wang, Xuewen; Wei, Bin; Wang, Yongbin; Liu, Yinghong; Zhang, Junjie; Hu, Yufeng; Yu, Guowu; Li, Jian; Xu, Zhanbin; Huang, Yubi
2016-01-01
In southwest China, some maize landraces have long been isolated geographically, and have phenotypes that differ from those of widely grown cultivars. These landraces may harbor rich genetic variation responsible for those phenotypes. Four-row Wax is one such landrace, with four rows of kernels on the cob. We resequenced the genome of Four-row Wax, obtaining 50.46 Gb sequence at 21.87× coverage, then identified and characterized 3,252,194 SNPs, 213,181 short InDels (1–5 bp) and 39,631 structural variations (greater than 5 bp). Of those, 312,511 (9.6%) SNPs were novel compared to the most detailed haplotype map (HapMap) SNP database of maize. Characterization of variations in reported kernel row number (KRN) related genes and KRN QTL regions revealed potential causal mutations in fea2, td1, kn1, and te1. Genome-wide comparisons revealed abundant genetic variations in Four-row Wax, which may be associated with environmental adaptation. The sequence and SNP variations described here enrich genetic resources of maize, and provide guidance into study of seed numbers for crop yield improvement. PMID:27242868
Onuki, Ritsuko; Yamaguchi, Rui; Shibuya, Tetsuo; Kanehisa, Minoru; Goto, Susumu
2017-01-01
Genome-wide scans for positive selection have become important for genomic medicine, and many studies aim to find genomic regions affected by positive selection that are associated with risk allele variations among populations. Most such studies are designed to detect recent positive selection. However, we hypothesize that ancient positive selection is also important for adaptation to pathogens, and has affected current immune-mediated common diseases. Based on this hypothesis, we developed a novel linkage disequilibrium-based pipeline, which aims to detect regions associated with ancient positive selection across populations from single nucleotide polymorphism (SNP) data. By applying this pipeline to the genotypes in the International HapMap project database, we show that genes in the detected regions are enriched in pathways related to the immune system and infectious diseases. The detected regions also contain SNPs reported to be associated with cancers and metabolic diseases, obesity-related traits, type 2 diabetes, and allergic sensitization. These SNPs were further mapped to biological pathways to determine the associations between phenotypes and molecular functions. Assessments of candidate regions to identify functions associated with variations in incidence rates of these diseases are needed in the future. PMID:28445522
Sabidó, Eduard; Bosch, Elena
2016-01-01
Essential trace elements possess vital functions at molecular, cellular, and physiological levels in health and disease, and they are tightly regulated in the human body. In order to assess variability and potential adaptive evolution of trace element homeostasis, we quantified 18 trace elements in 150 liver samples, together with the expression levels of 90 genes and abundances of 40 proteins involved in their homeostasis. Additionally, we genotyped 169 single nucleotide polymorphism (SNPs) in the same sample set. We detected significant associations for 8 protein quantitative trait loci (pQTL), 10 expression quantitative trait loci (eQTLs), and 15 micronutrient quantitative trait loci (nutriQTL). Six of these exceeded the false discovery rate cutoff and were related to essential trace elements: 1) one pQTL for GPX2 (rs10133290); 2) two previously described eQTLs for HFE (rs12346) and SELO (rs4838862) expression; and 3) three nutriQTLs: The pathogenic C282Y mutation at HFE affecting iron (rs1800562), and two SNPs within several clustered metallothionein genes determining selenium concentration (rs1811322 and rs904773). Within the complete set of significant QTLs (which involved 30 SNPs and 20 gene regions), we identified 12 SNPs with extreme patterns of population differentiation (FST values in the top 5% percentile in at least one HapMap population pair) and significant evidence for selective sweeps involving QTLs at GPX1, SELENBP1, GPX3, SLC30A9, and SLC39A8. Overall, this detailed study of various molecular phenotypes illustrates the role of regulatory variants in explaining differences in trace element homeostasis among populations and in the human adaptive response to environmental pressures related to micronutrients. PMID:26582562
Meldrum, Suzanne J; Li, Yuchun; Zhang, Guicheng; Heaton, Alexandra E M; D'Vaz, Nina; Manz, Judith; Reischl, Eva; Koletzko, Berthold V; Prescott, Susan L; Simmer, Karen
2017-09-19
The enzymes encoded by fatty acid desaturases (FADS) genes determine the desaturation of long-chain polyunsaturated fatty acids (LCPUFA). We investigated if haplotype and single nucleotide polymorphisms (SNPs) in FADS gene cluster can influence LCPUFA status in infants who received either fish oil or placebo supplementation. Children enrolled in the Infant Fish Oil Supplementation Study (IFOS) were randomly allocated to receive either fish oil or placebo from birth to 6 months of age. Blood was collected at 6 months of age for the measurement of fatty acids and for DNA extraction. A total of 276 participant DNA samples underwent genotyping, and 126 erythrocyte and 133 plasma fatty acid measurements were available for analysis. Twenty-two FADS SNPs were selected on the basis of literature and linkage disequilibrium patterns identified from the HapMap data. Haplotype construction was completed using PHASE. For participants allocated to the fish oil group who had two copies of the FADS1 haplotype consisting of SNP minor alleles, DHA levels were significantly higher compared to other haplotypes. This finding was not observed for the placebo group. Furthermore, for members of the fish oil group only, the minor homozygous carriers of all the FADS1 SNPs investigated had significantly higher DHA than other genotypes (rs174545, rs174546, rs174548, rs174553, rs174556, rs174537, rs174448, and rs174455). Overall results of this preliminary study suggest that supplementation with fish oil may only significantly increase DHA in minor allele carriers of FADS1 SNPs. Further research is required to confirm this novel finding.
Physiogenomic analysis of the Puerto Rican population.
Ruaño, Gualberto; Duconge, Jorge; Windemuth, Andreas; Cadilla, Carmen L; Kocherla, Mohan; Villagra, David; Renta, Jessica; Holford, Theodore; Santiago-Borrero, Pedro J
2009-04-01
Admixture in the population of the island of Puerto Rico is of general interest with regards to pharmacogenetics to develop comprehensive strategies for personalized healthcare in Latin Americans. This research was aimed at determining the frequencies of SNPs in key physiological, pharmacological and biochemical genes to infer population structure and ancestry in the Puerto Rican population. A noninterventional, cross-sectional, retrospective study design was implemented following a controlled, stratified-by-region, random sampling protocol. The sample was based on birthrates in each region of the island of Puerto Rico, according to the 2004 National Birth Registry. Genomic DNA samples from 100 newborns were obtained from the Puerto Rico Newborn Screening Program in dried-blood spot cards. Genotyping using a physiogenomic array was performed for 332 SNPs from 196 cardiometabolic and neuroendocrine genes. Population structure was examined using a Bayesian clustering approach as well as by allelic dissimilarity as a measure of allele sharing. The Puerto Rican sample was found to be broadly heterogeneous. We observed three main clusters in the population, which we hypothesize to reflect the historical admixture in the Puerto Rican population from Amerindian, African and European ancestors. We present evidence for this interpretation by comparing allele frequencies for the three clusters with those for the same SNPs available from the International HapMap project for Asian, African and European populations. Our results demonstrate that population analysis can be performed with a physiogenomic array of cardiometabolic and neuroendocrine genes to facilitate the translation of genome diversity into personalized medicine.
Identification of selection signatures in cattle breeds selected for dairy production.
Stella, Alessandra; Ajmone-Marsan, Paolo; Lazzari, Barbara; Boettcher, Paul
2010-08-01
The genomics revolution has spurred the undertaking of HapMap studies of numerous species, allowing for population genomics to increase the understanding of how selection has created genetic differences between subspecies populations. The objectives of this study were to (1) develop an approach to detect signatures of selection in subsets of phenotypically similar breeds of livestock by comparing single nucleotide polymorphism (SNP) diversity between the subset and a larger population, (2) verify this method in breeds selected for simply inherited traits, and (3) apply this method to the dairy breeds in the International Bovine HapMap (IBHM) study. The data consisted of genotypes for 32,689 SNPs of 497 animals from 19 breeds. For a given subset of breeds, the test statistic was the parametric composite log likelihood (CLL) of the differences in allelic frequencies between the subset and the IBHM for a sliding window of SNPs. The null distribution was obtained by calculating CLL for 50,000 random subsets (per chromosome) of individuals. The validity of this approach was confirmed by obtaining extremely large CLLs at the sites of causative variation for polled (BTA1) and black-coat-color (BTA18) phenotypes. Across the 30 bovine chromosomes, 699 putative selection signatures were detected. The largest CLL was on BTA6 and corresponded to KIT, which is responsible for the piebald phenotype present in four of the five dairy breeds. Potassium channel-related genes were at the site of the largest CLL on three chromosomes (BTA14, -16, and -25) whereas integrins (BTA18 and -19) and serine/arginine rich splicing factors (BTA20 and -23) each had the largest CLL on two chromosomes. On the basis of the results of this study, the application of population genomics to farm animals seems quite promising. Comparisons between breed groups have the potential to identify genomic regions influencing complex traits with no need for complex equipment and the collection of extensive phenotypic records and can contribute to the identification of candidate genes and to the understanding of the biological mechanisms controlling complex traits.
Oparina, Nina Y; Delgado-Vega, Angelica M; Martinez-Bueno, Manuel; Magro-Checa, César; Fernández, Concepción; Castro, Rafaela Ortega; Pons-Estel, Bernardo A; D'Alfonso, Sandra; Sebastiani, Gian Domenico; Witte, Torsten; Lauwerys, Bernard R; Endreffy, Emoke; Kovács, László; Escudero, Alejandro; López-Pedrera, Chary; Vasconcelos, Carlos; da Silva, Berta Martins; Frostegård, Johan; Truedsson, Lennart; Martin, Javier; Raya, Enrique; Ortego-Centeno, Norberto; de Los Angeles Aguirre, Maria; de Ramón Garrido, Enrique; Palma, María-Jesús Castillo; Alarcon-Riquelme, Marta E; Kozyrev, Sergey V
2015-03-01
To perform fine mapping of the PXK locus associated with systemic lupus erythematosus (SLE) and study functional effects that lead to susceptibility to the disease. Linkage disequilibrium (LD) mapping was conducted by using 1251 SNPs (single nucleotide polymorphism) covering a 862 kb genomic region on 3p14.3 comprising the PXK locus in 1467 SLE patients and 2377 controls of European origin. Tag SNPs and genotypes imputed with IMPUTE2 were tested for association by using SNPTEST and PLINK. The expression QTLs data included three independent datasets for lymphoblastoid cells of European donors: HapMap3, MuTHER and the cross-platform eQTL catalogue. Correlation analysis of eQTLs was performed using Vassarstats. Alternative splicing for the PXK gene was analysed on mRNA from PBMCs. Fine mapping revealed long-range LD (>200 kb) extended over the ABHD6, RPP14, PXK, and PDHB genes on 3p14.3. The highly correlated variants tagged an SLE-associated haplotype that was less frequent in the patients compared with the controls (OR=0.89, p=0.00684). A robust correlation between the association with SLE and enhanced expression of ABHD6 gene was revealed, while neither expression, nor splicing alterations associated with SLE susceptibility were detected for PXK. The SNP allele frequencies as well as eQTL pattern analysed in the CEU and CHB HapMap3 populations indicate that the SLE association and the effect on ABHD6 expression are specific to Europeans. These results confirm the genetic association of the locus 3p14.3 with SLE in Europeans and point to the ABHD6 and not PXK, as the major susceptibility gene in the region. We suggest a pathogenic mechanism mediated by the upregulation of ABHD6 in individuals carrying the SLE-risk variants. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
A genetic variation map for chicken with 2.8 million single nucleotide polymorphisms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wong, G K; Hillier, L; Brandstrom, M
2005-02-20
We describe a genetic variation map for the chicken genome containing 2.8 million single nucleotide polymorphisms (SNPs), based on a comparison of the sequences of 3 domestic chickens (broiler, layer, Silkie) to their wild ancestor Red Jungle Fowl (RJF). Subsequent experiments indicate that at least 90% are true SNPs, and at least 70% are common SNPs that segregate in many domestic breeds. Mean nucleotide diversity is about 5 SNP/kb for almost every possible comparison between RJF and domestic lines, between two different domestic lines, and within domestic lines--contrary to the idea that domestic animals are highly inbred relative to theirmore » wild ancestors. In fact, most of the SNPs originated prior to domestication, and there is little to no evidence of selective sweeps for adaptive alleles on length scales of greater than 100 kb.« less
de Vries, Paul S; Sabater-Lleal, Maria; Chasman, Daniel I; Trompet, Stella; Ahluwalia, Tarunveer S; Teumer, Alexander; Kleber, Marcus E; Chen, Ming-Huei; Wang, Jie Jin; Attia, John R; Marioni, Riccardo E; Steri, Maristella; Weng, Lu-Chen; Pool, Rene; Grossmann, Vera; Brody, Jennifer A; Venturini, Cristina; Tanaka, Toshiko; Rose, Lynda M; Oldmeadow, Christopher; Mazur, Johanna; Basu, Saonli; Frånberg, Mattias; Yang, Qiong; Ligthart, Symen; Hottenga, Jouke J; Rumley, Ann; Mulas, Antonella; de Craen, Anton J M; Grotevendt, Anne; Taylor, Kent D; Delgado, Graciela E; Kifley, Annette; Lopez, Lorna M; Berentzen, Tina L; Mangino, Massimo; Bandinelli, Stefania; Morrison, Alanna C; Hamsten, Anders; Tofler, Geoffrey; de Maat, Moniek P M; Draisma, Harmen H M; Lowe, Gordon D; Zoledziewska, Magdalena; Sattar, Naveed; Lackner, Karl J; Völker, Uwe; McKnight, Barbara; Huang, Jie; Holliday, Elizabeth G; McEvoy, Mark A; Starr, John M; Hysi, Pirro G; Hernandez, Dena G; Guan, Weihua; Rivadeneira, Fernando; McArdle, Wendy L; Slagboom, P Eline; Zeller, Tanja; Psaty, Bruce M; Uitterlinden, André G; de Geus, Eco J C; Stott, David J; Binder, Harald; Hofman, Albert; Franco, Oscar H; Rotter, Jerome I; Ferrucci, Luigi; Spector, Tim D; Deary, Ian J; März, Winfried; Greinacher, Andreas; Wild, Philipp S; Cucca, Francesco; Boomsma, Dorret I; Watkins, Hugh; Tang, Weihong; Ridker, Paul M; Jukema, Jan W; Scott, Rodney J; Mitchell, Paul; Hansen, Torben; O'Donnell, Christopher J; Smith, Nicholas L; Strachan, David P; Dehghan, Abbas
2017-01-01
An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In order to assess the improvement of 1000G over HapMap imputation in identifying associated loci, we compared the results of GWA studies of circulating fibrinogen based on the two reference panels. Using both HapMap and 1000G imputation we performed a meta-analysis of 22 studies comprising the same 91,953 individuals. We identified six additional signals using 1000G imputation, while 29 loci were associated using both HapMap and 1000G imputation. One locus identified using HapMap imputation was not significant using 1000G imputation. The genome-wide significance threshold of 5×10-8 is based on the number of independent statistical tests using HapMap imputation, and 1000G imputation may lead to further independent tests that should be corrected for. When using a stricter Bonferroni correction for the 1000G GWA study (P-value < 2.5×10-8), the number of loci significant only using HapMap imputation increased to 4 while the number of loci significant only using 1000G decreased to 5. In conclusion, 1000G imputation enabled the identification of 20% more loci than HapMap imputation, although the advantage of 1000G imputation became less clear when a stricter Bonferroni correction was used. More generally, our results provide insights that are applicable to the implementation of other dense reference panels that are under development.
de Vries, Paul S.; Sabater-Lleal, Maria; Chasman, Daniel I.; Trompet, Stella; Kleber, Marcus E.; Chen, Ming-Huei; Wang, Jie Jin; Attia, John R.; Marioni, Riccardo E.; Weng, Lu-Chen; Grossmann, Vera; Brody, Jennifer A.; Venturini, Cristina; Tanaka, Toshiko; Rose, Lynda M.; Oldmeadow, Christopher; Mazur, Johanna; Basu, Saonli; Yang, Qiong; Ligthart, Symen; Hottenga, Jouke J.; Rumley, Ann; Mulas, Antonella; de Craen, Anton J. M.; Grotevendt, Anne; Taylor, Kent D.; Delgado, Graciela E.; Kifley, Annette; Lopez, Lorna M.; Berentzen, Tina L.; Mangino, Massimo; Bandinelli, Stefania; Morrison, Alanna C.; Hamsten, Anders; Tofler, Geoffrey; de Maat, Moniek P. M.; Draisma, Harmen H. M.; Lowe, Gordon D.; Zoledziewska, Magdalena; Sattar, Naveed; Lackner, Karl J.; Völker, Uwe; McKnight, Barbara; Huang, Jie; Holliday, Elizabeth G.; McEvoy, Mark A.; Starr, John M.; Hysi, Pirro G.; Hernandez, Dena G.; Guan, Weihua; Rivadeneira, Fernando; McArdle, Wendy L.; Slagboom, P. Eline; Zeller, Tanja; Psaty, Bruce M.; Uitterlinden, André G.; de Geus, Eco J. C.; Stott, David J.; Binder, Harald; Hofman, Albert; Franco, Oscar H.; Rotter, Jerome I.; Ferrucci, Luigi; Spector, Tim D.; Deary, Ian J.; März, Winfried; Greinacher, Andreas; Wild, Philipp S.; Cucca, Francesco; Boomsma, Dorret I.; Watkins, Hugh; Tang, Weihong; Ridker, Paul M.; Jukema, Jan W.; Scott, Rodney J.; Mitchell, Paul; Hansen, Torben; O'Donnell, Christopher J.; Smith, Nicholas L.; Strachan, David P.
2017-01-01
An increasing number of genome-wide association (GWA) studies are now using the higher resolution 1000 Genomes Project reference panel (1000G) for imputation, with the expectation that 1000G imputation will lead to the discovery of additional associated loci when compared to HapMap imputation. In order to assess the improvement of 1000G over HapMap imputation in identifying associated loci, we compared the results of GWA studies of circulating fibrinogen based on the two reference panels. Using both HapMap and 1000G imputation we performed a meta-analysis of 22 studies comprising the same 91,953 individuals. We identified six additional signals using 1000G imputation, while 29 loci were associated using both HapMap and 1000G imputation. One locus identified using HapMap imputation was not significant using 1000G imputation. The genome-wide significance threshold of 5×10−8 is based on the number of independent statistical tests using HapMap imputation, and 1000G imputation may lead to further independent tests that should be corrected for. When using a stricter Bonferroni correction for the 1000G GWA study (P-value < 2.5×10−8), the number of loci significant only using HapMap imputation increased to 4 while the number of loci significant only using 1000G decreased to 5. In conclusion, 1000G imputation enabled the identification of 20% more loci than HapMap imputation, although the advantage of 1000G imputation became less clear when a stricter Bonferroni correction was used. More generally, our results provide insights that are applicable to the implementation of other dense reference panels that are under development. PMID:28107422
RTEL1 tagging SNPs and haplotypes were associated with glioma development
2013-01-01
Abstract As glioma ranks as the first most prevalent solid tumors in primary central nervous system, certain single-nucleotide polymorphisms (SNPs) may be related to increased glioma risk, and have implications in carcinogenesis. The present case–control study was carried out to elucidate how common variants contribute to glioma susceptibility. Ten candidate tagging SNPs (tSNPs) were selected from seven genes whose polymorphisms have been proven by classical literatures and reliable databases to be tended to relate with gliomas, and with the minor allele frequency (MAF) > 5% in the HapMap Asian population. The selected tSNPs were genotyped in 629 glioma patients and 645 controls from a Han Chinese population using the multiplexed SNP MassEXTEND assay calibrated. Two significant tSNPs in RTEL1 gene were observed to be associated with glioma risk (rs6010620, P = 0.0016, OR: 1.32, 95% CI: 1.11-1.56; rs2297440, P = 0.001, OR: 1.33, 95% CI: 1.12-1.58) by χ2 test. It was identified the genotype “GG” of rs6010620 acted as the protective genotype for glioma (OR, 0.46; 95% CI, 0.31-0.7; P = 0.0002), while the genotype “CC” of rs2297440 as the protective genotype in glioma (OR, 0.47; 95% CI, 0.31-0.71; P = 0.0003). Furthermore, haplotype “GCT” in RTEL1 gene was found to be associated with risk of glioma (OR, 0.7; 95% CI, 0.57-0.86; Fisher’s P = 0.0005; Pearson’s P = 0.0005), and haplotype “ATT” was detected to be associated with risk of glioma (OR, 1.32; 95% CI, 1.12-1.57; Fisher’s P = 0.0013; Pearson’s P = 0.0013). Two single variants, the genotypes of “GG” of rs6010620 and “CC” of rs2297440 (rs6010620 and rs2297440) in the RTEL1 gene, together with two haplotypes of GCT and ATT, were identified to be associated with glioma development. And it might be used to evaluate the glioma development risks to screen the above RTEL1 tagging SNPs and haplotypes. Virtual slides The virtual slides for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1993021136961998 PMID:23683922
HapMap-based study of CIP2A gene polymorphisms and HCC susceptibility
LI, YUCHUN; WANG, KAIJUAN; DAI, LIPING; WANG, PENG; SONG, CHUNHUA; SHI, JIANXIANG; REN, PENGFEI; YE, HUA; ZHANG, JIANYING
2012-01-01
CIP2A is a human oncoprotein that inhibits PP2A and stabilizes c-myc in human malignancies. Autoantibodies to CIP2A protein have been reported to be present in higher levels in sera from patients with hepatocellular carcinoma (HCC) than in sera of healthy individuals. The CIP2A gene has been demonstrated as a potential cancer susceptibility gene. To elucidate whether common CIP2A variants are associated with HCC susceptibility, we conducted a case-control study comprising 233 cases of HCC and 280 controls matched on age, gender and ethnicity in the Chinese Han population. Two haplotype-tagging single nucleotide polymorphisms (htSNPs) (rs2278911 and rs4855656) from the HapMap database were analyzed, which provide an almost complete coverage of the genetic variations in the CIP2A gene. We found that neither of these htSNPs and haplotypes were associated with the risk of HCC. However, an interaction was observed between hepatitis virus B and C infection (HBV and HCV) and the C carriers (TC or CC) of rs2278911 on HCC risk (OR=12.35; 95% CI, 4.93–19.87). No such association was found for rs4855656. Our study also demonstrated that two htSNPs (rs2278911 and rs4855656) in the CIP2A gene are not associated with the risk of HCC. HBV and HCV infection was found to exert a synergistic effect on the risk of HCC in individuals with the C carriers (TC or CC) of rs2278911 in the Chinese Han population. PMID:22844383
Wang, Hansong; Haiman, Christopher A.; Kolonel, Laurence N.; Henderson, Brian E.; Wilkens, Lynne R.; Le Marchand, Loïc; Stram, Daniel O.
2011-01-01
It is well-known that population substructure may lead to confounding in case-control association studies. Here, we examined genetic structure in a large racially and ethnically diverse sample consisting of 5 ethnic groups of the Multiethnic Cohort study (African Americans, Japanese Americans, Latinos, European Americans and Native Hawaiians) using 2,509 SNPs distributed across the genome. Principal component analysis on 6,213 study participants, 18 Native Americans and 11 HapMap III populations revealed 4 important principal components (PCs): the first two separated Asians, Europeans and Africans, and the third and fourth corresponded to Native American and Native Hawaiian (Polynesian) ancestry, respectively. Individual ethnic composition derived from self-reported parental information matched well to genetic ancestry for Japanese and European Americans. STRUCTURE-estimated individual ancestral proportions for African Americans and Latinos are consistent with previous reports. We quantified the East Asian (mean 27%), European (mean 27%) and Polynesian (mean 46%) ancestral proportions for the first time, to our knowledge, for Native Hawaiians. Simulations based on realistic settings of case-control studies nested in the Multiethnic Cohort found that the effect of population stratification was modest and readily corrected by adjusting for race/ethnicity or by adjusting for top PCs derived from all SNPs or from ancestry informative markers; the power of these approaches was similar when averaged across causal variants simulated based on allele frequencies of the 2,509 genotyped markers. The bias may be large in case-only analysis of gene by gene interactions but it can be corrected by top PCs derived from all SNPs. PMID:20499252
The Role of Local Ancestry Adjustment in Association Studies Using Admixed Populations
Zhang, Jianqi; Stram, Daniel O.
2016-01-01
Association analysis using admixed populations imposes challenges and opportunities for disease mapping. By developing some explicit results for the variance of an allele of interest conditional on either local or global ancestry and by simulation of recently admixed genomes we evaluate power and false-positive rates under a variety of scenarios concerning linkage disequilibrium (LD) and the presence of unmeasured variants. Pairwise LD patterns were compared between admixed and nonadmixed populations using the HapMap phase 3 data. Based on the above, we showed that as follows: For causal variants with similar effect size in all populations, power is generally higher in a study using admixed population than using nonadmixed population, especially for highly differentiated SNPs. This gain of power is achieved with adjustment of global ancestry, which completely removes any cross-chromosome inflation of type I error rates, and addresses much of the intrachromosome inflation.If reliably estimated, adjusting for local ancestry precisely recovers the localization that could have been achieved in a stratified analysis of source populations. Improved localization is most evident for highly differentiated SNPs; however, the advantage of higher power is lost on exactly the same differentiated SNPs.In the real admixed populations such as African Americans and Latinos, the expansion of LD is not as dramatic as in our simulation.While adjustment for global ancestry is required prior to announcing a novel association seen in an admixed population, local ancestry adjustment may best be regarded as a localization tool not strictly required for discovery purposes. PMID:25043967
Physiogenomic analysis of the Puerto Rican population
Ruaño, Gualberto; Duconge, Jorge; Windemuth, Andreas; Cadilla, Carmen L; Kocherla, Mohan; Villagra, David; Renta, Jessica; Holford, Theodore; Santiago-Borrero, Pedro J
2009-01-01
Aims Admixture in the population of the island of Puerto Rico is of general interest with regards to pharmacogenetics to develop comprehensive strategies for personalized healthcare in Latin Americans. This research was aimed at determining the frequencies of SNPs in key physiological, pharmacological and biochemical genes to infer population structure and ancestry in the Puerto Rican population. Materials & methods A noninterventional, cross-sectional, retrospective study design was implemented following a controlled, stratified-by-region, random sampling protocol. The sample was based on birthrates in each region of the island of Puerto Rico, according to the 2004 National Birth Registry. Genomic DNA samples from 100 newborns were obtained from the Puerto Rico Newborn Screening Program in dried-blood spot cards. Genotyping using a physiogenomic array was performed for 332 SNPs from 196 cardiometabolic and neuroendocrine genes. Population structure was examined using a Bayesian clustering approach as well as by allelic dissimilarity as a measure of allele sharing. Results The Puerto Rican sample was found to be broadly heterogeneous. We observed three main clusters in the population, which we hypothesize to reflect the historical admixture in the Puerto Rican population from Amerindian, African and European ancestors. We present evidence for this interpretation by comparing allele frequencies for the three clusters with those for the same SNPs available from the International HapMap project for Asian, African and European populations. Conclusion Our results demonstrate that population analysis can be performed with a physiogenomic array of cardiometabolic and neuroendocrine genes to facilitate the translation of genome diversity into personalized medicine. PMID:19374515
Nagaie, Satoshi; Ogishima, Soichi; Nakaya, Jun; Tanaka, Hiroshi
2015-01-01
Genome-wide association studies (GWAS) and linkage analysis has identified many single nucleotide polymorphisms (SNPs) related to disease. There are many unknown SNPs whose minor allele frequencies (MAFs) as low as 0.005 having intermediate effects with odds ratio between 1.5~3.0. Low frequency variants having intermediate effects on disease pathogenesis are believed to have complex interactions with environmental factors called gene-environment interactions (GxE). Hence, we describe a model using 3D Manhattan plot called GxE landscape plot to visualize the association of p-values for gene-environment interactions (GxE). We used the Gene-Environment iNteraction Simulator 2 (GENS2) program to simulate interactions between two genetic loci and one environmental factor in this exercise. The dataset used for training contains disease status, gender, 20 environmental exposures and 100 genotypes for 170 subjects, and p-values were calculated by Cochran-Mantel-Haenszel chi-squared test on known data. Subsequently, we created a 3D GxE landscape plot of negative logarithm of the association of p-values for all the possible combinations of genetic and environmental factors with their hierarchical clustering. Thus, the GxE landscape plot is a valuable model to predict association of p-values for GxE and similarity among genotypes and environments in the context of disease pathogenesis. GxE - Gene-environment interactions, GWAS - Genome-wide association study, MAFs - Minor allele frequencies, SNPs - Single nucleotide polymorphisms, EWAS - Environment-wide association study, FDR - False discovery rate, JPT+CHB - HapMap population of Japanese in Tokyo, Japan - Han Chinese in Beijing.
Association of Transcription Factor Gene LMX1B with Autism
Yamada, Kazuo; Iwayama, Yoshimi; Toyota, Tomoko; Tsujii, Masatsugu; Iwata, Yasuhide; Suzuki, Katsuaki; Matsuzaki, Hideo; Iwata, Keiko; Sugiyama, Toshiro; Yoshikawa, Takeo; Mori, Norio
2011-01-01
Multiple lines of evidence suggest a serotoninergic dysfunction in autism. The role of LMX1B in the development and maintenance of serotoninergic neurons is well known. In order to examine the role, if any, of LMX1B with autism pathophysiology, a trio-based SNP association study using 252 family samples from the AGRE was performed. Using pair-wise tagging method, 24 SNPs were selected from the HapMap data, based on their location and minor allele frequency. Two SNPs (rs10732392 and rs12336217) showed moderate association with autism with p values 0.018 and 0.022 respectively in transmission disequilibrium test. The haplotype AGCGTG also showed significant association (p = 0.008). Further, LMX1B mRNA expressions were studied in the postmortem brain tissues of autism subjects and healthy controls samples. LMX1B transcripts was found to be significantly lower in the anterior cingulate gyrus region of autism patients compared with controls (p = 0.049). Our study suggests a possible role of LMX1B in the pathophysiology of autism. Based on previous reports, it is likely to be mediated through a seretoninergic mechanism. This is the first report on the association of LMX1B with autism, though it should be viewed with some caution considering the modest associations we report. PMID:21901133
GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies
Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio
2013-01-01
We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243
Identification of genomic regions contributing to etoposide-induced cytotoxicity.
Bleibel, Wasim K; Duan, Shiwei; Huang, R Stephanie; Kistner, Emily O; Shukla, Sunita J; Wu, Xiaolin; Badner, Judith A; Dolan, M Eileen
2009-03-01
Etoposide is routinely used in combination-based chemotherapy for testicular cancer and small-cell lung cancer; however, myelosuppression, therapy-related leukemia and neurotoxicity limit its utility. To determine the genetic contribution to cellular sensitivity to etoposide, we evaluated cell growth inhibition in Centre d' Etude du Polymorphisme Humain lymphoblastoid cell lines from 24 multi-generational pedigrees (321 samples) following treatment with 0.02-2.5 microM etoposide for 72 h. Heritability analysis showed that genetic variation contributes significantly to the cytotoxic phenotypes (h (2) = 0.17-0.25, P = 4.9 x 10(-5)-7.3 x 10(-3)). Whole genome linkage scans uncovered 8 regions with peak LOD scores ranging from 1.57 to 2.55, with the most significant signals being found on chromosome 5 (LOD = 2.55) and chromosome 6 (LOD = 2.52). Linkage-directed association was performed on a subset of HapMap samples within the pedigrees to find 22 SNPs significantly associated with etoposide cytotoxicity at one or more treatment concentrations. UVRAG, a DNA repair gene, SEMA5A, SLC7A6 and PRMT7 are implicated from these unbiased studies. Our findings suggest that susceptibility to etoposide-induced cytotoxicity is heritable and using an integrated genomics approach we identified both genomic regions and SNPs associated with the cytotoxic phenotypes.
CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.
Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H
2010-07-06
The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/
Natural positive selection and north-south genetic diversity in East Asia.
Suo, Chen; Xu, Haiyan; Khor, Chiea-Chuen; Ong, Rick Th; Sim, Xueling; Chen, Jieming; Tay, Wan-Ting; Sim, Kar-Seng; Zeng, Yi-Xin; Zhang, Xuejun; Liu, Jianjun; Tai, E-Shyong; Wong, Tien-Yin; Chia, Kee-Seng; Teo, Yik-Ying
2012-01-01
Recent reports have identified a north-south cline in genetic variation in East and South-East Asia, but these studies have not formally explored the basis of these clinical differences. Understanding the origins of these variations may provide valuable insights in tracking down the functional variants in genomic regions identified by genetic association studies. Here we investigate the genetic basis of these differences with genome-wide data from the HapMap, the Human Genome Diversity Project and the Singapore Genome Variation Project. We implemented four bioinformatic measures to discover genomic regions that are considerably differentiated either between two Han Chinese populations in the north and south of China, or across 22 populations in East and South-East Asia. These measures prioritized genomic stretches with: (i) regional differences in the allelic spectrum for SNPs common to the two Han Chinese populations; (ii) differential evidence of positive selection between the two populations as quantified by integrated haplotype score (iHS) and cross-population extended haplotype homozygosity (XP-EHH); (iii) significant correlation between allele frequencies and geographical latitudes of the 22 populations. We also explored the extent of linkage disequilibrium variations in these regions, which is important in combining genetic association studies from North and South Chinese. Two of the regions that emerged are found in HLA class I and II, suggesting that the HLA imputation panel from the HapMap may not be directly applicable to every Chinese sample. This has important implications to autoimmune studies that plan to impute the classical HLA alleles to fine map the SNP association signals.
Natural positive selection and north–south genetic diversity in East Asia
Suo, Chen; Xu, Haiyan; Khor, Chiea-Chuen; Ong, Rick TH; Sim, Xueling; Chen, Jieming; Tay, Wan-Ting; Sim, Kar-Seng; Zeng, Yi-Xin; Zhang, Xuejun; Liu, Jianjun; Tai, E-Shyong; Wong, Tien-Yin; Chia, Kee-Seng; Teo, Yik-Ying
2012-01-01
Recent reports have identified a north–south cline in genetic variation in East and South-East Asia, but these studies have not formally explored the basis of these clinical differences. Understanding the origins of these variations may provide valuable insights in tracking down the functional variants in genomic regions identified by genetic association studies. Here we investigate the genetic basis of these differences with genome-wide data from the HapMap, the Human Genome Diversity Project and the Singapore Genome Variation Project. We implemented four bioinformatic measures to discover genomic regions that are considerably differentiated either between two Han Chinese populations in the north and south of China, or across 22 populations in East and South-East Asia. These measures prioritized genomic stretches with: (i) regional differences in the allelic spectrum for SNPs common to the two Han Chinese populations; (ii) differential evidence of positive selection between the two populations as quantified by integrated haplotype score (iHS) and cross-population extended haplotype homozygosity (XP-EHH); (iii) significant correlation between allele frequencies and geographical latitudes of the 22 populations. We also explored the extent of linkage disequilibrium variations in these regions, which is important in combining genetic association studies from North and South Chinese. Two of the regions that emerged are found in HLA class I and II, suggesting that the HLA imputation panel from the HapMap may not be directly applicable to every Chinese sample. This has important implications to autoimmune studies that plan to impute the classical HLA alleles to fine map the SNP association signals. PMID:21792231
Association of Genetic Polymorphisms on VEGFA and VEGFR2 With Risk of Coronary Heart Disease
Liu, Doxing; Song, Jiantao; Ji, Xianfei; Liu, Zunqi; Cong, Mulin; Hu, Bo
2016-01-01
Abstract Coronary heart disease (CHD) is a cardiovascular disease which is contributed by abnormal neovascularization. VEGFA (vascular endothelial growth factor A) and VEGFR2 (vascular endothelial growth factor receptor 2) have been revealed to be involved in the pathological angiogenesis. This study was intended to confirm whether single nucleotide polymorphisms (SNPs) of VEGFA and VEGFR2 were associated with CHD in a Chinese population, considering pathological features and living habits of CHD patients. Peripheral blood samples were collected from 810 CHD patients and 805 healthy individuals. Six tag SNPs within VEGFA and VEGFR2 were obtained from HapMap Database. Genotyping of SNPs was performed using SNapShot method (Applied Biosystems, Foster, CA). Odd ratios (ORs) and their 95% confidence intervals (95% CIs) were calculated to evaluate the association between SNPs and CHD risk. Under the allelic model, 6 SNPs of VEGFA and VEGFR2 were remarkably associated with the susceptibility to CHD. Genotype CT of rs3025039, TT of rs2305948, and AA of rs1873077 were associated with a reduced risk of CHD when smoking, alcohol intake and diabetes were considered, while homozygote GG of rs1570360 might elevate the susceptibility to CHD (all P < 0.05) for patients who were addicted to smoking or those with hypertension. All of the combined effects of rs699947 (CC/CA) and rs2305948 (TT), rs3025039 (TT) and rs2305948 (TT), rs3025039 (CT) and rs1870377 (AA) had positive effects on the risk of CHD, respectively (all P < 0.05). By contrast, the synthetic effects of rs69947 (CA/AA) and rs1870377 (TA), rs699947 (CA) and rs7667298 (GG), rs699947 (AA) and rs7667298 (GG), rs1570360 (GG) and rs2305948 (TT), as well as rs1570360 (GG) and rs1870377 (AA) all exhibited adverse effects on the risk of CHD, respectively (all P < 0.05). Six polymorphisms in VEGFA and VEGFR2 may have substantial influence on the susceptibility to CHD in a Han Chinese population. Prospective cohort studies should be further designed to confirm the above conclusions. PMID:27175642
Lerer, E; Levi, S; Salomon, S; Darvasi, A; Yirmiya, N; Ebstein, R P
2008-10-01
Evidence both from animal and human studies suggests that common polymorphisms in the oxytocin receptor (OXTR) gene are likely candidates to confer risk for autism spectrum disorders (ASD). In lower mammals, oxytocin is important in a wide range of social behaviors, and recent human studies have shown that administration of oxytocin modulates behavior in both clinical and non-clinical groups. Additionally, two linkage studies and two recent association investigations also underscore a possible role for the OXTR gene in predisposing to ASD. We undertook a comprehensive study of all 18 tagged SNPs across the entire OXTR gene region identified using HapMap data and the Haploview algorithm. Altogether 152 subjects diagnosed with ASDs (that is, DSM IV autistic disorder or pervasive developmental disorder--NOS) from 133 families were genotyped (parents and affected siblings). Both individual SNPs and haplotypes were tested for association using family-based association tests as provided in the UNPHASED set of programs. Significant association with single SNPs and haplotypes (global P-values <0.05, following permutation test adjustment) were observed with ASD. Association was also observed with IQ and the Vineland Adaptive Behavior Scales (VABS). In particular, a five-locus haplotype block (rs237897-rs13316193-rs237889-rs2254298-rs2268494) was significantly associated with ASD (nominal global P=0.000019; adjusted global P=0.009) and a single haplotype (carried by 7% of the population) within that block showed highly significant association (P=0.00005). This is the third association study, in a third ethnic group, showing that SNPs and haplotypes in the OXTR gene confer risk for ASD. The current investigation also shows association with IQ and total VABS scores (as well as the communication, daily living skills and socialization subdomains), suggesting that this gene shapes both cognition and daily living skills that may cross diagnostic boundaries.
2012-01-01
Introduction The largest genetic risk to develop rheumatoid arthritis (RA) arises from a group of alleles of the HLA DRB1 locus ('shared epitope', SE). Over 30 non-HLA single nucleotide polymorphisms (SNPs) predisposing to disease have been identified in Caucasians, but they have never been investigated in West/Central Africa. We previously reported a lower prevalence of the SE in RA patients in Cameroon compared to European patients and aimed in the present study to investigate the contribution of Caucasian non-HLA RA SNPs to disease susceptibility in Black Africans. Methods RA cases and controls from Cameroon were genotyped for Caucasian RA susceptibility SNPs using Sequenom MassArray technology. Genotype data were also available for 5024 UK cases and 4281 UK controls and for 119 Yoruba individuals in Ibadan, Nigeria (YRI, HapMap). A Caucasian aggregate genetic-risk score (GRS) was calculated as the sum of the weighted risk-allele counts. Results After genotyping quality control procedures were performed, data on 28 Caucasian non-HLA susceptibility SNPs were available in 43 Cameroonian RA cases and 44 controls. The minor allele frequencies (MAF) were tightly correlated between Cameroonian controls and YRI individuals (correlation coefficient 93.8%, p = 1.7E-13), and they were pooled together. There was no correlation between MAF of UK and African controls; 13 markers differed by more than 20%. The MAF for markers at PTPN22, IL2RA, FCGR2A and IL2/IL21 was below 2% in Africans. The GRS showed a strong association with RA in the UK. However, the GRS did not predict RA in Africans (OR = 0.71, 95% CI 0.29 - 1.74, p = 0.456). Random sampling from the UK cohort showed that this difference in association is unlikely to be explained by small sample size or chance, but is statistically significant with p<0.001. Conclusions The MAFs of non-HLA Caucasian RA susceptibility SNPs are different between Caucasians and Africans, and several polymorphisms are barely detectable in West/Central Africa. The genetic risk of developing RA conferred by a set of 28 Caucasian susceptibility SNPs is significantly different between the UK and Africa with p<0.001. Taken together, these observations strengthen the hypothesis that the genetic architecture of RA susceptibility is different in different ethnic backgrounds. PMID:23121884
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Jinna, E-mail: kqkjk@yahoo.com.cn; Song, Tao; Jiao, Xiaohui
2011-07-15
Highlights: {yields} IRF6 rs642961 polymorphism is intensively associated with NSCLP. {yields} IRF6 rs2235371 polymorphism is not associated with NSCLP in the northern Chinese population. {yields} This investigation failed to yield any evidence for the involvement of TFAP2A polymorphisms in NSCLP in the northern Chinese population. -- Abstract: Non-syndromic cleft lip with or without cleft palate (NSCLP) is a common birth defect that is presumably caused by genetic factors alone or gene alterations in combination with environmental changes. A number of studies have shown an association between NSCLP and single-nucleotide polymorphisms (SNPs) in the interferon regulatory factor 6 (IRF6) gene inmore » several populations. The transcription factor AP-2a (TFAP2A), which is involved in regulating mid-face development and upper lip fusion, has also be considered a candidate gene contributing to the etiology of NSCLP. The potential importance of IRF6 and TFAP2A in the NSCLP is further highlighted by a study showing that the two molecules are in the same developmental pathway. To further assess the roles of the IRF6 and TFAP2A in NSCLP, we investigated two identified IRF6 SNPs (rs2235371, rs642961) and three TFAP2A tag SNPs (rs3798691, rs1675414, rs303050) selected from HapMap data in a northern Chinese population, a group with a high prevalence of NSCLP. These SNPs were examined for association with NSCLP in 175 patients and 160 healthy controls. We observed a significant correlation between IRF6 rs642961 and NSCLP, and a lack of association between IRF6 rs2235371 polymorphisms and NSCLP in this population. This investigation indicated that there is no association between the three SNPs in the TFAP2A and NSCLP, suggesting that TFAP2A may not be involved in the development of NSCLP in the northern Chinese population. Our study provides further evidence regarding the role of IRF6 variations in NSCLP development and finds no significant association between TFAP2A and NSCLP in this northern Chinese population.« less
Viatte, Sebastien; Flynn, Edward; Lunt, Mark; Barnes, Joanne; Singwe-Ngandeu, Madeleine; Bas, Sylvette; Barton, Anne; Gabay, Cem
2012-11-03
The largest genetic risk to develop rheumatoid arthritis (RA) arises from a group of alleles of the HLA DRB1 locus ('shared epitope', SE). Over 30 non-HLA single nucleotide polymorphisms (SNPs) predisposing to disease have been identified in Caucasians, but they have never been investigated in West/Central Africa. We previously reported a lower prevalence of the SE in RA patients in Cameroon compared to European patients and aimed in the present study to investigate the contribution of Caucasian non-HLA RA SNPs to disease susceptibility in Black Africans. RA cases and controls from Cameroon were genotyped for Caucasian RA susceptibility SNPs using Sequenom MassArray technology. Genotype data were also available for 5024 UK cases and 4281 UK controls and for 119 Yoruba individuals in Ibadan, Nigeria (YRI, HapMap). A Caucasian aggregate genetic-risk score (GRS) was calculated as the sum of the weighted risk-allele counts. After genotyping quality control procedures were performed, data on 28 Caucasian non-HLA susceptibility SNPs were available in 43 Cameroonian RA cases and 44 controls. The minor allele frequencies (MAF) were tightly correlated between Cameroonian controls and YRI individuals (correlation coefficient 93.8%, p = 1.7E-13), and they were pooled together. There was no correlation between MAF of UK and African controls; 13 markers differed by more than 20%. The MAF for markers at PTPN22, IL2RA, FCGR2A and IL2/IL21 was below 2% in Africans. The GRS showed a strong association with RA in the UK. However, the GRS did not predict RA in Africans (OR = 0.71, 95% CI 0.29 - 1.74, p = 0.456). Random sampling from the UK cohort showed that this difference in association is unlikely to be explained by small sample size or chance, but is statistically significant with p<0.001. The MAFs of non-HLA Caucasian RA susceptibility SNPs are different between Caucasians and Africans, and several polymorphisms are barely detectable in West/Central Africa. The genetic risk of developing RA conferred by a set of 28 Caucasian susceptibility SNPs is significantly different between the UK and Africa with p<0.001. Taken together, these observations strengthen the hypothesis that the genetic architecture of RA susceptibility is different in different ethnic backgrounds.
Asthma and genes encoding components of the vitamin D pathway
2009-01-01
Background Genetic variants at the vitamin D receptor (VDR) locus are associated with asthma and atopy. We hypothesized that polymorphisms in other genes of the vitamin D pathway are associated with asthma or atopy. Methods Eleven candidate genes were chosen for this study, five of which code for proteins in the vitamin D metabolism pathway (CYP27A1, CYP27B1, CYP2R1, CYP24A1, GC) and six that are known to be transcriptionally regulated by vitamin D (IL10, IL1RL1, CD28, CD86, IL8, SKIIP). For each gene, we selected a maximally informative set of common SNPs (tagSNPs) using the European-derived (CEU) HapMap dataset. A total of 87 SNPs were genotyped in a French-Canadian family sample ascertained through asthmatic probands (388 nuclear families, 1064 individuals) and evaluated using the Family Based Association Test (FBAT) program. We then sought to replicate the positive findings in four independent samples: two from Western Canada, one from Australia and one from the USA (CAMP). Results A number of SNPs in the IL10, CYP24A1, CYP2R1, IL1RL1 and CD86 genes were modestly associated with asthma and atopy (p < 0.05). Two-gene models testing for both main effects and the interaction were then performed using conditional logistic regression. Two-gene models implicating functional variants in the IL10 and VDR genes as well as in the IL10 and IL1RL1 genes were associated with asthma (p < 0.0002). In the replicate samples, SNPs in the IL10 and CYP24A1 genes were again modestly associated with asthma and atopy (p < 0.05). However, the SNPs or the orientation of the risk alleles were different between populations. A two-gene model involving IL10 and VDR was replicated in CAMP, but not in the other populations. Conclusion A number of genes involved in the vitamin D pathway demonstrate modest levels of association with asthma and atopy. Multilocus models testing genes in the same pathway are potentially more effective to evaluate the risk of asthma, but the effects are not uniform across populations. PMID:19852851
Prasad, Pushplata; Kumar, Ashok; Gupta, Rajiva; Juyal, Ramesh C.; B. K., Thelma
2012-01-01
Genome-wide association studies and meta-analysis indicate that several genes/loci are consistently associated with rheumatoid arthritis (RA) in European and Asian populations. To evaluate the transferability status of these findings to an ethnically diverse north Indian population, we performed a replication analysis. We investigated the association of 47 single-nucleotide polymorphisms (SNPs) at 43 of these genes/loci with RA in a north Indian cohort comprising 983 RA cases and 1007 age and gender matched controls. Genotyping was done using Infinium human 660w-quad. Association analysis by chi-square test implemented in plink was carried out in two steps. Firstly, association of the index or surrogate SNP (r2>0.8, calculated from reference GIH Hap-Map population) was tested. In the second step, evidence for allelic/locus heterogeneity at aforementioned genes/loci was assessed for by testing additional flanking SNPs in linkage equilibrium with index/surrogate marker. Of the 44 European specific index SNPs, neither index nor surrogate SNPs were present for nine SNPs in the genotyping array. Of the remaining 35, associations were replicated at seven genes namely PTPN22 (rs1217407, p = 3×10−3); IL2–21 (rs13119723, p = 0.008); HLA-DRB1 (rs660895, p = 2.56×10−5; rs6457617, p = 1.6×10−09; rs13192471, p = 6.7×10−16); TNFA1P3 (rs9321637, p = 0.03); CCL21 (rs13293020, p = 0.01); IL2RA (rs2104286, p = 1.9×10−4) and ZEB1 (rs2793108, p = 0.006). Of the three Asian specific loci tested, rs2977227 in PADI4 showed modest association (p<0.02). Further, of the 140 SNPs (in LE with index/surrogate variant) tested, association was observed at 11 additional genes: PTPRC, AFF3, CD28, CTLA4, PXK, ANKRD55, TAGAP, CCR6, BLK, CD40 and IL2RB. This study indicates limited replication of European and Asian index SNPs and apparent allelic heterogeneity in RA etiology among north Indians warranting independent GWAS in this population. However, replicated associations of HLA-DRB1, PTPN22 (which confer ∼50% of the heritable risk to RA) and IL2RA suggest that cross-ethnicity fine mapping of such loci is apposite for identification of causal variants. PMID:22355377
Genome Wide Association Study of Sepsis in Extremely Premature Infants
Srinivasan, Lakshmi; Page, Grier; Kirpalani, Haresh; Murray, Jeffrey C.; Das, Abhik; Higgins, Rosemary D.; Carlo, Waldemar A.; Bell, Edward F.; Goldberg, Ronald N.; Schibler, Kurt; Sood, Beena G.; Stevenson, David K.; Stoll, Barbara J.; Van Meurs, Krisa P.; Johnson, Karen J.; Levy, Joshua; McDonald, Scott A.; Zaterka-Baxter, Kristin M.; Kennedy, Kathleen A.; Sánchez, Pablo J.; Duara, Shahnaz; Walsh, Michele C.; Shankaran, Seetha; Wynn, James L.; Cotten, C. Michael
2017-01-01
Objective To identify genetic variants associated with sepsis (early and late-onset) using a genome wide association (GWA) analysis in a cohort of extremely premature infants. Study Design Previously generated GWA data from the Neonatal Research Network’s anonymized genomic database biorepository of extremely premature infants were used for this study. Sepsis was defined as culture-positive early-onset or late-onset sepsis or culture-proven meningitis. Genomic and whole genome amplified DNA was genotyped for 1.2 million single nucleotide polymorphisms (SNPs); 91% of SNPs were successfully genotyped. We imputed 7.2 million additional SNPs. P values and false discovery rates were calculated from multivariate logistic regression analysis adjusting for gender, gestational age and ancestry. Target statistical value was p<10−5. Secondary analyses assessed associations of SNPs with pathogen type. Pathway analyses were also run on primary and secondary end points. Results Data from 757 extremely premature infants were included: 351 infants with sepsis and 406 infants without sepsis. No SNPs reached genome-wide significance levels (5×10−8); two SNPs in proximity to FOXC2 and FOXL1 genes achieved target levels of significance. In secondary analyses, SNPs for ELMO1, IRAK2 (Gram positive sepsis), RALA, IMMP2L (Gram negative sepsis) and PIEZO2 (fungal sepsis) met target significance levels. Pathways associated with sepsis and Gram negative sepsis included gap junctions, fibroblast growth factor receptors, regulators of cell division and Interleukin-1 associated receptor kinase 2 (p values<0.001 and FDR<20%). Conclusions No SNPs met genome-wide significance in this cohort of ELBW infants; however, areas of potential association and pathways meriting further study were identified. PMID:28283553
Performance of genotype imputation for low frequency and rare variants from the 1000 genomes.
Zheng, Hou-Feng; Rong, Jing-Jing; Liu, Ming; Han, Fang; Zhang, Xing-Wei; Richards, J Brent; Wang, Li
2015-01-01
Genotype imputation is now routinely applied in genome-wide association studies (GWAS) and meta-analyses. However, most of the imputations have been run using HapMap samples as reference, imputation of low frequency and rare variants (minor allele frequency (MAF) < 5%) are not systemically assessed. With the emergence of next-generation sequencing, large reference panels (such as the 1000 Genomes panel) are available to facilitate imputation of these variants. Therefore, in order to estimate the performance of low frequency and rare variants imputation, we imputed 153 individuals, each of whom had 3 different genotype array data including 317k, 610k and 1 million SNPs, to three different reference panels: the 1000 Genomes pilot March 2010 release (1KGpilot), the 1000 Genomes interim August 2010 release (1KGinterim), and the 1000 Genomes phase1 November 2010 and May 2011 release (1KGphase1) by using IMPUTE version 2. The differences between these three releases of the 1000 Genomes data are the sample size, ancestry diversity, number of variants and their frequency spectrum. We found that both reference panel and GWAS chip density affect the imputation of low frequency and rare variants. 1KGphase1 outperformed the other 2 panels, at higher concordance rate, higher proportion of well-imputed variants (info>0.4) and higher mean info score in each MAF bin. Similarly, 1M chip array outperformed 610K and 317K. However for very rare variants (MAF ≤ 0.3%), only 0-1% of the variants were well imputed. We conclude that the imputation of low frequency and rare variants improves with larger reference panels and higher density of genome-wide genotyping arrays. Yet, despite a large reference panel size and dense genotyping density, very rare variants remain difficult to impute.
Blair, Lily M; Feldman, Marcus W
2015-07-14
Demography and environmental adaptation can affect the global distribution of genetic variants and possibly the distribution of disease. Population heterozygosity of single nucleotide polymorphisms has been shown to decrease strongly with distance from Africa and this has been attributed to the effect of serial founding events during the migration of humans out of Africa. Additionally, population allele frequencies have been shown to change due to environmental adaptation. Here, we investigate the relationship of Out-of-Africa migration and climatic variables to the distribution of risk alleles for 21 diseases. For each disease, we computed the regression of average heterozygosity and average allele frequency of the risk alleles with distance from Africa and 9 environmental variables. We compared these regressions to a null distribution created by regressing statistics for SNPs not associated with disease on distance from Africa and these environmental variables. Additionally, we used Bayenv 2.0 to assess the signal of environmental adaptation associated with individual risk SNPs. For those SNPs in HGDP and HapMap that are risk alleles for type 2 diabetes, we cannot reject that their distribution is as expected from Out-of-Africa migration. However, the allelic statistics for many other diseases correlate more closely with environmental variables than would be expected from the serial founder effect and show signals of environmental adaptation. We report strong environmental interactions with several autoimmune diseases, and note a particularly strong interaction between asthma and summer humidity. Additionally, we identified several risk genes with strong environmental associations. For most diseases, migration does not explain the distribution of risk alleles and the worldwide pattern of allele frequencies for some diseases may be better explained by environmental associations, which suggests that some selection has acted on these diseases.
Jia, Jing; Wei, Yi-Liang; Qin, Cui-Jiao; Hu, Lan; Wan, Li-Hua; Li, Cai-Xia
2014-01-01
Inferring the ancestral origin of DNA samples can be helpful in correcting population stratification in disease association studies or guiding crime investigations. Populations throughout the world vary in appearance features and biological characteristics. Based on this idea, we performed a genome-wide scan for SNPs within genes that are related to physical and biological traits. Using the HapMap database, we screened 52 genes and their flanking regions. Thirty-five SNPs that displayed highly contrasting allele frequencies (F(st)>0.3, linkage disequilibrium r(2)<0.2, and Hardy-Weinberg equilibrium P>0.001) among Africans, Europeans, and East Asians were selected and validated. A multiplexed assay was developed to genotype these 35 SNPs in 357 individuals from 10 populations worldwide. This panel provided accurate estimates of individual ancestry proportions with balanced discriminatory power among the three continental ancestries: Africans, Europeans, and East Asians. It also proved very effective in evaluating admixed populations living in joint regions of continents (e.g., Uyghurs and Indians) and discriminating some subpopulations within each of the three continents. Structure analysis was performed to establish and evaluate the panel of ancestry-informative markers, and the components of each population were also described to indicate the structural composition. The 21 population structures in our study are consistent with geographic patterns, and individuals were properly assigned to their original ancestral populations with proportion analyses and random match probability calculations. Thus, the panel and its population information will be useful resources to minimize the effects of population stratification in association analyses and to assign the most likely origin of an unknown DNA contributor in forensic investigations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Maternal and offspring genetic variants of AKR1C3 and the risk of childhood leukemia
Liu, Chen-yu; Hsu, Yi-Hsiang; Pan, Pi-Chen; Wu, Ming-Tsang; Ho, Chi-Kung; Su, Li; Xu, Xin; Li, Yi; Christiani, David C.
2008-01-01
The aldo-keto reductase 1C3 (AKR1C3) gene located on chromosome 10p15-p14, a regulator of myeloid cell proliferation and differentiation, represents an important candidate gene for studying human carcinogenesis. In a prospectively enrolled population-based case–control study of Han Chinese conducted in Kaohsiung in southern Taiwan, a total of 114 leukemia cases and 221 controls <20 years old were recruited between November 1997 and December 2005. The present study set out to evaluate the association between childhood leukemia and both maternal and offspring's genotypes. To do so, we conducted a systematic assessment of common single-nucleotide polymorphisms (SNPs) at the 5′ flanking 10 kb to 3′ UTR of AKR1C3 gene. Gln5His and three tagSNPs (rs2245191, rs10508293 and rs3209896) and one multimarker (rs2245191, rs10508293 and rs3209896) were selected with average 90% coverage of untagged SNPs by using the HapMap II data set. Odds ratios and 95% confidence intervals were adjusted for age and gender. After correcting for multiple comparisons, we observed that risk of developing childhood leukemia is significantly associated with rs10508293 polymorphism on intron 4 of the AKR1C3 gene in both offspring alone and in the combined maternal and offspring genotypes (nominal P < 0.0001, permutation P < 0.005). The maternal methylenetetrahydrofolate reductase A1298C polymorphism was found to be an effect modifier of the maternal intron 4 polymorphism of the AKR1C3 gene (rs10508293) and the childhood leukemia risk. In conclusion, this study suggests that AKR1C3 polymorphisms may be important predictive markers for childhood leukemia susceptibility. PMID:18339682
Elfassihi, Latifa; Giroux, Sylvie; Bureau, Alexandre; Laflamme, Nathalie; Cole, David Ec; Rousseau, François
2010-04-01
Osteoporosis is a bone disease characterized by low bone mineral density (BMD), a highly heritable polygenic trait. Women are more prone than men to develop osteoporosis owing to a lower peak bone mass and accelerated bone loss at menopause. Lack of estrogen thus is a major risk factor for osteoporosis. In addition to having strong similarity to the estrogen receptor 1 (ESR1), the orphan nuclear estrogen-related receptor gamma (ESRRgamma) is widely expressed and shows overlap with ESR1 expression in tissues where estrogen has important physiologic functions. For these reasons, we have undertaken a study of ESRRgamma sequence variants in association with bone measurements [heel quantitative ultrasound (QUS) by measurements of broadband ultrasound attenuation (BUA), speed of sound (SOS), and stiffness index (SI) and dual-energy X-ray absorptiometry (DXA) at the femoral neck (FN) and lumbar spine (LS)]. A silent variant was found to be associated with multiple bone measurements (LS, BUA, SOS, and SI), the p values ranging from .006 to .04 in a sample of 5144 Quebec women. The region of this variant was analyzed using the HapMap database and the Gabriel method to define a block of 20 kb. Using the Tagger method, eight TagSNPs were identified and genotyped in a sample of 1335 women. Four of these SNPs capture the five major block haplotypes. One SNP (rs2818964) and one haplotype were significantly associated with multiple bone measures. All SNPs involved in the associations were analyzed in two other sample sets with significant results in the same direction. These results suggest involvement of ESRRgamma in the determination of bone density in women. Copyright 2010 American Society for Bone and Mineral Research.
Nucleotide, cytogenetic and expression impact of the human chromosome 8p23.1 inversion polymorphism.
Bosch, Nina; Morell, Marta; Ponsa, Immaculada; Mercader, Josep Maria; Armengol, Lluís; Estivill, Xavier
2009-12-14
The human chromosome 8p23.1 region contains a 3.8-4.5 Mb segment which can be found in different orientations (defined as genomic inversion) among individuals. The identification of single nucleotide polymorphisms (SNPs) tightly linked to the genomic orientation of a given region should be useful to indirectly evaluate the genotypes of large genomic orientations in the individuals. We have identified 16 SNPs, which are in linkage disequilibrium (LD) with the 8p23.1 inversion as detected by fluorescent in situ hybridization (FISH). The variability of the 8p23.1 orientation in 150 HapMap samples was predicted using this set of SNPs and was verified by FISH in a subset of samples. Four genes (NEIL2, MSRA, CTSB and BLK) were found differentially expressed (p<0.0005) according to the orientation of the 8p23.1 region. Finally, we have found variable levels of mosaicism for the orientation of the 8p23.1 as determined by FISH. By means of dense SNP genotyping of the region, haplotype-based computational analyses and FISH experiments we could infer and verify the orientation status of alleles in the 8p23.1 region by detecting two short haplotype stretches at both ends of the inverted region, which are likely the relic of the chromosome in which the original inversion occurred. Moreover, an impact of 8p23.1 inversion on gene expression levels cannot be ruled out, since four genes from this region have statistically significant different expression levels depending on the inversion status. FISH results in lymphoblastoid cell lines suggest the presence of mosaicism regarding the 8p23.1 inversion.
LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources.
Karchin, Rachel; Diekhans, Mark; Kelly, Libusha; Thomas, Daryl J; Pieper, Ursula; Eswar, Narayanan; Haussler, David; Sali, Andrej
2005-06-15
The NCBI dbSNP database lists over 9 million single nucleotide polymorphisms (SNPs) in the human genome, but currently contains limited annotation information. SNPs that result in amino acid residue changes (nsSNPs) are of critical importance in variation between individuals, including disease and drug sensitivity. We have developed LS-SNP, a genomic scale software pipeline to annotate nsSNPs. LS-SNP comprehensively maps nsSNPs onto protein sequences, functional pathways and comparative protein structure models, and predicts positions where nsSNPs destabilize proteins, interfere with the formation of domain-domain interfaces, have an effect on protein-ligand binding or severely impact human health. It currently annotates 28,043 validated SNPs that produce amino acid residue substitutions in human proteins from the SwissProt/TrEMBL database. Annotations can be viewed via a web interface either in the context of a genomic region or by selecting sets of SNPs, genes, proteins or pathways. These results are useful for identifying candidate functional SNPs within a gene, haplotype or pathway and in probing molecular mechanisms responsible for functional impacts of nsSNPs. http://www.salilab.org/LS-SNP CONTACT: rachelk@salilab.org http://salilab.org/LS-SNP/supp-info.pdf.
Characterization of six human disease-associated inversion polymorphisms.
Antonacci, Francesca; Kidd, Jeffrey M; Marques-Bonet, Tomas; Ventura, Mario; Siswara, Priscillia; Jiang, Zhaoshi; Eichler, Evan E
2009-07-15
The human genome is a highly dynamic structure that shows a wide range of genetic polymorphic variation. Unlike other types of structural variation, little is known about inversion variants within normal individuals because such events are typically balanced and are difficult to detect and analyze by standard molecular approaches. Using sequence-based, cytogenetic and genotyping approaches, we characterized six large inversion polymorphisms that map to regions associated with genomic disorders with complex segmental duplications mapping at the breakpoints. We developed a metaphase FISH-based assay to genotype inversions and analyzed the chromosomes of 27 individuals from three HapMap populations. In this subset, we find that these inversions are less frequent or absent in Asians when compared with European and Yoruban populations. Analyzing multiple individuals from outgroup species of great apes, we show that most of these large inversion polymorphisms are specific to the human lineage with two exceptions, 17q21.31 and 8p23 inversions, which are found to be similarly polymorphic in other great ape species and where the inverted allele represents the ancestral state. Investigating linkage disequilibrium relationships with genotyped SNPs, we provide evidence that most of these inversions appear to have arisen on at least two different haplotype backgrounds. In these cases, discovery and genotyping methods based on SNPs may be confounded and molecular cytogenetics remains the only method to genotype these inversions.
Type 2 diabetes mellitus: distribution of genetic markers in Kazakh population.
Sikhayeva, Nurgul; Talzhanov, Yerkebulan; Iskakova, Aisha; Dzharmukhanov, Jarkyn; Nugmanova, Raushan; Zholdybaeva, Elena; Ramanculov, Erlan
2018-01-01
Ethnic differences exist in the frequencies of genetic variations that contribute to the risk of common disease. This study aimed to analyse the distribution of several genes, previously associated with susceptibility to type 2 diabetes and obesity-related phenotypes, in a Kazakh population. A total of 966 individuals belonging to the Kazakh ethnicity were recruited from an outpatient clinic. We genotyped 41 common single nucleotide polymorphisms (SNPs) previously associated with type 2 diabetes in other ethnic groups and 31 of these were in Hardy-Weinberg equilibrium. The obtained allele frequencies were further compared to publicly available data from other ethnic populations. Allele frequencies for other (compared) populations were pooled from the haplotype map (HapMap) database. Principal component analysis (PCA), cluster analysis, and multidimensional scaling (MDS) were used for the analysis of genetic relationship between the populations. Comparative analysis of allele frequencies of the studied SNPs showed significant differentiation among the studied populations. The Kazakh population was grouped with Asian populations according to the cluster analysis and with the Caucasian populations according to PCA. According to MDS, results of the current study show that the Kazakh population holds an intermediate position between Caucasian and Asian populations. A high percentage of population differentiation was observed between Kazakh and world populations. The Kazakh population was clustered with Caucasian populations, and this result may indicate a significant Caucasian component in the Kazakh gene pool.
Vitamin D receptor gene methylation is associated with ethnicity, tuberculosis and TaqI polymorphism
Andraos, Charlene; Koorsen, Gerrit; Knight, Julian C; Bornman, Liza
2014-01-01
The Vitamin D Receptor (VDR) gene encodes a transcription factor which, on activation by vitamin D, modulates diverse biological processes including calcium homeostasis and immune function. Genetic variation involving VDR shows striking differences in allele frequency between populations and has been associated with disease susceptibility including tuberculosis and autoimmunity, although results have often been conflicting. We hypothesized that methylation of VDR may be population specific and that the combination of differential methylation and genetic variation may characterise TB predisposition. We use bisulphite conversion and/or pyrosequencing to analyse the methylation status of 17 CpGs of VDR and to genotype 7 SNPs in the 3′ CpG Island (CGI 1060), including the commonly studied SNPs ApaI (rs7975232) and TaqI (rs731236). We show that for lymphoblastoid cell lines from two ethnically diverse populations (Yoruba from HapMap, n=30 and Caucasians, n=30) together with TB cases (n=32) and controls (n=29) from the Venda population of South Africa there are methylation variable positions (MVPs) in the 3′ end that significantly distinguish ethnicity (9/17 CpGs) and TB status (3/17 CpGs). Moreover methylation status shows complex association with TaqI genotype highlighting the need to consider both genetic and epigenetic variants in genetic studies of VDR association with disease. PMID:21168462
Identification of genomic regions contributing to etoposide-induced cytotoxicity
Bleibel, Wasim K.; Duan, Shiwei; Huang, R. Stephanie; Kistner, Emily O.; Shukla, Sunita J.; Wu, Xiaolin; Badner, Judith A.
2009-01-01
Etoposide is routinely used in combination based chemotherapy for testicular cancer and small-cell lung cancer; however, myelosuppression, therapy-related leukemia and neurotoxicity limit its utility. To determine the genetic contribution to cellular sensitivity to etoposide, we evaluated cell growth inhibition in Centre d’ Etude du Polymorphisme Humain lymphoblastoid cell lines from 24 multi-generational pedigrees (321 samples) following treatment with 0.02–2.5 µM etoposide for 72 h. Heritability analysis showed that genetic variation contributes significantly to the cytotoxic phenotypes (h2 = 0.17–0.25, P = 4.9 × 10−5−7.3 × 10−3). Whole genome linkage scans uncovered 8 regions with peak LOD scores ranging from 1.57 to 2.55, with the most significant signals being found on chromosome 5 (LOD = 2.55) and chromosome 6 (LOD = 2.52). Linkage-directed association was performed on a subset of HapMap samples within the pedigrees to find 22 SNPs significantly associated with etoposide cytotoxicity at one or more treatment concentrations. UVRAG, a DNA repair gene, SEMA5A, SLC7A6 and PRMT7 are implicated from these unbiased studies. Our findings suggest that susceptibility to etoposide-induced cytotoxicity is heritable and using an integrated genomics approach we identified both genomic regions and SNPs associated with the cytotoxic phenotypes. PMID:19089452
LDSplitDB: a database for studies of meiotic recombination hotspots in MHC using human genomic data.
Guo, Jing; Chen, Hao; Yang, Peng; Lee, Yew Ti; Wu, Min; Przytycka, Teresa M; Kwoh, Chee Keong; Zheng, Jie
2018-04-20
Meiotic recombination happens during the process of meiosis when chromosomes inherited from two parents exchange genetic materials to generate chromosomes in the gamete cells. The recombination events tend to occur in narrow genomic regions called recombination hotspots. Its dysregulation could lead to serious human diseases such as birth defects. Although the regulatory mechanism of recombination events is still unclear, DNA sequence polymorphisms have been found to play crucial roles in the regulation of recombination hotspots. To facilitate the studies of the underlying mechanism, we developed a database named LDSplitDB which provides an integrative and interactive data mining and visualization platform for the genome-wide association studies of recombination hotspots. It contains the pre-computed association maps of the major histocompatibility complex (MHC) region in the 1000 Genomes Project and the HapMap Phase III datasets, and a genome-scale study of the European population from the HapMap Phase II dataset. Besides the recombination profiles, related data of genes, SNPs and different types of epigenetic modifications, which could be associated with meiotic recombination, are provided for comprehensive analysis. To meet the computational requirement of the rapidly increasing population genomics data, we prepared a lookup table of 400 haplotypes for recombination rate estimation using the well-known LDhat algorithm which includes all possible two-locus haplotype configurations. To the best of our knowledge, LDSplitDB is the first large-scale database for the association analysis of human recombination hotspots with DNA sequence polymorphisms. It provides valuable resources for the discovery of the mechanism of meiotic recombination hotspots. The information about MHC in this database could help understand the roles of recombination in human immune system. DATABASE URL: http://histone.scse.ntu.edu.sg/LDSplitDB.
2012-01-01
Background Identification of genomic regions that have been targets of selection for phenotypic traits is one of the most important and challenging areas of research in animal genetics. However, currently there are relatively few genomic regions identified that have been subject to positive selection. In this study, a genome-wide scan using ~50,000 Single Nucleotide Polymorphisms (SNPs) was performed in an attempt to identify genomic regions associated with fat deposition in fat-tail breeds. This trait and its modification are very important in those countries grazing these breeds. Results Two independent experiments using either Iranian or Ovine HapMap genotyping data contrasted thin and fat tail breeds. Population differentiation using FST in Iranian thin and fat tail breeds revealed seven genomic regions. Almost all of these regions overlapped with QTLs that had previously been identified as affecting fat and carcass yield traits in beef and dairy cattle. Study of selection sweep signatures using FST in thin and fat tail breeds sampled from the Ovine HapMap project confirmed three of these regions located on Chromosomes 5, 7 and X. We found increased homozygosity in these regions in favour of fat tail breeds on chromosome 5 and X and in favour of thin tail breeds on chromosome 7. Conclusions In this study, we were able to identify three novel regions associated with fat deposition in thin and fat tail sheep breeds. Two of these were associated with an increase of homozygosity in the fat tail breeds which would be consistent with selection for mutations affecting fat tail size several thousand years after domestication. PMID:22364287
An unusual haplotype structure on human chromosome 8p23 derived from the inversion polymorphism.
Deng, Libin; Zhang, Yuezheng; Kang, Jian; Liu, Tao; Zhao, Hongbin; Gao, Yang; Li, Chaohua; Pan, Hao; Tang, Xiaoli; Wang, Dunmei; Niu, Tianhua; Yang, Huanming; Zeng, Changqing
2008-10-01
Chromosomal inversion is an important type of genomic variations involved in both evolution and disease pathogenesis. Here, we describe the refined genetic structure of a 3.8-Mb inversion polymorphism at chromosome 8p23. Using HapMap data of 1,073 SNPs generated from 209 unrelated samples from CEPH-Utah residents with ancestry from northern and western Europe (CEU); Yoruba in Ibadan, Nigeria (YRI); and Asian (ASN) samples, which were comprised of Han Chinese from Beijing, China (CHB) and Japanese from Tokyo, Japan (JPT)-we successfully deduced the inversion orientations of all their 418 haplotypes. In particular, distinct haplotype subgroups were identified based on principal component analysis (PCA). Such genetic substructures were consistent with clustering patterns based on neighbor-joining tree reconstruction, which revealed a total of four haplotype clades across all samples. Metaphase fluorescence in situ hybridization (FISH) in a subset of 10 HapMap samples verified their inversion orientations predicted by PCA or phylogenetic tree reconstruction. Positioning of the outgroup haplotype within one of YRI clades suggested that Human NCBI Build 36-inverted order is most likely the ancestral orientation. Furthermore, the population differentiation test and the relative extended haplotype homozygosity (REHH) analysis in this region discovered multiple selection signals, also in a population-specific manner. A positive selection signal was detected at XKR6 in the ASN population. These results revealed the correlation of inversion polymorphisms to population-specific genetic structures, and various selection patterns as possible mechanisms for the maintenance of a large chromosomal rearrangement at 8p23 region during evolution. In addition, our study also showed that haplotype-based clustering methods, such as PCA, can be applied in scanning for cryptic inversion polymorphisms at a genome-wide scale.
Genome-wide detection and characterization of positive selection in human populations.
Sabeti, Pardis C; Varilly, Patrick; Fry, Ben; Lohmueller, Jason; Hostetter, Elizabeth; Cotsapas, Chris; Xie, Xiaohui; Byrne, Elizabeth H; McCarroll, Steven A; Gaudet, Rachelle; Schaffner, Stephen F; Lander, Eric S; Frazer, Kelly A; Ballinger, Dennis G; Cox, David R; Hinds, David A; Stuve, Laura L; Gibbs, Richard A; Belmont, John W; Boudreau, Andrew; Hardenbol, Paul; Leal, Suzanne M; Pasternak, Shiran; Wheeler, David A; Willis, Thomas D; Yu, Fuli; Yang, Huanming; Zeng, Changqing; Gao, Yang; Hu, Haoran; Hu, Weitao; Li, Chaohua; Lin, Wei; Liu, Siqi; Pan, Hao; Tang, Xiaoli; Wang, Jian; Wang, Wei; Yu, Jun; Zhang, Bo; Zhang, Qingrun; Zhao, Hongbin; Zhao, Hui; Zhou, Jun; Gabriel, Stacey B; Barry, Rachel; Blumenstiel, Brendan; Camargo, Amy; Defelice, Matthew; Faggart, Maura; Goyette, Mary; Gupta, Supriya; Moore, Jamie; Nguyen, Huy; Onofrio, Robert C; Parkin, Melissa; Roy, Jessica; Stahl, Erich; Winchester, Ellen; Ziaugra, Liuda; Altshuler, David; Shen, Yan; Yao, Zhijian; Huang, Wei; Chu, Xun; He, Yungang; Jin, Li; Liu, Yangfan; Shen, Yayun; Sun, Weiwei; Wang, Haifeng; Wang, Yi; Wang, Ying; Xiong, Xiaoyan; Xu, Liang; Waye, Mary M Y; Tsui, Stephen K W; Xue, Hong; Wong, J Tze-Fei; Galver, Luana M; Fan, Jian-Bing; Gunderson, Kevin; Murray, Sarah S; Oliphant, Arnold R; Chee, Mark S; Montpetit, Alexandre; Chagnon, Fanny; Ferretti, Vincent; Leboeuf, Martin; Olivier, Jean-François; Phillips, Michael S; Roumy, Stéphanie; Sallée, Clémentine; Verner, Andrei; Hudson, Thomas J; Kwok, Pui-Yan; Cai, Dongmei; Koboldt, Daniel C; Miller, Raymond D; Pawlikowska, Ludmila; Taillon-Miller, Patricia; Xiao, Ming; Tsui, Lap-Chee; Mak, William; Song, You Qiang; Tam, Paul K H; Nakamura, Yusuke; Kawaguchi, Takahisa; Kitamoto, Takuya; Morizono, Takashi; Nagashima, Atsushi; Ohnishi, Yozo; Sekine, Akihiro; Tanaka, Toshihiro; Tsunoda, Tatsuhiko; Deloukas, Panos; Bird, Christine P; Delgado, Marcos; Dermitzakis, Emmanouil T; Gwilliam, Rhian; Hunt, Sarah; Morrison, Jonathan; Powell, Don; Stranger, Barbara E; Whittaker, Pamela; Bentley, David R; Daly, Mark J; de Bakker, Paul I W; Barrett, Jeff; Chretien, Yves R; Maller, Julian; McCarroll, Steve; Patterson, Nick; Pe'er, Itsik; Price, Alkes; Purcell, Shaun; Richter, Daniel J; Sabeti, Pardis; Saxena, Richa; Schaffner, Stephen F; Sham, Pak C; Varilly, Patrick; Altshuler, David; Stein, Lincoln D; Krishnan, Lalitha; Smith, Albert Vernon; Tello-Ruiz, Marcela K; Thorisson, Gudmundur A; Chakravarti, Aravinda; Chen, Peter E; Cutler, David J; Kashuk, Carl S; Lin, Shin; Abecasis, Gonçalo R; Guan, Weihua; Li, Yun; Munro, Heather M; Qin, Zhaohui Steve; Thomas, Daryl J; McVean, Gilean; Auton, Adam; Bottolo, Leonardo; Cardin, Niall; Eyheramendy, Susana; Freeman, Colin; Marchini, Jonathan; Myers, Simon; Spencer, Chris; Stephens, Matthew; Donnelly, Peter; Cardon, Lon R; Clarke, Geraldine; Evans, David M; Morris, Andrew P; Weir, Bruce S; Tsunoda, Tatsuhiko; Johnson, Todd A; Mullikin, James C; Sherry, Stephen T; Feolo, Michael; Skol, Andrew; Zhang, Houcan; Zeng, Changqing; Zhao, Hui; Matsuda, Ichiro; Fukushima, Yoshimitsu; Macer, Darryl R; Suda, Eiko; Rotimi, Charles N; Adebamowo, Clement A; Ajayi, Ike; Aniagwu, Toyin; Marshall, Patricia A; Nkwodimmah, Chibuzor; Royal, Charmaine D M; Leppert, Mark F; Dixon, Missy; Peiffer, Andy; Qiu, Renzong; Kent, Alastair; Kato, Kazuto; Niikawa, Norio; Adewole, Isaac F; Knoppers, Bartha M; Foster, Morris W; Clayton, Ellen Wright; Watkin, Jessica; Gibbs, Richard A; Belmont, John W; Muzny, Donna; Nazareth, Lynne; Sodergren, Erica; Weinstock, George M; Wheeler, David A; Yakub, Imtaz; Gabriel, Stacey B; Onofrio, Robert C; Richter, Daniel J; Ziaugra, Liuda; Birren, Bruce W; Daly, Mark J; Altshuler, David; Wilson, Richard K; Fulton, Lucinda L; Rogers, Jane; Burton, John; Carter, Nigel P; Clee, Christopher M; Griffiths, Mark; Jones, Matthew C; McLay, Kirsten; Plumb, Robert W; Ross, Mark T; Sims, Sarah K; Willey, David L; Chen, Zhu; Han, Hua; Kang, Le; Godbout, Martin; Wallenburg, John C; L'Archevêque, Paul; Bellemare, Guy; Saeki, Koji; Wang, Hongguang; An, Daochang; Fu, Hongbo; Li, Qing; Wang, Zhen; Wang, Renwu; Holden, Arthur L; Brooks, Lisa D; McEwen, Jean E; Guyer, Mark S; Wang, Vivian Ota; Peterson, Jane L; Shi, Michael; Spiegel, Jack; Sung, Lawrence M; Zacharia, Lynn F; Collins, Francis S; Kennedy, Karen; Jamieson, Ruth; Stewart, John
2007-10-18
With the advent of dense maps of human genetic variation, it is now possible to detect positive natural selection across the human genome. Here we report an analysis of over 3 million polymorphisms from the International HapMap Project Phase 2 (HapMap2). We used 'long-range haplotype' methods, which were developed to identify alleles segregating in a population that have undergone recent selection, and we also developed new methods that are based on cross-population comparisons to discover alleles that have swept to near-fixation within a population. The analysis reveals more than 300 strong candidate regions. Focusing on the strongest 22 regions, we develop a heuristic for scrutinizing these regions to identify candidate targets of selection. In a complementary analysis, we identify 26 non-synonymous, coding, single nucleotide polymorphisms showing regional evidence of positive selection. Examination of these candidates highlights three cases in which two genes in a common biological process have apparently undergone positive selection in the same population:LARGE and DMD, both related to infection by the Lassa virus, in West Africa;SLC24A5 and SLC45A2, both involved in skin pigmentation, in Europe; and EDAR and EDA2R, both involved in development of hair follicles, in Asia.
Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A.; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S.
2013-01-01
DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease. PMID:23695301
Guo, Xiaosen; Brenner, Max; Zhang, Xuemei; Laragione, Teresina; Tai, Shuaishuai; Li, Yanhong; Bu, Junjie; Yin, Ye; Shah, Anish A; Kwan, Kevin; Li, Yingrui; Jun, Wang; Gulko, Pércio S
2013-08-01
DA (D-blood group of Palm and Agouti, also known as Dark Agouti) and F344 (Fischer) are two inbred rat strains with differences in several phenotypes, including susceptibility to autoimmune disease models and inflammatory responses. While these strains have been extensively studied, little information is available about the DA and F344 genomes, as only the Brown Norway (BN) and spontaneously hypertensive rat strains have been sequenced to date. Here we report the sequencing of the DA and F344 genomes using next-generation Illumina paired-end read technology and the first de novo assembly of a rat genome. DA and F344 were sequenced with an average depth of 32-fold, covered 98.9% of the BN reference genome, and included 97.97% of known rat ESTs. New sequences could be assigned to 59 million positions with previously unknown data in the BN reference genome. Differences between DA, F344, and BN included 19 million positions in novel scaffolds, 4.09 million single nucleotide polymorphisms (SNPs) (including 1.37 million new SNPs), 458,224 short insertions and deletions, and 58,174 structural variants. Genetic differences between DA, F344, and BN, including high-impact SNPs and short insertions and deletions affecting >2500 genes, are likely to account for most of the phenotypic variation between these strains. The new DA and F344 genome sequencing data should facilitate gene discovery efforts in rat models of human disease.
Association of NCOA3 polymorphisms with Dyslipidemia in the Chinese Han population.
Yu, Mingxi; Gilbert, Siame; Li, Yong; Zhang, Huiping; Qiao, Yichun; Lu, Yuping; Tang, Yuan; Zhen, Qing; Cheng, Yi; Liu, Yawen
2015-10-09
Nuclear receptor coactivator-3 (NCOA3) is involved in various physiological processes. Emerging evidence from previous studies using animal models suggests that the NCOA3 gene (NCOA3) plays a critical role in lipid metabolism as well as adipogenesis and obesity. The present study aims to investigate the association between NCOA3 SNPs and dyslipidemia in the Chinese Han population. Five hundred and twenty-nine (529) Chinese Han subjects were recruited. Four tag SNPs (rs2425955G > T, rs6066394T > C, rs10485463C > G, and rs6094753G > A) in NCOA3, selected from the HapMap website, were genotyped using MALDI-TOF mass spectrometry. Data analysis was performed using SPSS 16.0, SNPStats and haploview 4.2. Four SNPs (rs2425955, rs6066394, rs10485463, and rs6094753) were associated with triglyceride levels. Except for SNP rs10485463, genotype distributions and allele frequencies of the other three NCOA3 SNPs (rs2425955, rs6066394, and rs6094753) were significantly different between hypertriglyceridemia subjects and normal group. Significant differences were also observed in allele frequencies and genotype distributions of SNP rs10485463 between low-HDL cholesterolemia subjects and normal group. Carriers of rs2425955 T allele had a lower risk of hypertriglyceridemia compared to GG genotype. Similar results were observed from rs6094753. Subjects with rs6066394 CT genotype had a lower risk of hypertriglyceridemia than those with the TT genotype; however, CC and TT genotypes showed no significant difference in the risk of hypertriglyceridemia. Similar results were found in the association between rs6066394 and hypercholesterolemia. The variant alleles of rs2425955, rs6066394 and rs6094753 were associated with a lower risk of hypertriglyceridemia compared with the wild-type alleles. The G allele of rs10485463 was associated with an increased risk of low-HDL cholesterolemia. In the log-additive model the association between rs2425955 and hypertriglyceridemia remained significant after Bonferroni correction, and genotypes with variant alleles were associated with a lower risk of hypertriglyceridemia. In summary, this study demonstrated that variation in NCOA3 might influence the risk of dyslipidemia and serum lipid levels in Chinese Han population.
Type 2 diabetes mellitus: distribution of genetic markers in Kazakh population
Sikhayeva, Nurgul; Talzhanov, Yerkebulan; Iskakova, Aisha; Dzharmukhanov, Jarkyn; Nugmanova, Raushan; Zholdybaeva, Elena; Ramanculov, Erlan
2018-01-01
Background Ethnic differences exist in the frequencies of genetic variations that contribute to the risk of common disease. This study aimed to analyse the distribution of several genes, previously associated with susceptibility to type 2 diabetes and obesity-related phenotypes, in a Kazakh population. Methods A total of 966 individuals belonging to the Kazakh ethnicity were recruited from an outpatient clinic. We genotyped 41 common single nucleotide polymorphisms (SNPs) previously associated with type 2 diabetes in other ethnic groups and 31 of these were in Hardy–Weinberg equilibrium. The obtained allele frequencies were further compared to publicly available data from other ethnic populations. Allele frequencies for other (compared) populations were pooled from the haplotype map (HapMap) database. Principal component analysis (PCA), cluster analysis, and multidimensional scaling (MDS) were used for the analysis of genetic relationship between the populations. Results Comparative analysis of allele frequencies of the studied SNPs showed significant differentiation among the studied populations. The Kazakh population was grouped with Asian populations according to the cluster analysis and with the Caucasian populations according to PCA. According to MDS, results of the current study show that the Kazakh population holds an intermediate position between Caucasian and Asian populations. Conclusion A high percentage of population differentiation was observed between Kazakh and world populations. The Kazakh population was clustered with Caucasian populations, and this result may indicate a significant Caucasian component in the Kazakh gene pool. PMID:29551892
Sugishita, Mihoko; Imai, Tsuneo; Kikumori, Toyone; Mitsuma, Ayako; Shimokata, Tomoya; Shibata, Takashi; Morita, Sachi; Inada-Inoue, Megumi; Sawaki, Masataka; Hasegawa, Yoshinori; Ando, Yuichi
2016-03-01
Genetic risk factors for febrile neutropenia (FN), the major adverse event of perioperative chemotherapy for early breast cancer, remain unclear. This study retrospectively explored pharmacogenetic associations of single nucleotide polymorphisms (SNPs) of the uridine glucuronosyltransferase 2B7 (UGT2B7, rs7668258), glutathione-S-transferase pi 1 (GSTP1, rs1695), and microcephalin 1 (MCPH1, rs2916733) genes with chemotherapy-related adverse events in 102 Japanese women who received epirubicin and cyclophosphamide as perioperative chemotherapy for early breast cancer. The allele frequencies for all of the SNPs were in concordance with the Hap-Map data of Japanese individuals. Among the 24 patients who had FN at least once during all courses of chemotherapy, 23 had the A/A genotype, and 1 had the A/G genotype of the GSTP1 polymorphism (rs1695, P = 0.001); 23 of the 70 patients with the A/A genotype had FN, as compared with only 1 of the 32 patients with the A/G and G/G genotypes. The genotype distributions of the UGT2B7 and MCPH1 polymorphisms did not differ between the patients who had FN or grade 3/4 neutropenia and those who did not. Among Japanese women who received epirubicin and cyclophosphamide as perioperative chemotherapy for early breast cancer, those with the A/A genotype of the GSTP1 polymorphism (rs1695) were more likely to have FN.
EPIGEN-Brazil Initiative resources: a Latin American imputation panel and the Scientific Workflow.
Magalhães, Wagner C S; Araujo, Nathalia M; Leal, Thiago P; Araujo, Gilderlanio S; Viriato, Paula J S; Kehdy, Fernanda S; Costa, Gustavo N; Barreto, Mauricio L; Horta, Bernardo L; Lima-Costa, Maria Fernanda; Pereira, Alexandre C; Tarazona-Santos, Eduardo; Rodrigues, Maíra R
2018-06-14
EPIGEN-Brazil is one of the largest Latin American initiatives at the interface of human genomics, public health, and computational biology. Here, we present two resources to address two challenges to the global dissemination of precision medicine and the development of the bioinformatics know-how to support it. To address the underrepresentation of non-European individuals in human genome diversity studies, we present the EPIGEN-5M+1KGP imputation panel-the fusion of the public 1000 Genomes Project (1KGP) Phase 3 imputation panel with haplotypes derived from the EPIGEN-5M data set (a product of the genotyping of 4.3 million SNPs in 265 admixed individuals from the EPIGEN-Brazil Initiative). When we imputed a target SNPs data set (6487 admixed individuals genotyped for 2.2 million SNPs from the EPIGEN-Brazil project) with the EPIGEN-5M+1KGP panel, we gained 140,452 more SNPs in total than when using the 1KGP Phase 3 panel alone and 788,873 additional high confidence SNPs ( info score ≥ 0.8). Thus, the major effect of the inclusion of the EPIGEN-5M data set in this new imputation panel is not only to gain more SNPs but also to improve the quality of imputation. To address the lack of transparency and reproducibility of bioinformatics protocols, we present a conceptual Scientific Workflow in the form of a website that models the scientific process (by including publications, flowcharts, masterscripts, documents, and bioinformatics protocols), making it accessible and interactive. Its applicability is shown in the context of the development of our EPIGEN-5M+1KGP imputation panel. The Scientific Workflow also serves as a repository of bioinformatics resources. © 2018 Magalhães et al.; Published by Cold Spring Harbor Laboratory Press.
Intricacies in arrangement of SNP haplotypes suggest "Great Admixture" that created modern humans.
Dutta, Rajib; Mainsah, Joseph; Yatskiv, Yuriy; Chakrabortty, Sharmistha; Brennan, Patrick; Khuder, Basil; Qiu, Shuhao; Fedorova, Larisa; Fedorov, Alexei
2017-06-05
Inferring history from genomic sequences is challenging and problematic because chromosomes are mosaics of thousands of small Identicalby-descent (IBD) fragments, each of them having their own unique story. However, the main events in recent evolution might be deciphered from comparative analysis of numerous loci. A paradox of why humans, whose effective population size is only 10 4 , have nearly three million frequent SNPs is formulated and examined. We studied 5398 loci evenly covering all human autosomes. Common haplotypes built from frequent SNPs that are present in people from various populations have been examined. We demonstrated highly non-random arrangement of alleles in common haplotypes. Abundance of mutually exclusive pairs of common haplotypes that have different alleles at every polymorphic position (so-called Yin/Yang haplotypes) was found in 56% of loci. A novel widely spread category of common haplotypes named Mosaic has been described. Mosaic consists of numerous pieces of Yin/Yang haplotypes and represents an ancestral stage of one of them. Scenarios of possible appearance of large number of frequent human SNPs and their habitual arrangement in Yin/Yang common haplotypes have been evaluated with an advanced genomic simulation algorithm. Computer modeling demonstrated that the observed arrangement of 2.9 million frequent SNPs could not originate from a sole stand-alone population. A "Great Admixture" event has been proposed that can explain peculiarities with frequent SNP distributions. This Great Admixture presumably occurred 100-300 thousand years ago between two ancestral populations that had been separated from each other about a million years ago. Our programs and algorithms can be applied to other species to perform evolutionary and comparative genomics.
Gu, Jun-dong; Hua, Feng; Mei, Chao-rong; Zheng, De-jie; Wang, Guo-fan; Zhou, Qing-hua
2014-01-01
Aim: Myeloperoxidase (MPO) and glutathione S-transferase pi 1 (GSTP1) are important carcinogen-metabolizing enzymes. The aim of this study was to investigate the association between the common polymorphisms of MPO and GSTP1 genes and lung cancer risk in Chinese Han population. Methods: A total of 266 subjects with lung cancer and 307 controls without personal history of the disease were recruited in this case control study. The tagSNPs approach was used to assess the common polymorphisms of MOP and GSTP1 genes and lung cancer risk according to the disequilibrium information from the HapMap project. The tagSNP rs7208693 was selected as the polymorphism site for MPO, while the haplotype-tagging SNPs rs1695, rs4891, rs762803 and rs749174 were selected as the polymorphism sites for GSTP1. The gene polymorphisms were confirmed using real-time PCR, cloning and sequencing. Results: The four GSTP1 haplotype-tagging SNPs rs1695, rs4891, rs762803 and rs749174, but not the MPO tagSNP rs7208693, exhibited an association with lung cancer susceptibility in smokers in the overall population and in the studied subgroups. When Phase 2 software was used to reconstruct the haplotype for GSTP1, the haplotype CACA (rs749174+rs1695 + rs762803+rs4891) exhibited an increased risk of lung cancer among smokers (adjust odds ratio 1.53; 95%CI 1.04–2.25, P=0.033). Furthermore, diplotype analyses demonstrated that the significant association between the risk haplotype and lung cancer. The risk haplotypes co-segregated with one or more biologically functional polymorphisms and corresponded to a recessive inheritance model. Conclusion: The common polymorphisms of the GSTP1 gene may be the candidates for SNP markers for lung cancer susceptibility in Chinese Han population. PMID:24786234
Whole genome SNP discovery and analysis of genetic diversity in Turkey (Meleagris gallopavo)
2012-01-01
Background The turkey (Meleagris gallopavo) is an important agricultural species and the second largest contributor to the world’s poultry meat production. Genetic improvement is attributed largely to selective breeding programs that rely on highly heritable phenotypic traits, such as body size and breast muscle development. Commercial breeding with small effective population sizes and epistasis can result in loss of genetic diversity, which in turn can lead to reduced individual fitness and reduced response to selection. The presence of genomic diversity in domestic livestock species therefore, is of great importance and a prerequisite for rapid and accurate genetic improvement of selected breeds in various environments, as well as to facilitate rapid adaptation to potential changes in breeding goals. Genomic selection requires a large number of genetic markers such as e.g. single nucleotide polymorphisms (SNPs) the most abundant source of genetic variation within the genome. Results Alignment of next generation sequencing data of 32 individual turkeys from different populations was used for the discovery of 5.49 million SNPs, which subsequently were used for the analysis of genetic diversity among the different populations. All of the commercial lines branched from a single node relative to the heritage varieties and the South Mexican turkey population. Heterozygosity of all individuals from the different turkey populations ranged from 0.17-2.73 SNPs/Kb, while heterozygosity of populations ranged from 0.73-1.64 SNPs/Kb. The average frequency of heterozygous SNPs in individual turkeys was 1.07 SNPs/Kb. Five genomic regions with very low nucleotide variation were identified in domestic turkeys that showed state of fixation towards alleles different than wild alleles. Conclusion The turkey genome is much less diverse with a relatively low frequency of heterozygous SNPs as compared to other livestock species like chicken and pig. The whole genome SNP discovery study in turkey resulted in the detection of 5.49 million putative SNPs compared to the reference genome. All commercial lines appear to share a common origin. Presence of different alleles/haplotypes in the SM population highlights that specific haplotypes have been selected in the modern domesticated turkey. PMID:22891612
Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh
2014-02-01
Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.
Murabito, Joanne M.; White, Charles C.; Kavousi, Maryam; Sun, Yan V.; Feitosa, Mary F.; Nambi, Vijay; Lamina, Claudia; Schillert, Arne; Coassin, Stefan; Bis, Joshua C.; Broer, Linda; Crawford, Dana C.; Franceschini, Nora; Frikke-Schmidt, Ruth; Haun, Margot; Holewijn, Suzanne; Huffman, Jennifer E.; Hwang, Shih-Jen; Kiechl, Stefan; Kollerits, Barbara; Montasser, May E.; Nolte, Ilja M.; Rudock, Megan E.; Senft, Andrea; Teumer, Alexander; van der Harst, Pim; Vitart, Veronique; Waite, Lindsay L.; Wood, Andrew R.; Wassel, Christina L.; Absher, Devin M.; Allison, Matthew A.; Amin, Najaf; Arnold, Alice; Asselbergs, Folkert W.; Aulchenko, Yurii; Bandinelli, Stefania; Barbalic, Maja; Boban, Mladen; Brown-Gentry, Kristin; Couper, David J.; Criqui, Michael H.; Dehghan, Abbas; Heijer, Martin den; Dieplinger, Benjamin; Ding, Jingzhong; Dörr, Marcus; Espinola-Klein, Christine; Felix, Stephan B.; Ferrucci, Luigi; Folsom, Aaron R.; Fraedrich, Gustav; Gibson, Quince; Goodloe, Robert; Gunjaca, Grgo; Haltmayer, Meinhard; Heiss, Gerardo; Hofman, Albert; Kieback, Arne; Kiemeney, Lambertus A.; Kolcic, Ivana; Kullo, Iftikhar J.; Kritchevsky, Stephen B.; Lackner, Karl J.; Li, Xiaohui; Lieb, Wolfgang; Lohman, Kurt; Meisinger, Christa; Melzer, David; Mohler, Emile R; Mudnic, Ivana; Mueller, Thomas; Navis, Gerjan; Oberhollenzer, Friedrich; Olin, Jeffrey W.; O’Connell, Jeff; O’Donnell, Christopher J.; Palmas, Walter; Penninx, Brenda W.; Petersmann, Astrid; Polasek, Ozren; Psaty, Bruce M.; Rantner, Barbara; Rice, Ken; Rivadeneira, Fernando; Rotter, Jerome I.; Seldenrijk, Adrie; Stadler, Marietta; Summerer, Monika; Tanaka, Toshiko; Tybjaerg-Hansen, Anne; Uitterlinden, Andre G.; van Gilst, Wiek H.; Vermeulen, Sita H.; Wild, Sarah H.; Wild, Philipp S.; Willeit, Johann; Zeller, Tanja; Zemunik, Tatijana; Zgaga, Lina; Assimes, Themistocles L.; Blankenberg, Stefan; Boerwinkle, Eric; Campbell, Harry; Cooke, John P.; de Graaf, Jacqueline; Herrington, David; Kardia, Sharon L. R.; Mitchell, Braxton D.; Murray, Anna; Münzel, Thomas; Newman, Anne; Oostra, Ben A.; Rudan, Igor; Shuldiner, Alan R.; Snieder, Harold; van Duijn, Cornelia M.; Völker, Uwe; Wright, Alan F.; Wichmann, H.-Erich; Wilson, James F.; Witteman, Jacqueline C.M.; Liu, Yongmei; Hayward, Caroline; Borecki, Ingrid B.; Ziegler, Andreas; North, Kari E.; Cupples, L. Adrienne; Kronenberg, Florian
2012-01-01
Background Genetic determinants of peripheral arterial disease (PAD) remain largely unknown. To identify genetic variants associated with the ankle-brachial index (ABI), a noninvasive measure of PAD, we conducted a meta-analysis of genome-wide association study data from 21 population-based cohorts. Methods and Results Continuous ABI and PAD (ABI≤0.9) phenotypes adjusted for age and sex were examined. Each study conducted genotyping and imputed data to the ~2.5 million SNPs in HapMap. Linear and logistic regression models were used to test each SNP for association with ABI and PAD using additive genetic models. Study-specific data were combined using fixed-effects inverse variance weighted meta-analyses. There were a total of 41,692 participants of European ancestry (~60% women, mean ABI 1.02 to 1.19), including 3,409 participants with PAD and with GWAS data available. In the discovery meta-analysis, rs10757269 on chromosome 9 near CDKN2B had the strongest association with ABI (β= −0.006, p=2.46x10−8). We sought replication of the 6 strongest SNP associations in 5 population-based studies and 3 clinical samples (n=16,717). The association for rs10757269 strengthened in the combined discovery and replication analysis (p=2.65x10−9). No other SNP associations for ABI or PAD achieved genome-wide significance. However, two previously reported candidate genes for PAD and one SNP associated with coronary artery disease (CAD) were associated with ABI : DAB21P (rs13290547, p=3.6x10−5); CYBA (rs3794624, p=6.3x10−5); and rs1122608 (LDLR, p=0.0026). Conclusions GWAS in more than 40,000 individuals identified one genome-wide significant association on chromosome 9p21 with ABI. Two candidate genes for PAD and 1 SNP for CAD are associated with ABI. PMID:22199011
Gene Flow between the Korean Peninsula and Its Neighboring Countries
Cho, Yoon Shin; Oh, Ji Hee; Ryu, Min Hyung; Chung, Hye Won; Seo, Jeong-Sun; Lee, Jong-Eun; Oh, Bermseok; Bhak, Jong; Kim, Hyung-Lae
2010-01-01
SNP markers provide the primary data for population structure analysis. In this study, we employed whole-genome autosomal SNPs as a marker set (54,836 SNP markers) and tested their possible effects on genetic ancestry using 320 subjects covering 24 regional groups including Northern ( = 16) and Southern ( = 3) Asians, Amerindians ( = 1), and four HapMap populations (YRI, CEU, JPT, and CHB). Additionally, we evaluated the effectiveness and robustness of 50K autosomal SNPs with various clustering methods, along with their dependencies on recombination hotspots (RH), linkage disequilibrium (LD), missing calls and regional specific markers. The RH- and LD-free multi-dimensional scaling (MDS) method showed a broad picture of human migration from Africa to North-East Asia on our genome map, supporting results from previous haploid DNA studies. Of the Asian groups, the East Asian group showed greater differentiation than the Northern and Southern Asian groups with respect to Fst statistics. By extension, the analysis of monomorphic markers implied that nine out of ten historical regions in South Korea, and Tokyo in Japan, showed signs of genetic drift caused by the later settlement of East Asia (South Korea, Japan and China), while Gyeongju in South East Korea showed signs of the earliest settlement in East Asia. In the genome map, the gene flow to the Korean Peninsula from its neighboring countries indicated that some genetic signals from Northern populations such as the Siberians and Mongolians still remain in the South East and West regions, while few signals remain from the early Southern lineages. PMID:20686617
Schönherr, Sebastian; Neuner, Mathias; Forer, Lukas; Specht, Günther; Kloss-Brandstätter, Anita; Kronenberg, Florian; Coassin, Stefan
2013-01-01
Single nucleotide polymorphisms (SNPs) play a prominent role in modern genetics. Current genotyping technologies such as Sequenom iPLEX, ABI TaqMan and KBioscience KASPar made the genotyping of huge SNP sets in large populations straightforward and allow the generation of hundreds of thousands of genotypes even in medium sized labs. While data generation is straightforward, the subsequent data conversion, storage and quality control steps are time-consuming, error-prone and require extensive bioinformatic support. In order to ease this tedious process, we developed SNPflow. SNPflow is a lightweight, intuitive and easily deployable application, which processes genotype data from Sequenom MassARRAY (iPLEX) and ABI 7900HT (TaqMan, KASPar) systems and is extendible to other genotyping methods as well. SNPflow automatically converts the raw output files to ready-to-use genotype lists, calculates all standard quality control values such as call rate, expected and real amount of replicates, minor allele frequency, absolute number of discordant replicates, discordance rate and the p-value of the HWE test, checks the plausibility of the observed genotype frequencies by comparing them to HapMap/1000-Genomes, provides a module for the processing of SNPs, which allow sex determination for DNA quality control purposes and, finally, stores all data in a relational database. SNPflow runs on all common operating systems and comes as both stand-alone version and multi-user version for laboratory-wide use. The software, a user manual, screenshots and a screencast illustrating the main features are available at http://genepi-snpflow.i-med.ac.at. PMID:23527209
ITGB5 and AGFG1 variants are associated with severity of airway responsiveness.
Himes, Blanca E; Qiu, Weiliang; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J; Lemanske, Robert F; Zeiger, Robert S; Strunk, Robert C; Martinez, Fernando D; Boushey, Homer; Chinchilli, Vernon M; Israel, Elliot; Mauger, David; Koppelman, Gerard H; Nieuwenhuis, Maartje A E; Postma, Dirkje S; Vonk, Judith M; Rafaels, Nicholas; Hansel, Nadia N; Barnes, Kathleen; Raby, Benjamin; Tantisira, Kelan G; Weiss, Scott T
2013-08-28
Airway hyperresponsiveness (AHR), a primary characteristic of asthma, involves increased airway smooth muscle contractility in response to certain exposures. We sought to determine whether common genetic variants were associated with AHR severity. A genome-wide association study (GWAS) of AHR, quantified as the natural log of the dosage of methacholine causing a 20% drop in FEV1, was performed with 994 non-Hispanic white asthmatic subjects from three drug clinical trials: CAMP, CARE, and ACRN. Genotyping was performed on Affymetrix 6.0 arrays, and imputed data based on HapMap Phase 2, was used to measure the association of SNPs with AHR using a linear regression model. Replication of primary findings was attempted in 650 white subjects from DAG, and 3,354 white subjects from LHS. Evidence that the top SNPs were eQTL of their respective genes was sought using expression data available for 419 white CAMP subjects. The top primary GWAS associations were in rs848788 (P-value 7.2E-07) and rs6731443 (P-value 2.5E-06), located within the ITGB5 and AGFG1 genes, respectively. The AGFG1 result replicated at a nominally significant level in one independent population (LHS P-value 0.012), and the SNP had a nominally significant unadjusted P-value (0.0067) for being an eQTL of AGFG1. Based on current knowledge of ITGB5 and AGFG1, our results suggest that variants within these genes may be involved in modulating AHR. Future functional studies are required to confirm that our associations represent true biologically significant findings.
GABRG1 and GABRA2 as Independent Predictors for Alcoholism in Two Populations
Enoch, Mary-Anne; Hodgkinson, Colin A.; Yuan, Qiaoping; Albaugh, Bernard; Virkkunen, Matti; Goldman, David
2008-01-01
The chromosome 4 cluster of GABAA receptor genes is predominantly expressed in the brain reward circuitry and this chromosomal region has been implicated in linkage scans for alcoholism. Variation in one chromosome 4 gene, GABRA2, has been robustly associated with alcohol use disorders (AUD) although no functional locus has been identified. Since HapMap data reveals moderate long-distance linkage disequilibrium across GABRA2 and the adjacent gene, GABRG1, it is possible that the functional locus is in GABRG1. We genotyped 24 SNPs across GABRG1 and GABRA2 in two population isolates: 547 Finnish Caucasian men (266 alcoholics) and 311 community-derived Plains Indian men and women (181 alcoholics). In both the Plains Indians and the Caucasians: (a) the GABRG1 haplotype block(s) did not extend to GABRA2; (b) GABRG1 haplotypes and SNPs were significantly associated with AUD; (c) there was no association between GABRA2 haplotypes and AUD; (d) there were several common (≥ 0.05) haplotypes that spanned GABRG1 and GABRA2 (341 kb), three of which were present in both populations: one of these ancestral haplotypes was associated with AUD, the other two were more common in non-alcoholics; this association was determined by GABRG1; (e) in the Finns, three less common (< 0.05) extended haplotypes showed an association with AUD that was determined by GABRA2. Our results suggest that there are likely to be independent, complex contributions from both GABRG1 and GABRA2 to alcoholism vulnerability. PMID:18818659
Meta-analysis of genome-wide association studies for personality
de Moor, Marleen H.M.; Costa, Paul T.; Terracciano, Antonio; Krueger, Robert F.; de Geus, Eco J.C.; Toshiko, Tanaka; Penninx, Brenda W.J.H.; Esko, Tõnu; Madden, Pamela A F; Derringer, Jaime; Amin, Najaf; Willemsen, Gonneke; Hottenga, Jouke-Jan; Distel, Marijn A.; Uda, Manuela; Sanna, Serena; Spinhoven, Philip; Hartman, Catharina A.; Sullivan, Patrick; Realo, Anu; Allik, Jüri; Heath, Andrew C; Pergadia, Michele L; Agrawal, Arpana; Lin, Peng; Grucza, Richard; Nutile, Teresa; Ciullo, Marina; Rujescu, Dan; Giegling, Ina; Konte, Bettina; Widen, Elisabeth; Cousminer, Diana L; Eriksson, Johan G.; Palotie, Aarno; Luciano, Michelle; Tenesa, Albert; Davies, Gail; Lopez, Lorna M.; Hansell, Narelle K.; Medland, Sarah E.; Ferrucci, Luigi; Schlessinger, David; Montgomery, Grant W.; Wright, Margaret J.; Aulchenko, Yurii S.; Janssens, A.Cecile J.W.; Oostra, Ben A.; Metspalu, Andres; Abecasis, Gonçalo R.; Deary, Ian J.; Räikkönen, Katri; Bierut, Laura J.; Martin, Nicholas G.; van Duijn, Cornelia M.; Boomsma, Dorret I.
2013-01-01
Personality can be thought of as a set of characteristics that influence people’s thoughts, feelings, and behaviour across a variety of settings. Variation in personality is predictive of many outcomes in life, including mental health. Here we report on a meta-analysis of genome-wide association (GWA) data for personality in ten discovery samples (17 375 adults) and five in-silico replication samples (3 294 adults). All participants were of European ancestry. Personality scores for Neuroticism, Extraversion, Openness to Experience, Agreeableness, and Conscientiousness were based on the NEO Five-Factor Inventory. Genotype data were available of ~2.4M Single Nucleotide Polymorphisms (SNPs; directly typed and imputed using HAPMAP data). In the discovery samples, classical association analyses were performed under an additive model followed by meta-analysis using the weighted inverse variance method. Results showed genome-wide significance for Openness to Experience near the RASA1 gene on 5q14.3 (rs1477268 and rs2032794, P = 2.8 × 10−8 and 3.1 × 10−8) and for Conscientiousness in the brain-expressed KATNAL2 gene on 18q21.1 (rs2576037, P = 4.9 × 10−8). We further conducted a gene-based test that confirmed the association of KATNAL2 to Conscientiousness. In-silico replication did not, however, show significant associations of the top SNPs with Openness and Conscientiousness, although the direction of effect of the KATNAL2 SNP on Conscientiousness was consistent in all replication samples. Larger scale GWA studies and alternative approaches are required for confirmation of KATNAL2 as a novel gene affecting Conscientiousness. PMID:21173776
Population Structure of Hispanics in the United States: The Multi-Ethnic Study of Atherosclerosis
Manichaikul, Ani; Palmas, Walter; Rodriguez, Carlos J.; Peralta, Carmen A.; Divers, Jasmin; Guo, Xiuqing; Chen, Wei-Min; Wong, Quenna; Williams, Kayleen; Kerr, Kathleen F.; Taylor, Kent D.; Tsai, Michael Y.; Goodarzi, Mark O.; Sale, Michèle M.; Diez-Roux, Ana V.; Rich, Stephen S.; Rotter, Jerome I.; Mychaleckyj, Josyf C.
2012-01-01
Using ∼60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population. PMID:22511882
SNP discovery and genotyping using Genotyping-by-Sequencing in Pekin ducks.
Zhu, Feng; Cui, Qian-Qian; Hou, Zhuo-Cheng
2016-11-15
Genomic selection and genome-wide association studies need thousands to millions of SNPs. However, many non-model species do not have reference chips for detecting variation. Our goal was to develop and validate an inexpensive but effective method for detecting SNP variation. Genotyping by sequencing (GBS) can be a highly efficient strategy for genome-wide SNP detection, as an alternative to microarray chips. Here, we developed a GBS protocol for ducks and tested it to genotype 49 Pekin ducks. A total of 169,209 SNPs were identified from all animals, with a mean of 55,920 SNPs per individual. The average SNP density reached 1156 SNPs/MB. In this study, the first application of GBS to ducks, we demonstrate the power and simplicity of this method. GBS can be used for genetic studies in to provide an effective method for genome-wide SNP discovery.
Jha, Ruchira Menka; Koleck, Theresa A; Puccio, Ava M; Okonkwo, David O; Park, Seo-Young; Zusman, Benjamin E; Clark, Robert S B; Shutter, Lori A; Wallisch, Jessica S; Empey, Philip E; Kochanek, Patrick M; Conley, Yvette P
2018-04-19
ABCC8 encodes sulfonylurea receptor 1, a key regulatory protein of cerebral oedema in many neurological disorders including traumatic brain injury (TBI). Sulfonylurea-receptor-1 inhibition has been promising in ameliorating cerebral oedema in clinical trials. We evaluated whether ABCC8 tag single-nucleotide polymorphisms predicted oedema and outcome in TBI. DNA was extracted from 485 prospectively enrolled patients with severe TBI. 410 were analysed after quality control. ABCC8 tag single-nucleotide polymorphisms (SNPs) were identified (Hapmap, r 2 >0.8, minor-allele frequency >0.20) and sequenced (iPlex-Gold, MassArray). Outcomes included radiographic oedema, intracranial pressure (ICP) and 3-month Glasgow Outcome Scale (GOS) score. Proxy SNPs, spatial modelling, amino acid topology and functional predictions were determined using established software programs. Wild-type rs7105832 and rs2237982 alleles and genotypes were associated with lower average ICP (β=-2.91, p=0.001; β=-2.28, p=0.003) and decreased radiographic oedema (OR 0.42, p=0.012; OR 0.52, p=0.017). Wild-type rs2237982 also increased favourable 3-month GOS (OR 2.45, p=0.006); this was partially mediated by oedema (p=0.03). Different polymorphisms predicted 3-month outcome: variant rs11024286 increased (OR 1.84, p=0.006) and wild-type rs4148622 decreased (OR 0.40, p=0.01) the odds of favourable outcome. Significant tag and concordant proxy SNPs regionally span introns/exons 2-15 of the 39-exon gene. This study identifies four ABCC8 tag SNPs associated with cerebral oedema and/or outcome in TBI, tagging a region including 33 polymorphisms. In polymorphisms predictive of oedema, variant alleles/genotypes confer increased risk. Different variant polymorphisms were associated with favourable outcome, potentially suggesting distinct mechanisms. Significant polymorphisms spatially clustered flanking exons encoding the sulfonylurea receptor site and transmembrane domain 0/loop 0 (juxtaposing the channel pore/binding site). This, if validated, may help build a foundation for developing future strategies that may guide individualised care, treatment response, prognosis and patient selection for clinical trials. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Liu, He; Jiang, Xia; Zhang, Ming-wu; Pan, Yi-feng; Yu, Yun-xian; Zhang, Shan-chun; Ma, Xin-yuan; Li, Qi-long; Chen, Kun
2013-01-01
The initiators caspase-9 (CASP9) and caspase-10 (CASP10) are two key controllers of apoptosis and play important roles in carcinogenesis. This study aims to explore the association between CASPs gene polymorphisms and colorectal cancer (CRC) susceptibility in a population-based study. A two-stage designed population-based case-control study was carried out, including a testing set with 300 cases and 296 controls and a validation set with 206 cases and 845 controls. A total of eight tag selected single nucleotide polymorphisms (SNPs) in CASP9 and CASP10 were chosen based on HapMap and the National Center of Biotechnology Information (NCBI) datasets and genotyped by restriction fragment length polymorphism (RFLP) assay. Multivariate logistic regression models were applied to evaluate the association of SNPs with CRC risk. In the first stage, from eight tag SNPs, three polymorphisms rs4646077 (odds ratio (OR)(AA+AG): 0.654, 95% confidence interval (CI): 0.406-1.055; P=0.082), rs4233532 (OR(CC): 1.667, 95% CI: 0.967-2.876; OR(CT): 1.435, 95% CI: 0.998-2.063; P=0.077), and rs2881930 (OR(CC): 0.263, 95% CI: 0.095-0.728, P=0.036) showed possible association with CRC risk. However, none of the three SNPs, rs4646077 (OR(AA+AG): 1.233, 95% CI: 0.903-1.683), rs4233532 (OR(CC): 0.892, 95% CI: 0.640-1.243; OR(CT): 1.134, 95% CI: 0.897-1.433), and rs2881930 (OR(CC): 1.096, 95% CI: 0.620-1.938; OR(CT): 1.009, 95% CI: 0.801-1.271), remained significant with CRC risk in the validation set, even after stratification for different tumor locations (colon or rectum). In addition, never tea drinking was associated with a significantly increased risk of CRC in testing set together with validation set (OR: 1.755, 95% CI: 1.319-2.334). Our results found that polymorphisms of CASP9 and CASP10 genes may not contribute to CRC risk in Chinese population and thereby the large-scale case-control studies might be in consideration. In addition, tea drinking was a protective factor for CRC.
Liu, He; Jiang, Xia; Zhang, Ming-wu; Pan, Yi-feng; Yu, Yun-xian; Zhang, Shan-chun; Ma, Xin-yuan; Li, Qi-long; Chen, Kun
2013-01-01
The initiators caspase-9 (CASP9) and caspase-10 (CASP10) are two key controllers of apoptosis and play important roles in carcinogenesis. This study aims to explore the association between CASPs gene polymorphisms and colorectal cancer (CRC) susceptibility in a population-based study. A two-stage designed population-based case-control study was carried out, including a testing set with 300 cases and 296 controls and a validation set with 206 cases and 845 controls. A total of eight tag selected single nucleotide polymorphisms (SNPs) in CASP9 and CASP10 were chosen based on HapMap and the National Center of Biotechnology Information (NCBI) datasets and genotyped by restriction fragment length polymorphism (RFLP) assay. Multivariate logistic regression models were applied to evaluate the association of SNPs with CRC risk. In the first stage, from eight tag SNPs, three polymorphisms rs4646077 (odds ratio (OR)AA+AG: 0.654, 95% confidence interval (CI): 0.406–1.055; P=0.082), rs4233532 (ORCC: 1.667, 95% CI: 0.967–2.876; ORCT: 1.435, 95% CI: 0.998–2.063; P=0.077), and rs2881930 (ORCC: 0.263, 95% CI: 0.095–0.728, P=0.036) showed possible association with CRC risk. However, none of the three SNPs, rs4646077 (ORAA+AG: 1.233, 95% CI: 0.903–1.683), rs4233532 (ORCC: 0.892, 95% CI: 0.640–1.243; ORCT: 1.134, 95% CI: 0.897–1.433), and rs2881930 (ORCC: 1.096, 95% CI: 0.620–1.938; ORCT: 1.009, 95% CI: 0.801–1.271), remained significant with CRC risk in the validation set, even after stratification for different tumor locations (colon or rectum). In addition, never tea drinking was associated with a significantly increased risk of CRC in testing set together with validation set (OR: 1.755, 95% CI: 1.319–2.334). Our results found that polymorphisms of CASP9 and CASP10 genes may not contribute to CRC risk in Chinese population and thereby the large-scale case-control studies might be in consideration. In addition, tea drinking was a protective factor for CRC. PMID:23303631
Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup
2016-01-01
Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.
Blåhed, Ida-Maria; Königsson, Helena; Ericsson, Göran; Spong, Göran
2018-01-01
Monitoring of wild animal populations is challenging, yet reliable information about population processes is important for both management and conservation efforts. Access to molecular markers, such as SNPs, enables population monitoring through genotyping of various DNA sources. We have developed 96 high quality SNP markers for individual identification of moose (Alces alces), an economically and ecologically important top-herbivore in boreal regions. Reduced representation libraries constructed from 34 moose were high-throughput de novo sequenced, generating nearly 50 million read pairs. About 50 000 stacks of aligned reads containing one or more SNPs were discovered with the Stacks pipeline. Several quality criteria were applied on the candidate SNPs to find markers informative on the individual level and well representative for the population. An empirical validation by genotyping of sequenced individuals and additional moose, resulted in the selection of a final panel of 86 high quality autosomal SNPs. Additionally, five sex-specific SNPs and five SNPs for sympatric species diagnostics are included in the panel. The genotyping error rate was 0.002 for the total panel and probability of identities were low enough to separate individuals with high confidence. Moreover, the autosomal SNPs were highly informative also for population level analyses. The potential applications of this SNP panel are thus many including investigations of population size, sex ratios, relatedness, reproductive success and population structure. Ideally, SNP-based studies could improve today's population monitoring and increase our knowledge about moose population dynamics.
Construction of the third-generation Zea mays haplotype map.
Bukowski, Robert; Guo, Xiaosen; Lu, Yanli; Zou, Cheng; He, Bing; Rong, Zhengqin; Wang, Bo; Xu, Dawen; Yang, Bicheng; Xie, Chuanxiao; Fan, Longjiang; Gao, Shibin; Xu, Xun; Zhang, Gengyun; Li, Yingrui; Jiao, Yinping; Doebley, John F; Ross-Ibarra, Jeffrey; Lorant, Anne; Buffalo, Vince; Romay, M Cinta; Buckler, Edward S; Ware, Doreen; Lai, Jinsheng; Sun, Qi; Xu, Yunbi
2018-04-01
Characterization of genetic variations in maize has been challenging, mainly due to deterioration of collinearity between individual genomes in the species. An international consortium of maize research groups combined resources to develop the maize haplotype version 3 (HapMap 3), built from whole-genome sequencing data from 1218 maize lines, covering predomestication and domesticated Zea mays varieties across the world. A new computational pipeline was set up to process more than 12 trillion bp of sequencing data, and a set of population genetics filters was applied to identify more than 83 million variant sites. We identified polymorphisms in regions where collinearity is largely preserved in the maize species. However, the fact that the B73 genome used as the reference only represents a fraction of all haplotypes is still an important limiting factor.
Quantifying the utility of single nucleotide polymorphisms to guide colorectal cancer screening
Jenkins, Mark A; Makalic, Enes; Dowty, James G; Schmidt, Daniel F; Dite, Gillian S; MacInnis, Robert J; Ait Ouakrim, Driss; Clendenning, Mark; Flander, Louisa B; Stanesby, Oliver K; Hopper, John L; Win, Aung K; Buchanan, Daniel D
2016-01-01
Aim: To determine whether single nucleotide polymorphisms (SNPs) can be used to identify people who should be screened for colorectal cancer. Methods: We simulated one million people with and without colorectal cancer based on published SNP allele frequencies and strengths of colorectal cancer association. We estimated 5-year risks of colorectal cancer by number of risk alleles. Results: We identified 45 SNPs with an average 1.14-fold increase colorectal cancer risk per allele (range: 1.05–1.53). The colorectal cancer risk for people in the highest quintile of risk alleles was 1.81-times that for the average person. Conclusion: We have quantified the extent to which known susceptibility SNPs can stratify the population into clinically useful colorectal cancer risk categories. PMID:26846999
New genetic variants associated with prostate cancer
Researchers have newly identified 23 common genetic variants -- one-letter changes in DNA known as single-nucleotide polymorphisms or SNPs -- that are associated with risk of prostate cancer. These results come from an analysis of more than 10 million SNP
Iacono, William G; Malone, Stephen M; Vaidyanathan, Uma; Vrieze, Scott I
2014-12-01
This article provides an introductory overview of the investigative strategy employed to evaluate the genetic basis of 17 endophenotypes examined as part of a 20-year data collection effort from the Minnesota Center for Twin and Family Research. Included are characterization of the study samples, descriptive statistics for key properties of the psychophysiological measures, and rationale behind the steps taken in the molecular genetic study design. The statistical approach included (a) biometric analysis of twin and family data, (b) heritability analysis using 527,829 single nucleotide polymorphisms (SNPs), (c) genome-wide association analysis of these SNPs and 17,601 autosomal genes, (d) follow-up analyses of candidate SNPs and genes hypothesized to have an association with each endophenotype, (e) rare variant analysis of nonsynonymous SNPs in the exome, and (f) whole genome sequencing association analysis using 27 million genetic variants. These methods were used in the accompanying empirical articles comprising this special issue, Genome-Wide Scans of Genetic Variants for Psychophysiological Endophenotypes. Copyright © 2014 Society for Psychophysiological Research.
Panicker, Vijay; Cluett, Christie; Shields, Beverley; Murray, Anna; Parnell, Kirstie S.; Perry, John R. B.; Weedon, Michael N.; Singleton, Andrew; Hernandez, Dena; Evans, Jonathan; Durant, Claire; Ferrucci, Luigi; Melzer, David; Saravanan, Ponnusamy; Visser, Theo J.; Ceresini, Graziano; Hattersley, Andrew T.; Vaidya, Bijay; Dayan, Colin M.; Frayling, Timothy M.
2008-01-01
Introduction: Genetic factors influence circulating thyroid hormone levels, but the common gene variants involved have not been conclusively identified. The genes encoding the iodothyronine deiodinases are good candidates because they alter the balance of thyroid hormones. We aimed to thoroughly examine the role of common variation across the three deiodinase genes in relation to thyroid hormones. Methods: We used HapMap data to select single-nucleotide polymorphisms (SNPs) that captured a large proportion of the common genetic variation across the three deiodinase genes. We analyzed these initially in a cohort of 552 people on T4 replacement. Suggestive findings were taken forward into three additional studies in people not on T4 (total n = 2513) and metaanalyzed for confirmation. Results: A SNP in the DIO1 gene, rs2235544, was associated with the free T3 to free T4 ratio with genome-wide levels of significance (P = 3.6 × 10−13). The C-allele of this SNP was associated with increased deiodinase 1 (D1) function with resulting increase in free T3/T4 ratio and free T3 and decrease in free T4 and rT3. There was no effect on serum TSH levels. None of the SNPs in the genes coding for D2 or D3 had any influence on hormone levels. Conclusions: This study provides convincing evidence that common genetic variation in DIO1 alters deiodinase function, resulting in an alteration in the balance of circulating free T3 to free T4. This should prove a valuable tool to assess the relative effects of circulating free T3 vs. free T4 on a wide range of biological parameters. PMID:18492748
Panicker, Vijay; Cluett, Christie; Shields, Beverley; Murray, Anna; Parnell, Kirstie S; Perry, John R B; Weedon, Michael N; Singleton, Andrew; Hernandez, Dena; Evans, Jonathan; Durant, Claire; Ferrucci, Luigi; Melzer, David; Saravanan, Ponnusamy; Visser, Theo J; Ceresini, Graziano; Hattersley, Andrew T; Vaidya, Bijay; Dayan, Colin M; Frayling, Timothy M
2008-08-01
Genetic factors influence circulating thyroid hormone levels, but the common gene variants involved have not been conclusively identified. The genes encoding the iodothyronine deiodinases are good candidates because they alter the balance of thyroid hormones. We aimed to thoroughly examine the role of common variation across the three deiodinase genes in relation to thyroid hormones. We used HapMap data to select single-nucleotide polymorphisms (SNPs) that captured a large proportion of the common genetic variation across the three deiodinase genes. We analyzed these initially in a cohort of 552 people on T(4) replacement. Suggestive findings were taken forward into three additional studies in people not on T(4) (total n = 2513) and metaanalyzed for confirmation. A SNP in the DIO1 gene, rs2235544, was associated with the free T(3) to free T(4) ratio with genome-wide levels of significance (P = 3.6 x 10(-13)). The C-allele of this SNP was associated with increased deiodinase 1 (D1) function with resulting increase in free T(3)/T(4) ratio and free T(3) and decrease in free T(4) and rT(3). There was no effect on serum TSH levels. None of the SNPs in the genes coding for D2 or D3 had any influence on hormone levels. This study provides convincing evidence that common genetic variation in DIO1 alters deiodinase function, resulting in an alteration in the balance of circulating free T(3) to free T(4). This should prove a valuable tool to assess the relative effects of circulating free T(3) vs. free T(4) on a wide range of biological parameters.
Association Analysis of the Ephrin-B2 Gene in African-Americans with End-Stage Renal Disease
Hicks, Pamela J.; Staten, Jennifer L.; Palmer, Nicholette D.; Langefeld, Carl D.; Ziegler, Julie T.; Keene, Keith L.; Sale, Michele M.; Bowden, Donald W.; Freedman, Barry I.
2008-01-01
Background Genome scans in African-Americans with end-stage renal disease (ESRD) identified linkage on chromosome 13q33 in the region containing the ephrin-B2 ligand (EFNB2) genes. Interactions between the ephrin-B2 receptor and ephrin-B2 ligand play essential roles in renal angiogenesis, blood vessel maturation, and kidney disease. Methods The EFNB2 gene was evaluated as a positional candidate for non-diabetic and diabetic ESRD susceptibility in 1,071 unrelated African-American subjects; 316 with non-diabetic etiologies of ESRD, 394 with type 2 diabetes-associated ESRD and 361 healthy controls. Single nucleotide polymorphism (SNP) genotyping was performed on the Sequenom Mass Array System. Statistical analyses were computed using Dandelion version 1.26, Snpaddmix version 1.4 and Haploview version 3.32. Results Twenty-eight HapMap tag SNPs were genotyped spanning the 39 kilobases (kb) of the EFNB2 coding region, with average spacing of 1.43 kb. Analysis of 710 ESRD patient samples and 361 controls provided no evidence of single SNP associations in either diabetic or non-diabetic ESRD; although nominal evidence of association with all-cause ESRD was observed with a two SNP (p = 0.022) and three SNP (p = 0.023) haplotype, both containing SNPs rs7490924 and rs2391335 in intron 1. Conclusions Although an attractive positional candidate gene, polymorphisms in the EFNB2 gene do not appear to contribute in a substantial way to non-diabetic, diabetic or all-cause ESRD susceptibility in African-Americans. Additional genes within the chromosome 13q33 linkage interval are likely contributors to African-American non-diabetic ESRD. PMID:18580054
Dixon, Peter H; Wadsworth, Christopher A; Chambers, Jennifer; Donnelly, Jennifer; Cooley, Sharon; Buckley, Rebecca; Mannino, Ramona; Jarvis, Sheba; Syngelaki, Argyro; Geenes, Victoria; Paul, Priyadarshini; Sothinathan, Meera; Kubitz, Ralf; Lammert, Frank; Tribe, Rachel M; Ch'ng, Chin Lye; Marschall, Hanns-Ulrich; Glantz, Anna; Khan, Shahid A; Nicolaides, Kypros; Whittaker, John; Geary, Michael; Williamson, Catherine
2014-01-01
OBJECTIVES: Intrahepatic cholestasis of pregnancy (ICP) has a complex etiology with a significant genetic component. Heterozygous mutations of canalicular transporters occur in a subset of ICP cases and a population susceptibility allele (p.444A) has been identified in ABCB11. We sought to expand our knowledge of the detailed genetic contribution to ICP by investigation of common variation around candidate loci with biological plausibility for a role in ICP (ABCB4, ABCB11, ABCC2, ATP8B1, NR1H4, and FGF19). METHODS: ICP patients (n=563) of white western European origin and controls (n=642) were analyzed in a case–control design. Single-nucleotide polymorphism (SNP) markers (n=83) were selected from the HapMap data set (Tagger, Haploview 4.1 (build 22)). Genotyping was performed by allelic discrimination assay on a robotic platform. Following quality control, SNP data were analyzed by Armitage's trend test. RESULTS: Cochran–Armitage trend testing identified six SNPs in ABCB11 together with six SNPs in ABCB4 that showed significant evidence of association. The minimum Bonferroni corrected P value for trend testing ABCB11 was 5.81×10−4 (rs3815676) and for ABCB4 it was 4.6×10−7(rs2109505). Conditional analysis of the two clusters of association signals suggested a single signal in ABCB4 but evidence for two independent signals in ABCB11. To confirm these findings, a second study was performed in a further 227 cases, which confirmed and strengthened the original findings. CONCLUSIONS: Our analysis of a large cohort of ICP cases has identified a key role for common variation around the ABCB4 and ABCB11 loci, identified the core associations, and expanded our knowledge of ICP susceptibility. PMID:24366234
Landgren, Sara; Jerlhag, Elisabet; Zetterberg, Henrik; Gonzalez-Quintela, Arturo; Campos, Joaquin; Olofsson, Ulrica; Nilsson, Staffan; Blennow, Kaj; Engel, Jörgen A
2008-12-01
Ghrelin, an orexigenic peptide, acts on growth hormone secretagogue receptors (GHS-R1A), expressed in the hypothalamus as well as in important reward nodes such as the ventral tegmental area. Interestingly, ghrelin has been found to activate an important part of the reward systems, i.e., the cholinergic-dopaminergic reward link. Additionally, the rewarding and neurochemical properties of alcohol are, at least in part, mediated via this reward link. There is comorbidity between alcohol dependence and eating disorders. Thus, plasma levels of ghrelin are altered in patients with addictive behaviors such as alcohol and nicotine dependence and in binge eating disorder. This overlap prompted as to investigate the pro-ghrelin and GHS-R1A genes in a haplotype analysis of heavy alcohol-using individuals. A total of 417 Spanish individuals (abstainers, moderate, and heavy alcohol drinkers) were investigated in a haplotype analysis of the pro-ghrelin and GHS-R1A genes. Tag SNPs were chosen using HapMap data and the Tagger and Haploview softwares. These SNPs were then genotyped using TaqMan Allelic Discrimination. SNP rs2232165 of the GHS-R1A gene was associated with heavy alcohol consumption and SNP rs2948694 of the same gene as well as haplotypes of both the pro-ghrelin and the GHS-R1A genes were associated with body mass in heavy alcohol consuming individuals. The present findings are the first to disclose an association between the pro-ghrelin and GHS-R1A genes and heavy alcohol use, further strengthening the role of the ghrelin system in addictive behaviors and brain reward.
Wojczynski, Mary K; Parnell, Laurence D; Pollin, Toni I; Lai, Chao Q; Feitosa, Mary F; O'Connell, Jeff R; Frazier-Wood, Alexis C; Gibson, Quince; Aslibekyan, Stella; Ryan, Kathy A; Province, Michael A; Tiwari, Hemant K; Ordovas, Jose M; Shuldiner, Alan R; Arnett, Donna K; Borecki, Ingrid B
2015-10-01
The triglyceride (TG) response to a high-fat meal (postprandial lipemia, PPL) affects cardiovascular disease risk and is influenced by genes and environment. Genes involved in lipid metabolism have dominated genetic studies of PPL TG response. We sought to elucidate common genetic variants through a genome-wide association (GWA) study in the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN). The GOLDN GWAS discovery sample consisted of 872 participants within families of European ancestry. Genotypes for 2,543,887 variants were measured or imputed from HapMap. Replication of our top results was performed in the Heredity and Phenotype Intervention (HAPI) Heart Study (n = 843). PPL TG response phenotypes were constructed from plasma TG measured at baseline (fasting, 0 hour), 3.5 and 6 hours after a high-fat meal, using a random coefficient regression model. Association analyses were adjusted for covariates and principal components, as necessary, in a linear mixed model using the kinship matrix; additional models further adjusted for fasting TG were also performed. Meta-analysis of the discovery and replication studies (n = 1715) was performed on the top SNPs from GOLDN. GOLDN revealed 111 suggestive (p < 1E-05) associations, with two SNPs meeting GWA significance level (p < 5E-08). Of the two significant SNPs, rs964184 demonstrated evidence of replication (p = 1.20E-03) in the HAPI Heart Study and in a joint analysis, was GWA significant (p = 1.26E-09). Rs964184 has been associated with fasting lipids (TG and HDL) and is near ZPR1 (formerly ZNF259), close to the APOA1/C3/A4/A5 cluster. This association was attenuated upon additional adjustment for fasting TG. This is the first report of a genome-wide significant association with replication for a novel phenotype, namely PPL TG response. Future investigation into response phenotypes is warranted using pathway analyses, or newer genetic technologies such as metabolomics. Copyright © 2015 Elsevier Inc. All rights reserved.
Lewis, Joshua P.; Palmer, Nicholette D.; Hicks, Pamela J.; Sale, Michele M.; Langefeld, Carl D.; Freedman, Barry I.; Divers, Jasmin; Bowden, Donald W.
2008-01-01
OBJECTIVE— Several whole-genome association studies have reported identification of type 2 diabetes susceptibility genes in various European-derived study populations. Little investigation of these loci has been reported in other ethnic groups, specifically African Americans. Striking differences exist between these populations, suggesting they may not share identical genetic risk factors. Our objective was to examine the influence of type 2 diabetes genes identified in whole-genome association studies in a large African American case-control population. RESEARCH DESIGN AND METHODS— Single nucleotide polymorphisms (SNPs) in 12 loci (e.g., TCF7L2, IDE/KIF11/HHEX, SLC30A8, CDKAL1, PKN2, IGF2BP2, FLJ39370, and EXT2/ALX4) associated with type 2 diabetes in European-derived populations were genotyped in 993 African American type 2 diabetic and 1,054 African American control subjects. Additionally, 68 ancestry-informative markers were genotyped to account for the impact of admixture on association results. RESULTS— Little evidence of association was observed between SNPs, with the exception of those in TCF7L2, and type 2 diabetes in African Americans. One TCF7L2 SNP (rs7903146) showed compelling evidence of association with type 2 diabetes (admixture-adjusted additive P [Pa] = 1.59 × 10−6). Only the intragenic SNP on 11p12 (rs9300039, dominant P [Pd] = 0.029) was also associated with type 2 diabetes after admixture adjustments. Interestingly, four of the SNPs are monomorphic in the Yoruba population of the HAPMAP project, with only the risk allele from the populations of European descent present. CONCLUSIONS— Results suggest that these variants do not significantly contribute to interindividual susceptibility to type 2 diabetes in African Americans. Consequently, genes contributing to type 2 diabetes in African Americans may, in part, be different from those in European-derived study populations. High frequency of risk alleles in several of these genes may, however, contribute to the increased prevalence of type 2 diabetes in African Americans. PMID:18443202
Wojczynski, M.K.; Parnel, L.D.; Pollin, T.I.; Lai, C.Q.; Feitosa, M.F.; O’Connell, J.R.; Frazier-Wood, A.C.; Gibson, Q.; Aslibekyan, S.; Ryan, K.A.; Province, M.A.; Tiwari, H.K.; Ordovas, J.M.; Shuldiner, A.R.; Arnett, D.K.; Borecki, I.B.
2015-01-01
Objective The triglyceride (TG) response to a high-fat meal (postprandial lipemia, PPL) affects cardiovascular disease risk and is influenced by genes and environment. Genes involved in lipid metabolism have dominated genetic studies of PPL TG response. We sought to elucidate common genetic variants through a genome-wide association (GWA) study in the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN). Methods The GOLDN GWAS discovery sample consisted of 872 participants within families of European ancestry. Genotypes for 2,543,887 variants were measured or imputed from HapMap. Replication of our top results was performed in the Heredity and Phenotype Intervention (HAPI) Heart Study (n=843). PPL TG response phenotypes were constructed from plasma TG measured at baseline (fasting, 0 hour), 3.5 and 6 hours after a high-fat meal, using a random coefficient regression model. Association analyses were adjusted for covariates and principal components, as necessary, in a linear mixed model using the kinship matrix; additional models further adjusted for fasting TG were also performed. Meta-analysis of the discovery and replication studies (n=1,715) was performed on the top SNPs from GOLDN. Results GOLDN revealed 111 suggestive (p<1E-05) associations, with two SNPs meeting GWA significance level (p<5E-08). Of the two significant SNPs, rs964184 demonstrated evidence of replication (p=1.20E-03) in the HAPI Heart Study and in a joint analysis, was GWA significant (p=1.26E-09). Rs964184 has been associated with fasting lipids (TG and HDL) and is near ZPR1 (formerly ZNF259), close to the APOA1/C3/A4/A5 cluster. This association was attenuated upon additional adjustment for fasting TG. Conclusion This is the first report of a genome-wide significant association with replication for a novel phenotype, namely PPL TG response. Future investigation into response phenotypes is warranted using pathway analyses, or newer genetic technologies such as metabolomics. PMID:26256467
Butsch Kovacic, Melinda; Biagini Myers, Jocelyn M.; Wang, Ning; Martin, Lisa J.; Lindsey, Mark; Ericksen, Mark B.; He, Hua; Patterson, Tia L.; Baye, Tesfaye M.; Torgerson, Dara; Roth, Lindsey A.; Gupta, Jayanta; Sivaprasad, Umasundari; Gibson, Aaron M.; Tsoras, Anna M.; Hu, Donglei; Eng, Celeste; Chapela, Rocío; Rodríguez-Santana, José R.; Rodríguez-Cintrón, William; Avila, Pedro C.; Beckman, Kenneth; Seibold, Max A.; Gignoux, Chris; Musaad, Salma M.; Chen, Weiguo; Burchard, Esteban González; Khurana Hershey, Gurjit K.
2011-01-01
Background Asthma is a chronic inflammatory disease with a strong genetic predisposition. A major challenge for candidate gene association studies in asthma is the selection of biologically relevant genes. Methodology/Principal Findings Using epithelial RNA expression arrays, HapMap allele frequency variation, and the literature, we identified six possible candidate susceptibility genes for childhood asthma including ADCY2, DNAH5, KIF3A, PDE4B, PLAU, SPRR2B. To evaluate these genes, we compared the genotypes of 194 predominantly tagging SNPs in 790 asthmatic, allergic and non-allergic children. We found that SNPs in all six genes were nominally associated with asthma (p<0.05) in our discovery cohort and in three independent cohorts at either the SNP or gene level (p<0.05). Further, we determined that our selection approach was superior to random selection of genes either differentially expressed in asthmatics compared to controls (p = 0.0049) or selected based on the literature alone (p = 0.0049), substantiating the validity of our gene selection approach. Importantly, we observed that 7 of 9 SNPs in the KIF3A gene more than doubled the odds of asthma (OR = 2.3, p<0.0001) and increased the odds of allergic disease (OR = 1.8, p<0.008). Our data indicate that KIF3A rs7737031 (T-allele) has an asthma population attributable risk of 18.5%. The association between KIF3A rs7737031 and asthma was validated in 3 independent populations, further substantiating the validity of our gene selection approach. Conclusions/Significance Our study demonstrates that KIF3A, a member of the kinesin superfamily of microtubule associated motors that are important in the transport of protein complexes within cilia, is a novel candidate gene for childhood asthma. Polymorphisms in KIF3A may in part be responsible for poor mucus and/or allergen clearance from the airways. Furthermore, our study provides a promising framework for the identification and evaluation of novel candidate susceptibility genes. PMID:21912604
Genetic polymorphisms of pharmacogenomic VIP variants in the Yi population from China.
Yan, Mengdan; Li, Dianzhen; Zhao, Guige; Li, Jing; Niu, Fanglin; Li, Bin; Chen, Peng; Jin, Tianbo
2018-03-30
Drug response and target therapeutic dosage are different among individuals. The variability is largely genetically determined. With the development of pharmacogenetics and pharmacogenomics, widespread research have provided us a wealth of information on drug-related genetic polymorphisms, and the very important pharmacogenetic (VIP) variants have been identified for the major populations around the world whereas less is known regarding minorities in China, including the Yi ethnic group. Our research aims to screen the potential genetic variants in Yi population on pharmacogenomics and provide a theoretical basis for future medication guidance. In the present study, 80 VIP variants (selected from the PharmGKB database) were genotyped in 100 unrelated and healthy Yi adults recruited for our research. Through statistical analysis, we made a comparison between the Yi and other 11 populations listed in the HapMap database for significant SNPs detection. Two specific SNPs were subsequently enrolled in an observation on global allele distribution with the frequencies downloaded from ALlele FREquency Database. Moreover, F-statistics (Fst), genetic structure and phylogenetic tree analyses were conducted for determination of genetic similarity between the 12 ethnic groups. Using the χ2 tests, rs1128503 (ABCB1), rs7294 (VKORC1), rs9934438 (VKORC1), rs1540339 (VDR) and rs689466 (PTGS2) were identified as the significantly different loci for further analysis. The global allele distribution revealed that the allele "A" of rs1540339 and rs9934438 were more frequent in Yi people, which was consistent with the most populations in East Asia. F-statistics (Fst), genetic structure and phylogenetic tree analyses demonstrated that the Yi and CHD shared a closest relationship on their genetic backgrounds. Additionally, Yi was considered similar to the Han people from Shaanxi province among the domestic ethnic populations in China. Our results demonstrated significant differences on several polymorphic SNPs and supplement the pharmacogenomic information for the Yi population, which could provide new strategies for optimizing clinical medication in accordance with the genetic determinants of drug toxicity and efficacy. Copyright © 2018 Elsevier B.V. All rights reserved.
Lewis, Joshua P; Palmer, Nicholette D; Hicks, Pamela J; Sale, Michele M; Langefeld, Carl D; Freedman, Barry I; Divers, Jasmin; Bowden, Donald W
2008-08-01
Several whole-genome association studies have reported identification of type 2 diabetes susceptibility genes in various European-derived study populations. Little investigation of these loci has been reported in other ethnic groups, specifically African Americans. Striking differences exist between these populations, suggesting they may not share identical genetic risk factors. Our objective was to examine the influence of type 2 diabetes genes identified in whole-genome association studies in a large African American case-control population. Single nucleotide polymorphisms (SNPs) in 12 loci (e.g., TCF7L2, IDE/KIF11/HHEX, SLC30A8, CDKAL1, PKN2, IGF2BP2, FLJ39370, and EXT2/ALX4) associated with type 2 diabetes in European-derived populations were genotyped in 993 African American type 2 diabetic and 1,054 African American control subjects. Additionally, 68 ancestry-informative markers were genotyped to account for the impact of admixture on association results. Little evidence of association was observed between SNPs, with the exception of those in TCF7L2, and type 2 diabetes in African Americans. One TCF7L2 SNP (rs7903146) showed compelling evidence of association with type 2 diabetes (admixture-adjusted additive P [P(a)] = 1.59 x 10(-6)). Only the intragenic SNP on 11p12 (rs9300039, dominant P [P(d)] = 0.029) was also associated with type 2 diabetes after admixture adjustments. Interestingly, four of the SNPs are monomorphic in the Yoruba population of the HAPMAP project, with only the risk allele from the populations of European descent present. Results suggest that these variants do not significantly contribute to interindividual susceptibility to type 2 diabetes in African Americans. Consequently, genes contributing to type 2 diabetes in African Americans may, in part, be different from those in European-derived study populations. High frequency of risk alleles in several of these genes may, however, contribute to the increased prevalence of type 2 diabetes in African Americans.
Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing
2012-01-01
Background Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions). Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with <10% strand bias. However, the SNP calling quality was substantially affected by the choice of tools and mapping strategy. With the aim of reducing computational requirements, we compared conventional whole-genome mapping and SNP-calling with a new faster approach: target-region mapping with subsequent ‘read-backmapping’ to the whole genome to reduce the false detection rate. Consequently, we developed a combined mapping pipeline, which includes standard tools (BWA, SAMtools, etc.), and tested it on public HiSeq2000 exome data from the 1000 Genomes Project. Our pipeline saved 12 hours of run time per Hiseq2000 exome sample and detected ~5% more SNPs than the conventional whole genome approach. This suggests that more potential novel SNPs may be discovered using both approaches than with just the conventional approach. Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results. PMID:22913592
Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin
2016-01-01
The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. PMID:27172202
Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin
2016-07-07
The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. Copyright © 2016 Chen et al.
Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology.
Pareek, Chandra Shekhar; Błaszczyk, Paweł; Dziuba, Piotr; Czarnik, Urszula; Fraser, Leyland; Sobiech, Przemysław; Pierzchała, Mariusz; Feng, Yaping; Kadarmideen, Haja N; Kumar, Dibyendu
2017-01-01
RNA-seq is a useful next-generation sequencing (NGS) technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs) in liver tissue of young bulls of the Polish Red, Polish Holstein-Friesian (HF) and Hereford breeds, and to understand the genomic variation in the three cattle breeds that may reflect differences in production traits. The RNA-seq experiment on bovine liver produced 107,114,4072 raw paired-end reads, with an average of approximately 60 million paired-end reads per library. Breed-wise, a total of 345.06, 290.04 and 436.03 million paired-end reads were obtained from the Polish Red, Polish HF, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed that 81.35%, 82.81% and 84.21% of the mapped sequencing reads were properly paired to the Polish Red, Polish HF, and Hereford breeds, respectively. This study identified 5,641,401 SNPs and insertion and deletion (indel) positions expressed in the bovine liver with an average of 313,411 SNPs and indel per young bull. Following the removal of the indel mutations, a total of 195,3804, 152,7120 and 205,3184 raw SNPs expressed in bovine liver were identified for the Polish Red, Polish HF, and Hereford breeds, respectively. Breed-wise, three highly reliable breed-specific SNP-databases (SNP-dbs) with 31,562, 24,945 and 28,194 SNP records were constructed for the Polish Red, Polish HF, and Hereford breeds, respectively. Using a combination of stringent parameters of a minimum depth of ≥10 mapping reads that support the polymorphic nucleotide base and 100% SNP ratio, 4,368, 3,780 and 3,800 SNP records were detected in the Polish Red, Polish HF, and Hereford breeds, respectively. The SNP detections using RNA-seq data were successfully validated by kompetitive allele-specific PCR (KASPTM) SNP genotyping assay. The comprehensive QTL/CG analysis of 110 QTL/CG with RNA-seq data identified 20 monomorphic SNP hit loci (CARTPT, GAD1, GDF5, GHRH, GHRL, GRB10, IGFBPL1, IGFL1, LEP, LHX4, MC4R, MSTN, NKAIN1, PLAG1, POU1F1, SDR16C5, SH2B2, TOX, UCP3 and WNT10B) in all three cattle breeds. However, six SNP loci (CCSER1, GHR, KCNIP4, MTSS1, EGFR and NSMCE2) were identified as highly polymorphic among the cattle breeds. This study identified breed-specific SNPs with greater SNP ratio and excellent mapping coverage, as well as monomorphic and highly polymorphic putative SNP loci within QTL/CGs of bovine liver tissue. A breed-specific SNP-db constructed for bovine liver yielded nearly six million SNPs. In addition, a KASPTM SNP genotyping assay, as a reliable cost-effective method, successfully validated the breed-specific putative SNPs originating from the RNA-seq experiments.
Roorkiwal, Manish; Jain, Ankit; Kale, Sandip M; Doddamani, Dadakhalandar; Chitikineni, Annapurna; Thudi, Mahendar; Varshney, Rajeev K
2018-04-01
To accelerate genomics research and molecular breeding applications in chickpea, a high-throughput SNP genotyping platform 'Axiom ® CicerSNP Array' has been designed, developed and validated. Screening of whole-genome resequencing data from 429 chickpea lines identified 4.9 million SNPs, from which a subset of 70 463 high-quality nonredundant SNPs was selected using different stringent filter criteria. This was further narrowed down to 61 174 SNPs based on p-convert score ≥0.3, of which 50 590 SNPs could be tiled on array. Among these tiled SNPs, a total of 11 245 SNPs (22.23%) were from the coding regions of 3673 different genes. The developed Axiom ® CicerSNP Array was used for genotyping two recombinant inbred line populations, namely ICCRIL03 (ICC 4958 × ICC 1882) and ICCRIL04 (ICC 283 × ICC 8261). Genotyping data reflected high success and polymorphic rate, with 15 140 (29.93%; ICCRIL03) and 20 018 (39.57%; ICCRIL04) polymorphic SNPs. High-density genetic maps comprising 13 679 SNPs spanning 1033.67 cM and 7769 SNPs spanning 1076.35 cM were developed for ICCRIL03 and ICCRIL04 populations, respectively. QTL analysis using multilocation, multiseason phenotyping data on these RILs identified 70 (ICCRIL03) and 120 (ICCRIL04) main-effect QTLs on genetic map. Higher precision and potential of this array is expected to advance chickpea genetics and breeding applications. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Nakajima, Ayaka; Kawaguchi, Fuki; Uemoto, Yoshinobu; Fukushima, Moriyuki; Yoshida, Emi; Iwamoto, Eiji; Akiyama, Takayuki; Kohama, Namiko; Kobayashi, Eiji; Honda, Takeshi; Oyama, Kenji; Mannen, Hideyuki; Sasazaki, Shinji
2018-05-01
The objective of this study was to identify genomic regions associated with fat-related traits using a Japanese Black cattle population in Hyogo. From 1836 animals, those with high or low values were selected on the basis of corrected phenotype and then pooled into high and low groups (n = 100 each), respectively. DNA pool-based genome-wide association study (GWAS) was performed using Illumina BovineSNP50 BeadChip v2 with three replicate assays for each pooled sample. GWAS detected that two single nucleotide polymorphisms (SNPs) on BTA7 (ARS-BFGL-NGS-35463 and Hapmap23838-BTA-163815) and one SNP on BTA12 (ARS-BFGL-NGS-2915) significantly affected fat percentage (FAR). The significance of ARS-BFGL-NGS-35463 on BTA7 was confirmed by individual genotyping in all pooled samples. Moreover, association analysis between SNP and FAR in 803 Japanese Black cattle revealed a significant effect of SNP on FAR. Thus, further investigation of these regions is required to identify FAR-associated genes and mutations, which can lead to the development of DNA markers for marker-assisted selection for the genetic improvement of beef quality. © 2018 Japanese Society of Animal Science.
DOE Office of Scientific and Technical Information (OSTI.GOV)
SacconePhD, Scott F; Chesler, Elissa J; Bierut, Laura J
Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well representedmore » by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions.« less
Construction of the third-generation Zea mays haplotype map
Bukowski, Robert; Guo, Xiaosen; Lu, Yanli; Zou, Cheng; He, Bing; Rong, Zhengqin; Wang, Bo; Xu, Dawen; Yang, Bicheng; Xie, Chuanxiao; Fan, Longjiang; Gao, Shibin; Xu, Xun; Zhang, Gengyun; Li, Yingrui; Jiao, Yinping; Doebley, John F; Ross-Ibarra, Jeffrey; Lorant, Anne; Buffalo, Vince; Romay, M Cinta; Buckler, Edward S; Ware, Doreen; Lai, Jinsheng; Sun, Qi
2017-01-01
Abstract Background Characterization of genetic variations in maize has been challenging, mainly due to deterioration of collinearity between individual genomes in the species. An international consortium of maize research groups combined resources to develop the maize haplotype version 3 (HapMap 3), built from whole-genome sequencing data from 1218 maize lines, covering predomestication and domesticated Zea mays varieties across the world. Results A new computational pipeline was set up to process more than 12 trillion bp of sequencing data, and a set of population genetics filters was applied to identify more than 83 million variant sites. Conclusions We identified polymorphisms in regions where collinearity is largely preserved in the maize species. However, the fact that the B73 genome used as the reference only represents a fraction of all haplotypes is still an important limiting factor. PMID:29300887
RFMix: A Discriminative Modeling Approach for Rapid and Robust Local-Ancestry Inference
Maples, Brian K.; Gravel, Simon; Kenny, Eimear E.; Bustamante, Carlos D.
2013-01-01
Local-ancestry inference is an important step in the genetic analysis of fully sequenced human genomes. Current methods can only detect continental-level ancestry (i.e., European versus African versus Asian) accurately even when using millions of markers. Here, we present RFMix, a powerful discriminative modeling approach that is faster (∼30×) and more accurate than existing methods. We accomplish this by using a conditional random field parameterized by random forests trained on reference panels. RFMix is capable of learning from the admixed samples themselves to boost performance and autocorrect phasing errors. RFMix shows high sensitivity and specificity in simulated Hispanics/Latinos and African Americans and admixed Europeans, Africans, and Asians. Finally, we demonstrate that African Americans in HapMap contain modest (but nonzero) levels of Native American ancestry (∼0.4%). PMID:23910464
Nikpay, Majid; Goel, Anuj; Won, Hong-Hee; Hall, Leanne M; Willenborg, Christina; Kanoni, Stavroula; Saleheen, Danish; Kyriakou, Theodosios; Nelson, Christopher P; Hopewell, Jemma C; Webb, Thomas R; Zeng, Lingyao; Dehghan, Abbas; Alver, Maris; Armasu, Sebastian M; Auro, Kirsi; Bjonnes, Andrew; Chasman, Daniel I; Chen, Shufeng; Ford, Ian; Franceschini, Nora; Gieger, Christian; Grace, Christopher; Gustafsson, Stefan; Huang, Jie; Hwang, Shih-Jen; Kim, Yun Kyoung; Kleber, Marcus E; Lau, King Wai; Lu, Xiangfeng; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Mihailov, Evelin; Morrison, Alanna C; Pervjakova, Natalia; Qu, Liming; Rose, Lynda M; Salfati, Elias; Saxena, Richa; Scholz, Markus; Smith, Albert V; Tikkanen, Emmi; Uitterlinden, Andre; Yang, Xueli; Zhang, Weihua; Zhao, Wei; de Andrade, Mariza; de Vries, Paul S; van Zuydam, Natalie R; Anand, Sonia S; Bertram, Lars; Beutner, Frank; Dedoussis, George; Frossard, Philippe; Gauguier, Dominique; Goodall, Alison H; Gottesman, Omri; Haber, Marc; Han, Bok-Ghee; Huang, Jianfeng; Jalilzadeh, Shapour; Kessler, Thorsten; König, Inke R; Lannfelt, Lars; Lieb, Wolfgang; Lind, Lars; Lindgren, Cecilia M; Lokki, Marja-Liisa; Magnusson, Patrik K; Mallick, Nadeem H; Mehra, Narinder; Meitinger, Thomas; Memon, Fazal-Ur-Rehman; Morris, Andrew P; Nieminen, Markku S; Pedersen, Nancy L; Peters, Annette; Rallidis, Loukianos S; Rasheed, Asif; Samuel, Maria; Shah, Svati H; Sinisalo, Juha; Stirrups, Kathleen E; Trompet, Stella; Wang, Laiyuan; Zaman, Khan S; Ardissino, Diego; Boerwinkle, Eric; Borecki, Ingrid B; Bottinger, Erwin P; Buring, Julie E; Chambers, John C; Collins, Rory; Cupples, L Adrienne; Danesh, John; Demuth, Ilja; Elosua, Roberto; Epstein, Stephen E; Esko, Tõnu; Feitosa, Mary F; Franco, Oscar H; Franzosi, Maria Grazia; Granger, Christopher B; Gu, Dongfeng; Gudnason, Vilmundur; Hall, Alistair S; Hamsten, Anders; Harris, Tamara B; Hazen, Stanley L; Hengstenberg, Christian; Hofman, Albert; Ingelsson, Erik; Iribarren, Carlos; Jukema, J Wouter; Karhunen, Pekka J; Kim, Bong-Jo; Kooner, Jaspal S; Kullo, Iftikhar J; Lehtimäki, Terho; Loos, Ruth J F; Melander, Olle; Metspalu, Andres; März, Winfried; Palmer, Colin N; Perola, Markus; Quertermous, Thomas; Rader, Daniel J; Ridker, Paul M; Ripatti, Samuli; Roberts, Robert; Salomaa, Veikko; Sanghera, Dharambir K; Schwartz, Stephen M; Seedorf, Udo; Stewart, Alexandre F; Stott, David J; Thiery, Joachim; Zalloua, Pierre A; O'Donnell, Christopher J; Reilly, Muredach P; Assimes, Themistocles L; Thompson, John R; Erdmann, Jeanette; Clarke, Robert; Watkins, Hugh; Kathiresan, Sekar; McPherson, Ruth; Deloukas, Panos; Schunkert, Heribert; Samani, Nilesh J; Farrall, Martin
2015-10-01
Existing knowledge of genetic variants affecting risk of coronary artery disease (CAD) is largely based on genome-wide association study (GWAS) analysis of common SNPs. Leveraging phased haplotypes from the 1000 Genomes Project, we report a GWAS meta-analysis of ∼185,000 CAD cases and controls, interrogating 6.7 million common (minor allele frequency (MAF) > 0.05) and 2.7 million low-frequency (0.005 < MAF < 0.05) variants. In addition to confirming most known CAD-associated loci, we identified ten new loci (eight additive and two recessive) that contain candidate causal genes newly implicating biological processes in vessel walls. We observed intralocus allelic heterogeneity but little evidence of low-frequency variants with larger effects and no evidence of synthetic association. Our analysis provides a comprehensive survey of the fine genetic architecture of CAD, showing that genetic susceptibility to this common disease is largely determined by common SNPs of small effect size.
Jia, Guanqing; Huang, Xuehui; Zhi, Hui; Zhao, Yan; Zhao, Qiang; Li, Wenjun; Chai, Yang; Yang, Lifang; Liu, Kunyan; Lu, Hengyun; Zhu, Chuanrang; Lu, Yiqi; Zhou, Congcong; Fan, Danlin; Weng, Qijun; Guo, Yunli; Huang, Tao; Zhang, Lei; Lu, Tingting; Feng, Qi; Hao, Hangfei; Liu, Hongkuan; Lu, Ping; Zhang, Ning; Li, Yuhui; Guo, Erhu; Wang, Shujun; Wang, Suying; Liu, Jinrong; Zhang, Wenfei; Chen, Guoqiu; Zhang, Baojin; Li, Wei; Wang, Yongfang; Li, Haiquan; Zhao, Baohua; Li, Jiayang; Diao, Xianmin; Han, Bin
2013-08-01
Foxtail millet (Setaria italica) is an important grain crop that is grown in arid regions. Here we sequenced 916 diverse foxtail millet varieties, identified 2.58 million SNPs and used 0.8 million common SNPs to construct a haplotype map of the foxtail millet genome. We classified the foxtail millet varieties into two divergent groups that are strongly correlated with early and late flowering times. We phenotyped the 916 varieties under five different environments and identified 512 loci associated with 47 agronomic traits by genome-wide association studies. We performed a de novo assembly of deeply sequenced genomes of a Setaria viridis accession (the wild progenitor of S. italica) and an S. italica variety and identified complex interspecies and intraspecies variants. We also identified 36 selective sweeps that seem to have occurred during modern breeding. This study provides fundamental resources for genetics research and genetic improvement in foxtail millet.
McGue, Matt; Iacono, William G.
2017-01-01
In a recent comprehensive investigation, we largely failed to identify significant genetic markers associated with P3 amplitude or to corroborate previous associations between P3 and specific single nucleotide polymorphisms (SNPs) or genes. In the present study we extended this line of investigation to examine time-frequency (TF) activity and intertrial phase coherence (ITPC) in the P3 time window, both of which are associated with P3 amplitude. Previous genome-wide research has reported associations between P3-related theta and delta activity and individual genetic variants. A large, population-based sample of 4211 subjects, comprising male and female adolescent twins and their parents, was genotyped for 527,828 single nucleotide polymorphisms (SNPs), from which over six million SNPs were accurately imputed. Heritability estimates were greater for TF energy than ITPC, whether based on biometric models or the combined influence of all measured SNPs (derived from genome-wide complex trait analysis). The magnitude of overlap in the specific SNPs associated with delta energy and ITPC and P3 amplitude was significant. A genome-wide analysis of all SNPs, accompanied by an analysis of approximately 17,600 genes, indicated a region of chromosome 2 around TEKT4 that was significantly associated with theta ITPC. Analysis of candidate SNPs and genes previously reported to be associated with P3 or related phenotypes yielded one association surviving correction for multiple tests: between theta energy and CRHR1. However, we did not obtain significant associations for SNPs implicated in previous genome-wide studies of TF measures. Identifying specific genetic variants associated with P3 amplitude remains a challenge. PMID:27871913
A Genome-Wide Association Study Identifies A New Ovarian Cancer Susceptibility Locus On 9p22.2
Song, Honglin; Ramus, Susan J.; Tyrer, Jonathan; Bolton, Kelly L.; Gentry-Maharaj, Aleksandra; Wozniak, Eva; Anton-Culver, Hoda; Chang-Claude, Jenny; Cramer, Daniel W.; DiCioccio, Richard; Dörk, Thilo; Goode, Ellen L.; Goodman, Marc T; Schildkraut, Joellen M; Sellers, Thomas; Baglietto, Laura; Beckmann, Matthias W.; Beesley, Jonathan; Blaakaer, Jan; Carney, Michael E; Chanock, Stephen; Chen, Zhihua; Cunningham, Julie M.; Dicks, Ed; Doherty, Jennifer A.; Dürst, Matthias; Ekici, Arif B.; Fenstermacher, David; Fridley, Brooke L.; Giles, Graham; Gore, Martin E.; De Vivo, Immaculata; Hillemanns, Peter; Hogdall, Claus; Hogdall, Estrid; Iversen, Edwin S; Jacobs, Ian J; Jakubowska, Anna; Li, Dong; Lissowska, Jolanta; Lubiński, Jan; Lurie, Galina; McGuire, Valerie; McLaughlin, John; Mędrek, Krzysztof; Moorman, Patricia G.; Moysich, Kirsten; Narod, Steven; Phelan, Catherine; Pye, Carole; Risch, Harvey; Runnebaum, Ingo B; Severi, Gianluca; Southey, Melissa; Stram, Daniel O.; Thiel, Falk C.; Terry, Kathryn L.; Tsai, Ya-Yu; Tworoger, Shelley S.; Van Den Berg, David J.; Vierkant, Robert A.; Wang-Gohrke, Shan; Webb, Penelope M.; Wilkens, Lynne R.; Wu, Anna H; Yang, Hannah; Brewster, Wendy; Ziogas, Argyrios; Houlston, Richard; Tomlinson, Ian; Whittemore, Alice S; Rossing, Mary Anne; Ponder, Bruce A.J.; Pearce, Celeste Leigh; Ness, Roberta B.; Menon, Usha; Kjaer, Susanne Krüger; Gronwald, Jacek; Garcia-Closas, Montserrat; Fasching, Peter A.; Easton, Douglas F; Chenevix-Trench, Georgia; Berchuck, Andrew; Pharoah, Paul D.P.; Gayther, Simon A.
2009-01-01
Epithelial ovarian cancer has a major heritable component, but the known susceptibility genes explain less than half the excess familial risk1. We performed a genome wide association study (GWAS) to identify common ovarian cancer susceptibility alleles. We evaluated 507,094 SNPs genotyped in 1,817 cases and 2,353 controls from the UK and ~2 million imputed SNPs. We genotyped the 22,790 top ranked SNPs in 4,274 cases and 4,809 controls of European ancestry from Europe, USA and Australia. We identified 12 SNPs at 9p22 associated with disease risk (P<10−8). The most significant SNP (rs3814113; P = 2.5 × 10−17) was genotyped in a further 2,670 ovarian cancer cases and 4,668 controls confirming its association (combined data odds ratio = 0.82 95% CI 0.79 – 0.86, P-trend = 5.1 × 10−19). The association differs by histological subtype, being strongest for serous ovarian cancers (OR 0.77 95% CI 0.73 – 0.81, Ptrend = 4.1 × 10−21). PMID:19648919
The impact of low-frequency and rare variants on lipid levels
Surakka, Ida; Horikoshi, Momoko; Mägi, Reedik; Sarin, Antti-Pekka; Mahajan, Anubha; Lagou, Vasiliki; Marullo, Letizia; Ferreira, Teresa; Miraglio, Benjamin; Timonen, Sanna; Kettunen, Johannes; Pirinen, Matti; Karjalainen, Juha; Thorleifsson, Gudmar; Hägg, Sara; Hottenga, Jouke-Jan; Isaacs, Aaron; Ladenvall, Claes; Beekman, Marian; Esko, Tõnu; Ried, Janina S; Nelson, Christopher P; Willenborg, Christina; Gustafsson, Stefan; Westra, Harm-Jan; Blades, Matthew; de Craen, Anton JM; de Geus, Eco J; Deelen, Joris; Grallert, Harald; Hamsten, Anders; Havulinna, Aki S.; Hengstenberg, Christian; Houwing-Duistermaat, Jeanine J; Hyppönen, Elina; Karssen, Lennart C; Lehtimäki, Terho; Lyssenko, Valeriya; Magnusson, Patrik KE; Mihailov, Evelin; Müller-Nurasyid, Martina; Mpindi, John-Patrick; Pedersen, Nancy L; Penninx, Brenda WJH; Perola, Markus; Pers, Tune H; Peters, Annette; Rung, Johan; Smit, Johannes H; Steinthorsdottir, Valgerdur; Tobin, Martin D; Tsernikova, Natalia; van Leeuwen, Elisabeth M; Viikari, Jorma S; Willems, Sara M; Willemsen, Gonneke; Schunkert, Heribert; Erdmann, Jeanette; Samani, Nilesh J; Kaprio, Jaakko; Lind, Lars; Gieger, Christian; Metspalu, Andres; Slagboom, P Eline; Groop, Leif; van Duijn, Cornelia M; Eriksson, Johan G; Jula, Antti; Salomaa, Veikko; Boomsma, Dorret I; Power, Christine; Raitakari, Olli T; Ingelsson, Erik; Järvelin, Marjo-Riitta; Stefansson, Kari; Franke, Lude; Ikonen, Elina; Kallioniemi, Olli; Pietiäinen, Vilja; Lindgren, Cecilia M; Thorsteinsdottir, Unnur; Palotie, Aarno; McCarthy, Mark I; Morris, Andrew P; Prokopenko, Inga; Ripatti, Samuli
2016-01-01
Using a genome-wide screen of 9.6 million genetic variants achieved through 1000 Genomes imputation in 62,166 samples, we identify association to lipids in 93 loci including 79 previously identified loci with new lead-SNPs, 10 new loci, 15 loci with a low-frequency and 10 loci with missense lead-SNPs, and, 2 loci with an accumulation of rare variants. In six loci, SNPs with established function in lipid genetics (CELSR2, GCKR, LIPC, and APOE), or candidate missense mutations with predicted damaging function (CD300LG and TM6SF2), explained the locus associations. The low-frequency variants increased the proportion of variance explained, particularly for LDL-C and TC. Altogether, our results highlight the impact of low-frequency variants in complex traits and show that imputation offers a cost-effective alternative to re-sequencing. PMID:25961943
Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology
Pareek, Chandra Shekhar; Błaszczyk, Paweł; Dziuba, Piotr; Czarnik, Urszula; Fraser, Leyland; Sobiech, Przemysław; Pierzchała, Mariusz; Feng, Yaping; Kadarmideen, Haja N.; Kumar, Dibyendu
2017-01-01
Background RNA-seq is a useful next-generation sequencing (NGS) technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs) in liver tissue of young bulls of the Polish Red, Polish Holstein-Friesian (HF) and Hereford breeds, and to understand the genomic variation in the three cattle breeds that may reflect differences in production traits. Results The RNA-seq experiment on bovine liver produced 107,114,4072 raw paired-end reads, with an average of approximately 60 million paired-end reads per library. Breed-wise, a total of 345.06, 290.04 and 436.03 million paired-end reads were obtained from the Polish Red, Polish HF, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed that 81.35%, 82.81% and 84.21% of the mapped sequencing reads were properly paired to the Polish Red, Polish HF, and Hereford breeds, respectively. This study identified 5,641,401 SNPs and insertion and deletion (indel) positions expressed in the bovine liver with an average of 313,411 SNPs and indel per young bull. Following the removal of the indel mutations, a total of 195,3804, 152,7120 and 205,3184 raw SNPs expressed in bovine liver were identified for the Polish Red, Polish HF, and Hereford breeds, respectively. Breed-wise, three highly reliable breed-specific SNP-databases (SNP-dbs) with 31,562, 24,945 and 28,194 SNP records were constructed for the Polish Red, Polish HF, and Hereford breeds, respectively. Using a combination of stringent parameters of a minimum depth of ≥10 mapping reads that support the polymorphic nucleotide base and 100% SNP ratio, 4,368, 3,780 and 3,800 SNP records were detected in the Polish Red, Polish HF, and Hereford breeds, respectively. The SNP detections using RNA-seq data were successfully validated by kompetitive allele-specific PCR (KASPTM) SNP genotyping assay. The comprehensive QTL/CG analysis of 110 QTL/CG with RNA-seq data identified 20 monomorphic SNP hit loci (CARTPT, GAD1, GDF5, GHRH, GHRL, GRB10, IGFBPL1, IGFL1, LEP, LHX4, MC4R, MSTN, NKAIN1, PLAG1, POU1F1, SDR16C5, SH2B2, TOX, UCP3 and WNT10B) in all three cattle breeds. However, six SNP loci (CCSER1, GHR, KCNIP4, MTSS1, EGFR and NSMCE2) were identified as highly polymorphic among the cattle breeds. Conclusions This study identified breed-specific SNPs with greater SNP ratio and excellent mapping coverage, as well as monomorphic and highly polymorphic putative SNP loci within QTL/CGs of bovine liver tissue. A breed-specific SNP-db constructed for bovine liver yielded nearly six million SNPs. In addition, a KASPTM SNP genotyping assay, as a reliable cost-effective method, successfully validated the breed-specific putative SNPs originating from the RNA-seq experiments. PMID:28234981
Li, Liming; Wang, Yi; Yang, Shuping; Xia, Mingying; Yang, Yajun; Wang, Jiucun; Lu, Daru; Pan, Xingwei; Ma, Teng; Jiang, Pei; Yu, Ge; Zhao, Ziqin; Ping, Yuan; Zhou, Huaigu; Zhao, Xueying; Sun, Hui; Liu, Bing; Jia, Dongtao; Li, Chengtao; Hu, Rile; Lu, Hongzhou; Liu, Xiaoyang; Chen, Wenqing; Mi, Qin; Xue, Fuzhong; Su, Yongdong; Jin, Li; Li, Shilin
2017-05-01
The applications of DNA profiling aim to identify perpetrators, missing family members and disaster victims in forensic investigations. Single nucleotide polymorphisms (SNPs) based forensic applications are emerging rapidly with a potential to replace short tandem repeats (STRs) based panels which are now being used widely, and there is a need for a well-designed SNP panel to meet such challenge for this transition. Here we present a panel of 175 SNP markers (referred to as Fudan ID Panel or FID), selected from ∼3.6 million SNPs, for the application of personal identification. We optimized and validated FID panel using 729 Chinese individuals using a next generation sequencing (NGS) technology. We showed that the SNPs in the panel possess very high heterozygosity as well as low within- and among-continent differentiations, enabling FID panel exhibit discrimination power in both regional and worldwide populations, with the average match probabilities ranging from 4.77×10 -71 to 1.06×10 -64 across 54 world populations. With the advent of biomedical research, the SNPs connecting physical anthropological, physiological, behavioral and phenotypic traits will be eventually added to the forensic panels that will revolutionize criminal investigation. Copyright © 2017 Elsevier B.V. All rights reserved.
A HapMap harvest of insights into the genetics of common disease
Manolio, Teri A.; Brooks, Lisa D.; Collins, Francis S.
2008-01-01
The International HapMap Project was designed to create a genome-wide database of patterns of human genetic variation, with the expectation that these patterns would be useful for genetic association studies of common diseases. This expectation has been amply fulfilled with just the initial output of genome-wide association studies, identifying nearly 100 loci for nearly 40 common diseases and traits. These associations provided new insights into pathophysiology, suggesting previously unsuspected etiologic pathways for common diseases that will be of use in identifying new therapeutic targets and developing targeted interventions based on genetically defined risk. In addition, HapMap-based discoveries have shed new light on the impact of evolutionary pressures on the human genome, suggesting multiple loci important for adapting to disease-causing pathogens and new environments. In this review we examine the origin, development, and current status of the HapMap; its prospects for continued evolution; and its current and potential future impact on biomedical science. PMID:18451988
Peace, Cameron; Bassil, Nahla; Main, Dorrie; Ficklin, Stephen; Rosyara, Umesh R.; Stegmeir, Travis; Sebolt, Audrey; Gilmore, Barbara; Lawley, Cindy; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Iezzoni, Amy
2012-01-01
High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb) of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome) and P. fruticosa (fruticosa subgenome). Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269) and sour (n = 330) cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery, genome structure investigation, and genetic diversity assessment in this diploid-tetraploid crop group. PMID:23284615
Huang, Shunmou; Yang, Hongli; Zhan, Gaomiao; Wang, Xinfa; Liu, Guihua; Wang, Hanzhong
2012-01-01
Background Single nucleotide polymorphisms (SNPs) are an important class of genetic marker for target gene mapping. As of yet, there is no rapid and effective method to identify SNPs linked with agronomic traits in rapeseed and other crop species. Methodology/Principal Findings We demonstrate a novel method for identifying SNP markers in rapeseed by deep sequencing a representative library and performing bulk segregant analysis. With this method, SNPs associated with rapeseed pod shatter-resistance were discovered. Firstly, a reduced representation of the rapeseed genome was used. Genomic fragments ranging from 450–550 bp were prepared from the susceptible bulk (ten F2 plants with the silique shattering resistance index, SSRI <0.10) and the resistance bulk (ten F2 plants with SSRI >0.90), and also Solexa sequencing-produced 90 bp reads. Approximately 50 million of these sequence reads were assembled into contigs to a depth of 20-fold coverage. Secondly, 60,396 ‘simple SNPs’ were identified, and the statistical significance was evaluated using Fisher's exact test. There were 70 associated SNPs whose –log10 p value over 16 were selected to be further analyzed. The distribution of these SNPs appeared a tight cluster, which consisted of 14 associated SNPs within a 396 kb region on chromosome A09. Our evidence indicates that this region contains a major quantitative trait locus (QTL). Finally, two associated SNPs from this region were mapped on a major QTL region. Conclusions/Significance 70 associated SNPs were discovered and a major QTL for rapeseed pod shatter-resistance was found on chromosome A09 using our novel method. The associated SNP markers were used for mapping of the QTL, and may be useful for improving pod shatter-resistance in rapeseed through marker-assisted selection and map-based cloning. This approach will accelerate the discovery of major QTLs and the cloning of functional genes for important agronomic traits in rapeseed and other crop species. PMID:22529909
Novel efficient genome-wide SNP panels for the conservation of the highly endangered Iberian lynx.
Kleinman-Ruiz, Daniel; Martínez-Cruz, Begoña; Soriano, Laura; Lucena-Perez, Maria; Cruz, Fernando; Villanueva, Beatriz; Fernández, Jesús; Godoy, José A
2017-07-21
The Iberian lynx (Lynx pardinus) has been acknowledged as the most endangered felid species in the world. An intense contraction and fragmentation during the twentieth century left less than 100 individuals split in two isolated and genetically eroded populations by 2002. Genetic monitoring and management so far have been based on 36 STRs, but their limited variability and the more complex situation of current populations demand more efficient molecular markers. The recent characterization of the Iberian lynx genome identified more than 1.6 million SNPs, of which 1536 were selected and genotyped in an extended Iberian lynx sample. We validated 1492 SNPs and analysed their heterozygosity, Hardy-Weinberg equilibrium, and linkage disequilibrium. We then selected a panel of 343 minimally linked autosomal SNPs from which we extracted subsets optimized for four different typical tasks in conservation applications: individual identification, parentage assignment, relatedness estimation, and admixture classification, and compared their power to currently used STR panels. We ascribed 21 SNPs to chromosome X based on their segregation patterns, and identified one additional marker that showed significant differentiation between sexes. For all applications considered, panels of autosomal SNPs showed higher power than the currently used STR set with only a very modest increase in the number of markers. These novel panels of highly informative genome-wide SNPs provide more powerful, efficient, and flexible tools for the genetic management and non-invasive monitoring of Iberian lynx populations. This example highlights an important outcome of whole-genome studies in genetically threatened species.
Tang, Shao-Wen; Lv, Xiao-Zhen; Chen, Ru; Wu, Shan-Shan; Yang, Zhi-Rong; Chen, Da-Fang; Zhan, Si-Yan
2013-05-01
The precise pathogenic mechanism of antituberculosis (anti-TB) drug-induced liver injury (ATLI) is poorly understood. It may be associated with drug-metabolizing enzymes, such as cytochrome P450 (CYP) 3A4, CYP2C9 and CYP2C19. The aim of the present study was to explore the role of tagging single nucleotide polymorphisms (tSNPs) of CYP3A4, CYP2C9 and CYP2C19 in the risk of ATLI in a population-based anti-TB treatment cohort. A nested case-control study was designed. Each ATLI case was matched 1 : 4 with controls on the basis of age, gender, treatment history, disease severity and drug dosage. The tSNPs were selected using Haploview 4.2 based on the HapMap database of Han Chinese in Beijing and genotyped by TaqMan allelic discrimination technology. Eighty-nine patients with ATLI and 356 controls were included in the study. One tSNP in CYP3A4 (rs12333983), two in CYP2C9 (rs4918758, rs9332098) and two in CYP2C19 (rs11568732, rs4986894) were selected and genotyped. The minor allele frequencies of rs12333983, rs4918758, rs9332098, rs11568732 and rs4986894 were 36.0%, 41.4%, 1.1%, 5.7% and 35.7%, respectively, in the patients, compared with 31.7%, 42.9%, 3.4%, 8.9% and 35.1%, respectively, in the controls. No significant differences were observed in genotypes or allele frequencies of the five tSNPs between the two groups and none of the CYP2C9 or CYP2C19 haplotypes was significantly associated with the development of ATLI. Based on the Chinese anti-TB treatment cohort, we did not find a significant association between the risk of ATLI and genetic polymorphisms of CYP3A4, CYP2C9 and CYP2C19. None of the haplotypes exhibited a significant association with the development of ATLI in a Chinese tuberculosis population. © 2013 The Authors Clinical and Experimental Pharmacology and Physiology © 2013 Wiley Publishing Asia Pty Ltd.
DoGSD: the dog and wolf genome SNP database.
Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping
2015-01-01
The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Kamitsuji, Shigeo; Matsuda, Takashi; Nishimura, Koichi; Endo, Seiko; Wada, Chisa; Watanabe, Kenji; Hasegawa, Koichi; Hishigaki, Haretsugu; Masuda, Masatoshi; Kuwahara, Yusuke; Tsuritani, Katsuki; Sugiura, Kenkichi; Kubota, Tomoko; Miyoshi, Shinji; Okada, Kinya; Nakazono, Kazuyuki; Sugaya, Yuki; Yang, Woosung; Sawamoto, Taiji; Uchida, Wataru; Shinagawa, Akira; Fujiwara, Tsutomu; Yamada, Hisaharu; Suematsu, Koji; Tsutsui, Naohisa; Kamatani, Naoyuki; Liou, Shyh-Yuh
2015-06-01
Japan Pharmacogenomics Data Science Consortium (JPDSC) has assembled a database for conducting pharmacogenomics (PGx) studies in Japanese subjects. The database contains the genotypes of 2.5 million single-nucleotide polymorphisms (SNPs) and 5 human leukocyte antigen loci from 2994 Japanese healthy volunteers, as well as 121 kinds of clinical information, including self-reports, physiological data, hematological data and biochemical data. In this article, the reliability of our data was evaluated by principal component analysis (PCA) and association analysis for hematological and biochemical traits by using genome-wide SNP data. PCA of the SNPs showed that all the samples were collected from the Japanese population and that the samples were separated into two major clusters by birthplace, Okinawa and other than Okinawa, as had been previously reported. Among 87 SNPs that have been reported to be associated with 18 hematological and biochemical traits in genome-wide association studies (GWAS), the associations of 56 SNPs were replicated using our data base. Statistical power simulations showed that the sample size of the JPDSC control database is large enough to detect genetic markers having a relatively strong association even when the case sample size is small. The JPDSC database will be useful as control data for conducting PGx studies to explore genetic markers to improve the safety and efficacy of drugs either during clinical development or in post-marketing.
Methods for discovering and validating relationships among genotyped animals
USDA-ARS?s Scientific Manuscript database
Genomic selection based on single-nucleotide polymorphisms (SNPs) has led to the collection of genotypes for over 2.2 million animals by the Council on Dairy Cattle Breeding in the United States. To assure that a genotype is assigned to the correct animal and that the animal’s pedigree is correct, t...
Genome-Wide Association Study of Intelligence: Additive Effects of Novel Brain Expressed Genes
ERIC Educational Resources Information Center
Loo, Sandra K.; Shtir, Corina; Doyle, Alysa E.; Mick, Eric; McGough, James J.; McCracken, James; Biederman, Joseph; Smalley, Susan L.; Cantor, Rita M.; Faraone, Stephen V.; Nelson, Stanley F.
2012-01-01
Objective: The purpose of the present study was to identify common genetic variants that are associated with human intelligence or general cognitive ability. Method: We performed a genome-wide association analysis with a dense set of 1 million single-nucleotide polymorphisms (SNPs) and quantitative intelligence scores within an ancestrally…
Gu, Ming-liang; Chu, Jia-you
2007-12-01
Human genome has structures of haplotype and haplotype block which provide valuable information on human evolutionary history and may lead to the development of more efficient strategies to identify genetic variants that increase susceptibility to complex diseases. Haplotype block can be divided into discrete blocks of limited haplotype diversity. In each block, a small fraction of ptag SNPsq can be used to distinguish a large fraction of the haplotypes. These tag SNPs can be potentially useful for construction of haplotype and haplotype block, and association studies in complex diseases. There are two general classes of methods to construct haplotype and haplotype blocks based on genotypes on large pedigrees and statistical algorithms respectively. The author evaluate several construction methods to assess the power of different association tests with a variety of disease models and block-partitioning criteria. The advantages, limitations and applications of each method and the application in the association studies are discussed equitably. With the completion of the HapMap and development of statistical algorithms for addressing haplotype reconstruction, ideas of construction of haplotype based on combination of mathematics, physics, and computer science etc will have profound impacts on population genetics, location and cloning for susceptible genes in complex diseases, and related domain with life science etc.
Single Nucleotide Polymorphism Discovery in Bovine Pituitary Gland Using RNA-Seq Technology
Pareek, Chandra Shekhar; Smoczyński, Rafał; Kadarmideen, Haja N.; Dziuba, Piotr; Błaszczyk, Paweł; Sikora, Marcin; Walendzik, Paulina; Grzybowski, Tomasz; Pierzchała, Mariusz; Horbańczuk, Jarosław; Szostak, Agnieszka; Ogluszka, Magdalena; Zwierzchowski, Lech; Czarnik, Urszula; Fraser, Leyland; Sobiech, Przemysław; Wąsowicz, Krzysztof; Gelfand, Brian; Feng, Yaping; Kumar, Dibyendu
2016-01-01
Examination of bovine pituitary gland transcriptome by strand-specific RNA-seq allows detection of putative single nucleotide polymorphisms (SNPs) within potential candidate genes (CGs) or QTLs regions as well as to understand the genomics variations that contribute to economic trait. Here we report a breed-specific model to successfully perform the detection of SNPs in the pituitary gland of young growing bulls representing Polish Holstein-Friesian (HF), Polish Red, and Hereford breeds at three developmental ages viz., six months, nine months, and twelve months. A total of 18 bovine pituitary gland polyA transcriptome libraries were prepared and sequenced using the Illumina NextSeq 500 platform. Sequenced FastQ databases of all 18 young bulls were submitted to NCBI-SRA database with NCBI-SRA accession numbers SRS1296732. For the investigated young bulls, a total of 113,882,3098 raw paired-end reads with a length of 156 bases were obtained, resulting in an approximately 63 million paired-end reads per library. Breed-wise, a total of 515.38, 215.39, and 408.04 million paired-end reads were obtained for Polish HF, Polish Red, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed 93.04%, 94.39%, and 83.46% of the mapped sequencing reads were properly paired to the Polish HF, Polish Red, and Hereford breeds, respectively. Constructed breed-specific SNP-db of three cattle breeds yielded at 13,775,885 SNPs. On an average 765,326 breed-specific SNPs per young bull were identified. Using two stringent filtering parameters, i.e., a minimum 10 SNP reads per base with an accuracy ≥ 90% and a minimum 10 SNP reads per base with an accuracy = 100%, SNP-db records were trimmed to construct a highly reliable SNP-db. This resulted in a reduction of 95,7% and 96,4% cut-off mark of constructed raw SNP-db. Finally, SNP discoveries using RNA-Seq data were validated by KASP™ SNP genotyping assay. The comprehensive QTLs/CGs analysis of 76 QTLs/CGs with RNA-seq data identified KCNIP4, CCSER1, DPP6, MAP3K5 and GHR CGs with highest SNPs hit loci in all three breeds and developmental ages. However, CAST CG with more than 100 SNPs hits were observed only in Polish HF and Hereford breeds.These findings are important for identification and construction of novel tissue specific SNP-db and breed specific SNP-db dataset by screening of putative SNPs according to QTL db and candidate genes for bovine growth and reproduction traits, one can develop genomic selection strategies for growth and reproductive traits. PMID:27606429
Single Nucleotide Polymorphism Discovery in Bovine Pituitary Gland Using RNA-Seq Technology.
Pareek, Chandra Shekhar; Smoczyński, Rafał; Kadarmideen, Haja N; Dziuba, Piotr; Błaszczyk, Paweł; Sikora, Marcin; Walendzik, Paulina; Grzybowski, Tomasz; Pierzchała, Mariusz; Horbańczuk, Jarosław; Szostak, Agnieszka; Ogluszka, Magdalena; Zwierzchowski, Lech; Czarnik, Urszula; Fraser, Leyland; Sobiech, Przemysław; Wąsowicz, Krzysztof; Gelfand, Brian; Feng, Yaping; Kumar, Dibyendu
2016-01-01
Examination of bovine pituitary gland transcriptome by strand-specific RNA-seq allows detection of putative single nucleotide polymorphisms (SNPs) within potential candidate genes (CGs) or QTLs regions as well as to understand the genomics variations that contribute to economic trait. Here we report a breed-specific model to successfully perform the detection of SNPs in the pituitary gland of young growing bulls representing Polish Holstein-Friesian (HF), Polish Red, and Hereford breeds at three developmental ages viz., six months, nine months, and twelve months. A total of 18 bovine pituitary gland polyA transcriptome libraries were prepared and sequenced using the Illumina NextSeq 500 platform. Sequenced FastQ databases of all 18 young bulls were submitted to NCBI-SRA database with NCBI-SRA accession numbers SRS1296732. For the investigated young bulls, a total of 113,882,3098 raw paired-end reads with a length of 156 bases were obtained, resulting in an approximately 63 million paired-end reads per library. Breed-wise, a total of 515.38, 215.39, and 408.04 million paired-end reads were obtained for Polish HF, Polish Red, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed 93.04%, 94.39%, and 83.46% of the mapped sequencing reads were properly paired to the Polish HF, Polish Red, and Hereford breeds, respectively. Constructed breed-specific SNP-db of three cattle breeds yielded at 13,775,885 SNPs. On an average 765,326 breed-specific SNPs per young bull were identified. Using two stringent filtering parameters, i.e., a minimum 10 SNP reads per base with an accuracy ≥ 90% and a minimum 10 SNP reads per base with an accuracy = 100%, SNP-db records were trimmed to construct a highly reliable SNP-db. This resulted in a reduction of 95,7% and 96,4% cut-off mark of constructed raw SNP-db. Finally, SNP discoveries using RNA-Seq data were validated by KASP™ SNP genotyping assay. The comprehensive QTLs/CGs analysis of 76 QTLs/CGs with RNA-seq data identified KCNIP4, CCSER1, DPP6, MAP3K5 and GHR CGs with highest SNPs hit loci in all three breeds and developmental ages. However, CAST CG with more than 100 SNPs hits were observed only in Polish HF and Hereford breeds.These findings are important for identification and construction of novel tissue specific SNP-db and breed specific SNP-db dataset by screening of putative SNPs according to QTL db and candidate genes for bovine growth and reproduction traits, one can develop genomic selection strategies for growth and reproductive traits.
Complete physical mapping of IL6 reveals a new marker associated with chronic periodontitis.
Farhat, S B; de Souza, C M; Braosi, A P R; Kim, S H; Tramontina, V A; Papalexiou, V; Olandoski, M; Mira, M T; Luczyszyn, S M; Trevilatto, P C
2017-04-01
Interleukin-6 (IL-6) is a powerful stimulator of osteoclast differentiation and bone resorption. Production of IL-6 is modulated by polymorphisms, and higher levels of this cytokine are found locally in patients with chronic periodontitis. In this study we performed a modern approach - Complete physical mapping of the IL6 gene - to identify the polymorphisms associated with chronic periodontitis in a southern Brazilian population sample. One-hundred and nine individuals of both genders (mean age: 41.5 ± 8.5 years) were divided into a study group (56 participants with periodontitis) and a control group (53 individuals without periodontitis). After collection and purification of DNA, nine tag single nucleotide polymorphisms (SNPs; rs1524107, rs2069835, rs2069837, rs2069838, rs2069840, rs2069842, rs2069843, rs2069845 and rs2069849) covering the entire gene were selected according to the information available on the International HapMap Project website and evaluated using real-time PCR. Differences in the distribution of the following parameters were statistically significant between study and control groups: number of teeth (p = 0.030); probing depth (p < 0.001); clinical attachment level (p < 0.001); gingival index (p < 0.001); plaque index (p = 0.003); calculus index (p < 0.001); and dental mobility (p < 0.001). It was found that marker rs2069837 (located in intron 2 of IL6) under G dominant was associated with protection against chronic periodontitis in a Brazilian population in the presence of clinical variables, such as visible plaque, dentist visit frequency and dental floss use, and was suggested for the first time as a marker of susceptibility to chronic periodontitis. Complete physical mapping of IL6 (using tag SNPs) was carried out for the first time, unveiling allele G of polymorphism rs2069837 (located in the second intron of IL6) as a suggestive marker of protection against chronic periodontitis in a Brazilian population. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Orho-Melander, Marju; Melander, Olle; Guiducci, Candace; Perez-Martinez, Pablo; Corella, Dolores; Roos, Charlotta; Tewhey, Ryan; Rieder, Mark J.; Hall, Jennifer; Abecasis, Goncalo; Tai, E. Shyong; Welch, Cullan; Arnett, Donna K.; Lyssenko, Valeriya; Lindholm, Eero; Saxena, Richa; de Bakker, Paul I.W.; Burtt, Noel; Voight, Benjamin F.; Hirschhorn, Joel N.; Tucker, Katherine L.; Hedner, Thomas; Tuomi, Tiinamaija; Isomaa, Bo; Eriksson, Karl-Fredrik; Taskinen, Marja-Riitta; Wahlstrand, Björn; Hughes, Thomas E.; Parnell, Laurence D.; Lai, Chao-Qiang; Berglund, Göran; Peltonen, Leena; Vartiainen, Erkki; Jousilahti, Pekka; Havulinna, Aki S.; Salomaa, Veikko; Nilsson, Peter; Groop, Leif; Altshuler, David; Ordovas, Jose M.; Kathiresan, Sekar
2008-01-01
OBJECTIVE—Using the genome-wide association approach, we recently identified the glucokinase regulatory protein gene (GCKR, rs780094) region as a novel quantitative trait locus for plasma triglyceride concentration in Europeans. Here, we sought to study the association of GCKR variants with metabolic phenotypes, including measures of glucose homeostasis, to evaluate the GCKR locus in samples of non-European ancestry and to fine- map across the associated genomic interval. RESEARCH DESIGN AND METHODS—We performed association studies in 12 independent cohorts comprising >45,000 individuals representing several ancestral groups (whites from Northern and Southern Europe, whites from the U.S., African Americans from the U.S., Hispanics of Caribbean origin, and Chinese, Malays, and Asian Indians from Singapore). We conducted genetic fine-mapping across the ∼417-kb region of linkage disequilibrium spanning GCKR and 16 other genes on chromosome 2p23 by imputing untyped HapMap single nucleotide polymorphisms (SNPs) and genotyping 104 SNPs across the associated genomic interval. RESULTS—We provide comprehensive evidence that GCKR rs780094 is associated with opposite effects on fasting plasma triglyceride (Pmeta = 3 × 10−56) and glucose (Pmeta = 1 × 10−13) concentrations. In addition, we confirmed recent reports that the same SNP is associated with C-reactive protein (CRP) level (P = 5 × 10−5). Both fine-mapping approaches revealed a common missense GCKR variant (rs1260326, Pro446Leu, 34% frequency, r2 = 0.93 with rs780094) as the strongest association signal in the region. CONCLUSIONS—These findings point to a molecular mechanism in humans by which higher triglycerides and CRP can be coupled with lower plasma glucose concentrations and position GCKR in central pathways regulating both hepatic triglyceride and glucose metabolism. PMID:18678614
SPSmart: adapting population based SNP genotype databases for fast and comprehensive web access.
Amigo, Jorge; Salas, Antonio; Phillips, Christopher; Carracedo, Angel
2008-10-10
In the last five years large online resources of human variability have appeared, notably HapMap, Perlegen and the CEPH foundation. These databases of genotypes with population information act as catalogues of human diversity, and are widely used as reference sources for population genetics studies. Although many useful conclusions may be extracted by querying databases individually, the lack of flexibility for combining data from within and between each database does not allow the calculation of key population variability statistics. We have developed a novel tool for accessing and combining large-scale genomic databases of single nucleotide polymorphisms (SNPs) in widespread use in human population genetics: SPSmart (SNPs for Population Studies). A fast pipeline creates and maintains a data mart from the most commonly accessed databases of genotypes containing population information: data is mined, summarized into the standard statistical reference indices, and stored into a relational database that currently handles as many as 4 x 10(9) genotypes and that can be easily extended to new database initiatives. We have also built a web interface to the data mart that allows the browsing of underlying data indexed by population and the combining of populations, allowing intuitive and straightforward comparison of population groups. All the information served is optimized for web display, and most of the computations are already pre-processed in the data mart to speed up the data browsing and any computational treatment requested. In practice, SPSmart allows populations to be combined into user-defined groups, while multiple databases can be accessed and compared in a few simple steps from a single query. It performs the queries rapidly and gives straightforward graphical summaries of SNP population variability through visual inspection of allele frequencies outlined in standard pie-chart format. In addition, full numerical description of the data is output in statistical results panels that include common population genetics metrics such as heterozygosity, Fst and In.
Jin, Peng; Andiappan, Anand Kumar; Quek, Jia Min; Lee, Bernett; Au, Bijin; Sio, Yang Yie; Irwanto, Astrid; Schurmann, Claudia; Grabe, Hans Jörgen; Suri, Bani Kaur; Matta, Sri Anusha; Westra, Harm-Jan; Franke, Lude; Esko, Tonu; Sun, Liangdan; Zhang, Xuejun; Liu, Hong; Zhang, Furen; Larbi, Anis; Xu, Xin; Poidinger, Michael; Liu, Jianjun; Chew, Fook Tim; Rotzschke, Olaf; Shi, Li; Wang, De Yun
2015-06-01
Brain-derived neurotrophic factor (BDNF) is a secretory protein that has been implicated in the pathogenesis of allergic rhinitis (AR), atopic asthma, and eczema, but it is currently unknown whether BDNF polymorphisms influence susceptibility to moderate-to-severe AR. We sought to identify disease associations and the functional effect of BDNF genetic variants in patients with moderate-to-severe AR. Tagging single nucleotide polymorphisms (SNPs) spanning the BDNF gene were selected from the human HapMap Han Chinese from Beijing (CHB) data set, and associations with moderate-to-severe AR were assessed in 2 independent cohorts of Chinese patients (2216 from Shandong province and 1239 living in Singapore). The functional effects of the BDNF genetic variants were determined by using both in vitro and ex vivo assays. The tagging SNP rs10767664 was significantly associated with the risk of moderate-to-severe AR in both Singapore Chinese (P = .0017; odds ratio, 1.324) and Shandong Chinese populations (P = .039; odds ratio, 1.180). The coding nonsynonymous SNP rs6265 was in perfect linkage with rs10767664 and conferred increased BDNF protein secretion by a human cell line in vitro. Subjects bearing the AA genotype of rs10767664 exhibited increased risk of moderate-to-severe AR and displayed increased BDNF protein and total IgE levels in plasma. Using a large-scale expression quantitative trait locus study, we demonstrated that BDNF SNPs are significantly associated with altered BDNF concentrations in peripheral blood. A common genetic variant of the BDNF gene is associated with increased risk of moderate-to-severe AR, and the AA genotype is associated with increased BDNF mRNA levels in peripheral blood. Together, these data indicate that functional BDNF gene variants increase the risk of moderate-to-severe AR. Copyright © 2015 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Kwan, Patrick; Poon, Wai Sang; Ng, Ho-Keung; Kang, David E; Wong, Virginia; Ng, Ping Wing; Lui, Colin H T; Sin, Ngai Chuen; Wong, Ka S; Baum, Larry
2008-11-01
Many antiepileptic drugs (AEDs) prevent seizures by blocking voltage-gated brain sodium channels. However, treatment is ineffective in 30% of epilepsy patients, which might, at least in part, result from polymorphisms of the sodium channel genes. We investigated the association of AED responsiveness with genetic polymorphisms and correlated any association with mRNA expression of the neuronal sodium channels. We performed genotyping of tagging and candidate single nucleotide polymorphisms (SNPs) of SCN1A, 2A, and 3A in 471 Chinese epilepsy patients (272 drug responsive and 199 drug resistant). A total of 27 SNPs were selected based on the HapMap database. Genotype distributions in drug-responsive and drug-resistant patients were compared. SCN2A mRNA was quantified by real-time PCR in 24 brain and 57 blood samples. Its level was compared between patients with different genotypes of an SCN2A SNP found to be associated with drug responsiveness. SCN2A IVS7-32A>G (rs2304016) A alleles were associated with drug resistance (odds ratio = 2.1, 95% confidence interval: 1.2-3.7, P=0.007). Haplotypes containing the IVS7-32A>G allele A were also associated with drug resistance. IVS7-32A>G is located within the putative splicing branch site for splicing exons 7 and 9. PCR of reverse-transcribed RNA from blood or brain of patients with different IVS7-32A>G genotypes using primers in exons 7 and 9 showed no skipping of exon 8, and real-time PCR showed no difference in SCN2A mRNA levels among genotypes. Results of this study suggest an association between SCN2A IVS7-32A>G and AED responsiveness, without evidence of an effect on splicing or mRNA expression.
Zhao, Hui; Wu, Xuan; Dong, Chun-Ling; Wang, Bi-Ying; Zhao, Jiao; Cao, Xian-E
2017-08-01
This study was designed to investigate the association between single nucleotide polymorphisms (SNPs) of the β2-adrenergic receptor (ADRB2) gene and the risk of chronic obstructive pulmonary disease (COPD) in a Chinese population. From January 2010 to October 2014, 261 COPD patients were selected as the case group and 239 healthy subjects were selected as the control group. Pulmonary function tests were performed to detect forced vital capacity (FVC), 1-s forced expiratory volume (FEV 1 ), and FEV 1 /FVC (%). rs1042711, rs1042714, and rs1042718 were selected as tagSNPs of the ADRB2 gene from the HapMap database in accordance with previous studies. The ADRB2 genotypes were established by real-time polymerase chain reaction assays using TaqMan-labeled probes. The relationships between the ADRB2 polymorphisms and COPD risk were estimated using logistic regression analyses. The frequency of the genotypes and alleles of rs1042711 in ADRB2 showed a significant difference between the COPD and control groups (p < 0.05); compared with the CC genotype, the non-CC genotypes showed an increased COPD risk (p = 0.002). Compared with the CC haplotype, the TG haplotype increased COPD risk, while the CG haplotype reduced COPD risk for normal individuals. Compared with the CC genotype, the TT genotype showed significantly lower FEV 1 and FEV 1 /FVC (p = 0.022, p = 0.0191, respectively). Both the TC and TG haplotypes showed lower FEV 1 and FEV 1 /FVC in comparison with the CC haplotype (both p < 0.05). The results of logistic regression analysis showed that rs1042711 of ADRB2 and smoking history were associated with COPD risk (both p < 0.05). It is indicated that the TT genotype of rs1042711 and smoking pack years are both risk factors for COPD.
Aguado, Cristina; Gayà-Vidal, Magdalena; Villatoro, Sergi; Oliva, Meritxell; Izquierdo, David; Giner-Delgado, Carla; Montalvo, Víctor; García-González, Judit; Martínez-Fundichely, Alexander; Capilla, Laia; Ruiz-Herrera, Aurora; Estivill, Xavier; Puig, Marta; Cáceres, Mario
2014-01-01
In recent years different types of structural variants (SVs) have been discovered in the human genome and their functional impact has become increasingly clear. Inversions, however, are poorly characterized and more difficult to study, especially those mediated by inverted repeats or segmental duplications. Here, we describe the results of a simple and fast inverse PCR (iPCR) protocol for high-throughput genotyping of a wide variety of inversions using a small amount of DNA. In particular, we analyzed 22 inversions predicted in humans ranging from 5.1 kb to 226 kb and mediated by inverted repeat sequences of 1.6–24 kb. First, we validated 17 of the 22 inversions in a panel of nine HapMap individuals from different populations, and we genotyped them in 68 additional individuals of European origin, with correct genetic transmission in ∼12 mother-father-child trios. Global inversion minor allele frequency varied between 1% and 49% and inversion genotypes were consistent with Hardy-Weinberg equilibrium. By analyzing the nucleotide variation and the haplotypes in these regions, we found that only four inversions have linked tag-SNPs and that in many cases there are multiple shared SNPs between standard and inverted chromosomes, suggesting an unexpected high degree of inversion recurrence during human evolution. iPCR was also used to check 16 of these inversions in four chimpanzees and two gorillas, and 10 showed both orientations either within or between species, providing additional support for their multiple origin. Finally, we have identified several inversions that include genes in the inverted or breakpoint regions, and at least one disrupts a potential coding gene. Thus, these results represent a significant advance in our understanding of inversion polymorphism in human populations and challenge the common view of a single origin of inversions, with important implications for inversion analysis in SNP-based studies. PMID:24651690
Sequence variants at CYP1A1–CYP1A2 and AHR associate with coffee consumption
Sulem, Patrick; Gudbjartsson, Daniel F.; Geller, Frank; Prokopenko, Inga; Feenstra, Bjarke; Aben, Katja K.H.; Franke, Barbara; den Heijer, Martin; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Yanek, Lisa R.; Becker, Lewis C.; Boyd, Heather A.; Stacey, Simon N.; Walters, G. Bragi; Jonasdottir, Adalbjorg; Thorleifsson, Gudmar; Holm, Hilma; Gudjonsson, Sigurjon A.; Rafnar, Thorunn; Björnsdottir, Gyda; Becker, Diane M.; Melbye, Mads; Kong, Augustine; Tönjes, Anke; Thorgeirsson, Thorgeir; Thorsteinsdottir, Unnur; Kiemeney, Lambertus A.; Stefansson, Kari
2011-01-01
Coffee is the most commonly used stimulant and caffeine is its main psychoactive ingredient. The heritability of coffee consumption has been estimated at around 50%. We performed a meta-analysis of four genome-wide association studies of coffee consumption among coffee drinkers from Iceland (n = 2680), the Netherlands (n = 2791), the Sorbs Slavonic population isolate in Germany (n = 771) and the USA (n = 369) using both directly genotyped and imputed single nucleotide polymorphisms (SNPs) (2.5 million SNPs). SNPs at the two most significant loci were also genotyped in a sample set from Iceland (n = 2430) and a Danish sample set consisting of pregnant women (n = 1620). Combining all data, two sequence variants significantly associated with increased coffee consumption: rs2472297-T located between CYP1A1 and CYP1A2 at 15q24 (P = 5.4 · 10−14) and rs6968865-T near aryl hydrocarbon receptor (AHR) at 7p21 (P = 2.3 · 10−11). An effect of ∼0.2 cups a day per allele was observed for both SNPs. CYP1A2 is the main caffeine metabolizing enzyme and is also involved in drug metabolism. AHR detects xenobiotics, such as polycyclic aryl hydrocarbons found in roasted coffee, and induces transcription of CYP1A1 and CYP1A2. The association of these SNPs with coffee consumption was present in both smokers and non-smokers. PMID:21357676
High throughput SNP discovery and genotyping in hexaploid wheat.
Rimbert, Hélène; Darrier, Benoît; Navarro, Julien; Kitt, Jonathan; Choulet, Frédéric; Leveugle, Magalie; Duarte, Jorge; Rivière, Nathalie; Eversole, Kellye; Le Gouis, Jacques; Davassi, Alessandro; Balfourier, François; Le Paslier, Marie-Christine; Berard, Aurélie; Brunel, Dominique; Feuillet, Catherine; Poncet, Charles; Sourdille, Pierre; Paux, Etienne
2018-01-01
Because of their abundance and their amenability to high-throughput genotyping techniques, Single Nucleotide Polymorphisms (SNPs) are powerful tools for efficient genetics and genomics studies, including characterization of genetic resources, genome-wide association studies and genomic selection. In wheat, most of the previous SNP discovery initiatives targeted the coding fraction, leaving almost 98% of the wheat genome largely unexploited. Here we report on the use of whole-genome resequencing data from eight wheat lines to mine for SNPs in the genic, the repetitive and non-repetitive intergenic fractions of the wheat genome. Eventually, we identified 3.3 million SNPs, 49% being located on the B-genome, 41% on the A-genome and 10% on the D-genome. We also describe the development of the TaBW280K high-throughput genotyping array containing 280,226 SNPs. Performance of this chip was examined by genotyping a set of 96 wheat accessions representing the worldwide diversity. Sixty-nine percent of the SNPs can be efficiently scored, half of them showing a diploid-like clustering. The TaBW280K was proven to be a very efficient tool for diversity analyses, as well as for breeding as it can discriminate between closely related elite varieties. Finally, the TaBW280K array was used to genotype a population derived from a cross between Chinese Spring and Renan, leading to the construction a dense genetic map comprising 83,721 markers. The results described here will provide the wheat community with powerful tools for both basic and applied research.
Ni, Guiyan; Cavero, David; Fangmann, Anna; Erbe, Malena; Simianer, Henner
2017-01-16
With the availability of next-generation sequencing technologies, genomic prediction based on whole-genome sequencing (WGS) data is now feasible in animal breeding schemes and was expected to lead to higher predictive ability, since such data may contain all genomic variants including causal mutations. Our objective was to compare prediction ability with high-density (HD) array data and WGS data in a commercial brown layer line with genomic best linear unbiased prediction (GBLUP) models using various approaches to weight single nucleotide polymorphisms (SNPs). A total of 892 chickens from a commercial brown layer line were genotyped with 336 K segregating SNPs (array data) that included 157 K genic SNPs (i.e. SNPs in or around a gene). For these individuals, genome-wide sequence information was imputed based on data from re-sequencing runs of 25 individuals, leading to 5.2 million (M) imputed SNPs (WGS data), including 2.6 M genic SNPs. De-regressed proofs (DRP) for eggshell strength, feed intake and laying rate were used as quasi-phenotypic data in genomic prediction analyses. Four weighting factors for building a trait-specific genomic relationship matrix were investigated: identical weights, -(log 10 P) from genome-wide association study results, squares of SNP effects from random regression BLUP, and variable selection based weights (known as BLUP|GA). Predictive ability was measured as the correlation between DRP and direct genomic breeding values in five replications of a fivefold cross-validation. Averaged over the three traits, the highest predictive ability (0.366 ± 0.075) was obtained when only genic SNPs from WGS data were used. Predictive abilities with genic SNPs and all SNPs from HD array data were 0.361 ± 0.072 and 0.353 ± 0.074, respectively. Prediction with -(log 10 P) or squares of SNP effects as weighting factors for building a genomic relationship matrix or BLUP|GA did not increase accuracy, compared to that with identical weights, regardless of the SNP set used. Our results show that little or no benefit was gained when using all imputed WGS data to perform genomic prediction compared to using HD array data regardless of the weighting factors tested. However, using only genic SNPs from WGS data had a positive effect on prediction ability.
Spontaneous preterm birth and single nucleotide gene polymorphisms: a recent update.
Sheikh, Ishfaq A; Ahmad, Ejaz; Jamal, Mohammad S; Rehan, Mohd; Assidi, Mourad; Tayubi, Iftikhar A; AlBasri, Samera F; Bajouh, Osama S; Turki, Rola F; Abuzenadah, Adel M; Damanhouri, Ghazi A; Beg, Mohd A; Al-Qahtani, Mohammed
2016-10-17
Preterm birth (PTB), birth at <37 weeks of gestation, is a significant global public health problem. World-wide, about 15 million babies are born preterm each year resulting in more than a million deaths of children. Preterm neonates are more prone to problems and need intensive care hospitalization. Health issues may persist through early adulthood and even be carried on to the next generation. Majority (70 %) of PTBs are spontaneous with about a half without any apparent cause and the other half associated with a number of risk factors. Genetic factors are one of the significant risks for PTB. The focus of this review is on single nucleotide gene polymorphisms (SNPs) that are reported to be associated with PTB. A comprehensive evaluation of studies on SNPs known to confer potential risk of PTB was done by performing a targeted PubMed search for the years 2007-2015 and systematically reviewing all relevant studies. Evaluation of 92 studies identified 119 candidate genes with SNPs that had potential association with PTB. The genes were associated with functions of a wide spectrum of tissue and cell types such as endocrine, tissue remodeling, vascular, metabolic, and immune and inflammatory systems. A number of potential functional candidate gene variants have been reported that predispose women for PTB. Understanding the complex genomic landscape of PTB needs high-throughput genome sequencing methods such as whole-exome sequencing and whole-genome sequencing approaches that will significantly enhance the understanding of PTB. Identification of high risk women, avoidance of possible risk factors, and provision of personalized health care are important to manage PTB.
Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies
Zhang, Yu; Liu, Jun S.
2011-01-01
Genome-wide association studies commonly involve simultaneous tests of millions of single nucleotide polymorphisms (SNP) for disease association. The SNPs in nearby genomic regions, however, are often highly correlated due to linkage disequilibrium (LD, a genetic term for correlation). Simple Bonferonni correction for multiple comparisons is therefore too conservative. Permutation tests, which are often employed in practice, are both computationally expensive for genome-wide studies and limited in their scopes. We present an accurate and computationally efficient method, based on Poisson de-clumping heuristics, for approximating genome-wide significance of SNP associations. Compared with permutation tests and other multiple comparison adjustment approaches, our method computes the most accurate and robust p-value adjustments for millions of correlated comparisons within seconds. We demonstrate analytically that the accuracy and the efficiency of our method are nearly independent of the sample size, the number of SNPs, and the scale of p-values to be adjusted. In addition, our method can be easily adopted to estimate false discovery rate. When applied to genome-wide SNP datasets, we observed highly variable p-value adjustment results evaluated from different genomic regions. The variation in adjustments along the genome, however, are well conserved between the European and the African populations. The p-value adjustments are significantly correlated with LD among SNPs, recombination rates, and SNP densities. Given the large variability of sequence features in the genome, we further discuss a novel approach of using SNP-specific (local) thresholds to detect genome-wide significant associations. This article has supplementary material online. PMID:22140288
Ojeda, Dario I; Dhillon, Braham; Tsui, Clement K M; Hamelin, Richard C
2014-03-01
Single-nucleotide polymorphisms (SNPs) are rapidly becoming the standard markers in population genomics studies; however, their use in nonmodel organisms is limited due to the lack of cost-effective approaches to uncover genome-wide variation, and the large number of individuals needed in the screening process to reduce ascertainment bias. To discover SNPs for population genomics studies in the fungal symbionts of the mountain pine beetle (MPB), we developed a road map to discover SNPs and to produce a genotyping platform. We undertook a whole-genome sequencing approach of Leptographium longiclavatum in combination with available genomics resources of another MPB symbiont, Grosmannia clavigera. We sequenced 71 individuals pooled into four groups using the Illumina sequencing technology. We generated between 27 and 30 million reads of 75 bp that resulted in a total of 1, 181 contigs longer than 2 kb and an assembled genome size of 28.9 Mb (N50 = 48 kb, average depth = 125x). A total of 9052 proteins were annotated, and between 9531 and 17,266 SNPs were identified in the four pools. A subset of 206 genes (containing 574 SNPs, 11% false positives) was used to develop a genotyping platform for this species. Using this roadmap, we developed a genotyping assay with a total of 147 SNPs located in 121 genes using the Illumina(®) Sequenom iPLEX Gold. Our preliminary genotyping (success rate = 85%) of 304 individuals from 36 populations supports the utility of this approach for population genomics studies in other MPB fungal symbionts and other fungal nonmodel species. © 2013 John Wiley & Sons Ltd.
An integrated map of genetic variation from 1,092 human genomes
2012-01-01
Summary Through characterising the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help understand the genetic contribution to disease. We describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methodologies to integrate information across multiple algorithms and diverse data sources we provide a validated haplotype map of 38 million SNPs, 1.4 million indels and over 14 thousand larger deletions. We show that individuals from different populations carry different profiles of rare and common variants and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways and that each individual harbours hundreds of rare non-coding variants at conserved sites, such as transcription-factor-motif disrupting changes. This resource, which captures up to 98% of accessible SNPs at a frequency of 1% in populations of medical genetics focus, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. PMID:23128226
Bassil, Nahla V; Davis, Thomas M; Zhang, Hailong; Ficklin, Stephen; Mittmann, Mike; Webster, Teresa; Mahoney, Lise; Wood, David; Alperin, Elisabeth S; Rosyara, Umesh R; Koehorst-Vanc Putten, Herma; Monfort, Amparo; Sargent, Daniel J; Amaya, Iraida; Denoyes, Beatrice; Bianco, Luca; van Dijk, Thijs; Pirani, Ali; Iezzoni, Amy; Main, Dorrie; Peace, Cameron; Yang, Yilong; Whitaker, Vance; Verma, Sujeet; Bellon, Laurent; Brew, Fiona; Herrera, Raul; van de Weg, Eric
2015-03-07
A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array. About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM. The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.
Ribeiro, Antonio; Golicz, Agnieszka; Hackett, Christine Anne; Milne, Iain; Stephen, Gordon; Marshall, David; Flavell, Andrew J; Bayer, Micha
2015-11-11
Single Nucleotide Polymorphisms (SNPs) are widely used molecular markers, and their use has increased massively since the inception of Next Generation Sequencing (NGS) technologies, which allow detection of large numbers of SNPs at low cost. However, both NGS data and their analysis are error-prone, which can lead to the generation of false positive (FP) SNPs. We explored the relationship between FP SNPs and seven factors involved in mapping-based variant calling - quality of the reference sequence, read length, choice of mapper and variant caller, mapping stringency and filtering of SNPs by read mapping quality and read depth. This resulted in 576 possible factor level combinations. We used error- and variant-free simulated reads to ensure that every SNP found was indeed a false positive. The variation in the number of FP SNPs generated ranged from 0 to 36,621 for the 120 million base pairs (Mbp) genome. All of the experimental factors tested had statistically significant effects on the number of FP SNPs generated and there was a considerable amount of interaction between the different factors. Using a fragmented reference sequence led to a dramatic increase in the number of FP SNPs generated, as did relaxed read mapping and a lack of SNP filtering. The choice of reference assembler, mapper and variant caller also significantly affected the outcome. The effect of read length was more complex and suggests a possible interaction between mapping specificity and the potential for contributing more false positives as read length increases. The choice of tools and parameters involved in variant calling can have a dramatic effect on the number of FP SNPs produced, with particularly poor combinations of software and/or parameter settings yielding tens of thousands in this experiment. Between-factor interactions make simple recommendations difficult for a SNP discovery pipeline but the quality of the reference sequence is clearly of paramount importance. Our findings are also a stark reminder that it can be unwise to use the relaxed mismatch settings provided as defaults by some read mappers when reads are being mapped to a relatively unfinished reference sequence from e.g. a non-model organism in its early stages of genomic exploration.
Coassin, Stefan; Friedel, Salome; Köttgen, Anna; Lamina, Claudia; Kronenberg, Florian
2016-11-01
A recent observational study with almost 2 million men reported an association between low high-density lipoprotein (HDL) cholesterol and worse kidney function. The causality of this association would be strongly supported if genetic variants associated with HDL cholesterol were also associated with kidney function. We used 68 genetic variants (single-nucleotide polymorphisms [SNPs]) associated with HDL cholesterol in genome-wide association studies including >188 000 subjects and tested their association with estimated glomerular filtration rate (eGFR) using summary statistics from another genome-wide association studies meta-analysis of kidney function including ≤133 413 subjects. Fourteen of the 68 SNPs (21%) had a P value <0.05 compared with the 5% expected by chance (Binomial test P=5.8×10 - 6 ). After Bonferroni correction, 6 SNPs were still significantly associated with eGFR. The genetic variants with the strongest associations with HDL cholesterol concentrations were not the same as those with the strongest association with kidney function and vice versa. An evaluation of pleiotropy indicated that the effects of the HDL-associated SNPs on eGFR were not mediated by HDL cholesterol. In addition, we performed a Mendelian randomization analysis. This analysis revealed a positive but nonsignificant causal effect of HDL cholesterol-increasing variants on eGFR. In summary, our findings indicate that HDL cholesterol does not causally influence eGFR and propose pleiotropic effects on eGFR for some HDL cholesterol-associated SNPs. This may cause the observed association by mechanisms other than the mere HDL cholesterol concentration. © 2016 The Authors.
High throughput SNP discovery and genotyping in hexaploid wheat
Navarro, Julien; Kitt, Jonathan; Choulet, Frédéric; Leveugle, Magalie; Duarte, Jorge; Rivière, Nathalie; Eversole, Kellye; Le Gouis, Jacques; Davassi, Alessandro; Balfourier, François; Le Paslier, Marie-Christine; Berard, Aurélie; Brunel, Dominique; Feuillet, Catherine; Poncet, Charles; Sourdille, Pierre
2018-01-01
Because of their abundance and their amenability to high-throughput genotyping techniques, Single Nucleotide Polymorphisms (SNPs) are powerful tools for efficient genetics and genomics studies, including characterization of genetic resources, genome-wide association studies and genomic selection. In wheat, most of the previous SNP discovery initiatives targeted the coding fraction, leaving almost 98% of the wheat genome largely unexploited. Here we report on the use of whole-genome resequencing data from eight wheat lines to mine for SNPs in the genic, the repetitive and non-repetitive intergenic fractions of the wheat genome. Eventually, we identified 3.3 million SNPs, 49% being located on the B-genome, 41% on the A-genome and 10% on the D-genome. We also describe the development of the TaBW280K high-throughput genotyping array containing 280,226 SNPs. Performance of this chip was examined by genotyping a set of 96 wheat accessions representing the worldwide diversity. Sixty-nine percent of the SNPs can be efficiently scored, half of them showing a diploid-like clustering. The TaBW280K was proven to be a very efficient tool for diversity analyses, as well as for breeding as it can discriminate between closely related elite varieties. Finally, the TaBW280K array was used to genotype a population derived from a cross between Chinese Spring and Renan, leading to the construction a dense genetic map comprising 83,721 markers. The results described here will provide the wheat community with powerful tools for both basic and applied research. PMID:29293495
Trifonova, E A; Eremina, E R; Urnov, F D; Stepanov, V A
2012-01-01
The structure of the haplotypes and linkage disequilibrium (LD) of the methylenetetrahydrofolate reductase gene (MTHFR) in 9 population groups from Northern Eurasia and populations of the international HapMap project was investigated in the present study. The data suggest that the architecture of LD in the human genome is largely determined by the evolutionary history of populations; however, the results of phylogenetic and haplotype analyses seems to suggest that in fact there may be a common "old" mechanism for the formation of certain patterns of LD. Variability in the structure of LD and the level of diversity of MTHFRhaplotypes cause a certain set of tagSNPs with an established prognostic significance for each population. In our opinion, the results obtained in the present study are of considerable interest for understanding multiple genetic phenomena: namely, the association of interpopulation differences in the patterns of LD with structures possessing a genetic susceptibility to complex diseases, and the functional significance of the pleiotropicMTHFR gene effect. Summarizing the results of this study, a conclusion can be made that the genetic variability analysis with emphasis on the structure of LD in human populations is a powerful tool that can make a significant contribution to such areas of biomedical science as human evolutionary biology, functional genomics, genetics of complex diseases, and pharmacogenomics.
1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function
Gorski, Mathias; van der Most, Peter J.; Teumer, Alexander; Chu, Audrey Y.; Li, Man; Mijatovic, Vladan; Nolte, Ilja M.; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F.; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P.; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C.; Curhan, Gary C.; d’Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H.; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J.; Harris, Tamara B.; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G.; Homuth, Georg; Hu, Frank B.; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K.; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J.; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J. F.; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A.; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J.; Olden, Matthias; WJH Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P.; Probst-Hensch, Nicole; Raitakari, Olli T.; Rettig, Rainer; Ridker, Paul M.; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E.; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J.; Sedaghat, Sanaz; Smith, Albert V.; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G.; Ulivi, Sheila; Viikari, Jorma S.; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I.; Tromp, Gerard; Snieder, Harold; Heid, Iris M.; Fox, Caroline S.; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A.; Fuchsberger, Christian
2017-01-01
HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10−8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples. PMID:28452372
1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function.
Gorski, Mathias; van der Most, Peter J; Teumer, Alexander; Chu, Audrey Y; Li, Man; Mijatovic, Vladan; Nolte, Ilja M; Cocca, Massimiliano; Taliun, Daniel; Gomez, Felicia; Li, Yong; Tayo, Bamidele; Tin, Adrienne; Feitosa, Mary F; Aspelund, Thor; Attia, John; Biffar, Reiner; Bochud, Murielle; Boerwinkle, Eric; Borecki, Ingrid; Bottinger, Erwin P; Chen, Ming-Huei; Chouraki, Vincent; Ciullo, Marina; Coresh, Josef; Cornelis, Marilyn C; Curhan, Gary C; d'Adamo, Adamo Pio; Dehghan, Abbas; Dengler, Laura; Ding, Jingzhong; Eiriksdottir, Gudny; Endlich, Karlhans; Enroth, Stefan; Esko, Tõnu; Franco, Oscar H; Gasparini, Paolo; Gieger, Christian; Girotto, Giorgia; Gottesman, Omri; Gudnason, Vilmundur; Gyllensten, Ulf; Hancock, Stephen J; Harris, Tamara B; Helmer, Catherine; Höllerer, Simon; Hofer, Edith; Hofman, Albert; Holliday, Elizabeth G; Homuth, Georg; Hu, Frank B; Huth, Cornelia; Hutri-Kähönen, Nina; Hwang, Shih-Jen; Imboden, Medea; Johansson, Åsa; Kähönen, Mika; König, Wolfgang; Kramer, Holly; Krämer, Bernhard K; Kumar, Ashish; Kutalik, Zoltan; Lambert, Jean-Charles; Launer, Lenore J; Lehtimäki, Terho; de Borst, Martin; Navis, Gerjan; Swertz, Morris; Liu, Yongmei; Lohman, Kurt; Loos, Ruth J F; Lu, Yingchang; Lyytikäinen, Leo-Pekka; McEvoy, Mark A; Meisinger, Christa; Meitinger, Thomas; Metspalu, Andres; Metzger, Marie; Mihailov, Evelin; Mitchell, Paul; Nauck, Matthias; Oldehinkel, Albertine J; Olden, Matthias; Wjh Penninx, Brenda; Pistis, Giorgio; Pramstaller, Peter P; Probst-Hensch, Nicole; Raitakari, Olli T; Rettig, Rainer; Ridker, Paul M; Rivadeneira, Fernando; Robino, Antonietta; Rosas, Sylvia E; Ruderfer, Douglas; Ruggiero, Daniela; Saba, Yasaman; Sala, Cinzia; Schmidt, Helena; Schmidt, Reinhold; Scott, Rodney J; Sedaghat, Sanaz; Smith, Albert V; Sorice, Rossella; Stengel, Benedicte; Stracke, Sylvia; Strauch, Konstantin; Toniolo, Daniela; Uitterlinden, Andre G; Ulivi, Sheila; Viikari, Jorma S; Völker, Uwe; Vollenweider, Peter; Völzke, Henry; Vuckovic, Dragana; Waldenberger, Melanie; Jin Wang, Jie; Yang, Qiong; Chasman, Daniel I; Tromp, Gerard; Snieder, Harold; Heid, Iris M; Fox, Caroline S; Köttgen, Anna; Pattaro, Cristian; Böger, Carsten A; Fuchsberger, Christian
2017-04-28
HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10 -8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples.
Feng, Xing-Ling; Sun, Qi-Fan; Liu, Hong; Wei, Yi-Liang; DU, Wei-An; Li, Cai-Xia; Chen, Ling; Liu, Chao
2016-04-20
To validate the efficiency of 27-plex single nucleotide polymorphism (SNP) multiplex system for ancestry inference. The 27-plex SNP system was validated for its sensitivity and species specificity. A total of 533 samples were collected from African, Southern Chinese Han, China's ethic minorities (Yi, Hui, Miao, Tibet, and Uygur), European, Central Asian, Western Asian, Southern Asian, Southeast Asian and South American populations for clustering analysis of the genotypes by citing 3 representative continental ancestral groups [East Asia (CHB), Europe (CEU), and Africa (YRI)] from HapMap database. The system sensitivity is 0.125 ng. Twenty and six genotypes were detected in chimpanzee and monkeys, respectively. Except in rs10496971, no more products were found in other animals. The system was capable of differentiating intercontinental populations but not of distinguishing between East Asian and Southeast Asian population or between Southern Chinese Han population and Chinese Ethnic populations (Hui, Miao, Yi and Tibet). This system achieved a 100% accuracy for intercontinental population source inference for 46 blind test samples. 27-plex SNPs multiplex system has a high sensitivity and species specificity and can correctly differentiate the ancestry origins of individuals from African, European and East Asian for criminal case investigation. But this system is not capable of distinguishing subpopulation groups and more specific ancestry-informative markers are needed to improve its recognition of Southeast Asian and Chinese ethnic populations.
GENOME-WIDE GENE-SODIUM INTERACTION ANALYSES ON BLOOD PRESSURE: THE GENSALT STUDY
Li, Changwei; He, Jiang; Chen, Jing; Zhao, Jinying; Gu, Dongfeng; Hixson, James E.; Rao, Dabeeru C.; Jaquish, Cashell E.; Gu, Charles C.; Chen, Jichun; Huang, Jianfeng; Chen, Shufeng; Kelly, Tanika N.
2016-01-01
We performed genome-wide analyses to identify genomic loci that interact with sodium to influence blood pressure (BP) using single marker (one and two degree-of-freedom joint tests) and gene-based tests among 1,876 Chinese participants of the Genetic Epidemiology Network of Salt-Sensitivity (GenSalt) study. Among GenSalt participants, the average of three urine samples was used to estimate sodium excretion. Nine BP measurements were taken using a random-zero-sphygmomanometer. A total of 2.05 million SNPs were imputed using Affymetrix 6.0 genotype data and the Chinese Han of Beijing and Japanese of Tokyo HapMap reference panel. Promising findings (P <1.00×10−4) from GenSalt were evaluated for replication among 775 Chinese participants of the Multi-ethnic Study of Atherosclerosis (MESA). SNP and gene-based results were meta-analyzed across the GenSalt and MESA studies to determine genome-wide significance. The one degree-of-freedom tests identified interactions for UST rs13211840 on diastolic BP (P=3.13×10−9). The two degree-of-freedom tests additionally identified associations for CLGN rs2567241 (P=3.90×10−12) and LOC105369882 rs11104632 (P=4.51×10−8) with systolic BP. The CLGN variant rs2567241 was also associated with diastolic BP (P=3.11×10−22) and mean arterial pressure (P= 2.86×10−15). Genome-wide gene-based analysis identified MKNK1 (P=6.70×10−7), C2orf80 (P<1.00×10−12), EPHA6 (P=2.88×10−7), SCOC-AS1 (P=4.35×10−14), SCOC (P=6.46×10−11), CLGN (P=3.68×10−13), MGAT4D (P=4.73×10−11), ARHGAP42 (P=<1.00×10−12), CASP4 (P=1.31×10−8), and LINC01478 (P=6.75×10−10) that were associated with at least one BP phenotype. In summary, we identified 8 novel and 1 previously reported BP loci through the examination of SNP and gene-based interactions with sodium. PMID:27271309
Fox, Caroline S; Liu, Yongmei; White, Charles C; Feitosa, Mary; Smith, Albert V; Heard-Costa, Nancy; Lohman, Kurt; Johnson, Andrew D; Foster, Meredith C; Greenawalt, Danielle M; Griffin, Paula; Ding, Jinghong; Newman, Anne B; Tylavsky, Fran; Miljkovic, Iva; Kritchevsky, Stephen B; Launer, Lenore; Garcia, Melissa; Eiriksdottir, Gudny; Carr, J Jeffrey; Gudnason, Vilmunder; Harris, Tamara B; Cupples, L Adrienne; Borecki, Ingrid B
2012-01-01
Body fat distribution, particularly centralized obesity, is associated with metabolic risk above and beyond total adiposity. We performed genome-wide association of abdominal adipose depots quantified using computed tomography (CT) to uncover novel loci for body fat distribution among participants of European ancestry. Subcutaneous and visceral fat were quantified in 5,560 women and 4,997 men from 4 population-based studies. Genome-wide genotyping was performed using standard arrays and imputed to ~2.5 million Hapmap SNPs. Each study performed a genome-wide association analysis of subcutaneous adipose tissue (SAT), visceral adipose tissue (VAT), VAT adjusted for body mass index, and VAT/SAT ratio (a metric of the propensity to store fat viscerally as compared to subcutaneously) in the overall sample and in women and men separately. A weighted z-score meta-analysis was conducted. For the VAT/SAT ratio, our most significant p-value was rs11118316 at LYPLAL1 gene (p = 3.1 × 10E-09), previously identified in association with waist-hip ratio. For SAT, the most significant SNP was in the FTO gene (p = 5.9 × 10E-08). Given the known gender differences in body fat distribution, we performed sex-specific analyses. Our most significant finding was for VAT in women, rs1659258 near THNSL2 (p = 1.6 × 10-08), but not men (p = 0.75). Validation of this SNP in the GIANT consortium data demonstrated a similar sex-specific pattern, with observed significance in women (p = 0.006) but not men (p = 0.24) for BMI and waist circumference (p = 0.04 [women], p = 0.49 [men]). Finally, we interrogated our data for the 14 recently published loci for body fat distribution (measured by waist-hip ratio adjusted for BMI); associations were observed at 7 of these loci. In contrast, we observed associations at only 7/32 loci previously identified in association with BMI; the majority of overlap was observed with SAT. Genome-wide association for visceral and subcutaneous fat revealed a SNP for VAT in women. More refined phenotypes for body composition and fat distribution can detect new loci not previously uncovered in large-scale GWAS of anthropometric traits.
Ain, Qurat-ul; Rasheed, Awais; Anwar, Alia; Mahmood, Tariq; Imtiaz, Muhammad; Mahmood, Tariq; Xia, Xianchun; He, Zhonghu; Quraishi, Umar M.
2015-01-01
Genome-wide association studies (GWAS) were undertaken to identify SNP markers associated with yield and yield-related traits in 123 Pakistani historical wheat cultivars evaluated during 2011–2014 seasons under rainfed field conditions. The population was genotyped by using high-density Illumina iSelect 90K single nucleotide polymorphism (SNP) assay, and finally 14,960 high quality SNPs were used in GWAS. Population structure examined using 1000 unlinked markers identified seven subpopulations (K = 7) that were representative of different breeding programs in Pakistan, in addition to local landraces. Forty four stable marker-trait associations (MTAs) with -log p > 4 were identified for nine yield-related traits. Nine multi-trait MTAs were found on chromosomes 1AL, 1BS, 2AL, 2BS, 2BL, 4BL, 5BL, 6AL, and 6BL, and those on 5BL and 6AL were stable across two seasons. Gene annotation and syntey identified that 14 trait-associated SNPs were linked to genes having significant importance in plant development. Favorable alleles for days to heading (DH), plant height (PH), thousand grain weight (TGW), and grain yield (GY) showed minor additive effects and their frequencies were slightly higher in cultivars released after 2000. However, no selection pressure on any favorable allele was identified. These genomic regions identified have historically contributed to achieve yield gains from 2.63 million tons in 1947 to 25.7 million tons in 2015. Future breeding strategies can be devised to initiate marker assisted breeding to accumulate these favorable alleles of SNPs associated with yield-related traits to increase grain yield. Additionally, in silico identification of 454-contigs corresponding to MTAs will facilitate fine mapping and subsequent cloning of candidate genes and functional marker development. PMID:26442056
Zohra, Rozi; Song, M S; Iliham, Nizam; Dolikun, Mamatyusupu
2016-08-16
To investigate the characterizations of genetic recombination hotspots and linkage disequilibrium (LD) patterns in peroxisome proliferative activated receptor gamma (PPARG) gene in Kirgiz and Uyghur ethnic groups. Blood samples were collected from 100 Kirgiz (50 healthy controls and 50 patients with type 2 diabetes mellitus) residents in Halajun County, Artux City, Kizilsu Kirgiz Autonomous Prefecture, Xinjiang in August 2013, and 50 healthy Uyghur residents in Hotan Prefecture of Xinjiang Uygur Autonomous Region in May 2012.Thirty-one tagSNPs in PPARG gene were genotyped using Matrix-Assisted Laser Desorption/Ionization Time of Flight Mass Spectrometry (MALDI-TOF-MS) method.The recombination hotspots and LD patterns within the PPARG gene were estimated by analyzing the SNP genotying data using the Hotspot Fisher program and Haploview software, respectively. Eighteen tagSNPs (rs1151999, rs1175540, rs1875796, rs1899951, rs2292101, rs2921190, rs2938397, rs2959272, rs2959273, rs2972162, rs3856806, rs4135247, rs4135275, rs709151, rs4135354, rs6805419, rs17036700 and rs4135304) were same with relatively higher recombination rates between the patients with type 2 diabetes mellitus (T2DM) and healthy controls of Kirgiz ethnic group, and healthy controls of Uyghur ethnic group.Five haplotype blocks with LD coefficient D' value of 1, indicating no genetic recombination occurred within the region, were observed in the healthy controls of Kirgiz ethnic groups, whereas five haplotype blocks with LD coefficient D' value less than 1 were observed in the Kirgiz patients with T2DM, indicating historical recombination events occurred within the region.Four haplotype blocks with LD coefficient D' value of 1 were observed in the Uyghur healthy controls, indicating no genetic recombination occurred within the region.There were significantly different recombination hotspot profiles between the Kirgiz, Uyghur, Utah residents with Northern and Western European ancestry (CEU), Yoruban in Ibadan, Nigeria (YRI) and Han Chinese in Beijing (CHB) and Japanese in Tokyo (JPT) samples.There are six recombination hotspots in the HapMap profile of genetic recombination.The last 5 SNPs within the PPARG gene were shown with lower recombination rates in the Kirgiz, whereas no recombination hotspot was found in the Uyghur. Variable recombination rates may be present in certain chromosome region between patients and healthy controls within the same or between the different ethnic groups.There may be presence of recombination hotspots of ethnic specificity and with variable recombination rates.
A global reference for human genetic variation
2016-01-01
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. PMID:26432245
A remark on copy number variation detection methods.
Li, Shuo; Dou, Xialiang; Gao, Ruiqi; Ge, Xinzhou; Qian, Minping; Wan, Lin
2018-01-01
Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute. In this study, we perform a deep analysis on copy number losses on 254 human DNA samples, which have both SNP microarray data and NGS data publicly available from Hapmap Project and 1000 Genomes Project respectively. We show that the copy number losses reported from Hapmap Project and 1000 Genome Project only have < 30% overlap, while these reports are required to have cross-platform (e.g. PCR, microarray and high-throughput sequencing) experimental supporting by their corresponding projects, even though state-of-art calling methods were employed. On the other hand, copy number losses are found directly from HapMap microarray data by an accurate algorithm, i.e. CNVhac, almost all of which have lower read mapping depth in NGS data; furthermore, 88% of which can be supported by the sequences with breakpoint in NGS data. Our results suggest the ability of microarray calling CNVs and the possible introduction of false negatives from the unessential requirement of the additional cross-platform supporting. The inconsistency of CNV reports from Hapmap Project and 1000 Genomes Project might result from the inadequate information containing in microarray data, the inconsistent detection criteria, or the filtration effect of cross-platform supporting. The statistical test on CNVs called from CNVhac show that the microarray data can offer reliable CNV reports, and majority of CNV candidates can be confirmed by raw sequences. Therefore, the CNV candidates given by a good caller could be highly reliable without cross-platform supporting, so additional experimental information should be applied in need instead of necessarily.
Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr.
Privé, Florian; Aschard, Hugues; Ziyatdinov, Andrey; Blum, Michael G B
2017-03-30
Genome-wide datasets produced for association studies have dramatically increased in size over the past few years, with modern datasets commonly including millions of variants measured in dozens of thousands of individuals. This increase in data size is a major challenge severely slowing down genomic analyses, leading to some software becoming obsolete and researchers having limited access to diverse analysis tools. Here we present two R packages, bigstatsr and bigsnpr, allowing for the analysis of large scale genomic data to be performed within R. To address large data size, the packages use memory-mapping for accessing data matrices stored on disk instead of in RAM. To perform data pre-processing and data analysis, the packages integrate most of the tools that are commonly used, either through transparent system calls to existing software, or through updated or improved implementation of existing methods. In particular, the packages implement fast and accurate computations of principal component analysis and association studies, functions to remove SNPs in linkage disequilibrium and algorithms to learn polygenic risk scores on millions of SNPs. We illustrate applications of the two R packages by analyzing a case-control genomic dataset for celiac disease, performing an association study and computing Polygenic Risk Scores. Finally, we demonstrate the scalability of the R packages by analyzing a simulated genome-wide dataset including 500,000 individuals and 1 million markers on a single desktop computer. https://privefl.github.io/bigstatsr/ & https://privefl.github.io/bigsnpr/. florian.prive@univ-grenoble-alpes.fr & michael.blum@univ-grenoble-alpes.fr. Supplementary materials are available at Bioinformatics online.
... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
Livingstone, Donald; Stack, Conrad; Mustiga, Guiliana M; Rodezno, Dayana C; Suarez, Carmen; Amores, Freddy; Feltus, Frank A; Mockaitis, Keithanne; Cornejo, Omar E; Motamayor, Juan C
2017-01-01
Cacao ( Theobroma cacao L.) is an important cash crop in tropical regions around the world and has a rich agronomic history in South America. As a key component in the cosmetic and confectionary industries, millions of people worldwide use products made from cacao, ranging from shampoo to chocolate. An Illumina Infinity II array was created using 13,530 SNPs identified within a small diversity panel of cacao. Of these SNPs, 12,643 derive from variation within annotated cacao genes. The genotypes of 3,072 trees were obtained, including two mapping populations from Ecuador. High-density linkage maps for these two populations were generated and compared to the cacao genome assembly. Phenotypic data from these populations were combined with the linkage maps to identify the QTLs for yield and disease resistance.
Association analyses of 249,796 individuals reveal eighteen new loci associated with body mass index
Speliotes, Elizabeth K.; Willer, Cristen J.; Berndt, Sonja I.; Monda, Keri L.; Thorleifsson, Gudmar; Jackson, Anne U.; Allen, Hana Lango; Lindgren, Cecilia M.; Luan, Jian’an; Mägi, Reedik; Randall, Joshua C.; Vedantam, Sailaja; Winkler, Thomas W.; Qi, Lu; Workalemahu, Tsegaselassie; Heid, Iris M.; Steinthorsdottir, Valgerdur; Stringham, Heather M.; Weedon, Michael N.; Wheeler, Eleanor; Wood, Andrew R.; Ferreira, Teresa; Weyant, Robert J.; Segré, Ayellet V.; Estrada, Karol; Liang, Liming; Nemesh, James; Park, Ju-Hyun; Gustafsson, Stefan; Kilpeläinen, Tuomas O.; Yang, Jian; Bouatia-Naji, Nabila; Esko, Tõnu; Feitosa, Mary F.; Kutalik, Zoltán; Mangino, Massimo; Raychaudhuri, Soumya; Scherag, Andre; Smith, Albert Vernon; Welch, Ryan; Zhao, Jing Hua; Aben, Katja K.; Absher, Devin M.; Amin, Najaf; Dixon, Anna L.; Fisher, Eva; Glazer, Nicole L.; Goddard, Michael E.; Heard-Costa, Nancy L.; Hoesel, Volker; Hottenga, Jouke-Jan; Johansson, Åsa; Johnson, Toby; Ketkar, Shamika; Lamina, Claudia; Li, Shengxu; Moffatt, Miriam F.; Myers, Richard H.; Narisu, Narisu; Perry, John R.B.; Peters, Marjolein J.; Preuss, Michael; Ripatti, Samuli; Rivadeneira, Fernando; Sandholt, Camilla; Scott, Laura J.; Timpson, Nicholas J.; Tyrer, Jonathan P.; van Wingerden, Sophie; Watanabe, Richard M.; White, Charles C.; Wiklund, Fredrik; Barlassina, Christina; Chasman, Daniel I.; Cooper, Matthew N.; Jansson, John-Olov; Lawrence, Robert W.; Pellikka, Niina; Prokopenko, Inga; Shi, Jianxin; Thiering, Elisabeth; Alavere, Helene; Alibrandi, Maria T. S.; Almgren, Peter; Arnold, Alice M.; Aspelund, Thor; Atwood, Larry D.; Balkau, Beverley; Balmforth, Anthony J.; Bennett, Amanda J.; Ben-Shlomo, Yoav; Bergman, Richard N.; Bergmann, Sven; Biebermann, Heike; Blakemore, Alexandra I.F.; Boes, Tanja; Bonnycastle, Lori L.; Bornstein, Stefan R.; Brown, Morris J.; Buchanan, Thomas A.; Busonero, Fabio; Campbell, Harry; Cappuccio, Francesco P.; Cavalcanti-Proença, Christine; Chen, Yii-Der Ida; Chen, Chih-Mei; Chines, Peter S.; Clarke, Robert; Coin, Lachlan; Connell, John; Day, Ian N.M.; den Heijer, Martin; Duan, Jubao; Ebrahim, Shah; Elliott, Paul; Elosua, Roberto; Eiriksdottir, Gudny; Erdos, Michael R.; Eriksson, Johan G.; Facheris, Maurizio F.; Felix, Stephan B.; Fischer-Posovszky, Pamela; Folsom, Aaron R.; Friedrich, Nele; Freimer, Nelson B.; Fu, Mao; Gaget, Stefan; Gejman, Pablo V.; Geus, Eco J.C.; Gieger, Christian; Gjesing, Anette P.; Goel, Anuj; Goyette, Philippe; Grallert, Harald; Gräßler, Jürgen; Greenawalt, Danielle M.; Groves, Christopher J.; Gudnason, Vilmundur; Guiducci, Candace; Hartikainen, Anna-Liisa; Hassanali, Neelam; Hall, Alistair S.; Havulinna, Aki S.; Hayward, Caroline; Heath, Andrew C.; Hengstenberg, Christian; Hicks, Andrew A.; Hinney, Anke; Hofman, Albert; Homuth, Georg; Hui, Jennie; Igl, Wilmar; Iribarren, Carlos; Isomaa, Bo; Jacobs, Kevin B.; Jarick, Ivonne; Jewell, Elizabeth; John, Ulrich; Jørgensen, Torben; Jousilahti, Pekka; Jula, Antti; Kaakinen, Marika; Kajantie, Eero; Kaplan, Lee M.; Kathiresan, Sekar; Kettunen, Johannes; Kinnunen, Leena; Knowles, Joshua W.; Kolcic, Ivana; König, Inke R.; Koskinen, Seppo; Kovacs, Peter; Kuusisto, Johanna; Kraft, Peter; Kvaløy, Kirsti; Laitinen, Jaana; Lantieri, Olivier; Lanzani, Chiara; Launer, Lenore J.; Lecoeur, Cecile; Lehtimäki, Terho; Lettre, Guillaume; Liu, Jianjun; Lokki, Marja-Liisa; Lorentzon, Mattias; Luben, Robert N.; Ludwig, Barbara; Manunta, Paolo; Marek, Diana; Marre, Michel; Martin, Nicholas G.; McArdle, Wendy L.; McCarthy, Anne; McKnight, Barbara; Meitinger, Thomas; Melander, Olle; Meyre, David; Midthjell, Kristian; Montgomery, Grant W.; Morken, Mario A.; Morris, Andrew P.; Mulic, Rosanda; Ngwa, Julius S.; Nelis, Mari; Neville, Matt J.; Nyholt, Dale R.; O’Donnell, Christopher J.; O’Rahilly, Stephen; Ong, Ken K.; Oostra, Ben; Paré, Guillaume; Parker, Alex N.; Perola, Markus; Pichler, Irene; Pietiläinen, Kirsi H.; Platou, Carl G.P.; Polasek, Ozren; Pouta, Anneli; Rafelt, Suzanne; Raitakari, Olli; Rayner, Nigel W.; Ridderstråle, Martin; Rief, Winfried; Ruokonen, Aimo; Robertson, Neil R.; Rzehak, Peter; Salomaa, Veikko; Sanders, Alan R.; Sandhu, Manjinder S.; Sanna, Serena; Saramies, Jouko; Savolainen, Markku J.; Scherag, Susann; Schipf, Sabine; Schreiber, Stefan; Schunkert, Heribert; Silander, Kaisa; Sinisalo, Juha; Siscovick, David S.; Smit, Jan H.; Soranzo, Nicole; Sovio, Ulla; Stephens, Jonathan; Surakka, Ida; Swift, Amy J.; Tammesoo, Mari-Liis; Tardif, Jean-Claude; Teder-Laving, Maris; Teslovich, Tanya M.; Thompson, John R.; Thomson, Brian; Tönjes, Anke; Tuomi, Tiinamaija; van Meurs, Joyce B.J.; van Ommen, Gert-Jan; Vatin, Vincent; Viikari, Jorma; Visvikis-Siest, Sophie; Vitart, Veronique; Vogel, Carla I. G.; Voight, Benjamin F.; Waite, Lindsay L.; Wallaschofski, Henri; Walters, G. Bragi; Widen, Elisabeth; Wiegand, Susanna; Wild, Sarah H.; Willemsen, Gonneke; Witte, Daniel R.; Witteman, Jacqueline C.; Xu, Jianfeng; Zhang, Qunyuan; Zgaga, Lina; Ziegler, Andreas; Zitting, Paavo; Beilby, John P.; Farooqi, I. Sadaf; Hebebrand, Johannes; Huikuri, Heikki V.; James, Alan L.; Kähönen, Mika; Levinson, Douglas F.; Macciardi, Fabio; Nieminen, Markku S.; Ohlsson, Claes; Palmer, Lyle J.; Ridker, Paul M.; Stumvoll, Michael; Beckmann, Jacques S.; Boeing, Heiner; Boerwinkle, Eric; Boomsma, Dorret I.; Caulfield, Mark J.; Chanock, Stephen J.; Collins, Francis S.; Cupples, L. Adrienne; Smith, George Davey; Erdmann, Jeanette; Froguel, Philippe; Grönberg, Henrik; Gyllensten, Ulf; Hall, Per; Hansen, Torben; Harris, Tamara B.; Hattersley, Andrew T.; Hayes, Richard B.; Heinrich, Joachim; Hu, Frank B.; Hveem, Kristian; Illig, Thomas; Jarvelin, Marjo-Riitta; Kaprio, Jaakko; Karpe, Fredrik; Khaw, Kay-Tee; Kiemeney, Lambertus A.; Krude, Heiko; Laakso, Markku; Lawlor, Debbie A.; Metspalu, Andres; Munroe, Patricia B.; Ouwehand, Willem H.; Pedersen, Oluf; Penninx, Brenda W.; Peters, Annette; Pramstaller, Peter P.; Quertermous, Thomas; Reinehr, Thomas; Rissanen, Aila; Rudan, Igor; Samani, Nilesh J.; Schwarz, Peter E.H.; Shuldiner, Alan R.; Spector, Timothy D.; Tuomilehto, Jaakko; Uda, Manuela; Uitterlinden, André; Valle, Timo T.; Wabitsch, Martin; Waeber, Gérard; Wareham, Nicholas J.; Watkins, Hugh; Wilson, James F.; Wright, Alan F.; Zillikens, M. Carola; Chatterjee, Nilanjan; McCarroll, Steven A.; Purcell, Shaun; Schadt, Eric E.; Visscher, Peter M.; Assimes, Themistocles L.; Borecki, Ingrid B.; Deloukas, Panos; Fox, Caroline S.; Groop, Leif C.; Haritunians, Talin; Hunter, David J.; Kaplan, Robert C.; Mohlke, Karen L.; O’Connell, Jeffrey R.; Peltonen, Leena; Schlessinger, David; Strachan, David P.; van Duijn, Cornelia M.; Wichmann, H.-Erich; Frayling, Timothy M.; Thorsteinsdottir, Unnur; Abecasis, Gonçalo R.; Barroso, Inês; Boehnke, Michael; Stefansson, Kari; North, Kari E.; McCarthy, Mark I.; Hirschhorn, Joel N.; Ingelsson, Erik; Loos, Ruth J.F.
2010-01-01
Obesity is globally prevalent and highly heritable, but the underlying genetic factors remain largely elusive. To identify genetic loci for obesity-susceptibility, we examined associations between body mass index (BMI) and ~2.8 million SNPs in up to 123,865 individuals, with targeted follow-up of 42 SNPs in up to 125,931 additional individuals. We confirmed 14 known obesity-susceptibility loci and identified 18 new loci associated with BMI (P<5×10−8), one of which includes a copy number variant near GPRC5B. Some loci (MC4R, POMC, SH2B1, BDNF) map near key hypothalamic regulators of energy balance, and one is near GIPR, an incretin receptor. Furthermore, genes in other newly-associated loci may provide novel insights into human body weight regulation. PMID:20935630
Veenstra, Jenna; Kalsbeek, Anya; Westra, Jason; Disselkoen, Craig; Smith, Caren; Tintle, Nathan
2017-08-18
Numerous genetic loci have been identified as being associated with circulating fatty acid (FA) levels and/or inflammatory biomarkers of cardiovascular health (e.g., C-reactive protein). Recently, using red blood cell (RBC) FA data from the Framingham Offspring Study, we conducted a genome-wide association study of over 2.5 million single nucleotide polymorphisms (SNPs) and 22 RBC FAs (and associated ratios), including the four Omega-3 FAs (ALA, DHA, DPA, and EPA). Our analyses identified numerous causal loci. In this manuscript, we investigate the extent to which polyunsaturated fatty acid (PUFA) levels moderate the relationship of genetics to cardiovascular health biomarkers using a genome-wide interaction study approach. In particular, we test for possible gene-FA interactions on 9 inflammatory biomarkers, with 2.5 million SNPs and 12 FAs, including all Omega-3 PUFAs. We identified eighteen novel loci, including loci which demonstrate strong evidence of modifying the impact of heritable genetics on biomarker levels, and subsequently cardiovascular health. The identified genes provide increased clarity on the biological functioning and role of Omega-3 PUFAs, as well as other common fatty acids, in cardiovascular health, and suggest numerous candidate loci for future replication and biological characterization.
Gutierrez, Alejandro P; Turner, Frances; Gharbi, Karim; Talbot, Richard; Lowe, Natalie R; Peñaloza, Carolina; McCullough, Mark; Prodöhl, Paulo A; Bean, Tim P; Houston, Ross D
2017-07-05
SNP arrays are enabling tools for high-resolution studies of the genetic basis of complex traits in farmed and wild animals. Oysters are of critical importance in many regions from both an ecological and economic perspective, and oyster aquaculture forms a key component of global food security. The aim of our study was to design a combined-species, medium density SNP array for Pacific oyster ( Crassostrea gigas ) and European flat oyster ( Ostrea edulis ), and to test the performance of this array on farmed and wild populations from multiple locations, with a focus on European populations. SNP discovery was carried out by whole-genome sequencing (WGS) of pooled genomic DNA samples from eight C. gigas populations, and restriction site-associated DNA sequencing (RAD-Seq) of 11 geographically diverse O. edulis populations. Nearly 12 million candidate SNPs were discovered and filtered based on several criteria, including preference for SNPs segregating in multiple populations and SNPs with monomorphic flanking regions. An Affymetrix Axiom Custom Array was created and tested on a diverse set of samples ( n = 219) showing ∼27 K high quality SNPs for C. gigas and ∼11 K high quality SNPs for O. edulis segregating in these populations. A high proportion of SNPs were segregating in each of the populations, and the array was used to detect population structure and levels of linkage disequilibrium (LD). Further testing of the array on three C. gigas nuclear families ( n = 165) revealed that the array can be used to clearly distinguish between both families based on identity-by-state (IBS) clustering parental assignment software. This medium density, combined-species array will be publicly available through Affymetrix, and will be applied for genome-wide association and evolutionary genetic studies, and for genomic selection in oyster breeding programs. Copyright © 2017 Gutierrez et al.
SNP-markers in Allium species to facilitate introgression breeding in onion.
Scholten, Olga E; van Kaauwen, Martijn P W; Shahin, Arwa; Hendrickx, Patrick M; Keizer, L C Paul; Burger, Karin; van Heusden, Adriaan W; van der Linden, C Gerard; Vosman, Ben
2016-08-31
Within onion, Allium cepa L., the availability of disease resistance is limited. The identification of sources of resistance in related species, such as Allium roylei and Allium fistulosum, was a first step towards the improvement of onion cultivars by breeding. SNP markers linked to resistance and polymorphic between these related species and onion cultivars are a valuable tool to efficiently introgress disease resistance genes. In this paper we describe the identification and validation of SNP markers valuable for onion breeding. Transcriptome sequencing resulted in 192 million RNA seq reads from the interspecific F1 hybrid between A. roylei and A. fistulosum (RF) and nine onion cultivars. After assembly, reliable SNPs were discovered in about 36 % of the contigs. For genotyping of the interspecific three-way cross population, derived from a cross between an onion cultivar and the RF (CCxRF), 1100 SNPs that are polymorphic in RF and monomorphic in the onion cultivars (RF SNPs) were selected for the development of KASP assays. A molecular linkage map based on 667 RF-SNP markers was constructed for CCxRF. In addition, KASP assays were developed for 1600 onion-SNPs (SNPs polymorphic among onion cultivars). A second linkage map was constructed for an F2 of onion x A. roylei (F2(CxR)) that consisted of 182 onion-SNPs and 119 RF-SNPs, and 76 previously mapped markers. Markers co-segregating in both the F2(CxR) and the CCxRF population were used to assign the linkage groups of RF to onion chromosomes. To validate usefulness of these SNP markers, QTL mapping was applied in the CCxRF population that segregates for resistance to Botrytis squamosa and resulted in a QTL for resistance on chromosome 6 of A. roylei. Our research has more than doubled the publicly available marker sequences of expressed onion genes and two onion-related species. It resulted in a detailed genetic map for the interspecific CCxRF population. This is the first paper that reports the detection of a QTL for resistance to B. squamosa in A. roylei.
Kavakiotis, Ioannis; Samaras, Patroklos; Triantafyllidis, Alexandros; Vlahavas, Ioannis
2017-11-01
Single Nucleotide Polymorphism (SNPs) are, nowadays, becoming the marker of choice for biological analyses involving a wide range of applications with great medical, biological, economic and environmental interest. Classification tasks i.e. the assignment of individuals to groups of origin based on their (multi-locus) genotypes, are performed in many fields such as forensic investigations, discrimination between wild and/or farmed populations and others. Τhese tasks, should be performed with a small number of loci, for computational as well as biological reasons. Thus, feature selection should precede classification tasks, especially for Single Nucleotide Polymorphism (SNP) datasets, where the number of features can amount to hundreds of thousands or millions. In this paper, we present a novel data mining approach, called FIFS - Frequent Item Feature Selection, based on the use of frequent items for selection of the most informative markers from population genomic data. It is a modular method, consisting of two main components. The first one identifies the most frequent and unique genotypes for each sampled population. The second one selects the most appropriate among them, in order to create the informative SNP subsets to be returned. The proposed method (FIFS) was tested on a real dataset, which comprised of a comprehensive coverage of pig breed types present in Britain. This dataset consisted of 446 individuals divided in 14 sub-populations, genotyped at 59,436 SNPs. Our method outperforms the state-of-the-art and baseline methods in every case. More specifically, our method surpassed the assignment accuracy threshold of 95% needing only half the number of SNPs selected by other methods (FIFS: 28 SNPs, Delta: 70 SNPs Pairwise FST: 70 SNPs, In: 100 SNPs.) CONCLUSION: Our approach successfully deals with the problem of informative marker selection in high dimensional genomic datasets. It offers better results compared to existing approaches and can aid biologists in selecting the most informative markers with maximum discrimination power for optimization of cost-effective panels with applications related to e.g. species identification, wildlife management, and forensics. Copyright © 2017 Elsevier Ltd. All rights reserved.
Michailidou, S; Tsangaris, G; Fthenakis, G C; Tzora, A; Skoufos, I; Karkabounas, S C; Banos, G; Argiriou, A; Arsenos, G
2018-06-01
In the present study, genome-wide genotyping was applied to characterize the genetic diversity and population structure of three autochthonous Greek breeds: Boutsko, Karagouniko and Chios. Dairy sheep are among the most significant livestock species in Greece numbering approximately 9 million animals which are characterized by large phenotypic variation and reared under various farming systems. A total of 96 animals were genotyped with the Illumina's OvineSNP50K microarray beadchip, to study the population structure of the breeds and develop a specialized panel of single-nucleotide polymorphisms (SNPs), which could distinguish one breed from the others. Quality control on the dataset resulted in 46,125 SNPs, which were used to evaluate the genetic structure of the breeds. Population structure was assessed through principal component analysis (PCA) and admixture analysis, whereas inbreeding was estimated based on runs of homozygosity (ROHs) coefficients, genomic relationship matrix inbreeding coefficients (F GRM ) and patterns of linkage disequilibrium (LD). Associations between SNPs and breeds were analyzed with different inheritance models, to identify SNPs that distinguish among the breeds. Results showed high levels of genetic heterogeneity in the three breeds. Genetic distances among breeds were modest, despite their different ancestries. Chios and Karagouniko breeds were more genetically related to each other compared to Boutsko. Analysis revealed 3802 candidate SNPs that can be used to identify two-breed crosses and purebred animals. The present study provides, for the first time, data on the genetic background of three Greek indigenous dairy sheep breeds as well as a specialized marker panel that can be applied for traceability purposes as well as targeted genetic improvement schemes and conservation programs.
Theunert, Christoph; Pugach, Irina; Li, Jing; Nandineni, Madhusudan R.; Gross, Arnd; Scholz, Markus; Stoneking, Mark
2009-01-01
Background Genome-wide scans of hundreds of thousands of single-nucleotide polymorphisms (SNPs) have resulted in the identification of new susceptibility variants to common diseases and are providing new insights into the genetic structure and relationships of human populations. Moreover, genome-wide data can be used to search for signals of recent positive selection, thereby providing new insights into the genetic adaptations that occurred as modern humans spread out of Africa and around the world. Methodology We genotyped approximately 500,000 SNPs in 255 individuals (5 individuals from each of 51 worldwide populations) from the Human Genome Diversity Panel (HGDP-CEPH). When merged with non-overlapping SNPs typed previously in 250 of these same individuals, the resulting data consist of over 950,000 SNPs. We then analyzed the genetic relationships and ancestry of individuals without assigning them to populations, and we also identified candidate regions of recent positive selection at both the population and regional (continental) level. Conclusions Our analyses both confirm and extend previous studies; in particular, we highlight the impact of various dispersals, and the role of substructure in Africa, on human genetic diversity. We also identified several novel candidate regions for recent positive selection, and a gene ontology (GO) analysis identified several GO groups that were significantly enriched for such candidate genes, including immunity and defense related genes, sensory perception genes, membrane proteins, signal receptors, lipid binding/metabolism genes, and genes involved in the nervous system. Among the novel candidate genes identified are two genes involved in the thyroid hormone pathway that show signals of selection in African Pygmies that may be related to their short stature. PMID:19924308
Identification, validation and high-throughput genotyping of transcribed gene SNPs in cassava.
Ferguson, Morag E; Hearne, Sarah J; Close, Timothy J; Wanamaker, Steve; Moskal, William A; Town, Christopher D; de Young, Joe; Marri, Pradeep Reddy; Rabbi, Ismail Yusuf; de Villiers, Etienne P
2012-03-01
The availability of genomic resources can facilitate progress in plant breeding through the application of advanced molecular technologies for crop improvement. This is particularly important in the case of less researched crops such as cassava, a staple and food security crop for more than 800 million people. Here, expressed sequence tags (ESTs) were generated from five drought stressed and well-watered cassava varieties. Two cDNA libraries were developed: one from root tissue (CASR), the other from leaf, stem and stem meristem tissue (CASL). Sequencing generated 706 contigs and 3,430 singletons. These sequences were combined with those from two other EST sequencing initiatives and filtered based on the sequence quality. Quality sequences were aligned using CAP3 and embedded in a Windows browser called HarvEST:Cassava which is made available. HarvEST:Cassava consists of a Unigene set of 22,903 quality sequences. A total of 2,954 putative SNPs were identified. Of these 1,536 SNPs from 1,170 contigs and 53 cassava genotypes were selected for SNP validation using Illumina's GoldenGate assay. As a result 1,190 SNPs were validated technically and biologically. The location of validated SNPs on scaffolds of the cassava genome sequence (v.4.1) is provided. A diversity assessment of 53 cassava varieties reveals some sub-structure based on the geographical origin, greater diversity in the Americas as opposed to Africa, and similar levels of diversity in West Africa and southern, eastern and central Africa. The resources presented allow for improved genetic dissection of economically important traits and the application of modern genomics-based approaches to cassava breeding and conservation.
Humble, E; Martinez-Barrio, A; Forcada, J; Trathan, P N; Thorne, M A S; Hoffmann, M; Wolf, J B W; Hoffman, J I
2016-07-01
Custom genotyping arrays provide a flexible and accurate means of genotyping single nucleotide polymorphisms (SNPs) in a large number of individuals of essentially any organism. However, validation rates, defined as the proportion of putative SNPs that are verified to be polymorphic in a population, are often very low. A number of potential causes of assay failure have been identified, but none have been explored systematically. In particular, as SNPs are often developed from transcriptomes, parameters relating to the genomic context are rarely taken into account. Here, we assembled a draft Antarctic fur seal (Arctocephalus gazella) genome (assembly size: 2.41 Gb; scaffold/contig N50 : 3.1 Mb/27.5 kb). We then used this resource to map the probe sequences of 144 putative SNPs genotyped in 480 individuals. The number of probe-to-genome mappings and alignment length together explained almost a third of the variation in validation success, indicating that sequence uniqueness and proximity to intron-exon boundaries play an important role. The same pattern was found after mapping the probe sequences to the Walrus and Weddell seal genomes, suggesting that the genomes of species divergent by as much as 23 million years can hold information relevant to SNP validation outcomes. Additionally, reanalysis of genotyping data from seven previous studies found the same two variables to be significantly associated with SNP validation success across a variety of taxa. Finally, our study reveals considerable scope for validation rates to be improved, either by simply filtering for SNPs whose flanking sequences align uniquely and completely to a reference genome, or through predictive modelling. © 2015 John Wiley & Sons Ltd.
... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
Han, Jun Hyun; Lee, Yong Seong; Kim, Hae Jong; Lee, Shin Young; Myung, Soon Chul
2015-01-01
In this study, we evaluated genetic variants of the androgen metabolism genes CYP17A1, CYP3A4, and CYP3A43 to determine whether they play a role in the development of prostate cancer (PCa) in Korean men. The study population included 240 pathologically diagnosed cases of PCa and 223 age-matched controls. Among the 789 single-nucleotide polymorphism (SNP) database variants detected, 129 were reported in two Asian groups (Han Chinese and Japanese) in the HapMap database. Only 21 polymorphisms of CYP17A1, CYP3A4, and CYP3A43 were selected based on linkage disequilibrium in Asians (r2 = 1), locations (SNPs in exons were preferred), and amino acid changes and were assessed. In addition, we performed haplotype analysis for the 21 SNPs in CYP17A1, CYP3A4, and CYP3A43 genes. To determine the association between genotype and haplotype distributions of patients and controls, logistic analyses were carried out, controlling for age. Twelve sequence variants and five major haplotypes were identified in CYP17A1. Five sequence variants and two major haplotypes were identified in CYP3A4. Four sequence variants and four major haplotypes were observed in CYP3A43. CYP17A1 haplotype-2 (Ht-2) (odds ratio [OR], 1.51; 95% confidence interval [CI], 1.04–2.18) was associated with PCa susceptibility. CYP3A4 Ht-2 (OR: 1.87; 95% CI: 1.02–3.43) was associated with PCa metastatic potential according to tumor stage. rs17115149 (OR: 1.96; 95% CI: 1.04–3.68) and CYP17A1 Ht-4 (OR: 2.01; 95% CI: 1.07–4.11) showed a significant association with histologic aggressiveness according to Gleason score. Genetic variants of CYP17A1 and CYP3A4 may play a role in the development of PCa in Korean men. PMID:25337833
Han, Jun Hyun; Lee, Yong Seong; Kim, Hae Jong; Lee, Shin Young; Myung, Soon Chul
2015-01-01
In this study, we evaluated genetic variants of the androgen metabolism genes CYP17A1, CYP3A4, and CYP3A43 to determine whether they play a role in the development of prostate cancer (PCa) in Korean men. The study population included 240 pathologically diagnosed cases of PCa and 223 age-matched controls. Among the 789 single-nucleotide polymorphism (SNP) database variants detected, 129 were reported in two Asian groups (Han Chinese and Japanese) in the HapMap database. Only 21 polymorphisms of CYP17A1, CYP3A4, and CYP3A43 were selected based on linkage disequilibrium in Asians (r2 = 1), locations (SNPs in exons were preferred), and amino acid changes and were assessed. In addition, we performed haplotype analysis for the 21 SNPs in CYP17A1, CYP3A4, and CYP3A43 genes. To determine the association between genotype and haplotype distributions of patients and controls, logistic analyses were carried out, controlling for age. Twelve sequence variants and five major haplotypes were identified in CYP17A1. Five sequence variants and two major haplotypes were identified in CYP3A4. Four sequence variants and four major haplotypes were observed in CYP3A43. CYP17A1 haplotype-2 (Ht-2) (odds ratio [OR], 1.51; 95% confidence interval [CI], 1.04-2.18) was associated with PCa susceptibility. CYP3A4 Ht-2 (OR: 1.87; 95% CI: 1.02-3.43) was associated with PCa metastatic potential according to tumor stage. rs17115149 (OR: 1.96; 95% CI: 1.04-3.68) and CYP17A1 Ht-4 (OR: 2.01; 95% CI: 1.07-4.11) showed a significant association with histologic aggressiveness according to Gleason score. Genetic variants of CYP17A1 and CYP3A4 may play a role in the development of PCa in Korean men.
Anderson, Eric C
2012-11-08
Advances in genotyping that allow tens of thousands of individuals to be genotyped at a moderate number of single nucleotide polymorphisms (SNPs) permit parentage inference to be pursued on a very large scale. The intergenerational tagging this capacity allows is revolutionizing the management of cultured organisms (cows, salmon, etc.) and is poised to do the same for scientific studies of natural populations. Currently, however, there are no likelihood-based methods of parentage inference which are implemented in a manner that allows them to quickly handle a very large number of potential parents or parent pairs. Here we introduce an efficient likelihood-based method applicable to the specialized case of cultured organisms in which both parents can be reliably sampled. We develop a Markov chain representation for the cumulative number of Mendelian incompatibilities between an offspring and its putative parents and we exploit it to develop a fast algorithm for simulation-based estimates of statistical confidence in SNP-based assignments of offspring to pairs of parents. The method is implemented in the freely available software SNPPIT. We describe the method in detail, then assess its performance in a large simulation study using known allele frequencies at 96 SNPs from ten hatchery salmon populations. The simulations verify that the method is fast and accurate and that 96 well-chosen SNPs can provide sufficient power to identify the correct pair of parents from amongst millions of candidate pairs.
Genome-wide scan in Hispanics highlights candidate loci for brain white matter hyperintensities
Beecham, Ashley; Dong, Chuanhui; Wright, Clinton B.; Dueker, Nicole; Brickman, Adam M.; Wang, Liyong; DeCarli, Charles; Blanton, Susan H.; Rundek, Tatjana; Mayeux, Richard
2017-01-01
Objective: To investigate genetic variants influencing white matter hyperintensities (WMHs) in the understudied Hispanic population. Methods: Using 6.8 million single nucleotide polymorphisms (SNPs), we conducted a genome-wide association study (GWAS) to identify SNPs associated with WMH volume (WMHV) in 922 Hispanics who underwent brain MRI as a cross-section of 2 community-based cohorts in the Northern Manhattan Study and the Washington Heights–Inwood Columbia Aging Project. Multiple linear modeling with PLINK was performed to examine the additive genetic effects on ln(WMHV) after controlling for age, sex, total intracranial volume, and principal components of ancestry. Gene-based tests of association were performed using VEGAS. Replication was performed in independent samples of Europeans, African Americans, and Asians. Results: From the SNP analysis, a total of 17 independent SNPs in 7 genes had suggestive evidence of association with WMHV in Hispanics (p < 1 × 10−5) and 5 genes from the gene-based analysis with p < 1 × 10−3. One SNP (rs9957475 in GATA6) and 1 gene (UBE2C) demonstrated evidence of association (p < 0.05) in the African American sample. Four SNPs with p < 1 × 10−5 were shown to affect binding of SPI1 using RegulomeDB. Conclusions: This GWAS of 2 community-based Hispanic cohorts revealed several novel WMH-associated genetic variants. Further replication is needed in independent Hispanic samples to validate these suggestive associations, and fine mapping is needed to pinpoint causal variants. PMID:28975155
Livingstone, Donald; Stack, Conrad; Mustiga, Guiliana M.; Rodezno, Dayana C.; Suarez, Carmen; Amores, Freddy; Feltus, Frank A.; Mockaitis, Keithanne; Cornejo, Omar E.; Motamayor, Juan C.
2017-01-01
Cacao (Theobroma cacao L.) is an important cash crop in tropical regions around the world and has a rich agronomic history in South America. As a key component in the cosmetic and confectionary industries, millions of people worldwide use products made from cacao, ranging from shampoo to chocolate. An Illumina Infinity II array was created using 13,530 SNPs identified within a small diversity panel of cacao. Of these SNPs, 12,643 derive from variation within annotated cacao genes. The genotypes of 3,072 trees were obtained, including two mapping populations from Ecuador. High-density linkage maps for these two populations were generated and compared to the cacao genome assembly. Phenotypic data from these populations were combined with the linkage maps to identify the QTLs for yield and disease resistance. PMID:29259608
Genome-wide association study identifies multiple loci influencing human serum metabolite levels
Kettunen, Johannes; Tukiainen, Taru; Sarin, Antti-Pekka; Ortega-Alonso, Alfredo; Tikkanen, Emmi; Lyytikäinen, Leo-Pekka; Kangas, Antti J; Soininen, Pasi; Würtz, Peter; Silander, Kaisa; Dick, Danielle M; Rose, Richard J; Savolainen, Markku J; Viikari, Jorma; Kähönen, Mika; Lehtimäki, Terho; Pietiläinen, Kirsi H; Inouye, Michael; McCarthy, Mark I; Jula, Antti; Eriksson, Johan; Raitakari, Olli T; Salomaa, Veikko; Kaprio, Jaakko; Järvelin, Marjo-Riitta; Peltonen, Leena; Perola, Markus; Freimer, Nelson B; Ala-Korpela, Mika; Palotie, Aarno; Ripatti, Samuli
2013-01-01
Nuclear magnetic resonance assays allow for measurement of a wide range of metabolic phenotypes. We report here the results of a GWAS on 8,330 Finnish individuals genotyped and imputed at 7.7 million SNPs for a range of 216 serum metabolic phenotypes assessed by NMR of serum samples. We identified significant associations (P < 2.31 × 10−10) at 31 loci, including 11 for which there have not been previous reports of associations to a metabolic trait or disorder. Analyses of Finnish twin pairs suggested that the metabolic measures reported here show higher heritability than comparable conventional metabolic phenotypes. In accordance with our expectations, SNPs at the 31 loci associated with individual metabolites account for a greater proportion of the genetic component of trait variance (up to 40%) than is typically observed for conventional serum metabolic phenotypes. The identification of such associations may provide substantial insight into cardiometabolic disorders. PMID:22286219
Variation resources at UC Santa Cruz.
Thomas, Daryl J; Trumbower, Heather; Kern, Andrew D; Rhead, Brooke L; Kuhn, Robert M; Haussler, David; Kent, W James
2007-01-01
The variation resources within the University of California Santa Cruz Genome Browser include polymorphism data drawn from public collections and analyses of these data, along with their display in the context of other genomic annotations. Primary data from dbSNP is included for many organisms, with added information including genomic alleles and orthologous alleles for closely related organisms. Display filtering and coloring is available by variant type, functional class or other annotations. Annotation of potential errors is highlighted and a genomic alignment of the variant's flanking sequence is displayed. HapMap allele frequencies and linkage disequilibrium (LD) are available for each HapMap population, along with non-human primate alleles. The browsing and analysis tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.
Population substructure in Cache County, Utah: the Cache County study
2014-01-01
Background Population stratification is a key concern for genetic association analyses. In addition, extreme homogeneity of ethnic origins of a population can make it difficult to interpret how genetic associations in that population may translate into other populations. Here we have evaluated the genetic substructure of samples from the Cache County study relative to the HapMap Reference populations and data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Results Our findings show that the Cache County study is similar in ethnic diversity to the self-reported "Whites" in the ADNI sample and less homogenous than the HapMap CEU population. Conclusions We conclude that the Cache County study is genetically representative of the general European American population in the USA and is an appropriate population for conducting broadly applicable genetic studies. PMID:25078123
Characterization of recombination features and the genetic basis in multiple cattle breeds.
Shen, Botong; Jiang, Jicai; Seroussi, Eyal; Liu, George E; Ma, Li
2018-04-27
Crossover generated by meiotic recombination is a fundamental event that facilitates meiosis and sexual reproduction. Comparative studies have shown wide variation in recombination rate among species, but the characterization of recombination features between cattle breeds has not yet been performed. Cattle populations in North America count millions, and the dairy industry has genotyped millions of individuals with pedigree information that provide a unique opportunity to study breed-level variations in recombination. Based on large pedigrees of Jersey, Ayrshire and Brown Swiss cattle with genotype data, we identified over 3.4 million maternal and paternal crossover events from 161,309 three-generation families. We constructed six breed- and sex-specific genome-wide recombination maps using 58,982 autosomal SNPs for two sexes in the three dairy cattle breeds. A comparative analysis of the six recombination maps revealed similar global recombination patterns between cattle breeds but with significant differences between sexes. We confirmed that male recombination map is 10% longer than the female map in all three cattle breeds, consistent with previously reported results in Holstein cattle. When comparing recombination hotspot regions between cattle breeds, we found that 30% and 10% of the hotspots were shared between breeds in males and females, respectively, with each breed exhibiting some breed-specific hotspots. Finally, our multiple-breed GWAS found that SNPs in eight loci affected recombination rate and that the PRDM9 gene associated with hotspot usage in multiple cattle breeds, indicating a shared genetic basis for recombination across dairy cattle breeds. Collectively, our results generated breed- and sex-specific recombination maps for multiple cattle breeds, provided a comprehensive characterization and comparison of recombination patterns between breeds, and expanded our understanding of the breed-level variations in recombination features within an important livestock species.
A Genome-Wide Association Study of Circulating Galectin-3
van Veldhuisen, Dirk J.; Westra, Harm-Jan; Bakker, Stephan J. L.; Gansevoort, Ron T.; Muller Kobold, Anneke C.; van Gilst, Wiek H.; Franke, Lude
2012-01-01
Galectin-3 is a lectin involved in fibrosis, inflammation and proliferation. Increased circulating levels of galectin-3 have been associated with various diseases, including cancer, immunological disorders, and cardiovascular disease. To enhance our knowledge on galectin-3 biology we performed the first genome-wide association study (GWAS) using the Illumina HumanCytoSNP-12 array imputed with the HapMap 2 CEU panel on plasma galectin-3 levels in 3,776 subjects and follow-up genotyping in an additional 3,516 subjects. We identified 2 genome wide significant loci associated with plasma galectin-3 levels. One locus harbours the LGALS3 gene (rs2274273; P = 2.35×10−188) and the other locus the ABO gene (rs644234; P = 3.65×10−47). The variance explained by the LGALS3 locus was 25.6% and by the ABO locus 3.8% and jointly they explained 29.2%. Rs2274273 lies in high linkage disequilibrium with two non-synonymous SNPs (rs4644; r2 = 1.0, and rs4652; r2 = 0.91) and wet lab follow-up genotyping revealed that both are strongly associated with galectin-3 levels (rs4644; P = 4.97×10−465 and rs4652 P = 1.50×10−421) and were also associated with LGALS3 gene-expression. The origins of our associations should be further validated by means of functional experiments. PMID:23056639
Haplotypic Analysis of Wellcome Trust Case Control Consortium Data
Browning, Brian L.; Browning, Sharon R.
2008-01-01
We applied a recently developed multilocus association testing method (localized haplotype clustering) to Wellcome Trust Case Control Consortium data (14,000 cases of seven common diseases and 3,000 shared controls genotyped on the Affymetrix 500K array). After rigorous data quality filtering, we identified three disease-associated loci with strong statistical support from localized haplotype cluster tests but with only marginal significance in single marker tests. These loci are chromosomes 10p15.1 with type 1 diabetes (p = 5.1 × 10-9), 12q15 with type 2 diabetes (p = 1.9 × 10-7) and 15q26.2 with hypertension (p = 2.8 × 10-8). We also detected the association of chromosome 9p21.3 with type 2 diabetes (p = 2.8 × 10-8), although this locus did not pass our stringent genotype quality filters. The association of 10p15.1 with type 1 diabetes and 9p21.3 with type 2 diabetes have both been replicated in other studies using independent data sets. Overall, localized haplotype cluster analysis had better success detecting disease associated variants than a previous single-marker analysis of imputed HapMap SNPs. We found that stringent application of quality score thresholds to genotype data substantially reduced false-positive results arising from genotype error. In addition, we demonstrate that it is possible to simultaneously phase 16,000 individuals genotyped on genome-wide data (450K markers) using the Beagle software package. PMID:18224336
An overview of the genetic dissection of complex traits.
Rao, D C
2008-01-01
Thanks to the recent revolutionary genomic advances such as the International HapMap consortium, resolution of the genetic architecture of common complex traits is beginning to look hopeful. While demonstrating the feasibility of genome-wide association (GWA) studies, the pathbreaking Wellcome Trust Case Control Consortium (WTCCC) study also serves to underscore the critical importance of very large sample sizes and draws attention to potential problems, which need to be addressed as part of the study design. Even the large WTCCC study had vastly inadequate power for several of the associations reported (and confirmed) and, therefore, most of the regions harboring relevant associations may not be identified anytime soon. This chapter provides an overview of some of the key developments in the methodological approaches to genetic dissection of common complex traits. Constrained Bayesian networks are suggested as especially useful for analysis of pathway-based SNPs. Likewise, composite likelihood is suggested as a promising method for modeling complex systems. It discusses the key steps in a study design, with an emphasis on GWA studies. Potential limitations highlighted by the WTCCC GWA study are discussed, including problems associated with massive genotype imputation, analysis of pooled national samples, shared controls, and the critical role of interactions. GWA studies clearly need massive sample sizes that are only possible through genuine collaborations. After all, for common complex traits, the question is not whether we can find some pieces of the puzzle, but how large and what kind of a sample we need to (nearly) solve the genetic puzzle.
Espinoza, Jose R.; Alvarez, Giancarlo; León-Velarde, Fabiola; Ju Preciado, Hugo F.; Macarlupu, Jose-Luis; Rivera-Ch, Maria; Rodriguez, Jorge; Favier, Judith; Gimenez-Roqueplo, Anne-Paule
2014-01-01
Abstract Espinoza, Jose R., Giancarlo Alvarez, Fabiola León-Velarde, Hugo F. Ju Preciado, Jose-Luis Macarlupu, Maria Rivera-Ch, Jorge Rodriguez, Judith Favier, Anne-Paule Gimenez-Roqueplo, and Jean-Paul Richalet. Vascular endothelial growth factor-A is associated with chronic mountain sickness in Andean population. High Alt Med Biol. 15:146–154, 2014.—A study of chronic mountain sickness (CMS) with a candidate gene—vascular endothelial growth factor A (VEGFA)—was carried out in a Peruvian population living at high altitude in Cerro de Pasco (4380 m). The study was performed by genotyping of 11 tag SNPs encompassing 2.2 kb of region of VEGFA gene in patients with a diagnosis of CMS (n=131; 49.1±12.7 years old) and unrelated healthy controls (n=84; 47.2±13.4 years old). The VEGFA tag SNP rs3025033 was found associated with CMS (p<0.05), individuals with AG genotype have 2.5 more risk of CMS compared to those with GG genotype (p<0.02; OR, 2.54; 95% CI: 1.10–5.88). Pairwise Fst and Nei's distance indicate genetic differentiation between Cerro de Pasco population and HapMap3 population (Fst>0.36, p<0.01), suggesting selection is operating on the VEGF gene. Our results suggest that VEGFA is associated with CMS in long-term residents at high altitude in the Peruvian Andes. PMID:24971768
Brief Guide to Genomics: DNA, Genes and Genomes
... Sheets A Brief Guide to Genomics About NHGRI Research About the International HapMap Project Biological Pathways Chromosome Abnormalities Chromosomes Cloning Comparative Genomics DNA Microarray Technology DNA Sequencing Deoxyribonucleic Acid ( ...
Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca
2015-01-01
Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450
Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca
2015-01-01
Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.
Ashrafi, Hamid; Hill, Theresa; Stoffel, Kevin; Kozik, Alexander; Yao, Jiqiang; Chin-Wo, Sebastian Reyes; Van Deynze, Allen
2012-10-30
Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeño and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from >125,000 Sanger-EST sequences that were mainly derived from a Korean F1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip® microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80-120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins. Before availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project.
Chadaeva, Irina V; Ponomarenko, Mikhail P; Rasskazov, Dmitry A; Sharypova, Ekaterina B; Kashina, Elena V; Matveeva, Marina Yu; Arshinova, Tatjana V; Ponomarenko, Petr M; Arkova, Olga V; Bondar, Natalia P; Savinkova, Ludmila K; Kolchanov, Nikolay A
2016-12-28
Aggressiveness in humans is a hereditary behavioral trait that mobilizes all systems of the body-first of all, the nervous and endocrine systems, and then the respiratory, vascular, muscular, and others-e.g., for the defense of oneself, children, family, shelter, territory, and other possessions as well as personal interests. The level of aggressiveness of a person determines many other characteristics of quality of life and lifespan, acting as a stress factor. Aggressive behavior depends on many parameters such as age, gender, diseases and treatment, diet, and environmental conditions. Among them, genetic factors are believed to be the main parameters that are well-studied at the factual level, but in actuality, genome-wide studies of aggressive behavior appeared relatively recently. One of the biggest projects of the modern science-1000 Genomes-involves identification of single nucleotide polymorphisms (SNPs), i.e., differences of individual genomes from the reference genome. SNPs can be associated with hereditary diseases, their complications, comorbidities, and responses to stress or a drug. Clinical comparisons between cohorts of patients and healthy volunteers (as a control) allow for identifying SNPs whose allele frequencies significantly separate them from one another as markers of the above conditions. Computer-based preliminary analysis of millions of SNPs detected by the 1000 Genomes project can accelerate clinical search for SNP markers due to preliminary whole-genome search for the most meaningful candidate SNP markers and discarding of neutral and poorly substantiated SNPs. Here, we combine two computer-based search methods for SNPs (that alter gene expression) {i} Web service SNP_TATA_Comparator (DNA sequence analysis) and {ii} PubMed-based manual search for articles on aggressiveness using heuristic keywords. Near the known binding sites for TATA-binding protein (TBP) in human gene promoters, we found aggressiveness-related candidate SNP markers, including rs1143627 (associated with higher aggressiveness in patients undergoing cytokine immunotherapy), rs544850971 (higher aggressiveness in old women taking lipid-lowering medication), and rs10895068 (childhood aggressiveness-related obesity in adolescence with cardiovascular complications in adulthood). After validation of these candidate markers by clinical protocols, these SNPs may become useful for physicians (may help to improve treatment of patients) and for the general population (a lifestyle choice preventing aggressiveness-related complications).
Mining sequence variations in representative polyploid sugarcane germplasm accessions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Xiping; Song, Jian; You, Qian
Sugarcane (Saccharum spp.) is one of the most important economic crops because of its high sugar production and biofuel potential. Due to the high polyploid level and complex genome of sugarcane, it has been a huge challenge to investigate genomic sequence variations, which are critical for identifying alleles contributing to important agronomic traits. In order to mine the genetic variations in sugarcane, genotyping by sequencing (GBS), was used to genotype 14 representative Saccharum complex accessions. GBS is a method to generate a large number of markers, enabled by next generation sequencing (NGS) and the genome complexity reduction using restriction enzymes.more » To use GBS for high throughput genotyping highly polyploid sugarcane, the GBS analysis pipelines in 14 Saccharum complex accessions were established by evaluating different alignment methods, sequence variants callers, and sequence depth for single nucleotide polymorphism (SNP) filtering. By using the established pipeline, a total of 76,251 non-redundant SNPs, 5642 InDels, 6380 presence/absence variants (PAVs), and 826 copy number variations (CNVs) were detected among the 14 accessions. In addition, non-reference based universal network enabled analysis kit and Stacks de novo called 34,353 and 109,043 SNPs, respectively. In the 14 accessions, the percentages of single dose SNPs ranged from 38.3% to 62.3% with an average of 49.6%, much more than the portions of multiple dosage SNPs. Concordantly called SNPs were used to evaluate the phylogenetic relationship among the 14 accessions. The results showed that the divergence time between the Erianthus genus and the Saccharum genus was more than 10 million years ago (MYA). The Saccharum species separated from their common ancestors ranging from 0.19 to 1.65 MYA. The GBS pipelines including the reference sequences, alignment methods, sequence variant callers, and sequence depth were recommended and discussed for the Saccharum complex and other related species. A large number of sequence variations were discovered in the Saccharum complex, including SNPs, InDels, PAVs, and CNVs. Genome-wide SNPs were further used to illustrate sequence features of polyploid species and demonstrated the divergence of different species in the Saccharum complex. The results of this study showed that GBS was an effective NGS-based method to discover genomic sequence variations in highly polyploid and heterozygous species.« less
Mining sequence variations in representative polyploid sugarcane germplasm accessions
Yang, Xiping; Song, Jian; You, Qian; ...
2017-08-09
Sugarcane (Saccharum spp.) is one of the most important economic crops because of its high sugar production and biofuel potential. Due to the high polyploid level and complex genome of sugarcane, it has been a huge challenge to investigate genomic sequence variations, which are critical for identifying alleles contributing to important agronomic traits. In order to mine the genetic variations in sugarcane, genotyping by sequencing (GBS), was used to genotype 14 representative Saccharum complex accessions. GBS is a method to generate a large number of markers, enabled by next generation sequencing (NGS) and the genome complexity reduction using restriction enzymes.more » To use GBS for high throughput genotyping highly polyploid sugarcane, the GBS analysis pipelines in 14 Saccharum complex accessions were established by evaluating different alignment methods, sequence variants callers, and sequence depth for single nucleotide polymorphism (SNP) filtering. By using the established pipeline, a total of 76,251 non-redundant SNPs, 5642 InDels, 6380 presence/absence variants (PAVs), and 826 copy number variations (CNVs) were detected among the 14 accessions. In addition, non-reference based universal network enabled analysis kit and Stacks de novo called 34,353 and 109,043 SNPs, respectively. In the 14 accessions, the percentages of single dose SNPs ranged from 38.3% to 62.3% with an average of 49.6%, much more than the portions of multiple dosage SNPs. Concordantly called SNPs were used to evaluate the phylogenetic relationship among the 14 accessions. The results showed that the divergence time between the Erianthus genus and the Saccharum genus was more than 10 million years ago (MYA). The Saccharum species separated from their common ancestors ranging from 0.19 to 1.65 MYA. The GBS pipelines including the reference sequences, alignment methods, sequence variant callers, and sequence depth were recommended and discussed for the Saccharum complex and other related species. A large number of sequence variations were discovered in the Saccharum complex, including SNPs, InDels, PAVs, and CNVs. Genome-wide SNPs were further used to illustrate sequence features of polyploid species and demonstrated the divergence of different species in the Saccharum complex. The results of this study showed that GBS was an effective NGS-based method to discover genomic sequence variations in highly polyploid and heterozygous species.« less
Teixeira, Thallita Monteiro; da Silva, Hugo Delleon; Goveia, Rebeca Mota; Ribolla, Paulo Eduardo Martins; Alonso, Diego Peres; Alves, Alessandro Arruda; Melo E Silva, Daniela; Collevatti, Rosane Garcia; Bicudo, Lucilene Arilho; Bérgamo, Nádia Aparecida; de Paula Silveira-Lacerda, Elisângela
2017-12-01
Worldwide, different studies have reported an association of alcohol-use disorder (AUD) with different types of Single Nucleotide Polymorphisms (SNPs) in the genes for aldehyde dehydrogenase (ALDH) and alcohol dehydrogenase (ADH). In Brazil, there is little information about the occurrence of these SNPs in the AUD population and an absence of studies characterizing the population in the Central-West Region of Brazil. Actually, in Brazil, there are more than 4 million people with AUD. Despite the major health hazards of AUD, information on alcohol consumption and its consequences are not well understood. Therefore, it is extremely important to characterize these SNPs for the better understanding of AUD as a genetic disease in the Brazilian population. The present study, unlike other studies in other countries, is done with a subject population that shows a significant amount of racial homogenization. We evaluated the presence of SNPs in the ADH (ADH1B, ADH1C, and ADH4) and ALDH (ALDH2) genes in alcohol users of Goiânia, State of Goiás - Brazil, and then we established a possible relationship with AUD by allelic and genotypic study. This study was conducted with a population of people with AUD (n = 99) from Goiás Alcohol Dependence Recovery Center (GO CEREA) and Psychosocial Care Center for Alcohol and Drugs (CAPS AD), and with a population of people without AUD as controls (n = 100). DNA was extracted from whole-blood samples and the genotyping was performed using TaqMan ® SNP genotyping assays. For characterization and evaluation of SNPs in the population, genotype frequency, allele frequency, haplotype frequency, Hardy-Weinberg equilibrium, and linkage disequilibrium were analyzed. Statistical analyses were calculated by GENEPOP 4.5 and Haploview software. The allele 1 was considered as "wild" (or *1) and allele 2 as mutant (or *2). Significant differences were found for ADH1B*, ADH4*2, and ALDH2*2 SNPs when the genotype and allele frequencies were analyzed. In addition, four haplotypes were observed between ADH1B*2 and ADH1C*2 through linkage disequilibrium analysis. The genetic variants may be associated with protection against AUD in the population studied. Copyright © 2017 Elsevier Inc. All rights reserved.
Das, Shouvik; Singh, Mohar; Srivastava, Rishi; Bajaj, Deepak; Saxena, Maneesha S.; Rana, Jai C.; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.
2016-01-01
The present study used a whole-genome, NGS resequencing-based mQTL-seq (multiple QTL-seq) strategy in two inter-specific mapping populations (Pusa 1103 × ILWC 46 and Pusa 256 × ILWC 46) to scan the major genomic region(s) underlying QTL(s) governing pod number trait in chickpea. Essentially, the whole-genome resequencing of low and high pod number-containing parental accessions and homozygous individuals (constituting bulks) from each of these two mapping populations discovered >8 million high-quality homozygous SNPs with respect to the reference kabuli chickpea. The functional significance of the physically mapped SNPs was apparent from the identified 2,264 non-synonymous and 23,550 regulatory SNPs, with 8–10% of these SNPs-carrying genes corresponding to transcription factors and disease resistance-related proteins. The utilization of these mined SNPs in Δ (SNP index)-led QTL-seq analysis and their correlation between two mapping populations based on mQTL-seq, narrowed down two (CaqaPN4.1: 867.8 kb and CaqaPN4.2: 1.8 Mb) major genomic regions harbouring robust pod number QTLs into the high-resolution short QTL intervals (CaqbPN4.1: 637.5 kb and CaqbPN4.2: 1.28 Mb) on chickpea chromosome 4. The integration of mQTL-seq-derived one novel robust QTL with QTL region-specific association analysis delineated the regulatory (C/T) and coding (C/A) SNPs-containing one pentatricopeptide repeat (PPR) gene at a major QTL region regulating pod number in chickpea. This target gene exhibited anther, mature pollen and pod-specific expression, including pronounced higher up-regulated (∼3.5-folds) transcript expression in high pod number-containing parental accessions and homozygous individuals of two mapping populations especially during pollen and pod development. The proposed mQTL-seq-driven combinatorial strategy has profound efficacy in rapid genome-wide scanning of potential candidate gene(s) underlying trait-associated high-resolution robust QTL(s), thereby expediting genomics-assisted breeding and genetic enhancement of crop plants, including chickpea. PMID:26685680
Brief Overview of a Decade of Genome-Wide Association Studies on Primary Hypertension.
Azam, Afifah Binti; Azizan, Elena Aisha Binti
2018-01-01
Primary hypertension is widely believed to be a complex polygenic disorder with the manifestation influenced by the interactions of genomic and environmental factors making identification of susceptibility genes a major challenge. With major advancement in high-throughput genotyping technology, genome-wide association study (GWAS) has become a powerful tool for researchers studying genetically complex diseases. GWASs work through revealing links between DNA sequence variation and a disease or trait with biomedical importance. The human genome is a very long DNA sequence which consists of billions of nucleotides arranged in a unique way. A single base-pair change in the DNA sequence is known as a single nucleotide polymorphism (SNP). With the help of modern genotyping techniques such as chip-based genotyping arrays, thousands of SNPs can be genotyped easily. Large-scale GWASs, in which more than half a million of common SNPs are genotyped and analyzed for disease association in hundreds of thousands of cases and controls, have been broadly successful in identifying SNPs associated with heart diseases, diabetes, autoimmune diseases, and psychiatric disorders. It is however still debatable whether GWAS is the best approach for hypertension. The following is a brief overview on the outcomes of a decade of GWASs on primary hypertension.
Genomic data for 78 chickens from 14 populations
Li, Diyan; Che, Tiandong; Chen, Binlong; Tian, Shilin; Zhou, Xuming; Zhang, Guolong; Li, Miao; Gaur, Uma; Li, Yan; Luo, Majing; Zhang, Long; Xu, Zhongxian; Zhao, Xiaoling; Yin, Huadong; Wang, Yan; Jin, Long; Tang, Qianzi; Xu, Huailiang; Yang, Mingyao; Zhou, Rongjia; Li, Ruiqiang
2017-01-01
Abstract Background: Since the domestication of the red jungle fowls (Gallus gallus; dating back to ∼10 000 B.P.) in Asia, domestic chickens (Gallus gallus domesticus) have been subjected to the combined effects of natural selection and human-driven artificial selection; this has resulted in marked phenotypic diversity in a number of traits, including behavior, body composition, egg production, and skin color. Population genomic variations through diversifying selection have not been fully investigated. Findings: The whole genomes of 78 domestic chickens were sequenced to an average of 18-fold coverage for each bird. By combining this data with publicly available genomes of five wild red jungle fowls and eight Xishuangbanna game fowls, we conducted a comprehensive comparative genomics analysis of 91 chickens from 17 populations. After aligning ∼21.30 gigabases (Gb) of high-quality data from each individual to the reference chicken genome, we identified ∼6.44 million (M) single nucleotide polymorphisms (SNPs) for each population. These SNPs included 1.10 M novel SNPs in 17 populations that were absent in the current chicken dbSNP (Build 145) entries. Conclusions: The current data is important for population genetics and further studies in chickens and will serve as a valuable resource for investigating diversifying selection and candidate genes for selective breeding in chickens. PMID:28431039
High-density SNP assay development for genetic analysis in maritime pine (Pinus pinaster).
Plomion, C; Bartholomé, J; Lesur, I; Boury, C; Rodríguez-Quilón, I; Lagraulet, H; Ehrenmann, F; Bouffier, L; Gion, J M; Grivet, D; de Miguel, M; de María, N; Cervera, M T; Bagnoli, F; Isik, F; Vendramin, G G; González-Martínez, S C
2016-03-01
Maritime pine provides essential ecosystem services in the south-western Mediterranean basin, where it covers around 4 million ha. Its scattered distribution over a range of environmental conditions makes it an ideal forest tree species for studies of local adaptation and evolutionary responses to climatic change. Highly multiplexed single nucleotide polymorphism (SNP) genotyping arrays are increasingly used to study genetic variation in living organisms and for practical applications in plant and animal breeding and genetic resource conservation. We developed a 9k Illumina Infinium SNP array and genotyped maritime pine trees from (i) a three-generation inbred (F2) pedigree, (ii) the French breeding population and (iii) natural populations from Portugal and the French Atlantic coast. A large proportion of the exploitable SNPs (2052/8410, i.e. 24.4%) segregated in the mapping population and could be mapped, providing the densest ever gene-based linkage map for this species. Based on 5016 SNPs, natural and breeding populations from the French gene pool exhibited similar level of genetic diversity. Population genetics and structure analyses based on 3981 SNP markers common to the Portuguese and French gene pools revealed high levels of differentiation, leading to the identification of a set of highly differentiated SNPs that could be used for seed provenance certification. Finally, we discuss how the validated SNPs could facilitate the identification of ecologically and economically relevant genes in this species, improving our understanding of the demography and selective forces shaping its natural genetic diversity, and providing support for new breeding strategies. © 2015 John Wiley & Sons Ltd.
Novel genes identified in a high-density genome wide association study for nicotine dependence.
Bierut, Laura Jean; Madden, Pamela A F; Breslau, Naomi; Johnson, Eric O; Hatsukami, Dorothy; Pomerleau, Ovide F; Swan, Gary E; Rutter, Joni; Bertelsen, Sarah; Fox, Louis; Fugman, Douglas; Goate, Alison M; Hinrichs, Anthony L; Konvicka, Karel; Martin, Nicholas G; Montgomery, Grant W; Saccone, Nancy L; Saccone, Scott F; Wang, Jen C; Chase, Gary A; Rice, John P; Ballinger, Dennis G
2007-01-01
Tobacco use is a leading contributor to disability and death worldwide, and genetic factors contribute in part to the development of nicotine dependence. To identify novel genes for which natural variation contributes to the development of nicotine dependence, we performed a comprehensive genome wide association study using nicotine dependent smokers as cases and non-dependent smokers as controls. To allow the efficient, rapid, and cost effective screen of the genome, the study was carried out using a two-stage design. In the first stage, genotyping of over 2.4 million single nucleotide polymorphisms (SNPs) was completed in case and control pools. In the second stage, we selected SNPs for individual genotyping based on the most significant allele frequency differences between cases and controls from the pooled results. Individual genotyping was performed in 1050 cases and 879 controls using 31 960 selected SNPs. The primary analysis, a logistic regression model with covariates of age, gender, genotype and gender by genotype interaction, identified 35 SNPs with P-values less than 10(-4) (minimum P-value 1.53 x 10(-6)). Although none of the individual findings is statistically significant after correcting for multiple tests, additional statistical analyses support the existence of true findings in this group. Our study nominates several novel genes, such as Neurexin 1 (NRXN1), in the development of nicotine dependence while also identifying a known candidate gene, the beta3 nicotinic cholinergic receptor. This work anticipates the future directions of large-scale genome wide association studies with state-of-the-art methodological approaches and sharing of data with the scientific community.
Hagen, Ingerid J; Billing, Anna M; Rønning, Bernt; Pedersen, Sindre A; Pärn, Henrik; Slate, Jon; Jensen, Henrik
2013-05-01
With the advent of next generation sequencing, new avenues have opened to study genomics in wild populations of non-model species. Here, we describe a successful approach to a genome-wide medium density Single Nucleotide Polymorphism (SNP) panel in a non-model species, the house sparrow (Passer domesticus), through the development of a 10 K Illumina iSelect HD BeadChip. Genomic DNA and cDNA derived from six individuals were sequenced on a 454 GS FLX system and generated a total of 1.2 million sequences, in which SNPs were detected. As no reference genome exists for the house sparrow, we used the zebra finch (Taeniopygia guttata) reference genome to determine the most likely position of each SNP. The 10 000 SNPs on the SNP-chip were selected to be distributed evenly across 31 chromosomes, giving on average one SNP per 100 000 bp. The SNP-chip was screened across 1968 individual house sparrows from four island populations. Of the original 10 000 SNPs, 7413 were found to be variable, and 99% of these SNPs were successfully called in at least 93% of all individuals. We used the SNP-chip to demonstrate the ability of such genome-wide marker data to detect population sub-division, and compared these results to similar analyses using microsatellites. The SNP-chip will be used to map Quantitative Trait Loci (QTL) for fitness-related phenotypic traits in natural populations. © 2013 Blackwell Publishing Ltd.
D'Cunha, Anitha; Pandit, Lekha; Malli, Chaithra
2017-06-01
Indian data have been largely missing from genome-wide databases that provide information on genetic variations in different populations. This hinders association studies for complex disorders in India. This study was aimed to determine whether the complex genetic structure and endogamy among Indians could potentially influence the design of case-control studies for autoimmune disorders in the south Indian population. A total of 12 single nucleotide variations (SNVs) related to genes associated with autoimmune disorders were genotyped in 370 healthy individuals belonging to six different caste groups in southern India. Allele frequencies were estimated; genetic divergence and phylogenetic relationship within the various caste groups and other HapMap populations were ascertained. Allele frequencies for all genotyped SNVs did not vary significantly among the different groups studied. Wright's FSTwas 0.001 per cent among study population and 0.38 per cent when compared with Gujarati in Houston (GIH) population on HapMap data. The analysis of molecular variance results showed a 97 per cent variation attributable to differences within the study population and <1 per cent variation due to differences between castes. Phylogenetic analysis showed a separation of Dravidian population from other HapMap populations and particularly from GIH population. Despite the complex genetic origins of the Indian population, our study indicated a low level of genetic differentiation among Dravidian language-speaking people of south India. Case-control studies of association among Dravidians of south India may not require stratification based on language and caste.
Evidence for large inversion polymorphisms in the human genome from HapMap data
Bansal, Vikas; Bashir, Ali; Bafna, Vineet
2007-01-01
Knowledge about structural variation in the human genome has grown tremendously in the past few years. However, inversions represent a class of structural variation that remains difficult to detect. We present a statistical method to identify large inversion polymorphisms using unusual Linkage Disequilibrium (LD) patterns from high-density SNP data. The method is designed to detect chromosomal segments that are inverted (in a majority of the chromosomes) in a population with respect to the reference human genome sequence. We demonstrate the power of this method to detect such inversion polymorphisms through simulations done using the HapMap data. Application of this method to the data from the first phase of the International HapMap project resulted in 176 candidate inversions ranging from 200 kb to several megabases in length. Our predicted inversions include an 800-kb polymorphic inversion at 7p22, a 1.1-Mb inversion at 16p12, and a novel 1.2-Mb inversion on chromosome 10 that is supported by the presence of two discordant fosmids. Analysis of the genomic sequence around inversion breakpoints showed that 11 predicted inversions are flanked by pairs of highly homologous repeats in the inverted orientation. In addition, for three candidate inversions, the inverted orientation is represented in the Celera genome assembly. Although the power of our method to detect inversions is restricted because of inherently noisy LD patterns in population data, inversions predicted by our method represent strong candidates for experimental validation and analysis. PMID:17185644
A mega-analysis of genome-wide association studies for major depressive disorder.
Ripke, Stephan; Wray, Naomi R; Lewis, Cathryn M; Hamilton, Steven P; Weissman, Myrna M; Breen, Gerome; Byrne, Enda M; Blackwood, Douglas H R; Boomsma, Dorret I; Cichon, Sven; Heath, Andrew C; Holsboer, Florian; Lucae, Susanne; Madden, Pamela A F; Martin, Nicholas G; McGuffin, Peter; Muglia, Pierandrea; Noethen, Markus M; Penninx, Brenda P; Pergadia, Michele L; Potash, James B; Rietschel, Marcella; Lin, Danyu; Müller-Myhsok, Bertram; Shi, Jianxin; Steinberg, Stacy; Grabe, Hans J; Lichtenstein, Paul; Magnusson, Patrik; Perlis, Roy H; Preisig, Martin; Smoller, Jordan W; Stefansson, Kari; Uher, Rudolf; Kutalik, Zoltan; Tansey, Katherine E; Teumer, Alexander; Viktorin, Alexander; Barnes, Michael R; Bettecken, Thomas; Binder, Elisabeth B; Breuer, René; Castro, Victor M; Churchill, Susanne E; Coryell, William H; Craddock, Nick; Craig, Ian W; Czamara, Darina; De Geus, Eco J; Degenhardt, Franziska; Farmer, Anne E; Fava, Maurizio; Frank, Josef; Gainer, Vivian S; Gallagher, Patience J; Gordon, Scott D; Goryachev, Sergey; Gross, Magdalena; Guipponi, Michel; Henders, Anjali K; Herms, Stefan; Hickie, Ian B; Hoefels, Susanne; Hoogendijk, Witte; Hottenga, Jouke Jan; Iosifescu, Dan V; Ising, Marcus; Jones, Ian; Jones, Lisa; Jung-Ying, Tzeng; Knowles, James A; Kohane, Isaac S; Kohli, Martin A; Korszun, Ania; Landen, Mikael; Lawson, William B; Lewis, Glyn; Macintyre, Donald; Maier, Wolfgang; Mattheisen, Manuel; McGrath, Patrick J; McIntosh, Andrew; McLean, Alan; Middeldorp, Christel M; Middleton, Lefkos; Montgomery, Grant M; Murphy, Shawn N; Nauck, Matthias; Nolen, Willem A; Nyholt, Dale R; O'Donovan, Michael; Oskarsson, Högni; Pedersen, Nancy; Scheftner, William A; Schulz, Andrea; Schulze, Thomas G; Shyn, Stanley I; Sigurdsson, Engilbert; Slager, Susan L; Smit, Johannes H; Stefansson, Hreinn; Steffens, Michael; Thorgeirsson, Thorgeir; Tozzi, Federica; Treutlein, Jens; Uhr, Manfred; van den Oord, Edwin J C G; Van Grootheest, Gerard; Völzke, Henry; Weilburg, Jeffrey B; Willemsen, Gonneke; Zitman, Frans G; Neale, Benjamin; Daly, Mark; Levinson, Douglas F; Sullivan, Patrick F
2013-04-01
Prior genome-wide association studies (GWAS) of major depressive disorder (MDD) have met with limited success. We sought to increase statistical power to detect disease loci by conducting a GWAS mega-analysis for MDD. In the MDD discovery phase, we analyzed more than 1.2 million autosomal and X chromosome single-nucleotide polymorphisms (SNPs) in 18 759 independent and unrelated subjects of recent European ancestry (9240 MDD cases and 9519 controls). In the MDD replication phase, we evaluated 554 SNPs in independent samples (6783 MDD cases and 50 695 controls). We also conducted a cross-disorder meta-analysis using 819 autosomal SNPs with P<0.0001 for either MDD or the Psychiatric GWAS Consortium bipolar disorder (BIP) mega-analysis (9238 MDD cases/8039 controls and 6998 BIP cases/7775 controls). No SNPs achieved genome-wide significance in the MDD discovery phase, the MDD replication phase or in pre-planned secondary analyses (by sex, recurrent MDD, recurrent early-onset MDD, age of onset, pre-pubertal onset MDD or typical-like MDD from a latent class analyses of the MDD criteria). In the MDD-bipolar cross-disorder analysis, 15 SNPs exceeded genome-wide significance (P<5 × 10(-8)), and all were in a 248 kb interval of high LD on 3p21.1 (chr3:52 425 083-53 822 102, minimum P=5.9 × 10(-9) at rs2535629). Although this is the largest genome-wide analysis of MDD yet conducted, its high prevalence means that the sample is still underpowered to detect genetic effects typical for complex traits. Therefore, we were unable to identify robust and replicable findings. We discuss what this means for genetic research for MDD. The 3p21.1 MDD-BIP finding should be interpreted with caution as the most significant SNP did not replicate in MDD samples, and genotyping in independent samples will be needed to resolve its status.
Johnatty, Sharon E; Tyrer, Jonathan P; Kar, Siddhartha; Beesley, Jonathan; Lu, Yi; Gao, Bo; Fasching, Peter A; Hein, Alexander; Ekici, Arif B; Beckmann, Matthias W; Lambrechts, Diether; Van Nieuwenhuysen, Els; Vergote, Ignace; Lambrechts, Sandrina; Rossing, Mary Anne; Doherty, Jennifer A; Chang-Claude, Jenny; Modugno, Francesmary; Ness, Roberta B; Moysich, Kirsten B; Levine, Douglas A; Kiemeney, Lambertus A; Massuger, Leon F A G; Gronwald, Jacek; Lubiński, Jan; Jakubowska, Anna; Cybulski, Cezary; Brinton, Louise; Lissowska, Jolanta; Wentzensen, Nicolas; Song, Honglin; Rhenius, Valerie; Campbell, Ian; Eccles, Diana; Sieh, Weiva; Whittemore, Alice S; McGuire, Valerie; Rothstein, Joseph H; Sutphen, Rebecca; Anton-Culver, Hoda; Ziogas, Argyrios; Gayther, Simon A; Gentry-Maharaj, Aleksandra; Menon, Usha; Ramus, Susan J; Pearce, Celeste L; Pike, Malcolm C; Stram, Daniel O; Wu, Anna H; Kupryjanczyk, Jolanta; Dansonka-Mieszkowska, Agnieszka; Rzepecka, Iwona K; Spiewankiewicz, Beata; Goodman, Marc T; Wilkens, Lynne R; Carney, Michael E; Thompson, Pamela J; Heitz, Florian; du Bois, Andreas; Schwaab, Ira; Harter, Philipp; Pisterer, Jacobus; Hillemanns, Peter; Karlan, Beth Y; Walsh, Christine; Lester, Jenny; Orsulic, Sandra; Winham, Stacey J; Earp, Madalene; Larson, Melissa C; Fogarty, Zachary C; Høgdall, Estrid; Jensen, Allan; Kjaer, Susanne Kruger; Fridley, Brooke L; Cunningham, Julie M; Vierkant, Robert A; Schildkraut, Joellen M; Iversen, Edwin S; Terry, Kathryn L; Cramer, Daniel W; Bandera, Elisa V; Orlow, Irene; Pejovic, Tanja; Bean, Yukie; Høgdall, Claus; Lundvall, Lene; McNeish, Ian; Paul, James; Carty, Karen; Siddiqui, Nadeem; Glasspool, Rosalind; Sellers, Thomas; Kennedy, Catherine; Chiew, Yoke-Eng; Berchuck, Andrew; MacGregor, Stuart; Pharoah, Paul D P; Goode, Ellen L; deFazio, Anna; Webb, Penelope M; Chenevix-Trench, Georgia
2015-12-01
Chemotherapy resistance remains a major challenge in the treatment of ovarian cancer. We hypothesize that germline polymorphisms might be associated with clinical outcome. We analyzed approximately 2.8 million genotyped and imputed SNPs from the iCOGS experiment for progression-free survival (PFS) and overall survival (OS) in 2,901 European epithelial ovarian cancer (EOC) patients who underwent first-line treatment of cytoreductive surgery and chemotherapy regardless of regimen, and in a subset of 1,098 patients treated with ≥ 4 cycles of paclitaxel and carboplatin at standard doses. We evaluated the top SNPs in 4,434 EOC patients, including patients from The Cancer Genome Atlas. In addition, we conducted pathway analysis of all intragenic SNPs and tested their association with PFS and OS using gene set enrichment analysis. Five SNPs were significantly associated (P ≤ 1.0 × 10(-5)) with poorer outcomes in at least one of the four analyses, three of which, rs4910232 (11p15.3), rs2549714 (16q23), and rs6674079 (1q22), were located in long noncoding RNAs (lncRNAs) RP11-179A10.1, RP11-314O13.1, and RP11-284F21.8, respectively (P ≤ 7.1 × 10(-6)). ENCODE ChIP-seq data at 1q22 for normal ovary show evidence of histone modification around RP11-284F21.8, and rs6674079 is perfectly correlated with another SNP within the super-enhancer MEF2D, expression levels of which were reportedly associated with prognosis in another solid tumor. YAP1- and WWTR1 (TAZ)-stimulated gene expression and high-density lipoprotein (HDL)-mediated lipid transport pathways were associated with PFS and OS, respectively, in the cohort who had standard chemotherapy (pGSEA ≤ 6 × 10(-3)). We have identified SNPs in three lncRNAs that might be important targets for novel EOC therapies. ©2015 American Association for Cancer Research.
Keel, B N; Nonneman, D J; Rohrer, G A
2017-08-01
Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Johnatty, Sharon E.; Tyrer, Jonathan P.; Kar, Siddhartha; Beesley, Jonathan; Lu, Yi; Gao, Bo; Fasching, Peter A.; Hein, Alexander; Ekici, Arif B.; Beckmann, Matthias W.; Lambrechts, Diether; Nieuwenhuysen, Els Van; Vergote, Ignace; Lambrechts, Sandrina; Rossing, Mary Anne; Doherty, Jennifer A.; Chang-Claude, Jenny; Modugno, Francesmary; Ness, Roberta B.; Moysich, Kirsten B.; Levine, Douglas A.; Kiemeney, Lambertus A.; Massuger, Leon F.A.G.; Gronwald, Jacek; Lubiński, Jan; Jakubowska, Anna; Cybulski, Cezary; Brinton, Louise; Lissowska, Jolanta; Wentzensen, Nicolas; Song, Honglin; Rhenius, Valerie; Campbell, Ian; Eccles, Diana; Sieh, Weiva; Whittemore, Alice S.; McGuire, Valerie; Rothstein, Joseph H.; Sutphen, Rebecca; Anton-Culver, Hoda; Ziogas, Argyrios; Gayther, Simon A.; Gentry-Maharaj, Aleksandra; Menon, Usha; Ramus, Susan J.; Pearce, Celeste L; Pike, Malcolm C; Stram, Daniel O.; Wu, Anna H.; Kupryjanczyk, Jolanta; Dansonka-Mieszkowska, Agnieszka; Rzepecka, Iwona K.; Spiewankiewicz, Beata; Goodman, Marc T.; Wilkens, Lynne R.; Carney, Michael E.; Thompson, Pamela J; Heitz, Florian; du Bois, Andreas; Schwaab, Ira; Harter, Philipp; Pisterer, Jacobus; Hillemanns, Peter; Karlan, Beth Y.; Walsh, Christine; Lester, Jenny; Orsulic, Sandra; Winham, Stacey J; Earp, Madalene; Larson, Melissa C.; Fogarty, Zachary C.; Høgdall, Estrid; Jensen, Allan; Kjaer, Susanne Kruger; Fridley, Brooke L.; Cunningham, Julie M.; Vierkant, Robert A.; Schildkraut, Joellen M.; Iversen, Edwin S.; Terry, Kathryn L.; Cramer, Daniel W.; Bandera, Elisa V.; Orlow, Irene; Pejovic, Tanja; Bean, Yukie; Høgdall, Claus; Lundvall, Lene; McNeish, Ian; Paul, James; Carty, Karen; Siddiqui, Nadeem; Glasspool, Rosalind; Sellers, Thomas; Kennedy, Catherine; Chiew, Yoke-Eng; Berchuck, Andrew; MacGregor, Stuart; deFazio, Anna; Pharoah, Paul D.P.; Goode, Ellen L.; deFazio, Anna; Webb, Penelope M.; Chenevix-Trench, Georgia
2015-01-01
Purpose Chemotherapy resistance remains a major challenge in the treatment of ovarian cancer. We hypothesize that germline polymorphisms might be associated with clinical outcome. Experimental Design We analyzed ~2.8 million genotyped and imputed SNPs from the iCOGS experiment for progression-free survival (PFS) and overall survival (OS) in 2,901 European epithelial ovarian cancer (EOC) patients who underwent firstline treatment of cytoreductive surgery and chemotherapy regardless of regimen, and in a subset of 1,098 patients treated with ≥4 cycles of paclitaxel and carboplatin at standard doses. We evaluated the top SNPs in 4,434 EOC patients including patients from The Cancer Genome Atlas. Additionally we conducted pathway analysis of all intragenic SNPs and tested their association with PFS and OS using gene set enrichment analysis. Results Five SNPs were significantly associated (p≤1.0x10−5) with poorer outcomes in at least one of the four analyses, three of which, rs4910232 (11p15.3), rs2549714 (16q23) and rs6674079 (1q22) were located in long non-coding RNAs (lncRNAs) RP11–179A10.1, RP11–314O13.1 and RP11–284F21.8 respectively (p≤7.1x10−6). ENCODE ChIP-seq data at 1q22 for normal ovary shows evidence of histone modification around RP11–284F21.8, and rs6674079 is perfectly correlated with another SNP within the super-enhancer MEF2D, expression levels of which were reportedly associated with prognosis in another solid tumor. YAP1- and WWTR1 (TAZ)-stimulated gene expression, and HDL-mediated lipid transport pathways were associated with PFS and OS, respectively, in the cohort who had standard chemotherapy (pGSEA≤6x10−3). Conclusion We have identified SNPs in three lncRNAs that might be important targets for novel EOC therapies. PMID:26152742
Sabir, Jamal S M; Arasappan, Dhivya; Bahieldin, Ahmed; Abo-Aba, Salah; Bafeel, Sameera; Zari, Talal A; Edris, Sherif; Shokry, Ahmed M; Gadalla, Nour O; Ramadan, Ahmed M; Atef, Ahmed; Al-Kordy, Magdy A; El-Domyati, Fotoh M; Jansen, Robert K
2014-01-01
Date palm is a very important crop in western Asia and northern Africa, and it is the oldest domesticated fruit tree with archaeological records dating back 5000 years. The huge economic value of this crop has generated considerable interest in breeding programs to enhance production of dates. One of the major limitations of these efforts is the uncertainty regarding the number of date palm cultivars, which are currently based on fruit shape, size, color, and taste. Whole mitochondrial and plastid genome sequences were utilized to examine single nucleotide polymorphisms (SNPs) of date palms to evaluate the efficacy of this approach for molecular characterization of cultivars. Mitochondrial and plastid genomes of nine Saudi Arabian cultivars were sequenced. For each species about 60 million 100 bp paired-end reads were generated from total genomic DNA using the Illumina HiSeq 2000 platform. For each cultivar, sequences were aligned separately to the published date palm plastid and mitochondrial reference genomes, and SNPs were identified. The results identified cultivar-specific SNPs for eight of the nine cultivars. Two previous SNP analyses of mitochondrial and plastid genomes identified substantial intra-cultivar ( = intra-varietal) polymorphisms in organellar genomes but these studies did not properly take into account the fact that nearly half of the plastid genome has been integrated into the mitochondrial genome. Filtering all sequencing reads that mapped to both organellar genomes nearly eliminated mitochondrial heteroplasmy but all plastid SNPs remained heteroplasmic. This investigation provides valuable insights into how to deal with interorganellar DNA transfer in performing SNP analyses from total genomic DNA. The results confirm recent suggestions that plastid heteroplasmy is much more common than previously thought. Finally, low levels of sequence variation in plastid and mitochondrial genomes argue for using nuclear SNPs for molecular characterization of date palm cultivars.
Ma, Zhiying; He, Shoupu; Wang, Xingfen; Sun, Junling; Zhang, Yan; Zhang, Guiyin; Wu, Liqiang; Li, Zhikun; Liu, Zhihao; Sun, Gaofei; Yan, Yuanyuan; Jia, Yinhua; Yang, Jun; Pan, Zhaoe; Gu, Qishen; Li, Xueyuan; Sun, Zhengwen; Dai, Panhong; Liu, Zhengwen; Gong, Wenfang; Wu, Jinhua; Wang, Mi; Liu, Hengwei; Feng, Keyun; Ke, Huifeng; Wang, Junduo; Lan, Hongyu; Wang, Guoning; Peng, Jun; Wang, Nan; Wang, Liru; Pang, Baoyin; Peng, Zhen; Li, Ruiqiang; Tian, Shilin; Du, Xiongming
2018-05-07
Upland cotton is the most important natural-fiber crop. The genomic variation of diverse germplasms and alleles underpinning fiber quality and yield should be extensively explored. Here, we resequenced a core collection comprising 419 accessions with 6.55-fold coverage depth and identified approximately 3.66 million SNPs for evaluating the genomic variation. We performed phenotyping across 12 environments and conducted genome-wide association study of 13 fiber-related traits. 7,383 unique SNPs were significantly associated with these traits and were located within or near 4,820 genes; more associated loci were detected for fiber quality than fiber yield, and more fiber genes were detected in the D than the A subgenome. Several previously undescribed causal genes for days to flowering, fiber length, and fiber strength were identified. Phenotypic selection for these traits increased the frequency of elite alleles during domestication and breeding. These results provide targets for molecular selection and genetic manipulation in cotton improvement.
Genotype Imputation for Latinos Using the HapMap and 1000 Genomes Project Reference Panels.
Gao, Xiaoyi; Haritunians, Talin; Marjoram, Paul; McKean-Cowdin, Roberta; Torres, Mina; Taylor, Kent D; Rotter, Jerome I; Gauderman, William J; Varma, Rohit
2012-01-01
Genotype imputation is a vital tool in genome-wide association studies (GWAS) and meta-analyses of multiple GWAS results. Imputation enables researchers to increase genomic coverage and to pool data generated using different genotyping platforms. HapMap samples are often employed as the reference panel. More recently, the 1000 Genomes Project resource is becoming the primary source for reference panels. Multiple GWAS and meta-analyses are targeting Latinos, the most populous, and fastest growing minority group in the US. However, genotype imputation resources for Latinos are rather limited compared to individuals of European ancestry at present, largely because of the lack of good reference data. One choice of reference panel for Latinos is one derived from the population of Mexican individuals in Los Angeles contained in the HapMap Phase 3 project and the 1000 Genomes Project. However, a detailed evaluation of the quality of the imputed genotypes derived from the public reference panels has not yet been reported. Using simulation studies, the Illumina OmniExpress GWAS data from the Los Angles Latino Eye Study and the MACH software package, we evaluated the accuracy of genotype imputation in Latinos. Our results show that the 1000 Genomes Project AMR + CEU + YRI reference panel provides the highest imputation accuracy for Latinos, and that also including Asian samples in the panel can reduce imputation accuracy. We also provide the imputation accuracy for each autosomal chromosome using the 1000 Genomes Project panel for Latinos. Our results serve as a guide to future imputation based analysis in Latinos.
Genome-wide association uncovers shared genetic effects among personality traits and mood states.
Luciano, Michelle; Huffman, Jennifer E; Arias-Vásquez, Alejandro; Vinkhuyzen, Anna A E; Middeldorp, Christel M; Giegling, Ina; Payton, Antony; Davies, Gail; Zgaga, Lina; Janzing, Joost; Ke, Xiayi; Galesloot, Tessel; Hartmann, Annette M; Ollier, William; Tenesa, Albert; Hayward, Caroline; Verhagen, Maaike; Montgomery, Grant W; Hottenga, Jouke-Jan; Konte, Bettina; Starr, John M; Vitart, Veronique; Vos, Pieter E; Madden, Pamela A F; Willemsen, Gonneke; Konnerth, Heike; Horan, Michael A; Porteous, David J; Campbell, Harry; Vermeulen, Sita H; Heath, Andrew C; Wright, Alan; Polasek, Ozren; Kovacevic, Sanja B; Hastie, Nicholas D; Franke, Barbara; Boomsma, Dorret I; Martin, Nicholas G; Rujescu, Dan; Wilson, James F; Buitelaar, Jan; Pendleton, Neil; Rudan, Igor; Deary, Ian J
2012-09-01
Measures of personality and psychological distress are correlated and exhibit genetic covariance. We conducted univariate genome-wide SNP (~2.5 million) and gene-based association analyses of these traits and examined the overlap in results across traits, including a prediction analysis of mood states using genetic polygenic scores for personality. Measures of neuroticism, extraversion, and symptoms of anxiety, depression, and general psychological distress were collected in eight European cohorts (n ranged 546-1,338; maximum total n = 6,268) whose mean age ranged from 55 to 79 years. Meta-analysis of the cohort results was performed, with follow-up associations of the top SNPs and genes investigated in independent cohorts (n = 527-6,032). Suggestive association (P = 8 × 10(-8)) of rs1079196 in the FHIT gene was observed with symptoms of anxiety. Other notable associations (P < 6.09 × 10(-6)) included SNPs in five genes for neuroticism (LCE3C, POLR3A, LMAN1L, ULK3, SCAMP2), KIAA0802 for extraversion, and NOS1 for general psychological distress. An association between symptoms of depression and rs7582472 (near to MGAT5 and NCKAP5) was replicated in two independent samples, but other replication findings were less consistent. Gene-based tests identified a significant locus on chromosome 15 (spanning five genes) associated with neuroticism which replicated (P < 0.05) in an independent cohort. Support for common genetic effects among personality and mood (particularly neuroticism and depressive symptoms) was found in terms of SNP association overlap and polygenic score prediction. The variance explained by individual SNPs was very small (up to 1%) confirming that there are no moderate/large effects of common SNPs on personality and related traits. Copyright © 2012 Wiley Periodicals, Inc.
2012-01-01
Background Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeño and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Results Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from >125,000 Sanger-EST sequences that were mainly derived from a Korean F1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip® microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80–120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins. Conclusions Before availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project. PMID:23110314
Ali, Mohammad; Liu, Xuanyao; Pillai, Esakimuthu Nisha; Chen, Peng; Khor, Chiea-Chuen; Ong, Rick Twee-Hee; Teo, Yik-Ying
2014-07-22
India is home to many ethnically and linguistically diverse populations. It is hypothesized that history of invasions by people from Persia and Central Asia, who are referred as Aryans in Hindu Holy Scriptures, had a defining role in shaping the Indian population canvas. A shift in spoken languages from Dravidian languages to Indo-European languages around 1500 B.C. is central to the Aryan Invasion Theory. Here we investigate the genetic differences between two sub-populations of India consisting of: (1) The Indo-European language speaking Gujarati Indians with genome-wide data from the International HapMap Project; and (2) the Dravidian language speaking Tamil Indians with genome-wide data from the Singapore Genome Variation Project. We implemented three population genetics measures to identify genomic regions that are significantly differentiated between the two Indian populations originating from the north and south of India. These measures singled out genomic regions with: (i) SNPs exhibiting significant variation in allele frequencies in the two Indian populations; and (ii) differential signals of positive natural selection as quantified by the integrated haplotype score (iHS) and cross-population extended haplotype homozygosity (XP-EHH). One of the regions that emerged spans the SLC24A5 gene that has been functionally shown to affect skin pigmentation, with a higher degree of genetic sharing between Gujarati Indians and Europeans. Our finding points to a gene-flow from Europe to north India that provides an explanation for the lighter skin tones present in North Indians in comparison to South Indians.
Pharmacoethnicity in Paclitaxel-Induced Sensory Peripheral Neuropathy
Komatsu, Masaaki; Wheeler, Heather E.; Chung, Suyoun; Low, Siew-Kee; Wing, Claudia; Delaney, Shannon M.; Gorsic, Lidija K.; Takahashi, Atsushi; Kubo, Michiaki; Kroetz, Deanna L.; Zhang, Wei; Nakamura, Yusuke; Dolan, M. Eileen
2015-01-01
Purpose Paclitaxel is used worldwide in the treatment of breast, lung, ovarian and other cancers. Sensory peripheral neuropathy is an associated adverse effect that cannot be predicted, prevented or mitigated. To better understand the contribution of germline genetic variation to paclitaxel-induced peripheral neuropathy, we undertook an integrative approach that combines genome-wide association study (GWAS) data generated from HapMap lymphoblastoid cell lines (LCLs) and Asian patients. Methods GWAS was performed with paclitaxel-induced cytotoxicity generated in 363 LCLs and with paclitaxel-induced neuropathy from 145 Asian patients. A gene-based approach was used to identify overlapping genes and compare to a European clinical cohort of paclitaxel-induced neuropathy. Neurons derived from human induced pluripotent stem cells were used for functional validation of candidate genes. Results SNPs near AIPL1 were significantly associated with paclitaxel-induced cytotoxicity in Asian LCLs (P < 10−6). Decreased expression of AIPL1 resulted in decreased sensitivity of neurons to paclitaxel by inducing neurite morphological changes as measured by increased relative total outgrowth, number of processes and mean process length. Using a gene-based analysis, there were 32 genes that overlapped between Asian LCL cytotoxicity and Asian patient neuropathy (P < 0.05) including BCR. Upon BCR knockdown, there was an increase in neuronal sensitivity to paclitaxel as measured by neurite morphological characteristics. Conclusion We identified genetic variants associated with Asian paclitaxel-induced cytotoxicity and functionally validated the AIPL1 and BCR in a neuronal cell model. Furthermore, the integrative pharmacogenomics approach of LCL/patient GWAS may help prioritize target genes associated with chemotherapeutic-induced peripheral neuropathy. PMID:26015512
Koning-Boucoiran, Carole F S; Esselink, G Danny; Vukosavljev, Mirjana; van 't Westende, Wendy P C; Gitonga, Virginia W; Krens, Frans A; Voorrips, Roeland E; van de Weg, W Eric; Schulz, Dietmar; Debener, Thomas; Maliepaard, Chris; Arens, Paul; Smulders, Marinus J M
2015-01-01
In order to develop a versatile and large SNP array for rose, we set out to mine ESTs from diverse sets of rose germplasm. For this RNA-Seq libraries containing about 700 million reads were generated from tetraploid cut and garden roses using Illumina paired-end sequencing, and from diploid Rosa multiflora using 454 sequencing. Separate de novo assemblies were performed in order to identify single nucleotide polymorphisms (SNPs) within and between rose varieties. SNPs among tetraploid roses were selected for constructing a genotyping array that can be employed for genetic mapping and marker-trait association discovery in breeding programs based on tetraploid germplasm, both from cut roses and from garden roses. In total 68,893 SNPs were included on the WagRhSNP Axiom array. Next, an orthology-guided assembly was performed for the construction of a non-redundant rose transcriptome database. A total of 21,740 transcripts had significant hits with orthologous genes in the strawberry (Fragaria vesca L.) genome. Of these 13,390 appeared to contain the full-length coding regions. This newly established transcriptome resource adds considerably to the currently available sequence resources for the Rosaceae family in general and the genus Rosa in particular.
Genome-wide association analysis of age-at-onset in Alzheimer’s disease
Kamboh, M. Ilyas; Barmada, M. Michael; Demirci, F. Yesim; Minster, Ryan L.; Carrasquillo, Minerva M.; Pankratz, V. Shane; Younkin, Steven G.; Saykin, Andrew J.; Sweet, Robert A.; Feingold, Eleanor; DeKosky, Steven T.; Lopez, Oscar L.
2011-01-01
The risk of Alzheimer’s disease (AD) is strongly determined by genetic factors and recent genome-wide association studies (GWAS) have identified several genes for the disease risk. In addition to the disease risk, age-at-onset (AAO) of AD has also strong genetic component with an estimated heritability of 42%. Identification of AAO genes may help to understand the biological mechanisms that regulate the onset of the disease. Here we report the first GWAS focused on identifying genes for the AAO of AD. We performed a genome-wide meta analysis on 3 samples comprising a total of 2,222 AD cases. A total of ~2.5 million directly genotyped or imputed SNPs were analyzed in relation to AAO of AD. As expected, the most significant associations were observed in the APOE region on chromosome 19 where several SNPs surpassed the conservative genome-wide significant threshold (P<5E-08). The most significant SNP outside the APOE region was located in the DCHS2 gene on chromosome 4q31.3 (rs1466662; P=4.95E-07). There were 19 additional significant SNPs in this region at P<1E-04 and the DCHS2 gene is expressed in the cerebral cortex and thus is a potential candidate for affecting AAO in AD. These findings need to be confirmed in additional well-powered samples. PMID:22005931
Genome-wide association analysis of age-at-onset in Alzheimer's disease.
Kamboh, M I; Barmada, M M; Demirci, F Y; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Sweet, R A; Feingold, E; DeKosky, S T; Lopez, O L
2012-12-01
The risk of Alzheimer's disease (AD) is strongly determined by genetic factors and recent genome-wide association studies (GWAS) have identified several genes for the disease risk. In addition to the disease risk, age-at-onset (AAO) of AD has also strong genetic component with an estimated heritability of 42%. Identification of AAO genes may help to understand the biological mechanisms that regulate the onset of the disease. Here we report the first GWAS focused on identifying genes for the AAO of AD. We performed a genome-wide meta-analysis on three samples comprising a total of 2222 AD cases. A total of ~2.5 million directly genotyped or imputed single-nucleotide polymorphisms (SNPs) were analyzed in relation to AAO of AD. As expected, the most significant associations were observed in the apolipoprotein E (APOE) region on chromosome 19 where several SNPs surpassed the conservative genome-wide significant threshold (P<5E-08). The most significant SNP outside the APOE region was located in the DCHS2 gene on chromosome 4q31.3 (rs1466662; P=4.95E-07). There were 19 additional significant SNPs in this region at P<1E-04 and the DCHS2 gene is expressed in the cerebral cortex and thus is a potential candidate for affecting AAO in AD. These findings need to be confirmed in additional well-powered samples.
Bian, Yang; De Vries, Brian; Tracy, William F.
2016-01-01
Physiological leaf spotting, or flecking, is a mild-lesion phenotype observed on the leaves of several commonly used maize (Zea mays) inbred lines and has been anecdotally linked to enhanced broad-spectrum disease resistance. Flecking was assessed in the maize nested association mapping (NAM) population, comprising 4,998 recombinant inbred lines from 25 biparental families, and in an association population, comprising 279 diverse maize inbreds. Joint family linkage analysis was conducted with 7,386 markers in the NAM population. Genome-wide association tests were performed with 26.5 million single-nucleotide polymorphisms (SNPs) in the NAM population and with 246,497 SNPs in the association population, resulting in the identification of 18 and three loci associated with variation in flecking, respectively. Many of the candidate genes colocalizing with associated SNPs are similar to genes that function in plant defense response via cell wall modification, salicylic acid- and jasmonic acid-dependent pathways, redox homeostasis, stress response, and vesicle trafficking/remodeling. Significant positive correlations were found between increased flecking, stronger defense response, increased disease resistance, and increased pest resistance. A nonlinear relationship with total kernel weight also was observed whereby lines with relatively high levels of flecking had, on average, lower total kernel weight. We present evidence suggesting that mild flecking could be used as a selection criterion for breeding programs trying to incorporate broad-spectrum disease resistance. PMID:27670817
Schweighofer, Carmen D.; Coombes, Kevin R.; Majewski, Tadeusz; Barron, Lynn L.; Lerner, Susan; Sargent, Rachel L.; O'Brien, Susan; Ferrajoli, Alessandra; Wierda, William G.; Czerniak, Bogdan A.; Medeiros, L. Jeffrey; Keating, Michael J.; Abruzzo, Lynne V.
2013-01-01
Genomic abnormalities, such as deletions in 11q22 or 17p13, are associated with poorer prognosis in patients with chronic lymphocytic leukemia (CLL). We hypothesized that unknown regions of copy number variation (CNV) affect clinical outcome and can be detected by array-based single-nucleotide polymorphism (SNP) genotyping. We compared SNP genotypes from 168 untreated patients with CLL with genotypes from 73 white HapMap controls. We identified 322 regions of recurrent CNV, 82 of which occurred significantly more often in CLL than in HapMap (CLL-specific CNV), including regions typically aberrant in CLL: deletions in 6q21, 11q22, 13q14, and 17p13 and trisomy 12. In univariate analyses, 35 of total and 11 of CLL-specific CNVs were associated with unfavorable time-to-event outcomes, including gains or losses in chromosomes 2p, 4p, 4q, 6p, 6q, 7q, 11p, 11q, and 17p. In multivariate analyses, six CNVs (ie, CLL-specific variations in 11p15.1-15.4 or 6q27) predicted time-to-treatment or overall survival independently of established markers of prognosis. Moreover, genotypic complexity (ie, the number of independent CNVs per patient) significantly predicted prognosis, with a median time-to-treatment of 64 months versus 23 months in patients with zero to one versus two or more CNVs, respectively (P = 3.3 × 10−8). In summary, a comparison of SNP genotypes from patients with CLL with HapMap controls allowed us to identify known and unknown recurrent CNVs and to determine regions and rates of CNV that predict poorer prognosis in patients with CLL. PMID:23273604
Unexpected Relationships and Inbreeding in HapMap Phase III Populations
Stevens, Eric L.; Baugher, Joseph D.; Shirley, Matthew D.; Frelin, Laurence P.; Pevsner, Jonathan
2012-01-01
Correct annotation of the genetic relationships between samples is essential for population genomic studies, which could be biased by errors or omissions. To this end, we used identity-by-state (IBS) and identity-by-descent (IBD) methods to assess genetic relatedness of individuals within HapMap phase III data. We analyzed data from 1,397 individuals across 11 ethnic populations. Our results support previous studies (Pemberton et al., 2010; Kyriazopoulou-Panagiotopoulou et al., 2011) assessing unknown relatedness present within this population. Additionally, we present evidence for 1,657 novel pairwise relationships across 9 populations. Surprisingly, significant Cotterman's coefficients of relatedness K1 (IBD1) values were detected between pairs of known parents. Furthermore, significant K2 (IBD2) values were detected in 32 previously annotated parent-child relationships. Consistent with a hypothesis of inbreeding, regions of homozygosity (ROH) were identified in the offspring of related parents, of which a subset overlapped those reported in previous studies (Gibson et al. 2010; Johnson et al. 2011). In total, we inferred 28 inbred individuals with ROH that overlapped areas of relatedness between the parents and/or IBD2 sharing at a different genomic locus between a child and a parent. Finally, 8 previously annotated parent-child relationships had unexpected K0 (IBD0) values (resulting from a chromosomal abnormality or genotype error), and 10 previously annotated second-degree relationships along with 38 other novel pairwise relationships had unexpected IBD2 (indicating two separate paths of recent ancestry). These newly described types of relatedness may impact the outcome of previous studies and should inform the design of future studies relying on the HapMap Phase III resource. PMID:23185369
Single-molecule optical genome mapping of a human HapMap and a colorectal cancer cell line.
Teo, Audrey S M; Verzotto, Davide; Yao, Fei; Nagarajan, Niranjan; Hillmer, Axel M
2015-01-01
Next-generation sequencing (NGS) technologies have changed our understanding of the variability of the human genome. However, the identification of genome structural variations based on NGS approaches with read lengths of 35-300 bases remains a challenge. Single-molecule optical mapping technologies allow the analysis of DNA molecules of up to 2 Mb and as such are suitable for the identification of large-scale genome structural variations, and for de novo genome assemblies when combined with short-read NGS data. Here we present optical mapping data for two human genomes: the HapMap cell line GM12878 and the colorectal cancer cell line HCT116. High molecular weight DNA was obtained by embedding GM12878 and HCT116 cells, respectively, in agarose plugs, followed by DNA extraction under mild conditions. Genomic DNA was digested with KpnI and 310,000 and 296,000 DNA molecules (≥ 150 kb and 10 restriction fragments), respectively, were analyzed per cell line using the Argus optical mapping system. Maps were aligned to the human reference by OPTIMA, a new glocal alignment method. Genome coverage of 6.8× and 5.7× was obtained, respectively; 2.9× and 1.7× more than the coverage obtained with previously available software. Optical mapping allows the resolution of large-scale structural variations of the genome, and the scaffold extension of NGS-based de novo assemblies. OPTIMA is an efficient new alignment method; our optical mapping data provide a resource for genome structure analyses of the human HapMap reference cell line GM12878, and the colorectal cancer cell line HCT116.
Maceachern, Sean; Muir, William M; Crosby, Seth; Cheng, Hans H
2011-06-03
Marek's disease (MD), a T cell lymphoma induced by the highly oncogenic α-herpesvirus Marek's disease virus (MDV), is the main chronic infectious disease concern threatening the poultry industry. Enhancing genetic resistance to MD in commercial poultry is an attractive method to augment MD vaccines, which is currently the control method of choice. In order to optimally implement this control strategy through marker-assisted selection (MAS) and to gain biological information, it is necessary to identify specific genes that influence MD incidence. A genome-wide screen for allele-specific expression (ASE) in response to MDV infection was conducted. The highly inbred ADOL chicken lines 6 (MD resistant) and 7 (MD susceptible) were inter-mated in reciprocal crosses and half of the progeny challenged with MDV. Splenic RNA pools at a single time after infection for each treatment group point were generated, sequenced using a next generation sequencer, then analyzed for allele-specific expression (ASE). To validate and extend the results, Illumina GoldenGate assays for selected cSNPs were developed and used on all RNA samples from all 6 time points following MDV challenge. RNA sequencing resulted in 11-13+ million mappable reads per treatment group, 1.7+ Gb total sequence, and 22,655 high-confidence cSNPs. Analysis of these cSNPs revealed that 5360 cSNPs in 3773 genes exhibited statistically significant allelic imbalance. Of the 1536 GoldenGate assays, 1465 were successfully scored with all but 19 exhibiting evidence for allelic imbalance. ASE is an efficient method to identify potentially all or most of the genes influencing this complex trait. The identified cSNPs can be further evaluated in resource populations to determine their allelic direction and size of effect on genetic resistance to MD as well as being directly implemented in genomic selection programs. The described method, although demonstrated in inbred chicken lines, is applicable to all traits in any diploid species, and should prove to be a simple method to identify the majority of genes controlling any complex trait.
Large-scale whole-genome sequencing of the Icelandic population.
Gudbjartsson, Daniel F; Helgason, Hannes; Gudjonsson, Sigurjon A; Zink, Florian; Oddson, Asmundur; Gylfason, Arnaldur; Besenbacher, Soren; Magnusson, Gisli; Halldorsson, Bjarni V; Hjartarson, Eirikur; Sigurdsson, Gunnar Th; Stacey, Simon N; Frigge, Michael L; Holm, Hilma; Saemundsdottir, Jona; Helgadottir, Hafdis Th; Johannsdottir, Hrefna; Sigfusson, Gunnlaugur; Thorgeirsson, Gudmundur; Sverrisson, Jon Th; Gretarsdottir, Solveig; Walters, G Bragi; Rafnar, Thorunn; Thjodleifsson, Bjarni; Bjornsson, Einar S; Olafsson, Sigurdur; Thorarinsdottir, Hildur; Steingrimsdottir, Thora; Gudmundsdottir, Thora S; Theodors, Asgeir; Jonasson, Jon G; Sigurdsson, Asgeir; Bjornsdottir, Gyda; Jonsson, Jon J; Thorarensen, Olafur; Ludvigsson, Petur; Gudbjartsson, Hakon; Eyjolfsson, Gudmundur I; Sigurdardottir, Olof; Olafsson, Isleifur; Arnar, David O; Magnusson, Olafur Th; Kong, Augustine; Masson, Gisli; Thorsteinsdottir, Unnur; Helgason, Agnar; Sulem, Patrick; Stefansson, Kari
2015-05-01
Here we describe the insights gained from sequencing the whole genomes of 2,636 Icelanders to a median depth of 20×. We found 20 million SNPs and 1.5 million insertions-deletions (indels). We describe the density and frequency spectra of sequence variants in relation to their functional annotation, gene position, pathway and conservation score. We demonstrate an excess of homozygosity and rare protein-coding variants in Iceland. We imputed these variants into 104,220 individuals down to a minor allele frequency of 0.1% and found a recessive frameshift mutation in MYL4 that causes early-onset atrial fibrillation, several mutations in ABCB4 that increase risk of liver diseases and an intronic variant in GNAS associating with increased thyroid-stimulating hormone levels when maternally inherited. These data provide a study design that can be used to determine how variation in the sequence of the human genome gives rise to human diversity.
Genomic characteristics of cattle copy number variations
USDA-ARS?s Scientific Manuscript database
We performed a systematic analysis of cattle copy number variations (CNVs) using the Bovine HapMap SNP genotyping data, including 539 animals of 21 modern cattle breeds and 6 outgroups. After correcting genomic waves and considering the trio information, we identified 682 candidate CNV regions (CNVR...
Mutations that Cause Human Disease: A Computational/Experimental Approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beernink, P; Barsky, D; Pesavento, B
International genome sequencing projects have produced billions of nucleotides (letters) of DNA sequence data, including the complete genome sequences of 74 organisms. These genome sequences have created many new scientific opportunities, including the ability to identify sequence variations among individuals within a species. These genetic differences, which are known as single nucleotide polymorphisms (SNPs), are particularly important in understanding the genetic basis for disease susceptibility. Since the report of the complete human genome sequence, over two million human SNPs have been identified, including a large-scale comparison of an entire chromosome from twenty individuals. Of the protein coding SNPs (cSNPs), approximatelymore » half leads to a single amino acid change in the encoded protein (non-synonymous coding SNPs). Most of these changes are functionally silent, while the remainder negatively impact the protein and sometimes cause human disease. To date, over 550 SNPs have been found to cause single locus (monogenic) diseases and many others have been associated with polygenic diseases. SNPs have been linked to specific human diseases, including late-onset Parkinson disease, autism, rheumatoid arthritis and cancer. The ability to predict accurately the effects of these SNPs on protein function would represent a major advance toward understanding these diseases. To date several attempts have been made toward predicting the effects of such mutations. The most successful of these is a computational approach called ''Sorting Intolerant From Tolerant'' (SIFT). This method uses sequence conservation among many similar proteins to predict which residues in a protein are functionally important. However, this method suffers from several limitations. First, a query sequence must have a sufficient number of relatives to infer sequence conservation. Second, this method does not make use of or provide any information on protein structure, which can be used to understand how an amino acid change affects the protein. The experimental methods that provide the most detailed structural information on proteins are X-ray crystallography and NMR spectroscopy. However, these methods are labor intensive and currently cannot be carried out on a genomic scale. Nonetheless, Structural Genomics projects are being pursued by more than a dozen groups and consortia worldwide and as a result the number of experimentally determined structures is rising exponentially. Based on the expectation that protein structures will continue to be determined at an ever-increasing rate, reliable structure prediction schemes will become increasingly valuable, leading to information on protein function and disease for many different proteins. Given known genetic variability and experimentally determined protein structures, can we accurately predict the effects of single amino acid substitutions? An objective assessment of this question would involve comparing predicted and experimentally determined structures, which thus far has not been rigorously performed. The completed research leveraged existing expertise at LLNL in computational and structural biology, as well as significant computing resources, to address this question.« less
Chopra, Ratan; Burow, Gloria; Farmer, Andrew; Mudge, Joann; Simpson, Charles E; Wilkins, Thea A; Baring, Michael R; Puppala, Naveen; Chamberlin, Kelly D; Burow, Mark D
2015-06-01
Single-nucleotide polymorphisms, which can be identified in the thousands or millions from comparisons of transcriptome or genome sequences, are ideally suited for making high-resolution genetic maps, investigating population evolutionary history, and discovering marker-trait linkages. Despite significant results from their use in human genetics, progress in identification and use in plants, and particularly polyploid plants, has lagged. As part of a long-term project to identify and use SNPs suitable for these purposes in cultivated peanut, which is tetraploid, we generated transcriptome sequences of four peanut cultivars, namely OLin, New Mexico Valencia C, Tamrun OL07 and Jupiter, which represent the four major market classes of peanut grown in the world, and which are important economically to the US southwest peanut growing region. CopyDNA libraries of each genotype were used to generate 2 × 54 paired-end reads using an Illumina GAIIx sequencer. Raw reads were mapped to a custom reference consisting of Tifrunner 454 sequences plus peanut ESTs in GenBank, compromising 43,108 contigs; 263,840 SNP and indel variants were identified among four genotypes compared to the reference. A subset of 6 variants was assayed across 24 genotypes representing four market types using KASP chemistry to assess the criteria for SNP selection. Results demonstrated that transcriptome sequencing can identify SNPs usable as selectable DNA-based markers in complex polyploid species such as peanut. Criteria for effective use of SNPs as markers are discussed in this context.
Wen, Wanqing; Kato, Norihiro; Hwang, Joo-Yeon; Guo, Xingyi; Tabara, Yasuharu; Li, Huaixing; Dorajoo, Rajkumar; Yang, Xiaobo; Tsai, Fuu-Jen; Li, Shengxu; Wu, Ying; Wu, Tangchun; Kim, Soriul; Guo, Xiuqing; Liang, Jun; Shungin, Dmitry; Adair, Linda S.; Akiyama, Koichi; Allison, Matthew; Cai, Qiuyin; Chang, Li-Ching; Chen, Chien-Hsiun; Chen, Yuan-Tsong; Cho, Yoon Shin; Choi, Bo Youl; Gao, Yutang; Go, Min Jin; Gu, Dongfeng; Han, Bok-Ghee; He, Meian; Hixson, James E.; Hu, Yanling; Huang, Tao; Isono, Masato; Jung, Keum Ji; Kang, Daehee; Kim, Young Jin; Kita, Yoshikuni; Lee, Juyoung; Lee, Nanette R.; Lee, Jeannette; Wang, Yiqin; Liu, Jian-Jun; Long, Jirong; Moon, Sanghoon; Nakamura, Yasuyuki; Nakatochi, Masahiro; Ohnaka, Keizo; Rao, Dabeeru; Shi, Jiajun; Sull, Jae Woong; Tan, Aihua; Ueshima, Hirotsugu; Wu, Chen; Xiang, Yong-Bing; Yamamoto, Ken; Yao, Jie; Ye, Xingwang; Yokota, Mitsuhiro; Zhang, Xiaomin; Zheng, Yan; Qi, Lu; Rotter, Jerome I.; Jee, Sun Ha; Lin, Dongxin; Mohlke, Karen L.; He, Jiang; Mo, Zengnan; Wu, Jer-Yuarn; Tai, E. Shyong; Lin, Xu; Miki, Tetsuro; Kim, Bong-Jo; Takeuchi, Fumihiko; Zheng, Wei; Shu, Xiao-Ou
2016-01-01
Sixty genetic loci associated with abdominal obesity, measured by waist circumference (WC) and waist-hip ratio (WHR), have been previously identified, primarily from studies conducted in European-ancestry populations. We conducted a meta-analysis of associations of abdominal obesity with approximately 2.5 million single nucleotide polymorphisms (SNPs) among 53,052 (for WC) and 48,312 (for WHR) individuals of Asian descent, and replicated 33 selected SNPs among 3,762 to 17,110 additional individuals. We identified four novel loci near the EFEMP1, ADAMTSL3 , CNPY2, and GNAS genes that were associated with WC after adjustment for body mass index (BMI); two loci near the NID2 and HLA-DRB5 genes associated with WHR after adjustment for BMI, and three loci near the CEP120, TSC22D2, and SLC22A2 genes associated with WC without adjustment for BMI. Functional enrichment analyses revealed enrichment of corticotropin-releasing hormone signaling, GNRH signaling, and/or CDK5 signaling pathways for those newly-identified loci. Our study provides additional insight on genetic contribution to abdominal obesity. PMID:26785701
Genomic and evolutionary characteristics of cattle copy number variations
USDA-ARS?s Scientific Manuscript database
We performed a systematic analysis of cattle copy number variations (CNVs) using the Bovine HapMap SNP genotyping data, including 539 animals of 21 modern cattle breeds and 6 outgroups. After correcting genomic waves and considering the trio information, we identified 682 candidate CNV regions (CNVR...
Maize HapMap2 identifies extant variation from a genome in flux
USDA-ARS?s Scientific Manuscript database
The maize genome is the largest, most diverse and complex plant genome sequenced to date. Using high-throughput sequencing to access genetic variation and a population genetics model to score the polymorphisms, we characterize and unite the diversity of the world’s key breeding germplasm, wild rela...
Rare copy number variants in a population-based investigation of hypoplastic right heart syndrome.
Dimopoulos, Aggeliki; Sicko, Robert J; Kay, Denise M; Rigler, Shannon L; Druschel, Charlotte M; Caggana, Michele; Browne, Marilyn L; Fan, Ruzong; Romitti, Paul A; Brody, Lawrence C; Mills, James L
2017-01-20
Hypoplastic right heart syndrome (HRHS) is a rare congenital defect characterized by underdevelopment of the right heart structures commonly accompanied by an atrial septal defect. Familial HRHS reports suggest genetic factor involvement. We examined the role of copy number variants (CNVs) in HRHS. We genotyped 32 HRHS cases identified from all New York State live births (1998-2005) using Illumina HumanOmni2.5 microarrays. CNVs were called with PennCNV and prioritized if they were ≥20 Kb, contained ≥10 SNPs and had minimal overlap with CNVs from in-house controls, the Database of Genomic Variants, HapMap3, and Childrens Hospital of Philadelphia database. We identified 28 CNVs in 17 cases; several encompassed genes important for right heart development. One case had a 2p16-2p23 duplication spanning LBH, a limb and heart development transcription factor. Lbh mis-expression results in right ventricular hypoplasia and pulmonary valve defects. This duplication also encompassed SOS1, a factor associated with pulmonary valve stenosis in Noonan syndrome. Sos1 -/- mice display thin and poorly trabeculated ventricles. In another case, we identified a 1.5 Mb deletion associated with Williams-Beuren syndrome, a disorder that includes valvular malformations. A third case had a 24 Kb deletion upstream of the TGFβ ligand ITGB8. Embryos genetically null for Itgb8, and its intracellular interactant Band 4.1B, display lethal cardiac phenotypes. To our knowledge, this is the first study of CNVs in HRHS. We identified several rare CNVs that overlap genes related to right ventricular wall and valve development, suggesting that genetics plays a role in HRHS and providing clues for further investigation. Birth Defects Research 109:16-26, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Population and genomic lessons from genetic analysis of two Indian populations.
Juyal, Garima; Mondal, Mayukh; Luisi, Pierre; Laayouni, Hafid; Sood, Ajit; Midha, Vandana; Heutink, Peter; Bertranpetit, Jaume; Thelma, B K; Casals, Ferran
2014-10-01
Indian demographic history includes special features such as founder effects, interpopulation segregation, complex social structure with a caste system and elevated frequency of consanguineous marriages. It also presents a higher frequency for some rare mendelian disorders and in the last two decades increased prevalence of some complex disorders. Despite the fact that India represents about one-sixth of the human population, deep genetic studies from this terrain have been scarce. In this study, we analyzed high-density genotyping and whole-exome sequencing data of a North and a South Indian population. Indian populations show higher differentiation levels than those reported between populations of other continents. In this work, we have analyzed its consequences, by specifically assessing the transferability of genetic markers from or to Indian populations. We show that there is limited genetic marker portability from available genetic resources such as HapMap or the 1,000 Genomes Project to Indian populations, which also present an excess of private rare variants. Conversely, tagSNPs show a high level of portability between the two Indian populations, in contrast to the common belief that North and South Indian populations are genetically very different. By estimating kinship from mates and consanguinity in our data from trios, we also describe different patterns of assortative mating and inbreeding in the two populations, in agreement with distinct mating preferences and social structures. In addition, this analysis has allowed us to describe genomic regions under recent adaptive selection, indicating differential adaptive histories for North and South Indian populations. Our findings highlight the importance of considering demography for design and analysis of genetic studies, as well as the need for extending human genetic variation catalogs to new populations and particularly to those with particular demographic histories.
Rare Copy Number Variants in a Population Based Investigation of Hypoplastic Right Heart Syndrome
Dimopoulos, Aggeliki; Sicko, Robert J.; Kay, Denise M.; Rigler, Shannon L.; Druschel, Charlotte M.; Caggana, Michele; Browne, Marilyn L.; Fan, Ruzong; Romitti, Paul A.; Brody, Lawrence C.; Mills, James L.
2016-01-01
Background Hypoplastic right heart syndrome (HRHS) is a rare congenital defect characterized by underdevelopment of the right heart structures commonly accompanied by an atrial septal defect. Familial HRHS reports suggest genetic factor involvement. We examined the role of copy number variants (CNVs) in HRHS. Methods We genotyped 32 HRHS cases identified from all New York State live births (1998–2005) using Illumina HumanOmni2.5 microarrays. CNVs were called with PennCNV and prioritized if they were ≥20Kb, contained ≥10 SNPs and had minimal overlap with CNVs from in-house controls, the Database of Genomic Variants, HapMap3 and CHOP database. Results We identified 28 CNVs in 17 cases; several encompassed genes important for right heart development. One case had a 2p16–2p23 duplication spanning LBH, a limb and heart development transcription factor. Lbh mis-expression results in right ventricular hypoplasia and pulmonary valve defects. This duplication also encompassed SOS1, a factor associated with pulmonary valve stenosis in Noonan syndrome. Sos1−/− mice display thin and poorly trabeculated ventricles. In another case, we identified a 1.5Mb deletion associated with Williams Beuren syndrome, a disorder that includes valvular malformations. A third case had a 24Kb deletion upstream of the TGFβ ligand ITGB8. Embryos genetically null for Itgb8, and its intracellular interactant Band 4.1B, display lethal cardiac phenotypes. Conclusions To our knowledge, this is the first study of CNVs in HRHS. We identified several rare CNVs that overlap genes related to right ventricular wall and valve development, suggesting that genetics plays a role in HRHS and providing clues for further investigation. PMID:28009100
Wong, Gerard; Leckie, Christopher; Gorringe, Kylie L; Haviv, Izhak; Campbell, Ian G; Kowalczyk, Adam
2010-04-15
High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Detecting REcurrent Copy number change using rank-order Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/ approximately gwong/DRECS/index.html.
Genomic profiling of 766 cancer-related genes in archived esophageal normal and carcinoma tissues.
Chen, Jing; Guo, Liping; Peiffer, Daniel A; Zhou, Lixin; Chan, Owen Tsan Mo; Bibikova, Marina; Wickham-Garcia, Eliza; Lu, Shih-Hsin; Zhan, Qimin; Wang-Rodriguez, Jessica; Jiang, Wei; Fan, Jian-Bing
2008-05-15
We employed the BeadArraytrade mark technology to perform a genetic analysis in 33 formalin-fixed, paraffin-embedded (FFPE) human esophageal carcinomas, mostly squamous-cell-carcinoma (ESCC), and their adjacent normal tissues. A total of 1,432 single nucleotide polymorphisms (SNPs) derived from 766 cancer-related genes were genotyped with partially degraded genomic DNAs isolated from these samples. This directly targeted genomic profiling identified not only previously reported somatic gene amplifications (e.g., CCND1) and deletions (e.g., CDKN2A and CDKN2B) but also novel genomic aberrations. Among these novel targets, the most frequently deleted genomic regions were chromosome 3p (including tumor suppressor genes FANCD2 and CTNNB1) and chromosome 5 (including tumor suppressor gene APC). The most frequently amplified genomic region was chromosome 3q (containing DVL3, MLF1, ABCC5, BCL6, AGTR1 and known oncogenes TNK2, TNFSF10, FGF12). The chromosome 3p deletion and 3q amplification occurred coincidently in nearly all of the affected cases, suggesting a molecular mechanism for the generation of somatic chromosomal aberrations. We also detected significant differences in germline allele frequency between the esophageal cohort of our study and normal control samples from the International HapMap Project for 10 genes (CSF1, KIAA1804, IL2, PMS2, IRF7, FLT3, NTRK2, MAP3K9, ERBB2 and PRKAR1A), suggesting that they might play roles in esophageal cancer susceptibility and/or development. Taken together, our results demonstrated the utility of the BeadArray technology for high-throughput genetic analysis in FFPE tumor tissues and provided a detailed genetic profiling of cancer-related genes in human esophageal cancer. (c) 2008 Wiley-Liss, Inc.
Chen, Hao; Yang, Peng; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie
2015-01-01
Meiotic recombination hotspots play important roles in various aspects of genomics, but the underlying mechanisms for regulating the locations and strengths of recombination hotspots are not yet fully revealed. Most existing algorithms for estimating recombination rates from sequence polymorphism data can only output average recombination rates of a population, although there is evidence for the heterogeneity in recombination rates among individuals. For genome-wide association studies (GWAS) of recombination hotspots, an efficient algorithm that estimates the individualized strengths of recombination hotspots is highly desirable. In this work, we propose a novel graph mining algorithm named ARG-walker, based on random walks on ancestral recombination graphs (ARG), to estimate individual-specific recombination hotspot strengths. Extensive simulations demonstrate that ARG-walker is able to distinguish the hot allele of a recombination hotspot from the cold allele. Integrated with output of ARG-walker, we performed GWAS on the phased haplotype data of the 22 autosome chromosomes of the HapMap Asian population samples of Chinese and Japanese (JPT+CHB). Significant cis-regulatory signals have been detected, which is corroborated by the enrichment of the well-known 13-mer motif CCNCCNTNNCCNC of PRDM9 protein. Moreover, two new DNA motifs have been identified in the flanking regions of the significantly associated SNPs (single nucleotide polymorphisms), which are likely to be new cis-regulatory elements of meiotic recombination hotspots of the human genome. Our results on both simulated and real data suggest that ARG-walker is a promising new method for estimating the individual recombination variations. In the future, it could be used to uncover the mechanisms of recombination regulation and human diseases related with recombination hotspots.
Sequence capture of ultraconserved elements from bird museum specimens.
McCormack, John E; Tsai, Whitney L E; Faircloth, Brant C
2016-09-01
New DNA sequencing technologies are allowing researchers to explore the genomes of the millions of natural history specimens collected prior to the molecular era. Yet, we know little about how well specific next-generation sequencing (NGS) techniques work with the degraded DNA typically extracted from museum specimens. Here, we use one type of NGS approach, sequence capture of ultraconserved elements (UCEs), to collect data from bird museum specimens as old as 120 years. We targeted 5060 UCE loci in 27 western scrub-jays (Aphelocoma californica) representing three evolutionary lineages that could be species, and we collected an average of 3749 UCE loci containing 4460 single nucleotide polymorphisms (SNPs). Despite older specimens producing fewer and shorter loci in general, we collected thousands of markers from even the oldest specimens. More sequencing reads per individual helped to boost the number of UCE loci we recovered from older specimens, but more sequencing was not as successful at increasing the length of loci. We detected contamination in some samples and determined that contamination was more prevalent in older samples that were subject to less sequencing. For the phylogeny generated from concatenated UCE loci, contamination led to incorrect placement of some individuals. In contrast, a species tree constructed from SNPs called within UCE loci correctly placed individuals into three monophyletic groups, perhaps because of the stricter analytical procedures used for SNP calling. This study and other recent studies on the genomics of museum specimens have profound implications for natural history collections, where millions of older specimens should now be considered genomic resources. © 2015 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.
USDA-ARS?s Scientific Manuscript database
Copy number variations (CNV) are well known genomic variants, which often complicate structural and functional genomics studies. Here, we integrated the CNV region (CNVR) result detected from 1,682 Nellore cattle with the equivalent result derived from the Bovine HapMap samples. Through comparing CN...
Fu, Chong-Yun; Liu, Wu-Ge; Liu, Di-Lin; Li, Ji-Hua; Zhu, Man-Shan; Liao, Yi-Long; Liu, Zhen-Rong; Zeng, Xue-Qin; Wang, Feng
2016-03-01
Next-generation sequencing technologies provide opportunities to further understand genetic variation, even within closely related cultivars. We performed whole genome resequencing of two elite indica rice varieties, RGD-7S and Taifeng B, whose F1 progeny showed hybrid weakness and hybrid vigor when grown in the early- and late-cropping seasons, respectively. Approximately 150 million 100-bp pair-end reads were generated, which covered ∼86% of the rice (Oryza sativa L. japonica 'Nipponbare') reference genome. A total of 2,758,740 polymorphic sites including 2,408,845 SNPs and 349,895 InDels were detected in RGD-7S and Taifeng B, respectively. Applying stringent parameters, we identified 961,791 SNPs and 46,640 InDels between RGD-7S and Taifeng B (RGD-7S/Taifeng B). The density of DNA polymorphisms was 256.8 SNPs and 12.5 InDels per 100 kb for RGD-7S/Taifeng B. Copy number variations (CNVs) were also investigated. In RGD-7S, 1989 of 2727 CNVs were overlapped in 218 genes, and 1231 of 2010 CNVs were annotated in 175 genes in Taifeng B. In addition, we verified a subset of InDels in the interval of hybrid weakness genes, Hw3 and Hw4, and obtained some polymorphic InDel markers, which will provide a sound foundation for cloning hybrid weakness genes. Analysis of genomic variations will also contribute to understanding the genetic basis of hybrid weakness and heterosis.
e-GRASP: an integrated evolutionary and GRASP resource for exploring disease associations.
Karim, Sajjad; NourEldin, Hend Fakhri; Abusamra, Heba; Salem, Nada; Alhathli, Elham; Dudley, Joel; Sanderford, Max; Scheinfeldt, Laura B; Chaudhary, Adeel G; Al-Qahtani, Mohammed H; Kumar, Sudhir
2016-10-17
Genome-wide association studies (GWAS) have become a mainstay of biological research concerned with discovering genetic variation linked to phenotypic traits and diseases. Both discrete and continuous traits can be analyzed in GWAS to discover associations between single nucleotide polymorphisms (SNPs) and traits of interest. Associations are typically determined by estimating the significance of the statistical relationship between genetic loci and the given trait. However, the prioritization of bona fide, reproducible genetic associations from GWAS results remains a central challenge in identifying genomic loci underlying common complex diseases. Evolutionary-aware meta-analysis of the growing GWAS literature is one way to address this challenge and to advance from association to causation in the discovery of genotype-phenotype relationships. We have created an evolutionary GWAS resource to enable in-depth query and exploration of published GWAS results. This resource uses the publically available GWAS results annotated in the GRASP2 database. The GRASP2 database includes results from 2082 studies, 177 broad phenotype categories, and ~8.87 million SNP-phenotype associations. For each SNP in e-GRASP, we present information from the GRASP2 database for convenience as well as evolutionary information (e.g., rate and timespan). Users can, therefore, identify not only SNPs with highly significant phenotype-association P-values, but also SNPs that are highly replicated and/or occur at evolutionarily conserved sites that are likely to be functionally important. Additionally, we provide an evolutionary-adjusted SNP association ranking (E-rank) that uses cross-species evolutionary conservation scores and population allele frequencies to transform P-values in an effort to enhance the discovery of SNPs with a greater probability of biologically meaningful disease associations. By adding an evolutionary dimension to the GWAS results available in the GRASP2 database, our e-GRASP resource will enable a more effective exploration of SNPs not only by the statistical significance of trait associations, but also by the number of studies in which associations have been replicated, and the evolutionary context of the associated mutations. Therefore, e-GRASP will be a valuable resource for aiding researchers in the identification of bona fide, reproducible genetic associations from GWAS results. This resource is freely available at http://www.mypeg.info/egrasp .
Hansen, Tina V A; Thamsborg, Stig M; Olsen, Annette; Prichard, Roger K; Nejsum, Peter
2013-08-12
The whipworm Trichuris trichiura has been estimated to infect 604 - 795 million people worldwide. The current control strategy against trichuriasis using the benzimidazoles (BZs) albendazole (400 mg) or mebendazole (500 mg) as single-dose treatment is not satisfactory. The occurrence of single nucleotide polymorphisms (SNPs) in codons 167, 198 or 200 of the beta-tubulin gene has been reported to convey BZ-resistance in intestinal nematodes of veterinary importance. It was hypothesised that the low susceptibility of T. trichiura to BZ could be due to a natural occurrence of such SNPs. The aim of this study was to investigate whether these SNPs were present in the beta-tubulin gene of Trichuris spp. from humans and baboons. As a secondary objective, the degree of identity between T. trichiura from humans and Trichuris spp. from baboons was evaluated based on the beta-tubulin gene and the internal transcribed spacer 2 region (ITS2). Nucleotide sequences of the beta-tubulin gene were generated by PCR using degenerate primers, specific primers and DNA from worms and eggs of T. trichiura and worms of Trichuris spp. from baboons. The ITS2 region was amplified using adult Trichuris spp. from baboons. PCR products were sequenced and analysed. The beta-tubulin fragments were studied for SNPs in codons 167, 198 or 200 and the ITS2 amplicons were compared with GenBank records of T. trichiura. No SNPs in codons 167, 198 or 200 were identified in any of the analysed Trichuris spp. from humans and baboons. Based on the ITS2 region, the similarity between Trichuris spp. from baboons and GenBank records of T. trichiura was found to be 98 - 99%. Single nucleotide polymorphisms in codon 167, 198 and 200, known to confer BZ-resistance in other nematodes, were absent in the studied material. This study does not provide data that could explain previous reports of poor BZ treatment efficacy in terms of polymorphism in these codons of beta-tubulin. Based on a fragment of the beta-tubulin gene and the ITS2 region sequenced, it was found that T. trichiura from humans and Trichuris spp. isolated from baboons are closely related and may be the same species.
2013-01-01
Background The whipworm Trichuris trichiura has been estimated to infect 604 – 795 million people worldwide. The current control strategy against trichuriasis using the benzimidazoles (BZs) albendazole (400 mg) or mebendazole (500 mg) as single-dose treatment is not satisfactory. The occurrence of single nucleotide polymorphisms (SNPs) in codons 167, 198 or 200 of the beta-tubulin gene has been reported to convey BZ-resistance in intestinal nematodes of veterinary importance. It was hypothesised that the low susceptibility of T. trichiura to BZ could be due to a natural occurrence of such SNPs. The aim of this study was to investigate whether these SNPs were present in the beta-tubulin gene of Trichuris spp. from humans and baboons. As a secondary objective, the degree of identity between T. trichiura from humans and Trichuris spp. from baboons was evaluated based on the beta-tubulin gene and the internal transcribed spacer 2 region (ITS2). Methods Nucleotide sequences of the beta-tubulin gene were generated by PCR using degenerate primers, specific primers and DNA from worms and eggs of T. trichiura and worms of Trichuris spp. from baboons. The ITS2 region was amplified using adult Trichuris spp. from baboons. PCR products were sequenced and analysed. The beta-tubulin fragments were studied for SNPs in codons 167, 198 or 200 and the ITS2 amplicons were compared with GenBank records of T. trichiura. Results No SNPs in codons 167, 198 or 200 were identified in any of the analysed Trichuris spp. from humans and baboons. Based on the ITS2 region, the similarity between Trichuris spp. from baboons and GenBank records of T. trichiura was found to be 98 – 99%. Conclusions Single nucleotide polymorphisms in codon 167, 198 and 200, known to confer BZ-resistance in other nematodes, were absent in the studied material. This study does not provide data that could explain previous reports of poor BZ treatment efficacy in terms of polymorphism in these codons of beta-tubulin. Based on a fragment of the beta-tubulin gene and the ITS2 region sequenced, it was found that T. trichiura from humans and Trichuris spp. isolated from baboons are closely related and may be the same species. PMID:23938038
Phillips, C; Ballard, D; Gill, P; Court, D Syndercombe; Carracedo, A; Lareu, M V
2012-05-01
Family studies can be used to measure the genetic distance between same-chromosome (syntenic) STRs in order to detect physical linkage or linkage disequilibrium. However, family studies are expensive and time consuming, in many cases uninformative, and lack a reliable means to infer the phase of the diplotypes obtained. HapMap provides a more comprehensive and fine-scale estimation of recombination rates using high density multi-point SNP data (average inter-SNP distance: 900 nucleotides). Data at this fine scale detects sub-kilobase genetic distances across the whole recombining human genome. We have used the most recent HapMap SNP data release 22 to measure and compare genetic distances, and by inference fine-scale recombination rates, between 29 syntenic STR pairs identified from 39 validated STRs currently available for forensic use. The 39 STRs comprise 23 core loci: SE33, Penta D & E, 13 CODIS and 7 non-CODIS European Standard Set STRs, plus supplementary STRs in the recently released Promega CS-7™ and Qiagen Investigator HDplex™ kits. Also included were D9S1120, a marker we developed for forensic use unique to chromosome 9, and the novel D6S1043 component STR of SinoFiler™ (Applied Biosystems). The data collated provides reliable estimates of recombination rates between each STR pair, that can then be placed into haplotype frequency calculators for short pedigrees with multiple meiotic inputs and which just requires the addition of allele frequencies. This allows all current STR sets or their combinations to be used in supplemented paternity analyses without the need for further adjustment for physical linkage. The detailed analysis of recombination rates made for autosomal forensic STRs was extended to the more than 50 X chromosome STRs established or in development for complex kinship analyses. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Accuracy of CNV Detection from GWAS Data.
Zhang, Dandan; Qian, Yudong; Akula, Nirmala; Alliey-Rodriguez, Ney; Tang, Jinsong; Gershon, Elliot S; Liu, Chunyu
2011-01-13
Several computer programs are available for detecting copy number variants (CNVs) using genome-wide SNP arrays. We evaluated the performance of four CNV detection software suites--Birdsuite, Partek, HelixTree, and PennCNV-Affy--in the identification of both rare and common CNVs. Each program's performance was assessed in two ways. The first was its recovery rate, i.e., its ability to call 893 CNVs previously identified in eight HapMap samples by paired-end sequencing of whole-genome fosmid clones, and 51,440 CNVs identified by array Comparative Genome Hybridization (aCGH) followed by validation procedures, in 90 HapMap CEU samples. The second evaluation was program performance calling rare and common CNVs in the Bipolar Genome Study (BiGS) data set (1001 bipolar cases and 1033 controls, all of European ancestry) as measured by the Affymetrix SNP 6.0 array. Accuracy in calling rare CNVs was assessed by positive predictive value, based on the proportion of rare CNVs validated by quantitative real-time PCR (qPCR), while accuracy in calling common CNVs was assessed by false positive/false negative rates based on qPCR validation results from a subset of common CNVs. Birdsuite recovered the highest percentages of known HapMap CNVs containing >20 markers in two reference CNV datasets. The recovery rate increased with decreased CNV frequency. In the tested rare CNV data, Birdsuite and Partek had higher positive predictive values than the other software suites. In a test of three common CNVs in the BiGS dataset, Birdsuite's call was 98.8% consistent with qPCR quantification in one CNV region, but the other two regions showed an unacceptable degree of accuracy. We found relatively poor consistency between the two "gold standards," the sequence data of Kidd et al., and aCGH data of Conrad et al. Algorithms for calling CNVs especially common ones need substantial improvement, and a "gold standard" for detection of CNVs remains to be established.
The Qatar genome: a population-specific tool for precision medicine in the Middle East
Fakhro, Khalid A; Staudt, Michelle R; Ramstetter, Monica Denise; Robay, Amal; Malek, Joel A; Badii, Ramin; Al-Marri, Ajayeb Al-Nabet; Khalil, Charbel Abi; Al-Shakaki, Alya; Chidiac, Omar; Stadler, Dora; Zirie, Mahmoud; Jayyousi, Amin; Salit, Jacqueline; Mezey, Jason G; Crystal, Ronald G; Rodriguez-Flores, Juan L
2016-01-01
Reaching the full potential of precision medicine depends on the quality of personalized genome interpretation. In order to facilitate precision medicine in regions of the Middle East and North Africa (MENA), a population-specific genome for the indigenous Arab population of Qatar (QTRG) was constructed by incorporating allele frequency data from sequencing of 1,161 Qataris, representing 0.4% of the population. A total of 20.9 million single nucleotide polymorphisms (SNPs) and 3.1 million indels were observed in Qatar, including an average of 1.79% novel variants per individual genome. Replacement of the GRCh37 standard reference with QTRG in a best practices genome analysis workflow resulted in an average of 7* deeper coverage depth (an improvement of 23%) and 756,671 fewer variants on average, a reduction of 16% that is attributed to common Qatari alleles being present in QTRG. The benefit for using QTRG varies across ancestries, a factor that should be taken into consideration when selecting an appropriate reference for analysis. PMID:27408750
Multiple testing and power calculations in genetic association studies.
So, Hon-Cheong; Sham, Pak C
2011-01-01
Modern genetic association studies typically involve multiple single-nucleotide polymorphisms (SNPs) and/or multiple genes. With the development of high-throughput genotyping technologies and the reduction in genotyping cost, investigators can now assay up to a million SNPs for direct or indirect association with disease phenotypes. In addition, some studies involve multiple disease or related phenotypes and use multiple methods of statistical analysis. The combination of multiple genetic loci, multiple phenotypes, and multiple methods of evaluating associations between genotype and phenotype means that modern genetic studies often involve the testing of an enormous number of hypotheses. When multiple hypothesis tests are performed in a study, there is a risk of inflation of the type I error rate (i.e., the chance of falsely claiming an association when there is none). Several methods for multiple-testing correction are in popular use, and they all have strengths and weaknesses. Because no single method is universally adopted or always appropriate, it is important to understand the principles, strengths, and weaknesses of the methods so that they can be applied appropriately in practice. In this article, we review the three principle methods for multiple-testing correction and provide guidance for calculating statistical power.
Natural Allelic Variations in Highly Polyploidy Saccharum Complex
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.
Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less
The Impact of Ancestry and Common Genetic Variants on QT Interval in African Americans
Smith, J. Gustav; Avery, Christy L.; Evans, Daniel S.; Nalls, Michael A.; Meng, Yan A.; Smith, Erin N.; Palmer, Cameron; Tanaka, Toshiko; Mehra, Reena; Butler, Anne M.; Young, Taylor; Buxbaum, Sarah G.; Kerr, Kathleen F.; Berenson, Gerald S.; Schnabel, Renate B.; Li, Guo; Ellinor, Patrick T.; Magnani, Jared W.; Chen, Wei; Bis, Joshua C.; Curb, J. David; Hsueh, Wen-Chi; Rotter, Jerome I.; Liu, Yongmei; Newman, Anne B.; Limacher, Marian C.; North, Kari E.; Reiner, Alexander P.; Quibrera, P. Miguel; Schork, Nicholas J.; Singleton, Andrew B.; Psaty, Bruce M.; Soliman, Elsayed Z.; Solomon, Allen J.; Srinivasan, Sathanur R.; Alonso, Alvaro; Wallace, Robert; Redline, Susan; Zhang, Zhu-Ming; Post, Wendy S.; Zonderman, Alan B.; Taylor, Herman A.; Murray, Sarah S.; Ferrucci, Luigi; Arking, Dan E.; Evans, Michele K.; Fox, Ervin R.; Sotoodehnia, Nona; Heckbert, Susan R.; Whitsel, Eric A.; Newton-Cheh, Christopher
2013-01-01
Background Ethnic differences in cardiac arrhythmia incidence have been reported, with a particularly high incidence of sudden cardiac death (SCD) and low incidence of atrial fibrillation in individuals of African ancestry. We tested the hypotheses that African ancestry and common genetic variants are associated with prolonged duration of cardiac repolarization, a central pathophysiological determinant of arrhythmia, as measured by the electrocardiographic QT interval. Methods and Results First, individual estimates of African and European ancestry were inferred from genome-wide single nucleotide polymorphism (SNP) data in seven population-based cohorts of African Americans (n=12 097) and regressed on measured QT interval from electrocardiograms. Second, imputation was performed for 2.8 million SNPs and a genome-wide association (GWA) study of QT interval performed in ten cohorts (n=13 105). There was no evidence of association between genetic ancestry and QT interval (p=0.94). Genome-wide significant associations (p<2.5×10−8) were identified with SNPs at two loci, upstream of the genes NOS1AP (rs12143842, p=2×10−15) and ATP1B1 (rs1320976, p=2×10−10). The most significant SNP in NOS1AP was the same as the strongest SNP previously associated with QT interval in individuals of European ancestry. Low p-values (p<10−5) were observed for SNPs at several other loci previously identified in GWA studies in individuals of European ancestry, including KCNQ1, KCNH2, LITAF and PLN. Conclusions We observed no difference in duration of cardiac repolarization with global genetic indices of African ancestry. In addition, our GWA study extends the association of polymorphisms at several loci associated with repolarization in individuals of European ancestry to include African Americans. PMID:23166209
Natural Allelic Variations in Highly Polyploidy Saccharum Complex
Song, Jian; Yang, Xiping; Resende, Jr., Marcio F. R.; ...
2016-06-08
Sugarcane ( Saccharum spp.) is an important sugar and biofuel crop with high polyploid and complex genomes. The Saccharum complex, comprised of Saccharum genus and a few related genera, are important genetic resources for sugarcane breeding. A large amount of natural variation exists within the Saccharum complex. Though understanding their allelic variation has been challenging, it is critical to dissect allelic structure and to identify the alleles controlling important traits in sugarcane. To characterize natural variations in Saccharum complex, a target enrichment sequencing approach was used to assay 12 representative germplasm accessions. In total, 55,946 highly efficient probes were designedmore » based on the sorghum genome and sugarcane unigene set targeting a total of 6 Mb of the sugarcane genome. A pipeline specifically tailored for polyploid sequence variants and genotype calling was established. BWAmem and sorghum genome approved to be an acceptable aligner and reference for sugarcane target enrichment sequence analysis, respectively. Genetic variations including 1,166,066 non-redundant SNPs, 150,421 InDels, 919 gene copy number variations, and 1,257 gene presence/absence variations were detected. SNPs from three different callers (Samtools, Freebayes, and GATK) were compared and the validation rates were nearly 90%. Based on the SNP loci of each accession and their ploidy levels, 999,258 single dosage SNPs were identified and most loci were estimated as largely homozygotes. An average of 34,397 haplotype blocks for each accession was inferred. The highest divergence time among the Saccharum spp. was estimated as 1.2 million years ago (MYA). Saccharum spp. diverged from Erianthus and Sorghum approximately 5 and 6 MYA, respectively. Furthermore, the target enrichment sequencing approach provided an effective way to discover and catalog natural allelic variation in highly polyploid or heterozygous genomes.« less
Impact of ancestry and common genetic variants on QT interval in African Americans.
Smith, J Gustav; Avery, Christy L; Evans, Daniel S; Nalls, Michael A; Meng, Yan A; Smith, Erin N; Palmer, Cameron; Tanaka, Toshiko; Mehra, Reena; Butler, Anne M; Young, Taylor; Buxbaum, Sarah G; Kerr, Kathleen F; Berenson, Gerald S; Schnabel, Renate B; Li, Guo; Ellinor, Patrick T; Magnani, Jared W; Chen, Wei; Bis, Joshua C; Curb, J David; Hsueh, Wen-Chi; Rotter, Jerome I; Liu, Yongmei; Newman, Anne B; Limacher, Marian C; North, Kari E; Reiner, Alexander P; Quibrera, P Miguel; Schork, Nicholas J; Singleton, Andrew B; Psaty, Bruce M; Soliman, Elsayed Z; Solomon, Allen J; Srinivasan, Sathanur R; Alonso, Alvaro; Wallace, Robert; Redline, Susan; Zhang, Zhu-Ming; Post, Wendy S; Zonderman, Alan B; Taylor, Herman A; Murray, Sarah S; Ferrucci, Luigi; Arking, Dan E; Evans, Michele K; Fox, Ervin R; Sotoodehnia, Nona; Heckbert, Susan R; Whitsel, Eric A; Newton-Cheh, Christopher
2012-12-01
Ethnic differences in cardiac arrhythmia incidence have been reported, with a particularly high incidence of sudden cardiac death and low incidence of atrial fibrillation in individuals of African ancestry. We tested the hypotheses that African ancestry and common genetic variants are associated with prolonged duration of cardiac repolarization, a central pathophysiological determinant of arrhythmia, as measured by the electrocardiographic QT interval. First, individual estimates of African and European ancestry were inferred from genome-wide single-nucleotide polymorphism (SNP) data in 7 population-based cohorts of African Americans (n=12,097) and regressed on measured QT interval from ECGs. Second, imputation was performed for 2.8 million SNPs, and a genome-wide association study of QT interval was performed in 10 cohorts (n=13,105). There was no evidence of association between genetic ancestry and QT interval (P=0.94). Genome-wide significant associations (P<2.5 × 10(-8)) were identified with SNPs at 2 loci, upstream of the genes NOS1AP (rs12143842, P=2 × 10(-15)) and ATP1B1 (rs1320976, P=2 × 10(-10)). The most significant SNP in NOS1AP was the same as the strongest SNP previously associated with QT interval in individuals of European ancestry. Low probability values (P<10(-5)) were observed for SNPs at several other loci previously identified in genome-wide association studies in individuals of European ancestry, including KCNQ1, KCNH2, LITAF, and PLN. We observed no difference in duration of cardiac repolarization with global genetic indices of African American ancestry. In addition, our genome-wide association study extends the association of polymorphisms at several loci associated with repolarization in individuals of European ancestry to include individuals of African ancestry.
van Rooij, Frank J. A.; Ehret, Georg B.; Boerwinkle, Eric; Felix, Janine F.; Leak, Tennille S.; Harris, Tamara B.; Yang, Qiong; Dehghan, Abbas; Aspelund, Thor; Katz, Ronit; Homuth, Georg; Kocher, Thomas; Rettig, Rainer; Ried, Janina S.; Gieger, Christian; Prucha, Hanna; Pfeufer, Arne; Meitinger, Thomas; Coresh, Josef; Hofman, Albert; Sarnak, Mark J.; Chen, Yii-Der Ida; Uitterlinden, André G.; Chakravarti, Aravinda; Psaty, Bruce M.; van Duijn, Cornelia M.; Kao, W. H. Linda; Witteman, Jacqueline C. M.; Gudnason, Vilmundur; Siscovick, David S.; Fox, Caroline S.; Köttgen, Anna
2010-01-01
Magnesium, potassium, and sodium, cations commonly measured in serum, are involved in many physiological processes including energy metabolism, nerve and muscle function, signal transduction, and fluid and blood pressure regulation. To evaluate the contribution of common genetic variation to normal physiologic variation in serum concentrations of these cations, we conducted genome-wide association studies of serum magnesium, potassium, and sodium concentrations using ∼2.5 million genotyped and imputed common single nucleotide polymorphisms (SNPs) in 15,366 participants of European descent from the international CHARGE Consortium. Study-specific results were combined using fixed-effects inverse-variance weighted meta-analysis. SNPs demonstrating genome-wide significant (p<5×10−8) or suggestive associations (p<4×10−7) were evaluated for replication in an additional 8,463 subjects of European descent. The association of common variants at six genomic regions (in or near MUC1, ATP2B1, DCDC5, TRPM6, SHROOM3, and MDS1) with serum magnesium levels was genome-wide significant when meta-analyzed with the replication dataset. All initially significant SNPs from the CHARGE Consortium showed nominal association with clinically defined hypomagnesemia, two showed association with kidney function, two with bone mineral density, and one of these also associated with fasting glucose levels. Common variants in CNNM2, a magnesium transporter studied only in model systems to date, as well as in CNNM3 and CNNM4, were also associated with magnesium concentrations in this study. We observed no associations with serum sodium or potassium levels exceeding p<4×10−7. Follow-up studies of newly implicated genomic loci may provide additional insights into the regulation and homeostasis of human serum magnesium levels. PMID:20700443
Genetic Indicators of Drug Resistance in the Highly Repetitive Genome of Trichomonas vaginalis.
Bradic, Martina; Warring, Sally D; Tooley, Grace E; Scheid, Paul; Secor, William E; Land, Kirkwood M; Huang, Po-Jung; Chen, Ting-Wen; Lee, Chi-Ching; Tang, Petrus; Sullivan, Steven A; Carlton, Jane M
2017-06-01
Trichomonas vaginalis, the most common nonviral sexually transmitted parasite, causes ∼283 million trichomoniasis infections annually and is associated with pregnancy complications and increased risk of HIV-1 acquisition. The antimicrobial drug metronidazole is used for treatment, but in a fraction of clinical cases, the parasites can become resistant to this drug. We undertook sequencing of multiple clinical isolates and lab derived lines to identify genetic markers and mechanisms of metronidazole resistance. Reduced representation genome sequencing of ∼100 T. vaginalis clinical isolates identified 3,923 SNP markers and presence of a bipartite population structure. Linkage disequilibrium was found to decay rapidly, suggesting genome-wide recombination and the feasibility of genetic association studies in the parasite. We identified 72 SNPs associated with metronidazole resistance, and a comparison of SNPs within several lab-derived resistant lines revealed an overlap with the clinically resistant isolates. We identified SNPs in genes for which no function has yet been assigned, as well as in functionally-characterized genes relevant to drug resistance (e.g., pyruvate:ferredoxin oxidoreductase). Transcription profiles of resistant strains showed common changes in genes involved in drug activation (e.g., flavin reductase), accumulation (e.g., multidrug resistance pump), and detoxification (e.g., nitroreductase). Finally, we identified convergent genetic changes in lab-derived resistant lines of Tritrichomonas foetus, a distantly related species that causes venereal disease in cattle. Shared genetic changes within and between T. vaginalis and Tr. foetus parasites suggest conservation of the pathways through which adaptation has occurred. These findings extend our knowledge of drug resistance in the parasite, providing a panel of markers that can be used as a diagnostic tool. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genetic Indicators of Drug Resistance in the Highly Repetitive Genome of Trichomonas vaginalis
Bradic, Martina; Warring, Sally D.; Tooley, Grace E.; Scheid, Paul; Secor, William E.; Land, Kirkwood M.; Huang, Po-Jung; Chen, Ting-Wen; Lee, Chi-Ching; Tang, Petrus; Sullivan, Steven A.
2017-01-01
Abstract Trichomonas vaginalis, the most common nonviral sexually transmitted parasite, causes ∼283 million trichomoniasis infections annually and is associated with pregnancy complications and increased risk of HIV-1 acquisition. The antimicrobial drug metronidazole is used for treatment, but in a fraction of clinical cases, the parasites can become resistant to this drug. We undertook sequencing of multiple clinical isolates and lab derived lines to identify genetic markers and mechanisms of metronidazole resistance. Reduced representation genome sequencing of ∼100 T. vaginalis clinical isolates identified 3,923 SNP markers and presence of a bipartite population structure. Linkage disequilibrium was found to decay rapidly, suggesting genome-wide recombination and the feasibility of genetic association studies in the parasite. We identified 72 SNPs associated with metronidazole resistance, and a comparison of SNPs within several lab-derived resistant lines revealed an overlap with the clinically resistant isolates. We identified SNPs in genes for which no function has yet been assigned, as well as in functionally-characterized genes relevant to drug resistance (e.g., pyruvate:ferredoxin oxidoreductase). Transcription profiles of resistant strains showed common changes in genes involved in drug activation (e.g., flavin reductase), accumulation (e.g., multidrug resistance pump), and detoxification (e.g., nitroreductase). Finally, we identified convergent genetic changes in lab-derived resistant lines of Tritrichomonas foetus, a distantly related species that causes venereal disease in cattle. Shared genetic changes within and between T. vaginalis and Tr. foetus parasites suggest conservation of the pathways through which adaptation has occurred. These findings extend our knowledge of drug resistance in the parasite, providing a panel of markers that can be used as a diagnostic tool. PMID:28633446
Kim, Tae-Sung; He, Qiang; Kim, Kyu-Won; Yoon, Min-Young; Ra, Won-Hee; Li, Feng Peng; Tong, Wei; Yu, Jie; Oo, Win Htet; Choi, Buung; Heo, Eun-Beom; Yun, Byoung-Kook; Kwon, Soon-Jae; Kwon, Soon-Wook; Cho, Yoo-Hyun; Lee, Chang-Yong; Park, Beom-Seok; Park, Yong-Jin
2016-05-26
Rice germplasm collections continue to grow in number and size around the world. Since maintaining and screening such massive resources remains challenging, it is important to establish practical methods to manage them. A core collection, by definition, refers to a subset of the entire population that preserves the majority of genetic diversity, enhancing the efficiency of germplasm utilization. Here, we report whole-genome resequencing of the 137 rice mini core collection or Korean rice core set (KRICE_CORE) that represents 25,604 rice germplasms deposited in the Korean genebank of the Rural Development Administration (RDA). We implemented the Illumina HiSeq 2000 and 2500 platform to produce short reads and then assembled those with 9.8 depths using Nipponbare as a reference. Comparisons of the sequences with the reference genome yielded more than 15 million (M) single nucleotide polymorphisms (SNPs) and 1.3 M INDELs. Phylogenetic and population analyses using 2,046,529 high-quality SNPs successfully assigned rice accessions to the relevant rice subgroups, suggesting that these SNPs capture evolutionary signatures that have accumulated in rice subpopulations. Furthermore, genome-wide association studies (GWAS) for four exemplary agronomic traits in the KRIC_CORE manifest the utility of KRICE_CORE; that is, identifying previously defined genes or novel genetic factors that potentially regulate important phenotypes. This study provides strong evidence that the size of KRICE_CORE is small but contains high genetic and functional diversity across the genome. Thus, our resequencing results will be useful for future breeding, as well as functional and evolutionary studies, in the post-genomic era.
Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.
Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V
2012-02-17
The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.
Evans, Joseph; Crisovan, Emily; Barry, Kerrie; ...
2015-10-01
Panicum virgatum L. (switchgrass) is a polyploid, perennial grass species that is native to North America, and is being developed as a future biofuel feedstock crop. Switchgrass is present primarily in two ecotypes: a northern upland ecotype, composed of tetraploid and octoploid accessions, and a southern lowland ecotype, composed of primarily tetraploid accessions. We employed high-coverage exome capture sequencing (~2.4 Tb) to genotype 537 individuals from 45 upland and 21 lowland populations. From these data, we identified ~27 million single-nucleotide polymorphisms (SNPs), of which 1 590 653 high-confidence SNPs were used in downstream analyses of diversity within and between themore » populations. From the 66 populations, we identified five primary population groups within the upland and lowland ecotypes, a result that was further supported through genetic distance analysis. We identified conserved, ecotype-restricted, non-synonymous SNPs that are predicted to affect the protein function of CONSTANS (CO) and EARLY HEADING DATE 1 (EHD1), key genes involved in flowering, which may contribute to the phenotypic differences between the two ecotypes. We also identified, relative to the near-reference Kanlow population, 17 228 genes present in more copies than in the reference genome (up-CNVs), 112 630 genes present in fewer copies than in the reference genome (down-CNVs) and 14 430 presence/absence variants (PAVs), affecting a total of 9979 genes, including two upland-specific CNV clusters. In total, 45 719 genes were affected by an SNP, CNV, or PAV across the panel, providing a firm foundation to identify functional variation associated with phenotypic traits of interest for biofuel feedstock production.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evans, Joseph; Crisovan, Emily; Barry, Kerrie
Panicum virgatum L. (switchgrass) is a polyploid, perennial grass species that is native to North America, and is being developed as a future biofuel feedstock crop. Switchgrass is present primarily in two ecotypes: a northern upland ecotype, composed of tetraploid and octoploid accessions, and a southern lowland ecotype, composed of primarily tetraploid accessions. We employed high-coverage exome capture sequencing (~2.4 Tb) to genotype 537 individuals from 45 upland and 21 lowland populations. From these data, we identified ~27 million single-nucleotide polymorphisms (SNPs), of which 1 590 653 high-confidence SNPs were used in downstream analyses of diversity within and between themore » populations. From the 66 populations, we identified five primary population groups within the upland and lowland ecotypes, a result that was further supported through genetic distance analysis. We identified conserved, ecotype-restricted, non-synonymous SNPs that are predicted to affect the protein function of CONSTANS (CO) and EARLY HEADING DATE 1 (EHD1), key genes involved in flowering, which may contribute to the phenotypic differences between the two ecotypes. We also identified, relative to the near-reference Kanlow population, 17 228 genes present in more copies than in the reference genome (up-CNVs), 112 630 genes present in fewer copies than in the reference genome (down-CNVs) and 14 430 presence/absence variants (PAVs), affecting a total of 9979 genes, including two upland-specific CNV clusters. In total, 45 719 genes were affected by an SNP, CNV, or PAV across the panel, providing a firm foundation to identify functional variation associated with phenotypic traits of interest for biofuel feedstock production.« less
Slattery, Martha L.; Lundgreen, Abbie; Herrick, Jennifer S.; Wolff, Roger K.; Caan, Bette J.
2012-01-01
BACKGROUND The transforming growth factor-β (TGF-β) signaling pathway is involved in many aspects of tumori-genesis, including angiogenesis and metastasis. The authors evaluated this pathway in association with survival after a diagnosis of colon or rectal cancer. METHODS The study included 1553 patients with colon cancer and 754 patients with rectal cancer who had incident first primary disease and were followed for a minimum of 7 years after diagnosis. Genetic variations were evaluated in the genes TGF-β1 (2 single nucleotide polymorphisms [SNPs]), TGF-β receptor 1 (TGF-βR1) (3 SNPs), smooth muscle actin/mothers against decapentaplegic homolog 1 (Smad1) (5 SNPs), Smad2 (4 SNPs), Smad3 (37 SNPs), Smad4 (2 SNPs), Smad7 (11 SNPs), bone morphogenetic protein 1 (BMP1) (11 SNPs), BMP2 (5 SNPs), BMP4 (3 SNPs), bone morphogenetic protein receptor 1A (BMPR1A) (9 SNPs), BMPR1B (21 SNPs), BMPR2 (11 SNPs), growth differentiation factor 10 (GDF10) (7 SNPs), Runt-related transcription factor 1 (RUNX1) (40 SNPs), RUNX2 (19 SNPs), RUNX3 (9 SNPs), eukaryotic translation initiation factor 4E (eiF4E) (3 SNPs), eukaryotic translation initiation factor 4E-binding protein 3 (eiF4EBP2) (2 SNPs), eiF4EBP3 (2 SNPs), and mitogen-activated protein kinase 1 (MAPK1) (6 SNPs). RESULTS After adjusting for American Joint Committee on Cancer stage and tumor molecular phenotype, 12 genes and 18 SNPs were associated with survival in patients with colon cancer, and 7 genes and 15 tagSNPs were associated with survival after a diagnosis of rectal cancer. A summary score based on “at-risk” genotypes revealed a hazard rate ratio of 5.10 (95% confidence interval, 2.56-10.15) for the group with the greatest number of “at-risk” genotypes; for rectal cancer, the hazard rate ratio was 6.03 (95% confidence interval, 2.83-12.75). CONCLUSIONS The current findings suggest that the presence of several higher risk alleles in the TGF-β signaling pathway increase the likelihood of dying after a diagnosis of colon or rectal cancer. PMID:21365634
Estimating haplotype frequencies by combining data from large DNA pools with database information.
Gasbarra, Dario; Kulathinal, Sangita; Pirinen, Matti; Sillanpää, Mikko J
2011-01-01
We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses, the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci, the performance of the proposed method is similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented using Matlab and the code is available upon request from the authors.
Luo, Ruibang; Wong, Yiu-Lun; Law, Wai-Chun; Lee, Lap-Kei; Cheung, Jeanno; Liu, Chi-Man; Lam, Tak-Wah
2014-01-01
This paper reports an integrated solution, called BALSA, for the secondary analysis of next generation sequencing data; it exploits the computational power of GPU and an intricate memory management to give a fast and accurate analysis. From raw reads to variants (including SNPs and Indels), BALSA, using just a single computing node with a commodity GPU board, takes 5.5 h to process 50-fold whole genome sequencing (∼750 million 100 bp paired-end reads), or just 25 min for 210-fold whole exome sequencing. BALSA's speed is rooted at its parallel algorithms to effectively exploit a GPU to speed up processes like alignment, realignment and statistical testing. BALSA incorporates a 16-genotype model to support the calling of SNPs and Indels and achieves competitive variant calling accuracy and sensitivity when compared to the ensemble of six popular variant callers. BALSA also supports efficient identification of somatic SNVs and CNVs; experiments showed that BALSA recovers all the previously validated somatic SNVs and CNVs, and it is more sensitive for somatic Indel detection. BALSA outputs variants in VCF format. A pileup-like SNAPSHOT format, while maintaining the same fidelity as BAM in variant calling, enables efficient storage and indexing, and facilitates the App development of downstream analyses. BALSA is available at: http://sourceforge.net/p/balsa.
Fang, Lu; Yang, Yuchen; Guo, Wuxia; Li, Jianfang; Zhong, Cairong; Huang, Yelin; Zhou, Renchao; Shi, Suhua
2016-08-01
Aegiceras corniculatum (L.) Blanco is one of the most salt tolerant mangrove species and can thrive in 3% salinity at the seaward edge of mangrove forests. Here we sequenced the transcriptome of A. corniculatum used Illumina GA platform to develop its genomic resources for ecological and evolutionary studies. We obtained about 50 million high-quality paired-end reads with 75bp in length. Using the short read assembler Velvet, we yielded 49,437 contigs with the average length of 625bp. A total of 32,744 (66.23%) contigs showed significant similarity to the GenBank non-redundant (NR) protein database. 30,911 and 18,004 of these sequences were assigned to Gene Ontology and eukaryotic orthologous groups of proteins (KOG). A total of 4942 transcripts from our assemblies had significant similarity with KEGG Orthologs and were involved in 144 KEGG pathways, while 9899 unigenes had enzyme commission (EC) numbers. In addition, 9792 transcriptome-derived SSRs were identified from 7342 sequences. With our strict criteria, 4165 candidate SNPs were also identified from 2058 contigs. Some of these SNPs were further validated by Sanger sequencing. Genomic resources generated in this study should be valuable in ecological, evolutionary, and functional genomics studies for this mangrove species. Copyright © 2016 Elsevier B.V. All rights reserved.
Lu, Qiongshi; Li, Boyang; Ou, Derek; Erlendsdottir, Margret; Powles, Ryan L; Jiang, Tony; Hu, Yiming; Chang, David; Jin, Chentian; Dai, Wei; He, Qidu; Liu, Zefeng; Mukherjee, Shubhabrata; Crane, Paul K; Zhao, Hongyu
2017-12-07
Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (N total ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Genetic Modifiers of Ovarian Cancer
2014-08-01
samples from many countries. To account for population stratification, the genotyping data in combination with HapMap data (CEU, Yoruban, Han Chinese...Cambridge, we evaluated associations with both breast and ovarian cancer using a retrospective likelihood model. This accounts for the age extremes of...carriers we used a competing risk analysis that accounted for the effects on breast and ovarian cancer in parallel. In this competing risk analysis
Singh, Kh Dhanachandra; Karthikeyan, Muthusamy
2014-12-01
The renin-angiotensin-aldosterone system (RAAS) plays a key role in the regulation of blood pressure (BP). Mutations on the genes that encode components of the RAAS have played a significant role in genetic susceptibility to hypertension and have been intensively scrutinized. The identification of such probably causal mutations not only provides insight into the RAAS but may also serve as antihypertensive therapeutic targets and diagnostic markers. The methods for analyzing the SNPs from the huge dataset of SNPs, containing both functional and neutral SNPs is challenging by the experimental approach on every SNPs to determine their biological significance. To explore the functional significance of genetic mutation (SNPs), we adopted combined sequence and sequence-structure-based SNP analysis algorithm. Out of 3864 SNPs reported in dbSNP, we found 108 missense SNPs in the coding region and remaining in the non-coding region. In this study, we are reporting only those SNPs in coding region to be deleterious when three or more tools are predicted to be deleterious and which have high RMSD from the native structure. Based on these analyses, we have identified two SNPs of REN gene, eight SNPs of AGT gene, three SNPs of ACE gene, two SNPs of AT1R gene, three SNPs of CYP11B2 gene and three SNPs of CMA1 gene in the coding region were found to be deleterious. Further this type of study will be helpful in reducing the cost and time for identification of potential SNP and also helpful in selecting potential SNP for experimental study out of SNP pool.
DNA Compass: a secure, client-side site for navigating personal genetic information
Curnin, Charles; Gordon, Assaf; Erlich, Yaniv
2017-01-01
Abstract Motivation: Millions of individuals have access to raw genomic data using direct-to-consumer companies. The advent of large-scale sequencing projects, such as the Precision Medicine Initiative, will further increase the number of individuals with access to their own genomic information. However, querying genomic data requires a computer terminal and computational skill to analyze the data—an impediment for the general public. Results: DNA Compass is a website designed to empower the public by enabling simple navigation of personal genomic data. Users can query the status of their genomic variants for over 1658 markers or tens of millions of documented single nucleotide polymorphisms (SNPs). DNA Compass presents the relevant genotypes of the user side-by-side with explanatory scientific resources. The genotype data never leaves the user’s computer, a feature that provides improved security and performance. More than 12 000 unique users, mainly from the general genetic genealogy community, have already used DNA Compass, demonstrating its utility. Availability and Implementation: DNA Compass is freely available on https://compass.dna.land. Contact: yaniv@cs.columbia.edu PMID:28334237
Linear reduction method for predictive and informative tag SNP selection.
He, Jingwu; Westbrooks, Kelly; Zelikovsky, Alexander
2005-01-01
Constructing a complete human haplotype map is helpful when associating complex diseases with their related SNPs. Unfortunately, the number of SNPs is very large and it is costly to sequence many individuals. Therefore, it is desirable to reduce the number of SNPs that should be sequenced to a small number of informative representatives called tag SNPs. In this paper, we propose a new linear algebra-based method for selecting and using tag SNPs. We measure the quality of our tag SNP selection algorithm by comparing actual SNPs with SNPs predicted from selected linearly independent tag SNPs. Our experiments show that for sufficiently long haplotypes, knowing only 0.4% of all SNPs the proposed linear reduction method predicts an unknown haplotype with the error rate below 2% based on 10% of the population.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.
2002-01-01
Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs inmore » gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.« less
Genetic Variation in the Acorn Barnacle from Allozymes to Population Genomics
Flight, Patrick A.; Rand, David M.
2012-01-01
Understanding the patterns of genetic variation within and among populations is a central problem in population and evolutionary genetics. We examine this question in the acorn barnacle, Semibalanus balanoides, in which the allozyme loci Mpi and Gpi have been implicated in balancing selection due to varying selective pressures at different spatial scales. We review the patterns of genetic variation at the Mpi locus, compare this to levels of population differentiation at mtDNA and microsatellites, and place these data in the context of genome-wide variation from high-throughput sequencing of population samples spanning the North Atlantic. Despite considerable geographic variation in the patterns of selection at the Mpi allozyme, this locus shows rather low levels of population differentiation at ecological and trans-oceanic scales (FST ∼ 5%). Pooled population sequencing was performed on samples from Rhode Island (RI), Maine (ME), and Southwold, England (UK). Analysis of more than 650 million reads identified approximately 335,000 high-quality SNPs in 19 million base pairs of the S. balanoides genome. Much variation is shared across the Atlantic, but there are significant examples of strong population differentiation among samples from RI, ME, and UK. An FST outlier screen of more than 22,000 contigs provided a genome-wide context for interpretation of earlier studies on allozymes, mtDNA, and microsatellites. FST values for allozymes, mtDNA and microsatellites are close to the genome-wide average for random SNPs, with the exception of the trans-Atlantic FST for mtDNA. The majority of FST outliers were unique between individual pairs of populations, but some genes show shared patterns of excess differentiation. These data indicate that gene flow is high, that selection is strong on a subset of genes, and that a variety of genes are experiencing diversifying selection at large spatial scales. This survey of polymorphism in S. balanoides provides a number of genomic tools that promise to make this a powerful model for ecological genomics of the rocky intertidal. PMID:22767487
Abdulkadir, Mohamed; Londono, Douglas; Gordon, Derek; Fernandez, Thomas V; Brown, Lawrence W; Cheon, Keun-Ah; Coffey, Barbara J; Elzerman, Lonneke; Fremer, Carolin; Fründt, Odette; Garcia-Delgar, Blanca; Gilbert, Donald L; Grice, Dorothy E; Hedderly, Tammy; Heyman, Isobel; Hong, Hyun Ju; Huyser, Chaim; Ibanez-Gomez, Laura; Jakubovski, Ewgeni; Kim, Young Key; Kim, Young Shin; Koh, Yun-Joo; Kook, Sodahm; Kuperman, Samuel; Leventhal, Bennett; Ludolph, Andrea G; Madruga-Garrido, Marcos; Maras, Athanasios; Mir, Pablo; Morer, Astrid; Müller-Vahl, Kirsten; Münchau, Alexander; Murphy, Tara L; Plessen, Kerstin J; Roessner, Veit; Shin, Eun-Young; Song, Dong-Ho; Song, Jungeun; Tübing, Jennifer; van den Ban, Els; Visscher, Frank; Wanderer, Sina; Woods, Martin; Zinner, Samuel H; King, Robert A; Tischfield, Jay A; Heiman, Gary A; Hoekstra, Pieter J; Dietrich, Andrea
2018-04-01
Genetic studies in Tourette syndrome (TS) are characterized by scattered and poorly replicated findings. We aimed to replicate findings from candidate gene and genome-wide association studies (GWAS). Our cohort included 465 probands with chronic tic disorder (93% TS) and both parents from 412 families (some probands were siblings). We assessed 75 single nucleotide polymorphisms (SNPs) in 465 parent-child trios; 117 additional SNPs in 211 trios; and 4 additional SNPs in 254 trios. We performed SNP and gene-based transmission disequilibrium tests and compared nominally significant SNP results with those from a large independent case-control cohort. After quality control 71 SNPs were available in 371 trios; 112 SNPs in 179 trios; and 3 SNPs in 192 trios. 17 were candidate SNPs implicated in TS and 2 were implicated in obsessive-compulsive disorder (OCD) or autism spectrum disorder (ASD); 142 were tagging SNPs from eight monoamine neurotransmitter-related genes (including dopamine and serotonin); 10 were top SNPs from TS GWAS; and 13 top SNPs from attention-deficit/hyperactivity disorder, OCD, or ASD GWAS. None of the SNPs or genes reached significance after adjustment for multiple testing. We observed nominal significance for the candidate SNPs rs3744161 (TBCD) and rs4565946 (TPH2) and for five tagging SNPs; none of these showed significance in the independent cohort. Also, SLC1A1 in our gene-based analysis and two TS GWAS SNPs showed nominal significance, rs11603305 (intergenic) and rs621942 (PICALM). We found no convincing support for previously implicated genetic polymorphisms. Targeted re-sequencing should fully appreciate the relevance of candidate genes.
2009-01-01
Background Population structure and admixture have strong confounding effects on genetic association studies. Discordant frequencies for age-related macular degeneration (AMD) risk alleles and for AMD incidence and prevalence rates are reported across different ethnic groups. We examined the genomic ancestry characterizing 538 Latinos drawn from the Los Angeles Latino Eye Study [LALES] as part of an ongoing AMD-association study. To help assess the degree of Native American ancestry inherited by Latino populations we sampled 25 Mayans and 5 Mexican Indians collected through Coriell's Institute. Levels of European, Asian, and African descent in Latinos were inferred through the USC Multiethnic Panel (USC MEP), formed from a sample from the Multiethnic Cohort (MEC) study, the Yoruba African samples from HapMap II, the Singapore Chinese Health Study, and a prospective cohort from Shanghai, China. A total of 233 ancestry informative markers were genotyped for 538 LALES Latinos, 30 Native Americans, and 355 USC MEP individuals (African Americans, Japanese, Chinese, European Americans, Latinos, and Native Hawaiians). Sensitivity of ancestry estimates to relative sample size was considered. Results We detected strong evidence for recent population admixture in LALES Latinos. Gradients of increasing Native American background and of correspondingly decreasing European ancestry were observed as a function of birth origin from North to South. The strongest excess of homozygosity, a reflection of recent population admixture, was observed in non-US born Latinos that recently populated the US. A set of 42 SNPs especially informative for distinguishing between Native Americans and Europeans were identified. Conclusion These findings reflect the historic migration patterns of Native Americans and suggest that while the 'Latino' label is used to categorize the entire population, there exists a strong degree of heterogeneity within that population, and that it will be important to assess this heterogeneity within future association studies on Latino populations. Our study raises awareness of the diversity within "Latinos" and the necessity to assess appropriate risk and treatment management. PMID:19903357
Shtir, Corina J; Marjoram, Paul; Azen, Stanley; Conti, David V; Le Marchand, Loic; Haiman, Christopher A; Varma, Rohit
2009-11-10
Population structure and admixture have strong confounding effects on genetic association studies. Discordant frequencies for age-related macular degeneration (AMD) risk alleles and for AMD incidence and prevalence rates are reported across different ethnic groups. We examined the genomic ancestry characterizing 538 Latinos drawn from the Los Angeles Latino Eye Study [LALES] as part of an ongoing AMD-association study. To help assess the degree of Native American ancestry inherited by Latino populations we sampled 25 Mayans and 5 Mexican Indians collected through Coriell's Institute. Levels of European, Asian, and African descent in Latinos were inferred through the USC Multiethnic Panel (USC MEP), formed from a sample from the Multiethnic Cohort (MEC) study, the Yoruba African samples from HapMap II, the Singapore Chinese Health Study, and a prospective cohort from Shanghai, China. A total of 233 ancestry informative markers were genotyped for 538 LALES Latinos, 30 Native Americans, and 355 USC MEP individuals (African Americans, Japanese, Chinese, European Americans, Latinos, and Native Hawaiians). Sensitivity of ancestry estimates to relative sample size was considered. We detected strong evidence for recent population admixture in LALES Latinos. Gradients of increasing Native American background and of correspondingly decreasing European ancestry were observed as a function of birth origin from North to South. The strongest excess of homozygosity, a reflection of recent population admixture, was observed in non-US born Latinos that recently populated the US. A set of 42 SNPs especially informative for distinguishing between Native Americans and Europeans were identified. These findings reflect the historic migration patterns of Native Americans and suggest that while the 'Latino' label is used to categorize the entire population, there exists a strong degree of heterogeneity within that population, and that it will be important to assess this heterogeneity within future association studies on Latino populations. Our study raises awareness of the diversity within "Latinos" and the necessity to assess appropriate risk and treatment management.
Jones, Nathan R; Spratt, Thomas E; Berg, Arthur S; Muscat, Joshua E; Lazarus, Philip; Gallagher, Carla J
2011-04-01
The formation of bulky DNA adducts caused by diol epoxide derivatives of polycyclic aromatic hydrocarbons has been associated with tobacco-induced cancers, and inefficient repair of such adducts by the nucleotide excision repair (NER) system has been linked to increased risk of tobacco-induced lung and head and neck (H&N) cancers. The human excision repair cross-complementation group 1 (ERCC1) protein is essential for a functional NER system and genetic variation in ERCC1 may contribute to impaired DNA repair capacity and increased lung and H&N cancer risk. In order to comprehensively capture common genetic variation in the ERCC1 gene, Caucasian data from the International HapMap project was used to assess linkage disequilibrium and choose four tagSNPs (rs1319052, rs3212955, rs3212948, and rs735482) in the ERCC1 gene to genotype 452 lung cancer cases, 175 H&N cancer cases, and 790 healthy controls. Haplotypes were estimated using expectation maximization (EM) algorithm, and haplotype association with cancer was investigated using Haplo.stats software adjusting for known covariates. The genotype and haplotype frequencies matched previous estimates from Caucasians. There was no significant difference in the prevalence of rs1319052, rs3212955, rs3212948, and rs735482 when comparing lung or H&N cancer cases with controls (p-values>0.05). Similarly, there was no association between ERCC1 haplotypes and lung or H&N cancer susceptibility in this Caucasian population (p-values>0.05). No associations were found when stratifying lung cancer cases by histology, sex, smoking status, or smoking intensity. This study suggests that ERCC1 polymorphisms and haplotypes do not play a role in lung and H&N cancer susceptibility in Caucasians. Copyright © 2010 Elsevier Ltd. All rights reserved.
Association of ORAI1 Haplotypes with the Risk of HLA-B27 Positive Ankylosing Spondylitis
Wei, James Cheng-Chung; Yen, Jeng-Hsien; Juo, Suh-Hang Hank; Chen, Wei-Chiao; Wang, Yu-Shiuan; Chiu, Yi-Ching; Hsieh, Tusty-Jiuan; Guo, Yuh-Cherng; Huang, Chun-Huang; Wong, Ruey-Hong; Wang, Hui-Po; Tsai, Ke-Li; Wu, Yang-Chang; Chang, Hsueh-Wei; Hsi, Edward; Chang, Wei-Pin; Chang, Wei-Chiao
2011-01-01
Ankylosing spondylitis (AS) is a chronic inflammation of the sacroiliac joints, spine and peripheral joints. The aetiology of ankylosing spondylitis is still unclear. Previous studies have indicated that genetics factors such as human leukocyte antigen HLA-B27 associates to AS susceptibility. We carried out a case-control study to determine whether the genetic polymorphisms of ORAI1 gene, a major component of store-operated calcium channels that involved the regulation of immune system, is a susceptibility factor to AS in a Taiwanese population. We enrolled 361 AS patients fulfilled the modified New York criteria and 379 controls from community. Five tagging single nucleotides polymorphisms (tSNPs) at ORAI1 were selected from the data of Han Chinese population in HapMap project. Clinical statuses of AS were assessed by the Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), Bath Ankylosing Spondylitis Functional Index (BASFI), and Bath Ankylosing Spondylitis Global Index (BAS-G). Our results indicated that subjects carrying the minor allele homozygote (CC) of the promoter SNP rs12313273 or TT homozygote of the SNP rs7135617 had an increased risk of HLA-B27 positive AS. The minor allele C of 3′UTR SNP rs712853 exerted a protective effect to HLA-B27 positive AS. Furthermore, the rs12313273/rs7135617 pairwise allele analysis found that C-G (OR 1.69, 95% CI 1.27, 2.25; p = 0.0003) and T-T (OR 1.75, 95% CI 1.36, 2.27; p<0.0001) haplotypes had a significantly association with the risk of HLA-B27-positive AS in comparison with the T-G carriers. This is the first study that indicate haplotypes of ORAI1 (rs12313273 and rs7135617) are associated with the risk of HLA-B27 positive AS. PMID:21674042
Mapping Genetic Variants Associated with Beta-Adrenergic Responses in Inbred Mice
Hersch, Micha; Peter, Bastian; Kang, Hyun Min; Schüpfer, Fanny; Abriel, Hugues; Pedrazzini, Thierry; Eskin, Eleazar; Beckmann, Jacques S.
2012-01-01
β-blockers and β-agonists are primarily used to treat cardiovascular diseases. Inter-individual variability in response to both drug classes is well recognized, yet the identity and relative contribution of the genetic players involved are poorly understood. This work is the first genome-wide association study (GWAS) addressing the values and susceptibility of cardiovascular-related traits to a selective β 1-blocker, Atenolol (ate), and a β-agonist, Isoproterenol (iso). The phenotypic dataset consisted of 27 highly heritable traits, each measured across 22 inbred mouse strains and four pharmacological conditions. The genotypic panel comprised 79922 informative SNPs of the mouse HapMap resource. Associations were mapped by Efficient Mixed Model Association (EMMA), a method that corrects for the population structure and genetic relatedness of the various strains. A total of 205 separate genome-wide scans were analyzed. The most significant hits include three candidate loci related to cardiac and body weight, three loci for electrocardiographic (ECG) values, two loci for the susceptibility of atrial weight index to iso, four loci for the susceptibility of systolic blood pressure (SBP) to perturbations of the β-adrenergic system, and one locus for the responsiveness of QTc (p<10−8). An additional 60 loci were suggestive for one or the other of the 27 traits, while 46 others were suggestive for one or the other drug effects (p<10−6). Most hits tagged unexpected regions, yet at least two loci for the susceptibility of SBP to β-adrenergic drugs pointed at members of the hypothalamic-pituitary-thyroid axis. Loci for cardiac-related traits were preferentially enriched in genes expressed in the heart, while 23% of the testable loci were replicated with datasets of the Mouse Phenome Database (MPD). Altogether these data and validation tests indicate that the mapped loci are relevant to the traits and responses studied. PMID:22859963
Biobank/Genomic Research in Nigeria: Examining Relevant Privacy and Confidentiality Frameworks.
Nnamuchi, Obiajulu
2015-01-01
Nigeria's commitment to genomic research and biobanking is beyond dispute. Proof, if there is need for one, is that the country is one of only six nations (others are Canada, China, Japan, the United Kingdom, and the United States) involved in the International HapMap Project. The HapMap Project is an innovative enterprise aimed at developing a haplotype map of the human genome, a tool that is helpful to studying the genetic basis of disease as well as the genetic or hereditary factors that contribute to variation in response to environmental factors, in susceptibility to infection, and in the effectiveness of, and adverse responses to, drugs and vaccines. In addition, the country is home to H3Africa biobank (with 45, 358 human samples in storage), affiliated with the Institute of Human Virology of Nigeria (IHVN), and several others. Benefits accruing from genomic research and biobanking are enormous; so also is protection of research subjects. The protection envisaged centers primarily on, inter alia, securing informed consent, safeguarding privacy and maintaining confidentiality of health information - all of which are enshrined in ethicolegal regimes in Nigeria. But whether these frameworks are consistent with international best practices is not at all clear, hence the need for this paper. © 2015 American Society of Law, Medicine & Ethics, Inc.
Zhang, Ruijie; Lv, Wenhua; Luan, Meiwei; Zheng, Jiajia; Shi, Miao; Zhu, Hongjie; Li, Jin; Lv, Hongchao; Zhang, Mingming; Shang, Zhenwei; Duan, Lian; Jiang, Yongshuai
2015-11-24
Different human genes often exhibit different degrees of stability in their DNA methylation levels between tissues, samples or cell types. This may be related to the evolution of human genome. Thus, we compared the evolutionary conservation between two types of genes: genes with stable DNA methylation levels (SM genes) and genes with fluctuant DNA methylation levels (FM genes). For long-term evolutionary characteristics between species, we compared the percentage of the orthologous genes, evolutionary rate dn/ds and protein sequence identity. We found that the SM genes had greater percentages of the orthologous genes, lower dn/ds, and higher protein sequence identities in all the 21 species. These results indicated that the SM genes were more evolutionarily conserved than the FM genes. For short-term evolutionary characteristics among human populations, we compared the single nucleotide polymorphism (SNP) density, and the linkage disequilibrium (LD) degree in HapMap populations and 1000 genomes project populations. We observed that the SM genes had lower SNP densities, and higher degrees of LD in all the 11 HapMap populations and 13 1000 genomes project populations. These results mean that the SM genes had more stable chromosome genetic structures, and were more conserved than the FM genes.
Screening large-scale association study data: exploiting interactions using random forests.
Lunetta, Kathryn L; Hayward, L Brooke; Segal, Jonathan; Van Eerdewegh, Paul
2004-12-10
Genome-wide association studies for complex diseases will produce genotypes on hundreds of thousands of single nucleotide polymorphisms (SNPs). A logical first approach to dealing with massive numbers of SNPs is to use some test to screen the SNPs, retaining only those that meet some criterion for further study. For example, SNPs can be ranked by p-value, and those with the lowest p-values retained. When SNPs have large interaction effects but small marginal effects in a population, they are unlikely to be retained when univariate tests are used for screening. However, model-based screens that pre-specify interactions are impractical for data sets with thousands of SNPs. Random forest analysis is an alternative method that produces a single measure of importance for each predictor variable that takes into account interactions among variables without requiring model specification. Interactions increase the importance for the individual interacting variables, making them more likely to be given high importance relative to other variables. We test the performance of random forests as a screening procedure to identify small numbers of risk-associated SNPs from among large numbers of unassociated SNPs using complex disease models with up to 32 loci, incorporating both genetic heterogeneity and multi-locus interaction. Keeping other factors constant, if risk SNPs interact, the random forest importance measure significantly outperforms the Fisher Exact test as a screening tool. As the number of interacting SNPs increases, the improvement in performance of random forest analysis relative to Fisher Exact test for screening also increases. Random forests perform similarly to the univariate Fisher Exact test as a screening tool when SNPs in the analysis do not interact. In the context of large-scale genetic association studies where unknown interactions exist among true risk-associated SNPs or SNPs and environmental covariates, screening SNPs using random forest analyses can significantly reduce the number of SNPs that need to be retained for further study compared to standard univariate screening methods.
Al-Tobasei, Rafet; Ali, Ali; Leeds, Timothy D; Liu, Sixin; Palti, Yniv; Kenney, Brett; Salem, Mohamed
2017-08-07
Coding/functional SNPs change the biological function of a gene and, therefore, could serve as "large-effect" genetic markers. In this study, we used two bioinformatics pipelines, GATK and SAMtools, for discovering coding/functional SNPs with allelic-imbalances associated with total body weight, muscle yield, muscle fat content, shear force, and whiteness. Phenotypic data were collected for approximately 500 fish, representing 98 families (5 fish/family), from a growth-selected line, and the muscle transcriptome was sequenced from 22 families with divergent phenotypes (4 low- versus 4 high-ranked families per trait). GATK detected 59,112 putative SNPs; of these SNPs, 4798 showed allelic imbalances (>2.0 as an amplification and <0.5 as loss of heterozygosity). SAMtools detected 87,066 putative SNPs; and of them, 4962 had allelic imbalances between the low- and high-ranked families. Only 1829 SNPs with allelic imbalances were common between the two datasets, indicating significant differences in algorithms. The two datasets contained 7930 non-redundant SNPs of which 4439 mapped to 1498 protein-coding genes (with 6.4% non-synonymous SNPs) and 684 mapped to 295 lncRNAs. Validation of a subset of 92 SNPs revealed 1) 86.7-93.8% success rate in calling polymorphic SNPs and 2) 95.4% consistent matching between DNA and cDNA genotypes indicating a high rate of identifying SNPs with allelic imbalances. In addition, 4.64% SNPs revealed random monoallelic expression. Genome distribution of the SNPs with allelic imbalances exhibited high density for all five traits in several chromosomes, especially chromosome 9, 20 and 28. Most of the SNP-harboring genes were assigned to important growth-related metabolic pathways. These results demonstrate utility of RNA-Seq in assessing phenotype-associated allelic imbalances in pooled RNA-Seq samples. The SNPs identified in this study were included in a new SNP-Chip design (available from Affymetrix) for genomic and genetic analyses in rainbow trout.
Yates, Christopher M; Sternberg, Michael J E
2013-11-01
Non-synonymous single nucleotide polymorphisms (nsSNPs) are single base changes leading to a change to the amino acid sequence of the encoded protein. Many of these variants are associated with disease, so nsSNPs have been well studied, with studies looking at the effects of nsSNPs on individual proteins, for example, on stability and enzyme active sites. In recent years, the impact of nsSNPs upon protein-protein interactions has also been investigated, giving a greater insight into the mechanisms by which nsSNPs can lead to disease. In this review, we summarize these studies, looking at the various mechanisms by which nsSNPs can affect protein-protein interactions. We focus on structural changes that can impair interaction, changes to disorder, gain of interaction, and post-translational modifications before looking at some examples of nsSNPs at human-pathogen protein-protein interfaces and the analysis of nsSNPs from a network perspective. © 2013.
Bensen, Jeannette T; Xu, Zongli; Smith, Gary J; Mohler, James L; Fontham, Elizabeth T H; Taylor, Jack A
2013-01-01
Genome-wide association studies have established a number of replicated single nucleotide polymorphisms (SNPs) for susceptibility to prostate cancer (CaP), but it is unclear whether these susceptibility SNPs are also associated with disease aggressiveness. This study evaluates whether such replication SNPs or other candidate SNPs are associated with CaP aggressiveness in African-American (AA) and European-American (EA) men. A 1,536 SNP panel which included 34 genome-wide association study (GWAS) replication SNPs, 38 flanking SNPs, a set of ancestry informative markers, and SNPs in candidate genes and other areas was genotyped in 1,060 AA and 1,087 EA men with incident CaP from the North Carolina-Louisiana Prostate Cancer Project (PCaP). Tests for association were conducted using ordinal logistic regression with a log-additive genotype model and a 3-category CaP aggressiveness variable. Four GWAS replication SNPs (rs2660753, rs13254738, rs10090154, rs2735839) and seven flanking SNPs were associated with CaP aggressiveness (P < 0.05) in three genomic regions: One at 3p12 (EA), seven at 8q24 (5 AA, 2 EA), and three at 19q13 at the kallilkrein-related peptidase 3 (KLK3) locus (two AA, one AA and EA). The KLK3 SNPs also were associated with serum prostate-specific antigen (PSA) levels in AA (P < 0.001) but not in EA. A number of the other SNPs showed some evidence of association but none met study-wide significance levels after adjusting for multiple comparisons. Some replicated GWAS susceptibility SNPs may play a role in CaP aggressiveness. However, like susceptibility, these associations are not consistent between racial groups. Copyright © 2012 Wiley Periodicals, Inc.
Bensen, Jeannette T.; Xu, Zongli; Smith, Gary J.; Mohler, James L.; Fontham, Elizabeth T.H.; Taylor, Jack A.
2012-01-01
BACKGROUND Genome-wide association studies have established a number of replicated single nucleotide polymorphisms (SNPs) for susceptibility to prostate cancer (CaP), but it is unclear whether these susceptibility SNPs are also associated with disease aggressiveness. This study evaluates whether such replication SNPs or other candidate SNPs are associated with CaP aggressiveness in African-American (AA) and European-American (EA) men. METHODS A 1,536 SNP panel which included 34 genome-wide association study (GWAS) replication SNPs, 38 flanking SNPs, a set of ancestry informative markers, and SNPs in candidate genes and other areas was genotyped in 1,060 AA and 1,087 EA men with incident CaP from the North Carolina-Louisiana Prostate Cancer Project (PCaP). Tests for association were conducted using ordinal logistic regression with a log-additive genotype model and a 3-category CaP aggressiveness variable. RESULTS 4 GWAS replication SNPs (rs2660753, rs13254738, rs10090154, rs2735839) and 7 flanking SNPs were associated with CaP aggressiveness (P<0.05) in 3 genomic regions: one at 3p12 (EA), 7 at 8q24 (5 AA, 2 EA), and 3 at 19q13 at the kallilkrein-related peptidase 3 (KLK3) locus (2 AA, 1 AA and EA). The KLK3 SNPs also were associated with serum prostate-specific antigen (PSA) levels in AA (p < 0.001) but not in EA. A number of the other SNPs showed some evidence of association but none met study-wide significance levels after adjusting for multiple comparisons. CONCLUSIONS Some replicated GWAS susceptibility SNPs may play a role in CaP aggressiveness. However, like susceptibility, these associations are not consistent between racial groups. PMID:22549899
Robinson, Nicholas; Baranski, Matthew; Mahapatra, Kanta Das; Saha, Jatindra Nath; Das, Sweta; Mishra, Jashobanta; Das, Paramananda; Kent, Matthew; Arnyasi, Mariann; Sahoo, Pramoda Kumar
2014-06-30
Production of carp dominates world aquaculture. More than 1.1 million tonnes of rohu carp, Labeo rohita (Hamilton), were produced in 2010. Aeromonas hydrophila is a bacterial pathogen causing aeromoniasis in rohu, and is a major problem for carp production worldwide. There is a need to better understand the genetic mechanisms affecting resistance to this disease, and to develop tools that can be used with selective breeding to improve resistance. Here we use a 6 K SNP array to genotype 21 full-sibling families of L. rohita that were experimentally challenged intra-peritoneally with a virulent strain of A. hydrophila to scan the genome for quantitative trait loci associated with disease resistance. In all, 3193 SNPs were found to be informative and were used to create a linkage map and to scan for QTL affecting resistance to A. hydrophila. The linkage map consisted of 25 linkage groups, corresponding to the number of haploid chromosomes in L. rohita. Male and female linkage maps were similar in terms of order, coverage (1384 and 1393 cM, respectively) and average interval distances (1.32 and 1.35 cM, respectively). Forty-one percent of the SNPs were annotated with gene identity using BLAST (cut off E-score of 0.001). Twenty-one SNPs mapping to ten linkage groups showed significant associations with the traits hours of survival and dead or alive (P <0.05 after Bonferroni correction). Of the SNPs showing significant or suggestive associations with the traits, several were homologous to genes of known immune function or were in close linkage to such genes. Genes of interest included heat shock proteins (70, 60, 105 and "small heat shock proteins"), mucin (5b precursor and 2), lectin (receptor and CD22), tributyltin-binding protein, major histocompatibility loci (I and II), complement protein component c7-1, perforin 1, ubiquitin (ligase, factor e4b isoform 2 and conjugation enzyme e2 c), proteasome subunit, T-cell antigen receptor and lymphocyte specific protein tyrosine kinase. A panel of markers has been identified that will be validated for use with both genomic and marker-assisted selection to improve resistance of L. rohita to A. hydrophila.
Hong, Yanbin; Pandey, Manish K; Liu, Ying; Chen, Xiaoping; Liu, Hong; Varshney, Rajeev K; Liang, Xuanqiang; Huang, Shangzhi
2015-01-01
The cultivated peanut (Arachis hypogaea L.) is an allotetraploid (AABB) species derived from the A-genome (Arachis duranensis) and B-genome (Arachis ipaensis) progenitors. Presence of two versions of a DNA sequence based on the two progenitor genomes poses a serious technical and analytical problem during single nucleotide polymorphism (SNP) marker identification and analysis. In this context, we have analyzed 200 amplicons derived from expressed sequence tags (ESTs) and genome survey sequences (GSS) to identify SNPs in a panel of genotypes consisting of 12 cultivated peanut varieties and two diploid progenitors representing the ancestral genomes. A total of 18 EST-SNPs and 44 genomic-SNPs were identified in 12 peanut varieties by aligning the sequence of A. hypogaea with diploid progenitors. The average frequency of sequence polymorphism was higher for genomic-SNPs than the EST-SNPs with one genomic-SNP every 1011 bp as compared to one EST-SNP every 2557 bp. In order to estimate the potential and further applicability of these identified SNPs, 96 peanut varieties were genotyped using high resolution melting (HRM) method. Polymorphism information content (PIC) values for EST-SNPs ranged between 0.021 and 0.413 with a mean of 0.172 in the set of peanut varieties, while genomic-SNPs ranged between 0.080 and 0.478 with a mean of 0.249. Total 33 SNPs were used for polymorphism detection among the parents and 10 selected lines from mapping population Y13Zh (Zhenzhuhei × Yueyou13). Of the total 33 SNPs, nine SNPs showed polymorphism in the mapping population Y13Zh, and seven SNPs were successfully mapped into five linkage groups. Our results showed that SNPs can be identified in allotetraploid peanut with high accuracy through amplicon sequencing and HRM assay. The identified SNPs were very informative and can be used for different genetic and breeding applications in peanut.
2013-01-01
Background Identification of single nucleotide polymorphisms (SNPs) for specific genes involved in reproduction might improve reliability of genomic estimates for these low-heritability traits. Semen from 550 Holstein bulls of high (≥ 1.7; n = 288) or low (≤ −2; n = 262) daughter pregnancy rate (DPR) was genotyped for 434 candidate SNPs using the Sequenom MassARRAY® system. Three types of SNPs were evaluated: SNPs previously reported to be associated with reproductive traits or physically close to genetic markers for reproduction, SNPs in genes that are well known to be involved in reproductive processes, and SNPs in genes that are differentially expressed between physiological conditions in a variety of tissues associated in reproductive function. Eleven reproduction and production traits were analyzed. Results A total of 40 SNPs were associated (P < 0.05) with DPR. Among these were genes involved in the endocrine system, cell signaling, immune function and inhibition of apoptosis. A total of 10 genes were regulated by estradiol. In addition, 22 SNPs were associated with heifer conception rate, 33 with cow conception rate, 36 with productive life, 34 with net merit, 23 with milk yield, 19 with fat yield, 13 with fat percent, 19 with protein yield, 22 with protein percent, and 13 with somatic cell score. The allele substitution effect for SNPs associated with heifer conception rate, cow conception rate, productive life and net merit were in the same direction as for DPR. Allele substitution effects for several SNPs associated with production traits were in the opposite direction as DPR. Nonetheless, there were 29 SNPs associated with DPR that were not negatively associated with production traits. Conclusion SNPs in a total of 40 genes associated with DPR were identified as well as SNPs for other traits. It might be feasible to include these SNPs into genomic tests of reproduction and other traits. The genes associated with DPR are likely to be important for understanding the physiology of reproduction. Given the large number of SNPs associated with DPR that were not negatively associated with production traits, it should be possible to select for DPR without compromising production. PMID:23759029
Abe, Makiko; Ito, Hidemi; Oze, Isao; Nomura, Masatoshi; Ogawa, Yoshihiro; Matsuo, Keitaro
2017-12-01
Little is known about the difference of genetic predisposition for CRC between ethnicities; however, many genetic traits common to colorectal cancer have been identified. This study investigated whether more SNPs identified in GWAS in East Asian population could improve the risk prediction of Japanese and explored possible application of genetic risk groups as an instrument of the risk communication. 558 Patients histologically verified colorectal cancer and 1116 first-visit outpatients were included for derivation study, and 547 cases and 547 controls were for replication study. Among each population, we evaluated prediction models for the risk of CRC that combined the genetic risk group based on SNPs from GWASs in European-population and a similarly developed model adding SNPs from GWASs in East Asian-population. We examined whether adding East Asian-specific SNPs would improve the discrimination. Six SNPs (rs6983267, rs4779584, rs4444235, rs9929218, rs10936599, rs16969681) from 23 SNPs by European-based GWAS and five SNPs (rs704017, rs11196172, rs10774214, rs647161, rs2423279) among ten SNPs by Asian-based GWAS were selected in CRC risk prediction model. Compared with a 6-SNP-based model, an 11-SNP model including Asian GWAS-SNPs showed improved discrimination capacity in Receiver operator characteristic analysis. A model with 11 SNPs resulted in statistically significant improvement in both derivation (P = 0.0039) and replication studies (P = 0.0018) compared with six SNP model. We estimated cumulative risk of CRC by using genetic risk group based on 11 SNPs and found that the cumulative risk at age 80 is approximately 13% in the high-risk group while 6% in the low-risk group. We constructed a more efficient CRC risk prediction model with 11 SNPs including newly identified East Asian-based GWAS SNPs (rs704017, rs11196172, rs10774214, rs647161, rs2423279). Risk grouping based on 11 SNPs depicted lifetime difference of CRC risk. This might be useful for effective individualized prevention for East Asian.
Replication of Caucasian Loci Associated with Osteoporosis-related Traits in East Asians
Kim, Beom-Jun; Ahn, Seong Hee; Kim, Hyeon-Mok; Ikegawa, Shiro; Yang, Tie-Lin; Guo, Yan; Deng, Hong-Wen; Koh, Jung-Min
2016-01-01
Background Most reported genome-wide association studies (GWAS) seeking to identify the loci of osteoporosis-related traits have involved Caucasian populations. We aimed to identify the single nucleotide polymorphisms (SNPs) of osteoporosis-related traits among East Asian populations from the bone mineral density (BMD)-related loci of an earlier GWAS meta-analysis. Methods A total of 95 SNPs, identified at the discovery stage of the largest GWAS meta-analysis of BMD, were tested to determine associations with osteoporosis-related traits (BMD, osteoporosis, or fracture) in Korean subjects (n=1,269). The identified SNPs of osteoporosis-related traits in Korean subjects were included in the replication analysis using Chinese (n=2,327) and Japanese (n=768) cohorts. Results A total of 17 SNPs were associated with low BMD in Korean subjects. Specifically, 9, 6, 9, and 5 SNPs were associated with the presence of osteoporosis, non-vertebral fractures, vertebral fractures, and any fracture, respectively. Collectively, 35 of the 95 SNPs (36.8%) were associated with one or more osteoporosis-related trait in Korean subjects. Of the 35 SNPs, 19 SNPs (54.3%) were also associated with one or more osteoporosis-related traits in East Asian populations. Twelve SNPs were associated with low BMD in the Chinese and Japanese cohorts. Specifically, 3, 4, and 2 SNPs were associated with the presence of hip fractures, vertebral fractures, and any fracture, respectively. Conclusions Our results identified the common SNPs of osteoporosis-related traits in both Caucasian and East Asian populations. These SNPs should be further investigated to assess whether they are true genetic markers of osteoporosis. PMID:27965945
Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon
2015-02-01
There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.
Your Guide to Medicare Special Needs Plans (SNPs)
... Needs Plans Where Are Medicare SNPs Offered? Each year, different types of Medicare SNPs may be available in different parts of the country. Insurance companies decide where they will do business, so Medicare SNPs may not be available in ...
Iquebal, M A; Jaiswal, Sarika; Mahato, Ajay Kumar; Jayaswal, Pawan K; Angadi, U B; Kumar, Neeraj; Sharma, Nimisha; Singh, Anand K; Srivastav, Manish; Prakash, Jai; Singh, S K; Khan, Kasim; Mishra, Rupesh K; Rajan, Shailendra; Bajpai, Anju; Sandhya, B S; Nischita, Puttaraju; Ravishankar, K V; Dinesh, M R; Rai, Anil; Kumar, Dinesh; Sharma, Tilak R; Singh, Nagendra K
2017-11-02
Mango is one of the most important fruits of tropical ecological region of the world, well known for its nutritive value, aroma and taste. Its world production is >45MT worth >200 billion US dollars. Genomic resources are required for improvement in productivity and management of mango germplasm. There is no web-based genomic resources available for mango. Hence rapid and cost-effective high throughput putative marker discovery is required to develop such resources. RAD-based marker discovery can cater this urgent need till whole genome sequence of mango becomes available. Using a panel of 84 mango varieties, a total of 28.6 Gb data was generated by ddRAD-Seq approach on Illumina HiSeq 2000 platform. A total of 1.25 million SNPs were discovered. Phylogenetic tree using 749 common SNPs across these varieties revealed three major lineages which was compared with geographical locations. A web genomic resources MiSNPDb, available at http://webtom.cabgrid.res.in/mangosnps/ is based on 3-tier architecture, developed using PHP, MySQL and Javascript. This web genomic resources can be of immense use in the development of high density linkage map, QTL discovery, varietal differentiation, traceability, genome finishing and SNP chip development for future GWAS in genomic selection program. We report here world's first web-based genomic resources for genetic improvement and germplasm management of mango.
Whole-genome analyses of Korean native and Holstein cattle breeds by massively parallel sequencing.
Choi, Jung-Woo; Liao, Xiaoping; Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin
2014-01-01
A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea--Hanwoo, Jeju Heugu, and Korean Holstein--using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions-deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding.
Zhou, Liyuan; Liu, Shouye; Wu, Weixun; Chen, Daibo; Zhan, Xiaodeng; Zhu, Aike; Zhang, Yingxin; Cheng, Shihua; Cao, Liyong; Lou, Xiangyang; Xu, Haiming
2016-01-01
Xieyou9308 is a certified super hybrid rice cultivar with a high grain yield. To investigate its underlying genetic basis of high yield potential, a recombinant inbred line (RIL) population derived from the cross between the maintainer line XieqingzaoB (XQZB) and the restorer line Zhonghui9308 (ZH9308) was constructed for identification of quantitative trait SNPs (QTSs) associated with two important agronomic traits, plant height (PH) and heading date (HD). By re-sequencing of 138 recombinant inbred lines (RILs), a total of ~0.7 million SNPs were identified for the association studies on the PH and HD. Three association mapping strategies (including hypothesis-free genome-wide association and its two complementary hypothesis-engaged ones, QTL-based association and gene-based association) were adopted for data analysis. Using a saturated mixed linear model including epistasis and environmental interaction, we identified a total of 31 QTSs associated with either the PH or the HD. The total estimated heritability across three analyses ranged from 37.22% to 45.63% and from 37.53% to 55.96% for the PH and HD, respectively. In this study we examined the feasibility of association studies in an experimental population (RIL) and identified several common loci through multiple strategies which could be preferred candidates for further research. PMID:27406081
Stölting, Kai N; Paris, Margot; Meier, Cécile; Heinze, Berthold; Castiglione, Stefano; Bartha, Denes; Lexer, Christian
2015-08-01
Studying the divergence continuum in plants is relevant to fundamental and applied biology because of the potential to reveal functionally important genetic variation. In this context, whole-genome sequencing (WGS) provides the necessary rigour for uncovering footprints of selection. We resequenced populations of two divergent phylogeographic lineages of Populus alba (n = 48), thoroughly characterized by microsatellites (n = 317), and scanned their genomes for regions of unusually high allelic differentiation and reduced diversity using > 1.7 million single nucleotide polymorphisms (SNPs) from WGS. Results were confirmed by Sanger sequencing. On average, 9134 high-differentiation (≥ 4 standard deviations) outlier SNPs were uncovered between populations, 848 of which were shared by ≥ three replicate comparisons. Annotation revealed that 545 of these were located in 437 predicted genes. Twelve percent of differentiation outlier genome regions exhibited significantly reduced genetic diversity. Gene ontology (GO) searches were successful for 327 high-differentiation genes, and these were enriched for 63 GO terms. Our results provide a snapshot of the roles of 'hard selective sweeps' vs divergent selection of standing genetic variation in distinct postglacial recolonization lineages of P. alba. Thus, this study adds to our understanding of the mechanisms responsible for the origin of functionally relevant variation in temperate trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Whole-Genome Analyses of Korean Native and Holstein Cattle Breeds by Massively Parallel Sequencing
Stothard, Paul; Chung, Won-Hyong; Jeon, Heoyn-Jeong; Miller, Stephen P.; Choi, So-Young; Lee, Jeong-Koo; Yang, Bokyoung; Lee, Kyung-Tai; Han, Kwang-Jin; Kim, Hyeong-Cheol; Jeong, Dongkee; Oh, Jae-Don; Kim, Namshin; Kim, Tae-Hun; Lee, Hak-Kyo; Lee, Sung-Jin
2014-01-01
A main goal of cattle genomics is to identify DNA differences that account for variations in economically important traits. In this study, we performed whole-genome analyses of three important cattle breeds in Korea—Hanwoo, Jeju Heugu, and Korean Holstein—using the Illumina HiSeq 2000 sequencing platform. We achieved 25.5-, 29.6-, and 29.5-fold coverage of the Hanwoo, Jeju Heugu, and Korean Holstein genomes, respectively, and identified a total of 10.4 million single nucleotide polymorphisms (SNPs), of which 54.12% were found to be novel. We also detected 1,063,267 insertions–deletions (InDels) across the genomes (78.92% novel). Annotations of the datasets identified a total of 31,503 nonsynonymous SNPs and 859 frameshift InDels that could affect phenotypic variations in traits of interest. Furthermore, genome-wide copy number variation regions (CNVRs) were detected by comparing the Hanwoo, Jeju Heugu, and previously published Chikso genomes against that of Korean Holstein. A total of 992, 284, and 1881 CNVRs, respectively, were detected throughout the genome. Moreover, 53, 65, 45, and 82 putative regions of homozygosity (ROH) were identified in Hanwoo, Jeju Heugu, Chikso, and Korean Holstein respectively. The results of this study provide a valuable foundation for further investigations to dissect the molecular mechanisms underlying variation in economically important traits in cattle and to develop genetic markers for use in cattle breeding. PMID:24992012
Yi, Liuxi; Gao, Fengyun; Siqin, Bateer; Zhou, Yu; Li, Qiang; Zhao, Xiaoqing; Jia, Xiaoyun; Zhang, Hui
2017-01-01
Flax is an important crop for oil and fiber, however, no high-density genetic maps have been reported for this species. Specific length amplified fragment sequencing (SLAF-seq) is a high-resolution strategy for large scale de novo discovery and genotyping of single nucleotide polymorphisms. In this study, SLAF-seq was employed to develop SNP markers in an F2 population to construct a high-density genetic map for flax. In total, 196.29 million paired-end reads were obtained. The average sequencing depth was 25.08 in male parent, 32.17 in the female parent, and 9.64 in each F2 progeny. In total, 389,288 polymorphic SLAFs were detected, from which 260,380 polymorphic SNPs were developed. After filtering, 4,638 SNPs were found suitable for genetic map construction. The final genetic map included 4,145 SNP markers on 15 linkage groups and was 2,632.94 cM in length, with an average distance of 0.64 cM between adjacent markers. To our knowledge, this map is the densest SNP-based genetic map for flax. The SNP markers and genetic map reported in here will serve as a foundation for the fine mapping of quantitative trait loci (QTLs), map-based gene cloning and marker assisted selection (MAS) for flax.
Yokoyama, Eiji; Hirai, Shinichiro; Ishige, Taichiro; Murakami, Satoshi
2018-01-02
Seventeen clusters of Shiga toxin-producing Escherichia coli O157:H7/- (O157) strains, determined by cluster analysis of pulsed-field gel electrophoresis patterns, were analyzed using whole genome sequence (WGS) data to investigate this pathogen's molecular epidemiology. The 17 clusters included 136 strains containing strains from nine outbreaks, with each outbreak caused by a single source contaminated with the organism, as shown by epidemiological contact surveys. WGS data of these strains were used to identify single nucleotide polymorphisms (SNPs) by two methods: short read data were directly mapped to a reference genome (mapping derived SNPs) and common SNPs between the mapping derived SNPs and SNPs in assembled data of short read data (common SNPs). Among both SNPs, those that were detected in genes with a gap were excluded to remove ambiguous SNPs from further analysis. The effectiveness of both SNPs was investigated among all the concatenated SNPs that were detected (whole SNP set); SNPs were divided into three categories based on the genes in which they were located (i.e., backbone SNP set, O-island SNP set, and mobile element SNP set); and SNPs in non-coding regions (intergenic region SNP set). When SNPs from strains isolated from the nine single source derived outbreaks were analyzed using an unweighted pair group method with arithmetic mean tree (UPGMA) and a minimum spanning tree (MST), the maximum pair-wise distances of the backbone SNP set of the mapping derived SNPs were significantly smaller than those of the whole and intergenic region SNP set on both UPGMAs and MSTs. This significant difference was also observed when the backbone SNP set of the common SNPs were examined (Steel-Dwass test, P≤0.01). When the maximum pair-wise distances were compared between the mapping derived and common SNPs, significant differences were observed in those of the whole, mobile element, and intergenic region SNP set (Wilcoxon signed rank test, P≤0.01). When all the strains included in one complex on an MST or one cluster on a UPGMA were designated as the same genotype, the values of the Hunter-Gaston Discriminatory Power Index for the backbone SNP set of the mapping derived and common SNPs were higher than those of other SNP sets. In contrast, the mobile element SNP set could not robustly subdivide lineage I strains of tested O157 strains using both the mapping derived and common SNPs. These results suggested that the backbone SNP set were the most effective for analysis of WGS data for O157 in enabling an appropriation of its molecular epidemiology. Copyright © 2017 Elsevier B.V. All rights reserved.
Herold, Christine; Hooli, Basavaraj V.; Mullin, Kristina; Liu, Tian; Roehr, Johannes T; Mattheisen, Manuel; Parrado, Antonio R.; Bertram, Lars; Lange, Christoph; Tanzi, Rudolph E.
2015-01-01
The genetic basis of Alzheimer's disease (AD) is complex and heterogeneous. Over 200 highly penetrant pathogenic variants in the genes APP, PSEN1 and PSEN2 cause a subset of early-onset familial Alzheimer's disease (EOFAD). On the other hand, susceptibility to late-onset forms of AD (LOAD) is indisputably associated to the ε4 allele in the gene APOE, and more recently to variants in more than two-dozen additional genes identified in the large-scale genome-wide association studies (GWAS) and meta-analyses reports. Taken together however, although the heritability in AD is estimated to be as high as 80%, a large proportion of the underlying genetic factors still remain to be elucidated. In this study we performed a systematic family-based genome-wide association and meta-analysis on close to 15 million imputed variants from three large collections of AD families (~3,500 subjects from 1,070 families). Using a multivariate phenotype combining affection status and onset age, meta-analysis of the association results revealed three single nucleotide polymorphisms (SNPs) that achieved genome-wide significance for association with AD risk: rs7609954 in the gene PTPRG (P-value = 3.98·10−08), rs1347297 in the gene OSBPL6 (P-value = 4.53·10−08), and rs1513625 near PDCL3 (P-value = 4.28·10−08). In addition, rs72953347 in OSBPL6 (P-value = 6.36·10−07) and two SNPs in the gene CDKAL1 showed marginally significant association with LOAD (rs10456232, P-value: 4.76·10−07; rs62400067, P-value: 3.54·10−07). In summary, family-based GWAS meta-analysis of imputed SNPs revealed novel genomic variants in (or near) PTPRG, OSBPL6, and PDCL3 that influence risk for AD with genome-wide significance. PMID:26830138
Refining Susceptibility Loci of Chronic Obstructive Pulmonary Disease with Lung eqtls
Lamontagne, Maxime; Couture, Christian; Postma, Dirkje S.; Timens, Wim; Sin, Don D.; Paré, Peter D.; Hogg, James C.; Nickle, David; Laviolette, Michel; Bossé, Yohan
2013-01-01
Chronic obstructive pulmonary disease (COPD) is the fourth leading cause of mortality worldwide. Recent genome-wide association studies (GWAS) have identified robust susceptibility loci associated with COPD. However, the mechanisms mediating the risk conferred by these loci remain to be found. The goal of this study was to identify causal genes/variants within susceptibility loci associated with COPD. In the discovery cohort, genome-wide gene expression profiles of 500 non-tumor lung specimens were obtained from patients undergoing lung surgery. Blood-DNA from the same patients were genotyped for 1,2 million SNPs. Following genotyping and gene expression quality control filters, 409 samples were analyzed. Lung expression quantitative trait loci (eQTLs) were identified and overlaid onto three COPD susceptibility loci derived from GWAS; 4q31 (HHIP), 4q22 (FAM13A), and 19q13 (RAB4B, EGLN2, MIA, CYP2A6). Significant eQTLs were replicated in two independent datasets (n = 363 and 339). SNPs previously associated with COPD and lung function on 4q31 (rs1828591, rs13118928) were associated with the mRNA expression of HHIP. An association between mRNA expression level of FAM13A and SNP rs2045517 was detected at 4q22, but did not reach statistical significance. At 19q13, significant eQTLs were detected with EGLN2. In summary, this study supports HHIP, FAM13A, and EGLN2 as the most likely causal COPD genes on 4q31, 4q22, and 19q13, respectively. Strong lung eQTL SNPs identified in this study will need to be tested for association with COPD in case-control studies. Further functional studies will also be needed to understand the role of genes regulated by disease-related variants in COPD. PMID:23936167
Genome-wide association study of Alzheimer's disease.
Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; Dekosky, S T; Lopez, O L; Barmada, M M
2012-05-15
In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ~2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69-180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P = 3.05E-07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples.
Genome-wide association study of Alzheimer's disease
Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; DeKosky, S T; Lopez, O L; Barmada, M M
2012-01-01
In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ∼2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69–180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P=3.05E–07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples. PMID:22832961
Design and characterization of a 52K SNP chip for goats.
Tosser-Klopp, Gwenola; Bardou, Philippe; Bouchez, Olivier; Cabau, Cédric; Crooijmans, Richard; Dong, Yang; Donnadieu-Tonon, Cécile; Eggen, André; Heuven, Henri C M; Jamli, Saadiah; Jiken, Abdullah Johari; Klopp, Christophe; Lawley, Cynthia T; McEwan, John; Martin, Patrice; Moreno, Carole R; Mulsant, Philippe; Nabihoudine, Ibouniyamine; Pailhoux, Eric; Palhière, Isabelle; Rupp, Rachel; Sarry, Julien; Sayre, Brian L; Tircazes, Aurélie; Jun Wang; Wang, Wen; Zhang, Wenguang
2014-01-01
The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50-60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF) suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed): Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc) and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes), sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years.
Design and Characterization of a 52K SNP Chip for Goats
Tosser-Klopp, Gwenola; Bardou, Philippe; Bouchez, Olivier; Cabau, Cédric; Crooijmans, Richard; Dong, Yang; Donnadieu-Tonon, Cécile; Eggen, André; Heuven, Henri C. M.; Jamli, Saadiah; Jiken, Abdullah Johari; Klopp, Christophe; Lawley, Cynthia T.; McEwan, John; Martin, Patrice; Moreno, Carole R.; Mulsant, Philippe; Nabihoudine, Ibouniyamine; Pailhoux, Eric; Palhière, Isabelle; Rupp, Rachel; Sarry, Julien; Sayre, Brian L.; Tircazes, Aurélie; Jun Wang; Wang, Wen; Zhang, Wenguang
2014-01-01
The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50–60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF) suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed): Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc) and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes), sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years. PMID:24465974
Ahsan, Muhammad; Ek, Weronica E.; Karlsson, Torgny; Gyllensten, Ulf
2017-01-01
Associations between epigenetic alterations and disease status have been identified for many diseases. However, there is no strong evidence that epigenetic alterations are directly causal for disease pathogenesis. In this study, we combined SNP and DNA methylation data with measurements of protein biomarkers for cancer, inflammation or cardiovascular disease, to investigate the relative contribution of genetic and epigenetic variation on biomarker levels. A total of 121 protein biomarkers were measured and analyzed in relation to DNA methylation at 470,000 genomic positions and to over 10 million SNPs. We performed epigenome-wide association study (EWAS) and genome-wide association study (GWAS) analyses, and integrated biomarker, DNA methylation and SNP data using between 698 and 1033 samples depending on data availability for the different analyses. We identified 124 and 45 loci (Bonferroni adjusted P < 0.05) with effect sizes up to 0.22 standard units’ change per 1% change in DNA methylation levels and up to four standard units’ change per copy of the effective allele in the EWAS and GWAS respectively. Most GWAS loci were cis-regulatory whereas most EWAS loci were located in trans. Eleven EWAS loci were associated with multiple biomarkers, including one in NLRC5 associated with CXCL11, CXCL9, IL-12, and IL-18 levels. All EWAS signals that overlapped with a GWAS locus were driven by underlying genetic variants and three EWAS signals were confounded by smoking. While some cis-regulatory SNPs for biomarkers appeared to have an effect also on DNA methylation levels, cis-regulatory SNPs for DNA methylation were not observed to affect biomarker levels. We present associations between protein biomarker and DNA methylation levels at numerous loci in the genome. The associations are likely to reflect the underlying pattern of genetic variants, specific environmental exposures, or represent secondary effects to the pathogenesis of disease. PMID:28915241
CerealsDB 3.0: expansion of resources and data integration.
Wilkinson, Paul A; Winfield, Mark O; Barker, Gary L A; Tyrrell, Simon; Bian, Xingdong; Allen, Alexandra M; Burridge, Amanda; Coghill, Jane A; Waterfall, Christy; Caccamo, Mario; Davey, Robert P; Edwards, Keith J
2016-06-24
The increase in human populations around the world has put pressure on resources, and as a consequence food security has become an important challenge for the 21st century. Wheat (Triticum aestivum) is one of the most important crops in human and livestock diets, and the development of wheat varieties that produce higher yields, combined with increased resistance to pests and resilience to changes in climate, has meant that wheat breeding has become an important focus of scientific research. In an attempt to facilitate these improvements in wheat, plant breeders have employed molecular tools to help them identify genes for important agronomic traits that can be bred into new varieties. Modern molecular techniques have ensured that the rapid and inexpensive characterisation of SNP markers and their validation with modern genotyping methods has produced a valuable resource that can be used in marker assisted selection. CerealsDB was created as a means of quickly disseminating this information to breeders and researchers around the globe. CerealsDB version 3.0 is an online resource that contains a wide range of genomic datasets for wheat that will assist plant breeders and scientists to select the most appropriate markers for use in marker assisted selection. CerealsDB includes a database which currently contains in excess of a million putative varietal SNPs, of which several hundreds of thousands have been experimentally validated. In addition, CerealsDB also contains new data on functional SNPs predicted to have a major effect on protein function and we have constructed a web service to encourage data integration and high-throughput programmatic access. CerealsDB is an open access website that hosts information on SNPs that are considered useful for both plant breeders and research scientists. The recent inclusion of web services designed to federate genomic data resources allows the information on CerealsDB to be more fully integrated with the WheatIS network and other biological databases.
Ravelombola, Waltram; Shi, Ainong; Weng, Yuejin; Mou, Beiquan; Motes, Dennis; Clark, John; Chen, Pengyin; Srivastava, Vibha; Qin, Jun; Dong, Lingdi; Yang, Wei; Bhattarai, Gehendra; Sugihara, Yuichi
2018-01-01
This is the first report on association analysis of salt tolerance and identification of SNP markers associated with salt tolerance in cowpea. Cowpea (Vigna unguiculata (L.) Walp) is one of the most important cultivated legumes in Africa. The worldwide annual production in cowpea dry seed is 5.4 million metric tons. However, cowpea is unfavorably affected by salinity stress at germination and seedling stages, which is exacerbated by the effects of climate change. The lack of knowledge on the genetic underlying salt tolerance in cowpea limits the establishment of a breeding strategy for developing salt-tolerant cowpea cultivars. The objectives of this study were to conduct association mapping for salt tolerance at germination and seedling stages and to identify SNP markers associated with salt tolerance in cowpea. We analyzed the salt tolerance index of 116 and 155 cowpea accessions at germination and seedling stages, respectively. A total of 1049 SNPs postulated from genotyping-by-sequencing were used for association analysis. Population structure was inferred using Structure 2.3.4; K optimal was determined using Structure Harvester. TASSEL 5, GAPIT, and FarmCPU involving three models such as single marker regression, general linear model, and mixed linear model were used for the association study. Substantial variation in salt tolerance index for germination rate, plant height reduction, fresh and dry shoot biomass reduction, foliar leaf injury, and inhibition of the first trifoliate leaf was observed. The cowpea accessions were structured into two subpopulations. Three SNPs, Scaffold87490_622, Scaffold87490_630, and C35017374_128 were highly associated with salt tolerance at germination stage. Seven SNPs, Scaffold93827_270, Scaffold68489_600, Scaffold87490_633, Scaffold87490_640, Scaffold82042_3387, C35069468_1916, and Scaffold93942_1089 were found to be associated with salt tolerance at seedling stage. The SNP markers were consistent across the three models and could be used as a tool to select salt-tolerant lines for breeding improved cowpea tolerance to salinity.
A Bayesian Method for Evaluating and Discovering Disease Loci Associations
Jiang, Xia; Barmada, M. Michael; Cooper, Gregory F.; Becich, Michael J.
2011-01-01
Background A genome-wide association study (GWAS) typically involves examining representative SNPs in individuals from some population. A GWAS data set can concern a million SNPs and may soon concern billions. Researchers investigate the association of each SNP individually with a disease, and it is becoming increasingly commonplace to also analyze multi-SNP associations. Techniques for handling so many hypotheses include the Bonferroni correction and recently developed Bayesian methods. These methods can encounter problems. Most importantly, they are not applicable to a complex multi-locus hypothesis which has several competing hypotheses rather than only a null hypothesis. A method that computes the posterior probability of complex hypotheses is a pressing need. Methodology/Findings We introduce the Bayesian network posterior probability (BNPP) method which addresses the difficulties. The method represents the relationship between a disease and SNPs using a directed acyclic graph (DAG) model, and computes the likelihood of such models using a Bayesian network scoring criterion. The posterior probability of a hypothesis is computed based on the likelihoods of all competing hypotheses. The BNPP can not only be used to evaluate a hypothesis that has previously been discovered or suspected, but also to discover new disease loci associations. The results of experiments using simulated and real data sets are presented. Our results concerning simulated data sets indicate that the BNPP exhibits both better evaluation and discovery performance than does a p-value based method. For the real data sets, previous findings in the literature are confirmed and additional findings are found. Conclusions/Significance We conclude that the BNPP resolves a pressing problem by providing a way to compute the posterior probability of complex multi-locus hypotheses. A researcher can use the BNPP to determine the expected utility of investigating a hypothesis further. Furthermore, we conclude that the BNPP is a promising method for discovering disease loci associations. PMID:21853025
Genotyping of 75 SNPs using arrays for individual identification in five population groups.
Hwa, Hsiao-Lin; Wu, Lawrence Shih Hsin; Lin, Chun-Yen; Huang, Tsun-Ying; Yin, Hsiang-I; Tseng, Li-Hui; Lee, James Chun-I
2016-01-01
Single nucleotide polymorphism (SNP) typing offers promise to forensic genetics. Various strategies and panels for analyzing SNP markers for individual identification have been published. However, the best panels with fewer identity SNPs for all major population groups are still under discussion. This study aimed to find more autosomal SNPs with high heterozygosity for individual identification among Asian populations. Ninety-six autosomal SNPs of 502 DNA samples from unrelated individuals of five population groups (208 Taiwanese Han, 83 Filipinos, 62 Thais, 69 Indonesians, and 80 individuals with European, Near Eastern, or South Asian ancestry) were analyzed using arrays in an initial screening, and 75 SNPs (group A, 46 newly selected SNPs; groups B, 29 SNPs based on a previous SNP panel) were selected for further statistical analyses. Some SNPs with high heterozygosity from Asian populations were identified. The combined random match probability of the best 40 and 45 SNPs was between 3.16 × 10(-17) and 7.75 × 10(-17) and between 2.33 × 10(-19) and 7.00 × 10(-19), respectively, in all five populations. These loci offer comparable power to short tandem repeats (STRs) for routine forensic profiling. In this study, we demonstrated the population genetic characteristics and forensic parameters of 75 SNPs with high heterozygosity from five population groups. This SNPs panel can provide valuable genotypic information and can be helpful in forensic casework for individual identification among these populations.
Nieuwenhuis, Maartje A.; Siedlinski, Matteusz; van den Berge, Maarten; Granell, Raquel; Li, Xingnan; Niens, Marijke; van der Vlies, Pieter; Altmüller, Janine; Nürnberg, Peter; Kerkhof, Marjan; van Schayck, Onno C.; Riemersma, Ronald A.; van der Molen, Thys; de Monchy, Jan G.; Bossé, Yohan; Sandford, Andrew; Bruijnzeel-Koomen, Carla A.; van Wijk, Roy G.; ten Hacken, Nick H.; Timens, Wim; Boezen, H. Marike; Henderson, John; Kabesch, Michael; Vonk, Judith M.; Postma, Dirkje S.; Koppelman, Gerard H.
2016-01-01
Background Genome wide association studies (GWAS) of asthma have identified single nucleotide polymorphisms (SNPs) that modestly increase the risk for asthma. This could be due to phenotypic heterogeneity of asthma. Bronchial hyperresponsiveness (BHR) is a phenotypic hallmark of asthma. We aim to identify susceptibility genes for asthma combined with BHR and analyse the presence of cis-eQTLs among replicated SNPs. Secondly, we compare the genetic association of SNPs previously associated with (doctor diagnosed) asthma to our GWAS of asthma with BHR. Methods A GWAS was performed in 920 asthmatics with BHR and 980 controls. Top SNPs of our GWAS were analysed in four replication cohorts and lung cis-eQTL analysis was performed on replicated SNPs. We investigated association of SNPs previously associated with asthma in our data. Results 368 SNPs were followed up for replication. Six SNPs in genes encoding ABI3BP, NAF1, MICA and the 17q21 locus replicated in one or more cohorts, with one locus (17q21) achieving genome wide significance after meta-analysis. Five out of 6 replicated SNPs regulated 35 gene transcripts in whole lung. Eight of 20 asthma associated SNPs from previous GWAS were significantly associated with asthma and BHR. Three SNPs, in IL-33 and GSDMB, showed larger effect sizes in our data compared to published literature. Conclusions Combining GWAS with subsequent lung eQTL analysis revealed disease associated SNPs regulating lung mRNA expression levels of potential new asthma genes. Adding BHR to the asthma definition does not lead to an overall larger genetic effect size than analysing (doctor’s diagnosed) asthma. PMID:27439200
Speakman, John R.; Westerterp, Klaas R.
2013-01-01
SUMMARY The thrifty-gene hypothesis (TGH) posits that the modern genetic predisposition to obesity stems from a historical past where famine selected for genes that promote efficient fat deposition. It has been previously argued that such a scenario is unfeasible because under such strong selection any gene favouring fat deposition would rapidly move to fixation. Hence, we should all be predisposed to obesity: which we are not. The genetic architecture of obesity that has been revealed by genome-wide association studies (GWAS), however, calls into question such an argument. Obesity is caused by mutations in many hundreds (maybe thousands) of genes, each with a very minor, independent and additive impact. Selection on such genes would probably be very weak because the individual advantages they would confer would be very small. Hence, the genetic architecture of the epidemic may indeed be compatible with, and hence support, the TGH. To evaluate whether this is correct, it is necessary to know the likely effects of the identified GWAS alleles on survival during starvation. This would allow definition of their advantage in famine conditions, and hence the likely selection pressure for such alleles to have spread over the time course of human evolution. We constructed a mathematical model of weight loss under total starvation using the established principles of energy balance. Using the model, we found that fatter individuals would indeed survive longer and, at a given body weight, females would survive longer than males, when totally starved. An allele causing deposition of an extra 80 g of fat would result in an extension of life under total starvation by about 1.1–1.6% in an individual with 10 kg of fat and by 0.25–0.27% in an individual carrying 32 kg of fat. A mutation causing a per allele effect of 0.25% would become completely fixed in a population with an effective size of 5 million individuals in 6000 selection events. Because there have probably been about 24,000 famine events since the evolution of hominins 4 million years ago, there has been ample time even for genes with only very minor impacts on adiposity to move to fixation. The observed polymorphic variation in the genes causing the predisposition to obesity is incompatible with the TGH, unless all these single nucleotide polymorphisms (SNPs) arose in the last 900,000 years, a requirement we know is incorrect. The TGH is further weakened by the observation of no link between the effect size of these SNPs and their prevalence, which would be anticipated under the TGH model of selection if all the SNPs had arisen in the last 900,000 years. PMID:22864023
Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.
ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong
2018-05-15
We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.
Genome-wide variation within and between wild and domestic yak.
Wang, Kun; Hu, Quanjun; Ma, Hui; Wang, Lizhong; Yang, Yongzhi; Luo, Wenchun; Qiu, Qiang
2014-07-01
The yak is one of the few animals that can thrive in the harsh environment of the Qinghai-Tibetan Plateau and adjacent Alpine regions. Yak provides essential resources allowing Tibetans to live at high altitudes. However, genetic variation within and between wild and domestic yak remain unknown. Here, we present a genome-wide study of the genetic variation within and between wild and domestic yak. Using next-generation sequencing technology, we resequenced three wild and three domestic yak with a mean of fivefold coverage using our published domestic yak genome as a reference. We identified a total of 8.38 million SNPs (7.14 million novel), 383,241 InDels and 126,352 structural variants between the six yak. We observed higher linkage disequilibrium in domestic yak than in wild yak and a modest but distinct genetic divergence between these two groups. We further identified more than a thousand of potential selected regions (PSRs) for the three domestic yak by scanning the whole genome. These genomic resources can be further used to study genetic diversity and select superior breeds of yak and other bovid species. © 2014 John Wiley & Sons Ltd.
Alsaif, Mohammed A.; Al Shammari, Sulaiman A.; Alhamdan, Adel A.
2012-01-01
Introduction Single-nucleotide polymorphisms (SNPs) are biomarkers for exploring the genetic basis of many complex human diseases. The prediction of SNPs is promising in modern genetic analysis but it is still a great challenge to identify the functional SNPs in a disease-related gene. The computational approach has overcome this challenge and an increase in the successful rate of genetic association studies and reduced cost of genotyping have been achieved. The objective of this study is to identify deleterious non-synonymous SNPs (nsSNPs) associated with the COL1A1 gene. Material and methods The SNPs were retrieved from the Single Nucleotide Polymorphism Database (dbSNP). Using I-Mutant, protein stability change was calculated. The potentially functional nsSNPs and their effect on proteins were predicted by PolyPhen and SIFT respectively. FASTSNP was used for estimation of risk score. Results Our analysis revealed 247 SNPs as non-synonymous, out of which 5 nsSNPs were found to be least stable by I-Mutant 2.0 with a DDG value of > –1.0. Four nsSNPs, namely rs17853657, rs17857117, rs57377812 and rs1059454, showed a highly deleterious tolerance index score of 0.00 with a change in their physicochemical properties by the SIFT server. Seven nsSNPs, namely rs1059454, rs8179178, rs17853657, rs17857117, rs72656340, rs72656344 and rs72656351, were found to be probably damaging with a PSIC score difference between 2.0 and 3.5 by the PolyPhen server. Three nsSNPs, namely rs1059454, rs17853657 and rs17857117, were found to be highly polymorphic with a risk score of 3-4 with a possible effect of non-conservative change and splicing regulation by FASTSNP. Conclusions Three nsSNPs, namely rs1059454, rs17853657 and rs17857117, are potential functional polymorphisms that are likely to have a functional impact on the COL1A1 gene. PMID:24273577
Genetic polymorphisms associated with breast cancer in malaysian cohort.
Chahil, Jagdish Kaur; Munretnam, Khamsigan; Samsudin, Nurulhafizah; Lye, Say Hean; Hashim, Nikman Adli Nor; Ramzi, Nurul Hanis; Velapasamy, Sharmila; Wee, Ler Lian; Alex, Livy
2015-04-01
Genome-wide association studies have discovered multiple single nucleotide polymorphisms (SNPs) associated with the risk of common diseases. The objective of this study was to demonstrate the replication of previously published SNPs that showed statistical significance for breast cancer in the Malaysian population. In this case-control study, 80 subjects for each group were recruited from various hospitals in Malaysia. A total of 768 SNPs were genotyped and analyzed to distinguish risk and protective alleles. A total of three SNPs were found to be associated with increased risk of breast cancer while six SNPs showed protective effect. All nine were statistically significant SNPs (p ≤ 0.01), five SNPs from previous studies were successfully replicated in our study. Significant modifiable (diet) and non-modifiable (family history of breast cancer in first degree relative) risk factors were also observed. We identified nine SNPs from this study to be either conferring susceptibility or protection to breast cancer which may serve as potential markers in risk prediction.
Genomic Regions Associated with Root Traits under Drought Stress in Tropical Maize (Zea mays L.)
Zaidi, P. H.; Krishna, Girish; Krishnamurthy, L.; Gajanan, S.; Babu, Raman; Zerka, M.; Vinayan, M. T.; Vivek, B. S.
2016-01-01
An association mapping panel, named as CIMMYT Asia association mapping (CAAM) panel, involving 396 diverse tropical maize lines were phenotyped for various structural and functional traits of roots under drought and well-watered conditions. The experiment was conducted during Kharif (summer-rainy) season of 2012 and 2013 in root phenotyping facility at CIMMYT-Hyderabad, India. The CAAM panel was genotyped to generate 955, 690 SNPs through GBS v2.7 using Illumina Hi-seq 2000/2500 at Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA. GWAS analysis was carried out using 331,390 SNPs filtered from the entire set of SNPs revealed a total of 50 and 67 SNPs significantly associated for root functional (transpiration efficiency, flowering period water use) and structural traits (rooting depth, root dry weight, root length, root volume, root surface area and root length density), respectively. In addition to this, 37 SNPs were identified for grain yield and shoot biomass under well-watered and drought stress. Though many SNPs were found to have significant association with the traits under study, SNPs that were common for more than one trait were discussed in detail. A total 18 SNPs were found to have common association with more than one trait, out of which 12 SNPs were found within or near the various gene functional regions. In this study we attempted to identify the trait specific maize lines based on the presence of favorable alleles for the SNPs associated with multiple traits. Two SNPs S3_128533512 and S7_151238865 were associated with transpiration efficiency, shoot biomass and grain yield under well-watered condition. Based on favorable allele for these SNPs seven inbred lines were identified. Similarly, four lines were identified for transpiration efficiency and shoot biomass under drought stress based on the presence of favorable allele for the common SNPs S1_211520521, S2_20017716, S3_57210184 and S7_130878458 and three lines were identified for flowering period water-use, transpiration efficiency, root dry weight and root volume based on the presence of favorable allele for the common SNPs S3_162065732 and S3_225760139. PMID:27768702
CsSNP: A Web-Based Tool for the Detecting of Comparative Segments SNPs.
Wang, Yi; Wang, Shuangshuang; Zhou, Dongjie; Yang, Shuai; Xu, Yongchao; Yang, Chao; Yang, Long
2016-07-01
SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies.
Ishida, Kelly; Cipriano, Talita Ferreira; Rocha, Gustavo Miranda; Weissmüller, Gilberto; Gomes, Fabio; Miranda, Kildare; Rozental, Sonia
2013-01-01
The microbial synthesis of nanoparticles is a green chemistry approach that combines nanotechnology and microbial biotechnology. The aim of this study was to obtain silver nanoparticles (SNPs) using aqueous extract from the filamentous fungus Fusarium oxysporum as an alternative to chemical procedures and to evaluate its antifungal activity. SNPs production increased in a concentration-dependent way up to 1 mM silver nitrate until 30 days of reaction. Monodispersed and spherical SNPs were predominantly produced. After 60 days, it was possible to observe degenerated SNPs with in additional needle morphology. The SNPs showed a high antifungal activity against Candida and Cryptococcus , with minimum inhibitory concentration values ≤ 1.68 µg/mL for both genera. Morphological alterations of Cryptococcus neoformans treated with SNPs were observed such as disruption of the cell wall and cytoplasmic membrane and lost of the cytoplasm content. This work revealed that SNPs can be easily produced by F. oxysporum aqueous extracts and may be a feasible, low-cost, environmentally friendly method for generating stable and uniformly sized SNPs. Finally, we have demonstrated that these SNPs are active against pathogenic fungi, such as Candida and Cryptococcus . PMID:24714966
Ishida, Kelly; Cipriano, Talita Ferreira; Rocha, Gustavo Miranda; Weissmüller, Gilberto; Gomes, Fabio; Miranda, Kildare; Rozental, Sonia
2014-04-01
The microbial synthesis of nanoparticles is a green chemistry approach that combines nanotechnology and microbial biotechnology. The aim of this study was to obtain silver nanoparticles (SNPs) using aqueous extract from the filamentous fungus Fusarium oxysporum as an alternative to chemical procedures and to evaluate its antifungal activity. SNPs production increased in a concentration-dependent way up to 1 mM silver nitrate until 30 days of reaction. Monodispersed and spherical SNPs were predominantly produced. After 60 days, it was possible to observe degenerated SNPs with in additional needle morphology. The SNPs showed a high antifungal activity against Candida and Cryptococcus , with minimum inhibitory concentration values ≤ 1.68 µg/mL for both genera. Morphological alterations of Cryptococcus neoformans treated with SNPs were observed such as disruption of the cell wall and cytoplasmic membrane and lost of the cytoplasm content. This work revealed that SNPs can be easily produced by F. oxysporum aqueous extracts and may be a feasible, low-cost, environmentally friendly method for generating stable and uniformly sized SNPs. Finally, we have demonstrated that these SNPs are active against pathogenic fungi, such as Candida and Cryptococcus.
Impact of SNPs on Protein Phosphorylation Status in Rice (Oryza sativa L.).
Lin, Shoukai; Chen, Lijuan; Tao, Huan; Huang, Jian; Xu, Chaoqun; Li, Lin; Ma, Shiwei; Tian, Tian; Liu, Wei; Xue, Lichun; Ai, Yufang; He, Huaqin
2016-11-11
Single nucleotide polymorphisms (SNPs) are widely used in functional genomics and genetics research work. The high-quality sequence of rice genome has provided a genome-wide SNP and proteome resource. However, the impact of SNPs on protein phosphorylation status in rice is not fully understood. In this paper, we firstly updated rice SNP resource based on the new rice genome Ver. 7.0, then systematically analyzed the potential impact of Non-synonymous SNPs (nsSNPs) on the protein phosphorylation status. There were 3,897,312 SNPs in Ver. 7.0 rice genome, among which 9.9% was nsSNPs. Whilst, a total 2,508,261 phosphorylated sites were predicted in rice proteome. Interestingly, we observed that 150,197 (39.1%) nsSNPs could influence protein phosphorylation status, among which 52.2% might induce changes of protein kinase (PK) types for adjacent phosphorylation sites. We constructed a database, SNP_rice, to deposit the updated rice SNP resource and phosSNPs information. It was freely available to academic researchers at http://bioinformatics.fafu.edu.cn. As a case study, we detected five nsSNPs that potentially influenced heterotrimeric G proteins phosphorylation status in rice, indicating that genetic polymorphisms showed impact on the signal transduction by influencing the phosphorylation status of heterotrimeric G proteins. The results in this work could be a useful resource for future experimental identification and provide interesting information for better rice breeding.
Hall, Barry G
2014-01-01
SNP-association studies are a starting point for identifying genes that may be responsible for specific phenotypes, such as disease traits. The vast bulk of tools for SNP-association studies are directed toward SNPs in the human genome, and I am unaware of any tools designed specifically for such studies in bacterial or viral genomes. The PPFS (Predict Phenotypes From SNPs) package described here is an add-on to kSNP , a program that can identify SNPs in a data set of hundreds of microbial genomes. PPFS identifies those SNPs that are non-randomly associated with a phenotype based on the χ² probability, then uses those diagnostic SNPs for two distinct, but related, purposes: (1) to predict the phenotypes of strains whose phenotypes are unknown, and (2) to identify those diagnostic SNPs that are most likely to be causally related to the phenotype. In the example illustrated here, from a set of 68 E. coli genomes, for 67 of which the pathogenicity phenotype was known, there were 418,500 SNPs. Using the phenotypes of 36 of those strains, PPFS identified 207 diagnostic SNPs. The diagnostic SNPs predicted the phenotypes of all of the genomes with 97% accuracy. It then identified 97 SNPs whose probability of being causally related to the pathogenic phenotype was >0.999. In a second example, from a set of 116 E. coli genome sequences, using the phenotypes of 65 strains PPFS identified 101 SNPs that predicted the source host (human or non-human) with 90% accuracy.
Zhu, Qian-Hao; Spriggs, Andrew; Taylor, Jennifer M.; Llewellyn, Danny; Wilson, Iain
2014-01-01
Varietal single nucleotide polymorphisms (SNPs) are the differences within one of the two subgenomes between different tetraploid cotton varieties and have not been practically used in cotton genetics and breeding because they are difficult to identify due to low genetic diversity and very high sequence identity between homeologous genes in cotton. We have used transcriptome and restriction site−associated DNA sequencing to identify varietal SNPs among 18 G. hirsutum varieties based on the rationale that varietal SNPs can be more confidently called when flanked by subgenome-specific SNPs. Using transcriptome data, we successfully identified 37,413 varietal SNPs and, of these, 22,121 did not have an additional varietal SNP within their 20-bp flanking regions so can be used in most SNP genotyping assays. From restriction site−associated DNA sequencing data, we identified an additional 3090 varietal SNPs between two of the varieties. Of the 1583 successful SNP assays achieved using different genotyping platforms, 1363 were verified. Many of the SNPs behaved as dominant markers because of coamplification from homeologous loci, but the number of SNPs acting as codominant markers increased when one or more subgenome-specific SNP(s) were incorporated in their assay primers, giving them greater utility for breeding applications. A G. hirsutum genetic map with 1244 SNP markers was constructed covering 5557.42 centiMorgan and used to map qualitative and quantitative traits. This collection of G. hirsutum varietal SNPs complements existing intra-specific SNPs and provides the cotton community with a valuable marker resource applicable to genetic analyses and breeding programs. PMID:25106949
HGDP and HapMap Analysis by Ancestry Mapper Reveals Local and Global Population Relationships
Magalhães, Tiago R.; Casey, Jillian P.; Conroy, Judith; Regan, Regina; Fitzpatrick, Darren J.; Shah, Naisha; Sobral, João; Ennis, Sean
2012-01-01
Knowledge of human origins, migrations, and expansions is greatly enhanced by the availability of large datasets of genetic information from different populations and by the development of bioinformatic tools used to analyze the data. We present Ancestry Mapper, which we believe improves on existing methods, for the assignment of genetic ancestry to an individual and to study the relationships between local and global populations. The principle function of the method, named Ancestry Mapper, is to give each individual analyzed a genetic identifier, made up of just 51 genetic coordinates, that corresponds to its relationship to the HGDP reference population. As a consequence, the Ancestry Mapper Id (AMid) has intrinsic biological meaning and provides a tool to measure similarity between world populations. We applied Ancestry Mapper to a dataset comprised of the HGDP and HapMap data. The results show distinctions at the continental level, while simultaneously giving details at the population level. We clustered AMids of HGDP/HapMap and observe a recapitulation of human migrations: for a small number of clusters, individuals are grouped according to continental origins; for a larger number of clusters, regional and population distinctions are evident. Calculating distances between AMids allows us to infer ancestry. The number of coordinates is expandable, increasing the power of Ancestry Mapper. An R package called Ancestry Mapper is available to apply this method to any high density genomic data set. PMID:23189146
Shriner, Daniel; Adeyemo, Adebowale; Gerry, Norman P.; Herbert, Alan; Chen, Guanjie; Doumatey, Ayo; Huang, Hanxia; Zhou, Jie; Christman, Michael F.; Rotimi, Charles N.
2009-01-01
Human height is the prototypical polygenic quantitative trait. Recently, several genetic variants influencing adult height were identified, primarily in individuals of East Asian (Chinese Han or Korean) or European ancestry. Here, we examined 152 genetic variants representing 107 independent loci previously associated with adult height for transferability in a well-powered sample of 1,016 unrelated African Americans. When we tested just the reported variants originally identified as associated with adult height in individuals of East Asian or European ancestry, only 8.3% of these loci transferred (p-values≤0.05 under an additive genetic model with directionally consistent effects) to our African American sample. However, when we comprehensively evaluated all HapMap variants in linkage disequilibrium (r 2≥0.3) with the reported variants, the transferability rate increased to 54.1%. The transferability rate was 70.8% for associations originally reported as genome-wide significant and 38.0% for associations originally reported as suggestive. An additional 23 loci were significantly associated but failed to transfer because of directionally inconsistent effects. Six loci were associated with adult height in all three groups. Using differences in linkage disequilibrium patterns between HapMap CEU or CHB reference data and our African American sample, we fine-mapped these six loci, improving both the localization and the annotation of these transferable associations. PMID:20027299
Abbey, Darren; Hickman, Meleah; Gresham, David; Berman, Judith
2011-01-01
Phenotypic diversity can arise rapidly through loss of heterozygosity (LOH) or by the acquisition of copy number variations (CNV) spanning whole chromosomes or shorter contiguous chromosome segments. In Candida albicans, a heterozygous diploid yeast pathogen with no known meiotic cycle, homozygosis and aneuploidy alter clinical characteristics, including drug resistance. Here, we developed a high-resolution microarray that simultaneously detects ∼39,000 single nucleotide polymorphism (SNP) alleles and ∼20,000 copy number variation loci across the C. albicans genome. An important feature of the array analysis is a computational pipeline that determines SNP allele ratios based upon chromosome copy number. Using the array and analysis tools, we constructed a haplotype map (hapmap) of strain SC5314 to assign SNP alleles to specific homologs, and we used it to follow the acquisition of loss of heterozygosity (LOH) and copy number changes in a series of derived laboratory strains. This high-resolution SNP/CGH microarray and the associated hapmap facilitated the phasing of alleles in lab strains and revealed detrimental genome changes that arose frequently during molecular manipulations of laboratory strains. Furthermore, it provided a useful tool for rapid, high-resolution, and cost-effective characterization of changes in allele diversity as well as changes in chromosome copy number in new C. albicans isolates. PMID:22384363
HGDP and HapMap analysis by Ancestry Mapper reveals local and global population relationships.
Magalhães, Tiago R; Casey, Jillian P; Conroy, Judith; Regan, Regina; Fitzpatrick, Darren J; Shah, Naisha; Sobral, João; Ennis, Sean
2012-01-01
Knowledge of human origins, migrations, and expansions is greatly enhanced by the availability of large datasets of genetic information from different populations and by the development of bioinformatic tools used to analyze the data. We present Ancestry Mapper, which we believe improves on existing methods, for the assignment of genetic ancestry to an individual and to study the relationships between local and global populations. The principle function of the method, named Ancestry Mapper, is to give each individual analyzed a genetic identifier, made up of just 51 genetic coordinates, that corresponds to its relationship to the HGDP reference population. As a consequence, the Ancestry Mapper Id (AMid) has intrinsic biological meaning and provides a tool to measure similarity between world populations. We applied Ancestry Mapper to a dataset comprised of the HGDP and HapMap data. The results show distinctions at the continental level, while simultaneously giving details at the population level. We clustered AMids of HGDP/HapMap and observe a recapitulation of human migrations: for a small number of clusters, individuals are grouped according to continental origins; for a larger number of clusters, regional and population distinctions are evident. Calculating distances between AMids allows us to infer ancestry. The number of coordinates is expandable, increasing the power of Ancestry Mapper. An R package called Ancestry Mapper is available to apply this method to any high density genomic data set.
No association of dynamin binding protein (DNMBP) gene SNPs and Alzheimer's disease.
Minster, Ryan L; DeKosky, Steven T; Kamboh, M Ilyas
2008-10-01
A recent scan of single nucleotide polymorphisms (SNPs) on chromosome 10q found significant association of six correlated SNPs with late-onset Alzheimer's disease (AD) among Japanese. We examined the SNP with the highest statistical significance (rs3740058) in a large Caucasian American case-control cohort and the remaining five SNPs in a smaller subset of cases and controls. We observed no association of statistical significance in either the total sample or the APOE*4 non-carriers for any of the SNPs.
NASA Astrophysics Data System (ADS)
Singh, Tej; Shekhawat, Dharmender Singh; Jyoti, Kumari
2018-05-01
The synthesis of silver nanoparticles (SNPs) by chemical and physical methods produce harmful products which may cause various environmental problems, thus, there is an increasing demand to use ecofriendly methods. Therefore, biosynthesis of SNPs using Justicia adhatoda flower extract is demonstrated in the present study. The biosynthesized SNPs were characterized by UV-visible spectroscopy, Fourier transform-infrared spectroscopy (FTIR), transmission electron microscopy (TEM), selected area electron diffraction (SAED) and atomic force microscopy (AFM) analysis. The result of UV-visible spectroscopy peaked at 417 nm corresponding to the plasmon absorbance of SNPs. The TEM and SAED result reveals the crystalline nature of SNPs. FTIR spectroscopy used to identify the possible biomolecules responsible for the conversion of silver ions to SNPs. The study concluded that Justicia adhatoda flower extract act as an excellent reducing agent and the green synthesized SNPs are safer to the environment.
Evolutionary evidence of the effect of rare variants on disease etiology.
Gorlov, I P; Gorlova, O Y; Frazier, M L; Spitz, M R; Amos, C I
2011-03-01
The common disease/common variant hypothesis has been popular for describing the genetic architecture of common human diseases for several years. According to the originally stated hypothesis, one or a few common genetic variants with a large effect size control the risk of common diseases. A growing body of evidence, however, suggests that rare single-nucleotide polymorphisms (SNPs), i.e. those with a minor allele frequency of less than 5%, are also an important component of the genetic architecture of common human diseases. In this study, we analyzed the relevance of rare SNPs to the risk of common diseases from an evolutionary perspective and found that rare SNPs are more likely than common SNPs to be functional and tend to have a stronger effect size than do common SNPs. This observation, and the fact that most of the SNPs in the human genome are rare, suggests that rare SNPs are a crucial element of the genetic architecture of common human diseases. We propose that the next generation of genomic studies should focus on analyzing rare SNPs. Further, targeting patients with a family history of the disease, an extreme phenotype, or early disease onset may facilitate the detection of risk-associated rare SNPs. © 2010 John Wiley & Sons A/S.
Sabir, Aneela; Shafiq, Muhammad; Islam, Atif; Jabeen, Faiza; Shafeeq, Amir; Ahmad, Adnan; Zahid Butt, Muhammad Taqi; Jacob, Karl I; Jamil, Tahir
2016-01-20
Thermally-induced phase separation (TIPS) method was used to synthesize polymer matrix (PM) membranes for reverse osmosis from cellulose acetate/polyethylene glycol (CA/PEG300) conjugated with silica nanoparticles (SNPs). Experimental data showed that the conjugation of SNPs changed the surface properties as dense and asymmetric composite structure. The results were explicitly determined by the permeability flux and salt rejection efficiency of the PM-SNPs membranes. The effect of SNPs conjugation on MgSO4 salt rejection was more significant in magnitude than on permeation flux i.e. 2.38 L/m(2)h. FTIR verified that SNPs were successfully conjugated on the surface of PM membrane. DSC of PM-SNPs shows an improved Tg from 76.2 to 101.8 °C for PM and PM-S4 respectively. Thermal stability of the PM-SNPs membranes was observed by TGA which was significantly enhanced with the conjugation of SNPs. The micrographs of SEM and AFM showed the morphological changes and increase in the valley and ridges on membrane surface. Experimental data showed that the PM-S4 (0.4 wt% SNPs) membrane has maximum salt rejection capacity and was selected as an optimal membrane. Copyright © 2015 Elsevier Ltd. All rights reserved.
Vallée Marcotte, Bastien; Cormier, Hubert; Guénard, Frédéric; Rudkowska, Iwona; Lemieux, Simone; Couture, Patrick; Vohl, Marie-Claude
2016-01-01
A recent genome-wide association study (GWAS) by our group identified 13 loci associated with the plasma triglyceride (TG) response to omega-3 (n-3) fatty acid (FA) supplementation. This study aimed to test whether single-nucleotide polymorphisms (SNPs) within the IQCJ, NXPH1, PHF17 and MYB genes are associated with the plasma TG response to an n-3 FA supplementation. A total of 208 subjects followed a 6-week n-3 FA supplementation of 5 g/day of fish oil (1.9-2.2 g of eicosapentaenoic acid and 1.1 g of docosahexaenoic acid). Measurements of plasma lipids were made before and after the supplementation. Sixty-seven tagged SNPs were selected to increase the density of markers near GWAS hits. In a repeated model, independent effects of the genotype and the gene-supplementation interaction were associated with plasma TG. Genotype effects were observed with two SNPs of NXPH1, and gene-diet interactions were observed with ten SNPs of IQCJ, four SNPs of NXPH1 and three SNPs of MYB. Positive and negative responders showed different genotype frequencies with nine SNPs of IQCJ, two SNPs of NXPH1 and two SNPs of MYB. Fine mapping in GWAS-associated loci allowed the identification of SNPs partly explaining the large interindividual variability observed in plasma TG levels in response to an n-3 FA supplementation. © 2016 S. Karger AG, Basel.
Linkage Disequilibrium and Inversion-Typing of the Drosophila melanogaster Genome Reference Panel
Houle, David; Márquez, Eladio J.
2015-01-01
We calculated the linkage disequilibrium between all pairs of variants in the Drosophila Genome Reference Panel with minor allele count ≥5. We used r2 ≥ 0.5 as the cutoff for a highly correlated SNP. We make available the list of all highly correlated SNPs for use in association studies. Seventy-six percent of variant SNPs are highly correlated with at least one other SNP, and the mean number of highly correlated SNPs per variant over the whole genome is 83.9. Disequilibrium between distant SNPs is also common when minor allele frequency (MAF) is low: 37% of SNPs with MAF < 0.1 are highly correlated with SNPs more than 100 kb distant. Although SNPs within regions with polymorphic inversions are highly correlated with somewhat larger numbers of SNPs, and these correlated SNPs are on average farther away, the probability that a SNP in such regions is highly correlated with at least one other SNP is very similar to SNPs outside inversions. Previous karyotyping of the DGRP lines has been inconsistent, and we used LD and genotype to investigate these discrepancies. When previous studies agreed on inversion karyotype, our analysis was almost perfectly concordant with those assignments. In discordant cases, and for inversion heterozygotes, our results suggest errors in two previous analyses or discordance between genotype and karyotype. Heterozygosities of chromosome arms are, in many cases, surprisingly highly correlated, suggesting strong epsistatic selection during the inbreeding and maintenance of the DGRP lines. PMID:26068573
Bipartite Community Structure of eQTLs.
Platig, John; Castaldi, Peter J; DeMeo, Dawn; Quackenbush, John
2016-09-01
Genome Wide Association Studies (GWAS) and expression quantitative trait locus (eQTL) analyses have identified genetic associations with a wide range of human phenotypes. However, many of these variants have weak effects and understanding their combined effect remains a challenge. One hypothesis is that multiple SNPs interact in complex networks to influence functional processes that ultimately lead to complex phenotypes, including disease states. Here we present CONDOR, a method that represents both cis- and trans-acting SNPs and the genes with which they are associated as a bipartite graph and then uses the modular structure of that graph to place SNPs into a functional context. In applying CONDOR to eQTLs in chronic obstructive pulmonary disease (COPD), we found the global network "hub" SNPs were devoid of disease associations through GWAS. However, the network was organized into 52 communities of SNPs and genes, many of which were enriched for genes in specific functional classes. We identified local hubs within each community ("core SNPs") and these were enriched for GWAS SNPs for COPD and many other diseases. These results speak to our intuition: rather than single SNPs influencing single genes, we see groups of SNPs associated with the expression of families of functionally related genes and that disease SNPs are associated with the perturbation of those functions. These methods are not limited in their application to COPD and can be used in the analysis of a wide variety of disease processes and other phenotypic traits.
Screening and Evaluation of Deleterious SNPs in APOE Gene of Alzheimer's Disease.
Masoodi, Tariq Ahmad; Al Shammari, Sulaiman A; Al-Muammar, May N; Alhamdan, Adel A
2012-01-01
Introduction. Apolipoprotein E (APOE) is an important risk factor for Alzheimer's disease (AD) and is present in 30-50% of patients who develop late-onset AD. Several single-nucleotide polymorphisms (SNPs) are present in APOE gene which act as the biomarkers for exploring the genetic basis of this disease. The objective of this study is to identify deleterious nsSNPs associated with APOE gene. Methods. The SNPs were retrieved from dbSNP. Using I-Mutant, protein stability change was calculated. The potentially functional nonsynonymous (ns) SNPs and their effect on protein was predicted by PolyPhen and SIFT, respectively. FASTSNP was used for functional analysis and estimation of risk score. The functional impact on the APOE protein was evaluated by using Swiss PDB viewer and NOMAD-Ref server. Results. Six nsSNPs were found to be least stable by I-Mutant 2.0 with DDG value of >-1.0. Four nsSNPs showed a highly deleterious tolerance index score of 0.00. Nine nsSNPs were found to be probably damaging with position-specific independent counts (PSICs) score of ≥2.0. Seven nsSNPs were found to be highly polymorphic with a risk score of 3-4. The total energies and root-mean-square deviation (RMSD) values were higher for three mutant-type structures compared to the native modeled structure. Conclusion. We concluded that three nsSNPs, namely, rs11542041, rs11542040, and rs11542034, to be potentially functional polymorphic.
Linkage Disequilibrium and Inversion-Typing of the Drosophila melanogaster Genome Reference Panel.
Houle, David; Márquez, Eladio J
2015-06-10
We calculated the linkage disequilibrium between all pairs of variants in the Drosophila Genome Reference Panel with minor allele count ≥5. We used r(2) ≥ 0.5 as the cutoff for a highly correlated SNP. We make available the list of all highly correlated SNPs for use in association studies. Seventy-six percent of variant SNPs are highly correlated with at least one other SNP, and the mean number of highly correlated SNPs per variant over the whole genome is 83.9. Disequilibrium between distant SNPs is also common when minor allele frequency (MAF) is low: 37% of SNPs with MAF < 0.1 are highly correlated with SNPs more than 100 kb distant. Although SNPs within regions with polymorphic inversions are highly correlated with somewhat larger numbers of SNPs, and these correlated SNPs are on average farther away, the probability that a SNP in such regions is highly correlated with at least one other SNP is very similar to SNPs outside inversions. Previous karyotyping of the DGRP lines has been inconsistent, and we used LD and genotype to investigate these discrepancies. When previous studies agreed on inversion karyotype, our analysis was almost perfectly concordant with those assignments. In discordant cases, and for inversion heterozygotes, our results suggest errors in two previous analyses or discordance between genotype and karyotype. Heterozygosities of chromosome arms are, in many cases, surprisingly highly correlated, suggesting strong epsistatic selection during the inbreeding and maintenance of the DGRP lines. Copyright © 2015 Houle and Márquez.
Tan, Xiang-Lin; Moyer, Ann M.; Fridley, Brooke L.; Schaid, Daniel J.; Niu, Nifang; Batzler, Anthony J.; Jenkins, Gregory D.; Abo, Ryan P.; Li, Liang; Cunningham, Julie M.; Sun, Zhifu; Yang, Ping; Wang, Liewei
2011-01-01
Purpose Inherited variability in the prognosis of lung cancer patients treated with platinum-based chemotherapy has been widely investigated. However, the overall contribution of genetic variation to platinum response is not well established. To identify novel candidate SNPs/genes, we performed a genome-wide association study (GWAS) for cisplatin cytotoxicity using lymphoblastoid cell lines (LCLs), followed by an association study of selected SNPs from the GWAS with overall survival (OS) in lung cancer patients. Experimental Design GWAS for cisplatin were performed with 283 ethnically diverse LCLs. 168 top SNPs were genotyped in 222 small cell and 961 non-small cell lung cancer (SCLC, NSCLC) patients treated with platinum-based therapy. Association of the SNPs with OS was determined using the Cox regression model. Selected candidate genes were functionally validated by siRNA knockdown in human lung cancer cells. Results Among 157 successfully genotyped SNPs, 9 and 10 SNPs were top SNPs associated with OS for patients with NSCLC and SCLC, respectively, although they were not significant after adjusting for multiple testing. Fifteen genes, including 7 located within 200 kb up or downstream of the four top SNPs and 8 genes for which expression was correlated with three SNPs in LCLs were selected for siRNA screening. Knockdown of DAPK3 and METTL6, for which expression levels were correlated with the rs11169748 and rs2440915 SNPs, significantly decreased cisplatin sensitivity in lung cancer cells. Conclusions This series of clinical and complementary laboratory-based functional studies identified several candidate genes/SNPs that might help predict treatment outcomes for platinum-based therapy of lung cancer. PMID:21775533
Geographic differences in allele frequencies of susceptibility SNPs for cardiovascular disease
2011-01-01
Background We hypothesized that the frequencies of risk alleles of SNPs mediating susceptibility to cardiovascular diseases differ among populations of varying geographic origin and that population-specific selection has operated on some of these variants. Methods From the database of genome-wide association studies (GWAS), we selected 36 cardiovascular phenotypes including coronary heart disease, hypertension, and stroke, as well as related quantitative traits (eg, body mass index and plasma lipid levels). We identified 292 SNPs in 270 genes associated with a disease or trait at P < 5 × 10-8. As part of the Human Genome-Diversity Project (HGDP), 158 (54.1%) of these SNPs have been genotyped in 938 individuals belonging to 52 populations from seven geographic areas. A measure of population differentiation, FST, was calculated to quantify differences in risk allele frequencies (RAFs) among populations and geographic areas. Results Large differences in RAFs were noted in populations of Africa, East Asia, America and Oceania, when compared with other geographic regions. The mean global FST (0.1042) for 158 SNPs among the populations was not significantly higher than the mean global FST of 158 autosomal SNPs randomly sampled from the HGDP database. Significantly higher global FST (P < 0.05) was noted in eight SNPs, based on an empirical distribution of global FST of 2036 putatively neutral SNPs. For four of these SNPs, additional evidence of selection was noted based on the integrated Haplotype Score. Conclusion Large differences in RAFs for a set of common SNPs that influence risk of cardiovascular disease were noted between the major world populations. Pairwise comparisons revealed RAF differences for at least eight SNPs that might be due to population-specific selection or demographic factors. These findings are relevant to a better understanding of geographic variation in the prevalence of cardiovascular disease. PMID:21507254
Lepoittevin, Camille; Frigerio, Jean-Marc; Garnier-Géré, Pauline; Salin, Franck; Cervera, María-Teresa; Vornam, Barbara; Harvengt, Luc; Plomion, Christophe
2010-01-01
Background There is considerable interest in the high-throughput discovery and genotyping of single nucleotide polymorphisms (SNPs) to accelerate genetic mapping and enable association studies. This study provides an assessment of EST-derived and resequencing-derived SNP quality in maritime pine (Pinus pinaster Ait.), a conifer characterized by a huge genome size (∼23.8 Gb/C). Methodology/Principal Findings A 384-SNPs GoldenGate genotyping array was built from i/ 184 SNPs originally detected in a set of 40 re-sequenced candidate genes (in vitro SNPs), chosen on the basis of functionality scores, presence of neighboring polymorphisms, minor allele frequencies and linkage disequilibrium and ii/ 200 SNPs screened from ESTs (in silico SNPs) selected based on the number of ESTs used for SNP detection, the SNP minor allele frequency and the quality of SNP flanking sequences. The global success rate of the assay was 66.9%, and a conversion rate (considering only polymorphic SNPs) of 51% was achieved. In vitro SNPs showed significantly higher genotyping-success and conversion rates than in silico SNPs (+11.5% and +18.5%, respectively). The reproducibility was 100%, and the genotyping error rate very low (0.54%, dropping down to 0.06% when removing four SNPs showing elevated error rates). Conclusions/Significance This study demonstrates that ESTs provide a resource for SNP identification in non-model species, which do not require any additional bench work and little bio-informatics analysis. However, the time and cost benefits of in silico SNPs are counterbalanced by a lower conversion rate than in vitro SNPs. This drawback is acceptable for population-based experiments, but could be dramatic in experiments involving samples from narrow genetic backgrounds. In addition, we showed that both the visual inspection of genotyping clusters and the estimation of a per SNP error rate should help identify markers that are not suitable to the GoldenGate technology in species characterized by a large and complex genome. PMID:20543950
Hüls, Anke; Ickstadt, Katja; Schikowski, Tamara; Krämer, Ursula
2017-06-12
For the analysis of gene-environment (GxE) interactions commonly single nucleotide polymorphisms (SNPs) are used to characterize genetic susceptibility, an approach that mostly lacks power and has poor reproducibility. One promising approach to overcome this problem might be the use of weighted genetic risk scores (GRS), which are defined as weighted sums of risk alleles of gene variants. The gold-standard is to use external weights from published meta-analyses. In this study, we used internal weights from the marginal genetic effects of the SNPs estimated by a multivariate elastic net regression and thereby provided a method that can be used if there are no external weights available. We conducted a simulation study for the detection of GxE interactions and compared power and type I error of single SNPs analyses with Bonferroni correction and corresponding analysis with unweighted and our weighted GRS approach in scenarios with six risk SNPs and an increasing number of highly correlated (up to 210) and noise SNPs (up to 840). Applying weighted GRS increased the power enormously in comparison to the common single SNPs approach (e.g. 94.2% vs. 35.4%, respectively, to detect a weak interaction with an OR ≈ 1.04 for six uncorrelated risk SNPs and n = 700 with a well-controlled type I error). Furthermore, weighted GRS outperformed the unweighted GRS, in particular in the presence of SNPs without any effect on the phenotype (e.g. 90.1% vs. 43.9%, respectively, when 20 noise SNPs were added to the six risk SNPs). This outperforming of the weighted GRS was confirmed in a real data application on lung inflammation in the SALIA cohort (n = 402). However, in scenarios with a high number of noise SNPs (>200 vs. 6 risk SNPs), larger sample sizes are needed to avoid an increased type I error, whereas a high number of correlated SNPs can be handled even in small samples (e.g. n = 400). In conclusion, weighted GRS with weights from the marginal genetic effects of the SNPs estimated by a multivariate elastic net regression were shown to be a powerful tool to detect gene-environment interactions in scenarios of high Linkage disequilibrium and noise.
Chatsuriyawong, Siriporn; Gozal, David; Kheirandish-Gozal, Leila; Bhattacharjee, Rakesh; Khalyfa, Ahamed A; Wang, Yang; Sukhumsirichart, Wasana; Khalyfa, Abdelnaby
2013-09-06
Obstructive sleep apnea (OSA) is associated with adverse and interdependent cognitive and cardiovascular consequences. Increasing evidence suggests that nitric oxide synthase (NOS) and endothelin family (EDN) genes underlie mechanistic aspects of OSA-associated morbidities. We aimed to identify single nucleotide polymorphisms (SNPs) in the NOS family (3 isoforms), and EDN family (3 isoforms) to identify potential associations of these SNPs in children with OSA. A pediatric community cohort (ages 5-10 years) enriched for snoring underwent overnight polysomnographic (NPSG) and a fasting morning blood draw. The diagnostic criteria for OSA were an obstructive apnea-hypopnea Index (AHI) >2/h total sleep time (TST), snoring during the night, and a nadir oxyhemoglobin saturation <92%. Control children were defined as non-snoring children with AHI <2/h TST (NOSA). Endothelial function was assessed using a modified post-occlusive hyperemic test. The time to peak reperfusion (Tmax) was considered as the indicator for normal endothelial function (NEF; Tmax<45 sec), or ED (Tmax ≥ 45 sec). Genomic DNA from peripheral blood was extracted and allelic frequencies were assessed for, NOS1 (209 SNPs), NOS2 (122 SNPs), NOS3 (50 SNPs), EDN1 (43 SNPs), EDN2 (48 SNPs), EDN3 (14 SNPs), endothelin receptor A, EDNRA, (27 SNPs), and endothelin receptor B, EDNRB (23 SNPs) using a custom SNPs array. The relative frequencies of NOS-1,-2, and -3, and EDN-1,-2,-3,-EDNRA, and-EDNRB genotypes were evaluated in 608 subjects [128 with OSA, and 480 without OSA (NOSA)]. Furthermore, subjects with OSA were divided into 2 subgroups: OSA with normal endothelial function (OSA-NEF), and OSA with endothelial dysfunction (OSA-ED). Linkage disequilibrium was analyzed using Haploview version 4.2 software. For NOSA vs. OSA groups, 15 differentially distributed SNPs for NOS1 gene, and 1 SNP for NOS3 emerged, while 4 SNPs for EDN1 and 1 SNP for both EDN2 and EDN3 were identified. However, in the smaller sub-group for whom endothelial function was available, none of the significant SNPs was retained due to lack of statistical power. Differences in the distribution of polymorphisms among NOS and EDN gene families suggest that these SNPs could play a contributory role in the pathophysiology and risk of OSA-induced cardiovascular morbidity. Thus, analysis of genotype-phenotype interactions in children with OSA may assist in the formulation of categorical risk estimates.
Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai
2014-01-01
The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies. PMID:24498047
USDA-ARS?s Scientific Manuscript database
The dissection of complex traits of economic importance for the pig industry requires the availability of a significant number of genetic markers, such as SNPs. This study was conducted in order to discover thousands of porcine SNPs using next generation sequencing technologies and use those SNPs, a...
Rasulov, Bakhtiyor; Rustamova, Nigora; Yili, Abulimiti; Zhao, Hai-Qing; Aisa, Haji A
2016-07-01
Silver nanoparticles (SNPs) were synthesized on the basis of exopolysaccharides (low and high molar mass) of diazotrophic Bradyrhizobium japonicum 36 strain. The synthesis of SNPs was carried out by direct reduction of silver nitrate with ethanol-insoluble (high molar mass, HMW) and ethanol-soluble (low molar mass, LMW) fractions of exopolysaccharides (EPS), produced by diazotrophic strain B. japonicum 36. SNPs were characterized using UV-vis spectroscopy, transmission electron microscopy (TEM), X-ray diffraction (XRD), and Fourier transform infrared spectroscopy (FTIR). SNPs synthesized on the basis of LMW EPS absorbed radiation in the visible regions of 420 nm, whereas SNPs based on the HMW EPS have a wavelength maximum at 450 nm because of the strong SPR transition. Moreover, the antibacterial and antifungal activities of the SNPs were examined in vitro against Escherichia coli, Staphylococcus aureus, and Candida albicans. SNPs synthesized on the basis of LMW EPS were active than those synthesized on the basis of HMW EPS. Besides, UV-visible spectroscopic evaluation confirmed that SNPs synthesized on the basis of LMW EPS were far more stable than those obtained on the basis of HMW EPS.
A periodic pattern of SNPs in the human genome
Madsen, Bo Eskerod; Villesen, Palle; Wiuf, Carsten
2007-01-01
By surveying a filtered, high-quality set of SNPs in the human genome, we have found that SNPs positioned 1, 2, 4, 6, or 8 bp apart are more frequent than SNPs positioned 3, 5, 7, or 9 bp apart. The observed pattern is not restricted to genomic regions that are known to cause sequencing or alignment errors, for example, transposable elements (SINE, LINE, and LTR), tandem repeats, and large duplicated regions. However, we found that the pattern is almost entirely confined to what we define as “periodic DNA.” Periodic DNA is a genomic region with a high degree of periodicity in nucleotide usage. It turned out that periodic DNA is mainly small regions (average length 16.9 bp), widely distributed in the genome. Furthermore, periodic DNA has a 1.8 times higher SNP density than the rest of the genome and SNPs inside periodic DNA have a significantly higher genotyping error rate than SNPs outside periodic DNA. Our results suggest that not all SNPs in the human genome are created by independent single nucleotide mutations, and that care should be taken in analysis of SNPs from periodic DNA. The latter may have important consequences for SNP and association studies. PMID:17673700
Inferring Alcoholism SNPs and Regulatory Chemical Compounds Based on Ensemble Bayesian Network.
Chen, Huan; Sun, Jiatong; Jiang, Hong; Wang, Xianyue; Wu, Lingxiang; Wu, Wei; Wang, Qh
2017-01-01
The disturbance of consciousness is one of the most common symptoms of those have alcoholism and may cause disability and mortality. Previous studies indicated that several single nucleotide polymorphisms (SNP) increase the susceptibility of alcoholism. In this study, we utilized the Ensemble Bayesian Network (EBN) method to identify causal SNPs of alcoholism based on the verified GAW14 data. We built a Bayesian network combining random process and greedy search by using Genetic Analysis Workshop 14 (GAW14) dataset to establish EBN of SNPs. Then we predicted the association between SNPs and alcoholism by determining Bayes' prior probability. Thirteen out of eighteen SNPs directly connected with alcoholism were found concordance with potential risk regions of alcoholism in OMIM database. As many SNPs were found contributing to alteration on gene expression, known as expression quantitative trait loci (eQTLs), we further sought to identify chemical compounds acting as regulators of alcoholism genes captured by causal SNPs. Chloroprene and valproic acid were identified as the expression regulators for genes C11orf66 and SALL3 which were captured by alcoholism SNPs, respectively. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Qiu, Chao; Chang, Ranran; Yang, Jie; Ge, Shengju; Xiong, Liu; Zhao, Mei; Li, Man; Sun, Qingjie
2017-04-15
Essential oils (EOs), including menthone, oregano, cinnamon, lavender, and citral, are natural products that have antimicrobial and antioxidant activities. However, extremely low water solubility, and easy degradation by heat, restrict their application. The aim of this work was to evaluate the enhancement in antioxidative and antimicrobial activities of EOs encapsulated in starch nanoparticles (SNPs) prepared by short glucan chains. For the first time, we have successfully fabricated menthone-loaded SNPs (SNPs-M) at different complexation temperatures (30, 60, and 90°C) by an in situ nanoprecipitation method. The SNPs-M displayed spherical shapes, and the particle sizes ranged from 93 to 113nm. The encapsulation efficiency (EE) of SNPs-M increased significantly with an increase in complexation temperature, and the maximum EE was 86.6%. The SNPs-M formed at 90°C had high crystallization and thermal stability. The durations of the antioxidant and antimicrobial activities of EOs was extended by their encapsulation in the SNPs. Copyright © 2016 Elsevier Ltd. All rights reserved.
Heterospecific SNP diversity in humans and rhesus macaque (Macaca mulatta)
Ng, Jillian; Trask, Jessica Satkoski; Smith, David Glenn; Kanthaswamy, Sree
2018-01-01
Background Conservation of single nucleotide polymorphisms (SNPs) between human and other primates (i.e., heterospecific SNPs) in candidate genes can be used to assess the utility of those organisms as models for human biomedical research. Methods 59,691 heterospecific SNPs in 22 rhesus macaques and 20 humans were analyzed for human trait associations and 4,207 heterospecific SNPs biallelic in both taxa were compared for genetic variation. Results Variation comparisons at the 4,207 SNPs showed that humans were more genetically diverse than rhesus macaques with observed and expected heterozygosities of 0.337 and 0.323 versus 0.119 and 0.102, and minor allele frequencies of 0.239 and 0.063, respectively. 431 of the 59,691 heterospecific SNPs are reportedly associated with human-specific traits. Conclusion While comparisons between human and rhesus macaque genomes are plausible, functional studies of heterospecific SNPs are necessary to determine whether rhesus macaque alleles are associated with the same phenotypes as their corresponding human alleles. PMID:25963897
Nguyen, Thanh-Tung; Huang, Joshua; Wu, Qingyao; Nguyen, Thuy; Li, Mark
2015-01-01
Single-nucleotide polymorphisms (SNPs) selection and identification are the most important tasks in Genome-wide association data analysis. The problem is difficult because genome-wide association data is very high dimensional and a large portion of SNPs in the data is irrelevant to the disease. Advanced machine learning methods have been successfully used in Genome-wide association studies (GWAS) for identification of genetic variants that have relatively big effects in some common, complex diseases. Among them, the most successful one is Random Forests (RF). Despite of performing well in terms of prediction accuracy in some data sets with moderate size, RF still suffers from working in GWAS for selecting informative SNPs and building accurate prediction models. In this paper, we propose to use a new two-stage quality-based sampling method in random forests, named ts-RF, for SNP subspace selection for GWAS. The method first applies p-value assessment to find a cut-off point that separates informative and irrelevant SNPs in two groups. The informative SNPs group is further divided into two sub-groups: highly informative and weak informative SNPs. When sampling the SNP subspace for building trees for the forest, only those SNPs from the two sub-groups are taken into account. The feature subspaces always contain highly informative SNPs when used to split a node at a tree. This approach enables one to generate more accurate trees with a lower prediction error, meanwhile possibly avoiding overfitting. It allows one to detect interactions of multiple SNPs with the diseases, and to reduce the dimensionality and the amount of Genome-wide association data needed for learning the RF model. Extensive experiments on two genome-wide SNP data sets (Parkinson case-control data comprised of 408,803 SNPs and Alzheimer case-control data comprised of 380,157 SNPs) and 10 gene data sets have demonstrated that the proposed model significantly reduced prediction errors and outperformed most existing the-state-of-the-art random forests. The top 25 SNPs in Parkinson data set were identified by the proposed model including four interesting genes associated with neurological disorders. The presented approach has shown to be effective in selecting informative sub-groups of SNPs potentially associated with diseases that traditional statistical approaches might fail. The new RF works well for the data where the number of case-control objects is much smaller than the number of SNPs, which is a typical problem in gene data and GWAS. Experiment results demonstrated the effectiveness of the proposed RF model that outperformed the state-of-the-art RFs, including Breiman's RF, GRRF and wsRF methods.
Utilizing the protein corona around silica nanoparticles for dual drug loading and release
NASA Astrophysics Data System (ADS)
Shahabi, Shakiba; Treccani, Laura; Dringen, Ralf; Rezwan, Kurosch
2015-10-01
A protein corona forms spontaneously around silica nanoparticles (SNPs) in serum-containing media. To test whether this protein corona can be utilized for the loading and release of anticancer drugs we incorporated the hydrophilic doxorubicin, the hydrophobic meloxicam as well as their combination in the corona around SNPs. The application of corona-covered SNPs to osteosarcoma cells revealed that drug-free particles did not affect the cell viability. In contrast, SNPs carrying a protein corona with doxorubicin or meloxicam lowered the cell proliferation in a concentration-dependent manner. In addition, these particles had an even greater antiproliferative potential than the respective concentrations of free drugs. The best antiproliferative effects were observed for SNPs containing both doxorubicin and meloxicam in their corona. Co-localization studies revealed the presence of doxorubicin fluorescence in the nucleus and lysosomes of cells exposed to doxorubicin-containing coated SNPs, suggesting that endocytotic uptake of the SNPs facilitates the cellular accumulation of the drug. Our data demonstrate that the protein corona, which spontaneously forms around nanoparticles, can be efficiently exploited for loading the particles with multiple drugs for therapeutic purposes. As drugs are efficiently released from such particles they may have a great potential for nanomedical applications.A protein corona forms spontaneously around silica nanoparticles (SNPs) in serum-containing media. To test whether this protein corona can be utilized for the loading and release of anticancer drugs we incorporated the hydrophilic doxorubicin, the hydrophobic meloxicam as well as their combination in the corona around SNPs. The application of corona-covered SNPs to osteosarcoma cells revealed that drug-free particles did not affect the cell viability. In contrast, SNPs carrying a protein corona with doxorubicin or meloxicam lowered the cell proliferation in a concentration-dependent manner. In addition, these particles had an even greater antiproliferative potential than the respective concentrations of free drugs. The best antiproliferative effects were observed for SNPs containing both doxorubicin and meloxicam in their corona. Co-localization studies revealed the presence of doxorubicin fluorescence in the nucleus and lysosomes of cells exposed to doxorubicin-containing coated SNPs, suggesting that endocytotic uptake of the SNPs facilitates the cellular accumulation of the drug. Our data demonstrate that the protein corona, which spontaneously forms around nanoparticles, can be efficiently exploited for loading the particles with multiple drugs for therapeutic purposes. As drugs are efficiently released from such particles they may have a great potential for nanomedical applications. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr04726a
Østergaard, Søren D.; Mukherjee, Shubhabrata; Sharp, Stephen J.; Proitsi, Petroula; Lotta, Luca A.; Day, Felix; Perry, John R. B.; Boehme, Kevin L.; Walter, Stefan; Kauwe, John S.; Gibbons, Laura E.; Larson, Eric B.; Powell, John F.; Langenberg, Claudia; Crane, Paul K.; Wareham, Nicholas J.; Scott, Robert A.
2015-01-01
Background Potentially modifiable risk factors including obesity, diabetes, hypertension, and smoking are associated with Alzheimer disease (AD) and represent promising targets for intervention. However, the causality of these associations is unclear. We sought to assess the causal nature of these associations using Mendelian randomization (MR). Methods and Findings We used SNPs associated with each risk factor as instrumental variables in MR analyses. We considered type 2 diabetes (T2D, N SNPs = 49), fasting glucose (N SNPs = 36), insulin resistance (N SNPs = 10), body mass index (BMI, N SNPs = 32), total cholesterol (N SNPs = 73), HDL-cholesterol (N SNPs = 71), LDL-cholesterol (N SNPs = 57), triglycerides (N SNPs = 39), systolic blood pressure (SBP, N SNPs = 24), smoking initiation (N SNPs = 1), smoking quantity (N SNPs = 3), university completion (N SNPs = 2), and years of education (N SNPs = 1). We calculated MR estimates of associations between each exposure and AD risk using an inverse-variance weighted approach, with summary statistics of SNP–AD associations from the International Genomics of Alzheimer’s Project, comprising a total of 17,008 individuals with AD and 37,154 cognitively normal elderly controls. We found that genetically predicted higher SBP was associated with lower AD risk (odds ratio [OR] per standard deviation [15.4 mm Hg] of SBP [95% CI]: 0.75 [0.62–0.91]; p = 3.4 × 10−3). Genetically predicted higher SBP was also associated with a higher probability of taking antihypertensive medication (p = 6.7 × 10−8). Genetically predicted smoking quantity was associated with lower AD risk (OR per ten cigarettes per day [95% CI]: 0.67 [0.51–0.89]; p = 6.5 × 10−3), although we were unable to stratify by smoking history; genetically predicted smoking initiation was not associated with AD risk (OR = 0.70 [0.37, 1.33]; p = 0.28). We saw no evidence of causal associations between glycemic traits, T2D, BMI, or educational attainment and risk of AD (all p > 0.1). Potential limitations of this study include the small proportion of intermediate trait variance explained by genetic variants and other implicit limitations of MR analyses. Conclusions Inherited lifetime exposure to higher SBP is associated with lower AD risk. These findings suggest that higher blood pressure—or some environmental exposure associated with higher blood pressure, such as use of antihypertensive medications—may reduce AD risk. PMID:26079503
Bioinformatics challenges for genome-wide association studies.
Moore, Jason H; Asselbergs, Folkert W; Williams, Scott M
2010-02-15
The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype-phenotype relationship that is characterized by significant heterogeneity and gene-gene and gene-environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods.
Irvin, Marguerite R; Sitlani, Colleen M; Noordam, Raymond; Avery, Christie L; Bis, Joshua C; Floyd, James S; Li, Jin; Limdi, Nita A; Srinivasasainagendra, Vinodh; Stewart, James; de Mutsert, Renée; Mook-Kanamori, Dennis O; Lipovich, Leonard; Kleinbrink, Erica L; Smith, Albert; Bartz, Traci M; Whitsel, Eric A; Uitterlinden, Andre G; Wiggins, Kerri L; Wilson, James G; Zhi, Degui; Stricker, Bruno H; Rotter, Jerome I; Arnett, Donna K; Psaty, Bruce M; Lange, Leslie A
2018-06-01
We evaluated interactions of SNP-by-ACE-I/ARB and SNP-by-TD on serum potassium (K+) among users of antihypertensive treatments (anti-HTN). Our study included seven European-ancestry (EA) (N = 4835) and four African-ancestry (AA) cohorts (N = 2016). We performed race-stratified, fixed-effect, inverse-variance-weighted meta-analyses of 2.5 million SNP-by-drug interaction estimates; race-combined meta-analysis; and trans-ethnic fine-mapping. Among EAs, we identified 11 significant SNPs (P < 5 × 10 -8 ) for SNP-ACE-I/ARB interactions on serum K+ that were located between NR2F1-AS1 and ARRDC3-AS1 on chromosome 5 (top SNP rs6878413 P = 1.7 × 10 -8 ; ratio of serum K+ in ACE-I/ARB exposed compared to unexposed is 1.0476, 1.0280, 1.0088 for the TT, AT, and AA genotypes, respectively). Trans-ethnic fine mapping identified the same group of SNPs on chromosome 5 as genome-wide significant for the ACE-I/ARB analysis. In conclusion, SNP-by-ACE-I /ARB interaction analyses uncovered loci that, if replicated, could have future implications for the prevention of arrhythmias due to anti-HTN treatment-related hyperkalemia. Before these loci can be identified as clinically relevant, future validation studies of equal or greater size in comparison to our discovery effort are needed.
Weidinger, Stephan; Willis-Owen, Saffron A G; Kamatani, Yoichiro; Baurecht, Hansjörg; Morar, Nilesh; Liang, Liming; Edser, Pauline; Street, Teresa; Rodriguez, Elke; O'Regan, Grainne M; Beattie, Paula; Fölster-Holst, Regina; Franke, Andre; Novak, Natalija; Fahy, Caoimhe M; Winge, Mårten C G; Kabesch, Michael; Illig, Thomas; Heath, Simon; Söderhäll, Cilla; Melén, Erik; Pershagen, Göran; Kere, Juha; Bradley, Maria; Lieden, Agne; Nordenskjold, Magnus; Harper, John I; McLean, W H Irwin; Brown, Sara J; Cookson, William O C; Lathrop, G Mark; Irvine, Alan D; Moffatt, Miriam F
2013-12-01
Atopic dermatitis (AD) is the most common dermatological disease of childhood. Many children with AD have asthma and AD shares regions of genetic linkage with psoriasis, another chronic inflammatory skin disease. We present here a genome-wide association study (GWAS) of childhood-onset AD in 1563 European cases with known asthma status and 4054 European controls. Using Illumina genotyping followed by imputation, we generated 268 034 consensus genotypes and in excess of 2 million single nucleotide polymorphisms (SNPs) for analysis. Association signals were assessed for replication in a second panel of 2286 European cases and 3160 European controls. Four loci achieved genome-wide significance for AD and replicated consistently across all cohorts. These included the epidermal differentiation complex (EDC) on chromosome 1, the genomic region proximal to LRRC32 on chromosome 11, the RAD50/IL13 locus on chromosome 5 and the major histocompatibility complex (MHC) on chromosome 6; reflecting action of classical HLA alleles. We observed variation in the contribution towards co-morbid asthma for these regions of association. We further explored the genetic relationship between AD, asthma and psoriasis by examining previously identified susceptibility SNPs for these diseases. We found considerable overlap between AD and psoriasis together with variable coincidence between allergic rhinitis (AR) and asthma. Our results indicate that the pathogenesis of AD incorporates immune and epidermal barrier defects with combinations of specific and overlapping effects at individual loci.
Weidinger, Stephan; Willis-Owen, Saffron A.G.; Kamatani, Yoichiro; Baurecht, Hansjörg; Morar, Nilesh; Liang, Liming; Edser, Pauline; Street, Teresa; Rodriguez, Elke; O'Regan, Grainne M.; Beattie, Paula; Fölster-Holst, Regina; Franke, Andre; Novak, Natalija; Fahy, Caoimhe M.; Winge, Mårten C.G.; Kabesch, Michael; Illig, Thomas; Heath, Simon; Söderhäll, Cilla; Melén, Erik; Pershagen, Göran; Kere, Juha; Bradley, Maria; Lieden, Agne; Nordenskjold, Magnus; Harper, John I.; Mclean, W.H. Irwin; Brown, Sara J.; Cookson, William O.C.; Lathrop, G. Mark; Irvine, Alan D.; Moffatt, Miriam F.
2013-01-01
Atopic dermatitis (AD) is the most common dermatological disease of childhood. Many children with AD have asthma and AD shares regions of genetic linkage with psoriasis, another chronic inflammatory skin disease. We present here a genome-wide association study (GWAS) of childhood-onset AD in 1563 European cases with known asthma status and 4054 European controls. Using Illumina genotyping followed by imputation, we generated 268 034 consensus genotypes and in excess of 2 million single nucleotide polymorphisms (SNPs) for analysis. Association signals were assessed for replication in a second panel of 2286 European cases and 3160 European controls. Four loci achieved genome-wide significance for AD and replicated consistently across all cohorts. These included the epidermal differentiation complex (EDC) on chromosome 1, the genomic region proximal to LRRC32 on chromosome 11, the RAD50/IL13 locus on chromosome 5 and the major histocompatibility complex (MHC) on chromosome 6; reflecting action of classical HLA alleles. We observed variation in the contribution towards co-morbid asthma for these regions of association. We further explored the genetic relationship between AD, asthma and psoriasis by examining previously identified susceptibility SNPs for these diseases. We found considerable overlap between AD and psoriasis together with variable coincidence between allergic rhinitis (AR) and asthma. Our results indicate that the pathogenesis of AD incorporates immune and epidermal barrier defects with combinations of specific and overlapping effects at individual loci. PMID:23886662
Zhu, Xiang; Stephens, Matthew
2017-01-01
Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss. PMID:29399241
Anand, Sonia S; Xie, Changchun; Paré, Guillaume; Montpetit, Alexandre; Rangarajan, Sumathy; McQueen, Matthew J; Cordell, Heather J; Keavney, Bernard; Yusuf, Salim; Hudson, Thomas J; Engert, James C
2009-02-01
Myocardial infarction (MI) is a leading cause of death globally, but specific genetic variants that influence MI and MI risk factors have not been assessed on a global basis. We included 8795 individuals of European, South Asian, Arab, Iranian, and Nepalese origin from the INTERHEART case-control study that genotyped 1536 single-nucleotide polymorphisms (SNPs) from 103 genes. One hundred and two SNPs were nominally associated with MI, but the statistical significance did not remain after adjustment for multiple testing. A subset of 940 SNPs from 69 genes were tested against MI risk factors. One hundred and sixty-three SNPs were nominally associated with a MI risk factor and 13 remained significant after adjusting for multiple testing. Of these 13, 11 were associated with apolipoprotein (Apo) B/A1 levels: 8 SNPs from 3 genes were associated with Apo B, and 3 cholesteryl ester transfer protein SNPs were associated with Apo A1. Seven of 8 of the SNPs associated with Apo B levels were nominally associated with MI (P<0.05), whereas none of the 3 cholesteryl ester transfer protein SNPs were associated with MI (P> or =0.17). Of the 3 SNPs most significantly associated with MI, rs7412, which defines the Apo E2 isoform, was associated with both a lower Apo B/A1 ratio (P=1.0x10(-7)) and lower MI risk (P=0.0004). Two low-density lipoprotein receptor variants, 1 intronic (rs6511720) and 1 in the 3' untranslated region (rs1433099) were both associated with a lower Apo B/A1 ratio (P<1.0x10(-5)) and a lower risk of MI (P=0.004 and P=0.003, respectively). Thirteen common SNPs were associated with MI risk factors. Importantly, SNPs associated with Apo B levels were associated with MI, whereas SNPs associated with Apo A1 levels were not. The Apo E isoform, and 2 common low-density lipoprotein receptor variants (rs1433099 and rs6511720) influence MI risk in this multiethnic sample.
Chen, Zhanghua; Pereira, Mark A.; Seielstad, Mark; Koh, Woon-Puay; Tai, E. Shyong; Teo, Yik-Ying; Liu, Jianjun; Hsu, Chris; Wang, Renwei; Odegaard, Andrew O.; Thyagarajan, Bharat; Koratkar, Revati; Yuan, Jian-Min; Gross, Myron D.; Stram, Daniel O.
2014-01-01
Background Genome-wide association studies (GWAS) have identified genetic factors in type 2 diabetes (T2D), mostly among individuals of European ancestry. We tested whether previously identified T2D-associated single nucleotide polymorphisms (SNPs) replicate and whether SNPs in regions near known T2D SNPs were associated with T2D within the Singapore Chinese Health Study. Methods 2338 cases and 2339 T2D controls from the Singapore Chinese Health Study were genotyped for 507,509 SNPs. Imputation extended the genotyped SNPs to 7,514,461 with high estimated certainty (r2>0.8). Replication of known index SNP associations in T2D was attempted. Risk scores were computed as the sum of index risk alleles. SNPs in regions ±100 kb around each index were tested for associations with T2D in conditional fine-mapping analysis. Results Of 69 index SNPs, 20 were genotyped directly and genotypes at 35 others were well imputed. Among the 55 SNPs with data, disease associations were replicated (at p<0.05) for 15 SNPs, while 32 more were directionally consistent with previous reports. Risk score was a significant predictor with a 2.03 fold higher risk CI (1.69–2.44) of T2D comparing the highest to lowest quintile of risk allele burden (p = 5.72×10−14). Two improved SNPs around index rs10923931 and 5 new candidate SNPs around indices rs10965250 and rs1111875 passed simple Bonferroni corrections for significance in conditional analysis. Nonetheless, only a small fraction (2.3% on the disease liability scale) of T2D burden in Singapore is explained by these SNPs. Conclusions While diabetes risk in Singapore Chinese involves genetic variants, most disease risk remains unexplained. Further genetic work is ongoing in the Singapore Chinese population to identify unique common variants not already seen in earlier studies. However rapid increases in T2D risk have occurred in recent decades in this population, indicating that dynamic environmental influences and possibly gene by environment interactions complicate the genetic architecture of this disease. PMID:24520337
In silico SNP analysis of the breast cancer antigen NY-BR-1.
Kosaloglu, Zeynep; Bitzer, Julia; Halama, Niels; Huang, Zhiqin; Zapatka, Marc; Schneeweiss, Andreas; Jäger, Dirk; Zörnig, Inka
2016-11-18
Breast cancer is one of the most common malignancies with increasing incidences every year and a leading cause of death among women. Although early stage breast cancer can be effectively treated, there are limited numbers of treatment options available for patients with advanced and metastatic disease. The novel breast cancer associated antigen NY-BR-1 was identified by SEREX analysis and is expressed in the majority (>70%) of breast tumors as well as metastases, in normal breast tissue, in testis and occasionally in prostate tissue. The biological function and regulation of NY-BR-1 is up to date unknown. We performed an in silico analysis on the genetic variations of the NY-BR-1 gene using data available in public SNP databases and the tools SIFT, Polyphen and Provean to find possible functional SNPs. Additionally, we considered the allele frequency of the found damaging SNPs and also analyzed data from an in-house sequencing project of 55 breast cancer samples for recurring SNPs, recorded in dbSNP. Over 2800 SNPs are recorded in the dbSNP and NHLBI ESP databases for the NY-BR-1 gene. Of these, 65 (2.07%) are synonymous SNPs, 191 (6.09%) are non-synoymous SNPs, and 2430 (77.48%) are noncoding intronic SNPs. As a result, 69 non-synoymous SNPs were predicted to be damaging by at least two, and 16 SNPs were predicted as damaging by all three of the used tools. The SNPs rs200639888, rs367841401 and rs377750885 were categorized as highly damaging by all three tools. Eight damaging SNPs are located in the ankyrin repeat domain (ANK), a domain known for its frequent involvement in protein-protein interactions. No distinctive features could be observed in the allele frequency of the analyzed SNPs. Considering these results we expect to gain more insights into the variations of the NY-BR-1 gene and their possible impact on giving rise to splice variants and therefore influence the function of NY-BR-1 in healthy tissue as well as in breast cancer.
Chen, Zhanghua; Pereira, Mark A; Seielstad, Mark; Koh, Woon-Puay; Tai, E Shyong; Teo, Yik-Ying; Liu, Jianjun; Hsu, Chris; Wang, Renwei; Odegaard, Andrew O; Thyagarajan, Bharat; Koratkar, Revati; Yuan, Jian-Min; Gross, Myron D; Stram, Daniel O
2014-01-01
Genome-wide association studies (GWAS) have identified genetic factors in type 2 diabetes (T2D), mostly among individuals of European ancestry. We tested whether previously identified T2D-associated single nucleotide polymorphisms (SNPs) replicate and whether SNPs in regions near known T2D SNPs were associated with T2D within the Singapore Chinese Health Study. 2338 cases and 2339 T2D controls from the Singapore Chinese Health Study were genotyped for 507,509 SNPs. Imputation extended the genotyped SNPs to 7,514,461 with high estimated certainty (r(2)>0.8). Replication of known index SNP associations in T2D was attempted. Risk scores were computed as the sum of index risk alleles. SNPs in regions ± 100 kb around each index were tested for associations with T2D in conditional fine-mapping analysis. Of 69 index SNPs, 20 were genotyped directly and genotypes at 35 others were well imputed. Among the 55 SNPs with data, disease associations were replicated (at p<0.05) for 15 SNPs, while 32 more were directionally consistent with previous reports. Risk score was a significant predictor with a 2.03 fold higher risk CI (1.69-2.44) of T2D comparing the highest to lowest quintile of risk allele burden (p = 5.72 × 10(-14)). Two improved SNPs around index rs10923931 and 5 new candidate SNPs around indices rs10965250 and rs1111875 passed simple Bonferroni corrections for significance in conditional analysis. Nonetheless, only a small fraction (2.3% on the disease liability scale) of T2D burden in Singapore is explained by these SNPs. While diabetes risk in Singapore Chinese involves genetic variants, most disease risk remains unexplained. Further genetic work is ongoing in the Singapore Chinese population to identify unique common variants not already seen in earlier studies. However rapid increases in T2D risk have occurred in recent decades in this population, indicating that dynamic environmental influences and possibly gene by environment interactions complicate the genetic architecture of this disease.
2015-01-01
Background Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e., significant differences of personal genomes of patients from the reference human genome) can help physicians to improve treatment. Clinical research can validate SNP markers via genotyping of patients and demonstration that SNP alleles are significantly more frequent in patients than in healthy people. The search for biomedical SNP markers of interest can be accelerated by computer-based analysis of hundreds of millions of SNPs in the 1000 Genomes project because of selection of the most meaningful candidate SNP markers and elimination of neutral SNPs. Results We cross-validated the output of two computer-based methods: DNA sequence analysis using Web service SNP_TATA_Comparator and keyword search for articles on comorbidities of obesity. Near the sites binding to TATA-binding protein (TBP) in human gene promoters, we found 22 obesity-related candidate SNP markers, including rs10895068 (male breast cancer in obesity); rs35036378 (reduced risk of obesity after ovariectomy); rs201739205 (reduced risk of obesity-related cancers due to weight loss by diet/exercise in obese postmenopausal women); rs183433761 (obesity resistance during a high-fat diet); rs367732974 and rs549591993 (both: cardiovascular complications in obese patients with type 2 diabetes mellitus); rs200487063 and rs34104384 (both: obesity-caused hypertension); rs35518301, rs72661131, and rs562962093 (all: obesity); and rs397509430, rs33980857, rs34598529, rs33931746, rs33981098, rs34500389, rs63750953, rs281864525, rs35518301, and rs34166473 (all: chronic inflammation in comorbidities of obesity). Using an electrophoretic mobility shift assay under nonequilibrium conditions, we empirically validated the statistical significance (α < 0.00025) of the differences in TBP affinity values between the minor and ancestral alleles of 4 out of the 22 SNPs: rs200487063, rs201381696, rs34104384, and rs183433761. We also measured half-life (t1/2), Gibbs free energy change (ΔG), and the association and dissociation rate constants, ka and kd, of the TBP-DNA complex for these SNPs. Conclusions Validation of the 22 candidate SNP markers by proper clinical protocols appears to have a strong rationale and may advance postgenomic predictive preventive personalized medicine. PMID:26694100
Germline sequence variants in TGM3 and RGS22 confer risk of basal cell carcinoma
Stacey, Simon N.; Sulem, Patrick; Gudbjartsson, Daniel F.; Jonasdottir, Aslaug; Thorleifsson, Gudmar; Gudjonsson, Sigurjon A.; Masson, Gisli; Gudmundsson, Julius; Sigurgeirsson, Bardur; Benediktsdottir, Kristrun R.; Thorisdottir, Kristin; Ragnarsson, Rafn; Fuentelsaz, Victoria; Corredera, Cristina; Grasa, Matilde; Planelles, Dolores; Sanmartin, Onofre; Rudnai, Peter; Gurzau, Eugene; Koppova, Kvetoslava; Hemminki, Kari; Nexø, Bjørn A; Tjønneland, Anne; Overvad, Kim; Johannsdottir, Hrefna; Helgadottir, Hafdis T.; Thorsteinsdottir, Unnur; Kong, Augustine; Vogel, Ulla; Kumar, Rajiv; Nagore, Eduardo; Mayordomo, José I.; Rafnar, Thorunn; Olafsson, Jon H.; Stefansson, Kari
2014-01-01
To search for new sequence variants that confer risk of cutaneous basal cell carcinoma (BCC), we conducted a genome-wide association study of 38.5 million single nucleotide polymorphisms (SNPs) and small indels identified through whole-genome sequencing of 2230 Icelanders. We imputed genotypes for 4208 BCC patients and 109 408 controls using Illumina SNP chip typing data, carried out association tests and replicated the findings in independent population samples. We found new BCC susceptibility loci at TGM3 (rs214782[G], P = 5.5 × 10−17, OR = 1.29) and RGS22 (rs7006527[C], P = 8.7 × 10−13, OR = 0.77). TGM3 encodes transglutaminase type 3, which plays a key role in production of the cornified envelope during epidermal differentiation. PMID:24403052
2011-01-01
Background Elucidation of molecular mechanism of silver nanoparticles (SNPs) biosynthesis is important to control its size, shape and monodispersity. The evaluation of molecular mechanism of biosynthesis of SNPs is of prime importance for the commercialization and methodology development for controlling the shape and size (uniform distribution) of SNPs. The unicellular algae Chlamydomonas reinhardtii was exploited as a model system to elucidate the role of cellular proteins in SNPs biosynthesis. Results The C. reinhardtii cell free extract (in vitro) and in vivo cells mediated synthesis of silver nanoparticles reveals SNPs of size range 5 ± 1 to 15 ± 2 nm and 5 ± 1 to 35 ± 5 nm respectively. In vivo biosynthesized SNPs were localized in the peripheral cytoplasm and at one side of flagella root, the site of pathway of ATP transport and its synthesis related enzymes. This provides an evidence for the involvement of oxidoreductive proteins in biosynthesis and stabilization of SNPs. Alteration in size distribution and decrease of synthesis rate of SNPs in protein-depleted fractions confirmed the involvement of cellular proteins in SNPs biosynthesis. Spectroscopic and SDS-PAGE analysis indicate the association of various proteins on C. reinhardtii mediated in vivo and in vitro biosynthesized SNPs. We have identified various cellular proteins associated with biosynthesized (in vivo and in vitro) SNPs by using MALDI-MS-MS, like ATP synthase, superoxide dismutase, carbonic anhydrase, ferredoxin-NADP+ reductase, histone etc. However, these proteins were not associated on the incubation of pre-synthesized silver nanoparticles in vitro. Conclusion Present study provides the indication of involvement of molecular machinery and various cellular proteins in the biosynthesis of silver nanoparticles. In this report, the study is mainly focused towards understanding the role of diverse cellular protein in the synthesis and capping of silver nanoparticles using C. reinhardtii as a model system. PMID:22152042
Replication of Caucasian loci associated with bone mineral density in Koreans.
Kim, Y A; Choi, H J; Lee, J Y; Han, B G; Shin, C S; Cho, N H
2013-10-01
Most bone mineral density (BMD) loci were reported in Caucasian genome-wide association studies (GWAS). This study investigated the association between 59 known BMD loci (+200 suggestive SNPs) and DXA-derived BMD in East Asian population with respect to sex and site specificity. We also identified four novel BMD candidate loci from the suggestive SNPs. Most GWAS have reported BMD-related variations in Caucasian populations. This study investigates whether the BMD loci discovered in Caucasian GWAS are also associated with BMD in East Asian ethnic samples. A total of 2,729 unrelated Korean individuals from a population-based cohort were analyzed. We selected 747 single-nucleotide polymorphisms (SNPs). These markers included 547 SNPs from 59 loci with genome-wide significance (GWS, p value less than 5 × 10(-8)) levels and 200 suggestive SNPs that showed weaker BMD association with p value less than 5 × 10(-5). After quality control, 535 GWS SNPs and 182 suggestive SNPs were included in the replication analysis. Of the 535 GWS SNPs, 276 from 25 loci were replicated (p < 0.05) in the Korean population with 51.6 % replication rate. Of the 182 suggestive variants, 16 were replicated (p < 0.05, 8.8 % of replication rate), and five reached a significant combined p value (less than 7.0 × 10(-5), 0.05/717 SNPs, corrected for multiple testing). Two markers (rs11711157, rs3732477) are for the same signal near the gene CPN2 (carboxypeptidase N, polypeptide 2). The other variants, rs6436440 and rs2291296, were located in the genes AP1S3 (adaptor-related protein complex 1, sigma 3 subunit) and RARB (retinoic acid receptor, beta). Our results illustrate ethnic differences in BMD susceptibility genes and underscore the need for further genetic studies in each ethnic group. We were also able to replicate some SNPs with suggestive associations. These SNPs may be BMD-related genetic markers and should be further investigated.
Abnet, Christian C.; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D.; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M.; Liao, Linda M.; Lee, Maxwell P.; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C.; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B.; Giffen, Carol A.; Burdett, Laurie; Fraumeni, Joseph F.; Tucker, Margaret A.; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M.; Yuan, Zhi-Qing; Chanock, Stephen J.; Zhang, Xue-Jun; Taylor, Philip R.; Wang, Li-Dong
2012-01-01
Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10−8, and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19–1.40) and P= 7.63 × 10−10. An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants. PMID:22323360
Abnet, Christian C; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M; Liao, Linda M; Lee, Maxwell P; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B; Giffen, Carol A; Burdett, Laurie; Fraumeni, Joseph F; Tucker, Margaret A; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M; Yuan, Zhi-Qing; Chanock, Stephen J; Zhang, Xue-Jun; Taylor, Philip R; Wang, Li-Dong
2012-05-01
Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10(-8), and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19-1.40) and P= 7.63 × 10(-10). An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants.
Manickam, Madhumathi; Ravanan, Palaniyandi; Singh, Pratibha; Talwar, Priti
2014-01-01
Gaucher's disease (GD) is an autosomal recessive disorder caused by the deficiency of glucocerebrosidase, a lysosomal enzyme that catalyses the hydrolysis of the glycolipid glucocerebroside to ceramide and glucose. Polymorphisms in GBA gene have been associated with the development of Gaucher disease. We hypothesize that prediction of SNPs using multiple state of the art software tools will help in increasing the confidence in identification of SNPs involved in GD. Enzyme replacement therapy is the only option for GD. Our goal is to use several state of art SNP algorithms to predict/address harmful SNPs using comparative studies. In this study seven different algorithms (SIFT, MutPred, nsSNP Analyzer, PANTHER, PMUT, PROVEAN, and SNPs&GO) were used to predict the harmful polymorphisms. Among the seven programs, SIFT found 47 nsSNPs as deleterious, MutPred found 46 nsSNPs as harmful. nsSNP Analyzer program found 43 out of 47 nsSNPs are disease causing SNPs whereas PANTHER found 32 out of 47 as highly deleterious, 22 out of 47 are classified as pathological mutations by PMUT, 44 out of 47 were predicted to be deleterious by PROVEAN server, all 47 shows the disease related mutations by SNPs&GO. Twenty two nsSNPs were commonly predicted by all the seven different algorithms. The common 22 targeted mutations are F251L, C342G, W312C, P415R, R463C, D127V, A309V, G46E, G202E, P391L, Y363C, Y205C, W378C, I402T, S366R, F397S, Y418C, P401L, G195E, W184R, R48W, and T43R.
Carty, Cara L; Buzková, Petra; Fornage, Myriam; Franceschini, Nora; Cole, Shelley; Heiss, Gerardo; Hindorff, Lucia A; Howard, Barbara V; Mann, Sue; Martin, Lisa W; Zhang, Ying; Matise, Tara C; Prentice, Ross; Reiner, Alexander P; Kooperberg, Charles
2012-04-01
Genome-wide association studies (GWAS) have identified loci associated with ischemic stroke (IS) and cardiovascular disease (CVD) in European-descent individuals, but their replication in different populations has been largely unexplored. Nine single nucleotide polymorphisms (SNPs) selected from GWAS and meta-analyses of stroke, and 86 SNPs previously associated with myocardial infarction and CVD risk factors, including blood lipids (high density lipoprotein [HDL], low density lipoprotein [LDL], and triglycerides), type 2 diabetes, and body mass index (BMI), were investigated for associations with incident IS in European Americans (EA) N=26 276, African-Americans (AA) N=8970, and American Indians (AI) N=3570 from the Population Architecture using Genomics and Epidemiology Study. Ancestry-specific fixed effects meta-analysis with inverse variance weighting was used to combine study-specific log hazard ratios from Cox proportional hazards models. Two of 9 stroke SNPs (rs783396 and rs1804689) were significantly associated with [corrected] IS hazard in AA; none were significant in this large EA cohort. Of 73 CVD risk factor SNPs tested in EA, 2 (HDL and triglycerides SNPs) were associated with IS. In AA, SNPs associated with LDL, HDL, and BMI were significantly associated with IS (3 of 86 SNPs tested). Out of 58 SNPs tested in AI, 1 LDL SNP was significantly associated with IS. Our analyses showing lack of replication in spite of reasonable power for many stroke SNPs and differing results by ancestry highlight the need to follow up on GWAS findings and conduct genetic association studies in diverse populations. We found modest IS associations with BMI and lipids SNPs, though these findings require confirmation.
Genome-wide association study of sporadic brain arteriovenous malformations.
Weinsheimer, Shantel; Bendjilali, Nasrine; Nelson, Jeffrey; Guo, Diana E; Zaroff, Jonathan G; Sidney, Stephen; McCulloch, Charles E; Al-Shahi Salman, Rustam; Berg, Jonathan N; Koeleman, Bobby P C; Simon, Matthias; Bostroem, Azize; Fontanella, Marco; Sturiale, Carmelo L; Pola, Roberto; Puca, Alfredo; Lawton, Michael T; Young, William L; Pawlikowska, Ludmila; Klijn, Catharina J M; Kim, Helen
2016-09-01
The pathogenesis of sporadic brain arteriovenous malformations (BAVMs) remains unknown, but studies suggest a genetic component. We estimated the heritability of sporadic BAVM and performed a genome-wide association study (GWAS) to investigate association of common single nucleotide polymorphisms (SNPs) with risk of sporadic BAVM in the international, multicentre Genetics of Arteriovenous Malformation (GEN-AVM) consortium. The Caucasian discovery cohort included 515 BAVM cases and 1191 controls genotyped using Affymetrix genome-wide SNP arrays. Genotype data were imputed to 1000 Genomes Project data, and well-imputed SNPs (>0.01 minor allele frequency) were analysed for association with BAVM. 57 top BAVM-associated SNPs (51 SNPs with p<10(-05) or p<10(-04) in candidate pathway genes, and 6 candidate BAVM SNPs) were tested in a replication cohort including 608 BAVM cases and 744 controls. The estimated heritability of BAVM was 17.6% (SE 8.9%, age and sex-adjusted p=0.015). None of the SNPs were significantly associated with BAVM in the replication cohort after correction for multiple testing. 6 SNPs had a nominal p<0.1 in the replication cohort and map to introns in EGFEM1P, SP4 and CDKAL1 or near JAG1 and BNC2. Of the 6 candidate SNPs, 2 in ACVRL1 and MMP3 had a nominal p<0.05 in the replication cohort. We performed the first GWAS of sporadic BAVM in the largest BAVM cohort assembled to date. No GWAS SNPs were replicated, suggesting that common SNPs do not contribute strongly to BAVM susceptibility. However, heritability estimates suggest a modest but significant genetic contribution. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs.
Ao, S I; Yip, Kevin; Ng, Michael; Cheung, David; Fong, Pui-Yee; Melhado, Ian; Sham, Pak C
2005-04-15
Cluster and set-cover algorithms are developed to obtain a set of tag single nucleotide polymorphisms (SNPs) that can represent all the known SNPs in a chromosomal region, subject to the constraint that all SNPs must have a squared correlation R2>C with at least one tag SNP, where C is specified by the user. http://hkumath.hku.hk/web/link/CLUSTAG/CLUSTAG.html mng@maths.hku.hk.
Rashmi, Venkatasubbaiah; Sanjay, Konasur R
2017-04-01
Consistent search of plants for green synthesis of silver nanoparticles (SNPs) is an important arena in Nanomedicine. This study focuses on synthesis of SNPs using bioreduction of silver nitrate (AgNO 3 ) by aqueous root extract of Decalepis hamiltonii . The biosynthesis of SNPs was monitored by UV-vis analysis at absorbance maxima 432 nm. The fluorescence emission spectra of SNPs illustrated the broad emission peak 450-483 nm at different excitation wavelengths. The surface characteristics were studied by scanning electron microscope and atomic force microscopy, showed spherical shape of SNPs and dynamic light scattering analysis confirmed the average particle size 32.5 nm and the presence of metallic silver was confirmed by energy dispersive X-ray. Face centred cubic structure with crystal size 33.3 nm was revealed by powder X-ray diffraction. Fourier transform infrared spectroscopy indicated the biomolecules involved in the reduction mainly polyols and phenols present in root extracts were found to be responsible for the synthesis of SNPs. The stability and charge on SNPs were revealed by zeta potential analysis. In addition, on therapeutic forum, the synthesised SNPs elicit antioxidant and antimicrobial activity against Bacillus cereus , Bacillus licheniformis , Escherichia coli , Pseudomonas aeruginosa and Staphylococcus aureus .
The versican gene and the risk of intracranial aneurysms.
Ruigrok, Ynte M; Rinkel, Gabriël J E; Wijmenga, Cisca
2006-09-01
The proteoglycan versican is an excellent candidate gene for intracranial aneurysms (IAs) because it plays an important role in extracellular matrix assembly and is localized in a previously implicated locus for IAs on chromosome 5q. We analyzed all the common variations using 16-tag single nucleotide polymorphisms (SNPs) and haplotypes in the versican gene using a 2-stage genotyping approach. For stage 1, 16 SNPs were genotyped in 307 cases and 639 controls. For stage 2, the two SNPs yielding the most significant associations (P<0.01) were genotyped in a second independent cohort of 310 cases for confirmation of the associations. In stage 1, we found several SNPs in strong linkage disequilibrium and haplotypes constituting these SNPs associated with IAs in the Dutch population (strongest SNP association for rs173686 with odds ratio=1.34, 95% CI=1.09 to 1.65, P=0.004). In stage 2, we confirmed association for the 2 SNPs with the most significant associations (strongest SNP association for rs173686 with odds ratio=1.36, 95% CI=1.11 to 1.67, P=0.003). SNPs in strong linkage disequilibrium and haplotypes constituting these SNPs in the versican gene are associated with IAs suggesting that variation in or near the versican gene plays a role in susceptibility to IAs.
Liu, Yanfang; Liao, Huidan; Liu, Ying; Guo, Juanjuan; Sun, Yi; Fu, Xiaoliang; Xiao, Ding; Cai, Jifeng; Lan, Lingmei; Xie, Pingli; Zha, Lagabaiyila
2017-04-01
Nonbinary single-nucleotide polymorphisms (SNPs) are potential forensic genetic markers because their discrimination power is greater than that of normal binary SNPs, and that they can detect highly degraded samples. We previously developed a nonbinary SNP multiplex typing assay. In this study, we selected additional 20 nonbinary SNPs from the NCBI SNP database and verified them through pyrosequencing. These 20 nonbinary SNPs were analyzed using the fluorescent-labeled SNaPshot multiplex SNP typing method. The allele frequencies and genetic parameters of these 20 nonbinary SNPs were determined among 314 unrelated individuals from Han populations from China. The total power of discrimination was 0.9999999999994, and the cumulative probability of exclusion was 0.9986. Moreover, the result of the combination of this 20 nonbinary SNP assay with the 20 nonbinary SNP assay we previously developed demonstrated that the cumulative probability of exclusion of the 40 nonbinary SNPs was 0.999991 and that no significant linkage disequilibrium was observed in all 40 nonbinary SNPs. Thus, we concluded that this new system consisting of new 20 nonbinary SNPs could provide highly informative polymorphic data which would be further used in forensic application and would serve as a potentially valuable supplement to forensic DNA analysis. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Delgado-Lista, Javier; Perez-Martinez, Pablo; Solivera, Juan; Garcia-Rios, Antonio; Perez-Caballero, A I; Lovegrove, Julie A; Drevon, Christian A; Defoort, Catherine; Blaak, Ellen E; Dembinska-Kieć, Aldona; Risérus, Ulf; Herruzo-Gomez, Ezequiel; Camargo, Antonio; Ordovas, Jose M; Roche, Helen; Lopez-Miranda, José
2014-02-01
Metabolic syndrome (MetS) is a high-prevalence condition characterized by altered energy metabolism, insulin resistance, and elevated cardiovascular risk. Although many individual single nucleotide polymorphisms (SNPs) have been linked to certain MetS features, there are few studies analyzing the influence of SNPs on carbohydrate metabolism in MetS. A total of 904 SNPs (tag SNPs and functional SNPs) were tested for influence on 8 fasting and dynamic markers of carbohydrate metabolism, by performance of an intravenous glucose tolerance test in 450 participants in the LIPGENE study. From 382 initial gene-phenotype associations between SNPs and any phenotypic variables, 61 (16% of the preselected variables) remained significant after bootstrapping. Top SNPs affecting glucose metabolism variables were as follows: fasting glucose, rs26125 (PPARGC1B); fasting insulin, rs4759277 (LRP1); C-peptide, rs4759277 (LRP1); homeostasis assessment of insulin resistance, rs4759277 (LRP1); quantitative insulin sensitivity check index, rs184003 (AGER); sensitivity index, rs7301876 (ABCC9), acute insulin response to glucose, rs290481 (TCF7L2); and disposition index, rs12691 (CEBPA). We describe here the top SNPs linked to phenotypic features in carbohydrate metabolism among approximately 1000 candidate gene variations in fasting and postprandial samples of 450 patients with MetS from the LIPGENE study.
Elastic-net regularization approaches for genome-wide association studies of rheumatoid arthritis.
Cho, Seoae; Kim, Haseong; Oh, Sohee; Kim, Kyunga; Park, Taesung
2009-12-15
The current trend in genome-wide association studies is to identify regions where the true disease-causing genes may lie by evaluating thousands of single-nucleotide polymorphisms (SNPs) across the whole genome. However, many challenges exist in detecting disease-causing genes among the thousands of SNPs. Examples include multicollinearity and multiple testing issues, especially when a large number of correlated SNPs are simultaneously tested. Multicollinearity can often occur when predictor variables in a multiple regression model are highly correlated, and can cause imprecise estimation of association. In this study, we propose a simple stepwise procedure that identifies disease-causing SNPs simultaneously by employing elastic-net regularization, a variable selection method that allows one to address multicollinearity. At Step 1, the single-marker association analysis was conducted to screen SNPs. At Step 2, the multiple-marker association was scanned based on the elastic-net regularization. The proposed approach was applied to the rheumatoid arthritis (RA) case-control data set of Genetic Analysis Workshop 16. While the selected SNPs at the screening step are located mostly on chromosome 6, the elastic-net approach identified putative RA-related SNPs on other chromosomes in an increased proportion. For some of those putative RA-related SNPs, we identified the interactions with sex, a well known factor affecting RA susceptibility.
Allele-Skewed DNA Modification in the Brain: Relevance to a Schizophrenia GWAS
Gagliano, Sarah A.; Ptak, Carolyn; Mak, Denise Y.F.; Shamsi, Mehrdad; Oh, Gabriel; Knight, Joanne; Boutros, Paul C.; Petronis, Arturas
2016-01-01
Numerous recent studies have suggested that phenotypic effects of DNA sequence variants can be mediated or modulated by their epigenetic marks, such as allele-skewed DNA modification (ASM). Using Affymetrix SNP microarrays, we performed a comprehensive search of ASM effects in human post-mortem brain and sperm samples (total n = 256) from individuals with major psychosis and control individuals. Depending on the phenotypic category of the brain samples, 1.4%–7.5% of interrogated SNPs exhibited ASM effects. Next, we investigated ASM in the context of genetic studies of schizophrenia and detected that brain ASM SNPs were significantly overrepresented among sub-threshold SNPs from a schizophrenia genome-wide association study (GWAS). Brain ASM SNPs showed a much stronger enrichment in a schizophrenia GWAS than in 17 large GWASs of non-psychiatric diseases and traits, arguing that ASM effects are at least partially tissue specific. Studies of germline and control brain ASM SNPs supported a causal association between ASM and schizophrenia. Finally, significantly higher proportions of ASM SNPs than of non-ASM SNPs were detected at loci exhibiting epigenetic signatures of enhancers and promoters, and they were overrepresented within transcription factor binding regions and DNase I hypersensitive sites. All of these findings collectively indicate that ASM SNPs should be prioritized in follow-up GWASs. PMID:27087318
regSNPs: a strategy for prioritizing regulatory single nucleotide substitutions
Teng, Mingxiang; Ichikawa, Shoji; Padgett, Leah R.; Wang, Yadong; Mort, Matthew; Cooper, David N.; Koller, Daniel L.; Foroud, Tatiana; Edenberg, Howard J.; Econs, Michael J.; Liu, Yunlong
2012-01-01
Motivation: One of the fundamental questions in genetics study is to identify functional DNA variants that are responsible to a disease or phenotype of interest. Results from large-scale genetics studies, such as genome-wide association studies (GWAS), and the availability of high-throughput sequencing technologies provide opportunities in identifying causal variants. Despite the technical advances, informatics methodologies need to be developed to prioritize thousands of variants for potential causative effects. Results: We present regSNPs, an informatics strategy that integrates several established bioinformatics tools, for prioritizing regulatory SNPs, i.e. the SNPs in the promoter regions that potentially affect phenotype through changing transcription of downstream genes. Comparing to existing tools, regSNPs has two distinct features. It considers degenerative features of binding motifs by calculating the differences on the binding affinity caused by the candidate variants and integrates potential phenotypic effects of various transcription factors. When tested by using the disease-causing variants documented in the Human Gene Mutation Database, regSNPs showed mixed performance on various diseases. regSNPs predicted three SNPs that can potentially affect bone density in a region detected in an earlier linkage study. Potential effects of one of the variants were validated using luciferase reporter assay. Contact: yunliu@iupui.edu Supplementary information: Supplementary data are available at Bioinformatics online PMID:22611130
Chao, Chia-Sheng; Wei, Jeng; Huang, Hurng-Wern; Yang, Shyh-Chyun
2014-07-01
Methylenetetrahydrofolate reductase (MTHFR) C677T and A1298C gene polymorphisms are associated with the risk of patent ductus arteriosus (PDA) congenital heart defects. This study aimed to determine the association of these polymorphisms in patients with isolated PDA and in non-PDA patients group without congenital heart disease. This retrospective case-controlled study was undertaken in 17 patients with isolated PDA and a control non-PDA group consisting of 34 subjects without congenital heart disease. MTHFR gene polymorphisms were analysed using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). In addition, the genotype distribution of the MTHFR gene was compared among different ethnicities using the HapMap database. In contrast to the MTHFR C677T polymorphism, differences in the MTHFR A1298C genotype were observed between the two groups (P=0.002); a greater proportion of the PDA patients had the MTHFR 1298CC and 1298AA genotypes as compared to the non-PDA control group. After merging the data obtained from the Taiwanese participants with that from the HapMap database, genetic diversity of the MTHFR 1298AA genotype was observed. Thus, the MTHFR A1298C polymorphism is associated with isolated PDA in Taiwan. Larger studies are necessary to evaluate the prognostic value of determining MTHFR polymorphism in PDA. Copyright © 2014 Australian and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) and the Cardiac Society of Australia and New Zealand (CSANZ). Published by Elsevier B.V. All rights reserved.
Genome-Wide Analysis in Brazilian Xavante Indians Reveals Low Degree of Admixture
Kuhn, Patricia C.; Horimoto, Andréa R. V. Russo.; Sanches, José Maurício; Vieira Filho, João Paulo B.; Franco, Luciana; Fabbro, Amaury Dal; Franco, Laercio Joel; Pereira, Alexandre C.; Moises, Regina S
2012-01-01
Characterization of population genetic variation and structure can be used as tools for research in human genetics and population isolates are of great interest. The aim of the present study was to characterize the genetic structure of Xavante Indians and compare it with other populations. The Xavante, an indigenous population living in Brazilian Central Plateau, is one of the largest native groups in Brazil. A subset of 53 unrelated subjects was selected from the initial sample of 300 Xavante Indians. Using 86,197 markers, Xavante were compared with all populations of HapMap Phase III and HGDP-CEPH projects and with a Southeast Brazilian population sample to establish its population structure. Principal Components Analysis showed that the Xavante Indians are concentrated in the Amerindian axis near other populations of known Amerindian ancestry such as Karitiana, Pima, Surui and Maya and a low degree of genetic admixture was observed. This is consistent with the historical records of bottlenecks experience and cultural isolation. By calculating pair-wise Fst statistics we characterized the genetic differentiation between Xavante Indians and representative populations of the HapMap and from HGDP-CEPH project. We found that the genetic differentiation between Xavante Indians and populations of Ameridian, Asian, European, and African ancestry increased progressively. Our results indicate that the Xavante is a population that remained genetically isolated over the past decades and can offer advantages for genome-wide mapping studies of inherited disorders. PMID:22900041
Genome-wide analysis in Brazilian Xavante Indians reveals low degree of admixture.
Kuhn, Patricia C; Horimoto, Andréa R V Russo; Sanches, José Maurício; Vieira Filho, João Paulo B; Franco, Luciana; Fabbro, Amaury Dal; Franco, Laercio Joel; Pereira, Alexandre C; Moises, Regina S
2012-01-01
Characterization of population genetic variation and structure can be used as tools for research in human genetics and population isolates are of great interest. The aim of the present study was to characterize the genetic structure of Xavante Indians and compare it with other populations. The Xavante, an indigenous population living in Brazilian Central Plateau, is one of the largest native groups in Brazil. A subset of 53 unrelated subjects was selected from the initial sample of 300 Xavante Indians. Using 86,197 markers, Xavante were compared with all populations of HapMap Phase III and HGDP-CEPH projects and with a Southeast Brazilian population sample to establish its population structure. Principal Components Analysis showed that the Xavante Indians are concentrated in the Amerindian axis near other populations of known Amerindian ancestry such as Karitiana, Pima, Surui and Maya and a low degree of genetic admixture was observed. This is consistent with the historical records of bottlenecks experience and cultural isolation. By calculating pair-wise F(st) statistics we characterized the genetic differentiation between Xavante Indians and representative populations of the HapMap and from HGDP-CEPH project. We found that the genetic differentiation between Xavante Indians and populations of Ameridian, Asian, European, and African ancestry increased progressively. Our results indicate that the Xavante is a population that remained genetically isolated over the past decades and can offer advantages for genome-wide mapping studies of inherited disorders.
Muller, Ryan Y; Hammond, Ming C; Rio, Donald C; Lee, Yeon J
2015-12-01
The Encyclopedia of DNA Elements (ENCODE) Project aims to identify all functional sequence elements in the human genome sequence by use of high-throughput DNA/cDNA sequencing approaches. To aid the standardization, comparison, and integration of data sets produced from different technologies and platforms, the ENCODE Consortium selected several standard human cell lines to be used by the ENCODE Projects. The Tier 1 ENCODE cell lines include GM12878, K562, and H1 human embryonic stem cell lines. GM12878 is a lymphoblastoid cell line, transformed with the Epstein-Barr virus, that was selected by the International HapMap Project for whole genome and transcriptome sequencing by use of the Illumina platform. K562 is an immortalized myelogenous leukemia cell line. The GM12878 cell line is attractive for the ENCODE Projects, as it offers potential synergy with the International HapMap Project. Despite the vast amount of sequencing data available on the GM12878 cell line through the ENCODE Project, including transcriptome, chromatin immunoprecipitation-sequencing for histone marks, and transcription factors, no small interfering siRNA-mediated knockdown studies have been performed in the GM12878 cell line, as cationic lipid-mediated transfection methods are inefficient for lymphoid cell lines. Here, we present an efficient and reproducible method for transfection of a variety of siRNAs into the GM12878 and K562 cell lines, which subsequently results in targeted protein depletion.
Liu, L B; Wu, C M; Wen, J; Chen, J L; Zheng, M Q; Zhao, G P
2009-03-01
Antibody titers raised for vaccinations against avian influenza (AI) and Newcastle disease (ND) were higher in Chinese Beijing-You (BJY) than in White Leghorn (WL) ( P < 0.001), but there was no breed difference in titers for sheep red blood cells (SRBC). Genotyping by PCR-SSCP identified seven haplotypes in WL and 17 in BJY. After sequencing PCR products (35 and 85, respectively), 43 (WL) and 47 (BJY) single nucleotide polymorphisms (SNPs) were found in the 264 bp of exon 2. In WL chickens, significant associations were found with antibody responses to AI (two SNPs), ND (six SNPs), and SRBC (one SNP), while in BJY there was association with responses to ND (two SNPs) and SRBC (two SNPs), but none with AI. These results indicate that the genomic region bearing exon 2 of the major histocompatibility complex B-F gene has significant effects on antibody responses to SRBC and vaccination against AI and ND. Different SNPs affected antibody titers for each of the antigens and they differed between these very distinct breeds.
Evaluating information content of SNPs for sample-tagging in re-sequencing projects.
Hu, Hao; Liu, Xiang; Jin, Wenfei; Hilger Ropers, H; Wienker, Thomas F
2015-05-15
Sample-tagging is designed for identification of accidental sample mix-up, which is a major issue in re-sequencing studies. In this work, we develop a model to measure the information content of SNPs, so that we can optimize a panel of SNPs that approach the maximal information for discrimination. The analysis shows that as low as 60 optimized SNPs can differentiate the individuals in a population as large as the present world, and only 30 optimized SNPs are in practice sufficient in labeling up to 100 thousand individuals. In the simulated populations of 100 thousand individuals, the average Hamming distances, generated by the optimized set of 30 SNPs are larger than 18, and the duality frequency, is lower than 1 in 10 thousand. This strategy of sample discrimination is proved robust in large sample size and different datasets. The optimized sets of SNPs are designed for Whole Exome Sequencing, and a program is provided for SNP selection, allowing for customized SNP numbers and interested genes. The sample-tagging plan based on this framework will improve re-sequencing projects in terms of reliability and cost-effectiveness.
Genomics in Cardiovascular Disease
Roberts, Robert; Marian, A.J.; Dandona, Sonny; Stewart, Alexandre F.R.
2013-01-01
A paradigm shift towards biology occurred in the 1990’s subsequently catalyzed by the sequencing of the human genome in 2000. The cost of DNA sequencing has gone from millions to thousands of dollars with sequencing of one’s entire genome costing only $1,000. Rapid DNA sequencing is being embraced for single gene disorders, particularly for sporadic cases and those from small families. Transmission of lethal genes such as associated with Huntington’s disease can, through in-vitro fertilization, avoid passing it on to one’s offspring. DNA sequencing will meet the challenge of elucidating the genetic predisposition for common polygenic diseases, especially in determining the function of the novel common genetic risk variants and identifying the rare variants, which may also partially ascertain the source of the missing heritability. The challenge for DNA sequencing remains great, despite human genome sequences being 99.5% identical, the 3 million single nucleotide polymorphisms (SNPs) responsible for most of the unique features add up to 60 new mutations per person which, for 7 billion people, is 420 billion mutations. It is claimed that DNA sequencing has increased 10,000 fold while information storage and retrieval only 16 fold. The physician and health user will be challenged by the convergence of two major trends, whole genome sequencing and the storage/retrieval and integration of the data. PMID:23524054
A GENOME-WIDE LINKAGE AND ASSOCIATION SCAN REVEALS NOVEL LOCI FOR AUTISM
Weiss, Lauren A.; Arking, Dan E.
2009-01-01
Summary Although autism is a highly heritable neurodevelopmental disorder, attempts to identify specific susceptibility genes have thus far met with limited success 1. Genome-wide association studies (GWAS) using half a million or more markers, particularly those with very large sample sizes achieved through meta-analysis, have shown great success in mapping genes for other complex genetic traits (http://www.genome.gov/26525384). Consequently, we initiated a linkage and association mapping study using half a million genome-wide SNPs in a common set of 1,031 multiplex autism families (1,553 affected offspring). We identified regions of suggestive and significant linkage on chromosomes 6q27 and 20p13, respectively. Initial analysis did not yield genome-wide significant associations; however, genotyping of top hits in additional families revealed a SNP on chromosome 5p15 (between SEMA5A and TAS2R1) that was significantly associated with autism (P = 2 × 10−7). We also demonstrated that expression of SEMA5A is reduced in brains from autistic patients, further implicating SEMA5A as an autism susceptibility gene. The linkage regions reported here provide targets for rare variation screening while the discovery of a single novel association demonstrates the action of common variants. PMID:19812673
Maddinedi, Sireesh Babu; Mandal, Badal Kumar; Anna, Kiran Kumar
2017-04-01
A simple, green approach for the size controllable preparation of silver nanoparticles (SNPs) using tyrosine as reducing and capping agent is shown here. The size of SNPs is controlled by varying the pH of tyrosine solution. The as synthesized SNPs are characterized by using XRD, UV-Visible, DLS, TEM and SAED. Zeta potential measurements revealed the stability of tyrosine capped silver nanocolloids. Furthermore, catalytic activity studies concluded that the smaller SNPs acts as good catalyst and the catalytic activity depends on size of the nanoparticles. Further, the in-vitro cytotoxicity experiments concluded that the cytotoxicity of the prepared SNPs towards mouse fibroblast (3T3) cell lines is size and dose dependent. Additionally, the present approach is substitute to the traditional methods that are being used now-a-days for size controlled synthesis of SNPs. Copyright © 2017 Elsevier B.V. All rights reserved.
Gu, Wanjun; Gurguis, Christopher I.; Zhou, Jin J.; Zhu, Yihua; Ko, Eun-A.; Ko, Jae-Hong; Wang, Ting; Zhou, Tong
2015-01-01
Genetic variation arising from single nucleotide polymorphisms (SNPs) is ubiquitously found among human populations. While disease-causing variants are known in some cases, identifying functional or causative variants for most human diseases remains a challenging task. Rare SNPs, rather than common ones, are thought to be more important in the pathology of most human diseases. We propose that rare SNPs should be divided into two categories dependent on whether the minor alleles are derived or ancestral. Derived alleles are less likely to have been purified by evolutionary processes and may be more likely to induce deleterious effects. We therefore hypothesized that the rare SNPs with derived minor alleles would be more important for human diseases and predicted that these variants would have larger functional or structural consequences relative to the rare variants for which the minor alleles are ancestral. We systematically investigated the consequences of the exonic SNPs on protein function, mRNA structure, and translation. We found that the functional and structural consequences are more significant for the rare exonic variants for which the minor alleles are derived. However, this pattern is reversed when the minor alleles are ancestral. Thus, the rare exonic SNPs with derived minor alleles are more likely to be deleterious. Age estimation of rare SNPs confirms that these potentially deleterious SNPs are recently evolved in the human population. These results have important implications for understanding the function of genetic variations in human exonic regions and for prioritizing functional SNPs in genome-wide association studies of human diseases. PMID:26454016
Singh, Vikas K; Khan, Aamir W; Saxena, Rachit K; Kumar, Vinay; Kale, Sandip M; Sinha, Pallavi; Chitikineni, Annapurna; Pazhamala, Lekha T; Garg, Vanika; Sharma, Mamta; Sameer Kumar, Chanda Venkata; Parupalli, Swathi; Vechalapu, Suryanarayana; Patil, Suyash; Muniswamy, Sonnappa; Ghanta, Anuradha; Yamini, Kalinati Narasimhan; Dharmaraj, Pallavi Subbanna; Varshney, Rajeev K
2016-05-01
To map resistance genes for Fusarium wilt (FW) and sterility mosaic disease (SMD) in pigeonpea, sequencing-based bulked segregant analysis (Seq-BSA) was used. Resistant (R) and susceptible (S) bulks from the extreme recombinant inbred lines of ICPL 20096 × ICPL 332 were sequenced. Subsequently, SNP index was calculated between R- and S-bulks with the help of draft genome sequence and reference-guided assembly of ICPL 20096 (resistant parent). Seq-BSA has provided seven candidate SNPs for FW and SMD resistance in pigeonpea. In parallel, four additional genotypes were re-sequenced and their combined analysis with R- and S-bulks has provided a total of 8362 nonsynonymous (ns) SNPs. Of 8362 nsSNPs, 60 were found within the 2-Mb flanking regions of seven candidate SNPs identified through Seq-BSA. Haplotype analysis narrowed down to eight nsSNPs in seven genes. These eight nsSNPs were further validated by re-sequencing 11 genotypes that are resistant and susceptible to FW and SMD. This analysis revealed association of four candidate nsSNPs in four genes with FW resistance and four candidate nsSNPs in three genes with SMD resistance. Further, In silico protein analysis and expression profiling identified two most promising candidate genes namely C.cajan_01839 for SMD resistance and C.cajan_03203 for FW resistance. Identified candidate genomic regions/SNPs will be useful for genomics-assisted breeding in pigeonpea. © 2015 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Genome-wide association study of coronary and aortic calcification in lung cancer screening CT
NASA Astrophysics Data System (ADS)
de Vos, Bob D.; van Setten, Jessica; de Jong, Pim A.; Mali, Willem P.; Oudkerk, Matthijs; Viergever, Max A.; Išgum, Ivana
2016-03-01
Arterial calcification has been related to cardiovascular disease (CVD) and osteoporosis. However, little is known about the role of genetics and exact pathways leading to arterial calcification and its relation to bone density changes indicating osteoporosis. In this study, we conducted a genome-wide association study of arterial calcification burden, followed by a look-up of known single nucleotide polymorphisms (SNPs) for coronary artery disease (CAD) and myocardial infarction (MI), and bone mineral density (BMD) to test for a shared genetic basis between the traits. The study included a subcohort of the Dutch-Belgian lung cancer screening trial comprised of 2,561 participants. Participants underwent baseline CT screening in one of two hospitals participating in the trial. Low-dose chest CT images were acquired without contrast enhancement and without ECG-synchronization. In these images coronary and aortic calcifications were identified automatically. Subsequently, the detected calcifications were quantified using coronary artery calcium Agatston and volume scores. Genotype data was available for these participants. A genome-wide association study was conducted on 10,220,814 SNPs using a linear regression model. To reduce multiple testing burden, known CAD/MI and BMD SNPs were specifically tested (45 SNPs from the CARDIoGRAMplusC4D consortium and 60 SNPS from the GEFOS consortium). No novel significant SNPs were found. Significant enrichment for CAD/MI SNPs was observed in testing Agatston and coronary artery calcium volume scores. Moreover, a significant enrichment of BMD SNPs was shown in aortic calcium volume scores. This may indicate genetic relation of BMD SNPs and arterial calcification burden.
2011-01-01
Background Single nucleotide polymorphisms (SNPs) are the most abundant source of genetic variation among individuals of a species. New genotyping technologies allow examining hundreds to thousands of SNPs in a single reaction for a wide range of applications such as genetic diversity analysis, linkage mapping, fine QTL mapping, association studies, marker-assisted or genome-wide selection. In this paper, we evaluated the potential of highly-multiplexed SNP genotyping for genetic mapping in maritime pine (Pinus pinaster Ait.), the main conifer used for commercial plantation in southwestern Europe. Results We designed a custom GoldenGate assay for 1,536 SNPs detected through the resequencing of gene fragments (707 in vitro SNPs/Indels) and from Sanger-derived Expressed Sequenced Tags assembled into a unigene set (829 in silico SNPs/Indels). Offspring from three-generation outbred (G2) and inbred (F2) pedigrees were genotyped. The success rate of the assay was 63.6% and 74.8% for in silico and in vitro SNPs, respectively. A genotyping error rate of 0.4% was further estimated from segregating data of SNPs belonging to the same gene. Overall, 394 SNPs were available for mapping. A total of 287 SNPs were integrated with previously mapped markers in the G2 parental maps, while 179 SNPs were localized on the map generated from the analysis of the F2 progeny. Based on 98 markers segregating in both pedigrees, we were able to generate a consensus map comprising 357 SNPs from 292 different loci. Finally, the analysis of sequence homology between mapped markers and their orthologs in a Pinus taeda linkage map, made it possible to align the 12 linkage groups of both species. Conclusions Our results show that the GoldenGate assay can be used successfully for high-throughput SNP genotyping in maritime pine, a conifer species that has a genome seven times the size of the human genome. This SNP-array will be extended thanks to recent sequencing effort using new generation sequencing technologies and will include SNPs from comparative orthologous sequences that were identified in the present study, providing a wider collection of anchor points for comparative genomics among the conifers. PMID:21767361
Chamala, Srikar; Beckstead, Wesley A; Rowe, Mark J; McClellan, David A
2007-01-01
We investigated whether the effect of evolutionary selection on three recent Single Nucleotide Polymorphisms (SNPs) in the mitochondrial sub-haplogroups of Pima Indians is consistent with their effects on metabolic efficiency. The mitochondrial SNPs impact metabolic rate and respiratory quotient, and may be adaptations to caloric restriction in a desert habitat. Using TreeSAAP software, we examined evolutionary selection in 107 mammalian species at these SNPs, characterising the biochemical shifts produced by the amino acid substitutions. Our results suggest that two SNPs were affected by selection during mammalian evolution in a manner consistent with their effects on metabolic efficiency in Pima Indians.
Kharrat, Najla; Abdelmouleh, Wafa; Abdelhedi, Rania; Alfadhli, Suad; Rebai, Ahmed
2012-01-01
DNA variations within the Angiotensin-Converting Enzyme (ACE) gene have been shown to be involved in the aetiology of several common diseases and the therapeutic response. This study reports a comparison of haplotype analysis of five SNPs in the ACE gene region using a sample of 100 healthy subjects derived from five different populations (Tunisian, Iranian, Kuwaiti, Bahraini and Indian). Strong linkage disequilibrium was found among all SNPs studied for all populations. Two SNPs (rs1800764 and rs4340) were identified as key SNPs for all populations. These SNPs will be valuable for future effective association studies of the ACE gene polymorphisms in Arab and Asian populations.
Green synthesis of silver nanoparticles: characterization and determination of antibacterial potency
NASA Astrophysics Data System (ADS)
Annamalai, Jayshree; Nallamuthu, Thangaraju
2016-02-01
Silver ions (Ag+) and its compounds are highly toxic to microorganisms, exhibiting strong biocidal effects on many species of bacteria but have a low toxicity toward animal cells. In the present study, silver nanoparticles (SNPs) were biosynthesized using aqueous extract of Chlorella vulgaris as reducing agent and size of SNPs synthesized ranged between 15 and 47 nm. SNPs were characterized by UV-visible spectroscopy, scanning electron microscopy, transmission electron microscopy, X-ray diffraction and Fourier infrared spectroscopy, and analyzed for its antibacterial property against human pathogens. This approach of SNPs synthesis involving green chemistry process can be considered for the large-scale production of SNPs and in the development of biomedicines.
Getting DNA copy numbers without control samples
2012-01-01
Background The selection of the reference to scale the data in a copy number analysis has paramount importance to achieve accurate estimates. Usually this reference is generated using control samples included in the study. However, these control samples are not always available and in these cases, an artificial reference must be created. A proper generation of this signal is crucial in terms of both noise and bias. We propose NSA (Normality Search Algorithm), a scaling method that works with and without control samples. It is based on the assumption that genomic regions enriched in SNPs with identical copy numbers in both alleles are likely to be normal. These normal regions are predicted for each sample individually and used to calculate the final reference signal. NSA can be applied to any CN data regardless the microarray technology and preprocessing method. It also finds an optimal weighting of the samples minimizing possible batch effects. Results Five human datasets (a subset of HapMap samples, Glioblastoma Multiforme (GBM), Ovarian, Prostate and Lung Cancer experiments) have been analyzed. It is shown that using only tumoral samples, NSA is able to remove the bias in the copy number estimation, to reduce the noise and therefore, to increase the ability to detect copy number aberrations (CNAs). These improvements allow NSA to also detect recurrent aberrations more accurately than other state of the art methods. Conclusions NSA provides a robust and accurate reference for scaling probe signals data to CN values without the need of control samples. It minimizes the problems of bias, noise and batch effects in the estimation of CNs. Therefore, NSA scaling approach helps to better detect recurrent CNAs than current methods. The automatic selection of references makes it useful to perform bulk analysis of many GEO or ArrayExpress experiments without the need of developing a parser to find the normal samples or possible batches within the data. The method is available in the open-source R package NSA, which is an add-on to the aroma.cn framework. http://www.aroma-project.org/addons. PMID:22898240
Getting DNA copy numbers without control samples.
Ortiz-Estevez, Maria; Aramburu, Ander; Rubio, Angel
2012-08-16
The selection of the reference to scale the data in a copy number analysis has paramount importance to achieve accurate estimates. Usually this reference is generated using control samples included in the study. However, these control samples are not always available and in these cases, an artificial reference must be created. A proper generation of this signal is crucial in terms of both noise and bias.We propose NSA (Normality Search Algorithm), a scaling method that works with and without control samples. It is based on the assumption that genomic regions enriched in SNPs with identical copy numbers in both alleles are likely to be normal. These normal regions are predicted for each sample individually and used to calculate the final reference signal. NSA can be applied to any CN data regardless the microarray technology and preprocessing method. It also finds an optimal weighting of the samples minimizing possible batch effects. Five human datasets (a subset of HapMap samples, Glioblastoma Multiforme (GBM), Ovarian, Prostate and Lung Cancer experiments) have been analyzed. It is shown that using only tumoral samples, NSA is able to remove the bias in the copy number estimation, to reduce the noise and therefore, to increase the ability to detect copy number aberrations (CNAs). These improvements allow NSA to also detect recurrent aberrations more accurately than other state of the art methods. NSA provides a robust and accurate reference for scaling probe signals data to CN values without the need of control samples. It minimizes the problems of bias, noise and batch effects in the estimation of CNs. Therefore, NSA scaling approach helps to better detect recurrent CNAs than current methods. The automatic selection of references makes it useful to perform bulk analysis of many GEO or ArrayExpress experiments without the need of developing a parser to find the normal samples or possible batches within the data. The method is available in the open-source R package NSA, which is an add-on to the aroma.cn framework. http://www.aroma-project.org/addons.
2012-01-01
Background High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. Results We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. Conclusion The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses. PMID:22260749
Exploiting Genome Structure in Association Analysis
Kim, Seyoung
2014-01-01
Abstract A genome-wide association study involves examining a large number of single-nucleotide polymorphisms (SNPs) to identify SNPs that are significantly associated with the given phenotype, while trying to reduce the false positive rate. Although haplotype-based association methods have been proposed to accommodate correlation information across nearby SNPs that are in linkage disequilibrium, none of these methods directly incorporated the structural information such as recombination events along chromosome. In this paper, we propose a new approach called stochastic block lasso for association mapping that exploits prior knowledge on linkage disequilibrium structure in the genome such as recombination rates and distances between adjacent SNPs in order to increase the power of detecting true associations while reducing false positives. Following a typical linear regression framework with the genotypes as inputs and the phenotype as output, our proposed method employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a first-order Markov process along the sequence of SNPs that incorporates the prior information on the linkage disequilibrium structure. The Markov-chain prior models the structural dependencies between a pair of adjacent SNPs, and allows us to look for association SNPs in a coupled manner, combining strength from multiple nearby SNPs. Our results on HapMap-simulated datasets and mouse datasets show that there is a significant advantage in incorporating the prior knowledge on linkage disequilibrium structure for marker identification under whole-genome association. PMID:21548809
Langlois, Christine; Abadi, Arkan; Peralta-Romero, Jesus; Alyass, Akram; Suarez, Fernando; Gomez-Zamudio, Jaime; Burguete-Garcia, Ana I.; Yazdi, Fereshteh T.; Cruz, Miguel; Meyre, David
2016-01-01
Genome wide association studies (GWAS) have identified single-nucleotide polymorphisms (SNPs) that are associated with fasting plasma glucose (FPG) in adult European populations. The contribution of these SNPs to FPG in non-Europeans and children is unclear. We studied the association of 15 GWAS SNPs and a genotype score (GS) with FPG and 7 metabolic traits in 1,421 Mexican children and adolescents from Mexico City. Genotyping of the 15 SNPs was performed using TaqMan Open Array. We used multivariate linear regression models adjusted for age, sex, body mass index standard deviation score, and recruitment center. We identified significant associations between 3 SNPs (G6PC2 (rs560887), GCKR (rs1260326), MTNR1B (rs10830963)), the GS and FPG level. The FPG risk alleles of 11 out of the 15 SNPs (73.3%) displayed significant or non-significant beta values for FPG directionally consistent with those reported in adult European GWAS. The risk allele frequencies for 11 of 15 (73.3%) SNPs differed significantly in Mexican children and adolescents compared to European adults from the 1000G Project, but no significant enrichment in FPG risk alleles was observed in the Mexican population. Our data support a partial transferability of European GWAS FPG association signals in children and adolescents from the admixed Mexican population. PMID:27782183
Munretnam, Khamsigan; Alex, Livy; Ramzi, Nurul Hanis; Chahil, Jagdish Kaur; Kavitha, I S; Hashim, Nikman Adli Nor; Lye, Say Hean; Velapasamy, Sharmila; Ler, Lian Wee
2014-01-01
There is growing global interest to stratify men into different levels of risk to developing prostate cancer, thus it is important to identify common genetic variants that confer the risk. Although many studies have identified more than a dozen common genetic variants which are highly associated with prostate cancer, none have been done in Malaysian population. To determine the association of such variants in Malaysian men with prostate cancer, we evaluated a panel of 768 SNPs found previously associated with various cancers which also included the prostate specific SNPs in a population based case control study (51 case subjects with prostate cancer and 51 control subjects) in Malaysian men of Malay, Chinese and Indian ethnicity. We identified 21 SNPs significantly associated with prostate cancer. Among these, 12 SNPs were strongly associated with increased risk of prostate cancer while remaining nine SNPs were associated with reduced risk. However, data analysis based on ethnic stratification led to only five SNPs in Malays and 3 SNPs in Chinese which remained significant. This could be due to small sample size in each ethnic group. Significant non-genetic risk factors were also identified for their association with prostate cancer. Our study is the first to investigate the involvement of multiple variants towards susceptibility for PC in Malaysian men using genotyping approach. Identified SNPs and non-genetic risk factors have a significant association with prostate cancer.
Setsirichok, Damrongrit; Tienboon, Phuwadej; Jaroonruang, Nattapong; Kittichaijaroen, Somkit; Wongseree, Waranyu; Piroonratana, Theera; Usavanarong, Touchpong; Limwongse, Chanin; Aporntewan, Chatchawit; Phadoongsidhi, Marong; Chaiyaratana, Nachol
2013-01-01
This article presents the ability of an omnibus permutation test on ensembles of two-locus analyses (2LOmb) to detect pure epistasis in the presence of genetic heterogeneity. The performance of 2LOmb is evaluated in various simulation scenarios covering two independent causes of complex disease where each cause is governed by a purely epistatic interaction. Different scenarios are set up by varying the number of available single nucleotide polymorphisms (SNPs) in data, number of causative SNPs and ratio of case samples from two affected groups. The simulation results indicate that 2LOmb outperforms multifactor dimensionality reduction (MDR) and random forest (RF) techniques in terms of a low number of output SNPs and a high number of correctly-identified causative SNPs. Moreover, 2LOmb is capable of identifying the number of independent interactions in tractable computational time and can be used in genome-wide association studies. 2LOmb is subsequently applied to a type 1 diabetes mellitus (T1D) data set, which is collected from a UK population by the Wellcome Trust Case Control Consortium (WTCCC). After screening for SNPs that locate within or near genes and exhibit no marginal single-locus effects, the T1D data set is reduced to 95,991 SNPs from 12,146 genes. The 2LOmb search in the reduced T1D data set reveals that 12 SNPs, which can be divided into two independent sets, are associated with the disease. The first SNP set consists of three SNPs from MUC21 (mucin 21, cell surface associated), three SNPs from MUC22 (mucin 22), two SNPs from PSORS1C1 (psoriasis susceptibility 1 candidate 1) and one SNP from TCF19 (transcription factor 19). A four-locus interaction between these four genes is also detected. The second SNP set consists of three SNPs from ATAD1 (ATPase family, AAA domain containing 1). Overall, the findings indicate the detection of pure epistasis in the presence of genetic heterogeneity and provide an alternative explanation for the aetiology of T1D in the UK population.
Salas, Antonio; Amigo, Jorge
2010-05-03
The high levels of variation characterising the mitochondrial DNA (mtDNA) molecule are due ultimately to its high average mutation rate; moreover, mtDNA variation is deeply structured in different populations and ethnic groups. There is growing interest in selecting a reduced number of mtDNA single nucleotide polymorphisms (mtSNPs) that account for the maximum level of discrimination power in a given population. Applications of the selected mtSNP panel range from anthropologic and medical studies to forensic genetic casework. This study proposes a new simulation-based method that explores the ability of different mtSNP panels to yield the maximum levels of discrimination power. The method explores subsets of mtSNPs of different sizes randomly chosen from a preselected panel of mtSNPs based on frequency. More than 2,000 complete genomes representing three main continental human population groups (Africa, Europe, and Asia) and two admixed populations ("African-Americans" and "Hispanics") were collected from GenBank and the literature, and were used as training sets. Haplotype diversity was measured for each combination of mtSNP and compared with existing mtSNP panels available in the literature. The data indicates that only a reduced number of mtSNPs ranging from six to 22 are needed to account for 95% of the maximum haplotype diversity of a given population sample. However, only a small proportion of the best mtSNPs are shared between populations, indicating that there is not a perfect set of "universal" mtSNPs suitable for all population contexts. The discrimination power provided by these mtSNPs is much higher than the power of the mtSNP panels proposed in the literature to date. Some mtSNP combinations also yield high diversity values in admixed populations. The proposed computational approach for exploring combinations of mtSNPs that optimise the discrimination power of a given set of mtSNPs is more efficient than previous empirical approaches. In contrast to precedent findings, the results seem to indicate that only few mtSNPs are needed to reach high levels of discrimination power in a population, independently of its ancestral background.
Salas, Antonio; Amigo, Jorge
2010-01-01
Background The high levels of variation characterising the mitochondrial DNA (mtDNA) molecule are due ultimately to its high average mutation rate; moreover, mtDNA variation is deeply structured in different populations and ethnic groups. There is growing interest in selecting a reduced number of mtDNA single nucleotide polymorphisms (mtSNPs) that account for the maximum level of discrimination power in a given population. Applications of the selected mtSNP panel range from anthropologic and medical studies to forensic genetic casework. Methodology/Principal Findings This study proposes a new simulation-based method that explores the ability of different mtSNP panels to yield the maximum levels of discrimination power. The method explores subsets of mtSNPs of different sizes randomly chosen from a preselected panel of mtSNPs based on frequency. More than 2,000 complete genomes representing three main continental human population groups (Africa, Europe, and Asia) and two admixed populations (“African-Americans” and “Hispanics”) were collected from GenBank and the literature, and were used as training sets. Haplotype diversity was measured for each combination of mtSNP and compared with existing mtSNP panels available in the literature. The data indicates that only a reduced number of mtSNPs ranging from six to 22 are needed to account for 95% of the maximum haplotype diversity of a given population sample. However, only a small proportion of the best mtSNPs are shared between populations, indicating that there is not a perfect set of “universal” mtSNPs suitable for all population contexts. The discrimination power provided by these mtSNPs is much higher than the power of the mtSNP panels proposed in the literature to date. Some mtSNP combinations also yield high diversity values in admixed populations. Conclusions/Significance The proposed computational approach for exploring combinations of mtSNPs that optimise the discrimination power of a given set of mtSNPs is more efficient than previous empirical approaches. In contrast to precedent findings, the results seem to indicate that only few mtSNPs are needed to reach high levels of discrimination power in a population, independently of its ancestral background. PMID:20454657
A genomic scale map of genetic diversity in Trypanosoma cruzi
2012-01-01
Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs): TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the population, providing an essential resource for future studies on the development of new drugs and diagnostics, for Chagas Disease. These data is available through the TcSNP database (http://snps.tcruzi.org). PMID:23270511
Hou, Liping; Heilbronner, Urs; Degenhardt, Franziska; Adli, Mazda; Akiyama, Kazufumi; Akula, Nirmala; Ardau, Raffaella; Arias, Bárbara; Backlund, Lena; Banzato, Claudio E M; Benabarre, Antoni; Bengesser, Susanne; Bhattacharjee, Abesh Kumar; Biernacka, Joanna M; Birner, Armin; Brichant-Petitjean, Clara; Bui, Elise T; Cervantes, Pablo; Chen, Guo-Bo; Chen, Hsi-Chung; Chillotti, Caterina; Cichon, Sven; Clark, Scott R; Colom, Francesc; Cousins, David A; Cruceanu, Cristiana; Czerski, Piotr M; Dantas, Clarissa R; Dayer, Alexandre; Étain, Bruno; Falkai, Peter; Forstner, Andreas J; Frisén, Louise; Fullerton, Janice M; Gard, Sébastien; Garnham, Julie S; Goes, Fernando S; Grof, Paul; Gruber, Oliver; Hashimoto, Ryota; Hauser, Joanna; Herms, Stefan; Hoffmann, Per; Hofmann, Andrea; Jamain, Stephane; Jiménez, Esther; Kahn, Jean-Pierre; Kassem, Layla; Kittel-Schneider, Sarah; Kliwicki, Sebastian; König, Barbara; Kusumi, Ichiro; Lackner, Nina; Laje, Gonzalo; Landén, Mikael; Lavebratt, Catharina; Leboyer, Marion; Leckband, Susan G; Jaramillo, Carlos A López; MacQueen, Glenda; Manchia, Mirko; Martinsson, Lina; Mattheisen, Manuel; McCarthy, Michael J; McElroy, Susan L; Mitjans, Marina; Mondimore, Francis M; Monteleone, Palmiero; Nievergelt, Caroline M; Nöthen, Markus M; Ösby, Urban; Ozaki, Norio; Perlis, Roy H; Pfennig, Andrea; Reich-Erkelenz, Daniela; Rouleau, Guy A; Schofield, Peter R; Schubert, K Oliver; Schweizer, Barbara W; Seemüller, Florian; Severino, Giovanni; Shekhtman, Tatyana; Shilling, Paul D; Shimoda, Kazutaka; Simhandl, Christian; Slaney, Claire M; Smoller, Jordan W; Squassina, Alessio; Stamm, Thomas; Stopkova, Pavla; Tighe, Sarah K; Tortorella, Alfonso; Turecki, Gustavo; Volkert, Julia; Witt, Stephanie; Wright, Adam; Young, L Trevor; Zandi, Peter P; Potash, James B; DePaulo, J Raymond; Bauer, Michael; Reininghaus, Eva Z; Novák, Tomas; Aubry, Jean-Michel; Maj, Mario; Baune, Bernhard T; Mitchell, Philip B; Vieta, Eduard; Frye, Mark A; Rybakowski, Janusz K; Kuo, Po-Hsiu; Kato, Tadafumi; Grigoroiu-Serbanescu, Maria; Reif, Andreas; Del Zompo, Maria; Bellivier, Frank; Schalling, Martin; Wray, Naomi R; Kelsoe, John R; Alda, Martin; Rietschel, Marcella; McMahon, Francis J; Schulze, Thomas G
2016-03-12
Lithium is a first-line treatment in bipolar disorder, but individual response is variable. Previous studies have suggested that lithium response is a heritable trait. However, no genetic markers of treatment response have been reproducibly identified. Here, we report the results of a genome-wide association study of lithium response in 2563 patients collected by 22 participating sites from the International Consortium on Lithium Genetics (ConLiGen). Data from common single nucleotide polymorphisms (SNPs) were tested for association with categorical and continuous ratings of lithium response. Lithium response was measured using a well established scale (Alda scale). Genotyped SNPs were used to generate data at more than 6 million sites, using standard genomic imputation methods. Traits were regressed against genotype dosage. Results were combined across two batches by meta-analysis. A single locus of four linked SNPs on chromosome 21 met genome-wide significance criteria for association with lithium response (rs79663003, p=1·37 × 10(-8); rs78015114, p=1·31 × 10(-8); rs74795342, p=3·31 × 10(-9); and rs75222709, p=3·50 × 10(-9)). In an independent, prospective study of 73 patients treated with lithium monotherapy for a period of up to 2 years, carriers of the response-associated alleles had a significantly lower rate of relapse than carriers of the alternate alleles (p=0·03268, hazard ratio 3·8, 95% CI 1·1-13·0). The response-associated region contains two genes for long, non-coding RNAs (lncRNAs), AL157359.3 and AL157359.4. LncRNAs are increasingly appreciated as important regulators of gene expression, particularly in the CNS. Confirmed biomarkers of lithium response would constitute an important step forward in the clinical management of bipolar disorder. Further studies are needed to establish the biological context and potential clinical utility of these findings. Deutsche Forschungsgemeinschaft, National Institute of Mental Health Intramural Research Program. Copyright © 2016 Elsevier Ltd. All rights reserved.
Du, Zhong-Jun; Cui, Guan-Qun; Zhang, Juan; Liu, Xiao-Mei; Zhang, Zhi-Hu; Jia, Qiang; Ng, Jack C; Peng, Cheng; Bo, Cun-Xiang; Shao, Hua
2017-01-01
Gap junction intercellular communication (GJIC) between cardiomyocytes is essential for synchronous heart contraction and relies on connexin-containing channels. Connexin 43 (Cx43) is a major component involved in GJIC in heart tissue, and its abnormal expression is closely associated with various cardiac diseases. Silica nanoparticles (SNPs) are known to induce cardiovascular toxicity. However, the mechanisms through which GJIC plays a role in cardiomyocytes apoptosis induced by SNPs remain unknown. The aim of the present study is to determine whether SNPs-decreased GJIC promotes apoptosis in rat cardiomyocytes cell line (H9c2 cells) via the mitochondrial pathway using CCK-8 Kit, scrape-loading dye transfer technique, Annexin V/PI double-staining assays, and Western blot analysis. The results showed that SNPs elicited cytotoxicity in H9c2 cells in a time- and concentration-dependent manner. SNPs also reduced GJIC in H9c2 cells in a concentration-dependent manner through downregulation of Cx43 and upregulation of P-Cx43. Inhibition of gap junctions by gap junction blocker carbenoxolone disodium resulted in decreased survival and increased apoptosis, whereas enhancement of the gap junctions by retinoic acid led to enhanced survival but decreased apoptosis. Furthermore, SNPs-induced apoptosis through the disrupted functional gap junction was correlated with abnormal expressions of the proteins involved in the mitochondrial pathway-related apoptosis such as Bcl-2/Bax, cytochrome C, Caspase-9, and Caspase-3. Taken together, our results provide the first evidence that SNPs-decreased GJIC promotes apoptosis in cardiomyocytes via the mitochondrial pathway. In addition, downregulation of GJIC by SNPs in cardiomyocytes is mediated through downregulation of Cx43 and upregulation of P-Cx43. These results suggest that in rat cardiomyocytes cell line, GJIC plays a protective role in SNPs-induced apoptosis and that GJIC may be one of the targets for SNPs-induced biological effects.
2013-01-01
Background SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases. Results The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively. Conclusions WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go. PMID:23819482
Li, Yi; Gao, Yuxuan; Kim, You-Sam; Iqbal, Asif; Kim, Jong-Joo
2017-01-01
A whole genome association study was conducted to identify single nucleotide polymorphisms (SNPs) with additive and dominant effects for growth and carcass traits in Korean native cattle, Hanwoo. The data set comprised 61 sires and their 486 Hanwoo steers that were born between spring of 2005 and fall of 2007. The steers were genotyped with the 35,968 SNPs that were embedded in the Illumina bovine SNP 50K beadchip and six growth and carcass quality traits were measured for the steers. A series of lack-of-fit tests between the models was applied to classify gene expression pattern as additive or dominant. A total of 18 (0), 15 (3), 12 (8), 15 (18), 11 (7), and 21 (1) SNPs were detected at the 5% chromosome (genome) - wise level for weaning weight (WWT), yearling weight (YWT), carcass weight (CWT), backfat thickness (BFT), longissimus dorsi muscle area (LMA) and marbling score, respectively. Among the significant 129 SNPs, 56 SNPs had additive effects, 20 SNPs dominance effects, and 53 SNPs both additive and dominance effects, suggesting that dominance inheritance mode be considered in genetic improvement for growth and carcass quality in Hanwoo. The significant SNPs were located at 33 quantitative trait locus (QTL) regions on 18 Bos Taurus chromosomes (i.e. BTA 3, 4, 5, 6, 7, 9, 11, 12, 13, 14, 16, 17, 18, 20, 23, 26, 28, and 29) were detected. There is strong evidence that BTA14 is the key chromosome affecting CWT. Also, BTA20 is the key chromosome for almost all traits measured (WWT, YWT, LMA). The application of various additive and dominance SNP models enabled better characterization of SNP inheritance mode for growth and carcass quality traits in Hanwoo, and many of the detected SNPs or QTL had dominance effects, suggesting that dominance be considered for the whole-genome SNPs data and implementation of successive molecular breeding schemes in Hanwoo.
Explaining the disease phenotype of intergenic SNP through predicted long range regulation
Chen, Jingqi; Tian, Weidong
2016-01-01
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. PMID:27280978
Development of a spreadsheet for SNPs typing using Microsoft EXCEL.
Hashiyada, Masaki; Itakura, Yukio; Takahashi, Shirushi; Sakai, Jun; Funayama, Masato
2009-04-01
Single-nucleotide polymorphisms (SNPs) have some characteristics that make them very appropriate for forensic studies and applications. In our institute, SNPs typings were performed by the TaqMan SNP Genotyping Assays using the ABI PRISM 7500 FAST Real-Time PCR System (AppliedBiosystems) and Sequence Detection Software ver.1.4 (AppliedBiosystem). The TaqMan method was desired two positive control (Allele1 and 2) and one negative control to analyze each SNP locus. Therefore, it can be analyzed up to 24 loci of a person on a 96-well-plate at the same time. If SNPs analysis is expected to apply to biometrics authentication, 48 and over loci are required to identify a person. In this study, we designed a spreadsheet package using Microsoft EXCEL, and population data were used from our 120 SNPs population studies. On the spreadsheet, we defined SNP types using 'template files' instead of positive and negative controls. "Template files" consisted of the results of 94 unknown samples and two negative controls of each of 120 SNPs loci we had previously studied. By the use of the files, the spreadsheet could analyze 96 SNPs on a 96-wells-plate simultaneously.
A novel method for in silico identification of regulatory SNPs in human genome.
Li, Rong; Zhong, Dexing; Liu, Ruiling; Lv, Hongqiang; Zhang, Xinman; Liu, Jun; Han, Jiuqiang
2017-02-21
Regulatory single nucleotide polymorphisms (rSNPs), kind of functional noncoding genetic variants, can affect gene expression in a regulatory way, and they are thought to be associated with increased susceptibilities to complex diseases. Here a novel computational approach to identify potential rSNPs is presented. Different from most other rSNPs finding methods which based on hypothesis that SNPs causing large allele-specific changes in transcription factor binding affinities are more likely to play regulatory functions, we use a set of documented experimentally verified rSNPs and nonfunctional background SNPs to train classifiers, so the discriminating features are found. To characterize variants, an extensive range of characteristics, such as sequence context, DNA structure and evolutionary conservation etc. are analyzed. Support vector machine is adopted to build the classifier model together with an ensemble method to deal with unbalanced data. 10-fold cross-validation result shows that our method can achieve accuracy with sensitivity of ~78% and specificity of ~82%. Furthermore, our method performances better than some other algorithms based on aforementioned hypothesis in handling false positives. The original data and the source matlab codes involved are available at https://sourceforge.net/projects/rsnppredict/. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Fundamental Relationship Between Genotype Frequencies and Fitnesses
Lachance, Joseph
2008-01-01
The set of possible postselection genotype frequencies in an infinite, randomly mating population is found. Geometric mean heterozygote frequency divided by geometric mean homozygote frequency equals two times the geometric mean heterozygote fitness divided by geometric mean homozygote fitness. The ratio of genotype frequencies provides a measure of genetic variation that is independent of allele frequencies. When this ratio does not equal two, either selection or population structure is present. Within-population HapMap data show population-specific patterns, while pooled data show an excess of homozygotes. PMID:18780726
Pombar-Gomez, Maria; Lopez-Lopez, Elixabet; Martin-Guerrero, Idoia; Garcia-Orad Carles, Africa; de Pancorbo, Marian M
2015-05-01
Single nucleotide polymorphisms (SNPs) are an interesting option to facilitate the analysis of highly degraded DNA by allowing the reduction of the size of the DNA amplicons. The SNPforID 52-plex panel is a clear example of the use of non-coding SNPs in forensic genetics. However, nonstop advances in studies of genetic polymorphisms are leading to the discovery of new associations between SNPs and diseases. The aim of this study was to perform a comprehensive review of the state of association between the 52 SNPs in the 52-plex panel and diseases or other traits related to their treatment, such as drug response characters. In order to achieve this goal, we have conducted a bioinformatic search for each SNP included in the panel and the SNPs in linkage disequilibrium (LD) with them in the European population (r (2) > 0.8). A total of 424 SNPs (52 in the panel and 372 in LD) were investigated in PubMed, Scopus, and dbSNP databases. Our results show that three SNPs in the SNPforID 52-plex panel (rs2107612, rs1979255, rs1463729) have been associated with diseases such as hypertension or macular degeneration, as well as drug response. Similarly, three out of the 372 SNPs in LD (rs2107614, r (2) = 0.859; rs765250, r (2) = 0.858; rs11064560, r (2) = 0,887) are also associated with various pathologies. In view of these results, we propose the need for a periodic review of the SNPs used in forensic genetics in order to keep their associations with diseases or related phenotypes updated and to evaluate their continuity in forensic panels for avoiding legal and ethical conflicts.
A whole genome analyses of genetic variants in two Kelantan Malay individuals.
Wan Juhari, Wan Khairunnisa; Md Tamrin, Nur Aida; Mat Daud, Mohd Hanif Ridzuan; Isa, Hatin Wan; Mohd Nasir, Nurfazreen; Maran, Sathiya; Abdul Rajab, Nur Shafawati; Ahmad Amin Noordin, Khairul Bariah; Nik Hassan, Nik Norliza; Tearle, Rick; Razali, Rozaimi; Merican, Amir Feisal; Zilfalil, Bin Alwi
2014-12-01
The sequencing of two members of the Royal Kelantan Malay family genomes will provide insights on the Kelantan Malay whole genome sequences. The two Kelantan Malay genomes were analyzed for the SNP markers associated with thalassemia and Helicobacter pylori infection. Helicobacter pylori infection was reported to be low prevalence in the north-east as compared to the west coast of the Peninsular Malaysia and beta-thalassemia was known to be one of the most common inherited and genetic disorder in Malaysia. By combining SNP information from literatures, GWAS study and NCBI ClinVar, 18 unique SNPs were selected for further analysis. From these 18 SNPs, 10 SNPs came from previous study of Helicobacter pylori infection among Malay patients, 6 SNPs were from NCBI ClinVar and 2 SNPs from GWAS studies. The analysis reveals that both Royal Kelantan Malay genomes shared all the 10 SNPs identified by Maran (Single Nucleotide Polymorphims (SNPs) genotypic profiling of Malay patients with and without Helicobacter pylori infection in Kelantan, 2011) and one SNP from GWAS study. In addition, the analysis also reveals that both Royal Kelantan Malay genomes shared 3 SNP markers; HBG1 (rs1061234), HBB (rs1609812) and BCL11A (rs766432) where all three markers were associated with beta-thalassemia. Our findings suggest that the Royal Kelantan Malays carry the SNPs which are associated with protection to Helicobacter pylori infection. In addition they also carry SNPs which are associated with beta-thalassemia. These findings are in line with the findings by other researchers who conducted studies on thalassemia and Helicobacter pylori infection in the non-royal Malay population.
Sohani, Zahra N; Deng, Wei Q; Pare, Guillaume; Meyre, David; Gerstein, Hertzel C; Anand, Sonia S
2014-11-01
South Asians are up to four times more likely to develop type 2 diabetes than white Europeans. It is postulated that the higher prevalence results from greater genetic risk. To evaluate this hypothesis, we: (1) systematically reviewed the literature for single nucleotide polymorphisms (SNPs) predisposing to type 2 diabetes in South Asians; (2) compared risk estimates, risk alleles and risk allele frequencies of predisposing SNPs between South Asians and white Europeans; and (3) tested the association of novel SNPs discovered from South Asians in white Europeans. MEDLINE, Embase, the Cumulative Index to Nursing and Allied Health Literature (CINAHL) and the Cochrane registry were searched for studies of genetic variants associated with type 2 diabetes in South Asians. Meta-analysis estimates for common and novel bi-allelic SNPs in South Asians were compared with white Europeans from the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) consortium. The population burden from predisposing SNPs was assessed using a genotype score. Twenty-four SNPs from 21 loci were associated with type 2 diabetes in South Asians after meta-analysis. The majority of SNPs increase odds of the disorder by 15-35% per risk allele. No substantial differences appear to exist in risk estimates between South Asians and white Europeans from SNPs common to both groups, and the population burden also does not differ. Eight of the 24 are novel SNPs discovered from South Asian genome-wide association studies, some of which show nominal associations with type 2 diabetes in white Europeans. Based on current literature there is no strong evidence to indicate that South Asians possess a greater genetic risk of type 2 diabetes than white Europeans.
Wang, Yanru; Freedman, Jennifer A; Liu, Hongliang; Moorman, Patricia G; Hyslop, Terry; George, Daniel J; Lee, Norman H; Patierno, Steven R; Wei, Qingyi
2017-08-15
Evidence suggests that cells with a stemness phenotype play a pivotal role in oncogenesis, and prostate cells exhibiting this phenotype have been identified. We used two genome-wide association study (GWAS) datasets of African descendants, from the Multiethnic/Minority Cohort Study of Diet and Cancer (MEC) and the Ghana Prostate Study, and two GWAS datasets of non-Hispanic whites, from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial and the Breast and Prostate Cancer Cohort Consortium (BPC3), to analyze the associations between genetic variants of stemness-related genes and racial disparities in susceptibility to prostate cancer. We evaluated associations of single-nucleotide polymorphisms (SNPs) in 25 stemness-related genes with prostate cancer risk in 1,609 cases and 2,550 controls of non-Hispanic whites (4,934 SNPs) and 1,144 cases and 1,116 controls of African descendants (5,448 SNPs) with correction by false discovery rate ≤0.2. We identified 32 SNPs in five genes (TP63, ALDH1A1, WNT1, MET and EGFR) that were significantly associated with prostate cancer risk, of which six SNPs in three genes (TP63, ALDH1A1 and WNT1) and eight EGFR SNPs showed heterogeneity in susceptibility between these two racial groups. In addition, 13 SNPs in MET and one in ALDH1A1 were found only in African descendants. The in silico bioinformatics analyses revealed that EGFR rs2072454 and SNPs in linkage with the identified SNPs in MET and ALDH1A1 (r 2 > 0.6) were predicted to regulate RNA splicing. These variants may serve as novel biomarkers for racial disparities in prostate cancer risk. © 2017 UICC.
Genetic variants in the vitamin D pathway and breast cancer disease-free survival
Brewster, Abenaa M.
2013-01-01
Epidemiological studies have investigated the association between vitamin D pathway genes and breast cancer risk; however, little is known about the association between vitamin D pathway genes and breast cancer prognosis. In a retrospective cohort of 1029 patients with early-stage breast cancer, we analyzed the association between 106 tagging single nucleotide polymorphisms (SNPs) in eight vitamin D pathway genes and breast cancer disease-free survival (DFS) using Cox regression analysis adjusted for known prognostic variables. Using a false discovery rate of 10%, six intronic SNPs were significantly associated with poorer DFS: retinoid-X receptor alpha (RXRA) SNPs (rs881658, rs11185659, rs10881583, rs881657 and rs7864987) and plasminogen activator and urokinase receptor (PLAUR) SNP (rs4251864). Treatment received (no systemic therapy, hormone therapy alone or chemotherapy) was an effect modifier of the RXRA SNPs association with DFS (P < 0.05); therefore, we stratified further analysis by treatment group. Among patients who did not receive systemic therapy, RXRA SNP [rs10881583 (P = 0.02)] was associated with poorer DFS, and among patients who received chemotherapy, RXRA SNPs (rs881658, rs11185659, rs10881583, rs881657 and rs7864987) were associated with poorer DFS (P < 0.001 for all SNPs). However, RXRA SNPs: rs10881583 (P < 0.001) and rs881657 (P = 0.02) were associated with improved DFS in patients treated with hormone therapy alone. Our results suggest that SNPs in the RXRA and PLAUR genes in the vitamin D pathway may contribute to breast cancer DFS. In particular, SNPs in RXRA may predict for poorer or improved DFS in patients, according to type of systemic treatment received. If validated, these markers could be used for risk stratification of breast cancer patients. PMID:23180655
SNP selection and classification of genome-wide SNP data using stratified sampling random forests.
Wu, Qingyao; Ye, Yunming; Liu, Yang; Ng, Michael K
2012-09-01
For high dimensional genome-wide association (GWA) case-control data of complex disease, there are usually a large portion of single-nucleotide polymorphisms (SNPs) that are irrelevant with the disease. A simple random sampling method in random forest using default mtry parameter to choose feature subspace, will select too many subspaces without informative SNPs. Exhaustive searching an optimal mtry is often required in order to include useful and relevant SNPs and get rid of vast of non-informative SNPs. However, it is too time-consuming and not favorable in GWA for high-dimensional data. The main aim of this paper is to propose a stratified sampling method for feature subspace selection to generate decision trees in a random forest for GWA high-dimensional data. Our idea is to design an equal-width discretization scheme for informativeness to divide SNPs into multiple groups. In feature subspace selection, we randomly select the same number of SNPs from each group and combine them to form a subspace to generate a decision tree. The advantage of this stratified sampling procedure can make sure each subspace contains enough useful SNPs, but can avoid a very high computational cost of exhaustive search of an optimal mtry, and maintain the randomness of a random forest. We employ two genome-wide SNP data sets (Parkinson case-control data comprised of 408 803 SNPs and Alzheimer case-control data comprised of 380 157 SNPs) to demonstrate that the proposed stratified sampling method is effective, and it can generate better random forest with higher accuracy and lower error bound than those by Breiman's random forest generation method. For Parkinson data, we also show some interesting genes identified by the method, which may be associated with neurological disorders for further biological investigations.
Ramos, Antonio M.; Crooijmans, Richard P. M. A.; Affara, Nabeel A.; Amaral, Andreia J.; Archibald, Alan L.; Beever, Jonathan E.; Bendixen, Christian; Churcher, Carol; Clark, Richard; Dehais, Patrick; Hansen, Mark S.; Hedegaard, Jakob; Hu, Zhi-Liang; Kerstens, Hindrik H.; Law, Andy S.; Megens, Hendrik-Jan; Milan, Denis; Nonneman, Danny J.; Rohrer, Gary A.; Rothschild, Max F.; Smith, Tim P. L.; Schnabel, Robert D.; Van Tassell, Curt P.; Taylor, Jeremy F.; Wiedmann, Ralph T.; Schook, Lawrence B.; Groenen, Martien A. M.
2009-01-01
Background The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs). This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. Methodology/Principal Findings A total of 19 reduced representation libraries derived from four swine breeds (Duroc, Landrace, Large White, Pietrain) and a Wild Boar population and three restriction enzymes (AluI, HaeIII and MspI) were sequenced using Illumina's Genome Analyzer (GA). The SNP discovery effort resulted in the de novo identification of over 372K SNPs. More than 549K SNPs were used to design the Illumina Porcine 60K+SNP iSelect Beadchip, now commercially available as the PorcineSNP60. A total of 64,232 SNPs were included on the Beadchip. Results from genotyping the 158 individuals used for sequencing showed a high overall SNP call rate (97.5%). Of the 62,621 loci that could be reliably scored, 58,994 were polymorphic yielding a SNP conversion success rate of 94%. The average minor allele frequency (MAF) for all scorable SNPs was 0.274. Conclusions/Significance Overall, the results of this study indicate the utility of using next generation sequencing technologies to identify large numbers of reliable SNPs. In addition, the validation of the PorcineSNP60 Beadchip demonstrated that the assay is an excellent tool that will likely be used in a variety of future studies in pigs. PMID:19654876
Dennis, Jessica; Medina-Rivera, Alejandra; Truong, Vinh; Antounians, Lina; Zwingerman, Nora; Carrasco, Giovana; Strug, Lisa; Wells, Phil; Trégouët, David-Alexandre; Morange, Pierre-Emmanuel; Wilson, Michael D; Gagnon, France
2017-07-01
Tissue factor pathway inhibitor (TFPI) regulates the formation of intravascular blood clots, which manifest clinically as ischemic heart disease, ischemic stroke, and venous thromboembolism (VTE). TFPI plasma levels are heritable, but the genetics underlying TFPI plasma level variability are poorly understood. Herein we report the first genome-wide association scan (GWAS) of TFPI plasma levels, conducted in 251 individuals from five extended French-Canadian Families ascertained on VTE. To improve discovery, we also applied a hypothesis-driven (HD) GWAS approach that prioritized single nucleotide polymorphisms (SNPs) in (1) hemostasis pathway genes, and (2) vascular endothelial cell (EC) regulatory regions, which are among the highest expressers of TFPI. Our GWAS identified 131 SNPs with suggestive evidence of association (P-value < 5 × 10 -8 ), but no SNPs reached the genome-wide threshold for statistical significance. Hemostasis pathway genes were not enriched for TFPI plasma level associated SNPs (global hypothesis test P-value = 0.147), but EC regulatory regions contained more TFPI plasma level associated SNPs than expected by chance (global hypothesis test P-value = 0.046). We therefore stratified our genome-wide SNPs, prioritizing those in EC regulatory regions via stratified false discovery rate (sFDR) control, and reranked the SNPs by q-value. The minimum q-value was 0.27, and the top-ranked SNPs did not show association evidence in the MARTHA replication sample of 1,033 unrelated VTE cases. Although this study did not result in new loci for TFPI, our work lays out a strategy to utilize epigenomic data in prioritization schemes for future GWAS studies. © 2017 WILEY PERIODICALS, INC.
Yoo, Jinho; Kim, Bo-Hyung; Kim, Soo-Hwan; Kim, Yangseok; Yim, Sung-Vin
2016-05-01
The study aimed to identify single nucleotide polymorphisms (SNPs) that significantly influenced the level of improvement of two kinds of training responses, including maximal O2 uptake (V'O2max) and knee peak torque of healthy adults participating in the high intensity training (HIT) program. The study also aimed to use these SNPs to develop prediction models for individual training responses. 79 Healthy volunteers participated in the HIT program. A genome-wide association study, based on 2,391,739 SNPs, was performed to identify SNPs that were significantly associated with gains in V'O2max and knee peak torque, following 9 weeks of the HIT program. To predict two training responses, two independent SNPs sets were determined using linear regression and iterative binary logistic regression analysis. False discovery rate analysis and permutation tests were performed to avoid false-positive findings. To predict gains in V'O2max, 7 SNPs were identified. These SNPs accounted for 26.0 % of the variance in the increment of V'O2max, and discriminated the subjects into three subgroups, non-responders, medium responders, and high responders, with prediction accuracy of 86.1 %. For the knee peak torque, 6 SNPs were identified, and accounted for 27.5 % of the variance in the increment of knee peak torque. The prediction accuracy discriminating the subjects into the three subgroups was estimated as 77.2 %. Novel SNPs found in this study could explain, and predict inter-individual variability in gains of V'O2max, and knee peak torque. Furthermore, with these genetic markers, a methodology suggested in this study provides a sound approach for the personalized training program.