Sample records for genome wide scan

  1. A Genome-Wide Breast Cancer Scan in African Americans

    DTIC Science & Technology

    2010-06-01

    SNPs from the African American breast cancer scan to COGs , a European collaborative study which is has designed a SNP array with that will be genotyped...Award Number: W81XWH-08-1-0383 TITLE: A Genome-wide Breast Cancer Scan in African Americans PRINCIPAL INVESTIGATOR: Christopher A...SUBTITLE A Genome-wide Breast Cancer Scan in African Americans 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-08-1-0383 5c. PROGRAM

  2. Genome-wide scans for loci under selection in humans

    PubMed Central

    2005-01-01

    Natural selection, which can be defined as the differential contribution of genetic variants to future generations, is the driving force of Darwinian evolution. Identifying regions of the human genome that have been targets of natural selection is an important step in clarifying human evolutionary history and understanding how genetic variation results in phenotypic diversity, it may also facilitate the search for complex disease genes. Technological advances in high-throughput DNA sequencing and single nucleotide polymorphism genotyping have enabled several genome-wide scans of natural selection to be undertaken. Here, some of the observations that are beginning to emerge from these studies will be reviewed, including evidence for geographically restricted selective pressures (ie local adaptation) and a relationship between genes subject to natural selection and human disease. In addition, the paper will highlight several important problems that need to be addressed in future genome-wide studies of natural selection. PMID:16004726

  3. Genome-wide scans of genetic variants for psychophysiological endophenotypes: a methodological overview.

    PubMed

    Iacono, William G; Malone, Stephen M; Vaidyanathan, Uma; Vrieze, Scott I

    2014-12-01

    This article provides an introductory overview of the investigative strategy employed to evaluate the genetic basis of 17 endophenotypes examined as part of a 20-year data collection effort from the Minnesota Center for Twin and Family Research. Included are characterization of the study samples, descriptive statistics for key properties of the psychophysiological measures, and rationale behind the steps taken in the molecular genetic study design. The statistical approach included (a) biometric analysis of twin and family data, (b) heritability analysis using 527,829 single nucleotide polymorphisms (SNPs), (c) genome-wide association analysis of these SNPs and 17,601 autosomal genes, (d) follow-up analyses of candidate SNPs and genes hypothesized to have an association with each endophenotype, (e) rare variant analysis of nonsynonymous SNPs in the exome, and (f) whole genome sequencing association analysis using 27 million genetic variants. These methods were used in the accompanying empirical articles comprising this special issue, Genome-Wide Scans of Genetic Variants for Psychophysiological Endophenotypes. Copyright © 2014 Society for Psychophysiological Research.

  4. Genome-wide scan of healthy human connectome discovers SPON1 gene variant influencing dementia severity

    PubMed Central

    Jahanshad, Neda; Rajagopalan, Priya; Hua, Xue; Hibar, Derrek P.; Nir, Talia M.; Toga, Arthur W.; Jack, Clifford R.; Saykin, Andrew J.; Green, Robert C.; Weiner, Michael W.; Medland, Sarah E.; Montgomery, Grant W.; Hansell, Narelle K.; McMahon, Katie L.; de Zubicaray, Greig I.; Martin, Nicholas G.; Wright, Margaret J.; Thompson, Paul M.; Weiner, Michael; Aisen, Paul; Weiner, Michael; Aisen, Paul; Petersen, Ronald; Jack, Clifford R.; Jagust, William; Trojanowski, John Q.; Toga, Arthur W.; Beckett, Laurel; Green, Robert C.; Saykin, Andrew J.; Morris, John; Liu, Enchi; Green, Robert C.; Montine, Tom; Petersen, Ronald; Aisen, Paul; Gamst, Anthony; Thomas, Ronald G.; Donohue, Michael; Walter, Sarah; Gessert, Devon; Sather, Tamie; Beckett, Laurel; Harvey, Danielle; Gamst, Anthony; Donohue, Michael; Kornak, John; Jack, Clifford R.; Dale, Anders; Bernstein, Matthew; Felmlee, Joel; Fox, Nick; Thompson, Paul; Schuff, Norbert; Alexander, Gene; DeCarli, Charles; Jagust, William; Bandy, Dan; Koeppe, Robert A.; Foster, Norm; Reiman, Eric M.; Chen, Kewei; Mathis, Chet; Morris, John; Cairns, Nigel J.; Taylor-Reinwald, Lisa; Trojanowki, J.Q.; Shaw, Les; Lee, Virginia M.Y.; Korecka, Magdalena; Toga, Arthur W.; Crawford, Karen; Neu, Scott; Saykin, Andrew J.; Foroud, Tatiana M.; Potkin, Steven; Shen, Li; Khachaturian, Zaven; Frank, Richard; Snyder, Peter J.; Molchan, Susan; Kaye, Jeffrey; Quinn, Joseph; Lind, Betty; Dolen, Sara; Schneider, Lon S.; Pawluczyk, Sonia; Spann, Bryan M.; Brewer, James; Vanderswag, Helen; Heidebrink, Judith L.; Lord, Joanne L.; Petersen, Ronald; Johnson, Kris; Doody, Rachelle S.; Villanueva-Meyer, Javier; Chowdhury, Munir; Stern, Yaakov; Honig, Lawrence S.; Bell, Karen L.; Morris, John C.; Ances, Beau; Carroll, Maria; Leon, Sue; Mintun, Mark A.; Schneider, Stacy; Marson, Daniel; Griffith, Randall; Clark, David; Grossman, Hillel; Mitsis, Effie; Romirowsky, Aliza; deToledo-Morrell, Leyla; Shah, Raj C.; Duara, Ranjan; Varon, Daniel; Roberts, Peggy; Albert, Marilyn; Onyike, Chiadi; Kielb, Stephanie; Rusinek, Henry; de Leon, Mony J.; Glodzik, Lidia; De Santi, Susan; Doraiswamy, P. Murali; Petrella, Jeffrey R.; Coleman, R. Edward; Arnold, Steven E.; Karlawish, Jason H.; Wolk, David; Smith, Charles D.; Jicha, Greg; Hardy, Peter; Lopez, Oscar L.; Oakley, MaryAnn; Simpson, Donna M.; Porsteinsson, Anton P.; Goldstein, Bonnie S.; Martin, Kim; Makino, Kelly M.; Ismail, M. Saleem; Brand, Connie; Mulnard, Ruth A.; Thai, Gaby; Mc-Adams-Ortiz, Catherine; Womack, Kyle; Mathews, Dana; Quiceno, Mary; Diaz-Arrastia, Ramon; King, Richard; Weiner, Myron; Martin-Cook, Kristen; DeVous, Michael; Levey, Allan I.; Lah, James J.; Cellar, Janet S.; Burns, Jeffrey M.; Anderson, Heather S.; Swerdlow, Russell H.; Apostolova, Liana; Lu, Po H.; Bartzokis, George; Silverman, Daniel H.S.; Graff-Radford, Neill R.; Parfitt, Francine; Johnson, Heather; Farlow, Martin R.; Hake, Ann Marie; Matthews, Brandy R.; Herring, Scott; van Dyck, Christopher H.; Carson, Richard E.; MacAvoy, Martha G.; Chertkow, Howard; Bergman, Howard; Hosein, Chris; Black, Sandra; Stefanovic, Bojana; Caldwell, Curtis; Hsiung, Ging-Yuek Robin; Feldman, Howard; Mudge, Benita; Assaly, Michele; Kertesz, Andrew; Rogers, John; Trost, Dick; Bernick, Charles; Munic, Donna; Kerwin, Diana; Mesulam, Marek-Marsel; Lipowski, Kristina; Wu, Chuang-Kuo; Johnson, Nancy; Sadowsky, Carl; Martinez, Walter; Villena, Teresa; Turner, Raymond Scott; Johnson, Kathleen; Reynolds, Brigid; Sperling, Reisa A.; Johnson, Keith A.; Marshall, Gad; Frey, Meghan; Yesavage, Jerome; Taylor, Joy L.; Lane, Barton; Rosen, Allyson; Tinklenberg, Jared; Sabbagh, Marwan; Belden, Christine; Jacobson, Sandra; Kowall, Neil; Killiany, Ronald; Budson, Andrew E.; Norbash, Alexander; Johnson, Patricia Lynn; Obisesan, Thomas O.; Wolday, Saba; Bwayo, Salome K.; Lerner, Alan; Hudson, Leon; Ogrocki, Paula; Fletcher, Evan; Carmichael, Owen; Olichney, John; DeCarli, Charles; Kittur, Smita; Borrie, Michael; Lee, T.-Y.; Bartha, Rob; Johnson, Sterling; Asthana, Sanjay; Carlsson, Cynthia M.; Potkin, Steven G.; Preda, Adrian; Nguyen, Dana; Tariot, Pierre; Fleisher, Adam; Reeder, Stephanie; Bates, Vernice; Capote, Horacio; Rainka, Michelle; Scharre, Douglas W.; Kataki, Maria; Zimmerman, Earl A.; Celmins, Dzintra; Brown, Alice D.; Pearlson, Godfrey D.; Blank, Karen; Anderson, Karen; Saykin, Andrew J.; Santulli, Robert B.; Schwartz, Eben S.; Sink, Kaycee M.; Williamson, Jeff D.; Garg, Pradeep; Watkins, Franklin; Ott, Brian R.; Querfurth, Henry; Tremont, Geoffrey; Salloway, Stephen; Malloy, Paul; Correia, Stephen; Rosen, Howard J.; Miller, Bruce L.; Mintzer, Jacobo; Longmire, Crystal Flynn; Spicer, Kenneth; Finger, Elizabeth; Rachinsky, Irina; Rogers, John; Kertesz, Andrew; Drost, Dick

    2013-01-01

    Aberrant connectivity is implicated in many neurological and psychiatric disorders, including Alzheimer’s disease and schizophrenia. However, other than a few disease-associated candidate genes, we know little about the degree to which genetics play a role in the brain networks; we know even less about specific genes that influence brain connections. Twin and family-based studies can generate estimates of overall genetic influences on a trait, but genome-wide association scans (GWASs) can screen the genome for specific variants influencing the brain or risk for disease. To identify the heritability of various brain connections, we scanned healthy young adult twins with high-field, high-angular resolution diffusion MRI. We adapted GWASs to screen the brain’s connectivity pattern, allowing us to discover genetic variants that affect the human brain’s wiring. The association of connectivity with the SPON1 variant at rs2618516 on chromosome 11 (11p15.2) reached connectome-wide, genome-wide significance after stringent statistical corrections were enforced, and it was replicated in an independent subsample. rs2618516 was shown to affect brain structure in an elderly population with varying degrees of dementia. Older people who carried the connectivity variant had significantly milder clinical dementia scores and lower risk of Alzheimer’s disease. As a posthoc analysis, we conducted GWASs on several organizational and topological network measures derived from the matrices to discover variants in and around genes associated with autism (MACROD2), development (NEDD4), and mental retardation (UBE2A) significantly associated with connectivity. Connectome-wide, genome-wide screening offers substantial promise to discover genes affecting brain connectivity and risk for brain diseases. PMID:23471985

  5. Genome-wide scan of healthy human connectome discovers SPON1 gene variant influencing dementia severity.

    PubMed

    Jahanshad, Neda; Rajagopalan, Priya; Hua, Xue; Hibar, Derrek P; Nir, Talia M; Toga, Arthur W; Jack, Clifford R; Saykin, Andrew J; Green, Robert C; Weiner, Michael W; Medland, Sarah E; Montgomery, Grant W; Hansell, Narelle K; McMahon, Katie L; de Zubicaray, Greig I; Martin, Nicholas G; Wright, Margaret J; Thompson, Paul M

    2013-03-19

    Aberrant connectivity is implicated in many neurological and psychiatric disorders, including Alzheimer's disease and schizophrenia. However, other than a few disease-associated candidate genes, we know little about the degree to which genetics play a role in the brain networks; we know even less about specific genes that influence brain connections. Twin and family-based studies can generate estimates of overall genetic influences on a trait, but genome-wide association scans (GWASs) can screen the genome for specific variants influencing the brain or risk for disease. To identify the heritability of various brain connections, we scanned healthy young adult twins with high-field, high-angular resolution diffusion MRI. We adapted GWASs to screen the brain's connectivity pattern, allowing us to discover genetic variants that affect the human brain's wiring. The association of connectivity with the SPON1 variant at rs2618516 on chromosome 11 (11p15.2) reached connectome-wide, genome-wide significance after stringent statistical corrections were enforced, and it was replicated in an independent subsample. rs2618516 was shown to affect brain structure in an elderly population with varying degrees of dementia. Older people who carried the connectivity variant had significantly milder clinical dementia scores and lower risk of Alzheimer's disease. As a posthoc analysis, we conducted GWASs on several organizational and topological network measures derived from the matrices to discover variants in and around genes associated with autism (MACROD2), development (NEDD4), and mental retardation (UBE2A) significantly associated with connectivity. Connectome-wide, genome-wide screening offers substantial promise to discover genes affecting brain connectivity and risk for brain diseases.

  6. Meta-Analysis of Genome-Wide Scans Provides Evidence for Sex- and Site-Specific Regulation of Bone Mass

    PubMed Central

    Sham, Pak C; Zintzaras, Elias; Lewis, Cathryn M; Deng, Hong-Wen; Econs, Michael J; Karasik, David; Devoto, Marcella; Kammerer, Candace M; Spector, Tim; Andrew, Toby; Cupples, L Adrienne; Duncan, Emma L; Foroud, Tatiana; Kiel, Douglas P; Koller, Daniel; Langdahl, Bente; Mitchell, Braxton D; Peacock, Munro; Recker, Robert; Shen, Hui; Sol-Church, Katia; Spotila, Loretta D; Uitterlinden, Andre G; Wilson, Scott G; Kung, Annie WC; Ralston, Stuart H

    2014-01-01

    Several genome-wide scans have been performed to detect loci that regulate BMD, but these have yielded inconsistent results, with limited replication of linkage peaks in different studies. In an effort to improve statistical power for detection of these loci, we performed a meta-analysis of genome-wide scans in which spine or hip BMD were studied. Evidence was gained to suggest that several chromosomal loci regulate BMD in a site-specific and sex-specific manner. Introduction BMD is a heritable trait and an important predictor of osteoporotic fracture risk. Several genome-wide scans have been performed in an attempt to detect loci that regulate BMD, but there has been limited replication of linkage peaks between studies. In an attempt to resolve these inconsistencies, we conducted a collaborative meta-analysis of genome-wide linkage scans in which femoral neck BMD (FN-BMD) or lumbar spine BMD (LS-BMD) had been studied. Materials and Methods Data were accumulated from nine genome-wide scans involving 11,842 subjects. Data were analyzed separately for LS-BMD and FN-BMD and by sex. For each study, genomic bins of 30 cM were defined and ranked according to the maximum LOD score they contained. While various densitometers were used in different studies, the ranking approach that we used means that the results are not confounded by the fact that different measurement devices were used. Significance for high average rank and heterogeneity was obtained through Monte Carlo testing. Results For LS-BMD, the quantitative trait locus (QTL) with greatest significance was on chromosome 1p13.3-q23.3 (p = 0.004), but this exhibited high heterogeneity and the effect was specific for women. Other significant LS-BMD QTLs were on chromosomes 12q24.31-qter, 3p25.3-p22.1, 11p12-q13.3, and 1q32-q42.3, including one on 18p11-q12.3 that had not been detected by individual studies. For FN-BMD, the strongest QTL was on chromosome 9q31.1-q33.3 (p = 0.002). Other significant QTLs were

  7. A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data.

    PubMed

    Nishiyama, Takeshi; Takahashi, Kunihiko; Tango, Toshiro; Pinto, Dalila; Scherer, Stephen W; Takami, Satoshi; Kishino, Hirohisa

    2011-05-26

    Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.

  8. Family-Based Genome-Wide Association Scan of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Mick, Eric; Todorov, Alexandre; Smalley, Susan; Hu, Xiaolan; Loo, Sandra; Todd, Richard D.; Biederman, Joseph; Byrne, Deirdre; Dechairo, Bryan; Guiney, Allan; McCracken, James; McGough, James; Nelson, Stanley F.; Reiersen, Angela M.; Wilens, Timothy E.; Wozniak, Janet; Neale, Benjamin M.; Faraone, Stephen V.

    2010-01-01

    Objective: Genes likely play a substantial role in the etiology of attention-deficit/hyperactivity disorder (ADHD). However, the genetic architecture of the disorder is unknown, and prior genome-wide association studies (GWAS) have not identified a genome-wide significant association. We have conducted a third, independent, multisite GWAS of…

  9. Genome-wide scans reveal variants at EDAR predominantly affecting hair straightness in Han Chinese and Uyghur populations.

    PubMed

    Wu, Sijie; Tan, Jingze; Yang, Yajun; Peng, Qianqian; Zhang, Manfei; Li, Jinxi; Lu, Dongsheng; Liu, Yu; Lou, Haiyi; Feng, Qidi; Lu, Yan; Guan, Yaqun; Zhang, Zhaoxia; Jiao, Yi; Sabeti, Pardis; Krutmann, Jean; Tang, Kun; Jin, Li; Xu, Shuhua; Wang, Sijia

    2016-11-01

    Hair straightness/curliness is one of the most conspicuous features of human variation and is particularly diverse among populations. A recent genome-wide scan found common variants in the Trichohyalin (TCHH) gene that are associated with hair straightness in Europeans, but different genes might affect this phenotype in other populations. By sampling 2899 Han Chinese, we performed the first genome-wide scan of hair straightness in East Asians, and found EDAR (rs3827760) as the predominant gene (P = 4.67 × 10 -16 ), accounting for 3.66 % of the total variance. The candidate gene approach did not find further significant associations, suggesting that hair straightness may be affected by a large number of genes with subtle effects. Notably, genetic variants associated with hair straightness in Europeans are generally low in frequency in Han Chinese, and vice versa. To evaluate the relative contribution of these variants, we performed a second genome-wide scan in 709 samples from the Uyghur, an admixed population with both eastern and western Eurasian ancestries. In Uyghurs, both EDAR (rs3827760: P = 1.92 × 10 -12 ) and TCHH (rs11803731: P = 1.46 × 10 -3 ) are associated with hair straightness, but EDAR (OR 0.415) has a greater effect than TCHH (OR 0.575). We found no significant interaction between EDAR and TCHH (P = 0.645), suggesting that these two genes affect hair straightness through different mechanisms. Furthermore, haplotype analysis indicates that TCHH is not subject to selection. While EDAR is under strong selection in East Asia, it does not appear to be subject to selection after the admixture in Uyghurs. These suggest that hair straightness is unlikely a trait under selection.

  10. A Genome-Wide Breast Cancer Scan in African Americans

    DTIC Science & Technology

    2011-06-01

    cancer in women of African ancestry. 13 References 1. Easton DF, P.K., Dunning AM, Pharoah PDP, Thompson D, Ballinger DG, et al . Genome...M, Hankinson, SE, et al . A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer...Millikan, R.C. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. Jama 295, 2492-502 ( 2006 ). 16 17. Huo, D., Ikpatt

  11. Revealing phenotype-associated functional differences by genome-wide scan of ancient haplotype blocks

    PubMed Central

    Onuki, Ritsuko; Yamaguchi, Rui; Shibuya, Tetsuo; Kanehisa, Minoru; Goto, Susumu

    2017-01-01

    Genome-wide scans for positive selection have become important for genomic medicine, and many studies aim to find genomic regions affected by positive selection that are associated with risk allele variations among populations. Most such studies are designed to detect recent positive selection. However, we hypothesize that ancient positive selection is also important for adaptation to pathogens, and has affected current immune-mediated common diseases. Based on this hypothesis, we developed a novel linkage disequilibrium-based pipeline, which aims to detect regions associated with ancient positive selection across populations from single nucleotide polymorphism (SNP) data. By applying this pipeline to the genotypes in the International HapMap project database, we show that genes in the detected regions are enriched in pathways related to the immune system and infectious diseases. The detected regions also contain SNPs reported to be associated with cancers and metabolic diseases, obesity-related traits, type 2 diabetes, and allergic sensitization. These SNPs were further mapped to biological pathways to determine the associations between phenotypes and molecular functions. Assessments of candidate regions to identify functions associated with variations in incidence rates of these diseases are needed in the future. PMID:28445522

  12. A genome-wide association scan in admixed Latin Americans identifies loci influencing facial and scalp hair features

    PubMed Central

    Adhikari, Kaustubh; Fontanil, Tania; Cal, Santiago; Mendoza-Revilla, Javier; Fuentes-Guajardo, Macarena; Chacón-Duque, Juan-Camilo; Al-Saadi, Farah; Johansson, Jeanette A.; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C.; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M.; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Gonzalez-José, Rolando; Headon, Denis; López-Otín, Carlos; Tobin, Desmond J.; Balding, David; Ruiz-Linares, Andrés

    2016-01-01

    We report a genome-wide association scan in over 6,000 Latin Americans for features of scalp hair (shape, colour, greying, balding) and facial hair (beard thickness, monobrow, eyebrow thickness). We found 18 signals of association reaching genome-wide significance (P values 5 × 10−8 to 3 × 10−119), including 10 novel associations. These include novel loci for scalp hair shape and balding, and the first reported loci for hair greying, monobrow, eyebrow and beard thickness. A newly identified locus influencing hair shape includes a Q30R substitution in the Protease Serine S1 family member 53 (PRSS53). We demonstrate that this enzyme is highly expressed in the hair follicle, especially the inner root sheath, and that the Q30R substitution affects enzyme processing and secretion. The genome regions associated with hair features are enriched for signals of selection, consistent with proposals regarding the evolution of human hair. PMID:26926045

  13. A genome-wide SNP scan accelerates trait-regulatory genomic loci identification in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C.L.L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We identified 44844 high-quality SNPs by sequencing 92 diverse chickpea accessions belonging to a seed and pod trait-specific association panel using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays. A GWAS (genome-wide association study) in an association panel of 211, including the 92 sequenced accessions, identified 22 major genomic loci showing significant association (explaining 23–47% phenotypic variation) with pod and seed number/plant and 100-seed weight. Eighteen trait-regulatory major genomic loci underlying 13 robust QTLs were validated and mapped on an intra-specific genetic linkage map by QTL mapping. A combinatorial approach of GWAS, QTL mapping and gene haplotype-specific LD mapping and transcript profiling uncovered one superior haplotype and favourable natural allelic variants in the upstream regulatory region of a CesA-type cellulose synthase (Ca_Kabuli_CesA3) gene regulating high pod and seed number/plant (explaining 47% phenotypic variation) in chickpea. The up-regulation of this superior gene haplotype correlated with increased transcript expression of Ca_Kabuli_CesA3 gene in the pollen and pod of high pod/seed number accession, resulting in higher cellulose accumulation for normal pollen and pollen tube growth. A rapid combinatorial genome-wide SNP genotyping-based approach has potential to dissect complex quantitative agronomic traits and delineate trait-regulatory genomic loci (candidate genes) for genetic enhancement in crop plants, including chickpea. PMID:26058368

  14. Transethnic genome-wide scan identifies novel Alzheimer's disease loci.

    PubMed

    Jun, Gyungah R; Chung, Jaeyoon; Mez, Jesse; Barber, Robert; Beecham, Gary W; Bennett, David A; Buxbaum, Joseph D; Byrd, Goldie S; Carrasquillo, Minerva M; Crane, Paul K; Cruchaga, Carlos; De Jager, Philip; Ertekin-Taner, Nilufer; Evans, Denis; Fallin, M Danielle; Foroud, Tatiana M; Friedland, Robert P; Goate, Alison M; Graff-Radford, Neill R; Hendrie, Hugh; Hall, Kathleen S; Hamilton-Nelson, Kara L; Inzelberg, Rivka; Kamboh, M Ilyas; Kauwe, John S K; Kukull, Walter A; Kunkle, Brian W; Kuwano, Ryozo; Larson, Eric B; Logue, Mark W; Manly, Jennifer J; Martin, Eden R; Montine, Thomas J; Mukherjee, Shubhabrata; Naj, Adam; Reiman, Eric M; Reitz, Christiane; Sherva, Richard; St George-Hyslop, Peter H; Thornton, Timothy; Younkin, Steven G; Vardarajan, Badri N; Wang, Li-San; Wendlund, Jens R; Winslow, Ashley R; Haines, Jonathan; Mayeux, Richard; Pericak-Vance, Margaret A; Schellenberg, Gerard; Lunetta, Kathryn L; Farrer, Lindsay A

    2017-07-01

    Genetic loci for Alzheimer's disease (AD) have been identified in whites of European ancestry, but the genetic architecture of AD among other populations is less understood. We conducted a transethnic genome-wide association study (GWAS) for late-onset AD in Stage 1 sample including whites of European Ancestry, African-Americans, Japanese, and Israeli-Arabs assembled by the Alzheimer's Disease Genetics Consortium. Suggestive results from Stage 1 from novel loci were followed up using summarized results in the International Genomics Alzheimer's Project GWAS dataset. Genome-wide significant (GWS) associations in single-nucleotide polymorphism (SNP)-based tests (P < 5 × 10 -8 ) were identified for SNPs in PFDN1/HBEGF, USP6NL/ECHDC3, and BZRAP1-AS1 and for the interaction of the (apolipoprotein E) APOE ε4 allele with NFIC SNP. We also obtained GWS evidence (P < 2.7 × 10 -6 ) for gene-based association in the total sample with a novel locus, TPBG (P = 1.8 × 10 -6 ). Our findings highlight the value of transethnic studies for identifying novel AD susceptibility loci. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  15. A 2cM genome-wide scan of European Holstein cattle affected by classical BSE

    PubMed Central

    2010-01-01

    Background Classical bovine spongiform encephalopathy (BSE) is an acquired prion disease that is invariably fatal in cattle and has been implicated as a significant human health risk. Polymorphisms that alter the prion protein of sheep or humans have been associated with variations in transmissible spongiform encephalopathy susceptibility or resistance. In contrast, there is no strong evidence that non-synonymous mutations in the bovine prion gene (PRNP) are associated with classical BSE disease susceptibility. However, two bovine PRNP insertion/deletion polymorphisms, one within the promoter region and the other in intron 1, have been associated with susceptibility to classical BSE. These associations do not explain the full extent of BSE susceptibility, and loci outside of PRNP appear to be associated with disease incidence in some cattle populations. To test for associations with BSE susceptibility, we conducted a genome wide scan using a panel of 3,072 single nucleotide polymorphism (SNP) markers on 814 animals representing cases and control Holstein cattle from the United Kingdom BSE epidemic. Results Two sets of BSE affected Holstein cattle were analyzed in this study, one set with known family relationships and the second set of paired cases with controls. The family set comprises half-sibling progeny from six sires. The progeny from four of these sires had previously been scanned with microsatellite markers. The results obtained from the current analysis of the family set yielded both some supporting and new results compared with those obtained in the earlier study. The results revealed 27 SNPs representing 18 chromosomes associated with incidence of BSE disease. These results confirm a region previously reported on chromosome 20, and identify additional regions on chromosomes 2, 14, 16, 21 and 28. This study did not identify a significant association near the PRNP in the family sample set. The only association found in the PRNP region was in the case

  16. Genome-wide scan for visceral leishmaniasis in mixed-breed dogs identifies candidate genes involved in T helper cells and macrophage signaling

    USDA-ARS?s Scientific Manuscript database

    We conducted a genome-wide scan for visceral leishmaniasis in mixed-breed dogs from a highly endemic area in Brazil using 149,648 single nucleotide polymorphism (SNP) markers genotyped in 20 cases and 28 controls. Using a mixed model approach, we found two candidate loci on canine autosomes 1 and 2....

  17. A GENOME-WIDE LINKAGE AND ASSOCIATION SCAN REVEALS NOVEL LOCI FOR AUTISM

    PubMed Central

    Weiss, Lauren A.; Arking, Dan E.

    2009-01-01

    Summary Although autism is a highly heritable neurodevelopmental disorder, attempts to identify specific susceptibility genes have thus far met with limited success 1. Genome-wide association studies (GWAS) using half a million or more markers, particularly those with very large sample sizes achieved through meta-analysis, have shown great success in mapping genes for other complex genetic traits (http://www.genome.gov/26525384). Consequently, we initiated a linkage and association mapping study using half a million genome-wide SNPs in a common set of 1,031 multiplex autism families (1,553 affected offspring). We identified regions of suggestive and significant linkage on chromosomes 6q27 and 20p13, respectively. Initial analysis did not yield genome-wide significant associations; however, genotyping of top hits in additional families revealed a SNP on chromosome 5p15 (between SEMA5A and TAS2R1) that was significantly associated with autism (P = 2 × 10−7). We also demonstrated that expression of SEMA5A is reduced in brains from autistic patients, further implicating SEMA5A as an autism susceptibility gene. The linkage regions reported here provide targets for rare variation screening while the discovery of a single novel association demonstrates the action of common variants. PMID:19812673

  18. A Genome-Wide Scan for Breast Cancer Risk Haplotypes among African American Women

    PubMed Central

    Song, Chi; Chen, Gary K.; Millikan, Robert C.; Ambrosone, Christine B.; John, Esther M.; Bernstein, Leslie; Zheng, Wei; Hu, Jennifer J.; Ziegler, Regina G.; Nyante, Sarah; Bandera, Elisa V.; Ingles, Sue A.; Press, Michael F.; Deming, Sandra L.; Rodriguez-Gil, Jorge L.; Chanock, Stephen J.; Wan, Peggy; Sheng, Xin; Pooler, Loreall C.; Van Den Berg, David J.; Le Marchand, Loic; Kolonel, Laurence N.; Henderson, Brian E.; Haiman, Chris A.; Stram, Daniel O.

    2013-01-01

    Genome-wide association studies (GWAS) simultaneously investigating hundreds of thousands of single nucleotide polymorphisms (SNP) have become a powerful tool in the investigation of new disease susceptibility loci. Haplotypes are sometimes thought to be superior to SNPs and are promising in genetic association analyses. The application of genome-wide haplotype analysis, however, is hindered by the complexity of haplotypes themselves and sophistication in computation. We systematically analyzed the haplotype effects for breast cancer risk among 5,761 African American women (3,016 cases and 2,745 controls) using a sliding window approach on the genome-wide scale. Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects. Furthermore, among 21 breast cancer susceptibility loci previously established in European populations, 10p15 and 14q24 are likely to harbor novel haplotype effects. We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data. It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density. PMID:23468962

  19. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    PubMed

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  20. Horizon scanning for new genomic tests.

    PubMed

    Gwinn, Marta; Grossniklaus, Daurice A; Yu, Wei; Melillo, Stephanie; Wulf, Anja; Flome, Jennifer; Dotson, W David; Khoury, Muin J

    2011-02-01

    The development of health-related genomic tests is decentralized and dynamic, involving government, academic, and commercial entities. Consequently, it is not easy to determine which tests are in development, currently available, or discontinued. We developed and assessed the usefulness of a systematic approach to identifying new genomic tests on the Internet. We devised targeted queries of Web pages, newspaper articles, and blogs (Google Alerts) to identify new genomic tests. We finalized search and review procedures during a pilot phase that ended in March 2010. Queries continue to run daily and are compiled weekly; selected data are indexed in an online database, the Genomic Applications in Practice and Prevention Finder. After the pilot phase, our scan detected approximately two to three new genomic tests per week. Nearly two thirds of all tests (122/188, 65%) were related to cancer; only 6% were related to hereditary disorders. Although 88 (47%) of the tests, including 2 marketed directly to consumers, were commercially available, only 12 (6%) claimed United States Food and Drug Administration licensure. Systematic surveillance of the Internet provides information about genomic tests that can be used in combination with other resources to evaluate genomic tests. The Genomic Applications in Practice and Prevention Finder makes this information accessible to a wide group of stakeholders.

  1. A Genome-wide Admixture Scan for Ancestry-linked Genes Predisposing to Sarcoidosis in African Americans

    PubMed Central

    Rybicki, Benjamin A.; Levin, Albert M.; McKeigue, Paul; Datta, Indrani; Gray-McGuire, Courtney; Colombo, Marco; Reich, David; Burke, Robert R.; Iannuzzi, Michael C.

    2010-01-01

    Genome-wide linkage and association studies have uncovered variants associated with sarcoidosis, a multi-organ granulomatous inflammatory disease. African ancestry may influence disease pathogenesis since African Americans are more commonly affected by sarcoidosis. Therefore, we conducted the first sarcoidosis genome-wide ancestry scan using a map of 1,384 highly ancestry informative single nucleotide polymorphisms genotyped on 1,357 sarcoidosis cases and 703 unaffected controls self-identified as African American. The most significant ancestry association was at marker rs11966463 on chromosome 6p22.3 (ancestry association risk ratio (aRR)= 1.90; p=0.0002). When we restricted the analysis to biopsy-confirmed cases, the aRR for this marker increased to 2.01; p=0.00007. Among the eight other markers that demonstrated suggestive ancestry associations with sarcoidosis were rs1462906 on chromosome 8p12 which had the most significant association with European ancestry (aRR=0.65; p=0.002), and markers on chromosomes 5p13 (aRR=1.46; p=0.005) and 5q31 (aRR=0.67; p=0.005), which correspond to regions we previously identified through sib pair linkage analyses. Overall, the most significant ancestry association for Scadding stage IV cases was to marker rs7919137 on chromosome 10p11.22 (aRR=0.27; p=2×10−5), a region not associated with disease susceptibility. In summary, through admixture mapping of sarcoidosis we have confirmed previous genetic linkages and identified several novel putative candidate loci for sarcoidosis. PMID:21179114

  2. A High Resolution Genome-Wide Scan for Significant Selective Sweeps: An Application to Pooled Sequence Data in Laying Chickens

    PubMed Central

    Qanbari, Saber; Strom, Tim M.; Haberer, Georg; Weigend, Steffen; Gheyas, Almas A.; Turner, Frances; Burt, David W.; Preisinger, Rudolf; Gianola, Daniel; Simianer, Henner

    2012-01-01

    In most studies aimed at localizing footprints of past selection, outliers at tails of the empirical distribution of a given test statistic are assumed to reflect locus-specific selective forces. Significance cutoffs are subjectively determined, rather than being related to a clear set of hypotheses. Here, we define an empirical p-value for the summary statistic by means of a permutation method that uses the observed SNP structure in the real data. To illustrate the methodology, we applied our approach to a panel of 2.9 million autosomal SNPs identified from re-sequencing a pool of 15 individuals from a brown egg layer line. We scanned the genome for local reductions in heterozygosity, suggestive of selective sweeps. We also employed a modified sliding window approach that accounts for gaps in the sequence and increases scanning resolution by moving the overlapping windows by steps of one SNP only, and suggest to call this a “creeping window” strategy. The approach confirmed selective sweeps in the region of previously described candidate genes, i.e. TSHR, PRL, PRLHR, INSR, LEPR, IGF1, and NRAMP1 when used as positive controls. The genome scan revealed 82 distinct regions with strong evidence of selection (genome-wide p-value<0.001), including genes known to be associated with eggshell structure and immune system such as CALB1 and GAL cluster, respectively. A substantial proportion of signals was found in poor gene content regions including the most extreme signal on chromosome 1. The observation of multiple signals in a highly selected layer line of chicken is consistent with the hypothesis that egg production is a complex trait controlled by many genes. PMID:23209582

  3. Genome-wide transposon insertion scanning of environmental survival functions in the polycyclic aromatic hydrocarbon degrading bacterium Sphingomonas wittichii RW1.

    PubMed

    Roggo, Clémence; Coronado, Edith; Moreno-Forero, Silvia K; Harshman, Keith; Weber, Johann; van der Meer, Jan Roelof

    2013-10-01

    Sphingomonas wittichii RW1 is a dibenzofuran and dibenzodioxin-degrading bacterium with potentially interesting properties for bioaugmentation of contaminated sites. In order to understand the capacity of the microorganism to survive in the environment we used a genome-wide transposon scanning approach. RW1 transposon libraries were generated with around 22,000 independent insertions. Libraries were grown for an average of 50 generations (five successive passages in batch liquid medium) with salicylate as sole carbon and energy source in presence or absence of salt stress at -1.5 MPa. Alternatively, libraries were grown in sand with salicylate, at 50% water holding capacity, for 4 and 10 days (equivalent to 7 generations). Library DNA was recovered from the different growth conditions and scanned by ultrahigh throughput sequencing for the positions and numbers of inserted transposed kanamycin resistance gene. No transposon reads were recovered in 579 genes (10% of all annotated genes in the RW1 genome) in any of the libraries, suggesting those to be essential for survival under the used conditions. Libraries recovered from sand differed strongly from those incubated in liquid batch medium. In particular, important functions for survival of cells in sand at the short term concerned nutrient scavenging, energy metabolism and motility. In contrast to this, fatty acid metabolism and oxidative stress response were essential for longer term survival of cells in sand. Comparison to transcriptome data suggested important functions in sand for flagellar movement, pili synthesis, trehalose and polysaccharide synthesis and putative cell surface antigen proteins. Interestingly, a variety of genes were also identified, interruption of which cause significant increase in fitness during growth on salicylate. One of these was an Lrp family transcription regulator and mutants in this gene covered more than 90% of the total library after 50 generations of growth on salicylate. Our

  4. Meta-analysis of 32 genome-wide linkage studies of schizophrenia

    PubMed Central

    Ng, MYM; Levinson, DF; Faraone, SV; Suarez, BK; DeLisi, LE; Arinami, T; Riley, B; Paunio, T; Pulver, AE; Irmansyah; Holmans, PA; Escamilla, M; Wildenauer, DB; Williams, NM; Laurent, C; Mowry, BJ; Brzustowicz, LM; Maziade, M; Sklar, P; Garver, DL; Abecasis, GR; Lerer, B; Fallin, MD; Gurling, HMD; Gejman, PV; Lindholm, E; Moises, HW; Byerley, W; Wijsman, EM; Forabosco, P; Tsuang, MT; Hwu, H-G; Okazaki, Y; Kendler, KS; Wormley, B; Fanous, A; Walsh, D; O’Neill, FA; Peltonen, L; Nestadt, G; Lasseter, VK; Liang, KY; Papadimitriou, GM; Dikeos, DG; Schwab, SG; Owen, MJ; O’Donovan, MC; Norton, N; Hare, E; Raventos, H; Nicolini, H; Albus, M; Maier, W; Nimgaonkar, VL; Terenius, L; Mallet, J; Jay, M; Godard, S; Nertney, D; Alexander, M; Crowe, RR; Silverman, JM; Bassett, AS; Roy, M-A; Mérette, C; Pato, CN; Pato, MT; Roos, J Louw; Kohn, Y; Amann-Zalcenstein, D; Kalsi, G; McQuillin, A; Curtis, D; Brynjolfson, J; Sigmundsson, T; Petursson, H; Sanders, AR; Duan, J; Jazin, E; Myles-Worsley, M; Karayiorgou, M; Lewis, CM

    2009-01-01

    A genome scan meta-analysis (GSMA) was carried out on 32 independent genome-wide linkage scan analyses that included 3255 pedigrees with 7413 genotyped cases affected with schizophrenia (SCZ) or related disorders. The primary GSMA divided the autosomes into 120 bins, rank-ordered the bins within each study according to the most positive linkage result in each bin, summed these ranks (weighted for study size) for each bin across studies and determined the empirical probability of a given summed rank (PSR) by simulation. Suggestive evidence for linkage was observed in two single bins, on chromosomes 5q (142-168 Mb) and 2q (103-134 Mb). Genome-wide evidence for linkage was detected on chromosome 2q (119-152 Mb) when bin boundaries were shifted to the middle of the previous bins. The primary analysis met empirical criteria for ‘aggregate’ genome-wide significance, indicating that some or all of 10 bins are likely to contain loci linked to SCZ, including regions of chromosomes 1, 2q, 3q, 4q, 5q, 8p and 10q. In a secondary analysis of 22 studies of European-ancestry samples, suggestive evidence for linkage was observed on chromosome 8p (16-33 Mb). Although the newer genome-wide association methodology has greater power to detect weak associations to single common DNA sequence variants, linkage analysis can detect diverse genetic effects that segregate in families, including multiple rare variants within one locus or several weakly associated loci in the same region. Therefore, the regions supported by this meta-analysis deserve close attention in future studies. PMID:19349958

  5. Genome-wide scans for microalbuminuria in Mexican Americans: the San Antonio Family Heart Study.

    PubMed

    Arar, Nedal; Nath, Subrata; Thameem, Farook; Bauer, Richard; Voruganti, Saroja; Comuzzie, Anthony; Cole, Shelley; Blangero, John; MacCluer, Jean; Abboud, Hanna

    2007-02-01

    Microalbuminuria, defined as urine albumin-to-creatinine ratio of 0.03 to 0.299 mg/mg, is a major risk factor for cardiovascular disease. Several genetic epidemiological studies have established that microalbuminuria clusters in families, suggesting a genetic predisposition. We estimated heritability of microalbuminuria and performed a genome-wide linkage analysis to identify chromosomal regions influencing urine albumin-to-creatinine ratio in 486 Mexican Americans from 26 multiplex families. Significant heritability was demonstrated for urine albumin-to-creatinine ratio (h = 24%, P < 0.003) after accounting for age, sex, body mass index, triglycerides, and hypertension. Genome scan revealed significant evidence of linkage of urine albumin-to-creatinine ratio to a region on chromosome 20q12 (LOD score of 3.5, P < 0.001) near marker D20S481. This region also exhibited a LOD score of 2.8 with diabetes status as a covariate and 3.0 with hypertension status as a covariate suggesting that the effect of this locus on urine albumin-to-creatinine ratio is largely independent of diabetes and hypertension. Findings indicate that there is a gene or genes located on human chromosome 20q12 that may have functional relevance to albumin excretion in Mexican Americans. Identifying and understanding the role of the genes that determine albumin excretion would lead to the development of novel therapeutic strategies targeted at high-risk individuals in whom intensive preventive measures may be most beneficial.

  6. Does parental expressed emotion moderate genetic effects in ADHD? An exploration using a genome wide association scan.

    PubMed

    Sonuga-Barke, Edmund J S; Lasky-Su, Jessica; Neale, Benjamin M; Oades, Robert; Chen, Wai; Franke, Barbara; Buitelaar, Jan; Banaschewski, Tobias; Ebstein, Richard; Gill, Michael; Anney, Richard; Miranda, Ana; Mulas, Fernando; Roeyers, Herbert; Rothenberger, Aribert; Sergeant, Joseph; Steinhausen, Hans Christoph; Thompson, Margaret; Asherson, Philip; Faraone, Stephen V

    2008-12-05

    Studies of gene x environment (G x E) interaction in ADHD have previously focused on known risk genes for ADHD and environmentally mediated biological risk. Here we use G x E analysis in the context of a genome-wide association scan to identify novel genes whose effects on ADHD symptoms and comorbid conduct disorder are moderated by high maternal expressed emotion (EE). SNPs (600,000) were genotyped in 958 ADHD proband-parent trios. After applying data cleaning procedures we examined 429,981 autosomal SNPs in 909 family trios. ADHD symptom severity and comorbid conduct disorder was measured using the Parental Account of Childhood Symptoms interview. Maternal criticism and warmth (i.e., EE) were coded by independent observers on comments made during the interview. No G x E interactions reached genome-wide significance. Nominal effects were found both with and without genetic main effects. For those with genetic main effects 36 uncorrected interaction P-values were <10(-5) implicating both novel genes as well as some previously supported candidates. These were found equally often for all of the interactions being investigated. The observed interactions in SLC1A1 and NRG3 SNPs represent reasonable candidate genes for further investigation given their previous association with several psychiatric illnesses. We find evidence for the role of EE in moderating the effects of genes on ADHD severity and comorbid conduct disorder, implicating both novel and established candidates. These findings need replicating in larger independent samples. Copyright 2008 Wiley-Liss, Inc.

  7. Genome-wide Scan of 29,141 African Americans Finds No Evidence of Directional Selection since Admixture

    PubMed Central

    Bhatia, Gaurav; Tandon, Arti; Patterson, Nick; Aldrich, Melinda C.; Ambrosone, Christine B.; Amos, Christopher; Bandera, Elisa V.; Berndt, Sonja I.; Bernstein, Leslie; Blot, William J.; Bock, Cathryn H.; Caporaso, Neil; Casey, Graham; Deming, Sandra L.; Diver, W. Ryan; Gapstur, Susan M.; Gillanders, Elizabeth M.; Harris, Curtis C.; Henderson, Brian E.; Ingles, Sue A.; Isaacs, William; De Jager, Phillip L.; John, Esther M.; Kittles, Rick A.; Larkin, Emma; McNeill, Lorna H.; Millikan, Robert C.; Murphy, Adam; Neslund-Dudas, Christine; Nyante, Sarah; Press, Michael F.; Rodriguez-Gil, Jorge L.; Rybicki, Benjamin A.; Schwartz, Ann G.; Signorello, Lisa B.; Spitz, Margaret; Strom, Sara S.; Tucker, Margaret A.; Wiencke, John K.; Witte, John S.; Wu, Xifeng; Yamamura, Yuko; Zanetti, Krista A.; Zheng, Wei; Ziegler, Regina G.; Chanock, Stephen J.; Haiman, Christopher A.; Reich, David; Price, Alkes L.

    2014-01-01

    The extent of recent selection in admixed populations is currently an unresolved question. We scanned the genomes of 29,141 African Americans and failed to find any genome-wide-significant deviations in local ancestry, indicating no evidence of selection influencing ancestry after admixture. A recent analysis of data from 1,890 African Americans reported that there was evidence of selection in African Americans after their ancestors left Africa, both before and after admixture. Selection after admixture was reported on the basis of deviations in local ancestry, and selection before admixture was reported on the basis of allele-frequency differences between African Americans and African populations. The local-ancestry deviations reported by the previous study did not replicate in our very large sample, and we show that such deviations were expected purely by chance, given the number of hypotheses tested. We further show that the previous study’s conclusion of selection in African Americans before admixture is also subject to doubt. This is because the FST statistics they used were inflated and because true signals of unusual allele-frequency differences between African Americans and African populations would be best explained by selection that occurred in Africa prior to migration to the Americas. PMID:25242497

  8. A genome-wide association scan for acute insulin response to glucose in Hispanic Americans: The IRAS Family Study

    PubMed Central

    Rich, S. S.; Goodarzi, M. O.; Palmer, N. D.; Langefeld, C. D.; Ziegler, J.; Haffner, S. M.; Bryer-Ash, M.; Norris, J. M.; Taylor, K. D.; Haritunians, T.; Rotter, J. I.; Chen, Y-D. I.; Wagenknecht, L. E.; Bowden, D. W.; Bergman, R. N.

    2009-01-01

    Aims/Hypothesis The goal of this study was to identify genes and regions in the human genome that are associated with the acute insulin response to glucose (AIRg), an important predictor of type 2 diabetes, in Hispanic-American participants from the Insulin Resistance Atherosclerosis Family Study (IRAS FS). Methods A two-stage genome-wide association scan (GWAS) was performed in IRAS FS Hispanic-American samples. In the first stage, 318K single nucleotide polymorphisms (SNPs) were assessed in 229 Hispanic-American DNA samples (from 34 families) from San Antonio, TX. SNPs with the most significant associations with AIRg were genotyped in the entire set of IRAS FS Hispanic-American samples (n = 1190). In chromosomal regions with evidence of association, additional SNPs were genotyped to capture variation in genes. Results No individual SNP achieved genome-wide levels of significance (P < 5 × 10-7); however, two regions — chromosomes 6p21 and 20p11 — had multiple highly-ranked SNPs that were associated with AIRg. Additional genotyping in these regions supported the initial evidence for variants contributing to variation in AIRg. One region resides in a gene desert between PXT1 and KCTD20 on 6p21 while the region on 20p11 has several viable candidate genes (ENTPD6, PYGB, GINS1 and R4-691N24.1). Conclusions/Interpretation A GWAS in Hispanic-American samples identified several candidate genes and loci that may be associated with AIRg. These associations explain a small component of variation in AIRg. The genes identified are involved in phosphorylation and ion transport and provide preliminary evidence that these processes have importance in beta cell response. PMID:19430760

  9. Genome wide scan for quantitative trait loci affecting tick resistance in cattle (Bos taurus × Bos indicus)

    PubMed Central

    2010-01-01

    Background In tropical countries, losses caused by bovine tick Rhipicephalus (Boophilus) microplus infestation have a tremendous economic impact on cattle production systems. Genetic variation between Bos taurus and Bos indicus to tick resistance and molecular biology tools might allow for the identification of molecular markers linked to resistance traits that could be used as an auxiliary tool in selection programs. The objective of this work was to identify QTL associated with tick resistance/susceptibility in a bovine F2 population derived from the Gyr (Bos indicus) × Holstein (Bos taurus) cross. Results Through a whole genome scan with microsatellite markers, we were able to map six genomic regions associated with bovine tick resistance. For most QTL, we have found that depending on the tick evaluation season (dry and rainy) different sets of genes could be involved in the resistance mechanism. We identified dry season specific QTL on BTA 2 and 10, rainy season specific QTL on BTA 5, 11 and 27. We also found a highly significant genome wide QTL for both dry and rainy seasons in the central region of BTA 23. Conclusions The experimental F2 population derived from Gyr × Holstein cross successfully allowed the identification of six highly significant QTL associated with tick resistance in cattle. QTL located on BTA 23 might be related with the bovine histocompatibility complex. Further investigation of these QTL will help to isolate candidate genes involved with tick resistance in cattle. PMID:20433753

  10. Genome-wide significant loci for addiction and anxiety.

    PubMed

    Hodgson, K; Almasy, L; Knowles, E E M; Kent, J W; Curran, J E; Dyer, T D; Göring, H H H; Olvera, R L; Fox, P T; Pearlson, G D; Krystal, J H; Duggirala, R; Blangero, J; Glahn, D C

    2016-08-01

    Psychiatric comorbidity is common among individuals with addictive disorders, with patients frequently suffering from anxiety disorders. While the genetic architecture of comorbid addictive and anxiety disorders remains unclear, elucidating the genes involved could provide important insights into the underlying etiology. Here we examine a sample of 1284 Mexican-Americans from randomly selected extended pedigrees. Variance decomposition methods were used to examine the role of genetics in addiction phenotypes (lifetime history of alcohol dependence, drug dependence or chronic smoking) and various forms of clinically relevant anxiety. Genome-wide univariate and bivariate linkage scans were conducted to localize the chromosomal regions influencing these traits. Addiction phenotypes and anxiety were shown to be heritable and univariate genome-wide linkage scans revealed significant quantitative trait loci for drug dependence (14q13.2-q21.2, LOD=3.322) and a broad anxiety phenotype (12q24.32-q24.33, LOD=2.918). Significant positive genetic correlations were observed between anxiety and each of the addiction subtypes (ρg=0.550-0.655) and further investigation with bivariate linkage analyses identified significant pleiotropic signals for alcohol dependence-anxiety (9q33.1-q33.2, LOD=3.054) and drug dependence-anxiety (18p11.23-p11.22, LOD=3.425). This study confirms the shared genetic underpinnings of addiction and anxiety and identifies genomic loci involved in the etiology of these comorbid disorders. The linkage signal for anxiety on 12q24 spans the location of TMEM132D, an emerging gene of interest from previous GWAS of anxiety traits, whilst the bivariate linkage signal identified for anxiety-alcohol on 9q33 peak coincides with a region where rare CNVs have been associated with psychiatric disorders. Other signals identified implicate novel regions of the genome in addiction genetics. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  11. Genome-wide scan of 29,141 African Americans finds no evidence of directional selection since admixture.

    PubMed

    Bhatia, Gaurav; Tandon, Arti; Patterson, Nick; Aldrich, Melinda C; Ambrosone, Christine B; Amos, Christopher; Bandera, Elisa V; Berndt, Sonja I; Bernstein, Leslie; Blot, William J; Bock, Cathryn H; Caporaso, Neil; Casey, Graham; Deming, Sandra L; Diver, W Ryan; Gapstur, Susan M; Gillanders, Elizabeth M; Harris, Curtis C; Henderson, Brian E; Ingles, Sue A; Isaacs, William; De Jager, Phillip L; John, Esther M; Kittles, Rick A; Larkin, Emma; McNeill, Lorna H; Millikan, Robert C; Murphy, Adam; Neslund-Dudas, Christine; Nyante, Sarah; Press, Michael F; Rodriguez-Gil, Jorge L; Rybicki, Benjamin A; Schwartz, Ann G; Signorello, Lisa B; Spitz, Margaret; Strom, Sara S; Tucker, Margaret A; Wiencke, John K; Witte, John S; Wu, Xifeng; Yamamura, Yuko; Zanetti, Krista A; Zheng, Wei; Ziegler, Regina G; Chanock, Stephen J; Haiman, Christopher A; Reich, David; Price, Alkes L

    2014-10-02

    The extent of recent selection in admixed populations is currently an unresolved question. We scanned the genomes of 29,141 African Americans and failed to find any genome-wide-significant deviations in local ancestry, indicating no evidence of selection influencing ancestry after admixture. A recent analysis of data from 1,890 African Americans reported that there was evidence of selection in African Americans after their ancestors left Africa, both before and after admixture. Selection after admixture was reported on the basis of deviations in local ancestry, and selection before admixture was reported on the basis of allele-frequency differences between African Americans and African populations. The local-ancestry deviations reported by the previous study did not replicate in our very large sample, and we show that such deviations were expected purely by chance, given the number of hypotheses tested. We further show that the previous study's conclusion of selection in African Americans before admixture is also subject to doubt. This is because the FST statistics they used were inflated and because true signals of unusual allele-frequency differences between African Americans and African populations would be best explained by selection that occurred in Africa prior to migration to the Americas. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  12. A genome-wide scan for common alleles affecting risk for autism.

    PubMed

    Anney, Richard; Klei, Lambertus; Pinto, Dalila; Regan, Regina; Conroy, Judith; Magalhaes, Tiago R; Correia, Catarina; Abrahams, Brett S; Sykes, Nuala; Pagnamenta, Alistair T; Almeida, Joana; Bacchelli, Elena; Bailey, Anthony J; Baird, Gillian; Battaglia, Agatino; Berney, Tom; Bolshakova, Nadia; Bölte, Sven; Bolton, Patrick F; Bourgeron, Thomas; Brennan, Sean; Brian, Jessica; Carson, Andrew R; Casallo, Guillermo; Casey, Jillian; Chu, Su H; Cochrane, Lynne; Corsello, Christina; Crawford, Emily L; Crossett, Andrew; Dawson, Geraldine; de Jonge, Maretha; Delorme, Richard; Drmic, Irene; Duketis, Eftichia; Duque, Frederico; Estes, Annette; Farrar, Penny; Fernandez, Bridget A; Folstein, Susan E; Fombonne, Eric; Freitag, Christine M; Gilbert, John; Gillberg, Christopher; Glessner, Joseph T; Goldberg, Jeremy; Green, Jonathan; Guter, Stephen J; Hakonarson, Hakon; Heron, Elizabeth A; Hill, Matthew; Holt, Richard; Howe, Jennifer L; Hughes, Gillian; Hus, Vanessa; Igliozzi, Roberta; Kim, Cecilia; Klauck, Sabine M; Kolevzon, Alexander; Korvatska, Olena; Kustanovich, Vlad; Lajonchere, Clara M; Lamb, Janine A; Laskawiec, Magdalena; Leboyer, Marion; Le Couteur, Ann; Leventhal, Bennett L; Lionel, Anath C; Liu, Xiao-Qing; Lord, Catherine; Lotspeich, Linda; Lund, Sabata C; Maestrini, Elena; Mahoney, William; Mantoulan, Carine; Marshall, Christian R; McConachie, Helen; McDougle, Christopher J; McGrath, Jane; McMahon, William M; Melhem, Nadine M; Merikangas, Alison; Migita, Ohsuke; Minshew, Nancy J; Mirza, Ghazala K; Munson, Jeff; Nelson, Stanley F; Noakes, Carolyn; Noor, Abdul; Nygren, Gudrun; Oliveira, Guiomar; Papanikolaou, Katerina; Parr, Jeremy R; Parrini, Barbara; Paton, Tara; Pickles, Andrew; Piven, Joseph; Posey, David J; Poustka, Annemarie; Poustka, Fritz; Prasad, Aparna; Ragoussis, Jiannis; Renshaw, Katy; Rickaby, Jessica; Roberts, Wendy; Roeder, Kathryn; Roge, Bernadette; Rutter, Michael L; Bierut, Laura J; Rice, John P; Salt, Jeff; Sansom, Katherine; Sato, Daisuke; Segurado, Ricardo; Senman, Lili; Shah, Naisha; Sheffield, Val C; Soorya, Latha; Sousa, Inês; Stoppioni, Vera; Strawbridge, Christina; Tancredi, Raffaella; Tansey, Katherine; Thiruvahindrapduram, Bhooma; Thompson, Ann P; Thomson, Susanne; Tryfon, Ana; Tsiantis, John; Van Engeland, Herman; Vincent, John B; Volkmar, Fred; Wallace, Simon; Wang, Kai; Wang, Zhouzhi; Wassink, Thomas H; Wing, Kirsty; Wittemeyer, Kerstin; Wood, Shawn; Yaspan, Brian L; Zurawiecki, Danielle; Zwaigenbaum, Lonnie; Betancur, Catalina; Buxbaum, Joseph D; Cantor, Rita M; Cook, Edwin H; Coon, Hilary; Cuccaro, Michael L; Gallagher, Louise; Geschwind, Daniel H; Gill, Michael; Haines, Jonathan L; Miller, Judith; Monaco, Anthony P; Nurnberger, John I; Paterson, Andrew D; Pericak-Vance, Margaret A; Schellenberg, Gerard D; Scherer, Stephen W; Sutcliffe, James S; Szatmari, Peter; Vicente, Astrid M; Vieland, Veronica J; Wijsman, Ellen M; Devlin, Bernie; Ennis, Sean; Hallmayer, Joachim

    2010-10-15

    Although autism spectrum disorders (ASDs) have a substantial genetic basis, most of the known genetic risk has been traced to rare variants, principally copy number variants (CNVs). To identify common risk variation, the Autism Genome Project (AGP) Consortium genotyped 1558 rigorously defined ASD families for 1 million single-nucleotide polymorphisms (SNPs) and analyzed these SNP genotypes for association with ASD. In one of four primary association analyses, the association signal for marker rs4141463, located within MACROD2, crossed the genome-wide association significance threshold of P < 5 × 10(-8). When a smaller replication sample was analyzed, the risk allele at rs4141463 was again over-transmitted; yet, consistent with the winner's curse, its effect size in the replication sample was much smaller; and, for the combined samples, the association signal barely fell below the P < 5 × 10(-8) threshold. Exploratory analyses of phenotypic subtypes yielded no significant associations after correction for multiple testing. They did, however, yield strong signals within several genes, KIAA0564, PLD5, POU6F2, ST8SIA2 and TAF1C.

  13. Refining genome-wide linkage intervals using a meta-analysis of genome-wide association studies identifies loci influencing personality dimensions

    PubMed Central

    Amin, Najaf; Hottenga, Jouke-Jan; Hansell, Narelle K; Janssens, A Cecile JW; de Moor, Marleen HM; Madden, Pamela AF; Zorkoltseva, Irina V; Penninx, Brenda W; Terracciano, Antonio; Uda, Manuela; Tanaka, Toshiko; Esko, Tonu; Realo, Anu; Ferrucci, Luigi; Luciano, Michelle; Davies, Gail; Metspalu, Andres; Abecasis, Goncalo R; Deary, Ian J; Raikkonen, Katri; Bierut, Laura J; Costa, Paul T; Saviouk, Viatcheslav; Zhu, Gu; Kirichenko, Anatoly V; Isaacs, Aaron; Aulchenko, Yurii S; Willemsen, Gonneke; Heath, Andrew C; Pergadia, Michele L; Medland, Sarah E; Axenovich, Tatiana I; de Geus, Eco; Montgomery, Grant W; Wright, Margaret J; Oostra, Ben A; Martin, Nicholas G; Boomsma, Dorret I; van Duijn, Cornelia M

    2013-01-01

    Personality traits are complex phenotypes related to psychosomatic health. Individually, various gene finding methods have not achieved much success in finding genetic variants associated with personality traits. We performed a meta-analysis of four genome-wide linkage scans (N=6149 subjects) of five basic personality traits assessed with the NEO Five-Factor Inventory. We compared the significant regions from the meta-analysis of linkage scans with the results of a meta-analysis of genome-wide association studies (GWAS) (N∼17 000). We found significant evidence of linkage of neuroticism to chromosome 3p14 (rs1490265, LOD=4.67) and to chromosome 19q13 (rs628604, LOD=3.55); of extraversion to 14q32 (ATGG002, LOD=3.3); and of agreeableness to 3p25 (rs709160, LOD=3.67) and to two adjacent regions on chromosome 15, including 15q13 (rs970408, LOD=4.07) and 15q14 (rs1055356, LOD=3.52) in the individual scans. In the meta-analysis, we found strong evidence of linkage of extraversion to 4q34, 9q34, 10q24 and 11q22, openness to 2p25, 3q26, 9p21, 11q24, 15q26 and 19q13 and agreeableness to 4q34 and 19p13. Significant evidence of association in the GWAS was detected between openness and rs677035 at 11q24 (P-value=2.6 × 10−06, KCNJ1). The findings of our linkage meta-analysis and those of the GWAS suggest that 11q24 is a susceptible locus for openness, with KCNJ1 as the possible candidate gene. PMID:23211697

  14. Genome-wide linkage scan for submaximal exercise heart rate in the HERITAGE family study.

    PubMed

    Spielmann, Nadine; Leon, Arthur S; Rao, D C; Rice, Treva; Skinner, James S; Rankinen, Tuomo; Bouchard, Claude

    2007-12-01

    The purpose of this study was to identify regions of the human genome linked to submaximal exercise heart rates in the sedentary state and in response to a standardized 20-wk endurance training program in blacks and whites of the HERITAGE Family Study. A total of 701 polymorphic markers covering the 22 autosomes were used in the genome-wide linkage scan, with 328 sibling pairs from 99 white nuclear families and 102 pairs from 115 black family units. Steady-state heart rates were measured at the relative intensity of 60% maximal oxygen uptake (HR60) and at the absolute intensity of 50 W (HR50). Baseline phenotypes were adjusted for age, sex, and baseline body mass index (BMI) and training responses (posttraining minus baseline, Delta) were adjusted for age, sex, baseline BMI, and baseline value of the phenotype. Two analytic strategies were used, a multipoint variance components and a regression-based multipoint linkage analysis. In whites, promising linkages (LOD > 1.75) were identified on 18q21-q22 for baseline HR50 (LOD = 2.64; P = 0.0002) and DeltaHR60 (LOD = 2.10; P = 0.0009) and on chromosome 2q33.3 for DeltaHR50 (LOD = 2.13; P = 0.0009). In blacks, evidence of promising linkage for baseline HR50 was detected with several markers within the chromosomal region 10q24-q25.3 (peak LOD = 2.43, P = 0.0004 with D10S597). The most promising regions for fine mapping in the HERITAGE Family Study were found on 2q33 for HR50 training response in whites, on 10q25-26 for baseline HR60 in blacks, and on 18q21-22 for both baseline HR50 and DeltaHR60 in whites.

  15. Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer)

    PubMed Central

    Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili

    2017-01-01

    Abstract Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. PMID:28922794

  16. A genome-wide scan for common alleles affecting risk for autism

    PubMed Central

    Anney, Richard; Klei, Lambertus; Pinto, Dalila; Regan, Regina; Conroy, Judith; Magalhaes, Tiago R.; Correia, Catarina; Abrahams, Brett S.; Sykes, Nuala; Pagnamenta, Alistair T.; Almeida, Joana; Bacchelli, Elena; Bailey, Anthony J.; Baird, Gillian; Battaglia, Agatino; Berney, Tom; Bolshakova, Nadia; Bölte, Sven; Bolton, Patrick F.; Bourgeron, Thomas; Brennan, Sean; Brian, Jessica; Carson, Andrew R.; Casallo, Guillermo; Casey, Jillian; Chu, Su H.; Cochrane, Lynne; Corsello, Christina; Crawford, Emily L.; Crossett, Andrew; Dawson, Geraldine; de Jonge, Maretha; Delorme, Richard; Drmic, Irene; Duketis, Eftichia; Duque, Frederico; Estes, Annette; Farrar, Penny; Fernandez, Bridget A.; Folstein, Susan E.; Fombonne, Eric; Freitag, Christine M.; Gilbert, John; Gillberg, Christopher; Glessner, Joseph T.; Goldberg, Jeremy; Green, Jonathan; Guter, Stephen J.; Hakonarson, Hakon; Heron, Elizabeth A.; Hill, Matthew; Holt, Richard; Howe, Jennifer L.; Hughes, Gillian; Hus, Vanessa; Igliozzi, Roberta; Kim, Cecilia; Klauck, Sabine M.; Kolevzon, Alexander; Korvatska, Olena; Kustanovich, Vlad; Lajonchere, Clara M.; Lamb, Janine A.; Laskawiec, Magdalena; Leboyer, Marion; Le Couteur, Ann; Leventhal, Bennett L.; Lionel, Anath C.; Liu, Xiao-Qing; Lord, Catherine; Lotspeich, Linda; Lund, Sabata C.; Maestrini, Elena; Mahoney, William; Mantoulan, Carine; Marshall, Christian R.; McConachie, Helen; McDougle, Christopher J.; McGrath, Jane; McMahon, William M.; Melhem, Nadine M.; Merikangas, Alison; Migita, Ohsuke; Minshew, Nancy J.; Mirza, Ghazala K.; Munson, Jeff; Nelson, Stanley F.; Noakes, Carolyn; Noor, Abdul; Nygren, Gudrun; Oliveira, Guiomar; Papanikolaou, Katerina; Parr, Jeremy R.; Parrini, Barbara; Paton, Tara; Pickles, Andrew; Piven, Joseph; Posey, David J; Poustka, Annemarie; Poustka, Fritz; Prasad, Aparna; Ragoussis, Jiannis; Renshaw, Katy; Rickaby, Jessica; Roberts, Wendy; Roeder, Kathryn; Roge, Bernadette; Rutter, Michael L.; Bierut, Laura J.; Rice, John P.; Salt, Jeff; Sansom, Katherine; Sato, Daisuke; Segurado, Ricardo; Senman, Lili; Shah, Naisha; Sheffield, Val C.; Soorya, Latha; Sousa, Inês; Stoppioni, Vera; Strawbridge, Christina; Tancredi, Raffaella; Tansey, Katherine; Thiruvahindrapduram, Bhooma; Thompson, Ann P.; Thomson, Susanne; Tryfon, Ana; Tsiantis, John; Van Engeland, Herman; Vincent, John B.; Volkmar, Fred; Wallace, Simon; Wang, Kai; Wang, Zhouzhi; Wassink, Thomas H.; Wing, Kirsty; Wittemeyer, Kerstin; Wood, Shawn; Yaspan, Brian L.; Zurawiecki, Danielle; Zwaigenbaum, Lonnie; Betancur, Catalina; Buxbaum, Joseph D.; Cantor, Rita M.; Cook, Edwin H.; Coon, Hilary; Cuccaro, Michael L.; Gallagher, Louise; Geschwind, Daniel H.; Gill, Michael; Haines, Jonathan L.; Miller, Judith; Monaco, Anthony P.; Nurnberger, John I.; Paterson, Andrew D.; Pericak-Vance, Margaret A.; Schellenberg, Gerard D.; Scherer, Stephen W.; Sutcliffe, James S.; Szatmari, Peter; Vicente, Astrid M.; Vieland, Veronica J.; Wijsman, Ellen M.; Devlin, Bernie; Ennis, Sean; Hallmayer, Joachim

    2010-01-01

    Although autism spectrum disorders (ASDs) have a substantial genetic basis, most of the known genetic risk has been traced to rare variants, principally copy number variants (CNVs). To identify common risk variation, the Autism Genome Project (AGP) Consortium genotyped 1558 rigorously defined ASD families for 1 million single-nucleotide polymorphisms (SNPs) and analyzed these SNP genotypes for association with ASD. In one of four primary association analyses, the association signal for marker rs4141463, located within MACROD2, crossed the genome-wide association significance threshold of P < 5 × 10−8. When a smaller replication sample was analyzed, the risk allele at rs4141463 was again over-transmitted; yet, consistent with the winner's curse, its effect size in the replication sample was much smaller; and, for the combined samples, the association signal barely fell below the P < 5 × 10−8 threshold. Exploratory analyses of phenotypic subtypes yielded no significant associations after correction for multiple testing. They did, however, yield strong signals within several genes, KIAA0564, PLD5, POU6F2, ST8SIA2 and TAF1C. PMID:20663923

  17. Enhancer scanning to locate regulatory regions in genomic loci

    PubMed Central

    Buckley, Melissa; Gjyshi, Anxhela; Mendoza-Fandiño, Gustavo; Baskin, Rebekah; Carvalho, Renato S.; Carvalho, Marcelo A.; Woods, Nicholas T.; Monteiro, Alvaro N.A.

    2016-01-01

    The present protocol provides a rapid, streamlined and scalable strategy to systematically scan genomic regions for the presence of transcriptional regulatory regions active in a specific cell type. It creates genomic tiles spanning a region of interest that are subsequently cloned by recombination into a luciferase reporter vector containing the Simian Virus 40 promoter. Tiling clones are transfected into specific cell types to test for the presence of transcriptional regulatory regions. The protocol includes testing of different SNP (single nucleotide polymorphism) alleles to determine their effect on regulatory activity. This procedure provides a systematic framework to identify candidate functional SNPs within a locus during functional analysis of genome-wide association studies. This protocol adapts and combines previous well-established molecular biology methods to provide a streamlined strategy, based on automated primer design and recombinational cloning to rapidly go from a genomic locus to a set of candidate functional SNPs in eight weeks. PMID:26658467

  18. Genome-wide Linkage Scan of Antisocial Behavior, Depression and Impulsive Substance Use in the UCSF Family Alcoholism Study

    PubMed Central

    Gizer, Ian R.; Ehlers, Cindy L.; Vieten, Cassandra; Feiler, Heidi S.; Gilder, David A.; Wilhelmsen, Kirk C.

    2012-01-01

    OBJECTIVE Epidemiological and clinical studies suggest that rates of antisocial behavior, depression, and impulsive substance use are increased among individuals diagnosed with alcohol dependence relative to those who are not. Thus, the present study conducted genome-wide linkage scans of antisocial behavior, depression, and impulsive substance use in the University of California at San Francisco Family Alcoholism Study. METHODS Antisocial behavior, depressive symptoms, and impulsive substance use were assessed using three scales from the MMPI-2, the Antisocial Practices content scale (ASP), the Depression content scale (DEP), and the revised MacAndrew Alcoholism scale (MAC-R). Linkage analyses were conducted using a variance components approach. RESULTS Suggestive evidence of linkage to three genomic regions independent of alcohol and cannabis dependence diagnostic status was observed: the ASP scale showed evidence of linkage to chromosome 13 at 11 cM, the MAC-R scale showed evidence of linkage to chromosome 15 at 47 cM, and all 3 scales showed evidence of linkage to chromosome 17 at 57–58 cM. CONCLUSIONS Each of these regions has shown prior evidence of linkage and association to substance dependence as well as other psychiatric disorders such as mood and anxiety disorders, ADHD, and schizophrenia thus suggesting potentially broad relations between these regions and psychopathology. PMID:22517380

  19. A genome-wide linkage scan for quantitative trait loci underlying obesity related phenotypes in 434 Caucasian families.

    PubMed

    Zhao, Lan-Juan; Xiao, Peng; Liu, Yong-Jun; Xiong, Dong-Hai; Shen, Hui; Recker, Robert R; Deng, Hong-Wen

    2007-03-01

    To identify quantitative trait loci (QTLs) that contribute to obesity, we performed a large-scale whole genome linkage scan (WGS) involving 4,102 individuals from 434 Caucasian families. The most pronounced linkage evidence was found at the genomic region 20p11-12 for fat mass (LOD = 3.31) and percentage fat mass (PFM) (LOD = 2.92). We also identified several regions showing suggestive linkage signals (threshold LOD = 1.9) for obesity phenotypes, including 5q35, 8q13, 10p12, and 17q11.

  20. Genome-Wide Association Scan in HIV-1-Infected Individuals Identifying Variants Influencing Disease Course

    PubMed Central

    van Manen, Daniëlle; Delaneau, Olivier; Kootstra, Neeltje A.; Boeser-Nunnink, Brigitte D.; Limou, Sophie; Bol, Sebastiaan M.; Burger, Judith A.; Zwinderman, Aeilko H.; Moerland, Perry D.; van 't Slot, Ruben; Zagury, Jean-François; van 't Wout, Angélique B.; Schuitemaker, Hanneke

    2011-01-01

    Background AIDS develops typically after 7–11 years of untreated HIV-1 infection, with extremes of very rapid disease progression (<2 years) and long-term non-progression (>15 years). To reveal additional host genetic factors that may impact on the clinical course of HIV-1 infection, we designed a genome-wide association study (GWAS) in 404 participants of the Amsterdam Cohort Studies on HIV-1 infection and AIDS. Methods The association of SNP genotypes with the clinical course of HIV-1 infection was tested in Cox regression survival analyses using AIDS-diagnosis and AIDS-related death as endpoints. Results Multiple, not previously identified SNPs, were identified to be strongly associated with disease progression after HIV-1 infection, albeit not genome-wide significant. However, three independent SNPs in the top ten associations between SNP genotypes and time between seroconversion and AIDS-diagnosis, and one from the top ten associations between SNP genotypes and time between seroconversion and AIDS-related death, had P-values smaller than 0.05 in the French Genomics of Resistance to Immunodeficiency Virus cohort on disease progression. Conclusions Our study emphasizes that the use of different phenotypes in GWAS may be useful to unravel the full spectrum of host genetic factors that may be associated with the clinical course of HIV-1 infection. PMID:21811574

  1. Lack of association between arterial stiffness and genetic variants by genome-wide association scan.

    PubMed

    Park, Sungha; Lee, Ji-Young; Kim, Byeong-Keuk; Lee, Sang-Hak; Chang, Hyuk-Jae; Choi, DongHoon; Jang, Yangsoo

    2015-01-01

    Arterial stiffness is an independent predictor of cardiovascular disease risk. However, whether genetic risk variants are associated with arterial stiffness measures, such as pulse-wave velocity (PWV), is largely unknown. Therefore, we performed a genome-wide association study (GWAS) to identify single-nucleotide polymorphisms (SNPs) associated with PWV in a Korea population. Study participants consisted of 402 patients in the Yonsei cardiovascular genome center cohort. Arterial stiffness was measured as brachial-ankle pulse-wave velocity (baPWV). Genotyping was performed in 402 subjects with the Axiom Genome-Wide ASI 1 Array Plate containing more than 600,000 SNP markers. The findings were tested for replication in independent subjects from a community-based cohort of 1206 individuals, using a Taqman assay to include two candidate SNPs. Associations with PWV were evaluated using an additive genetic model that included age, gender, systolic blood pressure and diastolic blood pressure as covariates. GWAS and replication analyses were conducted using the measured genotype method implemented in PLINK and SAS. We observed two candidate SNPs associated with baPWV in GWAS: rs7271920 (p = 7.15 × 10(-9)) and rs10125157 (p = 8.25 × 10(-7)). However, neither of these was significant in the replication cohort. In summary, we did not identify any common genetic variants associated with baPWV in cardiovascular patients.

  2. Genome-Wide Variation Patterns Uncover the Origin and Selection in Cultivated Ginseng (Panax ginseng Meyer).

    PubMed

    Li, Ming-Rui; Shi, Feng-Xue; Li, Ya-Ling; Jiang, Peng; Jiao, Lili; Liu, Bao; Li, Lin-Feng

    2017-09-01

    Chinese ginseng (Panax ginseng Meyer) is a medicinally important herb and plays crucial roles in traditional Chinese medicine. Pharmacological analyses identified diverse bioactive components from Chinese ginseng. However, basic biological attributes including domestication and selection of the ginseng plant remain under-investigated. Here, we presented a genome-wide view of the domestication and selection of cultivated ginseng based on the whole genome data. A total of 8,660 protein-coding genes were selected for genome-wide scanning of the 30 wild and cultivated ginseng accessions. In complement, the 45s rDNA, chloroplast and mitochondrial genomes were included to perform phylogenetic and population genetic analyses. The observed spatial genetic structure between northern cultivated ginseng (NCG) and southern cultivated ginseng (SCG) accessions suggested multiple independent origins of cultivated ginseng. Genome-wide scanning further demonstrated that NCG and SCG have undergone distinct selection pressures during the domestication process, with more genes identified in the NCG (97 genes) than in the SCG group (5 genes). Functional analyses revealed that these genes are involved in diverse pathways, including DNA methylation, lignin biosynthesis, and cell differentiation. These findings suggested that the SCG and NCG groups have distinct demographic histories. Candidate genes identified are useful for future molecular breeding of cultivated ginseng. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. Genome-wide patterns of selection in 230 ancient Eurasians

    PubMed Central

    Mathieson, Iain; Lazaridis, Iosif; Rohland, Nadin; Mallick, Swapan; Patterson, Nick; Roodenberg, Songül Alpaslan; Harney, Eadaoin; Stewardson, Kristin; Fernandes, Daniel; Novak, Mario; Sirak, Kendra; Gamba, Cristina; Jones, Eppie R.; Llamas, Bastien; Dryomov, Stanislav; Pickrel, Joseph; Arsuaga, Juan Luís; de Castro, José María Bermúdez; Carbonell, Eudald; Gerritsen, Fokke; Khokhlov, Aleksandr; Kuznetsov, Pavel; Lozano, Marina; Meller, Harald; Mochalov, Oleg; Moiseyev, Vayacheslav; Rojo Guerra, Manuel A.; Roodenberg, Jacob; Vergès, Josep Maria; Krause, Johannes; Cooper, Alan; Alt, Kurt W.; Brown, Dorcas; Anthony, David; Lalueza-Fox, Carles; Haak, Wolfgang; Pinhasi, Ron; Reich, David

    2016-01-01

    Ancient DNA makes it possible to directly witness natural selection by analyzing samples from populations before, during and after adaptation events. Here we report the first scan for selection using ancient DNA, capitalizing on the largest genome-wide dataset yet assembled: 230 West Eurasians dating to between 6500 and 1000 BCE, including 163 with newly reported data. The new samples include the first genome-wide data from the Anatolian Neolithic culture whose genetic material we extracted from the DNA-rich petrous bone and who we show were members of the population that was the source of Europe’s first farmers. We also report a complete transect of the steppe region in Samara between 5500 and 1200 BCE that allows us to recognize admixture from at least two external sources into steppe populations during this period. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height. PMID:26595274

  4. Genome-wide association Scan of dental caries in the permanent dentition

    PubMed Central

    2012-01-01

    Background Over 90% of adults aged 20 years or older with permanent teeth have suffered from dental caries leading to pain, infection, or even tooth loss. Although caries prevalence has decreased over the past decade, there are still about 23% of dentate adults who have untreated carious lesions in the US. Dental caries is a complex disorder affected by both individual susceptibility and environmental factors. Approximately 35-55% of caries phenotypic variation in the permanent dentition is attributable to genes, though few specific caries genes have been identified. Therefore, we conducted the first genome-wide association study (GWAS) to identify genes affecting susceptibility to caries in adults. Methods Five independent cohorts were included in this study, totaling more than 7000 participants. For each participant, dental caries was assessed and genetic markers (single nucleotide polymorphisms, SNPs) were genotyped or imputed across the entire genome. Due to the heterogeneity among the five cohorts regarding age, genotyping platform, quality of dental caries assessment, and study design, we first conducted genome-wide association (GWA) analyses on each of the five independent cohorts separately. We then performed three meta-analyses to combine results for: (i) the comparatively younger, Appalachian cohorts (N = 1483) with well-assessed caries phenotype, (ii) the comparatively older, non-Appalachian cohorts (N = 5960) with inferior caries phenotypes, and (iii) all five cohorts (N = 7443). Top ranking genetic loci within and across meta-analyses were scrutinized for biologically plausible roles on caries. Results Different sets of genes were nominated across the three meta-analyses, especially between the younger and older age cohorts. In general, we identified several suggestive loci (P-value ≤ 10E-05) within or near genes with plausible biological roles for dental caries, including RPS6KA2 and PTK2B, involved in p38-depenedent MAPK signaling

  5. The Genetic Architecture of Seed Composition in Soybean Is Refined by Genome-Wide Association Scans Across Multiple Populations

    PubMed Central

    Vaughn, Justin N.; Nelson, Randall L.; Song, Qijian; Cregan, Perry B.; Li, Zenglu

    2014-01-01

    Soybean oil and meal are major contributors to world-wide food production. Consequently, the genetic basis for soybean seed composition has been intensely studied using family-based mapping. Population-based mapping approaches, in the form of genome-wide association (GWA) scans, have been able to resolve loci controlling moderately complex quantitative traits (QTL) in numerous crop species. Yet, it is still unclear how soybean’s unique population history will affect GWA scans. Using one of the populations in this study, we simulated phenotypes resulting from a range of genetic architectures. We found that with a heritability of 0.5, ∼100% and ∼33% of the 4 and 20 simulated QTL can be recovered, respectively, with a false-positive rate of less than ∼6×10−5 per marker tested. Additionally, we demonstrated that combining information from multi-locus mixed models and compressed linear-mixed models improves QTL identification and interpretation. We applied these insights to exploring seed composition in soybean, refining the linkage group I (chromosome 20) protein QTL and identifying additional oil QTL that may allow some decoupling of highly correlated oil and protein phenotypes. Because the value of protein meal is closely related to its essential amino acid profile, we attempted to identify QTL underlying methionine, threonine, cysteine, and lysine content. Multiple QTL were found that have not been observed in family-based mapping studies, and each trait exhibited associations across multiple populations. Chromosomes 1 and 8 contain strong candidate alleles for essential amino acid increases. Overall, we present these and additional data that will be useful in determining breeding strategies for the continued improvement of soybean’s nutrient portfolio. PMID:25246241

  6. Genome-wide scan of IQ finds significant linkage to a quantitative trait locus on 2q.

    PubMed

    Luciano, M; Wright, M J; Duffy, D L; Wainwright, M A; Zhu, G; Evans, D M; Geffen, G M; Montgomery, G W; Martin, N G

    2006-01-01

    A genome-wide linkage scan of 795 microsatellite markers (761 autosomal, 34 X chromosome) was performed on Multidimensional Aptitude Battery subtests and verbal, performance and full scale scores, the WAIS-R Digit Symbol subtest, and two word-recognition tests (Schonell Graded Word Reading Test, Cambridge Contextual Reading Test) highly predictive of IQ. The sample included 361 families comprising 2-5 siblings who ranged in age from 15.7 to 22.2 years; genotype, but not phenotype, data were available for 81% of parents. A variance components analysis which controlled for age and sex effects showed significant linkage for the Cambridge reading test and performance IQ to the same region on chromosome 2, with respective LOD scores of 4.15 and 3.68. Suggestive linkage (LOD score>2.2) for various measures was further supported on chromosomes 6, 7, 11, 14, 21 and 22. Where location of linkage peaks converged for IQ subtests within the same scale, the overall scale score provided increased evidence for linkage to that region over any individual subtest. Association studies of candidate genes, particularly those involved in neural transmission and development, will be directed to genes located under the linkage peaks identified in this study.

  7. Genome Wide Scan for Loci influencing Warner Bratzler Shear Force in Five Bos taurus Breeds

    USDA-ARS?s Scientific Manuscript database

    Genetic tests for beef tenderness are currently limited to single nucleotide polymorphisms (SNPs) within µ-calpain (CAPN1) and calpastatin (CAST) and explain little of the phenotypic variation in Warner-Bratzler shear force (WBSF). We performed a genome-wide association study for WBSF by genotyping...

  8. A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation

    PubMed Central

    Adhikari, Kaustubh; Fuentes-Guajardo, Macarena; Quinto-Sánchez, Mirsha; Mendoza-Revilla, Javier; Camilo Chacón-Duque, Juan; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Lozano, Rodrigo Barquera; Pérez, Gastón Macín; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C.; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M.; Bortolini, Maria- Cátira; Canizales-Quinteros, Samuel; Cheeseman, Michael; Rosique, Javier; Bedoya, Gabriel; Rothhammer, Francisco; Headon, Denis; González-José, Rolando; Balding, David; Ruiz-Linares, Andrés

    2016-01-01

    We report a genome-wide association scan for facial features in ∼6,000 Latin Americans. We evaluated 14 traits on an ordinal scale and found significant association (P values<5 × 10−8) at single-nucleotide polymorphisms (SNPs) in four genomic regions for three nose-related traits: columella inclination (4q31), nose bridge breadth (6p21) and nose wing breadth (7p13 and 20p11). In a subsample of ∼3,000 individuals we obtained quantitative traits related to 9 of the ordinal phenotypes and, also, a measure of nasion position. Quantitative analyses confirmed the ordinal-based associations, identified SNPs in 2q12 associated to chin protrusion, and replicated the reported association of nasion position with SNPs in PAX3. Strongest association in 2q12, 4q31, 6p21 and 7p13 was observed for SNPs in the EDAR, DCHS2, RUNX2 and GLI3 genes, respectively. Associated SNPs in 20p11 extend to PAX1. Consistent with the effect of EDAR on chin protrusion, we documented alterations of mandible length in mice with modified Edar funtion. PMID:27193062

  9. A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation.

    PubMed

    Adhikari, Kaustubh; Fuentes-Guajardo, Macarena; Quinto-Sánchez, Mirsha; Mendoza-Revilla, Javier; Camilo Chacón-Duque, Juan; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Lozano, Rodrigo Barquera; Pérez, Gastón Macín; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Cheeseman, Michael; Rosique, Javier; Bedoya, Gabriel; Rothhammer, Francisco; Headon, Denis; González-José, Rolando; Balding, David; Ruiz-Linares, Andrés

    2016-05-19

    We report a genome-wide association scan for facial features in ∼6,000 Latin Americans. We evaluated 14 traits on an ordinal scale and found significant association (P values<5 × 10(-8)) at single-nucleotide polymorphisms (SNPs) in four genomic regions for three nose-related traits: columella inclination (4q31), nose bridge breadth (6p21) and nose wing breadth (7p13 and 20p11). In a subsample of ∼3,000 individuals we obtained quantitative traits related to 9 of the ordinal phenotypes and, also, a measure of nasion position. Quantitative analyses confirmed the ordinal-based associations, identified SNPs in 2q12 associated to chin protrusion, and replicated the reported association of nasion position with SNPs in PAX3. Strongest association in 2q12, 4q31, 6p21 and 7p13 was observed for SNPs in the EDAR, DCHS2, RUNX2 and GLI3 genes, respectively. Associated SNPs in 20p11 extend to PAX1. Consistent with the effect of EDAR on chin protrusion, we documented alterations of mandible length in mice with modified Edar funtion.

  10. InterProScan 5: genome-scale protein function classification

    PubMed Central

    Jones, Philip; Binns, David; Chang, Hsin-Yu; Fraser, Matthew; Li, Weizhong; McAnulla, Craig; McWilliam, Hamish; Maslen, John; Mitchell, Alex; Nuka, Gift; Pesseat, Sebastien; Quinn, Antony F.; Sangrador-Vegas, Amaia; Scheremetjew, Maxim; Yong, Siew-Yit; Lopez, Rodrigo; Hunter, Sarah

    2014-01-01

    Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code. Availability and implementation: InterProScan is distributed via FTP at ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/ and the source code is available from http://code.google.com/p/interproscan/. Contact: http://www.ebi.ac.uk/support or interhelp@ebi.ac.uk or mitchell@ebi.ac.uk PMID:24451626

  11. A genome-wide scan for signatures of differential artificial selection in ten cattle breeds.

    PubMed

    Rothammer, Sophie; Seichter, Doris; Förster, Martin; Medugorac, Ivica

    2013-12-21

    Since the times of domestication, cattle have been continually shaped by the influence of humans. Relatively recent history, including breed formation and the still enduring enormous improvement of economically important traits, is expected to have left distinctive footprints of selection within the genome. The purpose of this study was to map genome-wide selection signatures in ten cattle breeds and thus improve the understanding of the genome response to strong artificial selection and support the identification of the underlying genetic variants of favoured phenotypes. We analysed 47,651 single nucleotide polymorphisms (SNP) using Cross Population Extended Haplotype Homozygosity (XP-EHH). We set the significance thresholds using the maximum XP-EHH values of two essentially artificially unselected breeds and found up to 229 selection signatures per breed. Through a confirmation process we verified selection for three distinct phenotypes typical for one breed (polledness in Galloway, double muscling in Blanc-Bleu Belge and red coat colour in Red Holstein cattle). Moreover, we detected six genes strongly associated with known QTL for beef or dairy traits (TG, ABCG2, DGAT1, GH1, GHR and the Casein Cluster) within selection signatures of at least one breed. A literature search for genes lying in outstanding signatures revealed further promising candidate genes. However, in concordance with previous genome-wide studies, we also detected a substantial number of signatures without any yet known gene content. These results show the power of XP-EHH analyses in cattle to discover promising candidate genes and raise the hope of identifying phenotypically important variants in the near future. The finding of plausible functional candidates in some short signatures supports this hope. For instance, MAP2K6 is the only annotated gene of two signatures detected in Galloway and Gelbvieh cattle and is already known to be associated with carcass weight, back fat thickness and

  12. Genome-wide Association Study Identifies African-Specific Susceptibility Loci in African Americans with Inflammatory Bowel Disease

    PubMed Central

    Brant, Steven R.; Okou, David T.; Simpson, Claire L.; Cutler, David J.; Haritunians, Talin; Bradfield, Jonathan P.; Chopra, Pankaj; Prince, Jarod; Begum, Ferdouse; Kumar, Archana; Huang, Chengrui; Venkateswaran, Suresh; Datta, Lisa W.; Wei, Zhi; Thomas, Kelly; Herrinton, Lisa J.; Klapproth, Jan-Micheal A.; Quiros, Antonio J.; Seminerio, Jenifer; Liu, Zhenqiu; Alexander, Jonathan S.; Baldassano, Robert N.; Dudley-Brown, Sharon; Cross, Raymond K.; Dassopoulos, Themistocles; Denson, Lee A.; Dhere, Tanvi A.; Dryden, Gerald W.; Hanson, John S.; Hou, Jason K.; Hussain, Sunny Z.; Hyams, Jeffrey S.; Isaacs, Kim L.; Kader, Howard; Kappelman, Michael D.; Katz, Jeffry; Kellermayer, Richard; Kirschner, Barbara S.; Kuemmerle, John F.; Kwon, John H.; Lazarev, Mark; Li, Ellen; Mack, David; Mannon, Peter; Moulton, Dedrick E.; Newberry, Rodney D.; Osuntokun, Bankole O.; Patel, Ashish S.; Saeed, Shehzad A.; Targan, Stephan R.; Valentine, John F.; Wang, Ming-Hsi; Zonca, Martin; Rioux, John D.; Duerr, Richard H.; Silverberg, Mark S.; Cho, Judy H.; Hakonarson, Hakon; Zwick, Michael E.; McGovern, Dermot P.B.; Kugathasan, Subra

    2016-01-01

    Background & Aims The inflammatory bowel diseases (IBD) ulcerative colitis (UC) and Crohn’s disease (CD) cause significant morbidity and are increasing in prevalence among all populations, including African Americans. More than 200 susceptibility loci have been identified in populations of predominantly European ancestry, but few loci have been associated with IBD in other ethnicities. Methods We performed 2 high-density, genome-wide scans comprising 2345 cases of African Americans with IBD (1646 with CD, 583 with UC, and 116 inflammatory bowel disease unclassified [IBD-U]) and 5002 individuals without IBD (controls, identified from the Health Retirement Study and Kaiser Permanente database). Single-nucleotide polymorphisms (SNPs) associated at P<5.0×10−8 in meta-analysis with a nominal evidence (P<.05) in each scan were considered to have genome-wide significance. Results We detected SNPs at HLA-DRB1, and African-specific SNPs at ZNF649 and LSAMP, with associations of genome-wide significance for UC. We detected SNPs at USP25 with associations of genome-wide significance associations for IBD. No associations of genome-wide significance were detected for CD. In addition, 9 genes previously associated with IBD contained SNPs with significant evidence for replication (P<1.6×10−6): ADCY3, CXCR6, HLA-DRB1 to HLA-DQA1 (genome-wide significance on conditioning), IL12B, PTGER4, and TNC for IBD; IL23R, PTGER4, and SNX20 (in strong linkage disequilibrium with NOD2) for CD; and KCNQ2 (near TNFRSF6B) for UC. Several of these genes, such as TNC (near TNFSF15), CXCR6, and genes associated with IBD at the HLA locus, contained SNPs with unique association patterns with African-specific alleles. Conclusions We performed a genome-wide association study of African Americans with IBD and identified loci associated with CD and UC in only this population; we also replicated loci identified in European populations. The detection of variants associated with IBD risk in only

  13. Genome-Wide Association Study Identifies African-Specific Susceptibility Loci in African Americans With Inflammatory Bowel Disease.

    PubMed

    Brant, Steven R; Okou, David T; Simpson, Claire L; Cutler, David J; Haritunians, Talin; Bradfield, Jonathan P; Chopra, Pankaj; Prince, Jarod; Begum, Ferdouse; Kumar, Archana; Huang, Chengrui; Venkateswaran, Suresh; Datta, Lisa W; Wei, Zhi; Thomas, Kelly; Herrinton, Lisa J; Klapproth, Jan-Micheal A; Quiros, Antonio J; Seminerio, Jenifer; Liu, Zhenqiu; Alexander, Jonathan S; Baldassano, Robert N; Dudley-Brown, Sharon; Cross, Raymond K; Dassopoulos, Themistocles; Denson, Lee A; Dhere, Tanvi A; Dryden, Gerald W; Hanson, John S; Hou, Jason K; Hussain, Sunny Z; Hyams, Jeffrey S; Isaacs, Kim L; Kader, Howard; Kappelman, Michael D; Katz, Jeffry; Kellermayer, Richard; Kirschner, Barbara S; Kuemmerle, John F; Kwon, John H; Lazarev, Mark; Li, Ellen; Mack, David; Mannon, Peter; Moulton, Dedrick E; Newberry, Rodney D; Osuntokun, Bankole O; Patel, Ashish S; Saeed, Shehzad A; Targan, Stephan R; Valentine, John F; Wang, Ming-Hsi; Zonca, Martin; Rioux, John D; Duerr, Richard H; Silverberg, Mark S; Cho, Judy H; Hakonarson, Hakon; Zwick, Michael E; McGovern, Dermot P B; Kugathasan, Subra

    2017-01-01

    The inflammatory bowel diseases (IBD) ulcerative colitis (UC) and Crohn's disease (CD) cause significant morbidity and are increasing in prevalence among all populations, including African Americans. More than 200 susceptibility loci have been identified in populations of predominantly European ancestry, but few loci have been associated with IBD in other ethnicities. We performed 2 high-density, genome-wide scans comprising 2345 cases of African Americans with IBD (1646 with CD, 583 with UC, and 116 inflammatory bowel disease unclassified) and 5002 individuals without IBD (controls, identified from the Health Retirement Study and Kaiser Permanente database). Single-nucleotide polymorphisms (SNPs) associated at P < 5.0 × 10 -8 in meta-analysis with a nominal evidence (P < .05) in each scan were considered to have genome-wide significance. We detected SNPs at HLA-DRB1, and African-specific SNPs at ZNF649 and LSAMP, with associations of genome-wide significance for UC. We detected SNPs at USP25 with associations of genome-wide significance for IBD. No associations of genome-wide significance were detected for CD. In addition, 9 genes previously associated with IBD contained SNPs with significant evidence for replication (P < 1.6 × 10 -6 ): ADCY3, CXCR6, HLA-DRB1 to HLA-DQA1 (genome-wide significance on conditioning), IL12B,PTGER4, and TNC for IBD; IL23R, PTGER4, and SNX20 (in strong linkage disequilibrium with NOD2) for CD; and KCNQ2 (near TNFRSF6B) for UC. Several of these genes, such as TNC (near TNFSF15), CXCR6, and genes associated with IBD at the HLA locus, contained SNPs with unique association patterns with African-specific alleles. We performed a genome-wide association study of African Americans with IBD and identified loci associated with UC in only this population; we also replicated IBD, CD, and UC loci identified in European populations. The detection of variants associated with IBD risk in only people of African descent demonstrates the

  14. High-resolution genome-wide scan of genes, gene-networks and cellular systems impacting the yeast ionome

    USDA-ARS?s Scientific Manuscript database

    To balance the demand for uptake of essential elements with their potential toxicity living cells have complex regulatory mechanisms. Here, we describe a genome-wide screen to identify genes that impact the elemental composition (‘ionome’) of yeast Saccharomyces cerevisiae. Using inductively coupled...

  15. Memory management in genome-wide association studies

    PubMed Central

    2009-01-01

    Genome-wide association is a powerful tool for the identification of genes that underlie common diseases. Genome-wide association studies generate billions of genotypes and pose significant computational challenges for most users including limited computer memory. We applied a recently developed memory management tool to two analyses of North American Rheumatoid Arthritis Consortium studies and measured the performance in terms of central processing unit and memory usage. We conclude that our memory management approach is simple, efficient, and effective for genome-wide association studies. PMID:20018047

  16. A genome-wide association scan for acute insulin response to glucose in Hispanic-Americans: the Insulin Resistance Atherosclerosis Family Study (IRAS FS).

    PubMed

    Rich, S S; Goodarzi, M O; Palmer, N D; Langefeld, C D; Ziegler, J; Haffner, S M; Bryer-Ash, M; Norris, J M; Taylor, K D; Haritunians, T; Rotter, J I; Chen, Y-D I; Wagenknecht, L E; Bowden, D W; Bergman, R N

    2009-07-01

    This study sought to identify genes and regions in the human genome that are associated with the acute insulin response to glucose (AIRg), an important predictor of type 2 diabetes, in Hispanic-American participants from the Insulin Resistance Atherosclerosis Family Study (IRAS FS). A two-stage genome-wide association scan (GWAS) was performed in IRAS FS Hispanic-American samples. In the first stage, 317K single nucleotide polymorphisms (SNPs) were assessed in 229 Hispanic-American DNA samples from 34 families from San Antonio, TX, USA. SNPs with the most significant associations with AIRg were genotyped in the entire set of IRAS FS Hispanic-American samples (n = 1,190). In chromosomal regions with evidence of association, additional SNPs were genotyped to capture variation in genes. No individual SNP achieved genome-wide levels of significance (p < 5 x 10(-7)); however, two regions (chromosomes 6p21 and 20p11) had multiple highly ranked SNPs that were associated with AIRg. Additional genotyping in these regions supported the initial evidence of variants contributing to variation in AIRg. One region resides in a gene desert between PXT1 and KCTD20 on 6p21, while the region on 20p11 has several viable candidate genes (ENTPD6, PYGB, GINS1 and RP4-691N24.1). A GWAS in Hispanic-American samples identified several candidate genes and loci that may be associated with AIRg. These associations explain a small component of variation in AIRg. The genes identified are involved in phosphorylation and ion transport, and provide preliminary evidence that these processes are important in beta cell response.

  17. A genome-wide pleiotropy scan for prostate cancer risk.

    PubMed

    Panagiotou, Orestis A; Travis, Ruth C; Campa, Daniele; Berndt, Sonja I; Lindstrom, Sara; Kraft, Peter; Schumacher, Fredrick R; Siddiq, Afshan; Papatheodorou, Stefania I; Stanford, Janet L; Albanes, Demetrius; Virtamo, Jarmo; Weinstein, Stephanie J; Diver, W Ryan; Gapstur, Susan M; Stevens, Victoria L; Boeing, Heiner; Bueno-de-Mesquita, H Bas; Barricarte Gurrea, Aurelio; Kaaks, Rudolf; Khaw, Kay-Tee; Krogh, Vittorio; Overvad, Kim; Riboli, Elio; Trichopoulos, Dimitrios; Giovannucci, Edward; Stampfer, Meir; Haiman, Christopher; Henderson, Brian; Le Marchand, Loic; Gaziano, J Michael; Hunter, David J; Koutros, Stella; Yeager, Meredith; Hoover, Robert N; Chanock, Stephen J; Wacholder, Sholom; Key, Timothy J; Tsilidis, Konstantinos K

    2015-04-01

    No single-nucleotide polymorphisms (SNPs) specific for aggressive prostate cancer have been identified in genome-wide association studies (GWAS). To test if SNPs associated with other traits may also affect the risk of aggressive prostate cancer. SNPs implicated in any phenotype other than prostate cancer (p≤10(-7)) were identified through the catalog of published GWAS and tested in 2891 aggressive prostate cancer cases and 4592 controls from the Breast and Prostate Cancer Cohort Consortium (BPC3). The 40 most significant SNPs were followed up in 4872 aggressive prostate cancer cases and 24,534 controls from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium. Odds ratios (ORs) and 95% confidence intervals (CIs) for aggressive prostate cancer were estimated. A total of 4666 SNPs were evaluated by the BPC3. Two signals were seen in regions already reported for prostate cancer risk. rs7014346 at 8q24.21 was marginally associated with aggressive prostate cancer in the BPC3 trial (p=1.6×10(-6)), whereas after meta-analysis by PRACTICAL the summary OR was 1.21 (95% CI 1.16-1.27; p=3.22×10(-18)). rs9900242 at 17q24.3 was also marginally associated with aggressive disease in the meta-analysis (OR 0.90, 95% CI 0.86-0.94; p=2.5×10(-6)). Neither of these SNPs remained statistically significant when conditioning on correlated known prostate cancer SNPs. The meta-analysis by BPC3 and PRACTICAL identified a third promising signal, marked by rs16844874 at 2q34, independent of known prostate cancer loci (OR 1.12, 95% CI 1.06-1.19; p=4.67×10(-5)); it has been shown that SNPs correlated with this signal affect glycine concentrations. The main limitation is the heterogeneity in the definition of aggressive prostate cancer between BPC3 and PRACTICAL. We did not identify new SNPs for aggressive prostate cancer. However, rs16844874 may provide preliminary genetic evidence on the role of the glycine pathway in

  18. Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

    PubMed

    Manel, S; Perrier, C; Pratlong, M; Abi-Rached, L; Paganini, J; Pontarotti, P; Aurelle, D

    2016-01-01

    Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended. © 2015 John Wiley & Sons Ltd.

  19. Realization of an Ultra-thin Metasurface to Facilitate Wide Bandwidth, Wide Angle Beam Scanning.

    PubMed

    Bah, Alpha O; Qin, Pei-Yuan; Ziolkowski, Richard W; Cheng, Qiang; Guo, Y Jay

    2018-03-19

    A wide bandwidth, ultra-thin, metasurface is reported that facilitates wide angle beam scanning. Each unit cell of the metasurface contains a multi-resonant, strongly-coupled unequal arm Jerusalem cross element. This element consists of two bent-arm, orthogonal, capacitively loaded strips. The wide bandwidth of the metasurface is achieved by taking advantage of the strong coupling within and between its multi-resonant elements. A prototype of the proposed metasurface has been fabricated and measured. The design concept has been validated by the measured results. The proposed metasurface is able to alleviate the well-known problem of impedance mismatch caused by mutual coupling when the main beam of an array is scanned. In order to validate the wideband and wide scanning ability of the proposed metasurface, it is integrated with a wideband antenna array as a wide angle impedance matching element. The metasurface-array combination facilitates wide angle scanning over a 6:1 impedance bandwidth without the need for bulky dielectrics or multi-layered structures.

  20. Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis

    PubMed Central

    Gianola, Daniel; Fariello, Maria I.; Naya, Hugo; Schön, Chris-Carolin

    2016-01-01

    Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals (G) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G, provided variance components are unaffected by exclusion of such marker(s) from G. The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G does matter. Removal of eigenvectors from G can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. PMID:27520956

  1. GWAMA: software for genome-wide association meta-analysis.

    PubMed

    Mägi, Reedik; Morris, Andrew P

    2010-05-28

    Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.

  2. A Genome-Wide Scan for Evidence of Selection in a Maize Population Under Long-Term Artificial Selection for Ear Number

    PubMed Central

    Beissinger, Timothy M.; Hirsch, Candice N.; Vaillancourt, Brieanne; Deshpande, Shweta; Barry, Kerrie; Buell, C. Robin; Kaeppler, Shawn M.; Gianola, Daniel; de Leon, Natalia

    2014-01-01

    A genome-wide scan to detect evidence of selection was conducted in the Golden Glow maize long-term selection population. The population had been subjected to selection for increased number of ears per plant for 30 generations, with an empirically estimated effective population size ranging from 384 to 667 individuals and an increase of more than threefold in the number of ears per plant. Allele frequencies at >1.2 million single-nucleotide polymorphism loci were estimated from pooled whole-genome resequencing data, and FST values across sliding windows were employed to assess divergence between the population preselection and the population postselection. Twenty-eight highly divergent regions were identified, with half of these regions providing gene-level resolution on potentially selected variants. Approximately 93% of the divergent regions do not demonstrate a significant decrease in heterozygosity, which suggests that they are not approaching fixation. Also, most regions display a pattern consistent with a soft-sweep model as opposed to a hard-sweep model, suggesting that selection mostly operated on standing genetic variation. For at least 25% of the regions, results suggest that selection operated on variants located outside of currently annotated coding regions. These results provide insights into the underlying genetic effects of long-term artificial selection and identification of putative genetic elements underlying number of ears per plant in maize. PMID:24381334

  3. Genome-wide variation within and between wild and domestic yak.

    PubMed

    Wang, Kun; Hu, Quanjun; Ma, Hui; Wang, Lizhong; Yang, Yongzhi; Luo, Wenchun; Qiu, Qiang

    2014-07-01

    The yak is one of the few animals that can thrive in the harsh environment of the Qinghai-Tibetan Plateau and adjacent Alpine regions. Yak provides essential resources allowing Tibetans to live at high altitudes. However, genetic variation within and between wild and domestic yak remain unknown. Here, we present a genome-wide study of the genetic variation within and between wild and domestic yak. Using next-generation sequencing technology, we resequenced three wild and three domestic yak with a mean of fivefold coverage using our published domestic yak genome as a reference. We identified a total of 8.38 million SNPs (7.14 million novel), 383,241 InDels and 126,352 structural variants between the six yak. We observed higher linkage disequilibrium in domestic yak than in wild yak and a modest but distinct genetic divergence between these two groups. We further identified more than a thousand of potential selected regions (PSRs) for the three domestic yak by scanning the whole genome. These genomic resources can be further used to study genetic diversity and select superior breeds of yak and other bovid species. © 2014 John Wiley & Sons Ltd.

  4. Genome-wide analysis of epistasis in body mass index using multiple human populations.

    PubMed

    Wei, Wen-Hua; Hemani, Gib; Gyenesei, Attila; Vitart, Veronique; Navarro, Pau; Hayward, Caroline; Cabrera, Claudia P; Huffman, Jennifer E; Knott, Sara A; Hicks, Andrew A; Rudan, Igor; Pramstaller, Peter P; Wild, Sarah H; Wilson, James F; Campbell, Harry; Hastie, Nicholas D; Wright, Alan F; Haley, Chris S

    2012-08-01

    We surveyed gene-gene interactions (epistasis) in human body mass index (BMI) in four European populations (n<1200) via exhaustive pair-wise genome scans where interactions were computed as F ratios by testing a linear regression model fitting two single-nucleotide polymorphisms (SNPs) with interactions against the one without. Before the association tests, BMI was corrected for sex and age, normalised and adjusted for relatedness. Neither single SNPs nor SNP interactions were genome-wide significant in either cohort based on the consensus threshold (P=5.0E-08) and a Bonferroni corrected threshold (P=1.1E-12), respectively. Next we compared sub genome-wide significant SNP interactions (P<5.0E-08) across cohorts to identify common epistatic signals, where SNPs were annotated to genes to test for gene ontology (GO) enrichment. Among the epistatic genes contributing to the commonly enriched GO terms, 19 were shared across study cohorts of which 15 are previously published genome-wide association loci, including CDH13 (cadherin 13) associated with height and SORCS2 (sortilin-related VPS10 domain containing receptor 2) associated with circulating insulin-like growth factor 1 and binding protein 3. Interactions between the 19 shared epistatic genes and those involving BMI candidate loci (P<5.0E-08) were tested across cohorts and found eight replicated at the SNP level (P<0.05) in at least one cohort, which were further tested and showed limited replication in a separate European population (n>5000). We conclude that genome-wide analysis of epistasis in multiple populations is an effective approach to provide new insights into the genetic regulation of BMI but requires additional efforts to confirm the findings.

  5. Genome-wide SNP scan in a porcine Large White×Minzhu intercross population reveals a locus influencing muscle mass on chromosome 2.

    PubMed

    Liu, Xin; Wang, Li Gang; Luo, Wei Zhen; Li, Yong; Liang, Jing; Yan, Hua; Zhao, Ke Bin; Wang, Li Xian; Zhang, Long Chao

    2014-12-01

    A high-density single nucleotide polymorphism (SNP) array containing 62 163 markers was employed for a genome-wide association study (GWAS) to identify variants associated with lean meat in ham (LMH, %) and lean meat percentage (LMP, %) within a porcine Large White×Minzhu intercross population. For each individual, LMH and LMP were measured after slaughter at the age of 240±7 days. A total of 557 F2 animals were genotyped. The GWAS revealed that 21 SNPs showed significant genome-wide or chromosome-wide associations with LMH and LMP by the Genome-wide Rapid Association using Mixed Model and Regression-Genomic Control approach. Nineteen significant genome-wide SNPs were mapped to the distal end of Sus Scrofa Chromosome (SSC) 2, where a major known gene responsible for muscle mass, IGF2 is located. A conditioned analysis, in which the genotype of the strongest associated SNP is included as a fixed effect in the model, showed that those significant SNPs on SSC2 were derived from a single quantitative trait locus. The two chromosome-wide association SNPs on SSC1 disappeared after conditioned analysis suggested the association signal is a false association derived from using a F2 population. The present result is expected to lead to novel insights into muscle mass in different pig breeds and lays a preliminary foundation for follow-up studies for identification of causal mutations for subsequent application in marker-assisted selection programs for improving muscle mass in pigs. © 2014 Japanese Society of Animal Science.

  6. A Genome-Wide Scan of Selective Sweeps and Association Mapping of Fruit Traits Using Microsatellite Markers in Watermelon

    PubMed Central

    Reddy, Umesh K.; Abburi, Lavanya; Abburi, Venkata Lakshmi; Saminathan, Thangasamy; Cantrell, Robert; Vajja, Venkata Gopinath; Reddy, Rishi; Tomason, Yan R.; Levi, Amnon; Wehner, Todd C.; Nimmakayala, Padma

    2015-01-01

    Our genetic diversity study uses microsatellites of known map position to estimate genome level population structure and linkage disequilibrium, and to identify genomic regions that have undergone selection during watermelon domestication and improvement. Thirty regions that showed evidence of selective sweep were scanned for the presence of candidate genes using the watermelon genome browser (www.icugi.org). We localized selective sweeps in intergenic regions, close to the promoters, and within the exons and introns of various genes. This study provided an evidence of convergent evolution for the presence of diverse ecotypes with special reference to American and European ecotypes. Our search for location of linked markers in the whole-genome draft sequence revealed that BVWS00358, a GA repeat microsatellite, is the GAGA type transcription factor located in the 5′ untranslated regions of a structure and insertion element that expresses a Cys2His2 Zinc finger motif, with presumed biological processes related to chitin response and transcriptional regulation. In addition, BVWS01708, an ATT repeat microsatellite, located in the promoter of a DTW domain-containing protein (Cla002761); and 2 other simple sequence repeats that association mapping link to fruit length and rind thickness. PMID:25425675

  7. Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis.

    PubMed

    Gianola, Daniel; Fariello, Maria I; Naya, Hugo; Schön, Chris-Carolin

    2016-10-13

    Standard genome-wide association studies (GWAS) scan for relationships between each of p molecular markers and a continuously distributed target trait. Typically, a marker-based matrix of genomic similarities among individuals ( G: ) is constructed, to account more properly for the covariance structure in the linear regression model used. We show that the generalized least-squares estimator of the regression of phenotype on one or on m markers is invariant with respect to whether or not the marker(s) tested is(are) used for building G,: provided variance components are unaffected by exclusion of such marker(s) from G: The result is arrived at by using a matrix expression such that one can find many inverses of genomic relationship, or of phenotypic covariance matrices, stemming from removing markers tested as fixed, but carrying out a single inversion. When eigenvectors of the genomic relationship matrix are used as regressors with fixed regression coefficients, e.g., to account for population stratification, their removal from G: does matter. Removal of eigenvectors from G: can have a noticeable effect on estimates of genomic and residual variances, so caution is needed. Concepts were illustrated using genomic data on 599 wheat inbred lines, with grain yield as target trait, and on close to 200 Arabidopsis thaliana accessions. Copyright © 2016 Gianola et al.

  8. Genome-wide signals of positive selection in human evolution

    PubMed Central

    Enard, David; Messer, Philipp W.; Petrov, Dmitri A.

    2014-01-01

    The role of positive selection in human evolution remains controversial. On the one hand, scans for positive selection have identified hundreds of candidate loci, and the genome-wide patterns of polymorphism show signatures consistent with frequent positive selection. On the other hand, recent studies have argued that many of the candidate loci are false positives and that most genome-wide signatures of adaptation are in fact due to reduction of neutral diversity by linked deleterious mutations, known as background selection. Here we analyze human polymorphism data from the 1000 Genomes Project and detect signatures of positive selection once we correct for the effects of background selection. We show that levels of neutral polymorphism are lower near amino acid substitutions, with the strongest reduction observed specifically near functionally consequential amino acid substitutions. Furthermore, amino acid substitutions are associated with signatures of recent adaptation that should not be generated by background selection, such as unusually long and frequent haplotypes and specific distortions in the site frequency spectrum. We use forward simulations to argue that the observed signatures require a high rate of strongly adaptive substitutions near amino acid changes. We further demonstrate that the observed signatures of positive selection correlate better with the presence of regulatory sequences, as predicted by the ENCODE Project Consortium, than with the positions of amino acid substitutions. Our results suggest that adaptation was frequent in human evolution and provide support for the hypothesis of King and Wilson that adaptive divergence is primarily driven by regulatory changes. PMID:24619126

  9. A Genome-Wide Linkage Scan for Age at Menarche in Three Populations of European Descent

    PubMed Central

    Anderson, Carl A.; Zhu, Gu; Falchi, Mario; van den Berg, Stéphanie M.; Treloar, Susan A.; Spector, Timothy D.; Martin, Nicholas G.; Boomsma, Dorret I.; Visscher, Peter M.; Montgomery, Grant W.

    2008-01-01

    Context: Age at menarche (AAM) is an important trait both biologically and socially, a clearly defined event in female pubertal development, and has been associated with many clinically significant phenotypes. Objective: The objective of the study was to identify genetic loci influencing variation in AAM in large population-based samples from three countries. Design/Participants: Recalled AAM data were collected from 13,697 individuals and 4,899 pseudoindependent sister-pairs from three different populations (Australia, The Netherlands, and the United Kingdom) by mailed questionnaire or interview. Genome-wide variance components linkage analysis was implemented on each sample individually and in combination. Results: The mean, sd, and heritability of AAM across the three samples was 13.1 yr, 1.5 yr, and 0.69, respectively. No loci were detected that reached genome-wide significance in the combined analysis, but a suggestive locus was detected on chromosome 12 (logarithm of the odds = 2.0). Three loci of suggestive significance were seen in the U.K. sample on chromosomes 1, 4, and 18 (logarithm of the odds = 2.4, 2.2 and 3.2, respectively). Conclusions: There was no evidence for common highly penetrant variants influencing AAM. Linkage and association suggest that one trait locus for AAM is located on chromosome 12, but further studies are required to replicate these results. PMID:18647812

  10. Development and application of a novel genome-wide SNP array reveals domestication history in soybean

    PubMed Central

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-01-01

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884

  11. Development and application of a novel genome-wide SNP array reveals domestication history in soybean.

    PubMed

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-02-09

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.

  12. Extension of type 2 diabetes genome-wide association scan results in the diabetes prevention program.

    PubMed

    Moore, Allan F; Jablonski, Kathleen A; McAteer, Jarred B; Saxena, Richa; Pollin, Toni I; Franks, Paul W; Hanson, Robert L; Shuldiner, Alan R; Knowler, William C; Altshuler, David; Florez, Jose C

    2008-09-01

    Genome-wide association scans (GWASs) have identified novel diabetes-associated genes. We evaluated how these variants impact diabetes incidence, quantitative glycemic traits, and response to preventive interventions in 3,548 subjects at high risk of type 2 diabetes enrolled in the Diabetes Prevention Program (DPP), which examined the effects of lifestyle intervention, metformin, and troglitazone versus placebo. We genotyped selected single nucleotide polymorphisms (SNPs) in or near diabetes-associated loci, including EXT2, CDKAL1, CDKN2A/B, IGF2BP2, HHEX, LOC387761, and SLC30A8 in DPP participants and performed Cox regression analyses using genotype, intervention, and their interactions as predictors of diabetes incidence. We evaluated their effect on insulin resistance and secretion at 1 year. None of the selected SNPs were associated with increased diabetes incidence in this population. After adjustments for ethnicity, baseline insulin secretion was lower in subjects with the risk genotype at HHEX rs1111875 (P = 0.01); there were no significant differences in baseline insulin sensitivity. Both at baseline and at 1 year, subjects with the risk genotype at LOC387761 had paradoxically increased insulin secretion; adjustment for self-reported ethnicity abolished these differences. In ethnicity-adjusted analyses, we noted a nominal differential improvement in beta-cell function for carriers of the protective genotype at CDKN2A/B after 1 year of troglitazone treatment (P = 0.01) and possibly lifestyle modification (P = 0.05). We were unable to replicate the GWAS findings regarding diabetes risk in the DPP. We did observe genotype associations with differences in baseline insulin secretion at the HHEX locus and a possible pharmacogenetic interaction at CDKNA2/B.

  13. Parent-Of-Origin Effects in Autism Identified through Genome-Wide Linkage Analysis of 16,000 SNPs

    PubMed Central

    Fradin, Delphine; Cheslack-Postava, Keely; Ladd-Acosta, Christine; Newschaffer, Craig; Chakravarti, Aravinda; Arking, Dan E.; Feinberg, Andrew; Fallin, M. Daniele

    2010-01-01

    Background Autism is a common heritable neurodevelopmental disorder with complex etiology. Several genome-wide linkage and association scans have been carried out to identify regions harboring genes related to autism or autism spectrum disorders, with mixed results. Given the overlap in autism features with genetic abnormalities known to be associated with imprinting, one possible reason for lack of consistency would be the influence of parent-of-origin effects that may mask the ability to detect linkage and association. Methods and Findings We have performed a genome-wide linkage scan that accounts for potential parent-of-origin effects using 16,311 SNPs among families from the Autism Genetic Resource Exchange (AGRE) and the National Institute of Mental Health (NIMH) autism repository. We report parametric (GH, Genehunter) and allele-sharing linkage (Aspex) results using a broad spectrum disorder case definition. Paternal-origin genome-wide statistically significant linkage was observed on chromosomes 4 (LODGH = 3.79, empirical p<0.005 and LODAspex = 2.96, p = 0.008), 15 (LODGH = 3.09, empirical p<0.005 and LODAspex = 3.62, empirical p = 0.003) and 20 (LODGH = 3.36, empirical p<0.005 and LODAspex = 3.38, empirical p = 0.006). Conclusions These regions may harbor imprinted sites associated with the development of autism and offer fruitful domains for molecular investigation into the role of epigenetic mechanisms in autism. PMID:20824079

  14. Scanning the human genome at kilobase resolution.

    PubMed

    Chen, Jun; Kim, Yeong C; Jung, Yong-Chul; Xuan, Zhenyu; Dworkin, Geoff; Zhang, Yanming; Zhang, Michael Q; Wang, San Ming

    2008-05-01

    Normal genome variation and pathogenic genome alteration frequently affect small regions in the genome. Identifying those genomic changes remains a technical challenge. We report here the development of the DGS (Ditag Genome Scanning) technique for high-resolution analysis of genome structure. The basic features of DGS include (1) use of high-frequent restriction enzymes to fractionate the genome into small fragments; (2) collection of two tags from two ends of a given DNA fragment to form a ditag to represent the fragment; (3) application of the 454 sequencing system to reach a comprehensive ditag sequence collection; (4) determination of the genome origin of ditags by mapping to reference ditags from known genome sequences; (5) use of ditag sequences directly as the sense and antisense PCR primers to amplify the original DNA fragment. To study the relationship between ditags and genome structure, we performed a computational study by using the human genome reference sequences as a model, and analyzed the ditags experimentally collected from the well-characterized normal human DNA GM15510 and the leukemic human DNA of Kasumi-1 cells. Our studies show that DGS provides a kilobase resolution for studying genome structure with high specificity and high genome coverage. DGS can be applied to validate genome assembly, to compare genome similarity and variation in normal populations, and to identify genomic abnormality including insertion, inversion, deletion, translocation, and amplification in pathological genomes such as cancer genomes.

  15. A Discovery Genome-Wide Association Study of Entrepreneurship

    ERIC Educational Resources Information Center

    Quaye, Lydia; Nicolaou, Nicos; Shane, Scott; Mangino, Massimo

    2012-01-01

    To identify specific genetic variants influencing the phenotype of entrepreneurship, we conducted a genome-wide association study (GWAS) with 3,933 Caucasian females from the TwinsUK Adult Twin Registry. Following stringent genotype quality control, GWAF (genome-wide association analyses for family data) software was used to assess the association…

  16. SuperDCA for genome-wide epistasis analysis.

    PubMed

    Puranen, Santeri; Pesonen, Maiju; Pensar, Johan; Xu, Ying Ying; Lees, John A; Bentley, Stephen D; Croucher, Nicholas J; Corander, Jukka

    2018-05-29

    The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10 4 -10 5 polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 10 5 polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level.

  17. A note on generalized Genome Scan Meta-Analysis statistics

    PubMed Central

    Koziol, James A; Feng, Anne C

    2005-01-01

    Background Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin width across the various genome scan studies. Results We provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, and, we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We also remark on aggregate criteria and multiple testing for determining significance of GSMA results. Conclusion Theoretical considerations detailed herein can lead to clarification and simplification of testing criteria for generalizations of the GSMA statistic. PMID:15717930

  18. A genome-wide scan for selection signatures in Nelore cattle

    USDA-ARS?s Scientific Manuscript database

    Brazilian Nelore cattle have been selected for growth traits over more than four decades. In recent years, reproductive and meat quality traits have become more important because of increasing consumption, exports and consumer demand. The identification of genomic regions altered by artificial selec...

  19. A genome-wide scan for signatures of selection in Chinese indigenous and commercial pig breeds.

    PubMed

    Yang, Songbai; Li, Xiuling; Li, Kui; Fan, Bin; Tang, Zhonglin

    2014-01-15

    Modern breeding and artificial selection play critical roles in pig domestication and shape the genetic variation of different breeds. China has many indigenous pig breeds with various characteristics in morphology and production performance that differ from those of foreign commercial pig breeds. However, the signatures of selection on genes implying for economic traits between Chinese indigenous and commercial pigs have been poorly understood. We identified footprints of positive selection at the whole genome level, comprising 44,652 SNPs genotyped in six Chinese indigenous pig breeds, one developed breed and two commercial breeds. An empirical genome-wide distribution of Fst (F-statistics) was constructed based on estimations of Fst for each SNP across these nine breeds. We detected selection at the genome level using the High-Fst outlier method and found that 81 candidate genes show high evidence of positive selection. Furthermore, the results of network analyses showed that the genes that displayed evidence of positive selection were mainly involved in the development of tissues and organs, and the immune response. In addition, we calculated the pairwise Fst between Chinese indigenous and commercial breeds (CHN VS EURO) and between Northern and Southern Chinese indigenous breeds (Northern VS Southern). The IGF1R and ESR1 genes showed evidence of positive selection in the CHN VS EURO and Northern VS Southern groups, respectively. In this study, we first identified the genomic regions that showed evidences of selection between Chinese indigenous and commercial pig breeds using the High-Fst outlier method. These regions were found to be involved in the development of tissues and organs, the immune response, growth and litter size. The results of this study provide new insights into understanding the genetic variation and domestication in pigs.

  20. A genome-wide scan for signatures of selection in Chinese indigenous and commercial pig breeds

    PubMed Central

    2014-01-01

    Background Modern breeding and artificial selection play critical roles in pig domestication and shape the genetic variation of different breeds. China has many indigenous pig breeds with various characteristics in morphology and production performance that differ from those of foreign commercial pig breeds. However, the signatures of selection on genes implying for economic traits between Chinese indigenous and commercial pigs have been poorly understood. Results We identified footprints of positive selection at the whole genome level, comprising 44,652 SNPs genotyped in six Chinese indigenous pig breeds, one developed breed and two commercial breeds. An empirical genome-wide distribution of Fst (F-statistics) was constructed based on estimations of Fst for each SNP across these nine breeds. We detected selection at the genome level using the High-Fst outlier method and found that 81 candidate genes show high evidence of positive selection. Furthermore, the results of network analyses showed that the genes that displayed evidence of positive selection were mainly involved in the development of tissues and organs, and the immune response. In addition, we calculated the pairwise Fst between Chinese indigenous and commercial breeds (CHN VS EURO) and between Northern and Southern Chinese indigenous breeds (Northern VS Southern). The IGF1R and ESR1 genes showed evidence of positive selection in the CHN VS EURO and Northern VS Southern groups, respectively. Conclusions In this study, we first identified the genomic regions that showed evidences of selection between Chinese indigenous and commercial pig breeds using the High-Fst outlier method. These regions were found to be involved in the development of tissues and organs, the immune response, growth and litter size. The results of this study provide new insights into understanding the genetic variation and domestication in pigs. PMID:24422716

  1. Genome scan for nonadditive heterotic trait loci reveals mainly underdominant effects in Saccharomyces cerevisiae.

    PubMed

    Laiba, Efrat; Glikaite, Ilana; Levy, Yael; Pasternak, Zohar; Fridman, Eyal

    2016-04-01

    The overdominant model of heterosis explains the superior phenotype of hybrids by synergistic allelic interaction within heterozygous loci. To map such genetic variation in yeast, we used a population doubling time dataset of Saccharomyces cerevisiae 16 × 16 diallel and searched for major contributing heterotic trait loci (HTL). Heterosis was observed for the majority of hybrids, as they surpassed their best parent growth rate. However, most of the local heterozygous loci identified by genome scan were surprisingly underdominant, i.e., reduced growth. We speculated that in these loci adverse effects on growth resulted from incompatible allelic interactions. To test this assumption, we eliminated these allelic interactions by creating hybrids with local hemizygosity for the underdominant HTLs, as well as for control random loci. Growth of hybrids was indeed elevated for most hemizygous to HTL genes but not for control genes, hence validating the results of our genome scan. Assessing the consequences of local heterozygosity by reciprocal hemizygosity and allele replacement assays revealed the influence of genetic background on the underdominant effects of HTLs. Overall, this genome-wide study on a multi-parental hybrid population provides a strong argument against single gene overdominance as a major contributor to heterosis, and favors the dominance complementation model.

  2. Quality control and quality assurance in genotypic data for genome-wide association studies

    PubMed Central

    Laurie, Cathy C.; Doheny, Kimberly F.; Mirel, Daniel B.; Pugh, Elizabeth W.; Bierut, Laura J.; Bhangale, Tushar; Boehm, Frederick; Caporaso, Neil E.; Cornelis, Marilyn C.; Edenberg, Howard J.; Gabriel, Stacy B.; Harris, Emily L.; Hu, Frank B.; Jacobs, Kevin; Kraft, Peter; Landi, Maria Teresa; Lumley, Thomas; Manolio, Teri A.; McHugh, Caitlin; Painter, Ian; Paschall, Justin; Rice, John P.; Rice, Kenneth M.; Zheng, Xiuwen; Weir, Bruce S.

    2011-01-01

    Genome-wide scans of nucleotide variation in human subjects are providing an increasing number of replicated associations with complex disease traits. Most of the variants detected have small effects and, collectively, they account for a small fraction of the total genetic variance. Very large sample sizes are required to identify and validate findings. In this situation, even small sources of systematic or random error can cause spurious results or obscure real effects. The need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance (QC/QA) have been developed. Here we extend these methods and describe a system of QC/QA for genotypic data in genome-wide association studies. This system includes some new approaches that (1) combine analysis of allelic probe intensities and called genotypes to distinguish gender misidentification from sex chromosome aberrations, (2) detect autosomal chromosome aberrations that may affect genotype calling accuracy, (3) infer DNA sample quality from relatedness and allelic intensities, (4) use duplicate concordance to infer SNP quality, (5) detect genotyping artifacts from dependence of Hardy-Weinberg equilibrium (HWE) test p-values on allelic frequency, and (6) demonstrate sensitivity of principal components analysis (PCA) to SNP selection. The methods are illustrated with examples from the ‘Gene Environment Association Studies’ (GENEVA) program. The results suggest several recommendations for QC/QA in the design and execution of genome-wide association studies. PMID:20718045

  3. Genetic Predisposition to Self-Curing Infection with the Protozoan Leishmania chagasi: A Genome Wide Scan

    PubMed Central

    Jeronimo, Selma M. B.; Duggal, Priya; Ettinger, Nicholas A.; Nascimento, Eliana T.; Monteiro, Gloria R.; Cabral, Angela P.; Pontes, Núbia N.; Lacerda, Hênio G.; Queiroz, Paula V.; Maia, Carlos G.; Pearson, Richard D.; Blackwell, Jenefer M.; Beaty, Terri H.; Wilson, Mary E.

    2008-01-01

    The protozoan Leishmania chagasi can cause disseminated, fatal visceral leishmaniasis (VL) or asymptomatic human infection. We hypothesized that genetic factors contribute to this variable response to infection. A family study was performed in endemic neighborhoods near Natal, northeast Brazil. Subjects were assessed for VL or asymptomatic infection, defined as a positive delayed type hypersensitivity (DTH) skin test response to Leishmania antigen without disease symptoms. A genome scan of 405 microsatellite markers in 1254 subjects was analyzed for regions of linkage. The results indicated loci of potential linkage to DTH response on chromosomes 2, 13, 15 and 19, and a novel region of potential interest for VL on chromosome 9. An understanding of the genetic factors determining whether an individual will develop symptomatic or asymptomatic infection with L. chagasi may illuminate proteins essential for immune protection against this parasitic disease; findings could reveal strategies for immunotherapy or prevention. PMID:17955446

  4. Genome-wide scan of gastrointestinal nematode resistance in closed Angus population selected for minimized influence of MHC.

    PubMed

    Kim, Eui-Soo; Sonstegard, Tad S; da Silva, Marcos V G B; Gasbarre, Louis C; Van Tassell, Curtis P

    2015-01-01

    Genetic markers associated with parasite indicator traits are ideal targets for study of marker assisted selection aimed at controlling infections that reduce herd use of anthelminthics. For this study, we collected gastrointestinal (GI) nematode fecal egg count (FEC) data from post-weaning animals of an Angus resource population challenged to a 26 week natural exposure on pasture. In all, data from 487 animals was collected over a 16 year period between 1992 and 2007, most of which were selected for a specific DRB1 allele to reduce the influence of potential allelic variant effects of the MHC locus. A genome-wide association study (GWAS) based on BovineSNP50 genotypes revealed six genomic regions located on bovine Chromosomes 3, 5, 8, 15 and 27; which were significantly associated (-log10 p=4.3) with Box-Cox transformed mean FEC (BC-MFEC). DAVID analysis of the genes within the significant genomic regions suggested a correlation between our results and annotation for genes involved in inflammatory response to infection. Furthermore, ROH and selection signature analyses provided strong evidence that the genomic regions associated BC-MFEC have not been affected by local autozygosity or recent experimental selection. These findings provide useful information for parasite resistance prediction for young grazing cattle and suggest new candidate gene targets for development of disease-modifying therapies or future studies of host response to GI parasite infection.

  5. Genome scans for divergent selection in natural populations of the widespread hardwood species Eucalyptus grandis (Myrtaceae) using microsatellites

    PubMed Central

    Song, Zhijiao; Zhang, Miaomiao; Li, Fagen; Weng, Qijie; Zhou, Chanpin; Li, Mei; Li, Jie; Huang, Huanhua; Mo, Xiaoyong; Gan, Siming

    2016-01-01

    Identification of loci or genes under natural selection is important for both understanding the genetic basis of local adaptation and practical applications, and genome scans provide a powerful means for such identification purposes. In this study, genome-wide simple sequence repeats markers (SSRs) were used to scan for molecular footprints of divergent selection in Eucalyptus grandis, a hardwood species occurring widely in costal areas from 32° S to 16° S in Australia. High population diversity levels and weak population structure were detected with putatively neutral genomic SSRs. Using three FST outlier detection methods, a total of 58 outlying SSRs were collectively identified as loci under divergent selection against three non-correlated climatic variables, namely, mean annual temperature, isothermality and annual precipitation. Using a spatial analysis method, nine significant associations were revealed between FST outlier allele frequencies and climatic variables, involving seven alleles from five SSR loci. Of the five significant SSRs, two (EUCeSSR1044 and Embra394) contained alleles of putative genes with known functional importance for response to climatic factors. Our study presents critical information on the population diversity and structure of the important woody species E. grandis and provides insight into the adaptive responses of perennial trees to climatic variations. PMID:27748400

  6. Genome wide approaches to identify protein-DNA interactions.

    PubMed

    Ma, Tao; Ye, Zhenqing; Wang, Liguo

    2018-05-29

    Transcription factors are DNA-binding proteins that play key roles in many fundamental biological processes. Unraveling their interactions with DNA is essential to identify their target genes and understand the regulatory network. Genome-wide identification of their binding sites became feasible thanks to recent progress in experimental and computational approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques to demarcate genome-wide transcription factor binding sites. This review aims to provide an overview of these three techniques including their experiment procedures, computational approaches, and popular analytic tools. ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome-wide in vivo protein-DNA interaction. Due to the rapid development of next-generation sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the most widely used technique to identify transcription factor binding sites in genome-wide. The newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However, different programs may employ different mechanisms or underlying algorithms thus each will inherently include its own set of statistical assumption and bias. So choosing the most appropriate analytic program for a given experiment needs careful considerations. Moreover, most programs only have command line interface so their installation and usage will require basic computation expertise in Unix/Linux. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  7. Significance of genome-wide association studies in molecular anthropology.

    PubMed

    Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal

    2009-12-01

    The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.

  8. Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries

    PubMed Central

    Baurley, James W.; Edlund, Christopher K.; Pardamean, Carissa I.; Conti, David V.; Krasnow, Ruth; Javitz, Harold S.; Hops, Hyman; Swan, Gary E.; Benowitz, Neal L.

    2016-01-01

    Introduction: Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3′-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Methods: Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. Results: African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). Conclusions: This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan

  9. Genome-wide screening and identification of antigens for rickettsial vaccine development

    USDA-ARS?s Scientific Manuscript database

    The capacity to identify immunogens for vaccine development by genome-wide screening has been markedly enhanced by the availability of complete microbial genome sequences coupled to rapid proteomic and bioinformatic analysis. Critical to this genome-wide screening is in vivo testing in the context o...

  10. A genome-wide association study in soybean

    USDA-ARS?s Scientific Manuscript database

    A genome-wide association study (GWAS) was performed to estimate the feasibility of identifying genes controlling the quantitative traits, seed protein and oil concentration, in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. A total of 55,159 single nucleo...

  11. Educational Attainment: A Genome Wide Association Study in 9538 Australians

    PubMed Central

    Martin, Nicolas W.; Medland, Sarah E.; Verweij, Karin J. H.; Lee, S. Hong; Nyholt, Dale R.; Madden, Pamela A.; Heath, Andrew C.; Montgomery, Grant W.; Wright, Margaret J.; Martin, Nicholas G.

    2011-01-01

    Background Correlations between Educational Attainment (EA) and measures of cognitive performance are as high as 0.8. This makes EA an attractive alternative phenotype for studies wishing to map genes affecting cognition due to the ease of collecting EA data compared to other cognitive phenotypes such as IQ. Methodology In an Australian family sample of 9538 individuals we performed a genome-wide association scan (GWAS) using the imputed genotypes of ∼2.4 million single nucleotide polymorphisms (SNP) for a 6-point scale measure of EA. Top hits were checked for replication in an independent sample of 968 individuals. A gene-based test of association was then applied to the GWAS results. Additionally we performed prediction analyses using the GWAS results from our discovery sample to assess the percentage of EA and full scale IQ variance explained by the predicted scores. Results The best SNP fell short of having a genome-wide significant p-value (p = 9.77×10−7). In our independent replication sample six SNPs among the top 50 hits pruned for linkage disequilibrium (r2<0.8) had a p-value<0.05 but only one of these SNPs survived correction for multiple testing - rs7106258 (p = 9.7*10−4) located in an intergenic region of chromosome 11q14.1. The gene based test results were non-significant and our prediction analyses show that the predicted scores explained little variance in EA in our replication sample. Conclusion While we have identified a polymorphism chromosome 11q14.1 associated with EA, further replication is warranted. Overall, the absence of genome-wide significant p-values in our large discovery sample confirmed the high polygenic architecture of EA. Only the assembly of large samples or meta-analytic efforts will be able to assess the implication of common DNA polymorphisms in the etiology of EA. PMID:21694764

  12. Genome-wide high-throughput SNP discovery and genotyping for understanding natural (functional) allelic diversity and domestication patterns in wild chickpea

    PubMed Central

    Bajaj, Deepak; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We identified 82489 high-quality genome-wide SNPs from 93 wild and cultivated Cicer accessions through integrated reference genome- and de novo-based GBS assays. High intra- and inter-specific polymorphic potential (66–85%) and broader natural allelic diversity (6–64%) detected by genome-wide SNPs among accessions signify their efficacy for monitoring introgression and transferring target trait-regulating genomic (gene) regions/allelic variants from wild to cultivated Cicer gene pools for genetic improvement. The population-specific assignment of wild Cicer accessions pertaining to the primary gene pool are more influenced by geographical origin/phenotypic characteristics than species/gene-pools of origination. The functional significance of allelic variants (non-synonymous and regulatory SNPs) scanned from transcription factors and stress-responsive genes in differentiating wild accessions (with potential known sources of yield-contributing and stress tolerance traits) from cultivated desi and kabuli accessions, fine-mapping/map-based cloning of QTLs and determination of LD patterns across wild and cultivated gene-pools are suitably elucidated. The correlation between phenotypic (agromorphological traits) and molecular diversity-based admixed domestication patterns within six structured populations of wild and cultivated accessions via genome-wide SNPs was apparent. This suggests utility of whole genome SNPs as a potential resource for identifying naturally selected trait-regulating genomic targets/functional allelic variants adaptive to diverse agroclimatic regions for genetic enhancement of cultivated gene-pools. PMID:26208313

  13. Plasma levels of sIL-2Rα: associations with clinical cardiovascular events and genome- wide association scan

    PubMed Central

    Durda, Peter; Sabourin, Jeremy; Lange, Ethan M.; Nalls, Mike A.; Mychaleckyj, Josyf C.; Jenny, Nancy Swords; Li, Jin; Walston, Jeremy; Harris, Tamara B.; Psaty, Bruce M.; Valdar, William; Liu, Yongmei; Cushman, Mary; Reiner, Alex P.; Tracy, Russell P.; Lange, Leslie A.

    2017-01-01

    Objective Interleukin-2 receptor subunit alpha (IL-2Rα) regulates lymphocyte activation, which plays an important role in atherosclerosis. Associations between soluble IL-2Rα and cardiovascular disease (CVD) have not been widely studied and little is known about the genetic determinants of sIL-2Rα levels. Approach and Results We measured baseline levels of sIL- 2Rα in 4408 European-American (EA) and 766 African-American (AA) adults from the Cardiovascular Health Study (CHS) and examined associations with baseline CVD risk factors, subclinical CVD and incident CVD events. We also performed a genome-wide association study (GWAS) for sIL-2Rα in CHS (2964 EAs and 683 AAs) and further combined CHS EA results with those from two other EA cohorts in a meta-analysis (N=4464 EAs). In age, sex- and race- adjusted models, sIL-2Rα was positively associated with current smoking, type 2 diabetes, hypertension, insulin, waist circumference, C-reactive protein, interleukin-6, fibrinogen, internal carotid wall thickness, all-cause mortality, CVD mortality, and incident CVD, stroke and heart failure. When adjusted for baseline CVD risk factors and subclinical CVD, associations with all- cause mortality, CVD mortality and heart failure remained significant in both EAs and AAs. In the EA GWAS analysis, we observed 52 single nucleotide polymorphisms (SNPs) in the chromosome 10p15-14 region, which contains IL2RA, IL15RA and RMB17, that reached genome-wide significance (p<5×10-8). The most significant SNP was rs7911500 (p=1.31×10-75). The EA meta-analysis results were highly consistent with CHS-only results. No SNPs reached statistical significance in the AAs. Conclusions These results support a role for sIL-2Rα in atherosclerosis and provide evidence for multiple associated SNPs at chromosome 10p15-14. PMID:26293465

  14. Genome-Wide Detection and Analysis of Multifunctional Genes

    PubMed Central

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  15. Genome Wide Analysis of Fertility and Production Traits in Italian Holstein Cattle

    PubMed Central

    Stella, Alessandra; Biffani, Stefano; Negrini, Riccardo; Lazzari, Barbara; Ajmone-Marsan, Paolo; Williams, John L .

    2013-01-01

    A genome wide scan was performed on a total of 2093 Italian Holstein proven bulls genotyped with 50K single nucleotide polymorphisms (SNPs), with the objective of identifying loci associated with fertility related traits and to test their effects on milk production traits. The analysis was carried out using estimated breeding values for the aggregate fertility index and for each trait contributing to the index: angularity, calving interval, non-return rate at 56 days, days to first service, and 305 day first parity lactation. In addition, two production traits not included in the aggregate fertility index were analysed: fat yield and protein yield. Analyses were carried out using all SNPs treated separately, further the most significant marker on BTA14 associated to milk quality located in the DGAT1 region was treated as fixed effect. Genome wide association analysis identified 61 significant SNPs and 75 significant marker-trait associations. Eight additional SNP associations were detected when SNP located near DGAT1 was included as a fixed effect. As there were no obvious common SNPs between the traits analyzed independently in this study, a network analysis was carried out to identify unforeseen relationships that may link production and fertility traits. PMID:24265800

  16. Genome-Wide Association Study (GWAS) and Genome-Wide Environment Interaction Study (GWEIS) of Depressive Symptoms in African American and Hispanic/Latina Women

    PubMed Central

    Dunn, Erin C.; Wiste, Anna; Radmanesh, Farid; Almli, Lynn M.; Gogarten, Stephanie M.; Sofer, Tamar; Faul, Jessica D.; Kardia, Sharon L.R.; Smith, Jennifer A.; Weir, David R.; Zhao, Wei; Soare, Thomas W.; Mirza, Saira S.; Hek, Karin; Tiemeier, Henning W.; Goveas, Joseph S.; Sarto, Gloria E.; Snively, Beverly M.; Cornelis, Marilyn; Koenen, Karestan C.; Kraft, Peter; Purcell, Shaun; Ressler, Kerry J.; Rosand, Jonathan; Wassertheil-Smoller, Sylvia; Smoller, Jordan W.

    2016-01-01

    Background Genome-wide association studies (GWAS) have been unable to identify variants linked to depression. We hypothesized that examining depressive symptoms and considering gene-environment interaction (G×E) might improve efficiency for gene discovery. We therefore conducted a GWAS and genome-wide environment interaction study (GWEIS) of depressive symptoms. Methods Using data from the SHARe cohort of the Women’s Health Initiative, comprising African Americans (n=7179) and Hispanics/Latinas (n=3138), we examined genetic main effects and G×E with stressful life events and social support. We also conducted a heritability analysis using genome-wide complex trait analysis (GCTA). Replication was attempted in four independent cohorts. Results No SNPs achieved genome-wide significance for main effects in either discovery sample. The top signals in African Americans were rs73531535 (located 20kb from GPR139, p=5.75×10−8) and rs75407252 (intronic to CACNA2D3, p=6.99×10−7). In Hispanics/Latinas, the top signals were rs2532087 (located 27kb from CD38, p=2.44×10−7) and rs4542757 (intronic to DCC, p=7.31×10−7). In the GWEIS with stressful life events, one interaction signal was genome-wide significant in African Americans (rs4652467; p=4.10×10−10; located 14kb from CEP350). This interaction was not observed in a smaller replication cohort. Although heritability estimates for depressive symptoms and stressful life events were each less than 10%, they were strongly genetically correlated (rG=0.95), suggesting that common variation underlying depressive symptoms and stressful life event exposure, though modest on their own, were highly overlapping in this sample. Conclusions Our results underscore the need for larger samples, more GWEIS, and greater investigation into genetic and environmental determinants of depressive symptoms in minorities. PMID:27038408

  17. Genome-Wide Association of the Laboratory-Based Nicotine Metabolite Ratio in Three Ancestries.

    PubMed

    Baurley, James W; Edlund, Christopher K; Pardamean, Carissa I; Conti, David V; Krasnow, Ruth; Javitz, Harold S; Hops, Hyman; Swan, Gary E; Benowitz, Neal L; Bergen, Andrew W

    2016-09-01

    Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3'-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. This

  18. Genome-wide single nucleotide polymorphism scan suggests adaptation to urbanization in an important pollinator, the red-tailed bumblebee (Bombus lapidarius L.).

    PubMed

    Theodorou, Panagiotis; Radzevičiūtė, Rita; Kahnt, Belinda; Soro, Antonella; Grosse, Ivo; Paxton, Robert J

    2018-04-25

    Urbanization is considered a global threat to biodiversity; the growth of cities results in an increase in impervious surfaces, soil and air pollution, fragmentation of natural vegetation and invasion of non-native species, along with numerous environmental changes, including the heat island phenomenon. The combination of these effects constitutes a challenge for both the survival and persistence of many native species, while also imposing altered selective regimes. Here, using 110 314 single nucleotide polymorphisms generated by restriction-site-associated DNA sequencing, we investigated the genome-wide effects of urbanization on putative neutral and adaptive genomic diversity in a major insect pollinator, Bombus lapidarius , collected from nine German cities and nine paired rural sites. Overall, genetic differentiation among sites was low and there was no obvious genome-wide genetic structuring, suggesting the absence of strong effects of urbanization on gene flow. We nevertheless identified several loci under directional selection, a subset of which was associated with urban land use, including the percentage of impervious surface surrounding each sampling site. Overall, our results provide evidence of local adaptation to urbanization in the face of gene flow in a highly mobile insect pollinator. © 2018 The Author(s).

  19. COPS: Detecting Co-Occurrence and Spatial Arrangement of Transcription Factor Binding Motifs in Genome-Wide Datasets

    PubMed Central

    Lohmann, Ingrid

    2012-01-01

    In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression. PMID:23272209

  20. Layers of epistasis: genome-wide regulatory networks and network approaches to genome-wide association studies.

    PubMed

    Cowper-Sal lari, Richard; Cole, Michael D; Karagas, Margaret R; Lupien, Mathieu; Moore, Jason H

    2011-01-01

    The conceptual foundation of the genome-wide association study (GWAS) has advanced unchecked since its conception. A revision might seem premature as the potential of GWAS has not been fully realized. Multiple technical and practical limitations need to be overcome before GWAS can be fairly criticized. But with the completion of hundreds of studies and a deeper understanding of the genetic architecture of disease, warnings are being raised. The results compiled to date indicate that risk-associated variants lie predominantly in noncoding regions of the genome. Additionally, alternative methodologies are uncovering large and heterogeneous sets of rare variants underlying disease. The fear is that, even in its fulfillment, the current GWAS paradigm might be incapable of dissecting all kinds of phenotypes. In the following text, we review several initiatives that aim to overcome these limitations. The overarching theme of these studies is the inclusion of biological knowledge to both the analysis and interpretation of genotyping data. GWAS is uninformed of biology by design and although there is some virtue in its simplicity, it is also its most conspicuous deficiency. We propose a framework in which to integrate these novel approaches, both empirical and theoretical, in the form of a genome-wide regulatory network (GWRN). By processing experimental data into networks, emerging data types based on chromatin immunoprecipitation are made computationally tractable. This will give GWAS re-analysis efforts the most current and relevant substrates, and root them firmly on our knowledge of human disease. Copyright © 2010 John Wiley & Sons, Inc.

  1. Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

    PubMed

    Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

    2016-12-01

    In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  2. Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci.

    PubMed

    Harley, John B; Alarcón-Riquelme, Marta E; Criswell, Lindsey A; Jacob, Chaim O; Kimberly, Robert P; Moser, Kathy L; Tsao, Betty P; Vyse, Timothy J; Langefeld, Carl D; Nath, Swapan K; Guthridge, Joel M; Cobb, Beth L; Mirel, Daniel B; Marion, Miranda C; Williams, Adrienne H; Divers, Jasmin; Wang, Wei; Frank, Summer G; Namjou, Bahram; Gabriel, Stacey B; Lee, Annette T; Gregersen, Peter K; Behrens, Timothy W; Taylor, Kimberly E; Fernando, Michelle; Zidovetzki, Raphael; Gaffney, Patrick M; Edberg, Jeffrey C; Rioux, John D; Ojwang, Joshua O; James, Judith A; Merrill, Joan T; Gilkeson, Gary S; Seldin, Michael F; Yin, Hong; Baechler, Emily C; Li, Quan-Zhen; Wakeland, Edward K; Bruner, Gail R; Kaufman, Kenneth M; Kelly, Jennifer A

    2008-02-01

    Systemic lupus erythematosus (SLE) is a common systemic autoimmune disease with complex etiology but strong clustering in families (lambda(S) = approximately 30). We performed a genome-wide association scan using 317,501 SNPs in 720 women of European ancestry with SLE and in 2,337 controls, and we genotyped consistently associated SNPs in two additional independent sample sets totaling 1,846 affected women and 1,825 controls. Aside from the expected strong association between SLE and the HLA region on chromosome 6p21 and the previously confirmed non-HLA locus IRF5 on chromosome 7q32, we found evidence of association with replication (1.1 x 10(-7) < P(overall) < 1.6 x 10(-23); odds ratio = 0.82-1.62) in four regions: 16p11.2 (ITGAM), 11p15.5 (KIAA1542), 3p14.3 (PXK) and 1q25.1 (rs10798269). We also found evidence for association (P < 1 x 10(-5)) at FCGR2A, PTPN22 and STAT4, regions previously associated with SLE and other autoimmune diseases, as well as at > or =9 other loci (P < 2 x 10(-7)). Our results show that numerous genes, some with known immune-related functions, predispose to SLE.

  3. GENOME-WIDE ASSOCIATION STUDY (GWAS) AND GENOME-WIDE BY ENVIRONMENT INTERACTION STUDY (GWEIS) OF DEPRESSIVE SYMPTOMS IN AFRICAN AMERICAN AND HISPANIC/LATINA WOMEN.

    PubMed

    Dunn, Erin C; Wiste, Anna; Radmanesh, Farid; Almli, Lynn M; Gogarten, Stephanie M; Sofer, Tamar; Faul, Jessica D; Kardia, Sharon L R; Smith, Jennifer A; Weir, David R; Zhao, Wei; Soare, Thomas W; Mirza, Saira S; Hek, Karin; Tiemeier, Henning; Goveas, Joseph S; Sarto, Gloria E; Snively, Beverly M; Cornelis, Marilyn; Koenen, Karestan C; Kraft, Peter; Purcell, Shaun; Ressler, Kerry J; Rosand, Jonathan; Wassertheil-Smoller, Sylvia; Smoller, Jordan W

    2016-04-01

    Genome-wide association studies (GWAS) have made little progress in identifying variants linked to depression. We hypothesized that examining depressive symptoms and considering gene-environment interaction (GxE) might improve efficiency for gene discovery. We therefore conducted a GWAS and genome-wide by environment interaction study (GWEIS) of depressive symptoms. Using data from the SHARe cohort of the Women's Health Initiative, comprising African Americans (n = 7,179) and Hispanics/Latinas (n = 3,138), we examined genetic main effects and GxE with stressful life events and social support. We also conducted a heritability analysis using genome-wide complex trait analysis (GCTA). Replication was attempted in four independent cohorts. No SNPs achieved genome-wide significance for main effects in either discovery sample. The top signals in African Americans were rs73531535 (located 20 kb from GPR139, P = 5.75 × 10(-8) ) and rs75407252 (intronic to CACNA2D3, P = 6.99 × 10(-7) ). In Hispanics/Latinas, the top signals were rs2532087 (located 27 kb from CD38, P = 2.44 × 10(-7) ) and rs4542757 (intronic to DCC, P = 7.31 × 10(-7) ). In the GEWIS with stressful life events, one interaction signal was genome-wide significant in African Americans (rs4652467; P = 4.10 × 10(-10) ; located 14 kb from CEP350). This interaction was not observed in a smaller replication cohort. Although heritability estimates for depressive symptoms and stressful life events were each less than 10%, they were strongly genetically correlated (rG = 0.95), suggesting that common variation underlying self-reported depressive symptoms and stressful life event exposure, though modest on their own, were highly overlapping in this sample. Our results underscore the need for larger samples, more GEWIS, and greater investigation into genetic and environmental determinants of depressive symptoms in minorities. © 2016 Wiley Periodicals, Inc.

  4. FGWAS: Functional genome wide association analysis.

    PubMed

    Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu

    2017-10-01

    Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Meta-analysis of genome-wide association from genomic prediction models

    USDA-ARS?s Scientific Manuscript database

    A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. To increase sample size, results from different GWA can be combined in a meta-analys...

  6. PEPIS: A Pipeline for Estimating Epistatic Effects in Quantitative Trait Locus Mapping and Genome-Wide Association Studies.

    PubMed

    Zhang, Wenchao; Dai, Xinbin; Wang, Qishan; Xu, Shizhong; Zhao, Patrick X

    2016-05-01

    The term epistasis refers to interactions between multiple genetic loci. Genetic epistasis is important in regulating biological function and is considered to explain part of the 'missing heritability,' which involves marginal genetic effects that cannot be accounted for in genome-wide association studies. Thus, the study of epistasis is of great interest to geneticists. However, estimating epistatic effects for quantitative traits is challenging due to the large number of interaction effects that must be estimated, thus significantly increasing computing demands. Here, we present a new web server-based tool, the Pipeline for estimating EPIStatic genetic effects (PEPIS), for analyzing polygenic epistatic effects. The PEPIS software package is based on a new linear mixed model that has been used to predict the performance of hybrid rice. The PEPIS includes two main sub-pipelines: the first for kinship matrix calculation, and the second for polygenic component analyses and genome scanning for main and epistatic effects. To accommodate the demand for high-performance computation, the PEPIS utilizes C/C++ for mathematical matrix computing. In addition, the modules for kinship matrix calculations and main and epistatic-effect genome scanning employ parallel computing technology that effectively utilizes multiple computer nodes across our networked cluster, thus significantly improving the computational speed. For example, when analyzing the same immortalized F2 rice population genotypic data examined in a previous study, the PEPIS returned identical results at each analysis step with the original prototype R code, but the computational time was reduced from more than one month to about five minutes. These advances will help overcome the bottleneck frequently encountered in genome wide epistatic genetic effect analysis and enable accommodation of the high computational demand. The PEPIS is publically available at http://bioinfo.noble.org/PolyGenic_QTL/.

  7. Genome-Wide Association Study of the Genetic Determinants of Emphysema Distribution

    PubMed Central

    Boueiz, Adel; Lutz, Sharon M.; Cho, Michael H.; Hersh, Craig P.; Bowler, Russell P.; Washko, George R.; Halper-Stromberg, Eitan; Bakke, Per; Gulsvik, Amund; Laird, Nan M.; Beaty, Terri H.; Coxson, Harvey O.; Crapo, James D.; Silverman, Edwin K.; Castaldi, Peter J.

    2017-01-01

    Rationale: Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe–predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. Objectives: To identify the genetic influences of emphysema distribution in non–alpha-1 antitrypsin–deficient smokers. Methods: A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism–, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. Measurements and Main Results: We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. Conclusions: This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings

  8. Genome-Wide Association Study of the Genetic Determinants of Emphysema Distribution.

    PubMed

    Boueiz, Adel; Lutz, Sharon M; Cho, Michael H; Hersh, Craig P; Bowler, Russell P; Washko, George R; Halper-Stromberg, Eitan; Bakke, Per; Gulsvik, Amund; Laird, Nan M; Beaty, Terri H; Coxson, Harvey O; Crapo, James D; Silverman, Edwin K; Castaldi, Peter J; DeMeo, Dawn L

    2017-03-15

    Emphysema has considerable variability in the severity and distribution of parenchymal destruction throughout the lungs. Upper lobe-predominant emphysema has emerged as an important predictor of response to lung volume reduction surgery. Yet, aside from alpha-1 antitrypsin deficiency, the genetic determinants of emphysema distribution remain largely unknown. To identify the genetic influences of emphysema distribution in non-alpha-1 antitrypsin-deficient smokers. A total of 11,532 subjects with complete genotype and computed tomography densitometry data in the COPDGene (Genetic Epidemiology of Chronic Obstructive Pulmonary Disease [COPD]; non-Hispanic white and African American), ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints), and GenKOLS (Genetics of Chronic Obstructive Lung Disease) studies were analyzed. Two computed tomography scan emphysema distribution measures (difference between upper-third and lower-third emphysema; ratio of upper-third to lower-third emphysema) were tested for genetic associations in all study subjects. Separate analyses in each study population were followed by a fixed effect metaanalysis. Single-nucleotide polymorphism-, gene-, and pathway-based approaches were used. In silico functional evaluation was also performed. We identified five loci associated with emphysema distribution at genome-wide significance. These loci included two previously reported associations with COPD susceptibility (4q31 near HHIP and 15q25 near CHRNA5) and three new associations near SOWAHB, TRAPPC9, and KIAA1462. Gene set analysis and in silico functional evaluation revealed pathways and cell types that may potentially contribute to the pathogenesis of emphysema distribution. This multicohort genome-wide association study identified new genomic loci associated with differential emphysematous destruction throughout the lungs. These findings may point to new biologic pathways on which to expand diagnostic and therapeutic

  9. Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies

    PubMed Central

    Zhang, Yu; Liu, Jun S.

    2011-01-01

    Genome-wide association studies commonly involve simultaneous tests of millions of single nucleotide polymorphisms (SNP) for disease association. The SNPs in nearby genomic regions, however, are often highly correlated due to linkage disequilibrium (LD, a genetic term for correlation). Simple Bonferonni correction for multiple comparisons is therefore too conservative. Permutation tests, which are often employed in practice, are both computationally expensive for genome-wide studies and limited in their scopes. We present an accurate and computationally efficient method, based on Poisson de-clumping heuristics, for approximating genome-wide significance of SNP associations. Compared with permutation tests and other multiple comparison adjustment approaches, our method computes the most accurate and robust p-value adjustments for millions of correlated comparisons within seconds. We demonstrate analytically that the accuracy and the efficiency of our method are nearly independent of the sample size, the number of SNPs, and the scale of p-values to be adjusted. In addition, our method can be easily adopted to estimate false discovery rate. When applied to genome-wide SNP datasets, we observed highly variable p-value adjustment results evaluated from different genomic regions. The variation in adjustments along the genome, however, are well conserved between the European and the African populations. The p-value adjustments are significantly correlated with LD among SNPs, recombination rates, and SNP densities. Given the large variability of sequence features in the genome, we further discuss a novel approach of using SNP-specific (local) thresholds to detect genome-wide significant associations. This article has supplementary material online. PMID:22140288

  10. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

    PubMed

    Sanitá Lima, Matheus; Smith, David Roy

    2017-11-06

    Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.

  11. A genome-wide scan study identifies a single nucleotide substitution in ASIP associated with white versus non-white coat-colour variation in sheep (Ovis aries)

    PubMed Central

    Li, M-H; Tiirikka, T; Kantanen, J

    2014-01-01

    In sheep, coat colour (and pattern) is one of the important traits of great biological, economic and social importance. However, the genetics of sheep coat colour has not yet been fully clarified. We conducted a genome-wide association study of sheep coat colours by genotyping 47 303 single-nucleotide polymorphisms (SNPs) in the Finnsheep population in Finland. We identified 35 SNPs associated with all the coat colours studied, which cover genomic regions encompassing three known pigmentation genes (TYRP1, ASIP and MITF) in sheep. Eighteen of these associations were confirmed in further tests between white versus non-white individuals, but none of the 35 associations were significant in the analysis of only non-white colours. Across the tests, the s66432.1 in ASIP showed significant association (P=4.2 × 10−11 for all the colours; P=2.3 × 10−11 for white versus non-white colours) with the variation in coat colours and strong linkage disequilibrium with other significant variants surrounding the ASIP gene. The signals detected around the ASIP gene were explained by differences in white versus non-white alleles. Further, a genome scan for selection for white coat pigmentation identified a strong and striking selection signal spanning ASIP. Our study identified the main candidate gene for the coat colour variation between white and non-white as ASIP, an autosomal gene that has been directly implicated in the pathway regulating melanogenesis. Together with ASIP, the two other newly identified genes (TYRP1 and MITF) in the Finnsheep, bordering associated SNPs, represent a new resource for enriching sheep coat-colour genetics and breeding. PMID:24022497

  12. In Search of Genes Associated with Risk for Psychopathic Tendencies in Children: A Two-Stage Genome-Wide Association Study of Pooled DNA

    ERIC Educational Resources Information Center

    Viding, Essi; Hanscombe, Ken B.; Curtis, Charles J. C.; Davis, Oliver S. P.; Meaburn, Emma L.; Plomin, Robert

    2010-01-01

    Background: Quantitative genetic data from our group indicates that antisocial behaviour (AB) is strongly heritable when coupled with psychopathic, callous-unemotional (CU) personality traits. We have also demonstrated that the genetic influences for AB and CU overlap considerably. We conducted a genome-wide association scan that capitalises on…

  13. Genome-wide and fine-resolution association analysis of malaria in West Africa.

    PubMed

    Jallow, Muminatou; Teo, Yik Ying; Small, Kerrin S; Rockett, Kirk A; Deloukas, Panos; Clark, Taane G; Kivinen, Katja; Bojang, Kalifa A; Conway, David J; Pinder, Margaret; Sirugo, Giorgio; Sisay-Joof, Fatou; Usen, Stanley; Auburn, Sarah; Bumpstead, Suzannah J; Campino, Susana; Coffey, Alison; Dunham, Andrew; Fry, Andrew E; Green, Angela; Gwilliam, Rhian; Hunt, Sarah E; Inouye, Michael; Jeffreys, Anna E; Mendy, Alieu; Palotie, Aarno; Potter, Simon; Ragoussis, Jiannis; Rogers, Jane; Rowlands, Kate; Somaskantharajah, Elilan; Whittaker, Pamela; Widden, Claire; Donnelly, Peter; Howie, Bryan; Marchini, Jonathan; Morris, Andrew; SanJoaquin, Miguel; Achidi, Eric Akum; Agbenyega, Tsiri; Allen, Angela; Amodu, Olukemi; Corran, Patrick; Djimde, Abdoulaye; Dolo, Amagana; Doumbo, Ogobara K; Drakeley, Chris; Dunstan, Sarah; Evans, Jennifer; Farrar, Jeremy; Fernando, Deepika; Hien, Tran Tinh; Horstmann, Rolf D; Ibrahim, Muntaser; Karunaweera, Nadira; Kokwaro, Gilbert; Koram, Kwadwo A; Lemnge, Martha; Makani, Julie; Marsh, Kevin; Michon, Pascal; Modiano, David; Molyneux, Malcolm E; Mueller, Ivo; Parker, Michael; Peshu, Norbert; Plowe, Christopher V; Puijalon, Odile; Reeder, John; Reyburn, Hugh; Riley, Eleanor M; Sakuntabhai, Anavaj; Singhasivanon, Pratap; Sirima, Sodiomon; Tall, Adama; Taylor, Terrie E; Thera, Mahamadou; Troye-Blomberg, Marita; Williams, Thomas N; Wilson, Michael; Kwiatkowski, Dominic P

    2009-06-01

    We report a genome-wide association (GWA) study of severe malaria in The Gambia. The initial GWA scan included 2,500 children genotyped on the Affymetrix 500K GeneChip, and a replication study included 3,400 children. We used this to examine the performance of GWA methods in Africa. We found considerable population stratification, and also that signals of association at known malaria resistance loci were greatly attenuated owing to weak linkage disequilibrium (LD). To investigate possible solutions to the problem of low LD, we focused on the HbS locus, sequencing this region of the genome in 62 Gambian individuals and then using these data to conduct multipoint imputation in the GWA samples. This increased the signal of association, from P = 4 × 10(-7) to P = 4 × 10(-14), with the peak of the signal located precisely at the HbS causal variant. Our findings provide proof of principle that fine-resolution multipoint imputation, based on population-specific sequencing data, can substantially boost authentic GWA signals and enable fine mapping of causal variants in African populations.

  14. Gigwa-Genotype investigator for genome-wide analyses.

    PubMed

    Sempéré, Guilhem; Philippe, Florian; Dereeper, Alexis; Ruiz, Manuel; Sarah, Gautier; Larmande, Pierre

    2016-06-06

    Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions. Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats. The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.

  15. Genome-wide association studies and resting heart rate.

    PubMed

    Kilpeläinen, Tuomas O

    Genome-wide association studies (GWASs) have revolutionized the search for genetic variants regulating resting heart rate. In the last 10years, GWASs have led to the identification of at least 21 novel heart rate loci. These discoveries have provided valuable insights into the mechanisms and pathways that regulate heart rate and link heart rate to cardiovascular morbidity and mortality. GWASs capture majority of genetic variation in a population sample by utilizing high-throughput genotyping chips measuring genotypes for up to several millions of SNPs across the genome in thousands of individuals. This allows the identification of the strongest heart rate associated signals at genome-wide level. While GWASs provide robust statistical evidence of the association of a given genetic locus with heart rate, they are only the starting point for detailed follow-up studies to locate the causal variants and genes and gain further insights into the biological mechanisms underlying the observed associations. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Case-Control Genome-Wide Association Study of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Neale, Benjamin M.; Medland, Sarah; Ripke, Stephan; Anney, Richard J. L.; Asherson, Philip; Buitelaar, Jan; Franke, Barbara; Gill, Michael; Kent, Lindsey; Holmans, Peter; Middleton, Frank; Thapar, Anita; Lesch, Klaus-Peter; Faraone, Stephen V.; Daly, Mark; Nguyen, Thuy Trang; Schafer, Helmut; Steinhausen, Hans-Christoph; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Freitag, Christine; Meyer, Jobst; Palmason, Haukur; Rothenberger, Aribert; Hawi, Ziarih; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. Thus additional genome-wide association studies (GWAS) are needed. Method: We used case-control analyses of 896 cases…

  17. snpGeneSets: An R Package for Genome-Wide Study Annotation

    PubMed Central

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-01-01

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048

  18. Genome-wide identification of runs of homozygosity islands and associated genes in local dairy cattle breeds.

    PubMed

    Mastrangelo, S; Sardina, M T; Tolone, M; Di Gerlando, R; Sutera, A M; Fontanesi, L; Portolano, B

    2018-03-26

    Runs of homozygosity (ROH) are widely used as predictors of whole-genome inbreeding levels in cattle. They identify regions that have an unfavorable effect on a phenotype when homozygous, but also identify the genes associated with traits of economic interest present in these regions. Here, the distribution of ROH islands and enriched genes within these regions in four dairy cattle breeds were investigated. Cinisara (71), Modicana (72), Reggiana (168) and Italian Holstein (96) individuals were genotyped using the 50K v2 Illumina BeadChip. The genomic regions most commonly associated with ROHs were identified by selecting the top 1% of the single nucleotide polymorphisms (SNPs) most commonly observed in the ROH of each breed. In total, 11 genomic regions were identified in Cinisara and Italian Holstein, and eight in Modicana and Reggiana, indicating an increased ROH frequency level. Generally, ROH islands differed between breeds. The most homozygous region (>45% of individuals with ROH) was found in Modicana on chromosome 6 within a quantitative trail locus affecting milk fat and protein concentrations. We identified between 126 and 347 genes within ROH islands, which are involved in multiple signaling and signal transduction pathways in a wide variety of biological processes. The gene ontology enrichment provided information on possible molecular functions, biological processes and cellular components under selection related to milk production, reproduction, immune response and resistance/susceptibility to infection and diseases. Thus, scanning the genome for ROH could be an alternative strategy to detect genomic regions and genes related to important economic traits.

  19. A genome-wide search for linkage of estimated glomerular filtration rate (eGFR) in the Family Investigation of Nephropathy and Diabetes (FIND).

    PubMed

    Thameem, Farook; Igo, Robert P; Freedman, Barry I; Langefeld, Carl; Hanson, Robert L; Schelling, Jeffrey R; Elston, Robert C; Duggirala, Ravindranath; Nicholas, Susanne B; Goddard, Katrina A B; Divers, Jasmin; Guo, Xiuqing; Ipp, Eli; Kimmel, Paul L; Meoni, Lucy A; Shah, Vallabh O; Smith, Michael W; Winkler, Cheryl A; Zager, Philip G; Knowler, William C; Nelson, Robert G; Pahl, Madeline V; Parekh, Rulan S; Kao, W H Linda; Rasooly, Rebekah S; Adler, Sharon G; Abboud, Hanna E; Iyengar, Sudha K; Sedor, John R

    2013-01-01

    Estimated glomerular filtration rate (eGFR), a measure of kidney function, is heritable, suggesting that genes influence renal function. Genes that influence eGFR have been identified through genome-wide association studies. However, family-based linkage approaches may identify loci that explain a larger proportion of the heritability. This study used genome-wide linkage and association scans to identify quantitative trait loci (QTL) that influence eGFR. Genome-wide linkage and sparse association scans of eGFR were performed in families ascertained by probands with advanced diabetic nephropathy (DN) from the multi-ethnic Family Investigation of Nephropathy and Diabetes (FIND) study. This study included 954 African Americans (AA), 781 American Indians (AI), 614 European Americans (EA) and 1,611 Mexican Americans (MA). A total of 3,960 FIND participants were genotyped for 6,000 single nucleotide polymorphisms (SNPs) using the Illumina Linkage IVb panel. GFR was estimated by the Modification of Diet in Renal Disease (MDRD) formula. The non-parametric linkage analysis, accounting for the effects of diabetes duration and BMI, identified the strongest evidence for linkage of eGFR on chromosome 20q11 (log of the odds [LOD] = 3.34; P = 4.4 × 10(-5)) in MA and chromosome 15q12 (LOD = 2.84; P = 1.5 × 10(-4)) in EA. In all subjects, the strongest linkage signal for eGFR was detected on chromosome 10p12 (P = 5.5 × 10(-4)) at 44 cM near marker rs1339048. A subsequent association scan in both ancestry-specific groups and the entire population identified several SNPs significantly associated with eGFR across the genome. The present study describes the localization of QTL influencing eGFR on 20q11 in MA, 15q21 in EA and 10p12 in the combined ethnic groups participating in the FIND study. Identification of causal genes/variants influencing eGFR, within these linkage and association loci, will open new avenues for functional analyses and development of novel diagnostic markers

  20. Non-random mate choice in humans: insights from a genome scan.

    PubMed

    Laurent, R; Toupance, B; Chaix, R

    2012-02-01

    Little is known about the genetic factors influencing mate choice in humans. Still, there is evidence for non-random mate choice with respect to physical traits. In addition, some studies suggest that the Major Histocompatibility Complex may affect pair formation. Nowadays, the availability of high density genomic data sets gives the opportunity to scan the genome for signatures of non-random mate choice without prior assumptions on which genes may be involved, while taking into account socio-demographic factors. Here, we performed a genome scan to detect extreme patterns of similarity or dissimilarity among spouses throughout the genome in three populations of African, European American, and Mexican origins from the HapMap 3 database. Our analyses identified genes and biological functions that may affect pair formation in humans, including genes involved in skin appearance, morphogenesis, immunity and behaviour. We found little overlap between the three populations, suggesting that the biological functions potentially influencing mate choice are population specific, in other words are culturally driven. Moreover, whenever the same functional category of genes showed a significant signal in two populations, different genes were actually involved, which suggests the possibility of evolutionary convergences. © 2011 Blackwell Publishing Ltd.

  1. A genome-wide scan identifies variants in NFIB associated with metastasis in patients with osteosarcoma

    PubMed Central

    Mirabello, Lisa; Koster, Roelof; Moriarity, Branden S.; Spector, Logan G.; Meltzer, Paul S.; Gary, Joy; Machiela, Mitchell J.; Pankratz, Nathan; Panagiotou, Orestis A.; Largaespada, David; Wang, Zhaoming; Gastier-Foster, Julie M.; Gorlick, Richard; Khanna, Chand; de Toledo, Silvia Regina Caminada; Petrilli, Antonio S.; Patiño-Garcia, Ana; Sierrasesúmaga, Luis; Lecanda, Fernando; Andrulis, Irene L.; Wunder, Jay S.; Gokgoz, Nalan; Serra, Massimo; Hattinger, Claudia; Picci, Piero; Scotlandi, Katia; Flanagan, Adrienne M.; Tirabosco, Roberto; Amary, Maria Fernanda; Halai, Dina; Ballinger, Mandy L.; Thomas, David M.; Davis, Sean; Barkauskas, Donald A.; Marina, Neyssa; Helman, Lee; Otto, George M.; Becklin, Kelsie L.; Wolf, Natalie K.; Weg, Madison T.; Tucker, Margaret; Wacholder, Sholom; Fraumeni, Joseph F.; Caporaso, Neil E.; Boland, Joseph F.; Hicks, Belynda D.; Vogt, Aurelie; Burdett, Laurie; Yeager, Meredith; Hoover, Robert N.; Chanock, Stephen J.; Savage, Sharon A.

    2015-01-01

    Metastasis is the leading cause of death in osteosarcoma patients, the most common pediatric bone malignancy. We conducted a multi-stage genome-wide association study of osteosarcoma metastasis at diagnosis in 935 osteosarcoma patients to determine whether germline genetic variation contributes to risk of metastasis. We identified a SNP, rs7034162, in NFIB significantly associated with metastasis in European osteosarcoma cases, as well as in cases of African and Brazilian ancestry (meta-analysis of all cases: P=1.2×10−9, OR 2.43, 95% CI 1.83–3.24). The risk allele was significantly associated with lowered NFIB expression, which led to increased osteosarcoma cell migration, proliferation, and colony formation. Additionally, a transposon screen in mice identified a significant proportion of osteosarcomas harboring inactivating insertions in Nfib, and had lowered Nfib expression. These data suggest that germline genetic variation at rs7034162 is important in osteosarcoma metastasis, and that NFIB is an osteosarcoma metastasis susceptibility gene. PMID:26084801

  2. Genome-wide comparative analysis of four Indian Drosophila species.

    PubMed

    Mohanty, Sujata; Khanna, Radhika

    2017-12-01

    Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.

  3. Genome-wide association studies and epigenome-wide association studies go together in cancer control

    PubMed Central

    Verma, Mukesh

    2016-01-01

    Completion of the human genome a decade ago laid the foundation for: using genetic information in assessing risk to identify individuals and populations that are likely to develop cancer, and designing treatments based on a person's genetic profiling (precision medicine). Genome-wide association studies (GWAS) completed during the past few years have identified risk-associated single nucleotide polymorphisms that can be used as screening tools in epidemiologic studies of a variety of tumor types. This led to the conduct of epigenome-wide association studies (EWAS). This article discusses the current status, challenges and research opportunities in GWAS and EWAS. Information gained from GWAS and EWAS has potential applications in cancer control and treatment. PMID:27079684

  4. Microfluidics for genome-wide studies involving next generation sequencing

    PubMed Central

    Murphy, Travis W.; Lu, Chang

    2017-01-01

    Next-generation sequencing (NGS) has revolutionized how molecular biology studies are conducted. Its decreasing cost and increasing throughput permit profiling of genomic, transcriptomic, and epigenomic features for a wide range of applications. Microfluidics has been proven to be highly complementary to NGS technology with its unique capabilities for handling small volumes of samples and providing platforms for automation, integration, and multiplexing. In this article, we review recent progress on applying microfluidics to facilitate genome-wide studies. We emphasize on several technical aspects of NGS and how they benefit from coupling with microfluidic technology. We also summarize recent efforts on developing microfluidic technology for genomic, transcriptomic, and epigenomic studies, with emphasis on single cell analysis. We envision rapid growth in these directions, driven by the needs for testing scarce primary cell samples from patients in the context of precision medicine. PMID:28396707

  5. Genome-Wide Association Scan Meta-Analysis Identifies Three Loci Influencing Adiposity and Fat Distribution

    PubMed Central

    Qi, Lu; Speliotes, Elizabeth K.; Thorleifsson, Gudmar; Willer, Cristen J.; Herrera, Blanca M.; Jackson, Anne U.; Lim, Noha; Scheet, Paul; Soranzo, Nicole; Amin, Najaf; Aulchenko, Yurii S.; Chambers, John C.; Drong, Alexander; Luan, Jian'an; Lyon, Helen N.; Rivadeneira, Fernando; Sanna, Serena; Timpson, Nicholas J.; Zillikens, M. Carola; Zhao, Jing Hua; Almgren, Peter; Bandinelli, Stefania; Bennett, Amanda J.; Bergman, Richard N.; Bonnycastle, Lori L.; Bumpstead, Suzannah J.; Chanock, Stephen J.; Cherkas, Lynn; Chines, Peter; Coin, Lachlan; Cooper, Cyrus; Crawford, Gabriel; Doering, Angela; Dominiczak, Anna; Doney, Alex S. F.; Ebrahim, Shah; Elliott, Paul; Erdos, Michael R.; Estrada, Karol; Ferrucci, Luigi; Fischer, Guido; Forouhi, Nita G.; Gieger, Christian; Grallert, Harald; Groves, Christopher J.; Grundy, Scott; Guiducci, Candace; Hadley, David; Hamsten, Anders; Havulinna, Aki S.; Hofman, Albert; Holle, Rolf; Holloway, John W.; Illig, Thomas; Isomaa, Bo; Jacobs, Leonie C.; Jameson, Karen; Jousilahti, Pekka; Karpe, Fredrik; Kuusisto, Johanna; Laitinen, Jaana; Lathrop, G. Mark; Lawlor, Debbie A.; Mangino, Massimo; McArdle, Wendy L.; Meitinger, Thomas; Morken, Mario A.; Morris, Andrew P.; Munroe, Patricia; Narisu, Narisu; Nordström, Anna; Nordström, Peter; Oostra, Ben A.; Palmer, Colin N. A.; Payne, Felicity; Peden, John F.; Prokopenko, Inga; Renström, Frida; Ruokonen, Aimo; Salomaa, Veikko; Sandhu, Manjinder S.; Scott, Laura J.; Scuteri, Angelo; Silander, Kaisa; Song, Kijoung; Yuan, Xin; Stringham, Heather M.; Swift, Amy J.; Tuomi, Tiinamaija; Uda, Manuela; Vollenweider, Peter; Waeber, Gerard; Wallace, Chris; Walters, G. Bragi; Weedon, Michael N.; Witteman, Jacqueline C. M.; Zhang, Cuilin; Zhang, Weihua; Caulfield, Mark J.; Collins, Francis S.; Davey Smith, George; Day, Ian N. M.; Franks, Paul W.; Hattersley, Andrew T.; Hu, Frank B.; Jarvelin, Marjo-Riitta; Kong, Augustine; Kooner, Jaspal S.; Laakso, Markku; Lakatta, Edward; Mooser, Vincent; Morris, Andrew D.; Peltonen, Leena; Samani, Nilesh J.; Spector, Timothy D.; Strachan, David P.; Tanaka, Toshiko; Tuomilehto, Jaakko; Uitterlinden, André G.; van Duijn, Cornelia M.; Wareham, Nicholas J.; Watkins for the PROCARDIS consortia, Hugh; Waterworth, Dawn M.; Boehnke, Michael; Deloukas, Panos; Groop, Leif; Hunter, David J.; Thorsteinsdottir, Unnur; Schlessinger, David; Wichmann, H.-Erich; Frayling, Timothy M.; Abecasis, Gonçalo R.; Hirschhorn, Joel N.; Loos, Ruth J. F.; Stefansson, Kari; Mohlke, Karen L.; Barroso, Inês; McCarthy for the GIANT consortium, Mark I.

    2009-01-01

    To identify genetic loci influencing central obesity and fat distribution, we performed a meta-analysis of 16 genome-wide association studies (GWAS, N = 38,580) informative for adult waist circumference (WC) and waist–hip ratio (WHR). We selected 26 SNPs for follow-up, for which the evidence of association with measures of central adiposity (WC and/or WHR) was strong and disproportionate to that for overall adiposity or height. Follow-up studies in a maximum of 70,689 individuals identified two loci strongly associated with measures of central adiposity; these map near TFAP2B (WC, P = 1.9×10−11) and MSRA (WC, P = 8.9×10−9). A third locus, near LYPLAL1, was associated with WHR in women only (P = 2.6×10−8). The variants near TFAP2B appear to influence central adiposity through an effect on overall obesity/fat-mass, whereas LYPLAL1 displays a strong female-only association with fat distribution. By focusing on anthropometric measures of central obesity and fat distribution, we have identified three loci implicated in the regulation of human adiposity. PMID:19557161

  6. The Glyphosate-Based Herbicide Roundup Does not Elevate Genome-Wide Mutagenesis of Escherichia coli.

    PubMed

    Tincher, Clayton; Long, Hongan; Behringer, Megan; Walker, Noah; Lynch, Michael

    2017-10-05

    Mutations induced by pollutants may promote pathogen evolution, for example by accelerating mutations conferring antibiotic resistance. Generally, evaluating the genome-wide mutagenic effects of long-term sublethal pollutant exposure at single-nucleotide resolution is extremely difficult. To overcome this technical barrier, we use the mutation accumulation/whole-genome sequencing (MA/WGS) method as a mutagenicity test, to quantitatively evaluate genome-wide mutagenesis of Escherichia coli after long-term exposure to a wide gradient of the glyphosate-based herbicide (GBH) Roundup Concentrate Plus. The genome-wide mutation rate decreases as GBH concentration increases, suggesting that even long-term GBH exposure does not compromise the genome stability of bacteria. Copyright © 2017 Tincher et al.

  7. Inferring genome-wide interplay landscape between DNA methylation and transcriptional regulation.

    PubMed

    Tang, Binhua; Wang, Xin

    2015-01-01

    DNA methylation and transcriptional regulation play important roles in cancer cell development and differentiation processes. Based on the currently available cell line profiling information from the ENCODE Consortium, we propose a Bayesian inference model to infer and construct genome-wide interaction landscape between DNA methylation and transcriptional regulation, which sheds light on the underlying complex functional mechanisms important within the human cancer and disease context. For the first time, we select all the currently available cell lines (>=20) and transcription factors (>=80) profiling information from the ENCODE Consortium portal. Through the integration of those genome-wide profiling sources, our genome-wide analysis detects multiple functional loci of interest, and indicates that DNA methylation is cell- and region-specific, due to the interplay mechanisms with transcription regulatory activities. We validate our analysis results with the corresponding RNA-sequencing technique for those detected genomic loci. Our results provide novel and meaningful insights for the interplay mechanisms of transcriptional regulation and gene expression for the human cancer and disease studies.

  8. TEAM: efficient two-locus epistasis tests in human genome-wide association study.

    PubMed

    Zhang, Xiang; Huang, Shunping; Zou, Fei; Wang, Wei

    2010-06-15

    As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genome-wide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach.

  9. Genome-wide association and genomic prediction of resistance to viral nervous necrosis in European sea bass (Dicentrarchus labrax) using RAD sequencing.

    PubMed

    Palaiokostas, Christos; Cariou, Sophie; Bestin, Anastasia; Bruant, Jean-Sebastien; Haffray, Pierrick; Morin, Thierry; Cabon, Joëlle; Allal, François; Vandeputte, Marc; Houston, Ross D

    2018-06-08

    European sea bass (Dicentrarchus labrax) is one of the most important species for European aquaculture. Viral nervous necrosis (VNN), commonly caused by the redspotted grouper nervous necrosis virus (RGNNV), can result in high levels of morbidity and mortality, mainly during the larval and juvenile stages of cultured sea bass. In the absence of efficient therapeutic treatments, selective breeding for host resistance offers a promising strategy to control this disease. Our study aimed at investigating genetic resistance to VNN and genomic-based approaches to improve disease resistance by selective breeding. A population of 1538 sea bass juveniles from a factorial cross between 48 sires and 17 dams was challenged with RGNNV with mortalities and survivors being recorded and sampled for genotyping by the RAD sequencing approach. We used genome-wide genotype data from 9195 single nucleotide polymorphisms (SNPs) for downstream analysis. Estimates of heritability of survival on the underlying scale for the pedigree and genomic relationship matrices were 0.27 (HPD interval 95%: 0.14-0.40) and 0.43 (0.29-0.57), respectively. Classical genome-wide association analysis detected genome-wide significant quantitative trait loci (QTL) for resistance to VNN on chromosomes (unassigned scaffolds in the case of 'chromosome' 25) 3, 20 and 25 (P < 1e06). Weighted genomic best linear unbiased predictor provided additional support for the QTL on chromosome 3 and suggested that it explained 4% of the additive genetic variation. Genomic prediction approaches were tested to investigate the potential of using genome-wide SNP data to estimate breeding values for resistance to VNN and showed that genomic prediction resulted in a 13% increase in successful classification of resistant and susceptible animals compared to pedigree-based methods, with Bayes A and Bayes B giving the highest predictive ability. Genome-wide significant QTL were identified but each with relatively small effects on

  10. ARG-based genome-wide analysis of cacao cultivars.

    PubMed

    Utro, Filippo; Cornejo, Omar Eduardo; Livingstone, Donald; Motamayor, Juan Carlos; Parida, Laxmi

    2012-01-01

    Ancestral recombinations graph (ARG) is a topological structure that captures the relationship between the extant genomic sequences in terms of genetic events including recombinations. IRiS is a system that estimates the ARG on sequences of individuals, at genomic scales, capturing the relationship between these individuals of the species. Recently, this system was used to estimate the ARG of the recombining X Chromosome of a collection of human populations using relatively dense, bi-allelic SNP data. While the ARG is a natural model for capturing the inter-relationship between a single chromosome of the individuals of a species, it is not immediately apparent how the model can utilize whole-genome (across chromosomes) diploid data. Also, the sheer complexity of an ARG structure presents a challenge to graph visualization techniques. In this paper we examine the ARG reconstruction for (1) genome-wide or multiple chromosomes, (2) multi-allelic and (3) extremely sparse data. To aid in the visualization of the results of the reconstructed ARG, we additionally construct a much simplified topology, a classification tree, suggested by the ARG.As the test case, we study the problem of extracting the relationship between populations of Theobroma cacao. The chocolate tree is an outcrossing species in the wild, due to self-incompatibility mechanisms at play. Thus a principled approach to understanding the inter-relationships between the different populations must take the shuffling of the genomic segments into account. The polymorphisms in the test data are short tandem repeats (STR) and are multi-allelic (sometimes as high as 30 distinct possible values at a locus). Each is at a genomic location that is bilaterally transmitted, hence the ARG is a natural model for this data. Another characteristic of this plant data set is that while it is genome-wide, across 10 linkage groups or chromosomes, it is very sparse, i.e., only 96 loci from a genome of approximately 400 megabases

  11. ARG-based genome-wide analysis of cacao cultivars

    PubMed Central

    2012-01-01

    Background Ancestral recombinations graph (ARG) is a topological structure that captures the relationship between the extant genomic sequences in terms of genetic events including recombinations. IRiS is a system that estimates the ARG on sequences of individuals, at genomic scales, capturing the relationship between these individuals of the species. Recently, this system was used to estimate the ARG of the recombining X Chromosome of a collection of human populations using relatively dense, bi-allelic SNP data. Results While the ARG is a natural model for capturing the inter-relationship between a single chromosome of the individuals of a species, it is not immediately apparent how the model can utilize whole-genome (across chromosomes) diploid data. Also, the sheer complexity of an ARG structure presents a challenge to graph visualization techniques. In this paper we examine the ARG reconstruction for (1) genome-wide or multiple chromosomes, (2) multi-allelic and (3) extremely sparse data. To aid in the visualization of the results of the reconstructed ARG, we additionally construct a much simplified topology, a classification tree, suggested by the ARG. As the test case, we study the problem of extracting the relationship between populations of Theobroma cacao. The chocolate tree is an outcrossing species in the wild, due to self-incompatibility mechanisms at play. Thus a principled approach to understanding the inter-relationships between the different populations must take the shuffling of the genomic segments into account. The polymorphisms in the test data are short tandem repeats (STR) and are multi-allelic (sometimes as high as 30 distinct possible values at a locus). Each is at a genomic location that is bilaterally transmitted, hence the ARG is a natural model for this data. Another characteristic of this plant data set is that while it is genome-wide, across 10 linkage groups or chromosomes, it is very sparse, i.e., only 96 loci from a genome of

  12. Genome-wide association study to identify common variants associated with brachial circumference: a meta-analysis of 14 cohorts.

    PubMed

    Boraska, Vesna; Day-Williams, Aaron; Franklin, Christopher S; Elliott, Katherine S; Panoutsopoulou, Kalliope; Tachmazidou, Ioanna; Albrecht, Eva; Bandinelli, Stefania; Beilin, Lawrence J; Bochud, Murielle; Cadby, Gemma; Ernst, Florian; Evans, David M; Hayward, Caroline; Hicks, Andrew A; Huffman, Jennifer; Huth, Cornelia; James, Alan L; Klopp, Norman; Kolcic, Ivana; Kutalik, Zoltán; Lawlor, Debbie A; Musk, Arthur W; Pehlic, Marina; Pennell, Craig E; Perry, John R B; Peters, Annette; Polasek, Ozren; St Pourcain, Beate; Ring, Susan M; Salvi, Erika; Schipf, Sabine; Staessen, Jan A; Teumer, Alexander; Timpson, Nicholas; Vitart, Veronique; Warrington, Nicole M; Yaghootkar, Hanieh; Zemunik, Tatijana; Zgaga, Lina; An, Ping; Anttila, Verneri; Borecki, Ingrid B; Holmen, Jostein; Ntalla, Ioanna; Palotie, Aarno; Pietiläinen, Kirsi H; Wedenoja, Juho; Winsvold, Bendik S; Dedoussis, George V; Kaprio, Jaakko; Province, Michael A; Zwart, John-Anker; Burnier, Michel; Campbell, Harry; Cusi, Daniele; Smith, George Davey; Frayling, Timothy M; Gieger, Christian; Palmer, Lyle J; Pramstaller, Peter P; Rudan, Igor; Völzke, Henry; Wichmann, H-Erich; Wright, Alan F; Zeggini, Eleftheria

    2012-01-01

    Brachial circumference (BC), also known as upper arm or mid arm circumference, can be used as an indicator of muscle mass and fat tissue, which are distributed differently in men and women. Analysis of anthropometric measures of peripheral fat distribution such as BC could help in understanding the complex pathophysiology behind overweight and obesity. The purpose of this study is to identify genetic variants associated with BC through a large-scale genome-wide association scan (GWAS) meta-analysis. We used fixed-effects meta-analysis to synthesise summary results across 14 GWAS discovery and 4 replication cohorts comprising overall 22,376 individuals (12,031 women and 10,345 men) of European ancestry. Individual analyses were carried out for men, women, and combined across sexes using linear regression and an additive genetic model: adjusted for age and adjusted for age and BMI. We prioritised signals for follow-up in two-stages. We did not detect any signals reaching genome-wide significance. The FTO rs9939609 SNP showed nominal evidence for association (p<0.05) in the age-adjusted strata for men and across both sexes. In this first GWAS meta-analysis for BC to date, we have not identified any genome-wide significant signals and do not observe robust association of previously established obesity loci with BC. Large-scale collaborations will be necessary to achieve higher power to detect loci underlying BC.

  13. Enhancing genomic prediction with genome-wide association studies in multiparental maize populations

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association mapping using dense marker sets has identified some nucleotide variants affecting complex traits which have been validated with fine-mapping and functional analysis. Many sequence variants associated with complex traits in maize have small effects and low repeatability, howev...

  14. Heritability and molecular genetic basis of acoustic startle eye blink and affectively modulated startle response: A genome-wide association study

    PubMed Central

    VAIDYANATHAN, UMA; MALONE, STEPHEN M.; MILLER, MICHAEL B.; McGUE, MATT; IACONO, WILLIAM G.

    2014-01-01

    Acoustic startle responses have been studied extensively in relation to individual differences and psychopathology. We examined three indices of the blink response in a picture-viewing paradigm—overall startle magnitude across all picture types, and aversive and pleasant modulation scores—in 3,323 twins and parents. Biometric models and molecular genetic analyses showed that half the variance in overall startle was due to additive genetic effects. No single nucleotide polymorphism was genome-wide significant, but GRIK3 did produce a significant effect when examined as part of a candidate gene set. In contrast, emotion modulation scores showed little evidence of heritability in either biometric or molecular genetic analyses. However, in a genome-wide scan, PARP14 did produce a significant effect for aversive modulation. We conclude that, although overall startle retains potential as an endophenotype, emotion-modulated startle does not. PMID:25387708

  15. Genome-wide scan for seed composition provides insights into the improvement of soybean quality and the impacts of domestication and modern breeding

    USDA-ARS?s Scientific Manuscript database

    Soybean (Glycine max (L.) Merrill) is a world-widely grown major crop rich in both protein and oil. Improvement of seed nutrients has long been one of the most important breeding objectives in soybean. To better understand the genetic architecture of the traits for improvement, we conducted genome-w...

  16. Genome-wide interaction study of smoking and bladder cancer risk

    PubMed Central

    Figueroa, Jonine D.; Han, Summer S.; Garcia-Closas, Montserrat; Baris, Dalsu; Jacobs, Eric J.; Kogevinas, Manolis; Schwenn, Molly; Malats, Nuria; Johnson, Alison; Purdue, Mark P.; Caporaso, Neil; Landi, Maria Teresa; Prokunina-Olsson, Ludmila; Wang, Zhaoming; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Vineis, Paolo; Siddiq, Afshan; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Bueno-de-Mesquita, H.Bas; Trichopoulos, Dimitrios; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth; Tjønneland, Anne; Brenan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Rodabough, Rebecca; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Chen, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Karagas, Margaret R.; Schned, Alan; Armenti, Karla R.; Hosain, G.M.Monawar; Haiman, Chris A.; Fraumeni, Joseph F.; Chanock, Stephen J.; Chatterjee, Nilanjan; Rothman, Nathaniel; Silverman, Debra T.

    2014-01-01

    Bladder cancer is a complex disease with known environmental and genetic risk factors. We performed a genome-wide interaction study (GWAS) of smoking and bladder cancer risk based on primary scan data from 3002 cases and 4411 controls from the National Cancer Institute Bladder Cancer GWAS. Alternative methods were used to evaluate both additive and multiplicative interactions between individual single nucleotide polymorphisms (SNPs) and smoking exposure. SNPs with interaction P values < 5 × 10− 5 were evaluated further in an independent dataset of 2422 bladder cancer cases and 5751 controls. We identified 10 SNPs that showed association in a consistent manner with the initial dataset and in the combined dataset, providing evidence of interaction with tobacco use. Further, two of these novel SNPs showed strong evidence of association with bladder cancer in tobacco use subgroups that approached genome-wide significance. Specifically, rs1711973 (FOXF2) on 6p25.3 was a susceptibility SNP for never smokers [combined odds ratio (OR) = 1.34, 95% confidence interval (CI) = 1.20–1.50, P value = 5.18 × 10− 7]; and rs12216499 (RSPH3-TAGAP-EZR) on 6q25.3 was a susceptibility SNP for ever smokers (combined OR = 0.75, 95% CI = 0.67–0.84, P value = 6.35 × 10− 7). In our analysis of smoking and bladder cancer, the tests for multiplicative interaction seemed to more commonly identify susceptibility loci with associations in never smokers, whereas the additive interaction analysis identified more loci with associations among smokers—including the known smoking and NAT2 acetylation interaction. Our findings provide additional evidence of gene–environment interactions for tobacco and bladder cancer. PMID:24662972

  17. Genome-wide scan for selection signatures in six cattle breeds in South Africa.

    PubMed

    Makina, Sithembile O; Muchadeyi, Farai C; van Marle-Köster, Este; Taylor, Jerry F; Makgahlela, Mahlako L; Maiwashe, Azwihangwisi

    2015-11-26

    The detection of selection signatures in breeds of livestock species can contribute to the identification of regions of the genome that are, or have been, functionally important and, as a consequence, have been targeted by selection. This study used two approaches to detect signatures of selection within and between six cattle breeds in South Africa, including Afrikaner (n = 44), Nguni (n = 54), Drakensberger (n = 47), Bonsmara (n = 44), Angus (n = 31) and Holstein (n = 29). The first approach was based on the detection of genomic regions in which haplotypes have been driven towards complete fixation within breeds. The second approach identified regions of the genome that had very different allele frequencies between populations (F ST). Forty-seven candidate genomic regions were identified as harbouring putative signatures of selection using both methods. Twelve of these candidate selected regions were shared among the breeds and ten were validated by previous studies. Thirty-three of these regions were successfully annotated and candidate genes were identified. Among these genes the keratin genes (KRT222, KRT24, KRT25, KRT26, and KRT27) and one heat shock protein gene (HSPB9) on chromosome 19 between 42,896,570 and 42,897,840 bp were detected for the Nguni breed. These genes were previously associated with adaptation to tropical environments in Zebu cattle. In addition, a number of candidate genes associated with the nervous system (WNT5B, FMOD, PRELP, and ATP2B), immune response (CYM, CDC6, and CDK10), production (MTPN, IGFBP4, TGFB1, and AJAP1) and reproductive performance (ADIPOR2, OVOS2, and RBBP8) were also detected as being under selection. The results presented here provide a foundation for detecting mutations that underlie genetic variation of traits that have economic importance for cattle breeds in South Africa.

  18. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  19. Genome-wide association study of Tourette Syndrome

    PubMed Central

    Scharf, Jeremiah M.; Yu, Dongmei; Mathews, Carol A.; Neale, Benjamin M.; Stewart, S. Evelyn; Fagerness, Jesen A; Evans, Patrick; Gamazon, Eric; Edlund, Christopher K.; Service, Susan; Tikhomirov, Anna; Osiecki, Lisa; Illmann, Cornelia; Pluzhnikov, Anna; Konkashbaev, Anuar; Davis, Lea K; Han, Buhm; Crane, Jacquelyn; Moorjani, Priya; Crenshaw, Andrew T.; Parkin, Melissa A.; Reus, Victor I.; Lowe, Thomas L.; Rangel-Lugo, Martha; Chouinard, Sylvain; Dion, Yves; Girard, Simon; Cath, Danielle C; Smit, Jan H; King, Robert A.; Fernandez, Thomas; Leckman, James F.; Kidd, Kenneth K.; Kidd, Judith R.; Pakstis, Andrew J.; State, Matthew; Herrera, Luis Diego; Romero, Roxana; Fournier, Eduardo; Sandor, Paul; Barr, Cathy L; Phan, Nam; Gross-Tsur, Varda; Benarroch, Fortu; Pollak, Yehuda; Budman, Cathy L.; Bruun, Ruth D.; Erenberg, Gerald; Naarden, Allan L; Lee, Paul C; Weiss, Nicholas; Kremeyer, Barbara; Berrío, Gabriel Bedoya; Campbell, Desmond; Silgado, Julio C. Cardona; Ochoa, William Cornejo; Restrepo, Sandra C. Mesa; Muller, Heike; Duarte, Ana V. Valencia; Lyon, Gholson J; Leppert, Mark; Morgan, Jubel; Weiss, Robert; Grados, Marco A.; Anderson, Kelley; Davarya, Sarah; Singer, Harvey; Walkup, John; Jankovic, Joseph; Tischfield, Jay A.; Heiman, Gary A.; Gilbert, Donald L.; Hoekstra, Pieter J.; Robertson, Mary M.; Kurlan, Roger; Liu, Chunyu; Gibbs, J. Raphael; Singleton, Andrew; Hardy, John; Strengman, Eric; Ophoff, Roel; Wagner, Michael; Moessner, Rainald; Mirel, Daniel B.; Posthuma, Danielle; Sabatti, Chiara; Eskin, Eleazar; Conti, David V.; Knowles, James A.; Ruiz-Linares, Andres; Rouleau, Guy A.; Purcell, Shaun; Heutink, Peter; Oostra, Ben A.; McMahon, William; Freimer, Nelson; Cox, Nancy J.; Pauls, David L.

    2012-01-01

    Tourette Syndrome (TS) is a developmental disorder that has one of the highest familial recurrence rates among neuropsychiatric diseases with complex inheritance. However, the identification of definitive TS susceptibility genes remains elusive. Here, we report the first genome-wide association study (GWAS) of TS in 1285 cases and 4964 ancestry-matched controls of European ancestry, including two European-derived population isolates, Ashkenazi Jews from North America and Israel, and French Canadians from Quebec, Canada. In a primary meta-analysis of GWAS data from these European ancestry samples, no markers achieved a genome-wide threshold of significance (p<5 × 10−8); the top signal was found in rs7868992 on chromosome 9q32 within COL27A1 (p=1.85 × 10−6). A secondary analysis including an additional 211 cases and 285 controls from two closely-related Latin-American population isolates from the Central Valley of Costa Rica and Antioquia, Colombia also identified rs7868992 as the top signal (p=3.6 × 10−7 for the combined sample of 1496 cases and 5249 controls following imputation with 1000 Genomes data). This study lays the groundwork for the eventual identification of common TS susceptibility variants in larger cohorts and helps to provide a more complete understanding of the full genetic architecture of this disorder. PMID:22889924

  20. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

    PubMed Central

    Minster, Ryan L.; Sanders, Jason L.; Singh, Jatinder; Kammerer, Candace M.; Barmada, M. Michael; Matteini, Amy M.; Zhang, Qunyuan; Wojczynski, Mary K.; Daw, E. Warwick; Brody, Jennifer A.; Arnold, Alice M.; Lunetta, Kathryn L.; Murabito, Joanne M.; Christensen, Kaare; Perls, Thomas T.; Province, Michael A.

    2015-01-01

    Background. The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. Methods. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. Results. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10− 6) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24–p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. Conclusions. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. PMID:25758594

  1. Genome-wide association study of the four-constitution medicine.

    PubMed

    Yin, Chang Shik; Park, Hi Joon; Chung, Joo-Ho; Lee, Hye-Jung; Lee, Byung-Cheol

    2009-12-01

    Four-constitution medicine (FCM), also known as Sasang constitutional medicine, and the heritage of the long history of individualized acupuncture medicine tradition, is one of the holistic and traditional systems of constitution to appraise and categorize individual differences into four major types. This study first reports a genome-wide association study on FCM, to explore the genetic basis of FCM and facilitate the integration of FCM with conventional individual differences research. Healthy individuals of the Korean population were classified into the four constitutional types (FCTs). A total of 353,202 single nucleotide polymorphisms (SNPs) were typed using whole genome amplified samples, and six-way comparison of FCM types provided lists of significantly differential SNPs. In one-to-one FCT comparisons, 15,944 SNPs were significantly differential, and 5 SNPs were commonly significant in all of the three comparisons. In one-to-two FCT comparisons, 22,616 SNPs were significantly differential, and 20 SNPs were commonly significant in all of the three comparison groups. This study presents the association between genome-wide SNP profiles and the categorization of the FCM, and it could further provide a starting point of genome-based identification and research of the constitutions of FCM.

  2. Wide scanning spherical antenna

    NASA Technical Reports Server (NTRS)

    Shen, Bing (Inventor); Stutzman, Warren L. (Inventor)

    1995-01-01

    A novel method for calculating the surface shapes for subreflectors in a suboptic assembly of a tri-reflector spherical antenna system is introduced, modeled from a generalization of Galindo-Israel's method of solving partial differential equations to correct for spherical aberration and provide uniform feed to aperture mapping. In a first embodiment, the suboptic assembly moves as a single unit to achieve scan while the main reflector remains stationary. A feed horn is tilted during scan to maintain the illuminated area on the main spherical reflector fixed throughout the scan thereby eliminating the need to oversize the main spherical reflector. In an alternate embodiment, both the main spherical reflector and the suboptic assembly are fixed. A flat mirror is used to create a virtual image of the suboptic assembly. Scan is achieved by rotating the mirror about the spherical center of the main reflector. The feed horn is tilted during scan to maintain the illuminated area on the main spherical reflector fixed throughout the scan.

  3. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    PubMed Central

    2011-01-01

    Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was

  4. Genomic prediction and genome-wide association analysis of female longevity in a composite beef cattle breed.

    PubMed

    Hamidi Hay, E; Roberts, A

    2017-04-01

    Longevity is a highly important trait to the efficiency of beef cattle production. The objective of this study was to evaluate the genomic prediction of longevity and identify genomic regions associated with this trait. The data used in this study consisted of 547 Composite Gene Combination cows (1/2 Red Angus, 1/4 Charolais, 1/4 Tarentaise) born from 2002 to 2011 genotyped with Illumina BovineSNP50 BeadChip. Three models were used to assess genomic prediction: Bayes A, Bayes B and GBLUP using a genomic relationship matrix. To identify genomic regions associated with longevity 2 approaches were adopted: single marker genome wide association and Bayesian approach using GenSel software. The genomic prediction accuracy was low 0.28, 0.25, and 0.22 for Bayes A, Bayes B and GBLUP, respectively. The single-marker genome wide association study (GWAS)identified 5 loci with -value less than 0.05 after false discovery correction: UA-IFASA-7571 on chromosome 19 (58.03 Mb), ARS-BFGL-BAC-15059 on BTA 1 (28.8 Mb), ARS-BFGL-NGS-104159 on BTA3 (29.4 Mb), ARS-BFGL-NGS-32882 on BTA9 (104.07 Mb) and ARS-BFGL-NGS-32883 on BTA25 (33.77 Mb). The Bayesian GWAS yielded 4 genomic regions overlapping with the single marker GWAS results. The region with the highest percentage of genomic variance (3.73%) was detected on chromosome 19. Both GWAS approaches adopted in this study showed evidence for association with various chromosomal locations.

  5. A Genome-Wide Association Study of Depressive Symptoms

    PubMed Central

    Cornelis, Marilyn C.; Amin, Najaf; Bakshis, Erin; Baumert, Jens; Ding, Jingzhong; Liu, Yongmei; Marciante, Kristin; Meirelles, Osorio; Nalls, Michael A.; Sun, Yan V.; Vogelzangs, Nicole; Yu, Lei; Bandinelli, Stefania; Benjamin, Emelia J.; Bennett, David A.; Boomsma, Dorret; Cannas, Alessandra; Coker, Laura H.; de Geus, Eco; De Jager, Philip L.; Diez-Roux, Ana V.; Purcell, Shaun; Hu, Frank B.; Rimma, Eric B.; Hunter, David J.; Jensen, Majken K.; Curhan, Gary; Rice, Kenneth; Penman, Alan D.; Rotter, Jerome I.; Sotoodehnia, Nona; Emeny, Rebecca; Eriksson, Johan G.; Evans, Denis A.; Ferrucci, Luigi; Fornage, Myriam; Gudnason, Vilmundur; Hofman, Albert; Illig, Thomas; Kardia, Sharon; Kelly-Hayes, Margaret; Koenen, Karestan; Kraft, Peter; Kuningas, Maris; Massaro, Joseph M.; Melzer, David; Mulas, Antonella; Mulder, Cornelis L.; Murray, Anna; Oostra, Ben A.; Palotie, Aarno; Penninx, Brenda; Petersmann, Astrid; Pilling, Luke C.; Psaty, Bruce; Rawal, Rajesh; Reiman, Eric M.; Schulz, Andrea; Shulman, Joshua M.; Singleton, Andrew B.; Smith, Albert V.; Sutin, Angelina R.; Uitterlinden, André G.; Völzke, Henry; Widen, Elisabeth; Yaffe, Kristine; Zonderman, Alan B.; Cucca, Francesco; Harris, Tamara; Ladwig, Karl-Heinz; Llewellyn, David J.; Räikkönen, Katri; Tanaka, Toshiko

    2013-01-01

    Background Depression is a heritable trait that exists on a continuum of varying severity and duration. Yet, the search for genetic variants associated with depression has had few successes. We exploit the entire continuum of depression to find common variants for depressive symptoms. Methods In this genome-wide association study, we combined the results of 17 population-based studies assessing depressive symptoms with the Center for Epidemiological Studies Depression Scale. Replication of the independent top hits (p < 1 × 10−5) was performed in five studies assessing depressive symptoms with other instruments. In addition, we performed a combined meta-analysis of all 22 discovery and replication studies. Results The discovery sample comprised 34,549 individuals (mean age of 66.5) and no loci reached genome-wide significance (lowest p = 1.05 × 10−7). Seven independent single nucleotide polymorphisms were considered for replication. In the replication set (n = 16,709), we found suggestive association of one single nucleotide polymorphism with depressive symptoms (rs161645, 5q21, p = 9.19 × 10−3). This 5q21 region reached genome-wide significance (p = 4.78 × 10−8) in the overall meta-analysis combining discovery and replication studies (n = 51,258). Conclusions The results suggest that only a large sample comprising more than 50,000 subjects may be sufficiently powered to detect genes for depressive symptoms. PMID:23290196

  6. Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set.

    PubMed

    Kanai, Masahiro; Tanaka, Toshihiro; Okada, Yukinori

    2016-10-01

    To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10 -8 , the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were P sig =3.24 × 10 -8 (AFR), 9.26 × 10 -8 (EUR), 1.83 × 10 -7 (AMR), 1.61 × 10 -7 (EAS) and 9.46 × 10 -8 (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded P sig =3.25 × 10 -8 (ALL) and 4.20 × 10 -8 (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10 -8 ) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.

  7. Genome Wide Association Study of Sepsis in Extremely Premature Infants

    PubMed Central

    Srinivasan, Lakshmi; Page, Grier; Kirpalani, Haresh; Murray, Jeffrey C.; Das, Abhik; Higgins, Rosemary D.; Carlo, Waldemar A.; Bell, Edward F.; Goldberg, Ronald N.; Schibler, Kurt; Sood, Beena G.; Stevenson, David K.; Stoll, Barbara J.; Van Meurs, Krisa P.; Johnson, Karen J.; Levy, Joshua; McDonald, Scott A.; Zaterka-Baxter, Kristin M.; Kennedy, Kathleen A.; Sánchez, Pablo J.; Duara, Shahnaz; Walsh, Michele C.; Shankaran, Seetha; Wynn, James L.; Cotten, C. Michael

    2017-01-01

    Objective To identify genetic variants associated with sepsis (early and late-onset) using a genome wide association (GWA) analysis in a cohort of extremely premature infants. Study Design Previously generated GWA data from the Neonatal Research Network’s anonymized genomic database biorepository of extremely premature infants were used for this study. Sepsis was defined as culture-positive early-onset or late-onset sepsis or culture-proven meningitis. Genomic and whole genome amplified DNA was genotyped for 1.2 million single nucleotide polymorphisms (SNPs); 91% of SNPs were successfully genotyped. We imputed 7.2 million additional SNPs. P values and false discovery rates were calculated from multivariate logistic regression analysis adjusting for gender, gestational age and ancestry. Target statistical value was p<10−5. Secondary analyses assessed associations of SNPs with pathogen type. Pathway analyses were also run on primary and secondary end points. Results Data from 757 extremely premature infants were included: 351 infants with sepsis and 406 infants without sepsis. No SNPs reached genome-wide significance levels (5×10−8); two SNPs in proximity to FOXC2 and FOXL1 genes achieved target levels of significance. In secondary analyses, SNPs for ELMO1, IRAK2 (Gram positive sepsis), RALA, IMMP2L (Gram negative sepsis) and PIEZO2 (fungal sepsis) met target significance levels. Pathways associated with sepsis and Gram negative sepsis included gap junctions, fibroblast growth factor receptors, regulators of cell division and Interleukin-1 associated receptor kinase 2 (p values<0.001 and FDR<20%). Conclusions No SNPs met genome-wide significance in this cohort of ELBW infants; however, areas of potential association and pathways meriting further study were identified. PMID:28283553

  8. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C. V. Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9–2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers. PMID:27857720

  9. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C V Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum , indicating a population bottleneck during domestication of C. baccatum . In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum , 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index ( F ST ) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9-2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers.

  10. Genome-wide association studies identify genetic loci for low von Willebrand factor levels

    PubMed Central

    van Loon, Janine; Dehghan, Abbas; Weihong, Tang; Trompet, Stella; McArdle, Wendy L; Asselbergs, Folkert F W; Chen, Ming-Huei; Lopez, Lorna M; Huffman, Jennifer E; Leebeek, Frank W G; Basu, Saonli; Stott, David J; Rumley, Ann; Gansevoort, Ron T; Davies, Gail; Wilson, James J F; Witteman, Jacqueline C M; Cao, Xiting; de Craen, Anton J M; Bakker, Stephan J L; Psaty, Bruce M; Starr, John M; Hofman, Albert; Wouter Jukema, J; Deary, Ian J; Hayward, Caroline; van der Harst, Pim; Lowe, Gordon D O; Folsom, Aaron R; Strachan, David P; Smith, Nicolas; de Maat, Moniek P M; O'Donnell, Christopher

    2016-01-01

    Low von Willebrand factor (VWF) levels are associated with bleeding symptoms and are a diagnostic criterion for von Willebrand disease, the most common inherited bleeding disorder. To date, it is unclear which genetic loci are associated with reduced VWF levels. Therefore, we conducted a meta-analysis of genome-wide association studies to identify genetic loci associated with low VWF levels. For this meta-analysis, we included 31 149 participants of European ancestry from 11 community-based studies. From all participants, VWF antigen (VWF:Ag) measurements and genome-wide single-nucleotide polymorphism (SNP) scans were available. Each study conducted analyses using logistic regression of SNPs on dichotomized VWF:Ag measures (lowest 5% for blood group O and non-O) with an additive genetic model adjusted for age and sex. An inverse-variance weighted meta-analysis was performed for VWF:Ag levels. A total of 97 SNPs exceeded the genome-wide significance threshold of 5 × 10−8 and comprised five loci on four different chromosomes: 6q24 (smallest P-value 5.8 × 10−10), 9q34 (2.4 × 10−64), 12p13 (5.3 × 10−22), 12q23 (1.2 × 10−8) and 13q13 (2.6 × 10−8). All loci were within or close to genes, including STXBP5 (Syntaxin Binding Protein 5) (6q24), STAB5 (stabilin-5) (12q23), ABO (9q34), VWF (12p13) and UFM1 (ubiquitin-fold modifier 1) (13q13). Of these, UFM1 has not been previously associated with VWF:Ag levels. Four genes that were previously associated with VWF levels (VWF, ABO, STXBP5 and STAB2) were also associated with low VWF levels, and, in addition, we identified a new gene, UFM1, that is associated with low VWF levels. These findings point to novel mechanisms for the occurrence of low VWF levels. PMID:26486471

  11. Genome-wide association study of alcohol dependence

    PubMed Central

    Treutlein, Jens; Cichon, Sven; Ridinger, Monika; Wodarz, Norbert; Soyka, Michael; Zill, Peter; Maier, Wolfgang; Moessner, Rainald; Gaebel, Wolfgang; Dahmen, Norbert; Fehr, Christoph; Scherbaum, Norbert; Steffens, Michael; Ludwig, Kerstin U.; Frank, Josef; Wichmann, H.- Erich; Schreiber, Stefan; Dragano, Nico; Sommer, Wolfgang; Leonardi-Essmann, Fernando; Lourdusamy, Anbarasu; Gebicke-Haerter, Peter; Wienker, Thomas F.; Sullivan, Patrick F.; Nöthen, Markus M.; Kiefer, Falk; Spanagel, Rainer; Mann, Karl; Rietschel, Marcella

    2014-01-01

    Context Identification of genes contributing to alcohol dependence will improve our understanding of the mechanisms underlying this disorder. Objective To identify susceptibility genes for alcohol dependence through a genome-wide association study (GWAS) and follow-up study in a population of German male inpatients with an early age at onset. Design The GWAS included 487 male inpatients with DSM-IV alcohol dependence with an age at onset below 28 years and 1,358 population based control individuals. The follow-up study included 1,024 male inpatients and 996 age-matched male controls. All subjects were of German descent. The GWAS tested 524,396 single nucleotide polymorphisms (SNPs). All SNPs with p<10-4 were subjected to the follow-up study. In addition, nominally significant SNPs from those genes that had also shown expression changes in rat brains after chronic alcohol consumption were selected for the follow-up step. Results The GWAS produced 121 SNPs with nominal p<10-4. These, together with 19 additional SNPs from homologs of rat genes showing differential expression, were genotyped in the follow-up sample. Fifteen SNPs showed significant association with the same allele as in the GWAS. In the combined analysis, two closely linked intergenic SNPs met genome-wide significance (rs7590720 p=9.72×10-9; rs1344694 p=1.69×10-8). They are located on chromosome 2q35, a region which has been implicated in linkage studies for alcohol phenotypes. Nine SNPs were located in genes, including CDH13 and ADH1C genes which have been reported to be associated with alcohol dependence. Conclusion This is the first GWAS and follow-up study to identify a genome-wide significant association in alcohol dependence. Further independent studies are required to confirm these findings. PMID:19581569

  12. Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase.

    PubMed

    Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R; Jha, Rajiv Kumar; Cole, Stewart T; Nagaraja, Valakunja

    2017-05-01

    Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase.

  13. Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase

    PubMed Central

    Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R.; Jha, Rajiv Kumar

    2017-01-01

    Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase. PMID:28463980

  14. Genome-wide analysis of tandem repeats in plants and green algae

    Treesearch

    Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang

    2014-01-01

    Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...

  15. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa.

    PubMed

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-05-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10 -6 ), and rs7700147, an intergenic variant (P=2.93 × 10 -5 ). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes.

  16. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa

    PubMed Central

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Adan, R A H; Alfredsson, L; Ando, T; Andreassen, O A; Aschauer, H; Baker, J H; Barrett, J C; Bencko, V; Bergen, A W; Berrettini, W H; Birgegard, A; Boni, C; Boraska Perica, V; Brandt, H; Breen, G; Bulik, C M; Carlberg, L; Cassina, M; Cichon, S; Clementi, M; Cohen-Woods, S; Coleman, J; Cone, R D; Courtet, P; Crawford, S; Crow, S; Crowley, J; Danner, U N; Davis, O S P; de Zwaan, M; Dedoussis, G; Degortes, D; DeSocio, J E; Dick, D M; Dikeos, D; Dina, C; Ding, B; Dmitrzak-Weglarz, M; Docampo, E; Duncan, L; Egberts, K; Ehrlich, S; Escaramís, G; Esko, T; Espeseth, T; Estivill, X; Favaro, A; Fernández-Aranda, F; Fichter, M M; Finan, C; Fischer, K; Floyd, J A B; Foretova, L; Forzan, M; Franklin, C S; Gallinger, S; Gambaro, G; Gaspar, H A; Giegling, I; Gonidakis, F; Gorwood, P; Gratacos, M; Guillaume, S; Guo, Y; Hakonarson, H; Halmi, K A; Hatzikotoulas, K; Hauser, J; Hebebrand, J; Helder, S; Herms, S; Herpertz-Dahlmann, B; Herzog, W; Hilliard, C E; Hinney, A; Hübel, C; Huckins, L M; Hudson, J I; Huemer, J; Inoko, H; Janout, V; Jiménez-Murcia, S; Johnson, C; Julià, A; Juréus, A; Kalsi, G; Kaminska, D; Kaplan, A S; Kaprio, J; Karhunen, L; Karwautz, A; Kas, M J H; Kaye, W; Kennedy, J L; Keski-Rahkonen, A; Kiezebrink, K; Klareskog, L; Klump, K L; Knudsen, G P S; Koeleman, B P C; Koubek, D; La Via, M C; Landén, M; Le Hellard, S; Levitan, R D; Li, D; Lichtenstein, P; Lilenfeld, L; Lissowska, J; Lundervold, A; Magistretti, P; Maj, M; Mannik, K; Marsal, S; Martin, N; Mattingsdal, M; McDevitt, S; McGuffin, P; Merl, E; Metspalu, A; Meulenbelt, I; Micali, N; Mitchell, J; Mitchell, K; Monteleone, P; Monteleone, A M; Mortensen, P; Munn-Chernoff, M A; Navratilova, M; Nilsson, I; Norring, C; Ntalla, I; Ophoff, R A; O'Toole, J K; Palotie, A; Pante, J; Papezova, H; Pinto, D; Rabionet, R; Raevuori, A; Rajewski, A; Ramoz, N; Rayner, N W; Reichborn-Kjennerud, T; Ripatti, S; Roberts, M; Rotondo, A; Rujescu, D; Rybakowski, F; Santonastaso, P; Scherag, A; Scherer, S W; Schmidt, U; Schork, N J; Schosser, A; Slachtova, L; Sladek, R; Slagboom, P E; Slof-Op 't Landt, M C T; Slopien, A; Soranzo, N; Southam, L; Steen, V M; Strengman, E; Strober, M; Sullivan, P F; Szatkiewicz, J P; Szeszenia-Dabrowska, N; Tachmazidou, I; Tenconi, E; Thornton, L M; Tortorella, A; Tozzi, F; Treasure, J; Tsitsika, A; Tziouvas, K; van Elburg, A A; van Furth, E F; Wagner, G; Walton, E; Watson, H; Wichmann, H-E; Widen, E; Woodside, D B; Yanovski, J; Yao, S; Yilmaz, Z; Zeggini, E; Zerwas, S; Zipfel, S; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-01-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10−6), and rs7700147, an intergenic variant (P=2.93 × 10−5). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes. PMID:29155802

  17. Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers

    PubMed Central

    2010-01-01

    Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788

  18. Genome-wide association study of rice grain width variation.

    PubMed

    Zheng, Xiao-Ming; Gong, Tingting; Ou, Hong-Ling; Xue, Dayuan; Qiao, Weihua; Wang, Junrui; Liu, Sha; Yang, Qingwen; Olsen, Kenneth M

    2018-04-01

    Seed size is variable within many plant species, and understanding the underlying genetic factors can provide insights into mechanisms of local environmental adaptation. Here we make use of the abundant genomic and germplasm resources available for rice (Oryza sativa) to perform a large-scale genome-wide association study (GWAS) of grain width. Grain width varies widely within the crop and is also known to show climate-associated variation across populations of its wild progenitor. Using a filtered dataset of >1.9 million genome-wide SNPs in a sample of 570 cultivated and wild rice accessions, we performed GWAS with two complementary models, GLM and MLM. The models yielded 10 and 33 significant associations, respectively, and jointly yielded seven candidate locus regions, two of which have been previously identified. Analyses of nucleotide diversity and haplotype distributions at these loci revealed signatures of selection and patterns consistent with adaptive introgression of grain width alleles across rice variety groups. The results provide a 50% increase in the total number of rice grain width loci mapped to date and support a polygenic model whereby grain width is shaped by gene-by-environment interactions. These loci can potentially serve as candidates for studies of adaptive seed size variation in wild grass species.

  19. AID/APOBEC cytosine deaminase induces genome-wide kataegis

    PubMed Central

    2012-01-01

    Clusters of localized hypermutation in human breast cancer genomes, named “kataegis” (from the Greek for thunderstorm), are hypothesized to result from multiple cytosine deaminations catalyzed by AID/APOBEC proteins. However, a direct link between APOBECs and kataegis is still lacking. We have sequenced the genomes of yeast mutants induced in diploids by expression of the gene for PmCDA1, a hypermutagenic deaminase from sea lamprey. Analysis of the distribution of 5,138 induced mutations revealed localized clusters very similar to those found in tumors. Our data provide evidence that unleashed cytosine deaminase activity is an evolutionary conserved, prominent source of genome-wide kataegis events. Reviewers This article was reviewed by: Professor Sandor Pongor, Professor Shamil R. Sunyaev, and Dr Vladimir Kuznetsov. PMID:23249472

  20. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    PubMed

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  1. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index.

    PubMed

    Minster, Ryan L; Sanders, Jason L; Singh, Jatinder; Kammerer, Candace M; Barmada, M Michael; Matteini, Amy M; Zhang, Qunyuan; Wojczynski, Mary K; Daw, E Warwick; Brody, Jennifer A; Arnold, Alice M; Lunetta, Kathryn L; Murabito, Joanne M; Christensen, Kaare; Perls, Thomas T; Province, Michael A; Newman, Anne B

    2015-08-01

    The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10(-) (6)) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24-p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. © The Author 2015. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Genome-wide scan in Hispanics highlights candidate loci for brain white matter hyperintensities

    PubMed Central

    Beecham, Ashley; Dong, Chuanhui; Wright, Clinton B.; Dueker, Nicole; Brickman, Adam M.; Wang, Liyong; DeCarli, Charles; Blanton, Susan H.; Rundek, Tatjana; Mayeux, Richard

    2017-01-01

    Objective: To investigate genetic variants influencing white matter hyperintensities (WMHs) in the understudied Hispanic population. Methods: Using 6.8 million single nucleotide polymorphisms (SNPs), we conducted a genome-wide association study (GWAS) to identify SNPs associated with WMH volume (WMHV) in 922 Hispanics who underwent brain MRI as a cross-section of 2 community-based cohorts in the Northern Manhattan Study and the Washington Heights–Inwood Columbia Aging Project. Multiple linear modeling with PLINK was performed to examine the additive genetic effects on ln(WMHV) after controlling for age, sex, total intracranial volume, and principal components of ancestry. Gene-based tests of association were performed using VEGAS. Replication was performed in independent samples of Europeans, African Americans, and Asians. Results: From the SNP analysis, a total of 17 independent SNPs in 7 genes had suggestive evidence of association with WMHV in Hispanics (p < 1 × 10−5) and 5 genes from the gene-based analysis with p < 1 × 10−3. One SNP (rs9957475 in GATA6) and 1 gene (UBE2C) demonstrated evidence of association (p < 0.05) in the African American sample. Four SNPs with p < 1 × 10−5 were shown to affect binding of SPI1 using RegulomeDB. Conclusions: This GWAS of 2 community-based Hispanic cohorts revealed several novel WMH-associated genetic variants. Further replication is needed in independent Hispanic samples to validate these suggestive associations, and fine mapping is needed to pinpoint causal variants. PMID:28975155

  3. Genome-Wide Mapping of Loci Explaining Variance in Scrotal Circumference in Nellore Cattle

    PubMed Central

    Utsunomiya, Yuri T.; Carmo, Adriana S.; Neves, Haroldo H. R.; Carvalheiro, Roberto; Matos, Márcia C.; Zavarez, Ludmilla B.; Ito, Pier K. R. K.; Pérez O'Brien, Ana M.; Sölkner, Johann; Porto-Neto, Laercio R.; Schenkel, Flávio S.; McEwan, John; Cole, John B.; da Silva, Marcos V. G. B.; Van Tassell, Curtis P.; Sonstegard, Tad S.; Garcia, José Fernando

    2014-01-01

    The reproductive performance of bulls has a high impact on the beef cattle industry. Scrotal circumference (SC) is the most recorded reproductive trait in beef herds, and is used as a major selection criterion to improve precocity and fertility. The characterization of genomic regions affecting SC can contribute to the identification of diagnostic markers for reproductive performance and uncover molecular mechanisms underlying complex aspects of bovine reproductive biology. In this paper, we report a genome-wide scan for chromosome segments explaining differences in SC, using data of 861 Nellore bulls (Bos indicus) genotyped for over 777,000 single nucleotide polymorphisms. Loci that excel from the genome background were identified on chromosomes 4, 6, 7, 10, 14, 18 and 21. The majority of these regions were previously found to be associated with reproductive and body size traits in cattle. The signal on chromosome 14 replicates the pleiotropic quantitative trait locus encompassing PLAG1 that affects male fertility in cattle and stature in several species. Based on intensive literature mining, SP4, MAGEL2, SH3RF2, PDE5A and SNAI2 are proposed as novel candidate genes for SC, as they affect growth and testicular size in other animal models. These findings contribute to linking reproductive phenotypes to gene functions, and may offer new insights on the molecular biology of male fertility. PMID:24558400

  4. A Genome-Wide Search for Linkage of Estimated Glomerular Filtration Rate (eGFR) in the Family Investigation of Nephropathy and Diabetes (FIND)

    PubMed Central

    Thameem, Farook; Igo, Robert P.; Freedman, Barry I.; Langefeld, Carl; Hanson, Robert L.; Schelling, Jeffrey R.; Elston, Robert C.; Duggirala, Ravindranath; Nicholas, Susanne B.; Goddard, Katrina A. B.; Divers, Jasmin; Guo, Xiuqing; Ipp, Eli; Kimmel, Paul L.; Meoni, Lucy A.; Shah, Vallabh O.; Smith, Michael W.; Winkler, Cheryl A.; Zager, Philip G.; Knowler, William C.; Nelson, Robert G.; Pahl, Madeline V.; Parekh, Rulan S.; Kao, W. H. Linda; Rasooly, Rebekah S.; Adler, Sharon G.; Abboud, Hanna E.; Iyengar, Sudha K.; Sedor, John R.

    2013-01-01

    Objective Estimated glomerular filtration rate (eGFR), a measure of kidney function, is heritable, suggesting that genes influence renal function. Genes that influence eGFR have been identified through genome-wide association studies. However, family-based linkage approaches may identify loci that explain a larger proportion of the heritability. This study used genome-wide linkage and association scans to identify quantitative trait loci (QTL) that influence eGFR. Methods Genome-wide linkage and sparse association scans of eGFR were performed in families ascertained by probands with advanced diabetic nephropathy (DN) from the multi-ethnic Family Investigation of Nephropathy and Diabetes (FIND) study. This study included 954 African Americans (AA), 781 American Indians (AI), 614 European Americans (EA) and 1,611 Mexican Americans (MA). A total of 3,960 FIND participants were genotyped for 6,000 single nucleotide polymorphisms (SNPs) using the Illumina Linkage IVb panel. GFR was estimated by the Modification of Diet in Renal Disease (MDRD) formula. Results The non-parametric linkage analysis, accounting for the effects of diabetes duration and BMI, identified the strongest evidence for linkage of eGFR on chromosome 20q11 (log of the odds [LOD] = 3.34; P = 4.4×10−5) in MA and chromosome 15q12 (LOD = 2.84; P = 1.5×10−4) in EA. In all subjects, the strongest linkage signal for eGFR was detected on chromosome 10p12 (P = 5.5×10−4) at 44 cM near marker rs1339048. A subsequent association scan in both ancestry-specific groups and the entire population identified several SNPs significantly associated with eGFR across the genome. Conclusion The present study describes the localization of QTL influencing eGFR on 20q11 in MA, 15q21 in EA and 10p12 in the combined ethnic groups participating in the FIND study. Identification of causal genes/variants influencing eGFR, within these linkage and association loci, will open new avenues for functional

  5. Citalopram and escitalopram plasma drug and metabolite concentrations: genome-wide associations

    PubMed Central

    Ji, Yuan; Schaid, Daniel J; Desta, Zeruesenay; Kubo, Michiaki; Batzler, Anthony J; Snyder, Karen; Mushiroda, Taisei; Kamatani, Naoyuki; Ogburn, Evan; Hall-Flavin, Daniel; Flockhart, David; Nakamura, Yusuke; Mrazek, David A; Weinshilboum, Richard M

    2014-01-01

    Aims Citalopram (CT) and escitalopram (S-CT) are among the most widely prescribed selective serotonin reuptake inhibitors used to treat major depressive disorder (MDD). We applied a genome-wide association study to identify genetic factors that contribute to variation in plasma concentrations of CT or S-CT and their metabolites in MDD patients treated with CT or S-CT. Methods Our genome-wide association study was performed using samples from 435 MDD patients. Linear mixed models were used to account for within-subject correlations of longitudinal measures of plasma drug/metabolite concentrations (4 and 8 weeks after the initiation of drug therapy), and single-nucleotide polymorphisms (SNPs) were modelled as additive allelic effects. Results Genome-wide significant associations were observed for S-CT concentration with SNPs in or near the CYP2C19 gene on chromosome 10 (rs1074145, P = 4.1 × 10−9) and with S-didesmethylcitalopram concentration for SNPs near the CYP2D6 locus on chromosome 22 (rs1065852, P = 2.0 × 10−16), supporting the important role of these cytochrome P450 (CYP) enzymes in biotransformation of citalopram. After adjustment for the effect of CYP2C19 functional alleles, the analyses also identified novel loci that will require future replication and functional validation. Conclusions In vitro and in vivo studies have suggested that the biotransformation of CT to monodesmethylcitalopram and didesmethylcitalopram is mediated by CYP isozymes. The results of our genome-wide association study performed in MDD patients treated with CT or S-CT have confirmed those observations but also identified novel genomic loci that might play a role in variation in plasma levels of CT or its metabolites during the treatment of MDD patients with these selective serotonin reuptake inhibitors. PMID:24528284

  6. Citalopram and escitalopram plasma drug and metabolite concentrations: genome-wide associations.

    PubMed

    Ji, Yuan; Schaid, Daniel J; Desta, Zeruesenay; Kubo, Michiaki; Batzler, Anthony J; Snyder, Karen; Mushiroda, Taisei; Kamatani, Naoyuki; Ogburn, Evan; Hall-Flavin, Daniel; Flockhart, David; Nakamura, Yusuke; Mrazek, David A; Weinshilboum, Richard M

    2014-08-01

    Citalopram (CT) and escitalopram (S-CT) are among the most widely prescribed selective serotonin reuptake inhibitors used to treat major depressive disorder (MDD). We applied a genome-wide association study to identify genetic factors that contribute to variation in plasma concentrations of CT or S-CT and their metabolites in MDD patients treated with CT or S-CT. Our genome-wide association study was performed using samples from 435 MDD patients. Linear mixed models were used to account for within-subject correlations of longitudinal measures of plasma drug/metabolite concentrations (4 and 8 weeks after the initiation of drug therapy), and single-nucleotide polymorphisms (SNPs) were modelled as additive allelic effects. Genome-wide significant associations were observed for S-CT concentration with SNPs in or near the CYP2C19 gene on chromosome 10 (rs1074145, P = 4.1 × 10(-9) ) and with S-didesmethylcitalopram concentration for SNPs near the CYP2D6 locus on chromosome 22 (rs1065852, P = 2.0 × 10(-16) ), supporting the important role of these cytochrome P450 (CYP) enzymes in biotransformation of citalopram. After adjustment for the effect of CYP2C19 functional alleles, the analyses also identified novel loci that will require future replication and functional validation. In vitro and in vivo studies have suggested that the biotransformation of CT to monodesmethylcitalopram and didesmethylcitalopram is mediated by CYP isozymes. The results of our genome-wide association study performed in MDD patients treated with CT or S-CT have confirmed those observations but also identified novel genomic loci that might play a role in variation in plasma levels of CT or its metabolites during the treatment of MDD patients with these selective serotonin reuptake inhibitors. © 2014 The British Pharmacological Society.

  7. A genome-wide approach to children's aggressive behavior: The EAGLE consortium.

    PubMed

    Pappa, Irene; St Pourcain, Beate; Benke, Kelly; Cavadino, Alana; Hakulinen, Christian; Nivard, Michel G; Nolte, Ilja M; Tiesler, Carla M T; Bakermans-Kranenburg, Marian J; Davies, Gareth E; Evans, David M; Geoffroy, Marie-Claude; Grallert, Harald; Groen-Blokhuis, Maria M; Hudziak, James J; Kemp, John P; Keltikangas-Järvinen, Liisa; McMahon, George; Mileva-Seitz, Viara R; Motazedi, Ehsan; Power, Christine; Raitakari, Olli T; Ring, Susan M; Rivadeneira, Fernando; Rodriguez, Alina; Scheet, Paul A; Seppälä, Ilkka; Snieder, Harold; Standl, Marie; Thiering, Elisabeth; Timpson, Nicholas J; Veenstra, René; Velders, Fleur P; Whitehouse, Andrew J O; Smith, George Davey; Heinrich, Joachim; Hypponen, Elina; Lehtimäki, Terho; Middeldorp, Christel M; Oldehinkel, Albertine J; Pennell, Craig E; Boomsma, Dorret I; Tiemeier, Henning

    2016-07-01

    Individual differences in aggressive behavior emerge in early childhood and predict persisting behavioral problems and disorders. Studies of antisocial and severe aggression in adulthood indicate substantial underlying biology. However, little attention has been given to genome-wide approaches of aggressive behavior in children. We analyzed data from nine population-based studies and assessed aggressive behavior using well-validated parent-reported questionnaires. This is the largest sample exploring children's aggressive behavior to date (N = 18,988), with measures in two developmental stages (N = 15,668 early childhood and N = 16,311 middle childhood/early adolescence). First, we estimated the additive genetic variance of children's aggressive behavior based on genome-wide SNP information, using genome-wide complex trait analysis (GCTA). Second, genetic associations within each study were assessed using a quasi-Poisson regression approach, capturing the highly right-skewed distribution of aggressive behavior. Third, we performed meta-analyses of genome-wide associations for both the total age-mixed sample and the two developmental stages. Finally, we performed a gene-based test using the summary statistics of the total sample. GCTA quantified variance tagged by common SNPs (10-54%). The meta-analysis of the total sample identified one region in chromosome 2 (2p12) at near genome-wide significance (top SNP rs11126630, P = 5.30 × 10(-8) ). The separate meta-analyses of the two developmental stages revealed suggestive evidence of association at the same locus. The gene-based analysis indicated association of variation within AVPR1A with aggressive behavior. We conclude that common variants at 2p12 show suggestive evidence for association with childhood aggression. Replication of these initial findings is needed, and further studies should clarify its biological meaning. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  8. Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

    2014-01-01

    Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

  9. GUIDE-Seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases

    PubMed Central

    Nguyen, Nhu T.; Liebers, Matthew; Topkar, Ved V.; Thapar, Vishal; Wyvekens, Nicolas; Khayter, Cyd; Iafrate, A. John; Le, Long P.; Aryee, Martin J.; Joung, J. Keith

    2014-01-01

    CRISPR RNA-guided nucleases (RGNs) are widely used genome-editing reagents, but methods to delineate their genome-wide off-target cleavage activities have been lacking. Here we describe an approach for global detection of DNA double-stranded breaks (DSBs) introduced by RGNs and potentially other nucleases. This method, called Genome-wide Unbiased Identification of DSBs Enabled by Sequencing (GUIDE-Seq), relies on capture of double-stranded oligodeoxynucleotides into breaks Application of GUIDE-Seq to thirteen RGNs in two human cell lines revealed wide variability in RGN off-target activities and unappreciated characteristics of off-target sequences. The majority of identified sites were not detected by existing computational methods or ChIP-Seq. GUIDE-Seq also identified RGN-independent genomic breakpoint ‘hotspots’. Finally, GUIDE-Seq revealed that truncated guide RNAs exhibit substantially reduced RGN-induced off-target DSBs. Our experiments define the most rigorous framework for genome-wide identification of RGN off-target effects to date and provide a method for evaluating the safety of these nucleases prior to clinical use. PMID:25513782

  10. Evaluation of the genetic overlap between osteoarthritis with body mass index and height using genome-wide association scan data.

    PubMed

    Elliott, Katherine S; Chapman, Kay; Day-Williams, Aaron; Panoutsopoulou, Kalliope; Southam, Lorraine; Lindgren, Cecilia M; Arden, Nigel; Aslam, Nadim; Birrell, Fraser; Carluke, Ian; Carr, Andrew; Deloukas, Panos; Doherty, Michael; Loughlin, John; McCaskie, Andrew; Ollier, William E R; Rai, Ashok; Ralston, Stuart; Reed, Mike R; Spector, Timothy D; Valdes, Ana M; Wallis, Gillian A; Wilkinson, Mark; Zeggini, Eleftheria

    2013-06-01

    Obesity as measured by body mass index (BMI) is one of the major risk factors for osteoarthritis. In addition, genetic overlap has been reported between osteoarthritis and normal adult height variation. We investigated whether this relationship is due to a shared genetic aetiology on a genome-wide scale. We compared genetic association summary statistics (effect size, p value) for BMI and height from the GIANT consortium genome-wide association study (GWAS) with genetic association summary statistics from the arcOGEN consortium osteoarthritis GWAS. Significance was evaluated by permutation. Replication of osteoarthritis association of the highlighted signals was investigated in an independent dataset. Phenotypic information of height and BMI was accounted for in a separate analysis using osteoarthritis-free controls. We found significant overlap between osteoarthritis and height (p=3.3×10(-5) for signals with p≤0.05) when the GIANT and arcOGEN GWAS were compared. For signals with p≤0.001 we found 17 shared signals between osteoarthritis and height and four between osteoarthritis and BMI. However, only one of the height or BMI signals that had shown evidence of association with osteoarthritis in the arcOGEN GWAS was also associated with osteoarthritis in the independent dataset: rs12149832, within the FTO gene (combined p=2.3×10(-5)). As expected, this signal was attenuated when we adjusted for BMI. We found a significant excess of shared signals between both osteoarthritis and height and osteoarthritis and BMI, suggestive of a common genetic aetiology. However, only one signal showed association with osteoarthritis when followed up in a new dataset.

  11. Genome-wide linkage scan for contraction velocity characteristics of knee musculature in the Leuven Genes for Muscular Strength Study.

    PubMed

    De Mars, Gunther; Windelinckx, An; Huygens, Wim; Peeters, Maarten W; Beunen, Gaston P; Aerssens, Jeroen; Vlietinck, Robert; Thomis, Martine A I

    2008-09-17

    The torque-velocity relationship is known to be affected by ageing, decreasing its protective role in the prevention of falls. Interindividual variability in this torque-velocity relationship is partly determined by genetic factors (h(2): 44-67%). As a first attempt, this genome-wide linkage study aimed to identify chromosomal regions linked to the torque-velocity relationship of the knee flexors and extensors. A selection of 283 informative male siblings (17-36 yr), belonging to 105 families, was used to conduct a genome-wide SNP-based (Illumina Linkage IVb panel) multipoint linkage analysis for the torque-velocity relationship of the knee flexors and extensors. The strongest evidence for linkage was found at 15q23 for the torque-velocity slope of the knee extensors (TVSE). Other interesting linkage regions with LOD scores >2 were found at 7p12.3 [logarithm of the odds ratio (LOD) = 2.03, P = 0.0011] for the torque-velocity ratio of the knee flexors (TVRF), at 2q14.3 (LOD = 2.25, P = 0.0006) for TVSE, and at 4p14 and 18q23 for the torque-velocity ratio of the knee extensors TVRE (LOD = 2.23 and 2.08; P = 0.0007 and 0.001, respectively). We conclude that many small contributing genes are involved in causing variation in the torque-velocity relationship of the knee flexor and extensor muscles. Several earlier reported candidate genes for muscle strength and muscle mass and new candidates are harbored within or in close vicinity of the linkage regions reported in the present study.

  12. Genome-wide association as a means to understanding the mammary gland

    USDA-ARS?s Scientific Manuscript database

    Next-generation sequencing and related technologies have facilitated the creation of enormous public databases that catalogue genomic variation. These databases have facilitated a variety of approaches to discover new genes that regulate normal biology as well as disease. Genome wide association (...

  13. Genome-wide association analysis of red blood cell traits in African Americans: the COGENT Network

    PubMed Central

    Chen, Zhao; Tang, Hua; Qayyum, Rehan; Schick, Ursula M.; Nalls, Michael A.; Handsaker, Robert; Li, Jin; Lu, Yingchang; Yanek, Lisa R.; Keating, Brendan; Meng, Yan; van Rooij, Frank J.A.; Okada, Yukinori; Kubo, Michiaki; Rasmussen-Torvik, Laura; Keller, Margaux F.; Lange, Leslie; Evans, Michele; Bottinger, Erwin P.; Linderman, Michael D.; Ruderfer, Douglas M.; Hakonarson, Hakon; Papanicolaou, George; Zonderman, Alan B.; Gottesman, Omri; Thomson, Cynthia; Ziv, Elad; Singleton, Andrew B.; Loos, Ruth J.F.; Sleiman, Patrick M.A.; Ganesh, Santhi; McCarroll, Steven; Becker, Diane M.; Wilson, James G.; Lettre, Guillaume; Reiner, Alexander P.

    2013-01-01

    Laboratory red blood cell (RBC) measurements are clinically important, heritable and differ among ethnic groups. To identify genetic variants that contribute to RBC phenotypes in African Americans (AAs), we conducted a genome-wide association study in up to ∼16 500 AAs. The alpha-globin locus on chromosome 16pter [lead SNP rs13335629 in ITFG3 gene; P < 1E−13 for hemoglobin (Hgb), RBC count, mean corpuscular volume (MCV), MCH and MCHC] and the G6PD locus on Xq28 [lead SNP rs1050828; P < 1E − 13 for Hgb, hematocrit (Hct), MCV, RBC count and red cell distribution width (RDW)] were each associated with multiple RBC traits. At the alpha-globin region, both the common African 3.7 kb deletion and common single nucleotide polymorphisms (SNPs) appear to contribute independently to RBC phenotypes among AAs. In the 2p21 region, we identified a novel variant of PRKCE distinctly associated with Hct in AAs. In a genome-wide admixture mapping scan, local European ancestry at the 6p22 region containing HFE and LRRC16A was associated with higher Hgb. LRRC16A has been previously associated with the platelet count and mean platelet volume in AAs, but not with Hgb. Finally, we extended to AAs the findings of association of erythrocyte traits with several loci previously reported in Europeans and/or Asians, including CD164 and HBS1L-MYB. In summary, this large-scale genome-wide analysis in AAs has extended the importance of several RBC-associated genetic loci to AAs and identified allelic heterogeneity and pleiotropy at several previously known genetic loci associated with blood cell traits in AAs. PMID:23446634

  14. Genome-Wide Association Study to Identify Common Variants Associated with Brachial Circumference: A Meta-Analysis of 14 Cohorts

    PubMed Central

    Boraska, Vesna; Day-Williams, Aaron; Franklin, Christopher S.; Elliott, Katherine S.; Panoutsopoulou, Kalliope; Tachmazidou, Ioanna; Albrecht, Eva; Bandinelli, Stefania; Beilin, Lawrence J.; Bochud, Murielle; Cadby, Gemma; Ernst, Florian; Evans, David M.; Hayward, Caroline; Hicks, Andrew A.; Huffman, Jennifer; Huth, Cornelia; James, Alan L.; Klopp, Norman; Kolcic, Ivana; Kutalik, Zoltán; Lawlor, Debbie A.; Musk, Arthur W.; Pehlic, Marina; Pennell, Craig E.; Perry, John R. B.; Peters, Annette; Polasek, Ozren; Pourcain, Beate St; Ring, Susan M.; Salvi, Erika; Schipf, Sabine; Staessen, Jan A.; Teumer, Alexander; Timpson, Nicholas; Vitart, Veronique; Warrington, Nicole M.; Yaghootkar, Hanieh; Zemunik, Tatijana; Zgaga, Lina; An, Ping; Anttila, Verneri; Borecki, Ingrid B.; Holmen, Jostein; Ntalla, Ioanna; Palotie, Aarno; Pietiläinen, Kirsi H.; Wedenoja, Juho; Winsvold, Bendik S.; Dedoussis, George V.; Kaprio, Jaakko; Province, Michael A.; Zwart, John-Anker; Burnier, Michel; Campbell, Harry; Cusi, Daniele; Davey Smith, George; Frayling, Timothy M.; Gieger, Christian; Palmer, Lyle J.; Pramstaller, Peter P.; Rudan, Igor; Völzke, Henry; Wichmann, H. -Erich; Wright, Alan F.; Zeggini, Eleftheria

    2012-01-01

    Brachial circumference (BC), also known as upper arm or mid arm circumference, can be used as an indicator of muscle mass and fat tissue, which are distributed differently in men and women. Analysis of anthropometric measures of peripheral fat distribution such as BC could help in understanding the complex pathophysiology behind overweight and obesity. The purpose of this study is to identify genetic variants associated with BC through a large-scale genome-wide association scan (GWAS) meta-analysis. We used fixed-effects meta-analysis to synthesise summary results across 14 GWAS discovery and 4 replication cohorts comprising overall 22,376 individuals (12,031 women and 10,345 men) of European ancestry. Individual analyses were carried out for men, women, and combined across sexes using linear regression and an additive genetic model: adjusted for age and adjusted for age and BMI. We prioritised signals for follow-up in two-stages. We did not detect any signals reaching genome-wide significance. The FTO rs9939609 SNP showed nominal evidence for association (p<0.05) in the age-adjusted strata for men and across both sexes. In this first GWAS meta-analysis for BC to date, we have not identified any genome-wide significant signals and do not observe robust association of previously established obesity loci with BC. Large-scale collaborations will be necessary to achieve higher power to detect loci underlying BC. PMID:22479309

  15. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    DTIC Science & Technology

    2011-09-01

    Almasy, L, Blangero, J. (2009) Human QTL linkage mapping. Genetica 136:333-340. Amos, CI. (2007) Successful design and conduct of genome-wide...quantitative trait loci. Genetica 136:237-243. Skol AD, Scott LJ, Abecasis GR, Boehnke M. (2006) Joint analysis is more efficient than replication

  16. Genome-wide selection components analysis in a fish with male pregnancy.

    PubMed

    Flanagan, Sarah P; Jones, Adam G

    2017-04-01

    A major goal of evolutionary biology is to identify the genome-level targets of natural and sexual selection. With the advent of next-generation sequencing, whole-genome selection components analysis provides a promising avenue in the search for loci affected by selection in nature. Here, we implement a genome-wide selection components analysis in the sex role reversed Gulf pipefish, Syngnathus scovelli. Our approach involves a double-digest restriction-site associated DNA sequencing (ddRAD-seq) technique, applied to adult females, nonpregnant males, pregnant males, and their offspring. An F ST comparison of allele frequencies among these groups reveals 47 genomic regions putatively experiencing sexual selection, as well as 468 regions showing a signature of differential viability selection between males and females. A complementary likelihood ratio test identifies similar patterns in the data as the F ST analysis. Sexual selection and viability selection both tend to favor the rare alleles in the population. Ultimately, we conclude that genome-wide selection components analysis can be a useful tool to complement other approaches in the effort to pinpoint genome-level targets of selection in the wild. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  17. Genome-wide expression profiling in pediatric septic shock

    PubMed Central

    Wong, Hector R.

    2013-01-01

    For nearly a decade, our research group has had the privilege of developing and mining a multi-center, microarray-based, genome-wide expression database of critically ill children (≤ 10 years of age) with septic shock. Using bioinformatic and systems biology approaches, the expression data generated through this discovery-oriented, exploratory approach have been leveraged for a variety of objectives, which will be reviewed. Fundamental observations include wide spread repression of gene programs corresponding to the adaptive immune system, and biologically significant differential patterns of gene expression across developmental age groups. The data have also identified gene expression-based subclasses of pediatric septic shock having clinically relevant phenotypic differences. The data have also been leveraged for the discovery of novel therapeutic targets, and for the discovery and development of novel stratification and diagnostic biomarkers. Almost a decade of genome-wide expression profiling in pediatric septic shock is now demonstrating tangible results. The studies have progressed from an initial discovery-oriented and exploratory phase, to a new phase where the data are being translated and applied to address several areas of clinical need. PMID:23329198

  18. A Genome-Wide Association Study for Regulators of Micronucleus Formation in Mice.

    PubMed

    McIntyre, Rebecca E; Nicod, Jérôme; Robles-Espinoza, Carla Daniela; Maciejowski, John; Cai, Na; Hill, Jennifer; Verstraten, Ruth; Iyer, Vivek; Rust, Alistair G; Balmus, Gabriel; Mott, Richard; Flint, Jonathan; Adams, David J

    2016-08-09

    In mammals the regulation of genomic instability plays a key role in tumor suppression and also controls genome plasticity, which is important for recombination during the processes of immunity and meiosis. Most studies to identify regulators of genomic instability have been performed in cells in culture or in systems that report on gross rearrangements of the genome, yet subtle differences in the level of genomic instability can contribute to whole organism phenotypes such as tumor predisposition. Here we performed a genome-wide association study in a population of 1379 outbred Crl:CFW(SW)-US_P08 mice to dissect the genetic landscape of micronucleus formation, a biomarker of chromosomal breaks, whole chromosome loss, and extranuclear DNA. Variation in micronucleus levels is a complex trait with a genome-wide heritability of 53.1%. We identify seven loci influencing micronucleus formation (false discovery rate <5%), and define candidate genes at each locus. Intriguingly at several loci we find evidence for sexual dimorphism in micronucleus formation, with a locus on chromosome 11 being specific to males. Copyright © 2016 McIntyre et al.

  19. StereoGene: rapid estimation of genome-wide correlation of continuous or interval feature data.

    PubMed

    Stavrovskaya, Elena D; Niranjan, Tejasvi; Fertig, Elana J; Wheelan, Sarah J; Favorov, Alexander V; Mironov, Andrey A

    2017-10-15

    Genomics features with similar genome-wide distributions are generally hypothesized to be functionally related, for example, colocalization of histones and transcription start sites indicate chromatin regulation of transcription factor activity. Therefore, statistical algorithms to perform spatial, genome-wide correlation among genomic features are required. Here, we propose a method, StereoGene, that rapidly estimates genome-wide correlation among pairs of genomic features. These features may represent high-throughput data mapped to reference genome or sets of genomic annotations in that reference genome. StereoGene enables correlation of continuous data directly, avoiding the data binarization and subsequent data loss. Correlations are computed among neighboring genomic positions using kernel correlation. Representing the correlation as a function of the genome position, StereoGene outputs the local correlation track as part of the analysis. StereoGene also accounts for confounders such as input DNA by partial correlation. We apply our method to numerous comparisons of ChIP-Seq datasets from the Human Epigenome Atlas and FANTOM CAGE to demonstrate its wide applicability. We observe the changes in the correlation between epigenomic features across developmental trajectories of several tissue types consistent with known biology and find a novel spatial correlation of CAGE clusters with donor splice sites and with poly(A) sites. These analyses provide examples for the broad applicability of StereoGene for regulatory genomics. The StereoGene C ++ source code, program documentation, Galaxy integration scripts and examples are available from the project homepage http://stereogene.bioinf.fbb.msu.ru/. favorov@sensi.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  20. Association of genome-wide variation with the risk of incident heart failure in adults of European and African ancestry: a prospective meta-analysis from the cohorts for heart and aging research in genomic epidemiology (CHARGE) consortium.

    PubMed

    Smith, Nicholas L; Felix, Janine F; Morrison, Alanna C; Demissie, Serkalem; Glazer, Nicole L; Loehr, Laura R; Cupples, L Adrienne; Dehghan, Abbas; Lumley, Thomas; Rosamond, Wayne D; Lieb, Wolfgang; Rivadeneira, Fernando; Bis, Joshua C; Folsom, Aaron R; Benjamin, Emelia; Aulchenko, Yurii S; Haritunians, Talin; Couper, David; Murabito, Joanne; Wang, Ying A; Stricker, Bruno H; Gottdiener, John S; Chang, Patricia P; Wang, Thomas J; Rice, Kenneth M; Hofman, Albert; Heckbert, Susan R; Fox, Ervin R; O'Donnell, Christopher J; Uitterlinden, Andre G; Rotter, Jerome I; Willerson, James T; Levy, Daniel; van Duijn, Cornelia M; Psaty, Bruce M; Witteman, Jacqueline C M; Boerwinkle, Eric; Vasan, Ramachandran S

    2010-06-01

    Although genetic factors contribute to the onset of heart failure (HF), no large-scale genome-wide investigation of HF risk has been published to date. We have investigated the association of 2,478,304 single-nucleotide polymorphisms with incident HF by meta-analyzing data from 4 community-based prospective cohorts: the Atherosclerosis Risk in Communities Study, the Cardiovascular Health Study, the Framingham Heart Study, and the Rotterdam Study. Eligible participants for these analyses were of European or African ancestry and free of clinical HF at baseline. Each study independently conducted genome-wide scans and imputed data to the approximately 2.5 million single-nucleotide polymorphisms in HapMap. Within each study, Cox proportional hazards regression models provided age- and sex-adjusted estimates of the association between each variant and time to incident HF. Fixed-effect meta-analyses combined results for each single-nucleotide polymorphism from the 4 cohorts to produce an overall association estimate and P value. A genome-wide significance P value threshold was set a priori at 5.0x10(-7). During a mean follow-up of 11.5 years, 2526 incident HF events (12%) occurred in 20 926 European-ancestry participants. The meta-analysis identified a genome-wide significant locus at chromosomal position 15q22 (1.4x10(-8)), which was 58.8 kb from USP3. Among 2895 African-ancestry participants, 466 incident HF events (16%) occurred during a mean follow-up of 13.7 years. One genome-wide significant locus was identified at 12q14 (6.7x10(-8)), which was 6.3 kb from LRIG3. We identified 2 loci that were associated with incident HF and exceeded genome-wide significance. The findings merit replication in other community-based settings of incident HF.

  1. Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering.

    PubMed

    Garst, Andrew D; Bassalo, Marcelo C; Pines, Gur; Lynch, Sean A; Halweg-Edwards, Andrea L; Liu, Rongming; Liang, Liya; Wang, Zhiwen; Zeitoun, Ramsey; Alexander, William G; Gill, Ryan T

    2017-01-01

    Improvements in DNA synthesis and sequencing have underpinned comprehensive assessment of gene function in bacteria and eukaryotes. Genome-wide analyses require high-throughput methods to generate mutations and analyze their phenotypes, but approaches to date have been unable to efficiently link the effects of mutations in coding regions or promoter elements in a highly parallel fashion. We report that CRISPR-Cas9 gene editing in combination with massively parallel oligomer synthesis can enable trackable editing on a genome-wide scale. Our method, CRISPR-enabled trackable genome engineering (CREATE), links each guide RNA to homologous repair cassettes that both edit loci and function as barcodes to track genotype-phenotype relationships. We apply CREATE to site saturation mutagenesis for protein engineering, reconstruction of adaptive laboratory evolution experiments, and identification of stress tolerance and antibiotic resistance genes in bacteria. We provide preliminary evidence that CREATE will work in yeast. We also provide a webtool to design multiplex CREATE libraries.

  2. Elastic-net regularization approaches for genome-wide association studies of rheumatoid arthritis.

    PubMed

    Cho, Seoae; Kim, Haseong; Oh, Sohee; Kim, Kyunga; Park, Taesung

    2009-12-15

    The current trend in genome-wide association studies is to identify regions where the true disease-causing genes may lie by evaluating thousands of single-nucleotide polymorphisms (SNPs) across the whole genome. However, many challenges exist in detecting disease-causing genes among the thousands of SNPs. Examples include multicollinearity and multiple testing issues, especially when a large number of correlated SNPs are simultaneously tested. Multicollinearity can often occur when predictor variables in a multiple regression model are highly correlated, and can cause imprecise estimation of association. In this study, we propose a simple stepwise procedure that identifies disease-causing SNPs simultaneously by employing elastic-net regularization, a variable selection method that allows one to address multicollinearity. At Step 1, the single-marker association analysis was conducted to screen SNPs. At Step 2, the multiple-marker association was scanned based on the elastic-net regularization. The proposed approach was applied to the rheumatoid arthritis (RA) case-control data set of Genetic Analysis Workshop 16. While the selected SNPs at the screening step are located mostly on chromosome 6, the elastic-net approach identified putative RA-related SNPs on other chromosomes in an increased proportion. For some of those putative RA-related SNPs, we identified the interactions with sex, a well known factor affecting RA susceptibility.

  3. Genome-Wide Landscapes of Human Local Adaptation in Asia

    PubMed Central

    Lu, Dongsheng; Xu, Shuhua

    2013-01-01

    Genetic studies of human local adaptation have been facilitated greatly by recent advances in high-throughput genotyping and sequencing technologies. However, few studies have investigated local adaptation in Asian populations on a genome-wide scale and with a high geographic resolution. In this study, taking advantage of the dense population coverage in Southeast Asia, which is the part of the world least studied in term of natural selection, we depicted genome-wide landscapes of local adaptations in 63 Asian populations representing the majority of linguistic and ethnic groups in Asia. Using genome-wide data analysis, we discovered many genes showing signs of local adaptation or natural selection. Notable examples, such as FOXQ1, MAST2, and CDH4, were found to play a role in hair follicle development and human cancer, signal transduction, and tumor repression, respectively. These showed strong indications of natural selection in Philippine Negritos, a group of aboriginal hunter-gatherers living in the Philippines. MTTP, which has associations with metabolic syndrome, body mass index, and insulin regulation, showed a strong signature of selection in Southeast Asians, including Indonesians. Functional annotation analysis revealed that genes and genetic variants underlying natural selections were generally enriched in the functional category of alternative splicing. Specifically, many genes showing significant difference with respect to allele frequency between northern and southern Asian populations were found to be associated with human height and growth and various immune pathways. In summary, this study contributes to the overall understanding of human local adaptation in Asia and has identified both known and novel signatures of natural selection in the human genome. PMID:23349834

  4. Genome-wide scans for candidate genes involved in the aquatic adaptation of dolphins.

    PubMed

    Sun, Yan-Bo; Zhou, Wei-Ping; Liu, He-Qun; Irwin, David M; Shen, Yong-Yi; Zhang, Ya-Ping

    2013-01-01

    Since their divergence from the terrestrial artiodactyls, cetaceans have fully adapted to an aquatic lifestyle, which represents one of the most dramatic transformations in mammalian evolutionary history. Numerous morphological and physiological characters of cetaceans have been acquired in response to this drastic habitat transition, such as thickened blubber, echolocation, and ability to hold their breath for a long period of time. However, knowledge about the molecular basis underlying these adaptations is still limited. The sequence of the genome of Tursiops truncates provides an opportunity for a comparative genomic analyses to examine the molecular adaptation of this species. Here, we constructed 11,838 high-quality orthologous gene alignments culled from the dolphin and four other terrestrial mammalian genomes and screened for positive selection occurring in the dolphin lineage. In total, 368 (3.1%) of the genes were identified as having undergone positive selection by the branch-site model. Functional characterization of these genes showed that they are significantly enriched in the categories of lipid transport and localization, ATPase activity, sense perception of sound, and muscle contraction, areas that are potentially related to cetacean adaptations. In contrast, we did not find a similar pattern in the cow, a closely related species. We resequenced some of the positively selected sites (PSSs), within the positively selected genes, and showed that most of our identified PSSs (50/52) could be replicated. The results from this study should have important implications for our understanding of cetacean evolution and their adaptations to the aquatic environment.

  5. Water-Immersible MEMS scanning mirror designed for wide-field fast-scanning photoacoustic microscopy

    NASA Astrophysics Data System (ADS)

    Yao, Junjie; Huang, Chih-Hsien; Martel, Catherine; Maslov, Konstantin I.; Wang, Lidai; Yang, Joon-Mo; Gao, Liang; Randolph, Gwendalyn; Zou, Jun; Wang, Lihong V.

    2013-03-01

    By offering images with high spatial resolution and unique optical absorption contrast, optical-resolution photoacoustic microscopy (OR-PAM) has gained increasing attention in biomedical research. Recent developments in OR-PAM have improved its imaging speed, but have sacrificed either the detection sensitivity or field of view or both. We have developed a wide-field fast-scanning OR-PAM by using a water-immersible MEMS scanning mirror (MEMS-ORPAM). Made of silicon with a gold coating, the MEMS mirror plate can reflect both optical and acoustic beams. Because it uses an electromagnetic driving force, the whole MEMS scanning system can be submerged in water. In MEMS-ORPAM, the optical and acoustic beams are confocally configured and simultaneously steered, which ensures uniform detection sensitivity. A B-scan imaging speed as high as 400 Hz can be achieved over a 3 mm scanning range. A diffraction-limited lateral resolution of 2.4 μm in water and a maximum imaging depth of 1.1 mm in soft tissue have been experimentally determined. Using the system, we imaged the flow dynamics of both red blood cells and carbon particles in a mouse ear in vivo. By using Evans blue dye as the contrast agent, we also imaged the flow dynamics of lymphatic vessels in a mouse tail in vivo. The results show that MEMS-OR-PAM could be a powerful tool for studying highly dynamic and time-sensitive biological phenomena.

  6. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    PubMed

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  7. Genome-wide introgression among distantly related Heliconius butterfly species.

    PubMed

    Zhang, Wei; Dasmahapatra, Kanchon K; Mallet, James; Moreira, Gilson R P; Kronforst, Marcus R

    2016-02-27

    Although hybridization is thought to be relatively rare in animals, the raw genetic material introduced via introgression may play an important role in fueling adaptation and adaptive radiation. The butterfly genus Heliconius is an excellent system to study hybridization and introgression but most studies have focused on closely related species such as H. cydno and H. melpomene. Here we characterize genome-wide patterns of introgression between H. besckei, the only species with a red and yellow banded 'postman' wing pattern in the tiger-striped silvaniform clade, and co-mimetic H. melpomene nanna. We find a pronounced signature of putative introgression from H. melpomene into H. besckei in the genomic region upstream of the gene optix, known to control red wing patterning, suggesting adaptive introgression of wing pattern mimicry between these two distantly related species. At least 39 additional genomic regions show signals of introgression as strong or stronger than this mimicry locus. Gene flow has been on-going, with evidence of gene exchange at multiple time points, and bidirectional, moving from the melpomene to the silvaniform clade and vice versa. The history of gene exchange has also been complex, with contributions from multiple silvaniform species in addition to H. besckei. We also detect a signature of ancient introgression of the entire Z chromosome between the silvaniform and melpomene/cydno clades. Our study provides a genome-wide portrait of introgression between distantly related butterfly species. We further propose a comprehensive and efficient workflow for gene flow identification in genomic data sets.

  8. Genome-wide Association Analysis of Kernel Weight in Hard Winter Wheat

    USDA-ARS?s Scientific Manuscript database

    Wheat kernel weight is an important and heritable component of wheat grain yield and a key predictor of flour extraction. Genome-wide association analysis was conducted to identify genomic regions associated with kernel weight and kernel weight environmental response in 8 trials of 299 hard winter ...

  9. A genome-wide scan for signatures of directional selection in domesticated pigs.

    PubMed

    Moon, Sunjin; Kim, Tae-Hun; Lee, Kyung-Tai; Kwak, Woori; Lee, Taeheon; Lee, Si-Woo; Kim, Myung-Jick; Cho, Kyuho; Kim, Namshin; Chung, Won-Hyong; Sung, Samsun; Park, Taesung; Cho, Seoae; Groenen, Martien Am; Nielsen, Rasmus; Kim, Yuseob; Kim, Heebal

    2015-02-25

    Animal domestication involved drastic phenotypic changes driven by strong artificial selection and also resulted in new populations of breeds, established by humans. This study aims to identify genes that show evidence of recent artificial selection during pig domestication. Whole-genome resequencing of 30 individual pigs from domesticated breeds, Landrace and Yorkshire, and 10 Asian wild boars at ~16-fold coverage was performed resulting in over 4.3 million SNPs for 19,990 genes. We constructed a comprehensive genome map of directional selection by detecting selective sweeps using an F ST-based approach that detects directional selection in lineages leading to the domesticated breeds and using a haplotype-based test that detects ongoing selective sweeps within the breeds. We show that candidate genes under selection are significantly enriched for loci implicated in quantitative traits important to pig reproduction and production. The candidate gene with the strongest signals of directional selection belongs to group III of the metabolomics glutamate receptors, known to affect brain functions associated with eating behavior, suggesting that loci under strong selection include loci involved in behaviorial traits in domesticated pigs including tameness. We show that a significant proportion of selection signatures coincide with loci that were previously inferred to affect phenotypic variation in pigs. We further identify functional enrichment related to behavior, such as signal transduction and neuronal activities, for those targets of selection during domestication in pigs.

  10. Evaluation of the genetic overlap between osteoarthritis with body mass index and height using genome-wide association scan data

    PubMed Central

    Elliott, Katherine S; Chapman, Kay; Day-Williams, Aaron; Panoutsopoulou, Kalliope; Southam, Lorraine; Lindgren, Cecilia M; Arden, Nigel; Aslam, Nadim; Birrell, Fraser; Carluke, Ian; Carr, Andrew; Deloukas, Panos; Doherty, Michael; Loughlin, John; McCaskie, Andrew; Ollier, William E R; Rai, Ashok; Ralston, Stuart; Reed, Mike R; Spector, Timothy D; Valdes, Ana M; Wallis, Gillian A; Wilkinson, Mark; Zeggini, Eleftheria

    2013-01-01

    Objectives Obesity as measured by body mass index (BMI) is one of the major risk factors for osteoarthritis. In addition, genetic overlap has been reported between osteoarthritis and normal adult height variation. We investigated whether this relationship is due to a shared genetic aetiology on a genome-wide scale. Methods We compared genetic association summary statistics (effect size, p value) for BMI and height from the GIANT consortium genome-wide association study (GWAS) with genetic association summary statistics from the arcOGEN consortium osteoarthritis GWAS. Significance was evaluated by permutation. Replication of osteoarthritis association of the highlighted signals was investigated in an independent dataset. Phenotypic information of height and BMI was accounted for in a separate analysis using osteoarthritis-free controls. Results We found significant overlap between osteoarthritis and height (p=3.3×10−5 for signals with p≤0.05) when the GIANT and arcOGEN GWAS were compared. For signals with p≤0.001 we found 17 shared signals between osteoarthritis and height and four between osteoarthritis and BMI. However, only one of the height or BMI signals that had shown evidence of association with osteoarthritis in the arcOGEN GWAS was also associated with osteoarthritis in the independent dataset: rs12149832, within the FTO gene (combined p=2.3×10−5). As expected, this signal was attenuated when we adjusted for BMI. Conclusions We found a significant excess of shared signals between both osteoarthritis and height and osteoarthritis and BMI, suggestive of a common genetic aetiology. However, only one signal showed association with osteoarthritis when followed up in a new dataset. PMID:22956599

  11. Meta-Analysis in Genome-Wide Association Datasets: Strategies and Application in Parkinson Disease

    PubMed Central

    Evangelou, Evangelos; Maraganore, Demetrius M.; Ioannidis, John P.A.

    2007-01-01

    Background Genome-wide association studies hold substantial promise for identifying common genetic variants that regulate susceptibility to complex diseases. However, for the detection of small genetic effects, single studies may be underpowered. Power may be improved by combining genome-wide datasets with meta-analytic techniques. Methodology/Principal Findings Both single and two-stage genome-wide data may be combined and there are several possible strategies. In the two-stage framework, we considered the options of (1) enhancement of replication data and (2) enhancement of first-stage data, and then, we also considered (3) joint meta-analyses including all first-stage and second-stage data. These strategies were examined empirically using data from two genome-wide association studies (three datasets) on Parkinson disease. In the three strategies, we derived 12, 5, and 49 single nucleotide polymorphisms that show significant associations at conventional levels of statistical significance. None of these remained significant after conservative adjustment for the number of performed analyses in each strategy. However, some may warrant further consideration: 6 SNPs were identified with at least 2 of the 3 strategies and 3 SNPs [rs1000291 on chromosome 3, rs2241743 on chromosome 4 and rs3018626 on chromosome 11] were identified with all 3 strategies and had no or minimal between-dataset heterogeneity (I2 = 0, 0 and 15%, respectively). Analyses were primarily limited by the suboptimal overlap of tested polymorphisms across different datasets (e.g., only 31,192 shared polymorphisms between the two tier 1 datasets). Conclusions/Significance Meta-analysis may be used to improve the power and examine the between-dataset heterogeneity of genome-wide association studies. Prospective designs may be most efficient, if they try to maximize the overlap of genotyping platforms and anticipate the combination of data across many genome-wide association studies. PMID:17332845

  12. An empirical Bayes method for updating inferences in analysis of quantitative trait loci using information from related genome scans.

    PubMed

    Zhang, Kui; Wiener, Howard; Beasley, Mark; George, Varghese; Amos, Christopher I; Allison, David B

    2006-08-01

    Individual genome scans for quantitative trait loci (QTL) mapping often suffer from low statistical power and imprecise estimates of QTL location and effect. This lack of precision yields large confidence intervals for QTL location, which are problematic for subsequent fine mapping and positional cloning. In prioritizing areas for follow-up after an initial genome scan and in evaluating the credibility of apparent linkage signals, investigators typically examine the results of other genome scans of the same phenotype and informally update their beliefs about which linkage signals in their scan most merit confidence and follow-up via a subjective-intuitive integration approach. A method that acknowledges the wisdom of this general paradigm but formally borrows information from other scans to increase confidence in objectivity would be a benefit. We developed an empirical Bayes analytic method to integrate information from multiple genome scans. The linkage statistic obtained from a single genome scan study is updated by incorporating statistics from other genome scans as prior information. This technique does not require that all studies have an identical marker map or a common estimated QTL effect. The updated linkage statistic can then be used for the estimation of QTL location and effect. We evaluate the performance of our method by using extensive simulations based on actual marker spacing and allele frequencies from available data. Results indicate that the empirical Bayes method can account for between-study heterogeneity, estimate the QTL location and effect more precisely, and provide narrower confidence intervals than results from any single individual study. We also compared the empirical Bayes method with a method originally developed for meta-analysis (a closely related but distinct purpose). In the face of marked heterogeneity among studies, the empirical Bayes method outperforms the comparator.

  13. A genome-wide association study of chronic obstructive pulmonary disease in Hispanics.

    PubMed

    Chen, Wei; Brehm, John M; Manichaikul, Ani; Cho, Michael H; Boutaoui, Nadia; Yan, Qi; Burkart, Kristin M; Enright, Paul L; Rotter, Jerome I; Petersen, Hans; Leng, Shuguang; Obeidat, Ma'en; Bossé, Yohan; Brandsma, Corry-Anke; Hao, Ke; Rich, Stephen S; Powell, Rhea; Avila, Lydiana; Soto-Quiros, Manuel; Silverman, Edwin K; Tesfaigzi, Yohannes; Barr, R Graham; Celedón, Juan C

    2015-03-01

    Genome-wide association studies (GWAS) of chronic obstructive pulmonary disease (COPD) have identified disease-susceptibility loci, mostly in subjects of European descent. We hypothesized that by studying Hispanic populations we would be able to identify unique loci that contribute to COPD pathogenesis in Hispanics but remain undetected in GWAS of non-Hispanic populations. We conducted a metaanalysis of two GWAS of COPD in independent cohorts of Hispanics in Costa Rica and the United States (Multi-Ethnic Study of Atherosclerosis [MESA]). We performed a replication study of the top single-nucleotide polymorphisms in an independent Hispanic cohort in New Mexico (the Lovelace Smokers Cohort). We also attempted to replicate prior findings from genome-wide studies in non-Hispanic populations in Hispanic cohorts. We found no genome-wide significant association with COPD in our metaanalysis of Costa Rica and MESA. After combining the top results from this metaanalysis with those from our replication study in the Lovelace Smokers Cohort, we identified two single-nucleotide polymorphisms approaching genome-wide significance for an association with COPD. The first (rs858249, combined P value = 6.1 × 10(-8)) is near the genes KLHL7 and NUPL2 on chromosome 7. The second (rs286499, combined P value = 8.4 × 10(-8)) is located in an intron of DLG2. The two most significant single-nucleotide polymorphisms in FAM13A from a previous genome-wide study in non-Hispanics were associated with COPD in Hispanics. We have identified two novel loci (in or near the genes KLHL7/NUPL2 and DLG2) that may play a role in COPD pathogenesis in Hispanic populations.

  14. Development and Evaluation of a Genome-Wide 6K SNP Array for Diploid Sweet Cherry and Tetraploid Sour Cherry

    PubMed Central

    Peace, Cameron; Bassil, Nahla; Main, Dorrie; Ficklin, Stephen; Rosyara, Umesh R.; Stegmeir, Travis; Sebolt, Audrey; Gilmore, Barbara; Lawley, Cindy; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Iezzoni, Amy

    2012-01-01

    High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb) of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome) and P. fruticosa (fruticosa subgenome). Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269) and sour (n = 330) cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery, genome

  15. Genome-wide meta-analysis of common variant differences between men and women

    PubMed Central

    Boraska, Vesna; Jerončić, Ana; Colonna, Vincenza; Southam, Lorraine; Nyholt, Dale R.; William Rayner, Nigel; Perry, John R.B.; Toniolo, Daniela; Albrecht, Eva; Ang, Wei; Bandinelli, Stefania; Barbalic, Maja; Barroso, Inês; Beckmann, Jacques S.; Biffar, Reiner; Boomsma, Dorret; Campbell, Harry; Corre, Tanguy; Erdmann, Jeanette; Esko, Tõnu; Fischer, Krista; Franceschini, Nora; Frayling, Timothy M.; Girotto, Giorgia; Gonzalez, Juan R.; Harris, Tamara B.; Heath, Andrew C.; Heid, Iris M.; Hoffmann, Wolfgang; Hofman, Albert; Horikoshi, Momoko; Hua Zhao, Jing; Jackson, Anne U.; Hottenga, Jouke-Jan; Jula, Antti; Kähönen, Mika; Khaw, Kay-Tee; Kiemeney, Lambertus A.; Klopp, Norman; Kutalik, Zoltán; Lagou, Vasiliki; Launer, Lenore J.; Lehtimäki, Terho; Lemire, Mathieu; Lokki, Marja-Liisa; Loley, Christina; Luan, Jian'an; Mangino, Massimo; Mateo Leach, Irene; Medland, Sarah E.; Mihailov, Evelin; Montgomery, Grant W.; Navis, Gerjan; Newnham, John; Nieminen, Markku S.; Palotie, Aarno; Panoutsopoulou, Kalliope; Peters, Annette; Pirastu, Nicola; Polašek, Ozren; Rehnström, Karola; Ripatti, Samuli; Ritchie, Graham R.S.; Rivadeneira, Fernando; Robino, Antonietta; Samani, Nilesh J.; Shin, So-Youn; Sinisalo, Juha; Smit, Johannes H.; Soranzo, Nicole; Stolk, Lisette; Swinkels, Dorine W.; Tanaka, Toshiko; Teumer, Alexander; Tönjes, Anke; Traglia, Michela; Tuomilehto, Jaakko; Valsesia, Armand; van Gilst, Wiek H.; van Meurs, Joyce B.J.; Smith, Albert Vernon; Viikari, Jorma; Vink, Jacqueline M.; Waeber, Gerard; Warrington, Nicole M.; Widen, Elisabeth; Willemsen, Gonneke; Wright, Alan F.; Zanke, Brent W.; Zgaga, Lina; Boehnke, Michael; d'Adamo, Adamo Pio; de Geus, Eco; Demerath, Ellen W.; den Heijer, Martin; Eriksson, Johan G.; Ferrucci, Luigi; Gieger, Christian; Gudnason, Vilmundur; Hayward, Caroline; Hengstenberg, Christian; Hudson, Thomas J.; Järvelin, Marjo-Riitta; Kogevinas, Manolis; Loos, Ruth J.F.; Martin, Nicholas G.; Metspalu, Andres; Pennell, Craig E.; Penninx, Brenda W.; Perola, Markus; Raitakari, Olli; Salomaa, Veikko; Schreiber, Stefan; Schunkert, Heribert; Spector, Tim D.; Stumvoll, Michael; Uitterlinden, André G.; Ulivi, Sheila; van der Harst, Pim; Vollenweider, Peter; Völzke, Henry; Wareham, Nicholas J.; Wichmann, H.-Erich; Wilson, James F.; Rudan, Igor; Xue, Yali; Zeggini, Eleftheria

    2012-01-01

    The male-to-female sex ratio at birth is constant across world populations with an average of 1.06 (106 male to 100 female live births) for populations of European descent. The sex ratio is considered to be affected by numerous biological and environmental factors and to have a heritable component. The aim of this study was to investigate the presence of common allele modest effects at autosomal and chromosome X variants that could explain the observed sex ratio at birth. We conducted a large-scale genome-wide association scan (GWAS) meta-analysis across 51 studies, comprising overall 114 863 individuals (61 094 women and 53 769 men) of European ancestry and 2 623 828 common (minor allele frequency >0.05) single-nucleotide polymorphisms (SNPs). Allele frequencies were compared between men and women for directly-typed and imputed variants within each study. Forward-time simulations for unlinked, neutral, autosomal, common loci were performed under the demographic model for European populations with a fixed sex ratio and a random mating scheme to assess the probability of detecting significant allele frequency differences. We do not detect any genome-wide significant (P < 5 × 10−8) common SNP differences between men and women in this well-powered meta-analysis. The simulated data provided results entirely consistent with these findings. This large-scale investigation across ∼115 000 individuals shows no detectable contribution from common genetic variants to the observed skew in the sex ratio. The absence of sex-specific differences is useful in guiding genetic association study design, for example when using mixed controls for sex-biased traits. PMID:22843499

  16. Genome-wide analysis of disease progression in age-related macular degeneration.

    PubMed

    Yan, Qi; Ding, Ying; Liu, Yi; Sun, Tao; Fritsche, Lars G; Clemons, Traci; Ratnapriya, Rinki; Klein, Michael L; Cook, Richard J; Liu, Yu; Fan, Ruzong; Wei, Lai; Abecasis, Gonçalo R; Swaroop, Anand; Chew, Emily Y; Weeks, Daniel E; Chen, Wei

    2018-03-01

    Family- and population-based genetic studies have successfully identified multiple disease-susceptibility loci for Age-related macular degeneration (AMD), one of the first batch and most successful examples of genome-wide association study. However, most genetic studies to date have focused on case-control studies of late AMD (choroidal neovascularization or geographic atrophy). The genetic influences on disease progression are largely unexplored. We assembled unique resources to perform a genome-wide bivariate time-to-event analysis to test for association of time-to-late-AMD with ∼9 million variants on 2721 Caucasians from a large multi-center randomized clinical trial, the Age-Related Eye Disease Study. To our knowledge, this is the first genome-wide association study of disease progression (bivariate survival outcome) in AMD genetic studies, thus providing novel insights to AMD genetics. We used a robust Cox proportional hazards model to appropriately account for between-eye correlation when analyzing the progression time in the two eyes of each participant. We identified four previously reported susceptibility loci showing genome-wide significant association with AMD progression: ARMS2-HTRA1 (P = 8.1 × 10-43), CFH (P = 3.5 × 10-37), C2-CFB-SKIV2L (P = 8.1 × 10-10) and C3 (P = 1.2 × 10-9). Furthermore, we detected association of rs58978565 near TNR (P = 2.3 × 10-8), rs28368872 near ATF7IP2 (P = 2.9 × 10-8) and rs142450006 near MMP9 (P = 0.0006) with progression to choroidal neovascularization but not geographic atrophy. Secondary analysis limited to 34 reported risk variants revealed that LIPC and CTRB2-CTRB1 were also associated with AMD progression (P < 0.0015). Our genome-wide analysis thus expands the genetics in both development and progression of AMD and should assist in early identification of high risk individuals.

  17. Potential assessment of genome-wide association study and genomic selection in Japanese pear Pyrus pyrifolia

    PubMed Central

    Iwata, Hiroyoshi; Hayashi, Takeshi; Terakami, Shingo; Takada, Norio; Sawamura, Yutaka; Yamamoto, Toshiya

    2013-01-01

    Although the potential of marker-assisted selection (MAS) in fruit tree breeding has been reported, bi-parental QTL mapping before MAS has hindered the introduction of MAS to fruit tree breeding programs. Genome-wide association studies (GWAS) are an alternative to bi-parental QTL mapping in long-lived perennials. Selection based on genomic predictions of breeding values (genomic selection: GS) is another alternative for MAS. This study examined the potential of GWAS and GS in pear breeding with 76 Japanese pear cultivars to detect significant associations of 162 markers with nine agronomic traits. We applied multilocus Bayesian models accounting for ordinal categorical phenotypes for GWAS and GS model training. Significant associations were detected at harvest time, black spot resistance and the number of spurs and two of the associations were closely linked to known loci. Genome-wide predictions for GS were accurate at the highest level (0.75) in harvest time, at medium levels (0.38–0.61) in resistance to black spot, firmness of flesh, fruit shape in longitudinal section, fruit size, acid content and number of spurs and at low levels (<0.2) in all soluble solid content and vigor of tree. Results suggest the potential of GWAS and GS for use in future breeding programs in Japanese pear. PMID:23641189

  18. Insights to the Genetics of Diabetic Nephropathy through a Genome-wide Association Study of the GoKinD Collection

    PubMed Central

    Pezzolesi, Marcus G.; Skupien, Jan; Krolewski, Andrzej S.

    2010-01-01

    The Genetics of Kidneys in Diabetes (GoKinD) study was initiated to facilitate research aimed at identifying genes involved in diabetic nephropathy (DN) in type 1 diabetes (T1D). In this review, we present on overview of this study and the various reports that have utilized its collection. At the forefront of these efforts is the recent genome-wide association (GWA) scan implemented on the GoKinD collection. We highlight the results from our analysis of these data and describe compelling evidence from animal models that further support the potential role of associated loci in the susceptibility of DN. To enhance our analysis of genetic associations in GoKinD, using genome-wide imputation (GWI), we expanded our analysis of this collection to include genotype data from more than 2.4 million common SNPs. We illustrate the added utility of this enhanced dataset through the comprehensive fine-mapping of candidate genomic regions previously linked with DN and the targeted investigation of genes involved in candidate pathway implicated in its pathogenesis. Collectively, GWA and GWI data from the GoKinD collection will serve as a springboard for future investigations into the genetic basis of DN in T1D. PMID:20347642

  19. SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets.

    PubMed

    Mao, Hongliang; Wang, Hao

    2017-03-01

    Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan , implemented in PERL and supported on Linux. wangh8@fudan.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  20. SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets

    PubMed Central

    Mao, Hongliang

    2017-01-01

    Abstract Motivation: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. Results: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. Availability and Implementation: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan, implemented in PERL and supported on Linux. Contact: wangh8@fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28062442

  1. Vive la résistance: genome-wide selection against introduced alleles in invasive hybrid zones

    USGS Publications Warehouse

    Kovach, Ryan P.; Hand, Brian K.; Hohenlohe, Paul A.; Cosart, Ted F.; Boyer, Matthew C.; Neville, Helen H.; Muhlfeld, Clint C.; Amish, Stephen J.; Carim, Kellie; Narum, Shawn R.; Lowe, Winsor H.; Allendorf, Fred W.; Luikart, Gordon

    2016-01-01

    Evolutionary and ecological consequences of hybridization between native and invasive species are notoriously complicated because patterns of selection acting on non-native alleles can vary throughout the genome and across environments. Rapid advances in genomics now make it feasible to assess locus-specific and genome-wide patterns of natural selection acting on invasive introgression within and among natural populations occupying diverse environments. We quantified genome-wide patterns of admixture across multiple independent hybrid zones of native westslope cutthroat trout and invasive rainbow trout, the world's most widely introduced fish, by genotyping 339 individuals from 21 populations using 9380 species-diagnostic loci. A significantly greater proportion of the genome appeared to be under selection favouring native cutthroat trout (rather than rainbow trout), and this pattern was pervasive across the genome (detected on most chromosomes). Furthermore, selection against invasive alleles was consistent across populations and environments, even in those where rainbow trout were predicted to have a selective advantage (warm environments). These data corroborate field studies showing that hybrids between these species have lower fitness than the native taxa, and show that these fitness differences are due to selection favouring many native genes distributed widely throughout the genome.

  2. Meta-analysis of sex-specific genome-wide association studies.

    PubMed

    Magi, Reedik; Lindgren, Cecilia M; Morris, Andrew P

    2010-12-01

    Despite the success of genome-wide association studies, much of the genetic contribution to complex human traits is still unexplained. One potential source of genetic variation that may contribute to this "missing heritability" is that which differs in magnitude and/or direction between males and females, which could result from sexual dimorphism in gene expression. Such sex-differentiated effects are common in model organisms, and are becoming increasingly evident in human complex traits through large-scale male- and female-specific meta-analyses. In this article, we review the methodology for meta-analysis of sex-specific genome-wide association studies, and propose a sex-differentiated test of association with quantitative or dichotomous traits, which allows for heterogeneity of allelic effects between males and females. We perform detailed simulations to compare the power of the proposed sex-differentiated meta-analysis with the more traditional "sex-combined" approach, which is ambivalent to gender. The results of this study highlight only a small loss in power for the sex-differentiated meta-analysis when the allelic effects of the causal variant are the same in males and females. However, over a range of models of heterogeneity in allelic effects between genders, our sex-differentiated meta-analysis strategy offers substantial gains in power, and thus has the potential to discover novel loci contributing effects to complex human traits with existing genome-wide association data. © 2010 Wiley-Liss, Inc.

  3. Comprehensive evaluation of genome-wide 5-hydroxymethylcytosine profiling approaches in human DNA.

    PubMed

    Skvortsova, Ksenia; Zotenko, Elena; Luu, Phuc-Loi; Gould, Cathryn M; Nair, Shalima S; Clark, Susan J; Stirzaker, Clare

    2017-01-01

    The discovery that 5-methylcytosine (5mC) can be oxidized to 5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation (TET) proteins has prompted wide interest in the potential role of 5hmC in reshaping the mammalian DNA methylation landscape. The gold-standard bisulphite conversion technologies to study DNA methylation do not distinguish between 5mC and 5hmC. However, new approaches to mapping 5hmC genome-wide have advanced rapidly, although it is unclear how the different methods compare in accurately calling 5hmC. In this study, we provide a comparative analysis on brain DNA using three 5hmC genome-wide approaches, namely whole-genome bisulphite/oxidative bisulphite sequencing (WG Bis/OxBis-seq), Infinium HumanMethylation450 BeadChip arrays coupled with oxidative bisulphite (HM450K Bis/OxBis) and antibody-based immunoprecipitation and sequencing of hydroxymethylated DNA (hMeDIP-seq). We also perform loci-specific TET-assisted bisulphite sequencing (TAB-seq) for validation of candidate regions. We show that whole-genome single-base resolution approaches are advantaged in providing precise 5hmC values but require high sequencing depth to accurately measure 5hmC, as this modification is commonly in low abundance in mammalian cells. HM450K arrays coupled with oxidative bisulphite provide a cost-effective representation of 5hmC distribution, at CpG sites with 5hmC levels >~10%. However, 5hmC analysis is restricted to the genomic location of the probes, which is an important consideration as 5hmC modification is commonly enriched at enhancer elements. Finally, we show that the widely used hMeDIP-seq method provides an efficient genome-wide profile of 5hmC and shows high correlation with WG Bis/OxBis-seq 5hmC distribution in brain DNA. However, in cell line DNA with low levels of 5hmC, hMeDIP-seq-enriched regions are not detected by WG Bis/OxBis or HM450K, either suggesting misinterpretation of 5hmC calls by hMeDIP or lack of sensitivity of the latter methods. We

  4. The effect of genome-wide association scan quality control on imputation outcome for common variants.

    PubMed

    Southam, Lorraine; Panoutsopoulou, Kalliope; Rayner, N William; Chapman, Kay; Durrant, Caroline; Ferreira, Teresa; Arden, Nigel; Carr, Andrew; Deloukas, Panos; Doherty, Michael; Loughlin, John; McCaskie, Andrew; Ollier, William E R; Ralston, Stuart; Spector, Timothy D; Valdes, Ana M; Wallis, Gillian A; Wilkinson, J Mark; Marchini, Jonathan; Zeggini, Eleftheria

    2011-05-01

    Imputation is an extremely valuable tool in conducting and synthesising genome-wide association studies (GWASs). Directly typed SNP quality control (QC) is thought to affect imputation quality. It is, therefore, common practise to use quality-controlled (QCed) data as an input for imputing genotypes. This study aims to determine the effect of commonly applied QC steps on imputation outcomes. We performed several iterations of imputing SNPs across chromosome 22 in a dataset consisting of 3177 samples with Illumina 610 k (Illumina, San Diego, CA, USA) GWAS data, applying different QC steps each time. The imputed genotypes were compared with the directly typed genotypes. In addition, we investigated the correlation between alternatively QCed data. We also applied a series of post-imputation QC steps balancing elimination of poorly imputed SNPs and information loss. We found that the difference between the unQCed data and the fully QCed data on imputation outcome was minimal. Our study shows that imputation of common variants is generally very accurate and robust to GWAS QC, which is not a major factor affecting imputation outcome. A minority of common-frequency SNPs with particular properties cannot be accurately imputed regardless of QC stringency. These findings may not generalise to the imputation of low frequency and rare variants.

  5. Whole-genome scan identifies quantitative trait loci for chronic pastern dermatitis in German draft horses.

    PubMed

    Mittmann, E Henrike; Mömke, Stefanie; Distl, Ottmar

    2010-02-01

    Chronic pastern dermatitis (CPD), also known as chronic progressive lymphedema (CPL), is a skin disease that affects draft horses. This disease causes painful lower-leg swelling, nodule formation, and skin ulceration, interfering with movement. The aim of this whole-genome scan was to identify quantitative trait loci (QTL) for CPD in German draft horses. We recorded clinical data for CPD in 917 German draft horses and collected blood samples from these horses. Of these 917 horses, 31 paternal half-sib families comprising 378 horses from the breeds Rhenish German, Schleswig, Saxon-Thuringian, and South German were chosen for genotyping. Each half-sib family was constituted by only one draft horse breed. Genotyping was done for 318 polymorphic microsatellites evenly distributed on all equine autosomes and the X chromosome with a mean distance of 7.5 Mb. An across-breed multipoint linkage analysis revealed chromosome-wide significant QTL on horse chromosomes (ECA) 1, 9, 16, and 17. Analyses by breed confirmed the QTL on ECA1 in South German and the QTL on ECA9, 16, and 17 in Saxon-Thuringian draft horses. For the Rhenish German and Schleswig draft horses, additional QTL on ECA4 and 10 and for the South German draft horses an additional QTL on ECA7 were found. This is the first whole-genome scan for CPD in draft horses and it is an important step toward the identification of candidate genes.

  6. Multi-Instance Metric Transfer Learning for Genome-Wide Protein Function Prediction.

    PubMed

    Xu, Yonghui; Min, Huaqing; Wu, Qingyao; Song, Hengjie; Ye, Bicui

    2017-02-06

    Multi-Instance (MI) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with multiple instances. Many studies in this literature attempted to find an appropriate Multi-Instance Learning (MIL) method for genome-wide protein function prediction under a usual assumption, the underlying distribution from testing data (target domain, i.e., TD) is the same as that from training data (source domain, i.e., SD). However, this assumption may be violated in real practice. To tackle this problem, in this paper, we propose a Multi-Instance Metric Transfer Learning (MIMTL) approach for genome-wide protein function prediction. In MIMTL, we first transfer the source domain distribution to the target domain distribution by utilizing the bag weights. Then, we construct a distance metric learning method with the reweighted bags. At last, we develop an alternative optimization scheme for MIMTL. Comprehensive experimental evidence on seven real-world organisms verifies the effectiveness and efficiency of the proposed MIMTL approach over several state-of-the-art methods.

  7. Genome-wide molecular dissection of serotype M3 group A Streptococcus strains causing two epidemics of invasive infections.

    PubMed

    Beres, Stephen B; Sylva, Gail L; Sturdevant, Daniel E; Granville, Chanel N; Liu, Mengyao; Ricklefs, Stacy M; Whitney, Adeline R; Parkins, Larye D; Hoe, Nancy P; Adams, Gerald J; Low, Donald E; DeLeo, Frank R; McGeer, Allison; Musser, James M

    2004-08-10

    Molecular factors that contribute to the emergence of new virulent bacterial subclones and epidemics are poorly understood. We hypothesized that analysis of a population-based strain sample of serotype M3 group A Streptococcus (GAS) recovered from patients with invasive infection by using genome-wide investigative methods would provide new insight into this fundamental infectious disease problem. Serotype M3 GAS strains (n = 255) cultured from patients in Ontario, Canada, over 11 years and representing two distinct infection peaks were studied. Genetic diversity was indexed by pulsed-field gel electrophoresis, DNA-DNA microarray, whole-genome PCR scanning, prophage genotyping, targeted gene sequencing, and single-nucleotide polymorphism genotyping. All variation in gene content was attributable to acquisition or loss of prophages, a molecular process that generated unique combinations of proven or putative virulence genes. Distinct serotype M3 genotypes experienced rapid population expansion and caused infections that differed significantly in character and severity. Molecular genetic analysis, combined with immunologic studies, implicated a 4-aa duplication in the extreme N terminus of M protein as a factor contributing to an epidemic wave of serotype M3 invasive infections. This finding has implications for GAS vaccine research. Genome-wide analysis of population-based strain samples cultured from clinically well defined patients is crucial for understanding the molecular events underlying bacterial epidemics.

  8. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome.

    PubMed

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed M Vargas; Parker, Brian J; Rasmussen, Morten; Lindgreen, Stinus; Lilje, Berit; Tobin, Desmond J; Kelly, Theresa K; Vang, Søren; Andersson, Robin; Jones, Peter A; Hoover, Cindi A; Tikhonov, Alexei; Prokhortchouk, Egor; Rubin, Edward M; Sandelin, Albin; Gilbert, M Thomas P; Krogh, Anders; Willerslev, Eske; Orlando, Ludovic

    2014-03-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery of the expected signals at promoter regions, exon/intron boundaries, and CTCF sites. The top-scoring nucleosome calls revealed distinct DNA positioning biases, attesting to nucleotide-level accuracy. The ancient methylation levels exhibited high conservation over time, clustering closely with modern hair tissues. Using ancient methylation information, we estimated the age at death of the Saqqaq individual and illustrate how epigenetic information can be used to infer ancient gene expression. Similar epigenetic signatures were found in other fossil material, such as 110,000- to 130,000-yr-old bones, supporting the contention that ancient epigenomic information can be reconstructed from a deep past. Our findings lay the foundation for extracting epigenomic information from ancient samples, allowing shifts in epialleles to be tracked through evolutionary time, as well as providing an original window into modern epigenomics.

  9. Genome-wide nucleosome map and cytosine methylation levels of an ancient human genome

    PubMed Central

    Pedersen, Jakob Skou; Valen, Eivind; Velazquez, Amhed M. Vargas; Parker, Brian J.; Rasmussen, Morten; Lindgreen, Stinus; Lilje, Berit; Tobin, Desmond J.; Kelly, Theresa K.; Vang, Søren; Andersson, Robin; Jones, Peter A.; Hoover, Cindi A.; Tikhonov, Alexei; Prokhortchouk, Egor; Rubin, Edward M.; Sandelin, Albin; Gilbert, M. Thomas P.; Krogh, Anders; Willerslev, Eske; Orlando, Ludovic

    2014-01-01

    Epigenetic information is available from contemporary organisms, but is difficult to track back in evolutionary time. Here, we show that genome-wide epigenetic information can be gathered directly from next-generation sequence reads of DNA isolated from ancient remains. Using the genome sequence data generated from hair shafts of a 4000-yr-old Paleo-Eskimo belonging to the Saqqaq culture, we generate the first ancient nucleosome map coupled with a genome-wide survey of cytosine methylation levels. The validity of both nucleosome map and methylation levels were confirmed by the recovery of the expected signals at promoter regions, exon/intron boundaries, and CTCF sites. The top-scoring nucleosome calls revealed distinct DNA positioning biases, attesting to nucleotide-level accuracy. The ancient methylation levels exhibited high conservation over time, clustering closely with modern hair tissues. Using ancient methylation information, we estimated the age at death of the Saqqaq individual and illustrate how epigenetic information can be used to infer ancient gene expression. Similar epigenetic signatures were found in other fossil material, such as 110,000- to 130,000-yr-old bones, supporting the contention that ancient epigenomic information can be reconstructed from a deep past. Our findings lay the foundation for extracting epigenomic information from ancient samples, allowing shifts in epialleles to be tracked through evolutionary time, as well as providing an original window into modern epigenomics. PMID:24299735

  10. Scanning the genome for gene single nucleotide polymorphisms involved in adaptive population differentiation in white spruce

    PubMed Central

    Namroud, Marie-Claire; Beaulieu, Jean; Juge, Nicolas; Laroche, Jérôme; Bousquet, Jean

    2008-01-01

    Conifers are characterized by a large genome size and a rapid decay of linkage disequilibrium, most often within gene limits. Genome scans based on noncoding markers are less likely to detect molecular adaptation linked to genes in these species. In this study, we assessed the effectiveness of a genome-wide single nucleotide polymorphism (SNP) scan focused on expressed genes in detecting local adaptation in a conifer species. Samples were collected from six natural populations of white spruce (Picea glauca) moderately differentiated for several quantitative characters. A total of 534 SNPs representing 345 expressed genes were analysed. Genes potentially under natural selection were identified by estimating the differentiation in SNP frequencies among populations (FST) and identifying outliers, and by estimating local differentiation using a Bayesian approach. Both average expected heterozygosity and population differentiation estimates (HE = 0.270 and FST = 0.006) were comparable to those obtained with other genetic markers. Of all genes, 5.5% were identified as outliers with FST at the 95% confidence level, while 14% were identified as candidates for local adaptation with the Bayesian method. There was some overlap between the two gene sets. More than half of the candidate genes for local adaptation were specific to the warmest population, about 20% to the most arid population, and 15% to the coldest and most humid higher altitude population. These adaptive trends were consistent with the genes’ putative functions and the divergence in quantitative traits noted among the populations. The results suggest that an approach separating the locus and population effects is useful to identify genes potentially under selection. These candidates are worth exploring in more details at the physiological and ecological levels. PMID:18662225

  11. HIV Genome-Wide Protein Associations: a Review of 30 Years of Research

    PubMed Central

    2016-01-01

    SUMMARY The HIV genome encodes a small number of viral proteins (i.e., 16), invariably establishing cooperative associations among HIV proteins and between HIV and host proteins, to invade host cells and hijack their internal machineries. As a known example, the HIV envelope glycoprotein GP120 is closely associated with GP41 for viral entry. From a genome-wide perspective, a hypothesis can be worked out to determine whether 16 HIV proteins could develop 120 possible pairwise associations either by physical interactions or by functional associations mediated via HIV or host molecules. Here, we present the first systematic review of experimental evidence on HIV genome-wide protein associations using a large body of publications accumulated over the past 3 decades. Of 120 possible pairwise associations between 16 HIV proteins, at least 34 physical interactions and 17 functional associations have been identified. To achieve efficient viral replication and infection, HIV protein associations play essential roles (e.g., cleavage, inhibition, and activation) during the HIV life cycle. In either a dispensable or an indispensable manner, each HIV protein collaborates with another viral protein to accomplish specific activities that precisely take place at the proper stages of the HIV life cycle. In addition, HIV genome-wide protein associations have an impact on anti-HIV inhibitors due to the extensive cross talk between drug-inhibited proteins and other HIV proteins. Overall, this study presents for the first time a comprehensive overview of HIV genome-wide protein associations, highlighting meticulous collaborations between all viral proteins during the HIV life cycle. PMID:27357278

  12. A Genome-Wide Association Study of Chronic Obstructive Pulmonary Disease in Hispanics

    PubMed Central

    Chen, Wei; Brehm, John M.; Manichaikul, Ani; Cho, Michael H.; Boutaoui, Nadia; Yan, Qi; Burkart, Kristin M.; Enright, Paul L.; Rotter, Jerome I.; Petersen, Hans; Leng, Shuguang; Obeidat, Ma’en; Bossé, Yohan; Brandsma, Corry-Anke; Hao, Ke; Rich, Stephen S.; Powell, Rhea; Avila, Lydiana; Soto-Quiros, Manuel; Silverman, Edwin K.; Tesfaigzi, Yohannes; Barr, R. Graham

    2015-01-01

    Rationale: Genome-wide association studies (GWAS) of chronic obstructive pulmonary disease (COPD) have identified disease-susceptibility loci, mostly in subjects of European descent. Objectives: We hypothesized that by studying Hispanic populations we would be able to identify unique loci that contribute to COPD pathogenesis in Hispanics but remain undetected in GWAS of non-Hispanic populations. Methods: We conducted a metaanalysis of two GWAS of COPD in independent cohorts of Hispanics in Costa Rica and the United States (Multi-Ethnic Study of Atherosclerosis [MESA]). We performed a replication study of the top single-nucleotide polymorphisms in an independent Hispanic cohort in New Mexico (the Lovelace Smokers Cohort). We also attempted to replicate prior findings from genome-wide studies in non-Hispanic populations in Hispanic cohorts. Measurements and Main Results: We found no genome-wide significant association with COPD in our metaanalysis of Costa Rica and MESA. After combining the top results from this metaanalysis with those from our replication study in the Lovelace Smokers Cohort, we identified two single-nucleotide polymorphisms approaching genome-wide significance for an association with COPD. The first (rs858249, combined P value = 6.1 × 10−8) is near the genes KLHL7 and NUPL2 on chromosome 7. The second (rs286499, combined P value = 8.4 × 10−8) is located in an intron of DLG2. The two most significant single-nucleotide polymorphisms in FAM13A from a previous genome-wide study in non-Hispanics were associated with COPD in Hispanics. Conclusions: We have identified two novel loci (in or near the genes KLHL7/NUPL2 and DLG2) that may play a role in COPD pathogenesis in Hispanic populations. PMID:25584925

  13. Assessing Predictive Properties of Genome-Wide Selection in Soybeans

    PubMed Central

    Xavier, Alencar; Muir, William M.; Rainey, Katy Martin

    2016-01-01

    Many economically important traits in plant breeding have low heritability or are difficult to measure. For these traits, genomic selection has attractive features and may boost genetic gains. Our goal was to evaluate alternative scenarios to implement genomic selection for yield components in soybean (Glycine max L. merr). We used a nested association panel with cross validation to evaluate the impacts of training population size, genotyping density, and prediction model on the accuracy of genomic prediction. Our results indicate that training population size was the factor most relevant to improvement in genome-wide prediction, with greatest improvement observed in training sets up to 2000 individuals. We discuss assumptions that influence the choice of the prediction model. Although alternative models had minor impacts on prediction accuracy, the most robust prediction model was the combination of reproducing kernel Hilbert space regression and BayesB. Higher genotyping density marginally improved accuracy. Our study finds that breeding programs seeking efficient genomic selection in soybeans would best allocate resources by investing in a representative training set. PMID:27317786

  14. Genome-wide Association Study of Obsessive-Compulsive Disorder

    PubMed Central

    Stewart, S Evelyn; Yu, Dongmei; Scharf, Jeremiah M; Neale, Benjamin M; Fagerness, Jesen A; Mathews, Carol A; Arnold, Paul D; Evans, Patrick D; Gamazon, Eric R; Osiecki, Lisa; McGrath, Lauren; Haddad, Stephen; Crane, Jacquelyn; Hezel, Dianne; Illman, Cornelia; Mayerfeld, Catherine; Konkashbaev, Anuar; Liu, Chunyu; Pluzhnikov, Anna; Tikhomirov, Anna; Edlund, Christopher K; Rauch, Scott L; Moessner, Rainald; Falkai, Peter; Maier, Wolfgang; Ruhrmann, Stephan; Grabe, Hans-Jörgen; Lennertz, Leonard; Wagner, Michael; Bellodi, Laura; Cavallini, Maria Cristina; Richter, Margaret A; Cook, Edwin H; Kennedy, James L; Rosenberg, David; Stein, Dan J; Hemmings, Sian MJ; Lochner, Christine; Azzam, Amin; Chavira, Denise A; Fournier, Eduardo; Garrido, Helena; Sheppard, Brooke; Umaña, Paul; Murphy, Dennis L; Wendland, Jens R; Veenstra-VanderWeele, Jeremy; Denys, Damiaan; Blom, Rianne; Deforce, Dieter; Van Nieuwerburgh, Filip; Westenberg, Herman GM; Walitza, Susanne; Egberts, Karin; Renner, Tobias; Miguel, Euripedes Constantino; Cappi, Carolina; Hounie, Ana G; Conceição do Rosário, Maria; Sampaio, Aline S; Vallada, Homero; Nicolini, Humberto; Lanzagorta, Nuria; Camarena, Beatriz; Delorme, Richard; Leboyer, Marion; Pato, Carlos N; Pato, Michele T; Voyiaziakis, Emanuel; Heutink, Peter; Cath, Danielle C; Posthuma, Danielle; Smit, Jan H; Samuels, Jack; Bienvenu, O Joseph; Cullen, Bernadette; Fyer, Abby J; Grados, Marco A; Greenberg, Benjamin D; McCracken, James T; Riddle, Mark A; Wang, Ying; Coric, Vladimir; Leckman, James F; Bloch, Michael; Pittenger, Christopher; Eapen, Valsamma; Black, Donald W; Ophoff, Roel A; Strengman, Eric; Cusi, Daniele; Turiel, Maurizio; Frau, Francesca; Macciardi, Fabio; Gibbs, J Raphael; Cookson, Mark R; Singleton, Andrew; Hardy, John; Crenshaw, Andrew T; Parkin, Melissa A; Mirel, Daniel B; Conti, David V; Purcell, Shaun; Nestadt, Gerald; Hanna, Gregory L; Jenike, Michael A; Knowles, James A; Cox, Nancy; Pauls, David L

    2014-01-01

    Obsessive-compulsive disorder (OCD) is a common, debilitating neuropsychiatric illness with complex genetic etiology. The International OCD Foundation Genetics Collaborative (IOCDF-GC) is a multi-national collaboration established to discover the genetic variation predisposing to OCD. A set of individuals affected with DSM-IV OCD, a subset of their parents, and unselected controls, were genotyped with several different Illumina SNP microarrays. After extensive data cleaning, 1,465 cases, 5,557 ancestry-matched controls and 400 complete trios remained, with a common set of 469,410 autosomal and 9,657 X-chromosome SNPs. Ancestry-stratified case-control association analyses were conducted for three genetically-defined subpopulations and combined in two meta-analyses, with and without the trio-based analysis. In the case-control analysis, the lowest two p-values were located within DLGAP1 (p=2.49×10-6 and p=3.44×10-6), a member of the neuronal postsynaptic density complex. In the trio analysis, rs6131295, near BTBD3, exceeded the genome-wide significance threshold with a p-value=3.84 × 10-8. However, when trios were meta-analyzed with the combined case-control samples, the p-value for this variant was 3.62×10-5, losing genome-wide significance. Although no SNPs were identified to be associated with OCD at a genome-wide significant level in the combined trio-case-control sample, a significant enrichment of methylation-QTLs (p<0.001) and frontal lobe eQTLs (p=0.001) was observed within the top-ranked SNPs (p<0.01) from the trio-case-control analysis, suggesting these top signals may have a broad role in gene expression in the brain, and possibly in the etiology of OCD. PMID:22889921

  15. Genome-Wide Association Mapping and Genomic Prediction Elucidate the Genetic Architecture of Morphological Traits in Arabidopsis.

    PubMed

    Kooke, Rik; Kruijer, Willem; Bours, Ralph; Becker, Frank; Kuhn, André; van de Geest, Henri; Buntjer, Jaap; Doeswijk, Timo; Guerra, José; Bouwmeester, Harro; Vreugdenhil, Dick; Keurentjes, Joost J B

    2016-04-01

    Quantitative traits in plants are controlled by a large number of genes and their interaction with the environment. To disentangle the genetic architecture of such traits, natural variation within species can be explored by studying genotype-phenotype relationships. Genome-wide association studies that link phenotypes to thousands of single nucleotide polymorphism markers are nowadays common practice for such analyses. In many cases, however, the identified individual loci cannot fully explain the heritability estimates, suggesting missing heritability. We analyzed 349 Arabidopsis accessions and found extensive variation and high heritabilities for different morphological traits. The number of significant genome-wide associations was, however, very low. The application of genomic prediction models that take into account the effects of all individual loci may greatly enhance the elucidation of the genetic architecture of quantitative traits in plants. Here, genomic prediction models revealed different genetic architectures for the morphological traits. Integrating genomic prediction and association mapping enabled the assignment of many plausible candidate genes explaining the observed variation. These genes were analyzed for functional and sequence diversity, and good indications that natural allelic variation in many of these genes contributes to phenotypic variation were obtained. For ACS11, an ethylene biosynthesis gene, haplotype differences explaining variation in the ratio of petiole and leaf length could be identified. © 2016 American Society of Plant Biologists. All Rights Reserved.

  16. Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing.

    PubMed

    Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning

    2014-11-07

    Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.

  17. GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies

    PubMed Central

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243

  18. Hematopoietic transcriptional mechanisms: from locus-specific to genome-wide vantage points.

    PubMed

    DeVilbiss, Andrew W; Sanalkumar, Rajendran; Johnson, Kirby D; Keles, Sunduz; Bresnick, Emery H

    2014-08-01

    Hematopoiesis is an exquisitely regulated process in which stem cells in the developing embryo and the adult generate progenitor cells that give rise to all blood lineages. Master regulatory transcription factors control hematopoiesis by integrating signals from the microenvironment and dynamically establishing and maintaining genetic networks. One of the most rudimentary aspects of cell type-specific transcription factor function, how they occupy a highly restricted cohort of cis-elements in chromatin, remains poorly understood. Transformative technologic advances involving the coupling of next-generation DNA sequencing technology with the chromatin immunoprecipitation assay (ChIP-seq) have enabled genome-wide mapping of factor occupancy patterns. However, formidable problems remain; notably, ChIP-seq analysis yields hundreds to thousands of chromatin sites occupied by a given transcription factor, and only a fraction of the sites appear to be endowed with critical, non-redundant function. It has become en vogue to map transcription factor occupancy patterns genome-wide, while using powerful statistical tools to establish correlations to inform biology and mechanisms. With the advent of revolutionary genome editing technologies, one can now reach beyond correlations to conduct definitive hypothesis testing. This review focuses on key discoveries that have emerged during the path from single loci to genome-wide analyses, specifically in the context of hematopoietic transcriptional mechanisms. Copyright © 2014 ISEH - International Society for Experimental Hematology. Published by Elsevier Inc. All rights reserved.

  19. Creating a RAW264.7 CRISPR-Cas9 Genome Wide Library

    PubMed Central

    Napier, Brooke A; Monack, Denise M

    2017-01-01

    The bacterial clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9 genome editing tools are used in mammalian cells to knock-out specific genes of interest to elucidate gene function. The CRISPR-Cas9 system requires that the mammalian cell expresses Cas9 endonuclease, guide RNA (gRNA) to lead the endonuclease to the gene of interest, and the PAM sequence that links the Cas9 to the gRNA. CRISPR-Cas9 genome wide libraries are used to screen the effect of each gene in the genome on the cellular phenotype of interest, in an unbiased high-throughput manner. In this protocol, we describe our method of creating a CRISPR-Cas9 genome wide library in a transformed murine macrophage cell-line (RAW264.7). We have employed this library to identify novel mediators in the caspase-11 cell death pathway (Napier et al., 2016); however, this library can then be used to screen the importance of specific genes in multiple murine macrophage cellular pathways. PMID:28868328

  20. DNA Breaks and End Resection Measured Genome-wide by End Sequencing.

    PubMed

    Canela, Andres; Sridharan, Sriram; Sciascia, Nicholas; Tubbs, Anthony; Meltzer, Paul; Sleckman, Barry P; Nussenzweig, André

    2016-09-01

    DNA double-strand breaks (DSBs) arise during physiological transcription, DNA replication, and antigen receptor diversification. Mistargeting or misprocessing of DSBs can result in pathological structural variation and mutation. Here we describe a sensitive method (END-seq) to monitor DNA end resection and DSBs genome-wide at base-pair resolution in vivo. We utilized END-seq to determine the frequency and spectrum of restriction-enzyme-, zinc-finger-nuclease-, and RAG-induced DSBs. Beyond sequence preference, chromatin features dictate the repertoire of these genome-modifying enzymes. END-seq can detect at least one DSB per cell among 10,000 cells not harboring DSBs, and we estimate that up to one out of 60 cells contains off-target RAG cleavage. In addition to site-specific cleavage, we detect DSBs distributed over extended regions during immunoglobulin class-switch recombination. Thus, END-seq provides a snapshot of DNA ends genome-wide, which can be utilized for understanding genome-editing specificities and the influence of chromatin on DSB pathway choice. Published by Elsevier Inc.

  1. A genome-wide 3C-method for characterizing the three-dimensional architectures of genomes.

    PubMed

    Duan, Zhijun; Andronescu, Mirela; Schutz, Kevin; Lee, Choli; Shendure, Jay; Fields, Stanley; Noble, William S; Anthony Blau, C

    2012-11-01

    Accumulating evidence demonstrates that the three-dimensional (3D) organization of chromosomes within the eukaryotic nucleus reflects and influences genomic activities, including transcription, DNA replication, recombination and DNA repair. In order to uncover structure-function relationships, it is necessary first to understand the principles underlying the folding and the 3D arrangement of chromosomes. Chromosome conformation capture (3C) provides a powerful tool for detecting interactions within and between chromosomes. A high throughput derivative of 3C, chromosome conformation capture on chip (4C), executes a genome-wide interrogation of interaction partners for a given locus. We recently developed a new method, a derivative of 3C and 4C, which, similar to Hi-C, is capable of comprehensively identifying long-range chromosome interactions throughout a genome in an unbiased fashion. Hence, our method can be applied to decipher the 3D architectures of genomes. Here, we provide a detailed protocol for this method. Published by Elsevier Inc.

  2. Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

    PubMed

    Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

    2018-05-31

    In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.

  3. Genome-wide linkage scan for loci of musical aptitude in Finnish families: evidence for a major locus at 4q22

    PubMed Central

    Pulli, K; Karma, K; Norio, R; Sistonen, P; Göring, H H H; Järvelä, I

    2008-01-01

    Background: Music perception and performance are comprehensive human cognitive functions and thus provide an excellent model system for studying human behaviour and brain function. However, the molecules involved in mediating music perception and performance are so far uncharacterised. Objective: To unravel the biological background of music perception, using molecular and statistical genetic approaches. Methods: 15 Finnish multigenerational families (with a total of 234 family members) were recruited via a nationwide search. The phenotype of all family members was determined using three tests used in defining musical aptitude: a test for auditory structuring ability (Karma Music test; KMT) commonly used in Finland, and the Seashore pitch and time discrimination subtests (SP and ST respectively) used internationally. We calculated heritabilities and performed a genome-wide variance components-based linkage scan using genotype data for 1113 microsatellite markers. Results: The heritability estimates were 42% for KMT, 57% for SP, 21% for ST and 48% for the combined music test scores. Significant evidence of linkage was obtained on chromosome 4q22 (LOD 3.33) and suggestive evidence of linkage at 8q13-21 (LOD 2.29) with the combined music test scores, using variance component linkage analyses. The major contribution of the 4q22 locus was obtained for the KMT (LOD 2.91). Interestingly, a positive LOD score of 1.69 was shown at 18q, a region previously linked to dyslexia (DYX6) using combined music test scores. Conclusion: Our results show that there is a genetic contribution to musical aptitude that is likely to be regulated by several predisposing genes or variants. PMID:18424507

  4. Genome-wide association studies on HIV susceptibility, pathogenesis and pharmacogenomics

    PubMed Central

    2012-01-01

    Susceptibility to HIV-1 and the clinical course after infection show a substantial heterogeneity between individuals. Part of this variability can be attributed to host genetic variation. Initial candidate gene studies have revealed interesting host factors that influence HIV infection, replication and pathogenesis. Recently, genome-wide association studies (GWAS) were utilized for unbiased searches at a genome-wide level to discover novel genetic factors and pathways involved in HIV-1 infection. This review gives an overview of findings from the GWAS performed on HIV infection, within different cohorts, with variable patient and phenotype selection. Furthermore, novel techniques and strategies in research that might contribute to the complete understanding of virus-host interactions and its role on the pathogenesis of HIV infection are discussed. PMID:22920050

  5. Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary.

    PubMed

    Brynildsrud, Ola; Bohlin, Jon; Scheffer, Lonneke; Eldholm, Vegard

    2016-11-25

    Genome-wide association studies (GWAS) have become indispensable in human medicine and genomics, but very few have been carried out on bacteria. Here we introduce Scoary, an ultra-fast, easy-to-use, and widely applicable software tool that scores the components of the pan-genome for associations to observed phenotypic traits while accounting for population stratification, with minimal assumptions about evolutionary processes. We call our approach pan-GWAS to distinguish it from traditional, single nucleotide polymorphism (SNP)-based GWAS. Scoary is implemented in Python and is available under an open source GPLv3 license at https://github.com/AdmiralenOla/Scoary .

  6. A Genome-Wide Investigation of Autozygosity and Breast Cancer Risk

    DTIC Science & Technology

    2011-07-01

    cases than in controls, using logistic regression methods. Using genome-wide SNP data (525,000 SNPs) on 1,647 non-Hispanic white, early-onset...premenopausal breast cancer cases and 1,556 matched controls we identified over 65,000 individual RoHs and 423 genomic regions harbor RoHs for at least 10...we hypothesize that germline autozygosity is more common in breast cancer cases than in controls. More specifically, we hypothesize that there are

  7. Cycloid scanning for wide field optical coherence tomography endomicroscopy and angiography in vivo

    PubMed Central

    Liang, Kaicheng; Wang, Zhao; Ahsen, Osman O.; Lee, Hsiang-Chieh; Potsaid, Benjamin M.; Jayaraman, Vijaysekhar; Cable, Alex; Mashimo, Hiroshi; Li, Xingde; Fujimoto, James G.

    2018-01-01

    Devices that perform wide field-of-view (FOV) precision optical scanning are important for endoscopic assessment and diagnosis of luminal organ disease such as in gastroenterology. Optical scanning for in vivo endoscopic imaging has traditionally relied on one or more proximal mechanical actuators, limiting scan accuracy and imaging speed. There is a need for rapid and precise two-dimensional (2D) microscanning technologies to enable the translation of benchtop scanning microscopies to in vivo endoscopic imaging. We demonstrate a new cycloid scanner in a tethered capsule for ultrahigh speed, side-viewing optical coherence tomography (OCT) endomicroscopy in vivo. The cycloid capsule incorporates two scanners: a piezoelectrically actuated resonant fiber scanner to perform a precision, small FOV, fast scan and a micromotor scanner to perform a wide FOV, slow scan. Together these scanners distally scan the beam circumferentially in a 2D cycloid pattern, generating an unwrapped 1 mm × 38 mm strip FOV. Sequential strip volumes can be acquired with proximal pullback to image centimeter-long regions. Using ultrahigh speed 1.3 μm wavelength swept-source OCT at a 1.17 MHz axial scan rate, we imaged the human rectum at 3 volumes/s. Each OCT strip volume had 166 × 2322 axial scans with 8.5 μm axial and 30 μm transverse resolution. We further demonstrate OCT angiography at 0.5 volumes/s, producing volumetric images of vasculature. In addition to OCT applications, cycloid scanning promises to enable precision 2D optical scanning for other imaging modalities, including fluorescence confocal and nonlinear microscopy. PMID:29682598

  8. Genome-wide identification of significant aberrations in cancer genome.

    PubMed

    Yuan, Xiguo; Yu, Guoqiang; Hou, Xuchu; Shih, Ie-Ming; Clarke, Robert; Zhang, Junying; Hoffman, Eric P; Wang, Roger R; Zhang, Zhen; Wang, Yue

    2012-07-27

    Somatic Copy Number Alterations (CNAs) in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC), a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1) exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2) performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3) iteratively detecting Significant Copy Number Aberrations (SCAs) and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme. We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS) on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma). When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC) or tumor suppressor genes (e.g., CDKN2A/B). Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies. Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes. Open-source and platform-independent SAIC software is

  9. Genome-wide association mapping of qualitatively inherited traits in a germplasm collection

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association (GWA) has been used as a tool for dissecting the genetic architecture of quantitatively inherited traits. We demonstrate here that GWA can also be highly useful for detecting the genomic locations of major genes governing categorically defined phenotype variants that exist fo...

  10. A genome-wide scan for signatures of selection in Azeri and Khuzestani buffalo breeds.

    PubMed

    Mokhber, Mahdi; Moradi-Shahrbabak, Mohammad; Sadeghi, Mostafa; Moradi-Shahrbabak, Hossein; Stella, Alessandra; Nicolzzi, Ezequiel; Rahmaninia, Javad; Williams, John L

    2018-06-11

    Identification of genomic regions that have been targets of selection may shed light on the genetic history of livestock populations and help to identify variation controlling commercially important phenotypes. The Azeri and Kuzestani buffalos are the most common indigenous Iranian breeds which have been subjected to divergent selection and are well adapted to completely different regions. Examining the genetic structure of these populations may identify genomic regions associated with adaptation to the different environments and production goals. A set of 385 water buffalo samples from Azeri (N = 262) and Khuzestani (N = 123) breeds were genotyped using the Axiom® Buffalo Genotyping 90 K Array. The unbiased fixation index method (F ST ) was used to detect signatures of selection. In total, 13 regions with outlier F ST values (0.1%) were identified. Annotation of these regions using the UMD3.1 Bos taurus Genome Assembly was performed to find putative candidate genes and QTLs within the selected regions. Putative candidate genes identified include FBXO9, NDFIP1, ACTR3, ARHGAP26, SERPINF2, BOLA-DRB3, BOLA-DQB, CLN8, and MYOM2. Candidate genes identified in regions potentially under selection were associated with physiological pathways including milk production, cytoskeleton organization, growth, metabolic function, apoptosis and domestication-related changes include immune and nervous system development. The QTL identified are involved in economically important traits in buffalo related to milk composition, udder structure, somatic cell count, meat quality, and carcass and body weight.

  11. Genome-wide single nucleotide polymorphisms reveal population history and adaptive divergence in wild guppies.

    PubMed

    Willing, Eva-Maria; Bentzen, Paul; van Oosterhout, Cock; Hoffmann, Margarete; Cable, Joanne; Breden, Felix; Weigel, Detlef; Dreyer, Christine

    2010-03-01

    Adaptation of guppies (Poecilia reticulata) to contrasting upland and lowland habitats has been extensively studied with respect to behaviour, morphology and life history traits. Yet population history has not been studied at the whole-genome level. Although single nucleotide polymorphisms (SNPs) are the most abundant form of variation in many genomes and consequently very informative for a genome-wide picture of standing natural variation in populations, genome-wide SNP data are rarely available for wild vertebrates. Here we use genetically mapped SNP markers to comprehensively survey genetic variation within and among naturally occurring guppy populations from a wide geographic range in Trinidad and Venezuela. Results from three different clustering methods, Neighbor-net, principal component analysis (PCA) and Bayesian analysis show that the population substructure agrees with geographic separation and largely with previously hypothesized patterns of historical colonization. Within major drainages (Caroni, Oropouche and Northern), populations are genetically similar, but those in different geographic regions are highly divergent from one another, with some indications of ancient shared polymorphisms. Clear genomic signatures of a previous introduction experiment were seen, and we detected additional potential admixture events. Headwater populations were significantly less heterozygous than downstream populations. Pairwise F(ST) values revealed marked differences in allele frequencies among populations from different regions, and also among populations within the same region. F(ST) outlier methods indicated some regions of the genome as being under directional selection. Overall, this study demonstrates the power of a genome-wide SNP data set to inform for studies on natural variation, adaptation and evolution of wild populations.

  12. Genome-Wide Identification of Molecular Mimicry Candidates in Parasites

    PubMed Central

    Ludin, Philipp; Nilsson, Daniel; Mäser, Pascal

    2011-01-01

    Among the many strategies employed by parasites for immune evasion and host manipulation, one of the most fascinating is molecular mimicry. With genome sequences available for host and parasite, mimicry of linear amino acid epitopes can be investigated by comparative genomics. Here we developed an in silico pipeline for genome-wide identification of molecular mimicry candidate proteins or epitopes. The predicted proteome of a given parasite was broken down into overlapping fragments, each of which was screened for close hits in the human proteome. Control searches were carried out against unrelated, free-living eukaryotes to eliminate the generally conserved proteins, and with randomized versions of the parasite proteins to get an estimate of statistical significance. This simple but computation-intensive approach yielded interesting candidates from human-pathogenic parasites. From Plasmodium falciparum, it returned a 14 amino acid motif in several of the PfEMP1 variants identical to part of the heparin-binding domain in the immunosuppressive serum protein vitronectin. And in Brugia malayi, fragments were detected that matched to periphilin-1, a protein of cell-cell junctions involved in barrier formation. All the results are publicly available by means of mimicDB, a searchable online database for molecular mimicry candidates from pathogens. To our knowledge, this is the first genome-wide survey for molecular mimicry proteins in parasites. The strategy can be adopted to any pair of host and pathogen, once appropriate negative control organisms are chosen. MimicDB provides a host of new starting points to gain insights into the molecular nature of host-pathogen interactions. PMID:21408160

  13. Pooled genome wide association detects association upstream of FCRL3 with Graves' disease.

    PubMed

    Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E

    2016-11-18

    Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.

  14. Genome-wide scan of job-related exhaustion with three replication studies implicate a susceptibility variant at the UST gene locus

    PubMed Central

    Sulkava, Sonja; Ollila, Hanna M.; Ahola, Kirsi; Partonen, Timo; Viitasalo, Katriina; Kettunen, Johannes; Lappalainen, Maarit; Kivimäki, Mika; Vahtera, Jussi; Lindström, Jaana; Härmä, Mikko; Puttonen, Sampsa; Salomaa, Veikko; Paunio, Tiina

    2013-01-01

    Job-related exhaustion is the core dimension of burnout, a work-related stress syndrome that has several negative health consequences. In this study, we explored the molecular genetic background of job-related exhaustion. A genome-wide analysis of job-related exhaustion was performed in the GENMETS subcohort (n = 1256) of the Finnish population-based Health 2000 study. Replication analyses included an analysis of the strongest associations in the rest of the Health 2000 sample (n = 1660 workers) and in three independent populations (the FINRISK population cohort, n = 10 753; two occupational cohorts, total n = 1451). Job-related exhaustion was ascertained using a standard self-administered questionnaire (the Maslach Burnout Inventory (MBI)-GS exhaustion scale in the Health 2000 sample and the occupational cohorts) or a single question (FINRISK). A variant located in an intron of UST, uronyl-2-sulfotransferase (rs13219957), gave the strongest statistical evidence in the initial genome-wide study (P = 1.55 × 10−7), and was associated with job-related exhaustion in all the replication sets (P < 0.05; P = 6.75 × 10−7 from the meta-analysis). Consistent with studies of mood disorders, individual common genetic variants did not have any strong effect on job-related exhaustion. However, the nominally significant signals from the allelic variant of UST in four separate samples suggest that this variant might be a weak risk factor for job-related exhaustion. Together with the previously reported associations of other dermatan/chondroitin sulfate genes with mood disorders, these results indicate a potential molecular pathway for stress-related traits and mark a candidate region for further studies of job-related and general exhaustion. PMID:23620144

  15. A robust clustering algorithm for identifying problematic samples in genome-wide association studies.

    PubMed

    Bellenguez, Céline; Strange, Amy; Freeman, Colin; Donnelly, Peter; Spencer, Chris C A

    2012-01-01

    High-throughput genotyping arrays provide an efficient way to survey single nucleotide polymorphisms (SNPs) across the genome in large numbers of individuals. Downstream analysis of the data, for example in genome-wide association studies (GWAS), often involves statistical models of genotype frequencies across individuals. The complexities of the sample collection process and the potential for errors in the experimental assay can lead to biases and artefacts in an individual's inferred genotypes. Rather than attempting to model these complications, it has become a standard practice to remove individuals whose genome-wide data differ from the sample at large. Here we describe a simple, but robust, statistical algorithm to identify samples with atypical summaries of genome-wide variation. Its use as a semi-automated quality control tool is demonstrated using several summary statistics, selected to identify different potential problems, and it is applied to two different genotyping platforms and sample collections. The algorithm is written in R and is freely available at www.well.ox.ac.uk/chris-spencer chris.spencer@well.ox.ac.uk Supplementary data are available at Bioinformatics online.

  16. Genome-wide association studies in maize: praise and stargaze

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association study (GWAS) has appeared as a widespread strategy in decoding genotype-phenotype associations in many species thanks to technical advances in next-generation sequencing (NGS) applications. Maize is an ideal crop for GWAS and significant progress has been made in the last dec...

  17. Genome-wide association identifies genetic variants associated with lentiform nucleus volume in N = 1345 young and elderly subjects.

    PubMed

    Hibar, Derrek P; Stein, Jason L; Ryles, April B; Kohannim, Omid; Jahanshad, Neda; Medland, Sarah E; Hansell, Narelle K; McMahon, Katie L; de Zubicaray, Greig I; Montgomery, Grant W; Martin, Nicholas G; Wright, Margaret J; Saykin, Andrew J; Jack, Clifford R; Weiner, Michael W; Toga, Arthur W; Thompson, Paul M

    2013-06-01

    Deficits in lentiform nucleus volume and morphometry are implicated in a number of genetically influenced disorders, including Parkinson's disease, schizophrenia, and ADHD. Here we performed genome-wide searches to discover common genetic variants associated with differences in lentiform nucleus volume in human populations. We assessed structural MRI scans of the brain in two large genotyped samples: the Alzheimer's Disease Neuroimaging Initiative (ADNI; N = 706) and the Queensland Twin Imaging Study (QTIM; N = 639). Statistics of association from each cohort were combined meta-analytically using a fixed-effects model to boost power and to reduce the prevalence of false positive findings. We identified a number of associations in and around the flavin-containing monooxygenase (FMO) gene cluster. The most highly associated SNP, rs1795240, was located in the FMO3 gene; after meta-analysis, it showed genome-wide significant evidence of association with lentiform nucleus volume (P MA  = 4.79 × 10(-8)). This commonly-carried genetic variant accounted for 2.68 % and 0.84 % of the trait variability in the ADNI and QTIM samples, respectively, even though the QTIM sample was on average 50 years younger. Pathway enrichment analysis revealed significant contributions of this gene to the cytochrome P450 pathway, which is involved in metabolizing numerous therapeutic drugs for pain, seizures, mania, depression, anxiety, and psychosis. The genetic variants we identified provide replicated, genome-wide significant evidence for the FMO gene cluster's involvement in lentiform nucleus volume differences in human populations.

  18. Genome-wide association studies of obesity and metabolic syndrome.

    PubMed

    Fall, Tove; Ingelsson, Erik

    2014-01-25

    Until just a few years ago, the genetic determinants of obesity and metabolic syndrome were largely unknown, with the exception of a few forms of monogenic extreme obesity. Since genome-wide association studies (GWAS) became available, large advances have been made. The first single nucleotide polymorphism robustly associated with increased body mass index (BMI) was in 2007 mapped to a gene with for the time unknown function. This gene, now known as fat mass and obesity associated (FTO) has been repeatedly replicated in several ethnicities and is affecting obesity by regulating appetite. Since the first report from a GWAS of obesity, an increasing number of markers have been shown to be associated with BMI, other measures of obesity or fat distribution and metabolic syndrome. This systematic review of obesity GWAS will summarize genome-wide significant findings for obesity and metabolic syndrome and briefly give a few suggestions of what is to be expected in the next few years. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  19. cisprimertool: software to implement a comparative genomics strategy for the development of conserved intron scanning (CIS) markers.

    PubMed

    Jayashree, B; Jagadeesh, V T; Hoisington, D

    2008-05-01

    The availability of complete, annotated genomic sequence information in model organisms is a rich resource that can be extended to understudied orphan crops through comparative genomic approaches. We report here a software tool (cisprimertool) for the identification of conserved intron scanning regions using expressed sequence tag alignments to a completely sequenced model crop genome. The method used is based on earlier studies reporting the assessment of conserved intron scanning primers (called CISP) within relatively conserved exons located near exon-intron boundaries from onion, banana, sorghum and pearl millet alignments with rice. The tool is freely available to academic users at http://www.icrisat.org/gt-bt/CISPTool.htm. © 2007 ICRISAT.

  20. Microbial genome-wide association studies: lessons from human GWAS.

    PubMed

    Power, Robert A; Parkhill, Julian; de Oliveira, Tulio

    2017-01-01

    The reduced costs of sequencing have led to whole-genome sequences for a large number of microorganisms, enabling the application of microbial genome-wide association studies (GWAS). Given the successes of human GWAS in understanding disease aetiology and identifying potential drug targets, microbial GWAS are likely to further advance our understanding of infectious diseases. These advances include insights into pressing global health problems, such as antibiotic resistance and disease transmission. In this Review, we outline the methodologies of GWAS, the current state of the field of microbial GWAS, and how lessons from human GWAS can direct the future of the field.

  1. MAGNAMWAR: an R package for genome-wide association studies of bacterial orthologs.

    PubMed

    Sexton, Corinne E; Smith, Hayden Z; Newell, Peter D; Douglas, Angela E; Chaston, John M

    2018-06-01

    Here we report on an R package for genome-wide association studies of orthologous genes in bacteria. Before using the software, orthologs from bacterial genomes or metagenomes are defined using local or online implementations of OrthoMCL. These presence-absence patterns are statistically associated with variation in user-collected phenotypes using the Mono-Associated GNotobiotic Animals Metagenome-Wide Association R package (MAGNAMWAR). Genotype-phenotype associations can be performed with several different statistical tests based on the type and distribution of the data. MAGNAMWAR is available on CRAN. john_chaston@byu.edu.

  2. Transcriptome sequencing reveals genome-wide variation in molecular evolutionary rate among ferns.

    PubMed

    Grusz, Amanda L; Rothfels, Carl J; Schuettpelz, Eric

    2016-08-30

    Transcriptomics in non-model plant systems has recently reached a point where the examination of nuclear genome-wide patterns in understudied groups is an achievable reality. This progress is especially notable in evolutionary studies of ferns, for which molecular resources to date have been derived primarily from the plastid genome. Here, we utilize transcriptome data in the first genome-wide comparative study of molecular evolutionary rate in ferns. We focus on the ecologically diverse family Pteridaceae, which comprises about 10 % of fern diversity and includes the enigmatic vittarioid ferns-an epiphytic, tropical lineage known for dramatically reduced morphologies and radically elongated phylogenetic branch lengths. Using expressed sequence data for 2091 loci, we perform pairwise comparisons of molecular evolutionary rate among 12 species spanning the three largest clades in the family and ask whether previously documented heterogeneity in plastid substitution rates is reflected in their nuclear genomes. We then inquire whether variation in evolutionary rate is being shaped by genes belonging to specific functional categories and test for differential patterns of selection. We find significant, genome-wide differences in evolutionary rate for vittarioid ferns relative to all other lineages within the Pteridaceae, but we recover few significant correlations between faster/slower vittarioid loci and known functional gene categories. We demonstrate that the faster rates characteristic of the vittarioid ferns are likely not driven by positive selection, nor are they unique to any particular type of nucleotide substitution. Our results reinforce recently reviewed mechanisms hypothesized to shape molecular evolutionary rates in vittarioid ferns and provide novel insight into substitution rate variation both within and among fern nuclear genomes.

  3. A review of genome-wide approaches to study the genetic basis for spermatogenic defects.

    PubMed

    Aston, Kenneth I; Conrad, Donald F

    2013-01-01

    Rapidly advancing tools for genetic analysis on a genome-wide scale have been instrumental in identifying the genetic bases for many complex diseases. About half of male infertility cases are of unknown etiology in spite of tremendous efforts to characterize the genetic basis for the disorder. Advancing our understanding of the genetic basis for male infertility will require the application of established and emerging genomic tools. This chapter introduces many of the tools available for genetic studies on a genome-wide scale along with principles of study design and data analysis.

  4. Genome-wide association analysis identifies 30 new susceptibility loci for schizophrenia.

    PubMed

    Li, Zhiqiang; Chen, Jianhua; Yu, Hao; He, Lin; Xu, Yifeng; Zhang, Dai; Yi, Qizhong; Li, Changgui; Li, Xingwang; Shen, Jiawei; Song, Zhijian; Ji, Weidong; Wang, Meng; Zhou, Juan; Chen, Boyu; Liu, Yahui; Wang, Jiqiang; Wang, Peng; Yang, Ping; Wang, Qingzhong; Feng, Guoyin; Liu, Benxiu; Sun, Wensheng; Li, Baojie; He, Guang; Li, Weidong; Wan, Chunling; Xu, Qi; Li, Wenjin; Wen, Zujia; Liu, Ke; Huang, Fang; Ji, Jue; Ripke, Stephan; Yue, Weihua; Sullivan, Patrick F; O'Donovan, Michael C; Shi, Yongyong

    2017-11-01

    We conducted a genome-wide association study (GWAS) with replication in 36,180 Chinese individuals and performed further transancestry meta-analyses with data from the Psychiatry Genomics Consortium (PGC2). Approximately 95% of the genome-wide significant (GWS) index alleles (or their proxies) from the PGC2 study were overrepresented in Chinese schizophrenia cases, including ∼50% that achieved nominal significance and ∼75% that continued to be GWS in the transancestry analysis. The Chinese-only analysis identified seven GWS loci; three of these also were GWS in the transancestry analyses, which identified 109 GWS loci, thus yielding a total of 113 GWS loci (30 novel) in at least one of these analyses. We observed improvements in the fine-mapping resolution at many susceptibility loci. Our results provide several lines of evidence supporting candidate genes at many loci and highlight some pathways for further research. Together, our findings provide novel insight into the genetic architecture and biological etiology of schizophrenia.

  5. Partitioning heritability by functional annotation using genome-wide association summary statistics.

    PubMed

    Finucane, Hilary K; Bulik-Sullivan, Brendan; Gusev, Alexander; Trynka, Gosia; Reshef, Yakir; Loh, Po-Ru; Anttila, Verneri; Xu, Han; Zang, Chongzhi; Farh, Kyle; Ripke, Stephan; Day, Felix R; Purcell, Shaun; Stahl, Eli; Lindstrom, Sara; Perry, John R B; Okada, Yukinori; Raychaudhuri, Soumya; Daly, Mark J; Patterson, Nick; Neale, Benjamin M; Price, Alkes L

    2015-11-01

    Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.

  6. Cooperative Genome-Wide Analysis Shows Increased Homozygosity in Early Onset Parkinson's Disease

    PubMed Central

    Nalls, Michael A.; Martinez, Maria; Schulte, Claudia; Holmans, Peter; Gasser, Thomas; Hardy, John; Singleton, Andrew B.; Wood, Nicholas W.; Brice, Alexis; Heutink, Peter; Williams, Nigel; Morris, Huw R.

    2012-01-01

    Parkinson's disease (PD) occurs in both familial and sporadic forms, and both monogenic and complex genetic factors have been identified. Early onset PD (EOPD) is particularly associated with autosomal recessive (AR) mutations, and three genes, PARK2, PARK7 and PINK1, have been found to carry mutations leading to AR disease. Since mutations in these genes account for less than 10% of EOPD patients, we hypothesized that further recessive genetic factors are involved in this disorder, which may appear in extended runs of homozygosity. We carried out genome wide SNP genotyping to look for extended runs of homozygosity (ROHs) in 1,445 EOPD cases and 6,987 controls. Logistic regression analyses showed an increased level of genomic homozygosity in EOPD cases compared to controls. These differences are larger for ROH of 9 Mb and above, where there is a more than three-fold increase in the proportion of cases carrying a ROH. These differences are not explained by occult recessive mutations at existing loci. Controlling for genome wide homozygosity in logistic regression analyses increased the differences between cases and controls, indicating that in EOPD cases ROHs do not simply relate to genome wide measures of inbreeding. Homozygosity at a locus on chromosome19p13.3 was identified as being more common in EOPD cases as compared to controls. Sequencing analysis of genes and predicted transcripts within this locus failed to identify a novel mutation causing EOPD in our cohort. There is an increased rate of genome wide homozygosity in EOPD, as measured by an increase in ROHs. These ROHs are a signature of inbreeding and do not necessarily harbour disease-causing genetic variants. Although there might be other regions of interest apart from chromosome 19p13.3, we lack the power to detect them with this analysis. PMID:22427796

  7. Genome-Wide Association of Heroin Dependence in Han Chinese.

    PubMed

    Kalsi, Gursharan; Euesden, Jack; Coleman, Jonathan R I; Ducci, Francesca; Aliev, Fazil; Newhouse, Stephen J; Liu, Xiehe; Ma, Xiaohong; Wang, Yingcheng; Collier, David A; Asherson, Philip; Li, Tao; Breen, Gerome

    2016-01-01

    Drug addiction is a costly and recurring healthcare problem, necessitating a need to understand risk factors and mechanisms of addiction, and to identify new biomarkers. To date, genome-wide association studies (GWAS) for heroin addiction have been limited; moreover they have been restricted to examining samples of European and African-American origin due to difficulty of recruiting samples from other populations. This is the first study to test a Han Chinese population; we performed a GWAS on a homogeneous sample of 370 Han Chinese subjects diagnosed with heroin dependence using the DSM-IV criteria and 134 ethnically matched controls. Analysis using the diagnostic criteria of heroin dependence yielded suggestive evidence for association between variants in the genes CCDC42 (coiled coil domain 42; p = 2.8x10-7) and BRSK2 (BR serine/threonine 2; p = 4.110-6). In addition, we found evidence for risk variants within the ARHGEF10 (Rho guanine nucleotide exchange factor 10) gene on chromosome 8 and variants in a region on chromosome 20q13, which is gene-poor but has a concentration of mRNAs and predicted miRNAs. Gene-based association analysis identified genome-wide significant association between variants in CCDC42 and heroin addiction. Additionally, when we investigated shared risk variants between heroin addiction and risk of other addiction-related and psychiatric phenotypes using polygenic risk scores, we found a suggestive relationship with variants predicting tobacco addiction, and a significant relationship with variants predicting schizophrenia. Our genome wide association study of heroin dependence provides data in a novel sample, with functionally plausible results and evidence of genetic data of value to the field.

  8. [Genome-wide screening of predicted sugar transporters in Neurospora crassa and the application in hexose fermentation by Saccharomyces cerevisiae].

    PubMed

    Gao, Jingfang; Wang, Bang; Han, Xiaoyun; Tian, Chaoguang

    2017-01-25

    The lignocellulolytic filamentous fungus Neurospora crassa is able to assimilate various mono- and oligo-saccharides. However, more than half of predicted sugar transporters in the genome are still waiting for functional elucidation. In this study, system analysis of substrate spectra of predicted sugar transporters in N. crassa was performed at genome-wide level. NCU01868 and NCU08152 have the capability of uptaking various hexose, which are named as NcHXT-1 and NcHXT-2 respectively. Their transport activities for glucose were further confirmed by fluorescence resonance energy transfer analysis. Over-expression of either NcHXT-1 or NcHXT-2 in the null-hexose-transporter yeast EBY.VW4000 restored the growth and ethanol fermentation under submerged fermentation with glucose, galactose, or mannose as the sole carbon source. NcHXT-1/-2 homologues were found in a variety of cellulolytic fungi. Functional identification of two filamentous fungal-conserved hexose transporters NcHXT-1/-2 via genome scanning would represent novel targets for ongoing efforts in engineering cellulolytic fungi and hexose fermentation in yeast.

  9. A genome-wide association study of corneal astigmatism: The CREAM Consortium.

    PubMed

    Shah, Rupal L; Li, Qing; Zhao, Wanting; Tedja, Milly S; Tideman, J Willem L; Khawaja, Anthony P; Fan, Qiao; Yazar, Seyhan; Williams, Katie M; Verhoeven, Virginie J M; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W V; Hysi, Pirro G; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R; Jonas, Jost B; Mitchell, Paul; Hammond, Christopher J; Höhn, René; Baird, Paul N; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C W; Guggenheim, Jeremy A; Bailey-Wilson, Joan E

    2018-01-01

    To identify genes and genetic markers associated with corneal astigmatism. A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha ( PDGFRA ) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08-1.16), p=5.55×10 -9 . No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans-claudin-7 ( CLDN7 ), acid phosphatase 2, lysosomal ( ACP2 ), and TNF alpha-induced protein 8 like 3 ( TNFAIP8L3 ). In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7 , ACP2 , and TNFAIP8L3 , that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism.

  10. A genome-wide association study of corneal astigmatism: The CREAM Consortium

    PubMed Central

    Shah, Rupal L.; Li, Qing; Zhao, Wanting; Tedja, Milly S.; Tideman, J. Willem L.; Khawaja, Anthony P.; Fan, Qiao; Yazar, Seyhan; Williams, Katie M.; Verhoeven, Virginie J.M.; Xie, Jing; Wang, Ya Xing; Hess, Moritz; Nickels, Stefan; Lackner, Karl J.; Pärssinen, Olavi; Wedenoja, Juho; Biino, Ginevra; Concas, Maria Pina; Uitterlinden, André; Rivadeneira, Fernando; Jaddoe, Vincent W.V.; Hysi, Pirro G.; Sim, Xueling; Tan, Nicholas; Tham, Yih-Chung; Sensaki, Sonoko; Hofman, Albert; Vingerling, Johannes R.; Jonas, Jost B.; Mitchell, Paul; Hammond, Christopher J.; Höhn, René; Baird, Paul N.; Wong, Tien-Yin; Cheng, Chinfsg-Yu; Teo, Yik Ying; Mackey, David A.; Williams, Cathy; Saw, Seang-Mei; Klaver, Caroline C.W.; Bailey-Wilson, Joan E.

    2018-01-01

    Purpose To identify genes and genetic markers associated with corneal astigmatism. Methods A meta-analysis of genome-wide association studies (GWASs) of corneal astigmatism undertaken for 14 European ancestry (n=22,250) and 8 Asian ancestry (n=9,120) cohorts was performed by the Consortium for Refractive Error and Myopia. Cases were defined as having >0.75 diopters of corneal astigmatism. Subsequent gene-based and gene-set analyses of the meta-analyzed results of European ancestry cohorts were performed using VEGAS2 and MAGMA software. Additionally, estimates of single nucleotide polymorphism (SNP)-based heritability for corneal and refractive astigmatism and the spherical equivalent were calculated for Europeans using LD score regression. Results The meta-analysis of all cohorts identified a genome-wide significant locus near the platelet-derived growth factor receptor alpha (PDGFRA) gene: top SNP: rs7673984, odds ratio=1.12 (95% CI:1.08–1.16), p=5.55×10−9. No other genome-wide significant loci were identified in the combined analysis or European/Asian ancestry-specific analyses. Gene-based analysis identified three novel candidate genes for corneal astigmatism in Europeans—claudin-7 (CLDN7), acid phosphatase 2, lysosomal (ACP2), and TNF alpha-induced protein 8 like 3 (TNFAIP8L3). Conclusions In addition to replicating a previously identified genome-wide significant locus for corneal astigmatism near the PDGFRA gene, gene-based analysis identified three novel candidate genes, CLDN7, ACP2, and TNFAIP8L3, that warrant further investigation to understand their role in the pathogenesis of corneal astigmatism. The much lower number of genetic variants and genes demonstrating an association with corneal astigmatism compared to published spherical equivalent GWAS analyses suggest a greater influence of rare genetic variants, non-additive genetic effects, or environmental factors in the development of astigmatism. PMID:29422769

  11. Population Stratification in the Context of Diverse Epidemiologic Surveys Sans Genome-Wide Data

    PubMed Central

    Oetjens, Matthew T.; Brown-Gentry, Kristin; Goodloe, Robert; Dilks, Holli H.; Crawford, Dana C.

    2016-01-01

    Population stratification or confounding by genetic ancestry is a potential cause of false associations in genetic association studies. Estimation of and adjustment for genetic ancestry has become common practice thanks in part to the availability of ancestry informative markers on genome-wide association study (GWAS) arrays. While array data is now widespread, these data are not ubiquitous as several large epidemiologic and clinic-based studies lack genome-wide data. One such large epidemiologic-based study lacking genome-wide data accessible to investigators is the National Health and Nutrition Examination Surveys (NHANES), population-based cross-sectional surveys of Americans linked to demographic, health, and lifestyle data conducted by the Centers for Disease Control and Prevention. DNA samples (n = 14,998) were extracted from biospecimens from consented NHANES participants between 1991–1994 (NHANES III, phase 2) and 1999–2002 and represent three major self-identified racial/ethnic groups: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We as the Epidemiologic Architecture for Genes Linked to Environment study genotyped candidate gene and GWAS-identified index variants in NHANES as part of the larger Population Architecture using Genomics and Epidemiology I study for collaborative genetic association studies. To enable basic quality control such as estimation of genetic ancestry to control for population stratification in NHANES san genome-wide data, we outline here strategies that use limited genetic data to identify the markers optimal for characterizing genetic ancestry. From among 411 and 295 autosomal SNPs available in NHANES III and NHANES 1999–2002, we demonstrate that markers with ancestry information can be identified to estimate global ancestry. Despite limited resolution, global genetic ancestry is highly correlated with self-identified race for the majority of participants, although less so

  12. Genome-wide investigation reveals high evolutionary rates in annual model plants.

    PubMed

    Yue, Jia-Xing; Li, Jinpeng; Wang, Dan; Araki, Hitoshi; Tian, Dacheng; Yang, Sihai

    2010-11-09

    Rates of molecular evolution vary widely among species. While significant deviations from molecular clock have been found in many taxa, effects of life histories on molecular evolution are not fully understood. In plants, annual/perennial life history traits have long been suspected to influence the evolutionary rates at the molecular level. To date, however, the number of genes investigated on this subject is limited and the conclusions are mixed. To evaluate the possible heterogeneity in evolutionary rates between annual and perennial plants at the genomic level, we investigated 85 nuclear housekeeping genes, 10 non-housekeeping families, and 34 chloroplast genes using the genomic data from model plants including Arabidopsis thaliana and Medicago truncatula for annuals and grape (Vitis vinifera) and popular (Populus trichocarpa) for perennials. According to the cross-comparisons among the four species, 74-82% of the nuclear genes and 71-97% of the chloroplast genes suggested higher rates of molecular evolution in the two annuals than those in the two perennials. The significant heterogeneity in evolutionary rate between annuals and perennials was consistently found both in nonsynonymous sites and synonymous sites. While a linear correlation of evolutionary rates in orthologous genes between species was observed in nonsynonymous sites, the correlation was weak or invisible in synonymous sites. This tendency was clearer in nuclear genes than in chloroplast genes, in which the overall evolutionary rate was small. The slope of the regression line was consistently lower than unity, further confirming the higher evolutionary rate in annuals at the genomic level. The higher evolutionary rate in annuals than in perennials appears to be a universal phenomenon both in nuclear and chloroplast genomes in the four dicot model plants we investigated. Therefore, such heterogeneity in evolutionary rate should result from factors that have genome-wide influence, most likely those

  13. Genome-wide association study of antisocial personality disorder

    PubMed Central

    Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

    2016-01-01

    The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53–3.14), P=1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37–1.85), P=1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder. PMID:27598967

  14. Genome-wide association study of antisocial personality disorder.

    PubMed

    Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J

    2016-09-06

    The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53-3.14), P=1.9 × 10(-5)). Two polymorphisms at 6p21.2 LINC00951-LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37-1.85), P=1.6 × 10(-9)) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder.

  15. Genome-wide characterization of Mediator recruitment, function, and regulation.

    PubMed

    Grünberg, Sebastian; Zentner, Gabriel E

    2017-05-27

    Mediator is a conserved and essential coactivator complex broadly required for RNA polymerase II (RNAPII) transcription. Recent genome-wide studies of Mediator binding in budding yeast have revealed new insights into the functions of this critical complex and raised new questions about its role in the regulation of gene expression.

  16. Multi-trait analysis of genome-wide association summary statistics using MTAG.

    PubMed

    Turley, Patrick; Walters, Raymond K; Maghzian, Omeed; Okbay, Aysu; Lee, James J; Fontana, Mark Alan; Nguyen-Viet, Tuan Anh; Wedow, Robbee; Zacher, Meghan; Furlotte, Nicholas A; Magnusson, Patrik; Oskarsson, Sven; Johannesson, Magnus; Visscher, Peter M; Laibson, David; Cesarini, David; Neale, Benjamin M; Benjamin, Daniel J

    2018-02-01

    We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. We apply MTAG to summary statistics for depressive symptoms (N eff  = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). As compared to the 32, 9, and 13 genome-wide significant loci identified in the single-trait GWAS (most of which are themselves novel), MTAG increases the number of associated loci to 64, 37, and 49, respectively. Moreover, association statistics from MTAG yield more informative bioinformatics analyses and increase the variance explained by polygenic scores by approximately 25%, matching theoretical expectations.

  17. Position-based scanning for comparative genomics and identification of genetic islands in Haemophilus influenzae type b.

    PubMed

    Bergman, Nicholas H; Akerley, Brian J

    2003-03-01

    Bacteria exhibit extensive genetic heterogeneity within species. In many cases, these differences account for virulence properties unique to specific strains. Several such loci have been discovered in the genome of the type b serotype of Haemophilus influenzae, a human pathogen able to cause meningitis, pneumonia, and septicemia. Here we report application of a PCR-based scanning procedure to compare the genome of a virulent type b (Hib) strain with that of the laboratory-passaged Rd KW20 strain for which a complete genome sequence is available. We have identified seven DNA segments or H. influenzae genetic islands (HiGIs) present in the type b genome and absent from the Rd genome. These segments vary in size and content and show signs of horizontal gene transfer in that their percent G+C content differs from that of the rest of the H. influenzae genome, they contain genes similar to those found on phages or other mobile elements, or they are flanked by DNA repeats. Several of these loci represent potential pathogenicity islands, because they contain genes likely to mediate interactions with the host. These newly identified genetic islands provide areas of investigation into both the evolution and pathogenesis of H. influenzae. In addition, the genome scanning approach developed to identify these islands provides a rapid means to compare the genomes of phenotypically diverse bacterial strains once the genome sequence of one representative strain has been determined.

  18. Bioinformatics challenges for genome-wide association studies.

    PubMed

    Moore, Jason H; Asselbergs, Folkert W; Williams, Scott M

    2010-02-15

    The sequencing of the human genome has made it possible to identify an informative set of >1 million single nucleotide polymorphisms (SNPs) across the genome that can be used to carry out genome-wide association studies (GWASs). The availability of massive amounts of GWAS data has necessitated the development of new biostatistical methods for quality control, imputation and analysis issues including multiple testing. This work has been successful and has enabled the discovery of new associations that have been replicated in multiple studies. However, it is now recognized that most SNPs discovered via GWAS have small effects on disease susceptibility and thus may not be suitable for improving health care through genetic testing. One likely explanation for the mixed results of GWAS is that the current biostatistical analysis paradigm is by design agnostic or unbiased in that it ignores all prior knowledge about disease pathobiology. Further, the linear modeling framework that is employed in GWAS often considers only one SNP at a time thus ignoring their genomic and environmental context. There is now a shift away from the biostatistical approach toward a more holistic approach that recognizes the complexity of the genotype-phenotype relationship that is characterized by significant heterogeneity and gene-gene and gene-environment interaction. We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. The goal of this review is to identify and discuss those GWAS challenges that will require computational methods.

  19. Genome-wide Analysis Reveals Extensive Functional Interaction between DNA Replication Initiation and Transcription in the Genome of Trypanosoma brucei

    PubMed Central

    Tiengwe, Calvin; Marcello, Lucio; Farr, Helen; Dickens, Nicholas; Kelly, Steven; Swiderski, Michal; Vaughan, Diane; Gull, Keith; Barry, J. David; Bell, Stephen D.; McCulloch, Richard

    2012-01-01

    Summary Identification of replication initiation sites, termed origins, is a crucial step in understanding genome transmission in any organism. Transcription of the Trypanosoma brucei genome is highly unusual, with each chromosome comprising a few discrete transcription units. To understand how DNA replication occurs in the context of such organization, we have performed genome-wide mapping of the binding sites of the replication initiator ORC1/CDC6 and have identified replication origins, revealing that both localize to the boundaries of the transcription units. A remarkably small number of active origins is seen, whose spacing is greater than in any other eukaryote. We show that replication and transcription in T. brucei have a profound functional overlap, as reducing ORC1/CDC6 levels leads to genome-wide increases in mRNA levels arising from the boundaries of the transcription units. In addition, ORC1/CDC6 loss causes derepression of silent Variant Surface Glycoprotein genes, which are critical for host immune evasion. PMID:22840408

  20. Five endometrial cancer risk loci identified through genome-wide association analysis.

    PubMed

    Cheng, Timothy Ht; Thompson, Deborah J; O'Mara, Tracy A; Painter, Jodie N; Glubb, Dylan M; Flach, Susanne; Lewis, Annabelle; French, Juliet D; Freeman-Mills, Luke; Church, David; Gorman, Maggie; Martin, Lynn; Hodgson, Shirley; Webb, Penelope M; Attia, John; Holliday, Elizabeth G; McEvoy, Mark; Scott, Rodney J; Henders, Anjali K; Martin, Nicholas G; Montgomery, Grant W; Nyholt, Dale R; Ahmed, Shahana; Healey, Catherine S; Shah, Mitul; Dennis, Joe; Fasching, Peter A; Beckmann, Matthias W; Hein, Alexander; Ekici, Arif B; Hall, Per; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Dörk, Thilo; Dürst, Matthias; Hillemanns, Peter; Runnebaum, Ingo; Amant, Frederic; Schrauwen, Stefanie; Zhao, Hui; Lambrechts, Diether; Depreeuw, Jeroen; Dowdy, Sean C; Goode, Ellen L; Fridley, Brooke L; Winham, Stacey J; Njølstad, Tormund S; Salvesen, Helga B; Trovik, Jone; Werner, Henrica Mj; Ashton, Katie; Otton, Geoffrey; Proietto, Tony; Liu, Tao; Mints, Miriam; Tham, Emma; Consortium, Chibcha; Jun Li, Mulin; Yip, Shun H; Wang, Junwen; Bolla, Manjeet K; Michailidou, Kyriaki; Wang, Qin; Tyrer, Jonathan P; Dunlop, Malcolm; Houlston, Richard; Palles, Claire; Hopper, John L; Peto, Julian; Swerdlow, Anthony J; Burwinkel, Barbara; Brenner, Hermann; Meindl, Alfons; Brauch, Hiltrud; Lindblom, Annika; Chang-Claude, Jenny; Couch, Fergus J; Giles, Graham G; Kristensen, Vessela N; Cox, Angela; Cunningham, Julie M; Pharoah, Paul D P; Dunning, Alison M; Edwards, Stacey L; Easton, Douglas F; Tomlinson, Ian; Spurdle, Amanda B

    2016-06-01

    We conducted a meta-analysis of three endometrial cancer genome-wide association studies (GWAS) and two follow-up phases totaling 7,737 endometrial cancer cases and 37,144 controls of European ancestry. Genome-wide imputation and meta-analysis identified five new risk loci of genome-wide significance at likely regulatory regions on chromosomes 13q22.1 (rs11841589, near KLF5), 6q22.31 (rs13328298, in LOC643623 and near HEY2 and NCOA7), 8q24.21 (rs4733613, telomeric to MYC), 15q15.1 (rs937213, in EIF2AK4, near BMF) and 14q32.33 (rs2498796, in AKT1, near SIVA1). We also found a second independent 8q24.21 signal (rs17232730). Functional studies of the 13q22.1 locus showed that rs9600103 (pairwise r(2) = 0.98 with rs11841589) is located in a region of active chromatin that interacts with the KLF5 promoter region. The rs9600103[T] allele that is protective in endometrial cancer suppressed gene expression in vitro, suggesting that regulation of the expression of KLF5, a gene linked to uterine development, is implicated in tumorigenesis. These findings provide enhanced insight into the genetic and biological basis of endometrial cancer.

  1. Sniffing out significant "Pee values": genome wide association study of asparagus anosmia.

    PubMed

    Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter; Mucci, Lorelei A

    2016-12-13

     To determine the inherited factors associated with the ability to smell asparagus metabolites in urine.  Genome wide association study.  Nurses' Health Study and Health Professionals Follow-up Study cohorts.  6909 men and women of European-American descent with available genetic data from genome wide association studies.  Participants were characterized as asparagus smellers if they strongly agreed with the prompt "after eating asparagus, you notice a strong characteristic odor in your urine," and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values <5×10 -8 were considered as genome wide significant.  58.0% of men (n=1449/2500) and 61.5% of women (n=2712/4409) had anosmia. 871 single nucleotide polymorphisms reached genome wide significance for asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553.  A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  2. Genome-wide analysis identifies 12 loci influencing human reproductive behavior.

    PubMed

    Barban, Nicola; Jansen, Rick; de Vlaming, Ronald; Vaez, Ahmad; Mandemakers, Jornt J; Tropf, Felix C; Shen, Xia; Wilson, James F; Chasman, Daniel I; Nolte, Ilja M; Tragante, Vinicius; van der Laan, Sander W; Perry, John R B; Kong, Augustine; Ahluwalia, Tarunveer S; Albrecht, Eva; Yerges-Armstrong, Laura; Atzmon, Gil; Auro, Kirsi; Ayers, Kristin; Bakshi, Andrew; Ben-Avraham, Danny; Berger, Klaus; Bergman, Aviv; Bertram, Lars; Bielak, Lawrence F; Bjornsdottir, Gyda; Bonder, Marc Jan; Broer, Linda; Bui, Minh; Barbieri, Caterina; Cavadino, Alana; Chavarro, Jorge E; Turman, Constance; Concas, Maria Pina; Cordell, Heather J; Davies, Gail; Eibich, Peter; Eriksson, Nicholas; Esko, Tõnu; Eriksson, Joel; Falahi, Fahimeh; Felix, Janine F; Fontana, Mark Alan; Franke, Lude; Gandin, Ilaria; Gaskins, Audrey J; Gieger, Christian; Gunderson, Erica P; Guo, Xiuqing; Hayward, Caroline; He, Chunyan; Hofer, Edith; Huang, Hongyan; Joshi, Peter K; Kanoni, Stavroula; Karlsson, Robert; Kiechl, Stefan; Kifley, Annette; Kluttig, Alexander; Kraft, Peter; Lagou, Vasiliki; Lecoeur, Cecile; Lahti, Jari; Li-Gao, Ruifang; Lind, Penelope A; Liu, Tian; Makalic, Enes; Mamasoula, Crysovalanto; Matteson, Lindsay; Mbarek, Hamdi; McArdle, Patrick F; McMahon, George; Meddens, S Fleur W; Mihailov, Evelin; Miller, Mike; Missmer, Stacey A; Monnereau, Claire; van der Most, Peter J; Myhre, Ronny; Nalls, Mike A; Nutile, Teresa; Kalafati, Ioanna Panagiota; Porcu, Eleonora; Prokopenko, Inga; Rajan, Kumar B; Rich-Edwards, Janet; Rietveld, Cornelius A; Robino, Antonietta; Rose, Lynda M; Rueedi, Rico; Ryan, Kathleen A; Saba, Yasaman; Schmidt, Daniel; Smith, Jennifer A; Stolk, Lisette; Streeten, Elizabeth; Tönjes, Anke; Thorleifsson, Gudmar; Ulivi, Sheila; Wedenoja, Juho; Wellmann, Juergen; Willeit, Peter; Yao, Jie; Yengo, Loic; Zhao, Jing Hua; Zhao, Wei; Zhernakova, Daria V; Amin, Najaf; Andrews, Howard; Balkau, Beverley; Barzilai, Nir; Bergmann, Sven; Biino, Ginevra; Bisgaard, Hans; Bønnelykke, Klaus; Boomsma, Dorret I; Buring, Julie E; Campbell, Harry; Cappellani, Stefania; Ciullo, Marina; Cox, Simon R; Cucca, Francesco; Toniolo, Daniela; Davey-Smith, George; Deary, Ian J; Dedoussis, George; Deloukas, Panos; van Duijn, Cornelia M; de Geus, Eco J C; Eriksson, Johan G; Evans, Denis A; Faul, Jessica D; Sala, Cinzia Felicita; Froguel, Philippe; Gasparini, Paolo; Girotto, Giorgia; Grabe, Hans-Jörgen; Greiser, Karin Halina; Groenen, Patrick J F; de Haan, Hugoline G; Haerting, Johannes; Harris, Tamara B; Heath, Andrew C; Heikkilä, Kauko; Hofman, Albert; Homuth, Georg; Holliday, Elizabeth G; Hopper, John; Hyppönen, Elina; Jacobsson, Bo; Jaddoe, Vincent W V; Johannesson, Magnus; Jugessur, Astanand; Kähönen, Mika; Kajantie, Eero; Kardia, Sharon L R; Keavney, Bernard; Kolcic, Ivana; Koponen, Päivikki; Kovacs, Peter; Kronenberg, Florian; Kutalik, Zoltan; La Bianca, Martina; Lachance, Genevieve; Iacono, William G; Lai, Sandra; Lehtimäki, Terho; Liewald, David C; Lindgren, Cecilia M; Liu, Yongmei; Luben, Robert; Lucht, Michael; Luoto, Riitta; Magnus, Per; Magnusson, Patrik K E; Martin, Nicholas G; McGue, Matt; McQuillan, Ruth; Medland, Sarah E; Meisinger, Christa; Mellström, Dan; Metspalu, Andres; Traglia, Michela; Milani, Lili; Mitchell, Paul; Montgomery, Grant W; Mook-Kanamori, Dennis; de Mutsert, Renée; Nohr, Ellen A; Ohlsson, Claes; Olsen, Jørn; Ong, Ken K; Paternoster, Lavinia; Pattie, Alison; Penninx, Brenda W J H; Perola, Markus; Peyser, Patricia A; Pirastu, Mario; Polasek, Ozren; Power, Chris; Kaprio, Jaakko; Raffel, Leslie J; Räikkönen, Katri; Raitakari, Olli; Ridker, Paul M; Ring, Susan M; Roll, Kathryn; Rudan, Igor; Ruggiero, Daniela; Rujescu, Dan; Salomaa, Veikko; Schlessinger, David; Schmidt, Helena; Schmidt, Reinhold; Schupf, Nicole; Smit, Johannes; Sorice, Rossella; Spector, Tim D; Starr, John M; Stöckl, Doris; Strauch, Konstantin; Stumvoll, Michael; Swertz, Morris A; Thorsteinsdottir, Unnur; Thurik, A Roy; Timpson, Nicholas J; Tung, Joyce Y; Uitterlinden, André G; Vaccargiu, Simona; Viikari, Jorma; Vitart, Veronique; Völzke, Henry; Vollenweider, Peter; Vuckovic, Dragana; Waage, Johannes; Wagner, Gert G; Wang, Jie Jin; Wareham, Nicholas J; Weir, David R; Willemsen, Gonneke; Willeit, Johann; Wright, Alan F; Zondervan, Krina T; Stefansson, Kari; Krueger, Robert F; Lee, James J; Benjamin, Daniel J; Cesarini, David; Koellinger, Philipp D; den Hoed, Marcel; Snieder, Harold; Mills, Melinda C

    2016-12-01

    The genetic architecture of human reproductive behavior-age at first birth (AFB) and number of children ever born (NEB)-has a strong relationship with fitness, human development, infertility and risk of neuropsychiatric disorders. However, very few genetic loci have been identified, and the underlying mechanisms of AFB and NEB are poorly understood. We report a large genome-wide association study of both sexes including 251,151 individuals for AFB and 343,072 individuals for NEB. We identified 12 independent loci that are significantly associated with AFB and/or NEB in a SNP-based genome-wide association study and 4 additional loci associated in a gene-based effort. These loci harbor genes that are likely to have a role, either directly or by affecting non-local gene expression, in human reproduction and infertility, thereby increasing understanding of these complex traits.

  3. A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.

    PubMed

    Si, Zengzhi; Du, Bing; Huo, Jinxi; He, Shaozhen; Liu, Qingchang; Zhai, Hong

    2016-11-21

    Sweetpotato, Ipomoea batatas (L.) Lam., is an important food crop widely grown in the world. However, little is known about the genome of this species because it is a highly heterozygous hexaploid. Gaining a more in-depth knowledge of sweetpotato genome is therefore necessary and imperative. In this study, the first bacterial artificial chromosome (BAC) library of sweetpotato was constructed. Clones from the BAC library were end-sequenced and analyzed to provide genome-wide information about this species. The BAC library contained 240,384 clones with an average insert size of 101 kb and had a 7.93-10.82 × coverage of the genome, and the probability of isolating any single-copy DNA sequence from the library was more than 99%. Both ends of 8310 BAC clones randomly selected from the library were sequenced to generate 11,542 high-quality BAC-end sequences (BESs), with an accumulative length of 7,595,261 bp and an average length of 658 bp. Analysis of the BESs revealed that 12.17% of the sweetpotato genome were known repetitive DNA, including 7.37% long terminal repeat (LTR) retrotransposons, 1.15% Non-LTR retrotransposons and 1.42% Class II DNA transposons etc., 18.31% of the genome were identified as sweetpotato-unique repetitive DNA and 10.00% of the genome were predicted to be coding regions. In total, 3,846 simple sequences repeats (SSRs) were identified, with a density of one SSR per 1.93 kb, from which 288 SSRs primers were designed and tested for length polymorphism using 20 sweetpotato accessions, 173 (60.07%) of them produced polymorphic bands. Sweetpotato BESs had significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum than those of Vitis vinifera, Theobroma cacao and Arabidopsis thaliana. The first BAC library for sweetpotato has been successfully constructed. The high quality BESs provide first insights into sweetpotato genome composition, and have significant hits to the genome

  4. Genome-Wide Analysis in Brazilians Reveals Highly Differentiated Native American Genome Regions

    PubMed Central

    Havt, Alexandre; Nayak, Uma; Pinkerton, Relana; Farber, Emily; Concannon, Patrick; Lima, Aldo A.; Guerrant, Richard L.

    2017-01-01

    Despite its population, geographic size, and emerging economic importance, disproportionately little genome-scale research exists into genetic factors that predispose Brazilians to disease, or the population genetics of risk. After identification of suitable proxy populations and careful analysis of tri-continental admixture in 1,538 North-Eastern Brazilians to estimate individual ancestry and ancestral allele frequencies, we computed 400,000 genome-wide locus-specific branch length (LSBL) Fst statistics of Brazilian Amerindian ancestry compared to European and African; and a similar set of differentiation statistics for their Amerindian component compared with the closest Asian 1000 Genomes population (surprisingly, Bengalis in Bangladesh). After ranking SNPs by these statistics, we identified the top 10 highly differentiated SNPs in five genome regions in the LSBL tests of Brazilian Amerindian ancestry compared to European and African; and the top 10 SNPs in eight regions comparing their Amerindian component to the closest Asian 1000 Genomes population. We found SNPs within or proximal to the genes CIITA (rs6498115), SMC6 (rs1834619), and KLHL29 (rs2288697) were most differentiated in the Amerindian-specific branch, while SNPs in the genes ADAMTS9 (rs7631391), DOCK2 (rs77594147), SLC28A1 (rs28649017), ARHGAP5 (rs7151991), and CIITA (rs45601437) were most highly differentiated in the Asian comparison. These genes are known to influence immune function, metabolic and anthropometry traits, and embryonic development. These analyses have identified candidate genes for selection within Amerindian ancestry, and by comparison of the two analyses, those for which the differentiation may have arisen during the migration from Asia to the Americas. PMID:28100790

  5. Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

    PubMed

    Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

    2017-03-27

    Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide

  6. Genome-Wide Analyses Reveal Genes Subject to Positive Selection in Pasteurella multocida

    PubMed Central

    Cao, Peili; Guo, Dongchun; Liu, Jiasen; Jiang, Qian; Xu, Zhuofei; Qu, Liandong

    2017-01-01

    Pasteurella multocida, a Gram-negative opportunistic pathogen, has led to a broad range of diseases in mammals and birds, including fowl cholera in poultry, pneumonia and atrophic rhinitis in swine and rabbit, hemorrhagic septicemia in cattle, and bite infections in humans. In order to better interpret the genetic diversity and adaptation evolution of this pathogen, seven genomes of P. multocida strains isolated from fowls, rabbit and pigs were determined by using high-throughput sequencing approach. Together with publicly available P. multocida genomes, evolutionary features were systematically analyzed in this study. Clustering of 70,565 protein-coding genes showed that the pangenome of 33 P. multocida strains was composed of 1,602 core genes, 1,364 dispensable genes, and 1,070 strain-specific genes. Of these, we identified a full spectrum of genes related to virulence factors and revealed genetic diversity of these potential virulence markers across P. multocida strains, e.g., bcbAB, fcbC, lipA, bexDCA, ctrCD, lgtA, lgtC, lic2A involved in biogenesis of surface polysaccharides, hsf encoding autotransporter adhesin, and fhaB encoding filamentous haemagglutinin. Furthermore, based on genome-wide positive selection scanning, a total of 35 genes were subject to strong selection pressure. Extensive analyses of protein subcellular location indicated that membrane-associated genes were highly abundant among all positively selected genes. The detected amino acid sites undergoing adaptive selection were preferably located in extracellular space, perhaps associated with bacterial evasion of host immune responses. Our findings shed more light on conservation and distribution of virulence-associated genes across P. multocida strains. Meanwhile, this study provides a genetic context for future researches on the mechanism of adaptive evolution in P. multocida. PMID:28611758

  7. Genome-Wide Profiling of DNA Double-Strand Breaks by the BLESS and BLISS Methods.

    PubMed

    Mirzazadeh, Reza; Kallas, Tomasz; Bienko, Magda; Crosetto, Nicola

    2018-01-01

    DNA double-strand breaks (DSBs) are major DNA lesions that are constantly formed during physiological processes such as DNA replication, transcription, and recombination, or as a result of exogenous agents such as ionizing radiation, radiomimetic drugs, and genome editing nucleases. Unrepaired DSBs threaten genomic stability by leading to the formation of potentially oncogenic rearrangements such as translocations. In past few years, several methods based on next-generation sequencing (NGS) have been developed to study the genome-wide distribution of DSBs or their conversion to translocation events. We developed Breaks Labeling, Enrichment on Streptavidin, and Sequencing (BLESS), which was the first method for direct labeling of DSBs in situ followed by their genome-wide mapping at nucleotide resolution (Crosetto et al., Nat Methods 10:361-365, 2013). Recently, we have further expanded the quantitative nature, applicability, and scalability of BLESS by developing Breaks Labeling In Situ and Sequencing (BLISS) (Yan et al., Nat Commun 8:15058, 2017). Here, we first present an overview of existing methods for genome-wide localization of DSBs, and then focus on the BLESS and BLISS methods, discussing different assay design options depending on the sample type and application.

  8. Developmental Stability Covaries with Genome-Wide and Single-Locus Heterozygosity in House Sparrows

    PubMed Central

    Vangestel, Carl; Mergeay, Joachim; Dawson, Deborah A.; Vandomme, Viki; Lens, Luc

    2011-01-01

    Fluctuating asymmetry (FA), a measure of developmental instability, has been hypothesized to increase with genetic stress. Despite numerous studies providing empirical evidence for associations between FA and genome-wide properties such as multi-locus heterozygosity, support for single-locus effects remains scant. Here we test if, and to what extent, FA co-varies with single- and multilocus markers of genetic diversity in house sparrow (Passer domesticus) populations along an urban gradient. In line with theoretical expectations, FA was inversely correlated with genetic diversity estimated at genome level. However, this relationship was largely driven by variation at a single key locus. Contrary to our expectations, relationships between FA and genetic diversity were not stronger in individuals from urban populations that experience higher nutritional stress. We conclude that loss of genetic diversity adversely affects developmental stability in P. domesticus, and more generally, that the molecular basis of developmental stability may involve complex interactions between local and genome-wide effects. Further study on the relative effects of single-locus and genome-wide effects on the developmental stability of populations with different genetic properties is therefore needed. PMID:21747940

  9. Genome-wide characterization of Mediator recruitment, function, and regulation

    PubMed Central

    2017-01-01

    ABSTRACT Mediator is a conserved and essential coactivator complex broadly required for RNA polymerase II (RNAPII) transcription. Recent genome-wide studies of Mediator binding in budding yeast have revealed new insights into the functions of this critical complex and raised new questions about its role in the regulation of gene expression. PMID:28301289

  10. GST-PRIME: an algorithm for genome-wide primer design.

    PubMed

    Leister, Dario; Varotto, Claudio

    2007-01-01

    The profiling of mRNA expression based on DNA arrays has become a powerful tool to study genome-wide transcription of genes in a number of organisms. GST-PRIME is a software package created to facilitate large-scale primer design for the amplification of probes to be immobilized on arrays for transcriptome analyses, even though it can be also applied in low-throughput approaches. GST-PRIME allows highly efficient, direct amplification of gene-sequence tags (GSTs) from genomic DNA (gDNA), starting from annotated genome or transcript sequences. GST-PRIME provides a customer-friendly platform for automatic primer design, and despite the relative simplicity of the algorithm, experimental tests in the model plant species Arabidopsis thaliana confirmed the reliability of the software. This chapter describes the algorithm used for primer design, its input and output files, and the installation of the standalone package and its use.

  11. Genome-wide linkage in Utah autism pedigrees

    PubMed Central

    Allen-Brady, K; Robison, R; Cannon, D; Varvil, T; Villalobos, M; Pingree, C; Leppert, MF; Miller, J; McMahon, WM; Coon, H

    2014-01-01

    Genetic studies of autism over the past decade suggest a complex landscape of multiple genes. In the face of this heterogeneity, studies that include large extended pedigrees may offer valuable insight, as the relatively few susceptibility genes within single large families may be more easily discerned. This genome-wide screen of 70 families includes 20 large extended pedigrees of 6–9 generations, 6 moderate-sized families of 4–5 generations, and 44 smaller families of 2–3 generations. The Center for Inherited Disease Research (CIDR) provided genotyping using the Illumina Linkage Panel 12, a 6K single nucleotide polymorphism (SNP) platform. Results from 192 subjects with an Autism Spectrum Disorder (ASD), and 461 of their relatives revealed genome-wide significance on chromosome 15q, with three possibly distinct peaks: 15q13.1-q14 (HLOD=4.09 at 29,459,872bp); 15q14-q21.1 (HLOD=3.59 at 36,837,208bp); and 15q21.1-q22.2 (HLOD=5.31 at 55,629,733bp). Two of these peaks replicate previous findings. There were additional suggestive results on chromosomes 2p25.3-p24.1 (HLOD=1.87), 7q31.31-q32.3 (HLOD=1.97), and 13q12.11-q12.3 (HLOD=1.93). Affected subjects in families supporting the linkage peaks found in this study did not reveal strong evidence for distinct phenotypic subgroups. PMID:19455147

  12. A genome-wide 20 K citrus microarray for gene expression analysis

    PubMed Central

    Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

    2008-01-01

    Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database [1] was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global studies in citrus by using it to

  13. Navigating the Interface Between Landscape Genetics and Landscape Genomics.

    PubMed

    Storfer, Andrew; Patton, Austin; Fraik, Alexandra K

    2018-01-01

    As next-generation sequencing data become increasingly available for non-model organisms, a shift has occurred in the focus of studies of the geographic distribution of genetic variation. Whereas landscape genetics studies primarily focus on testing the effects of landscape variables on gene flow and genetic population structure, landscape genomics studies focus on detecting candidate genes under selection that indicate possible local adaptation. Navigating the transition between landscape genomics and landscape genetics can be challenging. The number of molecular markers analyzed has shifted from what used to be a few dozen loci to thousands of loci and even full genomes. Although genome scale data can be separated into sets of neutral loci for analyses of gene flow and population structure and putative loci under selection for inference of local adaptation, there are inherent differences in the questions that are addressed in the two study frameworks. We discuss these differences and their implications for study design, marker choice and downstream analysis methods. Similar to the rapid proliferation of analysis methods in the early development of landscape genetics, new analytical methods for detection of selection in landscape genomics studies are burgeoning. We focus on genome scan methods for detection of selection, and in particular, outlier differentiation methods and genetic-environment association tests because they are the most widely used. Use of genome scan methods requires an understanding of the potential mismatches between the biology of a species and assumptions inherent in analytical methods used, which can lead to high false positive rates of detected loci under selection. Key to choosing appropriate genome scan methods is an understanding of the underlying demographic structure of study populations, and such data can be obtained using neutral loci from the generated genome-wide data or prior knowledge of a species' phylogeographic history. To

  14. Navigating the Interface Between Landscape Genetics and Landscape Genomics

    PubMed Central

    Storfer, Andrew; Patton, Austin; Fraik, Alexandra K.

    2018-01-01

    As next-generation sequencing data become increasingly available for non-model organisms, a shift has occurred in the focus of studies of the geographic distribution of genetic variation. Whereas landscape genetics studies primarily focus on testing the effects of landscape variables on gene flow and genetic population structure, landscape genomics studies focus on detecting candidate genes under selection that indicate possible local adaptation. Navigating the transition between landscape genomics and landscape genetics can be challenging. The number of molecular markers analyzed has shifted from what used to be a few dozen loci to thousands of loci and even full genomes. Although genome scale data can be separated into sets of neutral loci for analyses of gene flow and population structure and putative loci under selection for inference of local adaptation, there are inherent differences in the questions that are addressed in the two study frameworks. We discuss these differences and their implications for study design, marker choice and downstream analysis methods. Similar to the rapid proliferation of analysis methods in the early development of landscape genetics, new analytical methods for detection of selection in landscape genomics studies are burgeoning. We focus on genome scan methods for detection of selection, and in particular, outlier differentiation methods and genetic-environment association tests because they are the most widely used. Use of genome scan methods requires an understanding of the potential mismatches between the biology of a species and assumptions inherent in analytical methods used, which can lead to high false positive rates of detected loci under selection. Key to choosing appropriate genome scan methods is an understanding of the underlying demographic structure of study populations, and such data can be obtained using neutral loci from the generated genome-wide data or prior knowledge of a species' phylogeographic history. To

  15. Implementing meta-analysis from genome-wide association studies for pork quality traits

    USDA-ARS?s Scientific Manuscript database

    Pork quality plays an important role in the meat processing industry, thus different methodologies have been implemented to elucidate the genetic architecture of traits affecting meat quality. One of the most common and widely used approaches is to perform genome-wide association (GWA) studies. Howe...

  16. Asymmetric Introgression in the Horticultural Living Fossil Cycas Sect. Asiorientales Using a Genome-Wide Scanning Approach

    PubMed Central

    Chiang, Yu-Chung; Huang, Bing-Hong; Chang, Chun-Wen; Wan, Yu-Ting; Lai, Shih-Jie; Huang, Shong; Liao, Pei-Chun

    2013-01-01

    The Asian cycads are mostly allopatric, distributed in small population sizes. Hybridization between allopatric species provides clues in determining the mechanism of species divergence. Horticultural introduction provides the chance of interspecific gene flow between allopatric species. Two allopatrically eastern Asian Cycas sect. Asiorientales species, C. revoluta and C. taitungensis, which are widely distributed in Ryukyus and Fujian Province and endemic to Taiwan, respectively, were planted in eastern Taiwan for horticultural reason. Higher degrees of genetic admixture in cultivated samples than wild populations in both cycad species were detected based on multilocus scans by neutral AFLP markers. Furthermore, bidirectional but asymmetric introgression by horticultural introduction of C. revoluta is evidenced by the reanalyses of species associated loci, which are assumed to be diverged after species divergence. Partial loci introgressed from native cycad to the invaders were also detected at the loci of strong species association. Consistent results tested by all neutral loci, and the species-associated loci, specify the recent introgression from the paradox of sharing of ancestral polymorphisms. Phenomenon of introgression of cultivated cycads implies niche conservation among two geographic-isolated cycads, even though the habitats of the extant wild populations of two species are distinct. PMID:23591840

  17. Genic rather than genome-wide differences between sexually deceptive Ophrys orchids with different pollinators.

    PubMed

    Sedeek, Khalid E M; Scopece, Giovanni; Staedler, Yannick M; Schönenberger, Jürg; Cozzolino, Salvatore; Schiestl, Florian P; Schlüter, Philipp M

    2014-12-01

    High pollinator specificity and the potential for simple genetic changes to affect pollinator attraction make sexually deceptive orchids an ideal system for the study of ecological speciation, in which change of flower odour is likely important. This study surveys reproductive barriers and differences in floral phenotypes in a group of four closely related, coflowering sympatric Ophrys species and uses a genotyping-by-sequencing (GBS) approach to obtain information on the proportion of the genome that is differentiated between species. Ophrys species were found to effectively lack postpollination barriers, but are strongly isolated by their different pollinators (floral isolation) and, to a smaller extent, by shifts in flowering time (temporal isolation). Although flower morphology and perhaps labellum coloration may contribute to floral isolation, reproductive barriers may largely be due to differences in flower odour chemistry. GBS revealed shared polymorphism throughout the Ophrys genome, with very little population structure between species. Genome scans for FST outliers identified few markers that are highly differentiated between species and repeatable in several populations. These genome scans also revealed highly differentiated polymorphisms in genes with putative involvement in floral odour production, including a previously identified candidate gene thought to be involved in the biosynthesis of pseudo-pheromones by the orchid flowers. Taken together, these data suggest that ecological speciation associated with different pollinators in sexually deceptive orchids has a genic rather than a genomic basis, placing these species at an early phase of genomic divergence within the 'speciation continuum'. © 2014 The Authors. Molecular Ecology published by John Wiley & Sons Ltd.

  18. Genome-wide alterations of the DNA replication program during tumor progression

    NASA Astrophysics Data System (ADS)

    Arneodo, A.; Goldar, A.; Argoul, F.; Hyrien, O.; Audit, B.

    2016-08-01

    Oncogenic stress is a major driving force in the early stages of cancer development. Recent experimental findings reveal that, in precancerous lesions and cancers, activated oncogenes may induce stalling and dissociation of DNA replication forks resulting in DNA damage. Replication timing is emerging as an important epigenetic feature that recapitulates several genomic, epigenetic and functional specificities of even closely related cell types. There is increasing evidence that chromosome rearrangements, the hallmark of many cancer genomes, are intimately associated with the DNA replication program and that epigenetic replication timing changes often precede chromosomic rearrangements. The recent development of a novel methodology to map replication fork polarity using deep sequencing of Okazaki fragments has provided new and complementary genome-wide replication profiling data. We review the results of a wavelet-based multi-scale analysis of genomic and epigenetic data including replication profiles along human chromosomes. These results provide new insight into the spatio-temporal replication program and its dynamics during differentiation. Here our goal is to bring to cancer research, the experimental protocols and computational methodologies for replication program profiling, and also the modeling of the spatio-temporal replication program. To illustrate our purpose, we report very preliminary results obtained for the chronic myelogeneous leukemia, the archetype model of cancer. Finally, we discuss promising perspectives on using genome-wide DNA replication profiling as a novel efficient tool for cancer diagnosis, prognosis and personalized treatment.

  19. Genome-wide identification of bacterial plant colonization genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.

    Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less

  20. Genome-wide identification of bacterial plant colonization genes

    DOE PAGES

    Cole, Benjamin J.; Feltcher, Meghan E.; Waters, Robert J.; ...

    2017-09-22

    Diverse soil-resident bacteria can contribute to plant growth and health, but the molecular mechanisms enabling them to effectively colonize their plant hosts remain poorly understood. We used randomly barcoded transposon mutagenesis sequencing (RB-TnSeq) in Pseudomonas simiae, a model root-colonizing bacterium, to establish a genome-wide map of bacterial genes required for colonization of the Arabidopsis thaliana root system. We identified 115 genes (2% of all P. simiae genes) with functions that are required for maximal competitive colonization of the root system. Among the genes we identified were some with obvious colonization-related roles in motility and carbon metabolism, as well as 44more » other genes that had no or vague functional predictions. Independent validation assays of individual genes confirmed colonization functions for 20 of 22 (91%) cases tested. To further characterize genes identified by our screen, we compared the functional contributions of P. simiae genes to growth in 90 distinct in vitro conditions by RB-TnSeq, highlighting specific metabolic functions associated with root colonization genes. Here, our analysis of bacterial genes by sequence-driven saturation mutagenesis revealed a genome-wide map of the genetic determinants of plant root colonization and offers a starting point for targeted improvement of the colonization capabilities of plant-beneficial microbes.« less

  1. Optimal selection of markers for validation or replication from genome-wide association studies.

    PubMed

    Greenwood, Celia M T; Rangrej, Jagadish; Sun, Lei

    2007-07-01

    With reductions in genotyping costs and the fast pace of improvements in genotyping technology, it is not uncommon for the individuals in a single study to undergo genotyping using several different platforms, where each platform may contain different numbers of markers selected via different criteria. For example, a set of cases and controls may be genotyped at markers in a small set of carefully selected candidate genes, and shortly thereafter, the same cases and controls may be used for a genome-wide single nucleotide polymorphism (SNP) association study. After such initial investigations, often, a subset of "interesting" markers is selected for validation or replication. Specifically, by validation, we refer to the investigation of associations between the selected subset of markers and the disease in independent data. However, it is not obvious how to choose the best set of markers for this validation. There may be a prior expectation that some sets of genotyping data are more likely to contain real associations. For example, it may be more likely for markers in plausible candidate genes to show disease associations than markers in a genome-wide scan. Hence, it would be desirable to select proportionally more markers from the candidate gene set. When a fixed number of markers are selected for validation, we propose an approach for identifying an optimal marker-selection configuration by basing the approach on minimizing the stratified false discovery rate. We illustrate this approach using a case-control study of colorectal cancer from Ontario, Canada, and we show that this approach leads to substantial reductions in the estimated false discovery rates in the Ontario dataset for the selected markers, as well as reductions in the expected false discovery rates for the proposed validation dataset. Copyright 2007 Wiley-Liss, Inc.

  2. Genome-wide analysis identifies 12 loci influencing human reproductive behavior

    PubMed Central

    Barban, Nicola; Jansen, Rick; de Vlaming, Ronald; Vaez, Ahmad; Mandemakers, Jornt J.; Tropf, Felix C.; Shen, Xia; Wilson, James F.; Chasman, Daniel I.; Nolte, Ilja M.; Tragante, Vinicius; van der Laan, Sander W.; Perry, John R. B.; Kong, Augustine; Ahluwalia, Tarunveer; Albrecht, Eva; Yerges-Armstrong, Laura; Atzmon, Gil; Auro, Kirsi; Ayers, Kristin; Bakshi, Andrew; Ben-Avraham, Danny; Berger, Klaus; Bergman, Aviv; Bertram, Lars; Bielak, Lawrence F.; Bjornsdottir, Gyda; Bonder, Marc Jan; Broer, Linda; Bui, Minh; Barbieri, Caterina; Cavadino, Alana; Chavarro, Jorge E; Turman, Constance; Concas, Maria Pina; Cordell, Heather J.; Davies, Gail; Eibich, Peter; Eriksson, Nicholas; Esko, Tõnu; Eriksson, Joel; Falahi, Fahimeh; Felix, Janine F.; Fontana, Mark Alan; Franke, Lude; Gandin, Ilaria; Gaskins, Audrey J.; Gieger, Christian; Gunderson, Erica P.; Guo, Xiuqing; Hayward, Caroline; He, Chunyan; Hofer, Edith; Huang, Hongyan; Joshi, Peter K.; Kanoni, Stavroula; Karlsson, Robert; Kiechl, Stefan; Kifley, Annette; Kluttig, Alexander; Kraft, Peter; Lagou, Vasiliki; Lecoeur, Cecile; Lahti, Jari; Li-Gao, Ruifang; Lind, Penelope A.; Liu, Tian; Makalic, Enes; Mamasoula, Crysovalanto; Matteson, Lindsay; Mbarek, Hamdi; McArdle, Patrick F.; McMahon, George; Meddens, S. Fleur W.; Mihailov, Evelin; Miller, Mike; Missmer, Stacey A.; Monnereau, Claire; van der Most, Peter J.; Myhre, Ronny; Nalls, Mike A.; Nutile, Teresa; Panagiota, Kalafati Ioanna; Porcu, Eleonora; Prokopenko, Inga; Rajan, Kumar B.; Rich-Edwards, Janet; Rietveld, Cornelius A.; Robino, Antonietta; Rose, Lynda M.; Rueedi, Rico; Ryan, Kathy; Saba, Yasaman; Schmidt, Daniel; Smith, Jennifer A.; Stolk, Lisette; Streeten, Elizabeth; Tonjes, Anke; Thorleifsson, Gudmar; Ulivi, Sheila; Wedenoja, Juho; Wellmann, Juergen; Willeit, Peter; Yao, Jie; Yengo, Loic; Zhao, Jing Hua; Zhao, Wei; Zhernakova, Daria V.; Amin, Najaf; Andrews, Howard; Balkau, Beverley; Barzilai, Nir; Bergmann, Sven; Biino, Ginevra; Bisgaard, Hans; Bønnelykke, Klaus; Boomsma, Dorret I.; Buring, Julie E.; Campbell, Harry; Cappellani, Stefania; Ciullo, Marina; Cox, Simon R.; Cucca, Francesco; Daniela, Toniolo; Davey-Smith, George; Deary, Ian J.; Dedoussis, George; Deloukas, Panos; van Duijn, Cornelia M.; de Geus, Eco JC.; Eriksson, Johan G.; Evans, Denis A.; Faul, Jessica D.; Felicita, Sala Cinzia; Froguel, Philippe; Gasparini, Paolo; Girotto, Giorgia; Grabe, Hans-Jörgen; Greiser, Karin Halina; Groenen, Patrick J.F.; de Haan, Hugoline G.; Haerting, Johannes; Harris, Tamara B.; Heath, Andrew C.; Heikkilä, Kauko; Hofman, Albert; Homuth, Georg; Holliday, Elizabeth G; Hopper, John; Hypponen, Elina; Jacobsson, Bo; Jaddoe, Vincent W. V.; Johannesson, Magnus; Jugessur, Astanand; Kähönen, Mika; Kajantie, Eero; Kardia, Sharon L.R.; Keavney, Bernard; Kolcic, Ivana; Koponen, Päivikki; Kovacs, Peter; Kronenberg, Florian; Kutalik, Zoltan; La Bianca, Martina; Lachance, Genevieve; Iacono, William; Lai, Sandra; Lehtimäki, Terho; Liewald, David C; Lindgren, Cecilia; Liu, Yongmei; Luben, Robert; Lucht, Michael; Luoto, Riitta; Magnus, Per; Magnusson, Patrik K.E.; Martin, Nicholas G.; McGue, Matt; McQuillan, Ruth; Medland, Sarah E.; Meisinger, Christa; Mellström, Dan; Metspalu, Andres; Michela, Traglia; Milani, Lili; Mitchell, Paul; Montgomery, Grant W.; Mook-Kanamori, Dennis; de Mutsert, Renée; Nohr, Ellen A; Ohlsson, Claes; Olsen, Jørn; Ong, Ken K.; Paternoster, Lavinia; Pattie, Alison; Penninx, Brenda WJH; Perola, Markus; Peyser, Patricia A.; Pirastu, Mario; Polasek, Ozren; Power, Chris; Kaprio, Jaakko; Raffel, Leslie J.; Räikkönen, Katri; Raitakari, Olli; Ridker, Paul M.; Ring, Susan M.; Roll, Kathryn; Rudan, Igor; Ruggiero, Daniela; Rujescu, Dan; Salomaa, Veikko; Schlessinger, David; Schmidt, Helena; Schmidt, Reinhold; Schupf, Nicole; Smit, Johannes; Sorice, Rossella; Spector, Tim D.; Starr, John M.; Stöckl, Doris; Strauch, Konstantin; Stumvoll, Michael; Swertz, Morris A.; Thorsteinsdottir, Unnur; Thurik, A. Roy; Timpson, Nicholas J.; Tönjes, Anke; Tung, Joyce Y.; Uitterlinden, André G.; Vaccargiu, Simona; Viikari, Jorma; Vitart, Veronique; Völzke, Henry; Vollenweider, Peter; Vuckovic, Dragana; Waage, Johannes; Wagner, Gert G.; Wang, Jie Jin; Wareham, Nicholas J.; Weir, David R.; Willemsen, Gonneke; Willeit, Johann; Wright, Alan F.; Zondervan, Krina T.; Stefansson, Kari; Krueger, Robert F.; Lee, James J.; Benjamin, Daniel J.; Cesarini, David; Koellinger, Philipp D.; den Hoed, Marcel; Snieder, Harold; Mills, Melinda C.

    2017-01-01

    The genetic architecture of human reproductive behavior – age at first birth (AFB) and number of children ever born (NEB) – has a strong relationship with fitness, human development, infertility and risk of neuropsychiatric disorders. However, very few genetic loci have been identified and the underlying mechanisms of AFB and NEB are poorly understood. We report the largest genome-wide association study to date of both sexes including 251,151 individuals for AFB and 343,072 for NEB. We identified 12 independent loci that are significantly associated with AFB and/or NEB in a SNP-based genome-wide association study, and four additional loci in a gene-based effort. These loci harbor genes that are likely to play a role – either directly or by affecting non-local gene expression – in human reproduction and infertility, thereby increasing our understanding of these complex traits. PMID:27798627

  3. Quality control and conduct of genome-wide association meta-analyses

    PubMed Central

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth JF

    2014-01-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for [1] organizational aspects of GWAMAs, and for [2] QC at the study file level, the meta-level across studies, and the meta-analysis output level. Real–world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for use of a powerful and flexible software package called EasyQC. For consortia of comparable size to the GIANT consortium, the present protocol takes a minimum of about 10 months to complete. PMID:24762786

  4. Genome-wide association analysis identifies 13 new risk loci for schizophrenia.

    PubMed

    Ripke, Stephan; O'Dushlaine, Colm; Chambert, Kimberly; Moran, Jennifer L; Kähler, Anna K; Akterin, Susanne; Bergen, Sarah E; Collins, Ann L; Crowley, James J; Fromer, Menachem; Kim, Yunjung; Lee, Sang Hong; Magnusson, Patrik K E; Sanchez, Nick; Stahl, Eli A; Williams, Stephanie; Wray, Naomi R; Xia, Kai; Bettella, Francesco; Borglum, Anders D; Bulik-Sullivan, Brendan K; Cormican, Paul; Craddock, Nick; de Leeuw, Christiaan; Durmishi, Naser; Gill, Michael; Golimbet, Vera; Hamshere, Marian L; Holmans, Peter; Hougaard, David M; Kendler, Kenneth S; Lin, Kuang; Morris, Derek W; Mors, Ole; Mortensen, Preben B; Neale, Benjamin M; O'Neill, Francis A; Owen, Michael J; Milovancevic, Milica Pejovic; Posthuma, Danielle; Powell, John; Richards, Alexander L; Riley, Brien P; Ruderfer, Douglas; Rujescu, Dan; Sigurdsson, Engilbert; Silagadze, Teimuraz; Smit, August B; Stefansson, Hreinn; Steinberg, Stacy; Suvisaari, Jaana; Tosato, Sarah; Verhage, Matthijs; Walters, James T; Levinson, Douglas F; Gejman, Pablo V; Kendler, Kenneth S; Laurent, Claudine; Mowry, Bryan J; O'Donovan, Michael C; Owen, Michael J; Pulver, Ann E; Riley, Brien P; Schwab, Sibylle G; Wildenauer, Dieter B; Dudbridge, Frank; Holmans, Peter; Shi, Jianxin; Albus, Margot; Alexander, Madeline; Campion, Dominique; Cohen, David; Dikeos, Dimitris; Duan, Jubao; Eichhammer, Peter; Godard, Stephanie; Hansen, Mark; Lerer, F Bernard; Liang, Kung-Yee; Maier, Wolfgang; Mallet, Jacques; Nertney, Deborah A; Nestadt, Gerald; Norton, Nadine; O'Neill, Francis A; Papadimitriou, George N; Ribble, Robert; Sanders, Alan R; Silverman, Jeremy M; Walsh, Dermot; Williams, Nigel M; Wormley, Brandon; Arranz, Maria J; Bakker, Steven; Bender, Stephan; Bramon, Elvira; Collier, David; Crespo-Facorro, Benedicto; Hall, Jeremy; Iyegbe, Conrad; Jablensky, Assen; Kahn, Rene S; Kalaydjieva, Luba; Lawrie, Stephen; Lewis, Cathryn M; Lin, Kuang; Linszen, Don H; Mata, Ignacio; McIntosh, Andrew; Murray, Robin M; Ophoff, Roel A; Powell, John; Rujescu, Dan; Van Os, Jim; Walshe, Muriel; Weisbrod, Matthias; Wiersma, Durk; Donnelly, Peter; Barroso, Ines; Blackwell, Jenefer M; Bramon, Elvira; Brown, Matthew A; Casas, Juan P; Corvin, Aiden P; Deloukas, Panos; Duncanson, Audrey; Jankowski, Janusz; Markus, Hugh S; Mathew, Christopher G; Palmer, Colin N A; Plomin, Robert; Rautanen, Anna; Sawcer, Stephen J; Trembath, Richard C; Viswanathan, Ananth C; Wood, Nicholas W; Spencer, Chris C A; Band, Gavin; Bellenguez, Céline; Freeman, Colin; Hellenthal, Garrett; Giannoulatou, Eleni; Pirinen, Matti; Pearson, Richard D; Strange, Amy; Su, Zhan; Vukcevic, Damjan; Donnelly, Peter; Langford, Cordelia; Hunt, Sarah E; Edkins, Sarah; Gwilliam, Rhian; Blackburn, Hannah; Bumpstead, Suzannah J; Dronov, Serge; Gillman, Matthew; Gray, Emma; Hammond, Naomi; Jayakumar, Alagurevathi; McCann, Owen T; Liddle, Jennifer; Potter, Simon C; Ravindrarajah, Radhi; Ricketts, Michelle; Tashakkori-Ghanbaria, Avazeh; Waller, Matthew J; Weston, Paul; Widaa, Sara; Whittaker, Pamela; Barroso, Ines; Deloukas, Panos; Mathew, Christopher G; Blackwell, Jenefer M; Brown, Matthew A; Corvin, Aiden P; McCarthy, Mark I; Spencer, Chris C A; Bramon, Elvira; Corvin, Aiden P; O'Donovan, Michael C; Stefansson, Kari; Scolnick, Edward; Purcell, Shaun; McCarroll, Steven A; Sklar, Pamela; Hultman, Christina M; Sullivan, Patrick F

    2013-10-01

    Schizophrenia is an idiopathic mental disorder with a heritable component and a substantial public health impact. We conducted a multi-stage genome-wide association study (GWAS) for schizophrenia beginning with a Swedish national sample (5,001 cases and 6,243 controls) followed by meta-analysis with previous schizophrenia GWAS (8,832 cases and 12,067 controls) and finally by replication of SNPs in 168 genomic regions in independent samples (7,413 cases, 19,762 controls and 581 parent-offspring trios). We identified 22 loci associated at genome-wide significance; 13 of these are new, and 1 was previously implicated in bipolar disorder. Examination of candidate genes at these loci suggests the involvement of neuronal calcium signaling. We estimate that 8,300 independent, mostly common SNPs (95% credible interval of 6,300-10,200 SNPs) contribute to risk for schizophrenia and that these collectively account for at least 32% of the variance in liability. Common genetic variation has an important role in the etiology of schizophrenia, and larger studies will allow more detailed understanding of this disorder.

  5. Genetically contextual effects of smoking on genome wide DNA methylation.

    PubMed

    Dogan, Meeshanthini V; Beach, Steven R H; Philibert, Robert A

    2017-09-01

    Smoking is the leading cause of death in the United States. It exerts its effects by increasing susceptibility to a variety of complex disorders among those who smoke, and if pregnant, to their unborn children. In prior efforts to understand the epigenetic mechanisms through which this increased vulnerability is conveyed, a number of investigators have conducted genome wide methylation analyses. Unfortunately, secondary to methodological limitations, these studies were unable to examine methylation in gene regions with significant amounts of genetic variation. Using genome wide genetic and epigenetic data from the Framingham Heart Study, we re-examined the relationship of smoking status to genome wide methylation status. When only methylation status is considered, smoking was significantly associated with differential methylation in 310 genes that map to a variety of biological process and cellular differentiation pathways. However, when SNP effects on the magnitude of smoking associated methylation changes are also considered, cis and trans-interaction effects were noted at a total of 266 and 4353 genes with no marked enrichment for any biological pathways. Furthermore, the SNP variation participating in the significant interaction effects is enriched for loci previously associated with complex medical illnesses. The enlarged scope of the methylome shown to be affected by smoking may better explicate the mediational pathways linking smoking with a myriad of smoking related complex syndromes. Additionally, these results strongly suggest that combined epigenetic and genetic data analyses may be critical for a more complete understanding of the relationship between environmental variables, such as smoking, and pathophysiological outcomes. © 2017 Wiley Periodicals, Inc.

  6. Evolution of the pygmy phenotype: evidence of positive selection fro genome-wide scans in African, Asian, and Melanesian pygmies.

    PubMed

    Migliano, Andrea Bamberg; Romero, Irene Gallego; Metspalu, Mait; Leavesley, Matthew; Pagani, Luca; Antao, Tiago; Huang, Da-Wei; Sherman, Brad T; Siddle, Katharine; Scholes, Clarissa; Hudjashov, Georgi; Kaitokai, Elton; Babalu, Avis; Belatti, Maggie; Cagan, Alex; Hopkinshaw, Byrony; Shaw, Colin; Nelis, Mari; Metspalu, Ene; Mägi, Reedik; Lempicki, Richard A; Villems, Richard; Lahr, Marta Mirazon; Kivisild, Toomas

    2013-01-01

    Human pygmy populations inhabit different regions of the world, from Africa to Melanesia. In Asia, short-statured populations are often referred to as "negritos." Their short stature has been interpreted as a consequence of thermoregulatory, nutritional, and/or locomotory adaptations to life in tropical forests. A more recent hypothesis proposes that their stature is the outcome of a life history trade-off in high-mortality environments, where early reproduction is favored and, consequently, early sexual maturation and early growth cessation have coevolved. Some serological evidence of deficiencies in the growth hormone/insulin-like growth factor axis have been previously associated with pygmies' short stature. Using genome-wide single-nucleotide polymorphism genotype data, we first tested whether different negrito groups living in the Philippines and Papua New Guinea are closely related and then investigated genomic signals of recent positive selection in African, Asian, and Papuan pygmy populations. We found that negritos in the Philippines and Papua New Guinea are genetically more similar to their nonpygmy neighbors than to one another and have experienced positive selection at different genes. These results indicate that geographically distant pygmy groups are likely to have evolved their short stature independently. We also found that selection on common height variants is unlikely to explain their short stature and that different genes associated with growth, thyroid function, and sexual development are under selection in different pygmy groups. Copyright © 2013 Wayne State University Press, Detroit, Michigan 48201-1309.

  7. Genome-wide selective sweeps and gene-specific sweeps in natural bacterial populations

    DOE PAGES

    Bendall, Matthew L.; Stevens, Sarah L.R.; Chan, Leong-Keat; ...

    2016-01-08

    Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Using a 9-year metagenomic study of a freshwater lake (2005–2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of genemore » gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. Furthermore, these patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the ‘ecotype model’ of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Finally, evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment.« less

  8. Sniffing out significant “Pee values”: genome wide association study of asparagus anosmia

    PubMed Central

    Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter

    2016-01-01

    Objective To determine the inherited factors associated with the ability to smell asparagus metabolites in urine. Design Genome wide association study. Setting Nurses’ Health Study and Health Professionals Follow-up Study cohorts. Participants 6909 men and women of European-American descent with available genetic data from genome wide association studies. Main outcome measure Participants were characterized as asparagus smellers if they strongly agreed with the prompt “after eating asparagus, you notice a strong characteristic odor in your urine,” and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values <5×10-8 were considered as genome wide significant. Results 58.0% of men (n=1449/2500) and 61.5% of women (n=2712/4409) had anosmia. 871 single nucleotide polymorphisms reached genome wide significance for asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553. Conclusion A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. PMID:27965198

  9. Genome-wide association analysis of age-at-onset in Alzheimer's disease.

    PubMed

    Kamboh, M I; Barmada, M M; Demirci, F Y; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Sweet, R A; Feingold, E; DeKosky, S T; Lopez, O L

    2012-12-01

    The risk of Alzheimer's disease (AD) is strongly determined by genetic factors and recent genome-wide association studies (GWAS) have identified several genes for the disease risk. In addition to the disease risk, age-at-onset (AAO) of AD has also strong genetic component with an estimated heritability of 42%. Identification of AAO genes may help to understand the biological mechanisms that regulate the onset of the disease. Here we report the first GWAS focused on identifying genes for the AAO of AD. We performed a genome-wide meta-analysis on three samples comprising a total of 2222 AD cases. A total of ~2.5 million directly genotyped or imputed single-nucleotide polymorphisms (SNPs) were analyzed in relation to AAO of AD. As expected, the most significant associations were observed in the apolipoprotein E (APOE) region on chromosome 19 where several SNPs surpassed the conservative genome-wide significant threshold (P<5E-08). The most significant SNP outside the APOE region was located in the DCHS2 gene on chromosome 4q31.3 (rs1466662; P=4.95E-07). There were 19 additional significant SNPs in this region at P<1E-04 and the DCHS2 gene is expressed in the cerebral cortex and thus is a potential candidate for affecting AAO in AD. These findings need to be confirmed in additional well-powered samples.

  10. Investigation of Maternal Genotype Effects in Autism by Genome-Wide Association

    PubMed Central

    Yuan, Han; Dougherty, Joseph D.

    2014-01-01

    Lay Abstract Autism spectrum disorders (ASDs) are pervasive developmental disorders which have both a genetic and environmental component. One source of the environmental component is the in utero (prenatal) environment. The maternal genome can potentially contribute to the risk of autism in children by altering this prenatal environment. In this study, the possibility of maternal genotype effects was explored by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. We performed a case/control genome-wide association study (GWAS) using mothers of probands as cases and either fathers of probands or normal females as controls, using two collections of families with autism. We did not identify any SNP that reached significance and thus a common variant of large effect is unlikely. However, there was evidence for the possibility of a large number of alleles each carrying a small effect. This suggested that if there is a contribution to autism risk through common-variant maternal genetic effects, it may be the result of multiple loci of small effects. We did not investigate rare variants in this study. Scientific Abstract Like most psychiatric disorders, autism spectrum disorders have both a genetic and an environmental component. While previous studies have clearly demonstrated the contribution of in utero (prenatal) environment on autism risk, most of them focused on transient environmental factors. Based on a recent sibling study, we hypothesized that environmental factors could also come from the maternal genome, which would result in persistent effects across siblings. In this study, the possibility of maternal genotype effects was examined by looking for common variants (single nucleotide polymorphisms, or SNPs) in the maternal genome associated with increased risk of autism in children. A case/control genome-wide association study (GWAS) was performed using mothers of

  11. Genome-wide association between DNA methylation and alternative splicing in an invertebrate

    PubMed Central

    2012-01-01

    Background Gene bodies are the most evolutionarily conserved targets of DNA methylation in eukaryotes. However, the regulatory functions of gene body DNA methylation remain largely unknown. DNA methylation in insects appears to be primarily confined to exons. Two recent studies in Apis mellifera (honeybee) and Nasonia vitripennis (jewel wasp) analyzed transcription and DNA methylation data for one gene in each species to demonstrate that exon-specific DNA methylation may be associated with alternative splicing events. In this study we investigated the relationship between DNA methylation, alternative splicing, and cross-species gene conservation on a genome-wide scale using genome-wide transcription and DNA methylation data. Results We generated RNA deep sequencing data (RNA-seq) to measure genome-wide mRNA expression at the exon- and gene-level. We produced a de novo transcriptome from this RNA-seq data and computationally predicted splice variants for the honeybee genome. We found that exons that are included in transcription are higher methylated than exons that are skipped during transcription. We detected enrichment for alternative splicing among methylated genes compared to unmethylated genes using fisher’s exact test. We performed a statistical analysis to reveal that the presence of DNA methylation or alternative splicing are both factors associated with a longer gene length and a greater number of exons in genes. In concordance with this observation, a conservation analysis using BLAST revealed that each of these factors is also associated with higher cross-species gene conservation. Conclusions This study constitutes the first genome-wide analysis exhibiting a positive relationship between exon-level DNA methylation and mRNA expression in the honeybee. Our finding that methylated genes are enriched for alternative splicing suggests that, in invertebrates, exon-level DNA methylation may play a role in the construction of splice variants by positively

  12. Genome-wide association studies in cardiac electrophysiology: recent discoveries and implications for clinical practice.

    PubMed

    Milan, David J; Lubitz, Steven A; Kääb, Stefan; Ellinor, Patrick T

    2010-08-01

    Genome-wide association studies have been increasingly used to study the genetics of complex human diseases. Within the field of cardiac electrophysiology, this technique has been applied to conditions such as atrial fibrillation, and several electrocardiographic parameters including the QT interval. While these studies have identified multiple genomic regions associated with each trait, questions remain, including the best way to explore the pathophysiology of each association and the potential for clinical utility. This review will summarize recent genome-wide association study results within cardiac electrophysiology and discuss their broader implications in basic science and clinical medicine. Copyright 2010 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.

  13. Accurate computation of survival statistics in genome-wide studies.

    PubMed

    Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J; Upfal, Eli

    2015-05-01

    A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.

  14. Accurate Computation of Survival Statistics in Genome-Wide Studies

    PubMed Central

    Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli

    2015-01-01

    A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620

  15. Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast.

    PubMed

    Oud, Bart; van Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

    2012-03-01

    Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  16. Genome-wide analytical approaches for reverse metabolic engineering of industrially relevant phenotypes in yeast

    PubMed Central

    Oud, Bart; Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T

    2012-01-01

    Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. PMID:22152095

  17. A genome-wide search for genes affecting circulating fibrinogen levels in the Framingham Heart Study.

    PubMed

    Yang, Qiong; Tofler, Geoffrey H; Cupples, L Adrienne; Larson, Martin G; Feng, DaLi; Lindpaintner, Klaus; Levy, Daniel; D'Agostino, Ralph B; O'Donnell, Christopher J

    2003-04-15

    Circulating levels of fibrinogen are associated with atherosclerosis and predict future coronary heart disease and stroke. Levels of fibrinogen are correlated among family members, suggesting a heritable component. Variants of the beta-fibrinogen gene subunit on 4q28 are associated with fibrinogen levels but explain only a small proportion of the total genetic variability. It remains unknown what role, if any, is played by other genetic variants in the inter-individual variability in levels of fibrinogen in the general population. We conducted a 10-cM spaced genome-wide scan using 402 original cohort subjects and 1193 offspring subjects from 330 extended families of the Framingham Heart Study. Heritability and linkage analyses were carried out using variance component methods. Regression analyses were performed to adjust for traditional risk factors and HindIII beta-148 genotypes. The total heritability was estimated as 0.24. The highest and second highest LOD scores of linkage were found on chromosomes 2 (LOD=1.5 at 243 cM) and 10 (LOD=2.4 at 87 cM) using only offspring subjects in the analysis, and on chromosomes 2 (LOD=2.1 at 242 cM) and 10(LOD=1.4 at 86 cM), 17 (LOD=1.4 at 96 cM) and 20 (LOD=1.4 at 80 cM) using both original cohort and offspring. These results suggest that there may be influential genetic regions on these chromosomes. While no linkage with genome-wide significance was detected, further research to confirm our findings is warranted.

  18. SvABA: genome-wide detection of structural variants and indels by local assembly.

    PubMed

    Wala, Jeremiah A; Bandopadhayay, Pratiti; Greenwald, Noah F; O'Rourke, Ryan; Sharpe, Ted; Stewart, Chip; Schumacher, Steve; Li, Yilong; Weischenfeldt, Joachim; Yao, Xiaotong; Nusbaum, Chad; Campbell, Peter; Getz, Gad; Meyerson, Matthew; Zhang, Cheng-Zhong; Imielinski, Marcin; Beroukhim, Rameen

    2018-04-01

    Structural variants (SVs), including small insertion and deletion variants (indels), are challenging to detect through standard alignment-based variant calling methods. Sequence assembly offers a powerful approach to identifying SVs, but is difficult to apply at scale genome-wide for SV detection due to its computational complexity and the difficulty of extracting SVs from assembly contigs. We describe SvABA, an efficient and accurate method for detecting SVs from short-read sequencing data using genome-wide local assembly with low memory and computing requirements. We evaluated SvABA's performance on the NA12878 human genome and in simulated and real cancer genomes. SvABA demonstrates superior sensitivity and specificity across a large spectrum of SVs and substantially improves detection performance for variants in the 20-300 bp range, compared with existing methods. SvABA also identifies complex somatic rearrangements with chains of short (<1000 bp) templated-sequence insertions copied from distant genomic regions. We applied SvABA to 344 cancer genomes from 11 cancer types and found that short templated-sequence insertions occur in ∼4% of all somatic rearrangements. Finally, we demonstrate that SvABA can identify sites of viral integration and cancer driver alterations containing medium-sized (50-300 bp) SVs. © 2018 Wala et al.; Published by Cold Spring Harbor Laboratory Press.

  19. A Genome Scan Conducted in a Multigenerational Pedigree with Convergent Strabismus Supports a Complex Genetic Determinism

    PubMed Central

    Georges, Anouk; Cambisano, Nadine; Ahariz, Naïma; Karim, Latifa; Georges, Michel

    2013-01-01

    A genome-wide linkage scan was conducted in a Northern-European multigenerational pedigree with nine of 40 related members affected with concomitant strabismus. Twenty-seven members of the pedigree including all affected individuals were genotyped using a SNP array interrogating > 300,000 common SNPs. We conducted parametric and non-parametric linkage analyses assuming segregation of an autosomal dominant mutation, yet allowing for incomplete penetrance and phenocopies. We detected two chromosome regions with near-suggestive evidence for linkage, respectively on chromosomes 8 and 18. The chromosome 8 linkage implied a penetrance of 0.80 and a rate of phenocopy of 0.11, while the chromosome 18 linkage implied a penetrance of 0.64 and a rate of phenocopy of 0. Our analysis excludes a simple genetic determinism of strabismus in this pedigree. PMID:24376720

  20. A genome scan conducted in a multigenerational pedigree with convergent strabismus supports a complex genetic determinism.

    PubMed

    Georges, Anouk; Cambisano, Nadine; Ahariz, Naïma; Karim, Latifa; Georges, Michel

    2013-01-01

    A genome-wide linkage scan was conducted in a Northern-European multigenerational pedigree with nine of 40 related members affected with concomitant strabismus. Twenty-seven members of the pedigree including all affected individuals were genotyped using a SNP array interrogating > 300,000 common SNPs. We conducted parametric and non-parametric linkage analyses assuming segregation of an autosomal dominant mutation, yet allowing for incomplete penetrance and phenocopies. We detected two chromosome regions with near-suggestive evidence for linkage, respectively on chromosomes 8 and 18. The chromosome 8 linkage implied a penetrance of 0.80 and a rate of phenocopy of 0.11, while the chromosome 18 linkage implied a penetrance of 0.64 and a rate of phenocopy of 0. Our analysis excludes a simple genetic determinism of strabismus in this pedigree.

  1. Global methylation screening in the Arabidopsis thaliana and Mus musculus genome: applications of virtual image restriction landmark genomic scanning (Vi-RLGS)

    PubMed Central

    Matsuyama, Tomoki; Kimura, Makoto T.; Koike, Kuniaki; Abe, Tomoko; Nakano, Takeshi; Asami, Tadao; Ebisuzaki, Toshikazu; Held, William A.; Yoshida, Shigeo; Nagase, Hiroki

    2003-01-01

    Understanding the role of ‘epigenetic’ changes such as DNA methylation and chromatin remodeling has now become critical in understanding many biological processes. In order to delineate the global methylation pattern in a given genomic DNA, computer software has been developed to create a virtual image of restriction landmark genomic scanning (Vi-RLGS). When using a methylation- sensitive enzyme such as NotI as the restriction landmark, the comparison between real and in silico RLGS profiles of the genome provides a methylation map of genomic NotI sites. A methylation map of the Arabidopsis genome was created that could be confirmed by a methylation-sensitive PCR assay. The method has also been applied to the mouse genome. Although a complete methylation map has not been completed, a region of methylation difference between two tissues has been tested and confirmed by bisulfite sequencing. Vi-RLGS in conjunction with real RLGS will make it possible to develop a more complete map of genomic sites that are methylated or demethylated as a consequence of normal or abnormal development. PMID:12888509

  2. Efficient genome-wide association in biobanks using topic modeling identifies multiple novel disease loci.

    PubMed

    McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

    2017-08-31

    Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that may be unreliable and fail to capture the relationship between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records (EHR) for 10845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes are included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p<1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than for single phenome-wide diagnostic codes, and incorporation of less strongly-loading diagnostic codes enhanced association. This strategy provides a more efficient means of phenome-wide association in biobanks with coded clinical data.

  3. Efficient Genome-wide Association in Biobanks Using Topic Modeling Identifies Multiple Novel Disease Loci

    PubMed Central

    McCoy, Thomas H; Castro, Victor M; Snapper, Leslie A; Hart, Kamber L; Perlis, Roy H

    2017-01-01

    Biobanks and national registries represent a powerful tool for genomic discovery, but rely on diagnostic codes that can be unreliable and fail to capture relationships between related diagnoses. We developed an efficient means of conducting genome-wide association studies using combinations of diagnostic codes from electronic health records for 10,845 participants in a biobanking program at two large academic medical centers. Specifically, we applied latent Dirichilet allocation to fit 50 disease topics based on diagnostic codes, then conducted a genome-wide common-variant association for each topic. In sensitivity analysis, these results were contrasted with those obtained from traditional single-diagnosis phenome-wide association analysis, as well as those in which only a subset of diagnostic codes were included per topic. In meta-analysis across three biobank cohorts, we identified 23 disease-associated loci with p < 1e-15, including previously associated autoimmune disease loci. In all cases, observed significant associations were of greater magnitude than single phenome-wide diagnostic codes, and incorporation of less strongly loading diagnostic codes enhanced association. This strategy provides a more efficient means of identifying phenome-wide associations in biobanks with coded clinical data. PMID:28861588

  4. Wide-field reflective scanning optical systems

    NASA Technical Reports Server (NTRS)

    Abel, I. R.

    1973-01-01

    Catoptric optical scanning system provides relatively fast line-scan rate for two-dimensional coverage. Rapid scan rates require low focal ratios between components and smallest possible masses. System is relatively free from monochromatic defects and chromatic aberrations.

  5. Genome-wide Association Studies from the Cancer Genetic Markers of Susceptibility (CGEMS) Initiative | Office of Cancer Genomics

    Cancer.gov

    CGEMS identifies common inherited genetic variations associated with a number of cancers, including breast and prostate. Data from these genome-wide association studies (GWAS) are available through the Division of Cancer Epidemiology & Genetics website.

  6. Susceptibility to Childhood Pneumonia: A Genome-Wide Analysis.

    PubMed

    Hayden, Lystra P; Cho, Michael H; McDonald, Merry-Lynn N; Crapo, James D; Beaty, Terri H; Silverman, Edwin K; Hersh, Craig P

    2017-01-01

    Previous studies have indicated that in adult smokers, a history of childhood pneumonia is associated with reduced lung function and chronic obstructive pulmonary disease. There have been few previous investigations using genome-wide association studies to investigate genetic predisposition to pneumonia. This study aims to identify the genetic variants associated with the development of pneumonia during childhood and over the course of the lifetime. Study subjects included current and former smokers with and without chronic obstructive pulmonary disease participating in the COPDGene Study. Pneumonia was defined by subject self-report, with childhood pneumonia categorized as having the first episode at <16 years. Genome-wide association studies for childhood pneumonia (843 cases, 9,091 control subjects) and lifetime pneumonia (3,766 cases, 5,659 control subjects) were performed separately in non-Hispanic whites and African Americans. Non-Hispanic white and African American populations were combined in the meta-analysis. Top genetic variants from childhood pneumonia were assessed in network analysis. No single-nucleotide polymorphisms reached genome-wide significance, although we identified potential regions of interest. In the childhood pneumonia analysis, this included variants in NGR1 (P = 6.3 × 10 -8 ), PAK6 (P = 3.3 × 10 -7 ), and near MATN1 (P = 2.8 × 10 -7 ). In the lifetime pneumonia analysis, this included variants in LOC339862 (P = 8.7 × 10 -7 ), RAPGEF2 (P = 8.4 × 10 -7 ), PHACTR1 (P = 6.1 × 10 -7 ), near PRR27 (P = 4.3 × 10 -7 ), and near MCPH1 (P = 2.7 × 10 -7 ). Network analysis of the genes associated with childhood pneumonia included top networks related to development, blood vessel morphogenesis, muscle contraction, WNT signaling, DNA damage, apoptosis, inflammation, and immune response (P ≤ 0.05). We have identified genes potentially associated with the risk of pneumonia

  7. Genome-wide association studies in Alzheimer disease.

    PubMed

    Waring, Stephen C; Rosenberg, Roger N

    2008-03-01

    The genetics of Alzheimer disease (AD) to date support an age-dependent dichotomous model whereby earlier age of disease onset (< 60 years) is explained by 3 fully penetrant genes (APP [NCBI Entrez gene 351], PSEN1 [NCBI Entrez gene 5663], and PSEN2 [NCBI Entrez gene 5664]), whereas later age of disease onset (> or = 65 years) representing most cases of AD has yet to be explained by a purely genetic model. The APOE gene (NCBI Entrez gene 348) is the strongest genetic risk factor for later onset, although it is neither sufficient nor necessary to explain all occurrences of disease. Numerous putative genetic risk alleles and genetic variants have been reported. Although all have relevance to biological mechanisms that may be associated with AD pathogenesis, they await replication in large representative populations. Genome-wide association studies have emerged as an increasingly effective tool for identifying genetic contributions to complex diseases and represent the next frontier for furthering our understanding of the underlying etiologic, biological, and pathologic mechanisms associated with chronic complex disorders. There have already been success stories for diseases such as macular degeneration and diabetes mellitus. Whether this will hold true for a genetically complex and heterogeneous disease such as AD is not known, although early reports are encouraging. This review considers recent publications from studies that have successfully applied genome-wide association methods to investigations of AD by taking advantage of the currently available high-throughput arrays, bioinformatics, and software advances. The inherent strengths, limitations, and challenges associated with study design issues in the context of AD are presented herein.

  8. Quality control and conduct of genome-wide association meta-analyses.

    PubMed

    Winkler, Thomas W; Day, Felix R; Croteau-Chonka, Damien C; Wood, Andrew R; Locke, Adam E; Mägi, Reedik; Ferreira, Teresa; Fall, Tove; Graff, Mariaelisa; Justice, Anne E; Luan, Jian'an; Gustafsson, Stefan; Randall, Joshua C; Vedantam, Sailaja; Workalemahu, Tsegaselassie; Kilpeläinen, Tuomas O; Scherag, André; Esko, Tonu; Kutalik, Zoltán; Heid, Iris M; Loos, Ruth J F

    2014-05-01

    Rigorous organization and quality control (QC) are necessary to facilitate successful genome-wide association meta-analyses (GWAMAs) of statistics aggregated across multiple genome-wide association studies. This protocol provides guidelines for (i) organizational aspects of GWAMAs, and for (ii) QC at the study file level, the meta-level across studies and the meta-analysis output level. Real-world examples highlight issues experienced and solutions developed by the GIANT Consortium that has conducted meta-analyses including data from 125 studies comprising more than 330,000 individuals. We provide a general protocol for conducting GWAMAs and carrying out QC to minimize errors and to guarantee maximum use of the data. We also include details for the use of a powerful and flexible software package called EasyQC. Precise timings will be greatly influenced by consortium size. For consortia of comparable size to the GIANT Consortium, this protocol takes a minimum of about 10 months to complete.

  9. A Genome Wide Survey of SNP Variation Reveals the Genetic Structure of Sheep Breeds

    USDA-ARS?s Scientific Manuscript database

    The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identi...

  10. Detection of genome-wide copy number variants in myeloid malignancies using next-generation sequencing.

    PubMed

    Shen, Wei; Paxton, Christian N; Szankasi, Philippe; Longhurst, Maria; Schumacher, Jonathan A; Frizzell, Kimberly A; Sorrells, Shelly M; Clayton, Adam L; Jattani, Rakhi P; Patel, Jay L; Toydemir, Reha; Kelley, Todd W; Xu, Xinjie

    2018-04-01

    Genetic abnormalities, including copy number variants (CNV), copy number neutral loss of heterozygosity (CN-LOH) and gene mutations, underlie the pathogenesis of myeloid malignancies and serve as important diagnostic, prognostic and/or therapeutic markers. Currently, multiple testing strategies are required for comprehensive genetic testing in myeloid malignancies. The aim of this proof-of-principle study was to investigate the feasibility of combining detection of genome-wide large CNVs, CN-LOH and targeted gene mutations into a single assay using next-generation sequencing (NGS). For genome-wide CNV detection, we designed a single nucleotide polymorphism (SNP) sequencing backbone with 22 762 SNP regions evenly distributed across the entire genome. For targeted mutation detection, 62 frequently mutated genes in myeloid malignancies were targeted. We combined this SNP sequencing backbone with a targeted mutation panel, and sequenced 9 healthy individuals and 16 patients with myeloid malignancies using NGS. We detected 52 somatic CNVs, 11 instances of CN-LOH and 39 oncogenic mutations in the 16 patients with myeloid malignancies, and none in the 9 healthy individuals. All CNVs and CN-LOH were confirmed by SNP microarray analysis. We describe a genome-wide SNP sequencing backbone which allows for sensitive detection of genome-wide CNVs and CN-LOH using NGS. This proof-of-principle study has demonstrated that this strategy can provide more comprehensive genetic profiling for patients with myeloid malignancies using a single assay. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  11. GeneCount: genome-wide calculation of absolute tumor DNA copy numbers from array comparative genomic hybridization data

    PubMed Central

    Lyng, Heidi; Lando, Malin; Brøvig, Runar S; Svendsrud, Debbie H; Johansen, Morten; Galteland, Eivind; Brustugun, Odd T; Meza-Zepeda, Leonardo A; Myklebost, Ola; Kristensen, Gunnar B; Hovig, Eivind; Stokke, Trond

    2008-01-01

    Absolute tumor DNA copy numbers can currently be achieved only on a single gene basis by using fluorescence in situ hybridization (FISH). We present GeneCount, a method for genome-wide calculation of absolute copy numbers from clinical array comparative genomic hybridization data. The tumor cell fraction is reliably estimated in the model. Data consistent with FISH results are achieved. We demonstrate significant improvements over existing methods for exploring gene dosages and intratumor copy number heterogeneity in cancers. PMID:18500990

  12. Linkage Disequilibrium And Genome-Wide Association Studies In O. sativa

    USDA-ARS?s Scientific Manuscript database

    There is increasing evidence that genome-wide association studies provide a powerful approach to find the genetic basis of complex phenotypic variation in all kinds of species. For this purpose, we developed the first generation 44K Affymetrix SNP array in rice (see Tung et al. poster). We genotyped...

  13. A genome-wide linkage study of mammographic density, a risk factor for breast cancer

    PubMed Central

    2011-01-01

    Introduction Mammographic breast density is a highly heritable (h2 > 0.6) and strong risk factor for breast cancer. We conducted a genome-wide linkage study to identify loci influencing mammographic breast density (MD). Methods Epidemiological data were assembled on 1,415 families from the Australia, Northern California and Ontario sites of the Breast Cancer Family Registry, and additional families recruited in Australia and Ontario. Families consisted of sister pairs with age-matched mammograms and data on factors known to influence MD. Single nucleotide polymorphism (SNP) genotyping was performed on 3,952 individuals using the Illumina Infinium 6K linkage panel. Results Using a variance components method, genome-wide linkage analysis was performed using quantitative traits obtained by adjusting MD measurements for known covariates. Our primary trait was formed by fitting a linear model to the square root of the percentage of the breast area that was dense (PMD), adjusting for age at mammogram, number of live births, menopausal status, weight, height, weight squared, and menopausal hormone therapy. The maximum logarithm of odds (LOD) score from the genome-wide scan was on chromosome 7p14.1-p13 (LOD = 2.69; 63.5 cM) for covariate-adjusted PMD, with a 1-LOD interval spanning 8.6 cM. A similar signal was seen for the covariate adjusted area of the breast that was dense (DA) phenotype. Simulations showed that the complete sample had adequate power to detect LOD scores of 3 or 3.5 for a locus accounting for 20% of phenotypic variance. A modest peak initially seen on chromosome 7q32.3-q34 increased in strength when only the 513 families with at least two sisters below 50 years of age were included in the analysis (LOD 3.2; 140.7 cM, 1-LOD interval spanning 9.6 cM). In a subgroup analysis, we also found a LOD score of 3.3 for DA phenotype on chromosome 12.11.22-q13.11 (60.8 cM, 1-LOD interval spanning 9.3 cM), overlapping a region identified in a previous study

  14. Meta-Analyses of Genome-Wide Association Data Hold New Promise for Addiction Genetics.

    PubMed

    Agrawal, Arpana; Edenberg, Howard J; Gelernter, Joel

    2016-09-01

    Meta-analyses of genome-wide association study data have begun to lead to promising new discoveries for behavioral and psychiatrically relevant phenotypes (e.g., schizophrenia, educational attainment). We outline how this methodology can similarly lead to novel discoveries in genomic studies of substance use disorders, and discuss challenges that will need to be overcome to accomplish this goal. We illustrate our approach with the work of the newly established Substance Use Disorders workgroup of the Psychiatric Genomics Consortium.

  15. Genomic prediction in contrast to a genome-wide association study in explaining heritable variation of complex growth traits in breeding populations of Eucalyptus.

    PubMed

    Müller, Bárbara S F; Neves, Leandro G; de Almeida Filho, Janeo E; Resende, Márcio F R; Muñoz, Patricio R; Dos Santos, Paulo E T; Filho, Estefano Paludzyszyn; Kirst, Matias; Grattapaglia, Dario

    2017-07-11

    The advent of high-throughput genotyping technologies coupled to genomic prediction methods established a new paradigm to integrate genomics and breeding. We carried out whole-genome prediction and contrasted it to a genome-wide association study (GWAS) for growth traits in breeding populations of Eucalyptus benthamii (n =505) and Eucalyptus pellita (n =732). Both species are of increasing commercial interest for the development of germplasm adapted to environmental stresses. Predictive ability reached 0.16 in E. benthamii and 0.44 in E. pellita for diameter growth. Predictive abilities using either Genomic BLUP or different Bayesian methods were similar, suggesting that growth adequately fits the infinitesimal model. Genomic prediction models using ~5000-10,000 SNPs provided predictive abilities equivalent to using all 13,787 and 19,506 SNPs genotyped in the E. benthamii and E. pellita populations, respectively. No difference was detected in predictive ability when different sets of SNPs were utilized, based on position (equidistantly genome-wide, inside genes, linkage disequilibrium pruned or on single chromosomes), as long as the total number of SNPs used was above ~5000. Predictive abilities obtained by removing relatedness between training and validation sets fell near zero for E. benthamii and were halved for E. pellita. These results corroborate the current view that relatedness is the main driver of genomic prediction, although some short-range historical linkage disequilibrium (LD) was likely captured for E. pellita. A GWAS identified only one significant association for volume growth in E. pellita, illustrating the fact that while genome-wide regression is able to account for large proportions of the heritability, very little or none of it is captured into significant associations using GWAS in breeding populations of the size evaluated in this study. This study provides further experimental data supporting positive prospects of using genome-wide data to

  16. Inference of gene regulatory networks from genome-wide knockout fitness data

    PubMed Central

    Wang, Liming; Wang, Xiaodong; Arkin, Adam P.; Samoilov, Michael S.

    2013-01-01

    Motivation: Genome-wide fitness is an emerging type of high-throughput biological data generated for individual organisms by creating libraries of knockouts, subjecting them to broad ranges of environmental conditions, and measuring the resulting clone-specific fitnesses. Since fitness is an organism-scale measure of gene regulatory network behaviour, it may offer certain advantages when insights into such phenotypical and functional features are of primary interest over individual gene expression. Previous works have shown that genome-wide fitness data can be used to uncover novel gene regulatory interactions, when compared with results of more conventional gene expression analysis. Yet, to date, few algorithms have been proposed for systematically using genome-wide mutant fitness data for gene regulatory network inference. Results: In this article, we describe a model and propose an inference algorithm for using fitness data from knockout libraries to identify underlying gene regulatory networks. Unlike most prior methods, the presented approach captures not only structural, but also dynamical and non-linear nature of biomolecular systems involved. A state–space model with non-linear basis is used for dynamically describing gene regulatory networks. Network structure is then elucidated by estimating unknown model parameters. Unscented Kalman filter is used to cope with the non-linearities introduced in the model, which also enables the algorithm to run in on-line mode for practical use. Here, we demonstrate that the algorithm provides satisfying results for both synthetic data as well as empirical measurements of GAL network in yeast Saccharomyces cerevisiae and TyrR–LiuR network in bacteria Shewanella oneidensis. Availability: MATLAB code and datasets are available to download at http://www.duke.edu/∼lw174/Fitness.zip and http://genomics.lbl.gov/supplemental/fitness-bioinf/ Contact: wangx@ee.columbia.edu or mssamoilov@lbl.gov Supplementary information

  17. Genome-wide meta-analyses of stratified depression in Generation Scotland and UK Biobank.

    PubMed

    Hall, Lynsey S; Adams, Mark J; Arnau-Soler, Aleix; Clarke, Toni-Kim; Howard, David M; Zeng, Yanni; Davies, Gail; Hagenaars, Saskia P; Maria Fernandez-Pujals, Ana; Gibson, Jude; Wigmore, Eleanor M; Boutin, Thibaud S; Hayward, Caroline; Scotland, Generation; Porteous, David J; Deary, Ian J; Thomson, Pippa A; Haley, Chris S; McIntosh, Andrew M

    2018-01-10

    Few replicable genetic associations for Major Depressive Disorder (MDD) have been identified. Recent studies of MDD have identified common risk variants by using a broader phenotype definition in very large samples, or by reducing phenotypic and ancestral heterogeneity. We sought to ascertain whether it is more informative to maximize the sample size using data from all available cases and controls, or to use a sex or recurrent stratified subset of affected individuals. To test this, we compared heritability estimates, genetic correlation with other traits, variance explained by MDD polygenic score, and variants identified by genome-wide meta-analysis for broad and narrow MDD classifications in two large British cohorts - Generation Scotland and UK Biobank. Genome-wide meta-analysis of MDD in males yielded one genome-wide significant locus on 3p22.3, with three genes in this region (CRTAP, GLB1, and TMPPE) demonstrating a significant association in gene-based tests. Meta-analyzed MDD, recurrent MDD and female MDD yielded equivalent heritability estimates, showed no detectable difference in association with polygenic scores, and were each genetically correlated with six health-correlated traits (neuroticism, depressive symptoms, subjective well-being, MDD, a cross-disorder phenotype and Bipolar Disorder). Whilst stratified GWAS analysis revealed a genome-wide significant locus for male MDD, the lack of independent replication, and the consistent pattern of results in other MDD classifications suggests that phenotypic stratification using recurrence or sex in currently available sample sizes is currently weakly justified. Based upon existing studies and our findings, the strategy of maximizing sample sizes is likely to provide the greater gain.

  18. Nonlinear analysis and dynamic compensation of stylus scanning measurement with wide range

    NASA Astrophysics Data System (ADS)

    Hui, Heiyang; Liu, Xiaojun; Lu, Wenlong

    2011-12-01

    Surface topography is an important geometrical feature of a workpiece that influences its quality and functions such as friction, wearing, lubrication and sealing. Precision measurement of surface topography is fundamental for product quality characterizing and assurance. Stylus scanning technique is a widely used method for surface topography measurement, and it is also regarded as the international standard method for 2-D surface characterizing. Usually surface topography, including primary profile, waviness and roughness, can be measured precisely and efficiently by this method. However, by stylus scanning method to measure curved surface topography, the nonlinear error is unavoidable because of the difference of horizontal position of the actual measured point from given sampling point and the nonlinear transformation process from vertical displacement of the stylus tip to angle displacement of the stylus arm, and the error increases with the increasing of measuring range. In this paper, a wide range stylus scanning measurement system based on cylindrical grating interference principle is constructed, the originations of the nonlinear error are analyzed, the error model is established and a solution to decrease the nonlinear error is proposed, through which the error of the collected data is dynamically compensated.

  19. Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution.

    PubMed

    Renner, Daniel W; Szpara, Moriah L

    2018-01-01

    Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. Copyright © 2017 Renner and Szpara.

  20. Genome-Wide Mapping of Copy Number Variation in Humans: Comparative Analysis of High Resolution Array Platforms

    PubMed Central

    Haraksingh, Rajini R.; Abyzov, Alexej; Gerstein, Mark; Urban, Alexander E.; Snyder, Michael

    2011-01-01

    Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications. PMID:22140474

  1. Genome-wide association study identifies multiple loci associated with bladder cancer risk

    PubMed Central

    Figueroa, Jonine D.; Ye, Yuanqing; Siddiq, Afshan; Garcia-Closas, Montserrat; Chatterjee, Nilanjan; Prokunina-Olsson, Ludmila; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Dinney, Colin P.; Malats, Núria; Baris, Dalsu; Purdue, Mark; Jacobs, Eric J.; Albanes, Demetrius; Wang, Zhaoming; Deng, Xiang; Chung, Charles C.; Tang, Wei; Bas Bueno-de-Mesquita, H.; Trichopoulos, Dimitrios; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth; Tjønneland, Anne; Brenan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Rodabough, Rebecca; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Chen, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Kamat, Ashish M.; Lerner, Seth P.; Barton Grossman, H.; Lin, Jie; Gu, Jian; Pu, Xia; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Kogevinas, Manolis; Tardón, Adonina; Serra, Consol; Carrato, Alfredo; García-Closas, Reina; Lloreta, Josep; Schwenn, Molly; Karagas, Margaret R.; Johnson, Alison; Schned, Alan; Armenti, Karla R.; Hosain, G.M.; Andriole, Gerald; Grubb, Robert; Black, Amanda; Ryan Diver, W.; Gapstur, Susan M.; Weinstein, Stephanie J.; Virtamo, Jarmo; Haiman, Chris A.; Landi, Maria T.; Caporaso, Neil; Fraumeni, Joseph F.; Vineis, Paolo; Wu, Xifeng; Silverman, Debra T.; Chanock, Stephen; Rothman, Nathaniel

    2014-01-01

    Candidate gene and genome-wide association studies (GWAS) have identified 11 independent susceptibility loci associated with bladder cancer risk. To discover additional risk variants, we conducted a new GWAS of 2422 bladder cancer cases and 5751 controls, followed by a meta-analysis with two independently published bladder cancer GWAS, resulting in a combined analysis of 6911 cases and 11 814 controls of European descent. TaqMan genotyping of 13 promising single nucleotide polymorphisms with P < 1 × 10−5 was pursued in a follow-up set of 801 cases and 1307 controls. Two new loci achieved genome-wide statistical significance: rs10936599 on 3q26.2 (P = 4.53 × 10−9) and rs907611 on 11p15.5 (P = 4.11 × 10−8). Two notable loci were also identified that approached genome-wide statistical significance: rs6104690 on 20p12.2 (P = 7.13 × 10−7) and rs4510656 on 6p22.3 (P = 6.98 × 10−7); these require further studies for confirmation. In conclusion, our study has identified new susceptibility alleles for bladder cancer risk that require fine-mapping and laboratory investigation, which could further understanding into the biological underpinnings of bladder carcinogenesis. PMID:24163127

  2. Genome-wide gene–environment interaction analysis for asbestos exposure in lung cancer susceptibility

    PubMed Central

    Wei, Qingyi Wei

    2012-01-01

    Asbestos exposure is a known risk factor for lung cancer. Although recent genome-wide association studies (GWASs) have identified some novel loci for lung cancer risk, few addressed genome-wide gene–environment interactions. To determine gene–asbestos interactions in lung cancer risk, we conducted genome-wide gene–environment interaction analyses at levels of single nucleotide polymorphisms (SNPs), genes and pathways, using our published Texas lung cancer GWAS dataset. This dataset included 317 498 SNPs from 1154 lung cancer cases and 1137 cancer-free controls. The initial SNP-level P-values for interactions between genetic variants and self-reported asbestos exposure were estimated by unconditional logistic regression models with adjustment for age, sex, smoking status and pack-years. The P-value for the most significant SNP rs13383928 was 2.17×10–6, which did not reach the genome-wide statistical significance. Using a versatile gene-based test approach, we found that the top significant gene was C7orf54, located on 7q32.1 (P = 8.90×10–5). Interestingly, most of the other significant genes were located on 11q13. When we used an improved gene-set-enrichment analysis approach, we found that the Fas signaling pathway and the antigen processing and presentation pathway were most significant (nominal P < 0.001; false discovery rate < 0.05) among 250 pathways containing 17 572 genes. We believe that our analysis is a pilot study that first describes the gene–asbestos interaction in lung cancer risk at levels of SNPs, genes and pathways. Our findings suggest that immune function regulation-related pathways may be mechanistically involved in asbestos-associated lung cancer risk. Abbreviations:CIconfidence intervalEenvironmentFDRfalse discovery rateGgeneGSEAgene-set-enrichment analysisGWASgenome-wide association studiesi-GSEAimproved gene-set-enrichment analysis approachORodds ratioSNPsingle nucleotide polymorphism PMID:22637743

  3. Genome-wide mapping of autonomous promoter activity in human cells

    PubMed Central

    van Arensbergen, Joris; FitzPatrick, Vincent D.; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J.; van Steensel, Bas

    2017-01-01

    Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of sequences that could be tested. Here we present Survey of Regulatory Elements (SuRE), a method to assay more than 108 DNA fragments, each 0.2–2kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library is constructed of random genomic fragments upstream of a 20bp barcode and decoded by paired-end sequencing. This library is then transfected into cells and transcribed barcodes are quantified in the RNA by high throughput sequencing. When applied to the human genome, we achieved a 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide. By computational modeling we delineated subregions within promoters that are relevant for their activity. For instance, we show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites. PMID:28024146

  4. Genome-wide association study identifies 74 loci associated with educational attainment

    PubMed Central

    Okbay, Aysu; Beauchamp, Jonathan P.; Fontana, Mark A.; Lee, James J.; Pers, Tune H.; Rietveld, Cornelius A.; Turley, Patrick; Chen, Guo-Bo; Emilsson, Valur; Meddens, S. Fleur W.; Oskarsson, Sven; Pickrell, Joseph K.; Thom, Kevin; Timshel, Pascal; de Vlaming, Ronald; Abdellaoui, Abdel; Ahluwalia, Tarunveer S.; Bacelis, Jonas; Baumbach, Clemens; Bjornsdottir, Gyda; Brandsma, Johannes H.; Concas, Maria Pina; Derringer, Jaime; Furlotte, Nicholas A.; Galesloot, Tessel E.; Girotto, Giorgia; Gupta, Richa; Hall, Leanne M.; Harris, Sarah E.; Hofer, Edith; Horikoshi, Momoko; Huffman, Jennifer E.; Kaasik, Kadri; Kalafati, Ioanna P.; Karlsson, Robert; Kong, Augustine; Lahti, Jari; van der Lee, Sven J.; de Leeuw, Christiaan; Lind, Penelope A.; Lindgren, Karl-Oskar; Liu, Tian; Mangino, Massimo; Marten, Jonathan; Mihailov, Evelin; Miller, Michael B.; van der Most, Peter J.; Oldmeadow, Christopher; Payton, Antony; Pervjakova, Natalia; Peyrot, Wouter J.; Qian, Yong; Raitakari, Olli; Rueedi, Rico; Salvi, Erika; Schmidt, Börge; Schraut, Katharina E.; Shi, Jianxin; Smith, Albert V.; Poot, Raymond A.; Pourcain, Beate; Teumer, Alexander; Thorleifsson, Gudmar; Verweij, Niek; Vuckovic, Dragana; Wellmann, Juergen; Westra, Harm-Jan; Yang, Jingyun; Zhao, Wei; Zhu, Zhihong; Alizadeh, Behrooz Z.; Amin, Najaf; Bakshi, Andrew; Baumeister, Sebastian E.; Biino, Ginevra; Bønnelykke, Klaus; Boyle, Patricia A.; Campbell, Harry; Cappuccio, Francesco P.; Davies, Gail; De Neve, Jan-Emmanuel; Deloukas, Panos; Demuth, Ilja; Ding, Jun; Eibich, Peter; Eisele, Lewin; Eklund, Niina; Evans68, David M.; Faul, Jessica D.; Feitosa, Mary F.; Forstner, Andreas J.; Gandin, Ilaria; Gunnarsson, Bjarni; Halldórsson, Bjarni V.; Harris, Tamara B.; Heath, Andrew C.; Hocking, Lynne J.; Holliday, Elizabeth G.; Homuth, Georg; Horan, Michael A.; Hottenga, Jouke-Jan; de Jager, Philip L.; Joshi, Peter K.; Jugessur, Astanand; Kaakinen, Marika A.; Kähönen, Mika; Kanoni, Stavroula; Keltigangas-Järvinen, Liisa; Kiemeney, Lambertus A.L.M.; Kolcic, Ivana; Koskinen, Seppo; Kraja, Aldi T.; Kroh, Martin; Kutalik, Zoltan; Latvala, Antti; Launer, Lenore J.; Lebreton, Maël P.; Levinson, Douglas F.; Lichtenstein, Paul; Lichtner, Peter; Liewald, David C.M.; Loukola, Anu; Madden, Pamela A.; Mägi, Reedik; Mäki-Opas, Tomi; Marioni, Riccardo E.; Marques-Vidal, Pedro; Meddens, Gerardus A.; McMahon, George; Meisinger, Christa; Meitinger, Thomas; Milaneschi, Yusplitri; Milani, Lili; Montgomery, Grant W.; Myhre, Ronny; Nelson, Christopher P.; Nyholt, Dale R.; Ollier, William E.R.; Palotie, Aarno; Paternoster, Lavinia; Pedersen, Nancy L.; Petrovic, Katja E.; Porteous, David J.; Räikkönen, Katri; Ring, Susan M.; Robino, Antonietta; Rostapshova, Olga; Rudan, Igor; Rustichini, Aldo; Salomaa, Veikko; Sanders, Alan R.; Sarin, Antti-Pekka; Schmidt, Helena; Scott, Rodney J.; Smith, Blair H.; Smith, Jennifer A.; Staessen, Jan A.; Steinhagen-Thiessen, Elisabeth; Strauch, Konstantin; Terracciano, Antonio; Tobin, Martin D.; Ulivi, Sheila; Vaccargiu, Simona; Quaye, Lydia; van Rooij, Frank J.A.; Venturini, Cristina; Vinkhuyzen, Anna A.E.; Völker, Uwe; Völzke, Henry; Vonk, Judith M.; Vozzi, Diego; Waage, Johannes; Ware, Erin B.; Willemsen, Gonneke; Attia, John R.; Bennett, David A.; Berger, Klaus; Bertram, Lars; Bisgaard, Hans; Boomsma, Dorret I.; Borecki, Ingrid B.; Bultmann, Ute; Chabris, Christopher F.; Cucca, Francesco; Cusi, Daniele; Deary, Ian J.; Dedoussis, George V.; van Duijn, Cornelia M.; Eriksson, Johan G.; Franke, Barbara; Franke, Lude; Gasparini, Paolo; Gejman, Pablo V.; Gieger, Christian; Grabe, Hans-Jörgen; Gratten, Jacob; Groenen, Patrick J.F.; Gudnason, Vilmundur; van der Harst, Pim; Hayward, Caroline; Hinds, David A.; Hoffmann, Wolfgang; Hyppönen, Elina; Iacono, William G.; Jacobsson, Bo; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L.R.; Lehtimäki, Terho; Lehrer, Steven F.; Magnusson, Patrik K.E.; Martin, Nicholas G.; McGue, Matt; Metspalu, Andres; Pendleton, Neil; Penninx, Brenda W.J.H.; Perola, Markus; Pirastu, Nicola; Pirastu, Mario; Polasek, Ozren; Posthuma, Danielle; Power, Christine; Province, Michael A.; Samani, Nilesh J.; Schlessinger, David; Schmidt, Reinhold; Sørensen, Thorkild I.A.; Spector, Tim D.; Stefansson, Kari; Thorsteinsdottir, Unnur; Thurik, A. Roy; Timpson, Nicholas J.; Tiemeier, Henning; Tung, Joyce Y.; Uitterlinden, André G.; Vitart, Veronique; Vollenweider, Peter; Weir, David R.; Wilson, James F.; Wright, Alan F.; Conley, Dalton C.; Krueger, Robert F.; Smith, George Davey; Hofman, Albert; Laibson, David I.; Medland, Sarah E.; Meyer, Michelle N.; Yang, Jian; Johannesson, Magnus; Visscher, Peter M.; Esko, Tõnu; Koellinger, Philipp D.; Cesarini, David; Benjamin, Daniel J.

    2016-01-01

    Summary Educational attainment (EA) is strongly influenced by social and other environmental factors, but genetic factors are also estimated to account for at least 20% of the variation across individuals1. We report the results of a genome-wide association study (GWAS) for EA that extends our earlier discovery sample1,2 of 101,069 individuals to 293,723 individuals, and a replication in an independent sample of 111,349 individuals from the UK Biobank. We now identify 74 genome-wide significant loci associated with number of years of schooling completed. Single-nucleotide polymorphisms (SNPs) associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioral phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because EA is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric disease. PMID:27225129

  5. Genome-wide association study identifies 74 loci associated with educational attainment.

    PubMed

    Okbay, Aysu; Beauchamp, Jonathan P; Fontana, Mark Alan; Lee, James J; Pers, Tune H; Rietveld, Cornelius A; Turley, Patrick; Chen, Guo-Bo; Emilsson, Valur; Meddens, S Fleur W; Oskarsson, Sven; Pickrell, Joseph K; Thom, Kevin; Timshel, Pascal; de Vlaming, Ronald; Abdellaoui, Abdel; Ahluwalia, Tarunveer S; Bacelis, Jonas; Baumbach, Clemens; Bjornsdottir, Gyda; Brandsma, Johannes H; Pina Concas, Maria; Derringer, Jaime; Furlotte, Nicholas A; Galesloot, Tessel E; Girotto, Giorgia; Gupta, Richa; Hall, Leanne M; Harris, Sarah E; Hofer, Edith; Horikoshi, Momoko; Huffman, Jennifer E; Kaasik, Kadri; Kalafati, Ioanna P; Karlsson, Robert; Kong, Augustine; Lahti, Jari; van der Lee, Sven J; deLeeuw, Christiaan; Lind, Penelope A; Lindgren, Karl-Oskar; Liu, Tian; Mangino, Massimo; Marten, Jonathan; Mihailov, Evelin; Miller, Michael B; van der Most, Peter J; Oldmeadow, Christopher; Payton, Antony; Pervjakova, Natalia; Peyrot, Wouter J; Qian, Yong; Raitakari, Olli; Rueedi, Rico; Salvi, Erika; Schmidt, Börge; Schraut, Katharina E; Shi, Jianxin; Smith, Albert V; Poot, Raymond A; St Pourcain, Beate; Teumer, Alexander; Thorleifsson, Gudmar; Verweij, Niek; Vuckovic, Dragana; Wellmann, Juergen; Westra, Harm-Jan; Yang, Jingyun; Zhao, Wei; Zhu, Zhihong; Alizadeh, Behrooz Z; Amin, Najaf; Bakshi, Andrew; Baumeister, Sebastian E; Biino, Ginevra; Bønnelykke, Klaus; Boyle, Patricia A; Campbell, Harry; Cappuccio, Francesco P; Davies, Gail; De Neve, Jan-Emmanuel; Deloukas, Panos; Demuth, Ilja; Ding, Jun; Eibich, Peter; Eisele, Lewin; Eklund, Niina; Evans, David M; Faul, Jessica D; Feitosa, Mary F; Forstner, Andreas J; Gandin, Ilaria; Gunnarsson, Bjarni; Halldórsson, Bjarni V; Harris, Tamara B; Heath, Andrew C; Hocking, Lynne J; Holliday, Elizabeth G; Homuth, Georg; Horan, Michael A; Hottenga, Jouke-Jan; de Jager, Philip L; Joshi, Peter K; Jugessur, Astanand; Kaakinen, Marika A; Kähönen, Mika; Kanoni, Stavroula; Keltigangas-Järvinen, Liisa; Kiemeney, Lambertus A L M; Kolcic, Ivana; Koskinen, Seppo; Kraja, Aldi T; Kroh, Martin; Kutalik, Zoltan; Latvala, Antti; Launer, Lenore J; Lebreton, Maël P; Levinson, Douglas F; Lichtenstein, Paul; Lichtner, Peter; Liewald, David C M; Loukola, Anu; Madden, Pamela A; Mägi, Reedik; Mäki-Opas, Tomi; Marioni, Riccardo E; Marques-Vidal, Pedro; Meddens, Gerardus A; McMahon, George; Meisinger, Christa; Meitinger, Thomas; Milaneschi, Yusplitri; Milani, Lili; Montgomery, Grant W; Myhre, Ronny; Nelson, Christopher P; Nyholt, Dale R; Ollier, William E R; Palotie, Aarno; Paternoster, Lavinia; Pedersen, Nancy L; Petrovic, Katja E; Porteous, David J; Räikkönen, Katri; Ring, Susan M; Robino, Antonietta; Rostapshova, Olga; Rudan, Igor; Rustichini, Aldo; Salomaa, Veikko; Sanders, Alan R; Sarin, Antti-Pekka; Schmidt, Helena; Scott, Rodney J; Smith, Blair H; Smith, Jennifer A; Staessen, Jan A; Steinhagen-Thiessen, Elisabeth; Strauch, Konstantin; Terracciano, Antonio; Tobin, Martin D; Ulivi, Sheila; Vaccargiu, Simona; Quaye, Lydia; van Rooij, Frank J A; Venturini, Cristina; Vinkhuyzen, Anna A E; Völker, Uwe; Völzke, Henry; Vonk, Judith M; Vozzi, Diego; Waage, Johannes; Ware, Erin B; Willemsen, Gonneke; Attia, John R; Bennett, David A; Berger, Klaus; Bertram, Lars; Bisgaard, Hans; Boomsma, Dorret I; Borecki, Ingrid B; Bültmann, Ute; Chabris, Christopher F; Cucca, Francesco; Cusi, Daniele; Deary, Ian J; Dedoussis, George V; van Duijn, Cornelia M; Eriksson, Johan G; Franke, Barbara; Franke, Lude; Gasparini, Paolo; Gejman, Pablo V; Gieger, Christian; Grabe, Hans-Jörgen; Gratten, Jacob; Groenen, Patrick J F; Gudnason, Vilmundur; van der Harst, Pim; Hayward, Caroline; Hinds, David A; Hoffmann, Wolfgang; Hyppönen, Elina; Iacono, William G; Jacobsson, Bo; Järvelin, Marjo-Riitta; Jöckel, Karl-Heinz; Kaprio, Jaakko; Kardia, Sharon L R; Lehtimäki, Terho; Lehrer, Steven F; Magnusson, Patrik K E; Martin, Nicholas G; McGue, Matt; Metspalu, Andres; Pendleton, Neil; Penninx, Brenda W J H; Perola, Markus; Pirastu, Nicola; Pirastu, Mario; Polasek, Ozren; Posthuma, Danielle; Power, Christine; Province, Michael A; Samani, Nilesh J; Schlessinger, David; Schmidt, Reinhold; Sørensen, Thorkild I A; Spector, Tim D; Stefansson, Kari; Thorsteinsdottir, Unnur; Thurik, A Roy; Timpson, Nicholas J; Tiemeier, Henning; Tung, Joyce Y; Uitterlinden, André G; Vitart, Veronique; Vollenweider, Peter; Weir, David R; Wilson, James F; Wright, Alan F; Conley, Dalton C; Krueger, Robert F; Davey Smith, George; Hofman, Albert; Laibson, David I; Medland, Sarah E; Meyer, Michelle N; Yang, Jian; Johannesson, Magnus; Visscher, Peter M; Esko, Tõnu; Koellinger, Philipp D; Cesarini, David; Benjamin, Daniel J

    2016-05-26

    Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.

  6. Genome-wide differentiation of various melon horticultural groups for use in genome wide association study for fruit firmness and construction of a high resolution genetic map

    USDA-ARS?s Scientific Manuscript database

    We generated 13,789 single nucleotide plymorphism (SNP) markers from 97 melon accessions using genotyping by sequencing and anchored them to chromosomes to understand genome-wide fixation index between various melon morphotypes and linkage disequilibrium (LD) decay for inodorus and cantalupensis, th...

  7. Genetic link between family socioeconomic status and children's educational achievement estimated from genome-wide SNPs.

    PubMed

    Krapohl, E; Plomin, R

    2016-03-01

    One of the best predictors of children's educational achievement is their family's socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% variance in educational achievement and ~2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children's educational achievement and its association with family SES.

  8. Genetic link between family socioeconomic status and children's educational achievement estimated from genome-wide SNPs

    PubMed Central

    Krapohl, E; Plomin, R

    2016-01-01

    One of the best predictors of children's educational achievement is their family's socioeconomic status (SES), but the degree to which this association is genetically mediated remains unclear. For 3000 UK-representative unrelated children we found that genome-wide single-nucleotide polymorphisms could explain a third of the variance of scores on an age-16 UK national examination of educational achievement and half of the correlation between their scores and family SES. Moreover, genome-wide polygenic scores based on a previously published genome-wide association meta-analysis of total number of years in education accounted for ~3.0% variance in educational achievement and ~2.5% in family SES. This study provides the first molecular evidence for substantial genetic influence on differences in children's educational achievement and its association with family SES. PMID:25754083

  9. Genome-Wide Microsatellite Characterization and Marker Development in the Sequenced Brassica Crop Species

    PubMed Central

    Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2014-01-01

    Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species. PMID:24130371

  10. Genome-wide microsatellite characterization and marker development in the sequenced Brassica crop species.

    PubMed

    Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2014-02-01

    Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species.

  11. A Meta-Analysis of Genome-Wide Association Scans Identifies IL18RAP, PTPN2, TAGAP, and PUS10 As Shared Risk Loci for Crohn's Disease and Celiac Disease

    PubMed Central

    Boucher, Gabrielle; Beauchamp, Claudine; Trynka, Gosia; Dubois, Patrick C.; Lagacé, Caroline; Stokkers, Pieter C. F.; Hommes, Daan W.; Barisani, Donatella; Palmieri, Orazio; Annese, Vito; van Heel, David A.; Weersma, Rinse K.; Daly, Mark J.; Wijmenga, Cisca; Rioux, John D.

    2011-01-01

    Crohn's disease (CD) and celiac disease (CelD) are chronic intestinal inflammatory diseases, involving genetic and environmental factors in their pathogenesis. The two diseases can co-occur within families, and studies suggest that CelD patients have a higher risk to develop CD than the general population. These observations suggest that CD and CelD may share common genetic risk loci. Two such shared loci, IL18RAP and PTPN2, have already been identified independently in these two diseases. The aim of our study was to explicitly identify shared risk loci for these diseases by combining results from genome-wide association study (GWAS) datasets of CD and CelD. Specifically, GWAS results from CelD (768 cases, 1,422 controls) and CD (3,230 cases, 4,829 controls) were combined in a meta-analysis. Nine independent regions had nominal association p-value <1.0×10−5 in this meta-analysis and showed evidence of association to the individual diseases in the original scans (p-value <1×10−2 in CelD and <1×10−3 in CD). These include the two previously reported shared loci, IL18RAP and PTPN2, with p-values of 3.37×10−8 and 6.39×10−9, respectively, in the meta-analysis. The other seven had not been reported as shared loci and thus were tested in additional CelD (3,149 cases and 4,714 controls) and CD (1,835 cases and 1,669 controls) cohorts. Two of these loci, TAGAP and PUS10, showed significant evidence of replication (Bonferroni corrected p-values <0.0071) in the combined CelD and CD replication cohorts and were firmly established as shared risk loci of genome-wide significance, with overall combined p-values of 1.55×10−10 and 1.38×10−11 respectively. Through a meta-analysis of GWAS data from CD and CelD, we have identified four shared risk loci: PTPN2, IL18RAP, TAGAP, and PUS10. The combined analysis of the two datasets provided the power, lacking in the individual GWAS for single diseases, to detect shared loci with a relatively small effect. PMID:21298027

  12. Accounting for selection and correlation in the analysis of two-stage genome-wide association studies.

    PubMed

    Robertson, David S; Prevost, A Toby; Bowden, Jack

    2016-10-01

    The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39: (7), 830-832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account. © The Author 2016. Published by Oxford University Press.

  13. Genome-Wide Meta-Analysis of Longitudinal Alcohol Consumption Across Youth and Early Adulthood.

    PubMed

    Adkins, Daniel E; Clark, Shaunna L; Copeland, William E; Kennedy, Martin; Conway, Kevin; Angold, Adrian; Maes, Hermine; Liu, Youfang; Kumar, Gaurav; Erkanli, Alaattin; Patkar, Ashwin A; Silberg, Judy; Brown, Tyson H; Fergusson, David M; Horwood, L John; Eaves, Lindon; van den Oord, Edwin J C G; Sullivan, Patrick F; Costello, E J

    2015-08-01

    The public health burden of alcohol is unevenly distributed across the life course, with levels of use, abuse, and dependence increasing across adolescence and peaking in early adulthood. Here, we leverage this temporal patterning to search for common genetic variants predicting developmental trajectories of alcohol consumption. Comparable psychiatric evaluations measuring alcohol consumption were collected in three longitudinal community samples (N=2,126, obs=12,166). Consumption-repeated measurements spanning adolescence and early adulthood were analyzed using linear mixed models, estimating individual consumption trajectories, which were then tested for association with Illumina 660W-Quad genotype data (866,099 SNPs after imputation and QC). Association results were combined across samples using standard meta-analysis methods. Four meta-analysis associations satisfied our pre-determined genome-wide significance criterion (FDR<0.1) and six others met our 'suggestive' criterion (FDR<0.2). Genome-wide significant associations were highly biological plausible, including associations within GABA transporter 1, SLC6A1 (solute carrier family 6, member 1), and exonic hits in LOC100129340 (mitofusin-1-like). Pathway analyses elaborated single marker results, indicating significant enriched associations to intuitive biological mechanisms, including neurotransmission, xenobiotic pharmacodynamics, and nuclear hormone receptors (NHR). These findings underscore the value of combining longitudinal behavioral data and genome-wide genotype information in order to study developmental patterns and improve statistical power in genomic studies.

  14. Genome-wide Pleiotropy Between Parkinson Disease and Autoimmune Diseases.

    PubMed

    Witoelar, Aree; Jansen, Iris E; Wang, Yunpeng; Desikan, Rahul S; Gibbs, J Raphael; Blauwendraat, Cornelis; Thompson, Wesley K; Hernandez, Dena G; Djurovic, Srdjan; Schork, Andrew J; Bettella, Francesco; Ellinghaus, David; Franke, Andre; Lie, Benedicte A; McEvoy, Linda K; Karlsen, Tom H; Lesage, Suzanne; Morris, Huw R; Brice, Alexis; Wood, Nicholas W; Heutink, Peter; Hardy, John; Singleton, Andrew B; Dale, Anders M; Gasser, Thomas; Andreassen, Ole A; Sharma, Manu

    2017-07-01

    Recent genome-wide association studies (GWAS) and pathway analyses supported long-standing observations of an association between immune-mediated diseases and Parkinson disease (PD). The post-GWAS era provides an opportunity for cross-phenotype analyses between different complex phenotypes. To test the hypothesis that there are common genetic risk variants conveying risk of both PD and autoimmune diseases (ie, pleiotropy) and to identify new shared genetic variants and their pathways by applying a novel statistical framework in a genome-wide approach. Using the conjunction false discovery rate method, this study analyzed GWAS data from a selection of archetypal autoimmune diseases among 138 511 individuals of European ancestry and systemically investigated pleiotropy between PD and type 1 diabetes, Crohn disease, ulcerative colitis, rheumatoid arthritis, celiac disease, psoriasis, and multiple sclerosis. NeuroX data (6927 PD cases and 6108 controls) were used for replication. The study investigated the biological correlation between the top loci through protein-protein interaction and changes in the gene expression and methylation levels. The dates of the analysis were June 10, 2015, to March 4, 2017. The primary outcome was a list of novel loci and their pathways involved in PD and autoimmune diseases. Genome-wide conjunctional analysis identified 17 novel loci at false discovery rate less than 0.05 with overlap between PD and autoimmune diseases, including known PD loci adjacent to GAK, HLA-DRB5, LRRK2, and MAPT for rheumatoid arthritis, ulcerative colitis and Crohn disease. Replication confirmed the involvement of HLA, LRRK2, MAPT, TRIM10, and SETD1A in PD. Among the novel genes discovered, WNT3, KANSL1, CRHR1, BOLA2, and GUCY1A3 are within a protein-protein interaction network with known PD genes. A subset of novel loci was significantly associated with changes in methylation or expression levels of adjacent genes. The study findings provide novel mechanistic

  15. A Genome Wide Association Study Identifies Common Variants Associated with Lipid Levels in the Chinese Population

    PubMed Central

    Wu, Chen; Yang, Handong; Yu, Dianke; Yang, Xiaobo; Zhang, Xiaomin; Wang, Yiqin; Sun, Jielin; Gao, Yong; Tan, Aihua; He, Yunfeng; Zhang, Haiying; Qin, Xue; Zhu, Jingwen; Li, Huaixing; Lin, Xu; Zhu, Jiang; Min, Xinwen; Lang, Mingjian; Li, Dongfeng; Zhai, Kan; Chang, Jiang; Tan, Wen; Yuan, Jing; Chen, Weihong; Wang, Youjie; Wei, Sheng; Miao, Xiaoping; Wang, Feng; Fang, Weimin; Liang, Yuan; Deng, Qifei; Dai, Xiayun; Lin, Dafeng; Huang, Suli; Guo, Huan; Lilly Zheng, S.; Xu, Jianfeng; Lin, Dongxin; Hu, Frank B.; Wu, Tangchun

    2013-01-01

    Plasma lipid levels are important risk factors for cardiovascular disease and are influenced by genetic and environmental factors. Recent genome wide association studies (GWAS) have identified several lipid-associated loci, but these loci have been identified primarily in European populations. In order to identify genetic markers for lipid levels in a Chinese population and analyze the heterogeneity between Europeans and Asians, especially Chinese, we performed a meta-analysis of two genome wide association studies on four common lipid traits including total cholesterol (TC), triglycerides (TG), low-density lipoprotein cholesterol (LDL) and high-density lipoprotein cholesterol (HDL) in a Han Chinese population totaling 3,451 healthy subjects. Replication was performed in an additional 8,830 subjects of Han Chinese ethnicity. We replicated eight loci associated with lipid levels previously reported in a European population. The loci genome wide significantly associated with TC were near DOCK7, HMGCR and ABO; those genome wide significantly associated with TG were near APOA1/C3/A4/A5 and LPL; those genome wide significantly associated with LDL were near HMGCR, ABO and TOMM40; and those genome wide significantly associated with HDL were near LPL, LIPC and CETP. In addition, an additive genotype score of eight SNPs representing the eight loci that were found to be associated with lipid levels was associated with higher TC, TG and LDL levels (P = 5.52×10-16, 1.38×10-6 and 5.59×10-9, respectively). These findings suggest the cumulative effects of multiple genetic loci on plasma lipid levels. Comparisons with previous GWAS of lipids highlight heterogeneity in allele frequency and in effect size for some loci between Chinese and European populations. The results from our GWAS provided comprehensive and convincing evidence of the genetic determinants of plasma lipid levels in a Chinese population. PMID:24386095

  16. Natural CMT2 Variation Is Associated With Genome-Wide Methylation Changes and Temperature Seasonality

    PubMed Central

    Shen, Xia; De Jonge, Jennifer; Forsberg, Simon K. G.; Pettersson, Mats E.; Sheng, Zheya; Hennig, Lars; Carlborg, Örjan

    2014-01-01

    As Arabidopsis thaliana has colonized a wide range of habitats across the world it is an attractive model for studying the genetic mechanisms underlying environmental adaptation. Here, we used public data from two collections of A. thaliana accessions to associate genetic variability at individual loci with differences in climates at the sampling sites. We use a novel method to screen the genome for plastic alleles that tolerate a broader climate range than the major allele. This approach reduces confounding with population structure and increases power compared to standard genome-wide association methods. Sixteen novel loci were found, including an association between Chromomethylase 2 (CMT2) and temperature seasonality where the genome-wide CHH methylation was different for the group of accessions carrying the plastic allele. Cmt2 mutants were shown to be more tolerant to heat-stress, suggesting genetic regulation of epigenetic modifications as a likely mechanism underlying natural adaptation to variable temperatures, potentially through differential allelic plasticity to temperature-stress. PMID:25503602

  17. Genome-wide inference of regulatory networks in Streptomyces coelicolor.

    PubMed

    Castro-Melchor, Marlene; Charaniya, Salim; Karypis, George; Takano, Eriko; Hu, Wei-Shou

    2010-10-18

    The onset of antibiotics production in Streptomyces species is co-ordinated with differentiation events. An understanding of the genetic circuits that regulate these coupled biological phenomena is essential to discover and engineer the pharmacologically important natural products made by these species. The availability of genomic tools and access to a large warehouse of transcriptome data for the model organism, Streptomyces coelicolor, provides incentive to decipher the intricacies of the regulatory cascades and develop biologically meaningful hypotheses. In this study, more than 500 samples of genome-wide temporal transcriptome data, comprising wild-type and more than 25 regulatory gene mutants of Streptomyces coelicolor probed across multiple stress and medium conditions, were investigated. Information based on transcript and functional similarity was used to update a previously-predicted whole-genome operon map and further applied to predict transcriptional networks constituting modules enriched in diverse functions such as secondary metabolism, and sigma factor. The predicted network displays a scale-free architecture with a small-world property observed in many biological networks. The networks were further investigated to identify functionally-relevant modules that exhibit functional coherence and a consensus motif in the promoter elements indicative of DNA-binding elements. Despite the enormous experimental as well as computational challenges, a systems approach for integrating diverse genome-scale datasets to elucidate complex regulatory networks is beginning to emerge. We present an integrated analysis of transcriptome data and genomic features to refine a whole-genome operon map and to construct regulatory networks at the cistron level in Streptomyces coelicolor. The functionally-relevant modules identified in this study pose as potential targets for further studies and verification.

  18. Genome-Wide Association Analysis to Identify Loci for Milk Yield in Gyr Breed

    USDA-ARS?s Scientific Manuscript database

    A genome scan was conducted to identify QTL affecting milk yield in a Brazilian Gyr population of progeny test bulls (N=319). Data used in this study was derived from traditional genetic evaluation records computed by the Embrapa Dairy Cattleand released in May/2009 (http://www.cnpgl.embrapa.br/nova...

  19. Copy Number Variations in Tilapia Genomes.

    PubMed

    Li, Bi Jun; Li, Hong Lian; Meng, Zining; Zhang, Yong; Lin, Haoran; Yue, Gen Hua; Xia, Jun Hong

    2017-02-01

    Discovering the nature and pattern of genome variation is fundamental in understanding phenotypic diversity among populations. Although several millions of single nucleotide polymorphisms (SNPs) have been discovered in tilapia, the genome-wide characterization of larger structural variants, such as copy number variation (CNV) regions has not been carried out yet. We conducted a genome-wide scan for CNVs in 47 individuals from three tilapia populations. Based on 254 Gb of high-quality paired-end sequencing reads, we identified 4642 distinct high-confidence CNVs. These CNVs account for 1.9% (12.411 Mb) of the used Nile tilapia reference genome. A total of 1100 predicted CNVs were found overlapping with exon regions of protein genes. Further association analysis based on linear model regression found 85 CNVs ranging between 300 and 27,000 base pairs significantly associated to population types (R 2  > 0.9 and P > 0.001). Our study sheds first insights on genome-wide CNVs in tilapia. These CNVs among and within tilapia populations may have functional effects on phenotypes and specific adaptation to particular environments.

  20. Genome wide association analyses based on a multiple trait approach for modeling feed efficiency

    USDA-ARS?s Scientific Manuscript database

    Genome wide association (GWA) of feed efficiency (FE) could help target important genomic regions influencing FE. Data provided by an international dairy FE research consortium consisted of phenotypic records on dry matter intakes (DMI), milk energy (MILKE), and metabolic body weight (MBW) on 6,937 ...

  1. Genome-wide association study identifies three novel loci for type 2 diabetes.

    PubMed

    Hara, Kazuo; Fujita, Hayato; Johnson, Todd A; Yamauchi, Toshimasa; Yasuda, Kazuki; Horikoshi, Momoko; Peng, Chen; Hu, Cheng; Ma, Ronald C W; Imamura, Minako; Iwata, Minoru; Tsunoda, Tatsuhiko; Morizono, Takashi; Shojima, Nobuhiro; So, Wing Yee; Leung, Ting Fan; Kwan, Patrick; Zhang, Rong; Wang, Jie; Yu, Weihui; Maegawa, Hiroshi; Hirose, Hiroshi; Kaku, Kohei; Ito, Chikako; Watada, Hirotaka; Tanaka, Yasushi; Tobe, Kazuyuki; Kashiwagi, Atsunori; Kawamori, Ryuzo; Jia, Weiping; Chan, Juliana C N; Teo, Yik Ying; Shyong, Tai E; Kamatani, Naoyuki; Kubo, Michiaki; Maeda, Shiro; Kadowaki, Takashi

    2014-01-01

    Although over 60 loci for type 2 diabetes (T2D) have been identified, there still remains a large genetic component to be clarified. To explore unidentified loci for T2D, we performed a genome-wide association study (GWAS) of 6 209 637 single-nucleotide polymorphisms (SNPs), which were directly genotyped or imputed using East Asian references from the 1000 Genomes Project (June 2011 release) in 5976 Japanese patients with T2D and 20 829 nondiabetic individuals. Nineteen unreported loci were selected and taken forward to follow-up analyses. Combined discovery and follow-up analyses (30 392 cases and 34 814 controls) identified three new loci with genome-wide significance, which were MIR129-LEP [rs791595; risk allele = A; risk allele frequency (RAF) = 0.080; P = 2.55 × 10(-13); odds ratio (OR) = 1.17], GPSM1 [rs11787792; risk allele = A; RAF = 0.874; P = 1.74 × 10(-10); OR = 1.15] and SLC16A13 (rs312457; risk allele = G; RAF = 0.078; P = 7.69 × 10(-13); OR = 1.20). This study demonstrates that GWASs based on the imputation of genotypes using modern reference haplotypes such as that from the 1000 Genomes Project data can assist in identification of new loci for common diseases.

  2. Genome-Wide Identification and Characterization of WRKY Gene Family in Peanut.

    PubMed

    Song, Hui; Wang, Pengfei; Lin, Jer-Young; Zhao, Chuanzhi; Bi, Yuping; Wang, Xingjun

    2016-01-01

    WRKY, an important transcription factor family, is widely distributed in the plant kingdom. Many reports focused on analysis of phylogenetic relationship and biological function of WRKY protein at the whole genome level in different plant species. However, little is known about WRKY proteins in the genome of Arachis species and their response to salicylic acid (SA) and jasmonic acid (JA) treatment. In this study, we identified 77 and 75 WRKY proteins from the two wild ancestral diploid genomes of cultivated tetraploid peanut, Arachis duranensis and Arachis ipaënsis, using bioinformatics approaches. Most peanut WRKY coding genes were located on A. duranensis chromosome A6 and A. ipaënsis chromosome B3, while the least number of WRKY genes was found in chromosome 9. The WRKY orthologous gene pairs in A. duranensis and A. ipaënsis chromosomes were highly syntenic. Our analysis indicated that segmental duplication events played a major role in AdWRKY and AiWRKY genes, and strong purifying selection was observed in gene duplication pairs. Furthermore, we translate the knowledge gained from the genome-wide analysis result of wild ancestral peanut to cultivated peanut to reveal that gene activities of specific cultivated peanut WRKY gene were changed due to SA and JA treatment. Peanut WRKY7, 8 and 13 genes were down-regulated, whereas WRKY1 and 12 genes were up-regulated with SA and JA treatment. These results could provide valuable information for peanut improvement.

  3. Genome-Wide Identification and Characterization of WRKY Gene Family in Peanut

    PubMed Central

    Song, Hui; Wang, Pengfei; Lin, Jer-Young; Zhao, Chuanzhi; Bi, Yuping; Wang, Xingjun

    2016-01-01

    WRKY, an important transcription factor family, is widely distributed in the plant kingdom. Many reports focused on analysis of phylogenetic relationship and biological function of WRKY protein at the whole genome level in different plant species. However, little is known about WRKY proteins in the genome of Arachis species and their response to salicylic acid (SA) and jasmonic acid (JA) treatment. In this study, we identified 77 and 75 WRKY proteins from the two wild ancestral diploid genomes of cultivated tetraploid peanut, Arachis duranensis and Arachis ipaënsis, using bioinformatics approaches. Most peanut WRKY coding genes were located on A. duranensis chromosome A6 and A. ipaënsis chromosome B3, while the least number of WRKY genes was found in chromosome 9. The WRKY orthologous gene pairs in A. duranensis and A. ipaënsis chromosomes were highly syntenic. Our analysis indicated that segmental duplication events played a major role in AdWRKY and AiWRKY genes, and strong purifying selection was observed in gene duplication pairs. Furthermore, we translate the knowledge gained from the genome-wide analysis result of wild ancestral peanut to cultivated peanut to reveal that gene activities of specific cultivated peanut WRKY gene were changed due to SA and JA treatment. Peanut WRKY7, 8 and 13 genes were down-regulated, whereas WRKY1 and 12 genes were up-regulated with SA and JA treatment. These results could provide valuable information for peanut improvement. PMID:27200012

  4. Cell Context Dependent p53 Genome-Wide Binding Patterns and Enrichment at Repeats

    DOE PAGES

    Botcheva, Krassimira; McCorkle, Sean R.

    2014-11-21

    The p53 ability to elicit stress specific and cell type specific responses is well recognized, but how that specificity is established remains to be defined. Whether upon activation p53 binds to its genomic targets in a cell type and stress type dependent manner is still an open question. Here we show that the p53 binding to the human genome is selective and cell context-dependent. We mapped the genomic binding sites for the endogenous wild type p53 protein in the human cancer cell line HCT116 and compared them to those we previously determined in the normal cell line IMR90. We reportmore » distinct p53 genome-wide binding landscapes in two different cell lines, analyzed under the same treatment and experimental conditions, using the same ChIP-seq approach. This is evidence for cell context dependent p53 genomic binding. The observed differences affect the p53 binding sites distribution with respect to major genomic and epigenomic elements (promoter regions, CpG islands and repeats). We correlated the high-confidence p53 ChIP-seq peaks positions with the annotated human repeats (UCSC Human Genome Browser) and observed both common and cell line specific trends. In HCT116, the p53 binding was specifically enriched at LINE repeats, compared to IMR90 cells. The p53 genome-wide binding patterns in HCT116 and IMR90 likely reflect the different epigenetic landscapes in these two cell lines, resulting from cancer-associated changes (accumulated in HCT116) superimposed on tissue specific differences (HCT116 has epithelial, while IMR90 has mesenchymal origin). In conclusion, our data support the model for p53 binding to the human genome in a highly selective manner, mobilizing distinct sets of genes, contributing to distinct pathways.« less

  5. Whole-genome resequencing of 292 pigeonpea accessions identifies genomic regions associated with domestication and agronomic traits.

    PubMed

    Varshney, Rajeev K; Saxena, Rachit K; Upadhyaya, Hari D; Khan, Aamir W; Yu, Yue; Kim, Changhoon; Rathore, Abhishek; Kim, Dongseon; Kim, Jihun; An, Shaun; Kumar, Vinay; Anuradha, Ghanta; Yamini, Kalinati Narasimhan; Zhang, Wei; Muniswamy, Sonnappa; Kim, Jong-So; Penmetsa, R Varma; von Wettberg, Eric; Datta, Swapan K

    2017-07-01

    Pigeonpea (Cajanus cajan), a tropical grain legume with low input requirements, is expected to continue to have an important role in supplying food and nutritional security in developing countries in Asia, Africa and the tropical Americas. From whole-genome resequencing of 292 Cajanus accessions encompassing breeding lines, landraces and wild species, we characterize genome-wide variation. On the basis of a scan for selective sweeps, we find several genomic regions that were likely targets of domestication and breeding. Using genome-wide association analysis, we identify associations between several candidate genes and agronomically important traits. Candidate genes for these traits in pigeonpea have sequence similarity to genes functionally characterized in other plants for flowering time control, seed development and pod dehiscence. Our findings will allow acceleration of genetic gains for key traits to improve yield and sustainability in pigeonpea.

  6. Genome-wide analyses of LINE–LINE-mediated nonallelic homologous recombination

    PubMed Central

    Startek, Michał; Szafranski, Przemyslaw; Gambin, Tomasz; Campbell, Ian M.; Hixson, Patricia; Shaw, Chad A.; Stankiewicz, Paweł; Gambin, Anna

    2015-01-01

    Nonallelic homologous recombination (NAHR), occurring between low-copy repeats (LCRs) >10 kb in size and sharing >97% DNA sequence identity, is responsible for the majority of recurrent genomic rearrangements in the human genome. Recent studies have shown that transposable elements (TEs) can also mediate recurrent deletions and translocations, indicating the features of substrates that mediate NAHR may be significantly less stringent than previously believed. Using >4 kb length and >95% sequence identity criteria, we analyzed of the genome-wide distribution of long interspersed element (LINE) retrotransposon and their potential to mediate NAHR. We identified 17 005 directly oriented LINE pairs located <10 Mbp from each other as potential NAHR substrates, placing 82.8% of the human genome at risk of LINE–LINE-mediated instability. Cross-referencing these regions with CNVs in the Baylor College of Medicine clinical chromosomal microarray database of 36 285 patients, we identified 516 CNVs potentially mediated by LINEs. Using long-range PCR of five different genomic regions in a total of 44 patients, we confirmed that the CNV breakpoints in each patient map within the LINE elements. To additionally assess the scale of LINE–LINE/NAHR phenomenon in the human genome, we tested DNA samples from six healthy individuals on a custom aCGH microarray targeting LINE elements predicted to mediate CNVs and identified 25 LINE–LINE rearrangements. Our data indicate that LINE–LINE-mediated NAHR is widespread and under-recognized, and is an important mechanism of structural rearrangement contributing to human genomic variability. PMID:25613453

  7. Genome-wide divergence, haplotype distribution and population demographic histories for Gossypium hirsutum and Gossypium barbadense as revealed by genome-anchored SNPs

    PubMed Central

    Reddy, Umesh K.; Nimmakayala, Padma; Abburi, Venkata Lakshmi; Reddy, C. V. C. M.; Saminathan, Thangasamy; Percy, Richard G.; Yu, John Z.; Frelichowski, James; Udall, Joshua A.; Page, Justin T.; Zhang, Dong; Shehzad, Tariq; Paterson, Andrew H.

    2017-01-01

    Use of 10,129 singleton SNPs of known genomic location in tetraploid cotton provided unique opportunities to characterize genome-wide diversity among 440 Gossypium hirsutum and 219 G. barbadense cultivars and landrace accessions of widespread origin. Using the SNPs distributed genome-wide, we examined genetic diversity, haplotype distribution and linkage disequilibrium patterns in the G. hirsutum and G. barbadense genomes to clarify population demographic history. Diversity and identity-by-state analyses have revealed little sharing of alleles between the two cultivated allotetraploid genomes, with a few exceptions that indicated sporadic gene flow. We found a high number of new alleles, representing increased nucleotide diversity, on chromosomes 1 and 2 in cultivated G. hirsutum as compared with low nucleotide diversity on these chromosomes in landrace G. hirsutum. In contrast, G. barbadense chromosomes showed negative Tajima’s D on several chromosomes for both cultivated and landrace types, which indicate that speciation of G. barbadense itself, might have occurred with relatively narrow genetic diversity. The presence of conserved linkage disequilibrium (LD) blocks and haplotypes between G. hirsutum and G. barbadense provides strong evidence for comparable patterns of evolution in their domestication processes. Our study illustrates the potential use of population genetic techniques to identify genomic regions for domestication. PMID:28128280

  8. Genetics of cortisol secretion and depressive symptoms: a candidate gene and genome wide association approach.

    PubMed

    Velders, Fleur P; Kuningas, Maris; Kumari, Meena; Dekker, Marieke J; Uitterlinden, Andre G; Kirschbaum, Clemens; Hek, Karin; Hofman, Albert; Verhulst, Frank C; Kivimaki, Mika; Van Duijn, Cornelia M; Walker, Brian R; Tiemeier, Henning

    2011-08-01

    Depressive patients often have altered cortisol secretion, but few studies have investigated genetic variants in relation to both cortisol secretion and depression. To identify genes related to both these conditions, we: (1) tested the association of single nucleotide polymorphisms (SNPs) in hypothalamic-pituitary-adrenal-axis (HPA-axis) candidate genes with a summary measure of total cortisol secretion during the day (cortisol(AUC)), (2) performed a genome wide association study (GWAS) of cortisol(AUC), and (3) tested the association of identified cortisol-related SNPs with depressive symptoms. We analyzed data on candidate SNPs for the HPA-axis, genome-wide scans, cortisol secretion (n=1711) and depressive symptoms (the Centre for Epidemiology Studies Depression Scale, CES-D) (n=2928) in elderly persons of the Rotterdam Study. We used data from the Whitehall II study (n=2836) to replicate the GWAS findings. Of the 1456 SNPs in 33 candidate genes, minor alleles of 4 SNPs (rs9470080, rs9394309, rs7748266 and rs1360780) in the FKBP5 gene were associated with a decreased cortisol(AUC) (p<1×10(-4) after correction for multiple testing using permutations). These SNPs were also associated with an increased risk of depressive symptoms (rs9470080: OR 1.19 (95%CI 1.0; 1.4)). The GWAS for cortisol yielded 2 SNPs with p-values of 1×10(-06) (rs8062512, rs2252459), but these associations could not be replicated. These results suggest that variation in the FKBP5 gene is associated with both cortisol(AUC) and the likelihood of depressive symptoms. Copyright © 2011 Elsevier Ltd. All rights reserved.

  9. Meta-analysis of genome-wide linkage studies in BMI and obesity.

    PubMed

    Saunders, Catherine L; Chiodini, Benedetta D; Sham, Pak; Lewis, Cathryn M; Abkevich, Victor; Adeyemo, Adebowale A; de Andrade, Mariza; Arya, Rector; Berenson, Gerald S; Blangero, John; Boehnke, Michael; Borecki, Ingrid B; Chagnon, Yvon C; Chen, Wei; Comuzzie, Anthony G; Deng, Hong-Wen; Duggirala, Ravindranath; Feitosa, Mary F; Froguel, Philippe; Hanson, Robert L; Hebebrand, Johannes; Huezo-Dias, Patricia; Kissebah, Ahmed H; Li, Weidong; Luke, Amy; Martin, Lisa J; Nash, Matthew; Ohman, Miina; Palmer, Lyle J; Peltonen, Leena; Perola, Markus; Price, R Arlen; Redline, Susan; Srinivasan, Sathanur R; Stern, Michael P; Stone, Steven; Stringham, Heather; Turner, Stephen; Wijmenga, Cisca; Collier, David A

    2007-09-01

    The objective was to provide an overall assessment of genetic linkage data of BMI and BMI-defined obesity using a nonparametric genome scan meta-analysis. We identified 37 published studies containing data on over 31,000 individuals from more than >10,000 families and obtained genome-wide logarithm of the odds (LOD) scores, non-parametric linkage (NPL) scores, or maximum likelihood scores (MLS). BMI was analyzed in a pooled set of all studies, as a subgroup of 10 studies that used BMI-defined obesity, and for subgroups ascertained through type 2 diabetes, hypertension, or subjects of European ancestry. Bins at chromosome 13q13.2- q33.1, 12q23-q24.3 achieved suggestive evidence of linkage to BMI in the pooled analysis and samples ascertained for hypertension. Nominal evidence of linkage to these regions and suggestive evidence for 11q13.3-22.3 were also observed for BMI-defined obesity. The FTO obesity gene locus at 16q12.2 also showed nominal evidence for linkage. However, overall distribution of summed rank p values <0.05 is not different from that expected by chance. The strongest evidence was obtained in the families ascertained for hypertension at 9q31.1-qter and 12p11.21-q23 (p < 0.01). Despite having substantial statistical power, we did not unequivocally implicate specific loci for BMI or obesity. This may be because genes influencing adiposity are of very small effect, with substantial genetic heterogeneity and variable dependence on environmental factors. However, the observation that the FTO gene maps to one of the highest ranking bins for obesity is interesting and, while not a validation of this approach, indicates that other potential loci identified in this study should be investigated further.

  10. Genome-wide significant locus for Research Diagnostic Criteria Schizoaffective Disorder Bipolar type.

    PubMed

    Green, Elaine K; Di Florio, Arianna; Forty, Liz; Gordon-Smith, Katherine; Grozeva, Detelina; Fraser, Christine; Richards, Alexander L; Moran, Jennifer L; Purcell, Shaun; Sklar, Pamela; Kirov, George; Owen, Michael J; O'Donovan, Michael C; Craddock, Nick; Jones, Lisa; Jones, Ian R

    2017-12-01

    Studies have suggested that Research Diagnostic Criteria for Schizoaffective Disorder Bipolar type (RDC-SABP) might identify a more genetically homogenous subgroup of bipolar disorder. Aiming to identify loci associated with RDC-SABP, we have performed a replication study using independent RDC-SABP cases (n = 144) and controls (n = 6,559), focusing on the 10 loci that reached a p-value <10 -5 for RDC-SABP in the Wellcome Trust Case Control Consortium (WTCCC) bipolar disorder sample. Combining the WTCCC and replication datasets by meta-analysis (combined RDC-SABP, n = 423, controls, n = 9,494), we observed genome-wide significant association at one SNP, rs2352974, located within the intron of the gene TRAIP on chromosome 3p21.31 (p-value, 4.37 × 10 -8 ). This locus did not reach genome-wide significance in bipolar disorder or schizophrenia large Psychiatric Genomic Consortium datasets, suggesting that it may represent a relatively specific genetic risk for the bipolar subtype of schizoaffective disorder. © 2017 Wiley Periodicals, Inc.

  11. Genome-wide identification, functional and evolutionary analysis of terpene synthases in pineapple.

    PubMed

    Chen, Xiaoe; Yang, Wei; Zhang, Liqin; Wu, Xianmiao; Cheng, Tian; Li, Guanglin

    2017-10-01

    Terpene synthases (TPSs) are vital for the biosynthesis of active terpenoids, which have important physiological, ecological and medicinal value. Although terpenoids have been reported in pineapple (Ananas comosus), genome-wide investigations of the TPS genes responsible for pineapple terpenoid synthesis are still lacking. By integrating pineapple genome and proteome data, twenty-one putative terpene synthase genes were found in pineapple and divided into five subfamilies. Tandem duplication is the cause of TPS gene family duplication. Furthermore, functional differentiation between each TPS subfamily may have occurred for several reasons. Sixty-two key amino acid sites were identified as being type-II functionally divergence between TPS-a and TPS-c subfamily. Finally, coevolution analysis indicated that multiple amino acid residues are involved in coevolutionary processes. In addition, the enzyme activity of two TPSs were tested. This genome-wide identification, functional and evolutionary analysis of pineapple TPS genes provide a new insight into understanding the roles of TPS family and lay the basis for further characterizing the function and evolution of TPS gene family. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. A genome-wide survey of transgenerational genetic effects in autism.

    PubMed

    Tsang, Kathryn M; Croen, Lisa A; Torres, Anthony R; Kharrazi, Martin; Delorenze, Gerald N; Windham, Gayle C; Yoshida, Cathleen K; Zerbo, Ousseny; Weiss, Lauren A

    2013-01-01

    Effects of parental genotype or parent-offspring genetic interaction are well established in model organisms for a variety of traits. However, these transgenerational genetic models are rarely studied in humans. We have utilized an autism case-control study with 735 mother-child pairs to perform genome-wide screening for maternal genetic effects and maternal-offspring genetic interaction. We used simple models of single locus parent-child interaction and identified suggestive results (P<10(-4)) that cannot be explained by main effects, but no genome-wide significant signals. Some of these maternal and maternal-child associations were in or adjacent to autism candidate genes including: PCDH9, FOXP1, GABRB3, NRXN1, RELN, MACROD2, FHIT, RORA, CNTN4, CNTNAP2, FAM135B, LAMA1, NFIA, NLGN4X, RAPGEF4, and SDK1. We attempted validation of potential autism association under maternal-specific models using maternal-paternal comparison in family-based GWAS datasets. Our results suggest that further study of parental genetic effects and parent-child interaction in autism is warranted.

  13. HITS-CLIP yields genome-wide insights into brain alternative RNA processing

    NASA Astrophysics Data System (ADS)

    Licatalosi, Donny D.; Mele, Aldo; Fak, John J.; Ule, Jernej; Kayikci, Melis; Chi, Sung Wook; Clark, Tyson A.; Schweitzer, Anthony C.; Blume, John E.; Wang, Xuning; Darnell, Jennifer C.; Darnell, Robert B.

    2008-11-01

    Protein-RNA interactions have critical roles in all aspects of gene expression. However, applying biochemical methods to understand such interactions in living tissues has been challenging. Here we develop a genome-wide means of mapping protein-RNA binding sites in vivo, by high-throughput sequencing of RNA isolated by crosslinking immunoprecipitation (HITS-CLIP). HITS-CLIP analysis of the neuron-specific splicing factor Nova revealed extremely reproducible RNA-binding maps in multiple mouse brains. These maps provide genome-wide in vivo biochemical footprints confirming the previous prediction that the position of Nova binding determines the outcome of alternative splicing; moreover, they are sufficiently powerful to predict Nova action de novo. HITS-CLIP revealed a large number of Nova-RNA interactions in 3' untranslated regions, leading to the discovery that Nova regulates alternative polyadenylation in the brain. HITS-CLIP, therefore, provides a robust, unbiased means to identify functional protein-RNA interactions in vivo.

  14. The application of genome-wide 5-hydroxymethylcytosine studies in cancer research.

    PubMed

    Thomson, John P; Meehan, Richard R

    2017-01-01

    Early detection and characterization of molecular events associated with tumorgenesis remain high priorities. Genome-wide epigenetic assays are promising diagnostic tools, as aberrant epigenetic events are frequent and often cancer specific. The deposition and analysis of multiple patient-derived cancer epigenomic profiles contributes to our appreciation of the underlying biology; aiding the detection of novel identifiers for cancer subtypes. Modifying enzymes and co-factors regulating these epigenetic marks are frequently mutated in cancers, and as epigenetic modifications themselves are reversible, this makes their study very attractive with respect to pharmaceutical intervention. Here we focus on the novel modified base, 5-hydoxymethylcytosine, and discuss how genome-wide 5-hydoxymethylcytosine profiling expedites our molecular understanding of cancer, serves as a lineage tracer, classifies the mode of action of potentially carcinogenic agents and clarifies the roles of potential novel cancer drug targets; thus assisting the development of new diagnostic/prognostic tools.

  15. Genome-Wide Association Study Identifies Novel Loci Associated With Diisocyanate-Induced Occupational Asthma

    PubMed Central

    Yucesoy, Berran; Kaufman, Kenneth M.; Lummus, Zana L.; Weirauch, Matthew T.; Zhang, Ge; Cartier, André; Boulet, Louis-Philippe; Sastre, Joaquin; Quirce, Santiago; Tarlo, Susan M.; Cruz, Maria-Jesus; Munoz, Xavier; Harley, John B.; Bernstein, David I.

    2015-01-01

    Diisocyanates, reactive chemicals used to produce polyurethane products, are the most common causes of occupational asthma. The aim of this study is to identify susceptibility gene variants that could contribute to the pathogenesis of diisocyanate asthma (DA) using a Genome-Wide Association Study (GWAS) approach. Genome-wide single nucleotide polymorphism (SNP) genotyping was performed in 74 diisocyanate-exposed workers with DA and 824 healthy controls using Omni-2.5 and Omni-5 SNP microarrays. We identified 11 SNPs that exceeded genome-wide significance; the strongest association was for the rs12913832 SNP located on chromosome 15, which has been mapped to the HERC2 gene (p = 6.94 × 10−14). Strong associations were also found for SNPs near the ODZ3 and CDH17 genes on chromosomes 4 and 8 (rs908084, p = 8.59 × 10−9 and rs2514805, p = 1.22 × 10−8, respectively). We also prioritized 38 SNPs with suggestive genome-wide significance (p < 1 × 10−6). Among them, 17 SNPs map to the PITPNC1, ACMSD, ZBTB16, ODZ3, and CDH17 gene loci. Functional genomics data indicate that 2 of the suggestive SNPs (rs2446823 and rs2446824) are located within putative binding sites for the CCAAT/Enhancer Binding Protein (CEBP) and Hepatocyte Nuclear Factor 4, Alpha transcription factors (TFs), respectively. This study identified SNPs mapping to the HERC2, CDH17, and ODZ3 genes as potential susceptibility loci for DA. Pathway analysis indicated that these genes are associated with antigen processing and presentation, and other immune pathways. Overlap of 2 suggestive SNPs with likely TF binding sites suggests possible roles in disruption of gene regulation. These results provide new insights into the genetic architecture of DA and serve as a basis for future functional and mechanistic studies. PMID:25918132

  16. Signatures of positive selection in East African Shorthorn Zebu: a genome-wide SNP analysis

    USDA-ARS?s Scientific Manuscript database

    The small East African Shorthorn Zebu is the main indigenous cattle across East Africa. A recent genome wide SNPs analysis has revealed their ancient stable African taurine x Asian zebu admixture. Here, we assess the presence of candidate signature of positive selection in their genome, with the aim...

  17. Genome-wide association analysis of age-at-onset in Alzheimer’s disease

    PubMed Central

    Kamboh, M. Ilyas; Barmada, M. Michael; Demirci, F. Yesim; Minster, Ryan L.; Carrasquillo, Minerva M.; Pankratz, V. Shane; Younkin, Steven G.; Saykin, Andrew J.; Sweet, Robert A.; Feingold, Eleanor; DeKosky, Steven T.; Lopez, Oscar L.

    2011-01-01

    The risk of Alzheimer’s disease (AD) is strongly determined by genetic factors and recent genome-wide association studies (GWAS) have identified several genes for the disease risk. In addition to the disease risk, age-at-onset (AAO) of AD has also strong genetic component with an estimated heritability of 42%. Identification of AAO genes may help to understand the biological mechanisms that regulate the onset of the disease. Here we report the first GWAS focused on identifying genes for the AAO of AD. We performed a genome-wide meta analysis on 3 samples comprising a total of 2,222 AD cases. A total of ~2.5 million directly genotyped or imputed SNPs were analyzed in relation to AAO of AD. As expected, the most significant associations were observed in the APOE region on chromosome 19 where several SNPs surpassed the conservative genome-wide significant threshold (P<5E-08). The most significant SNP outside the APOE region was located in the DCHS2 gene on chromosome 4q31.3 (rs1466662; P=4.95E-07). There were 19 additional significant SNPs in this region at P<1E-04 and the DCHS2 gene is expressed in the cerebral cortex and thus is a potential candidate for affecting AAO in AD. These findings need to be confirmed in additional well-powered samples. PMID:22005931

  18. A Genome-Wide Association Study Identifies Genetic Variants Associated with Mathematics Ability

    PubMed Central

    Chen, Huan; Gu, Xiao-hong; Zhou, Yuxi; Ge, Zeng; Wang, Bin; Siok, Wai Ting; Wang, Guoqing; Huen, Michael; Jiang, Yuyang; Tan, Li-Hai; Sun, Yimin

    2017-01-01

    Mathematics ability is a complex cognitive trait with polygenic heritability. Genome-wide association study (GWAS) has been an effective approach to investigate genetic components underlying mathematic ability. Although previous studies reported several candidate genetic variants, none of them exceeded genome-wide significant threshold in general populations. Herein, we performed GWAS in Chinese elementary school students to identify potential genetic variants associated with mathematics ability. The discovery stage included 494 and 504 individuals from two independent cohorts respectively. The replication stage included another cohort of 599 individuals. In total, 28 of 81 candidate SNPs that met validation criteria were further replicated. Combined meta-analysis of three cohorts identified four SNPs (rs1012694, rs11743006, rs17778739 and rs17777541) of SPOCK1 gene showing association with mathematics ability (minimum p value 5.67 × 10−10, maximum β −2.43). The SPOCK1 gene is located on chromosome 5q31.2 and encodes a highly conserved glycoprotein testican-1 which was associated with tumor progression and prognosis as well as neurogenesis. This is the first study to report genome-wide significant association of individual SNPs with mathematics ability in general populations. Our preliminary results further supported the role of SPOCK1 during neurodevelopment. The genetic complexities underlying mathematics ability might contribute to explain the basis of human cognition and intelligence at genetic level. PMID:28155865

  19. A Genome-Wide Association Study Identifies Genetic Variants Associated with Mathematics Ability.

    PubMed

    Chen, Huan; Gu, Xiao-Hong; Zhou, Yuxi; Ge, Zeng; Wang, Bin; Siok, Wai Ting; Wang, Guoqing; Huen, Michael; Jiang, Yuyang; Tan, Li-Hai; Sun, Yimin

    2017-02-03

    Mathematics ability is a complex cognitive trait with polygenic heritability. Genome-wide association study (GWAS) has been an effective approach to investigate genetic components underlying mathematic ability. Although previous studies reported several candidate genetic variants, none of them exceeded genome-wide significant threshold in general populations. Herein, we performed GWAS in Chinese elementary school students to identify potential genetic variants associated with mathematics ability. The discovery stage included 494 and 504 individuals from two independent cohorts respectively. The replication stage included another cohort of 599 individuals. In total, 28 of 81 candidate SNPs that met validation criteria were further replicated. Combined meta-analysis of three cohorts identified four SNPs (rs1012694, rs11743006, rs17778739 and rs17777541) of SPOCK1 gene showing association with mathematics ability (minimum p value 5.67 × 10 -10 , maximum β -2.43). The SPOCK1 gene is located on chromosome 5q31.2 and encodes a highly conserved glycoprotein testican-1 which was associated with tumor progression and prognosis as well as neurogenesis. This is the first study to report genome-wide significant association of individual SNPs with mathematics ability in general populations. Our preliminary results further supported the role of SPOCK1 during neurodevelopment. The genetic complexities underlying mathematics ability might contribute to explain the basis of human cognition and intelligence at genetic level.

  20. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome

    PubMed Central

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-01-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230

  1. From conservation genetics to conservation genomics: a genome-wide assessment of blue whales (Balaenoptera musculus) in Australian feeding aggregations

    PubMed Central

    Sandoval-Castillo, Jonathan; Jenner, K. Curt S.; Gill, Peter C.; Jenner, Micheline-Nicole M.; Morrice, Margaret G.

    2018-01-01

    Genetic datasets of tens of markers have been superseded through next-generation sequencing technology with genome-wide datasets of thousands of markers. Genomic datasets improve our power to detect low population structure and identify adaptive divergence. The increased population-level knowledge can inform the conservation management of endangered species, such as the blue whale (Balaenoptera musculus). In Australia, there are two known feeding aggregations of the pygmy blue whale (B. m. brevicauda) which have shown no evidence of genetic structure based on a small dataset of 10 microsatellites and mtDNA. Here, we develop and implement a high-resolution dataset of 8294 genome-wide filtered single nucleotide polymorphisms, the first of its kind for blue whales. We use these data to assess whether the Australian feeding aggregations constitute one population and to test for the first time whether there is adaptive divergence between the feeding aggregations. We found no evidence of neutral population structure and negligible evidence of adaptive divergence. We propose that individuals likely travel widely between feeding areas and to breeding areas, which would require them to be adapted to a wide range of environmental conditions. This has important implications for their conservation as this blue whale population is likely vulnerable to a range of anthropogenic threats both off Australia and elsewhere. PMID:29410806

  2. Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies.

    PubMed

    Abnet, Christian C; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M; Liao, Linda M; Lee, Maxwell P; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B; Giffen, Carol A; Burdett, Laurie; Fraumeni, Joseph F; Tucker, Margaret A; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M; Yuan, Zhi-Qing; Chanock, Stephen J; Zhang, Xue-Jun; Taylor, Philip R; Wang, Li-Dong

    2012-05-01

    Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10(-8), and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19-1.40) and P= 7.63 × 10(-10). An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants.

  3. A mega-analysis of genome-wide association studies for major depressive disorder.

    PubMed

    Ripke, Stephan; Wray, Naomi R; Lewis, Cathryn M; Hamilton, Steven P; Weissman, Myrna M; Breen, Gerome; Byrne, Enda M; Blackwood, Douglas H R; Boomsma, Dorret I; Cichon, Sven; Heath, Andrew C; Holsboer, Florian; Lucae, Susanne; Madden, Pamela A F; Martin, Nicholas G; McGuffin, Peter; Muglia, Pierandrea; Noethen, Markus M; Penninx, Brenda P; Pergadia, Michele L; Potash, James B; Rietschel, Marcella; Lin, Danyu; Müller-Myhsok, Bertram; Shi, Jianxin; Steinberg, Stacy; Grabe, Hans J; Lichtenstein, Paul; Magnusson, Patrik; Perlis, Roy H; Preisig, Martin; Smoller, Jordan W; Stefansson, Kari; Uher, Rudolf; Kutalik, Zoltan; Tansey, Katherine E; Teumer, Alexander; Viktorin, Alexander; Barnes, Michael R; Bettecken, Thomas; Binder, Elisabeth B; Breuer, René; Castro, Victor M; Churchill, Susanne E; Coryell, William H; Craddock, Nick; Craig, Ian W; Czamara, Darina; De Geus, Eco J; Degenhardt, Franziska; Farmer, Anne E; Fava, Maurizio; Frank, Josef; Gainer, Vivian S; Gallagher, Patience J; Gordon, Scott D; Goryachev, Sergey; Gross, Magdalena; Guipponi, Michel; Henders, Anjali K; Herms, Stefan; Hickie, Ian B; Hoefels, Susanne; Hoogendijk, Witte; Hottenga, Jouke Jan; Iosifescu, Dan V; Ising, Marcus; Jones, Ian; Jones, Lisa; Jung-Ying, Tzeng; Knowles, James A; Kohane, Isaac S; Kohli, Martin A; Korszun, Ania; Landen, Mikael; Lawson, William B; Lewis, Glyn; Macintyre, Donald; Maier, Wolfgang; Mattheisen, Manuel; McGrath, Patrick J; McIntosh, Andrew; McLean, Alan; Middeldorp, Christel M; Middleton, Lefkos; Montgomery, Grant M; Murphy, Shawn N; Nauck, Matthias; Nolen, Willem A; Nyholt, Dale R; O'Donovan, Michael; Oskarsson, Högni; Pedersen, Nancy; Scheftner, William A; Schulz, Andrea; Schulze, Thomas G; Shyn, Stanley I; Sigurdsson, Engilbert; Slager, Susan L; Smit, Johannes H; Stefansson, Hreinn; Steffens, Michael; Thorgeirsson, Thorgeir; Tozzi, Federica; Treutlein, Jens; Uhr, Manfred; van den Oord, Edwin J C G; Van Grootheest, Gerard; Völzke, Henry; Weilburg, Jeffrey B; Willemsen, Gonneke; Zitman, Frans G; Neale, Benjamin; Daly, Mark; Levinson, Douglas F; Sullivan, Patrick F

    2013-04-01

    Prior genome-wide association studies (GWAS) of major depressive disorder (MDD) have met with limited success. We sought to increase statistical power to detect disease loci by conducting a GWAS mega-analysis for MDD. In the MDD discovery phase, we analyzed more than 1.2 million autosomal and X chromosome single-nucleotide polymorphisms (SNPs) in 18 759 independent and unrelated subjects of recent European ancestry (9240 MDD cases and 9519 controls). In the MDD replication phase, we evaluated 554 SNPs in independent samples (6783 MDD cases and 50 695 controls). We also conducted a cross-disorder meta-analysis using 819 autosomal SNPs with P<0.0001 for either MDD or the Psychiatric GWAS Consortium bipolar disorder (BIP) mega-analysis (9238 MDD cases/8039 controls and 6998 BIP cases/7775 controls). No SNPs achieved genome-wide significance in the MDD discovery phase, the MDD replication phase or in pre-planned secondary analyses (by sex, recurrent MDD, recurrent early-onset MDD, age of onset, pre-pubertal onset MDD or typical-like MDD from a latent class analyses of the MDD criteria). In the MDD-bipolar cross-disorder analysis, 15 SNPs exceeded genome-wide significance (P<5 × 10(-8)), and all were in a 248 kb interval of high LD on 3p21.1 (chr3:52 425 083-53 822 102, minimum P=5.9 × 10(-9) at rs2535629). Although this is the largest genome-wide analysis of MDD yet conducted, its high prevalence means that the sample is still underpowered to detect genetic effects typical for complex traits. Therefore, we were unable to identify robust and replicable findings. We discuss what this means for genetic research for MDD. The 3p21.1 MDD-BIP finding should be interpreted with caution as the most significant SNP did not replicate in MDD samples, and genotyping in independent samples will be needed to resolve its status.

  4. Genome-Wide Association Analysis of Blood Biomarkers in Chronic Obstructive Pulmonary Disease

    PubMed Central

    Kim, Deog Kyeom; Cho, Michael H.; Hersh, Craig P.; Lomas, David A.; Miller, Bruce E.; Kong, Xiangyang; Bakke, Per; Gulsvik, Amund; Agustí, Alvar; Wouters, Emiel; Celli, Bartolome; Coxson, Harvey; Vestbo, Jørgen; MacNee, William; Yates, Julie C.; Rennard, Stephen; Litonjua, Augusto; Qiu, Weiliang; Beaty, Terri H.; Crapo, James D.; Riley, John H.; Tal-Singer, Ruth

    2012-01-01

    Rationale: A genome-wide association study (GWAS) for circulating chronic obstructive pulmonary disease (COPD) biomarkers could identify genetic determinants of biomarker levels and COPD susceptibility. Objectives: To identify genetic variants of circulating protein biomarkers and novel genetic determinants of COPD. Methods: GWAS was performed for two pneumoproteins, Clara cell secretory protein (CC16) and surfactant protein D (SP-D), and five systemic inflammatory markers (C-reactive protein, fibrinogen, IL-6, IL-8, and tumor necrosis factor-α) in 1,951 subjects with COPD. For genome-wide significant single nucleotide polymorphisms (SNPs) (P < 1 × 10−8), association with COPD susceptibility was tested in 2,939 cases with COPD and 1,380 smoking control subjects. The association of candidate SNPs with mRNA expression in induced sputum was also elucidated. Measurements and Main Results: Genome-wide significant susceptibility loci affecting biomarker levels were found only for the two pneumoproteins. Two discrete loci affecting CC16, one region near the CC16 coding gene (SCGB1A1) on chromosome 11 and another locus approximately 25 Mb away from SCGB1A1, were identified, whereas multiple SNPs on chromosomes 6 and 16, in addition to SNPs near SFTPD, had genome-wide significant associations with SP-D levels. Several SNPs affecting circulating CC16 levels were significantly associated with sputum mRNA expression of SCGB1A1 (P = 0.009–0.03). Several SNPs highly associated with CC16 or SP-D levels were nominally associated with COPD in a collaborative GWAS (P = 0.001–0.049), although these COPD associations were not replicated in two additional cohorts. Conclusions: Distant genetic loci and biomarker-coding genes affect circulating levels of COPD-related pneumoproteins. A subset of these protein quantitative trait loci may influence their gene expression in the lung and/or COPD susceptibility. Clinical trial registered with www.clinicaltrials.gov (NCT 00292552). PMID

  5. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    NASA Astrophysics Data System (ADS)

    Qian, Long; Kussell, Edo

    The composition of genomes with respect to short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. The underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, which we detect in all species across domains of life. We hypothesize that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Alternative contributions may come from interference of protein-DNA binding with replication and mutational repair processes, which operates with similar rates. We conclude that genome-wide word compositions have been molded by DNA binding proteins through tiny evolutionary steps over timescales spanning millions of generations.

  6. Systematic, genome-wide, sex-specific linkage of cardiovascular traits in French Canadians.

    PubMed

    Seda, Ondrej; Tremblay, Johanne; Gaudet, Daniel; Brunelle, Pierre-Luc; Gurau, Alexandru; Merlo, Ettore; Pilote, Louise; Orlov, Sergei N; Boulva, Francis; Petrovich, Milan; Kotchen, Theodore A; Cowley, Allen W; Hamet, Pavel

    2008-04-01

    The sexual dimorphism of cardiovascular traits, as well as susceptibility to a variety of related diseases, has long been recognized, yet their sex-specific genomic determinants are largely unknown. We systematically assessed the sex-specific heritability and linkage of 539 hemodynamic, metabolic, anthropometric, and humoral traits in 120 French-Canadian families from the Saguenay-Lac-St-Jean region of Quebec, Canada. We performed multipoint linkage analysis using microsatellite markers followed by peak-wide linkage scan based on Affymetrix Human Mapping 50K Array Xba240 single nucleotide polymorphism genotypes in 3 settings, including the entire sample and then separately in men and women. Nearly one half of the traits were age and sex independent, one quarter were both age and sex dependent, and one eighth were exclusively age or sex dependent. Sex-specific phenotypes are most frequent in heart rate and blood pressure categories, whereas sex- and age-independent determinants are predominant among humoral and biochemical parameters. Twenty sex-specific loci passing multiple testing criteria were corroborated by 2-point single nucleotide polymorphism linkage. Several resting systolic blood pressure measurements showed significant genotype-by-sex interaction, eg, male-specific locus at chromosome 12 (male-female logarithm of odds difference: 4.16; interaction P=0.0002), which was undetectable in the entire population, even after adjustment for sex. Detailed interrogation of this locus revealed a 220-kb block overlapping parts of TAO-kinase 3 and SUDS3 genes. In summary, a large number of complex cardiovascular traits display significant sexual dimorphism, for which we have demonstrated genomic determinants at the haplotype level. Many of these would have been missed in a traditional, sex-adjusted setting.

  7. Genome-wide scans between two honeybee populations reveal putative signatures of human-mediated selection.

    PubMed

    Parejo, M; Wragg, D; Henriques, D; Vignal, A; Neuditschko, M

    2017-12-01

    Human-mediated selection has left signatures in the genomes of many domesticated animals, including the European dark honeybee, Apis mellifera mellifera, which has been selected by apiculturists for centuries. Using whole-genome sequence information, we investigated selection signatures in spatially separated honeybee subpopulations (Switzerland, n = 39 and France, n = 17). Three different test statistics were calculated in windows of 2 kb (fixation index, cross-population extended haplotype homozygosity and cross-population composite likelihood ratio) and combined into a recently developed composite selection score. Applying a stringent false discovery rate of 0.01, we identified six significant selective sweeps distributed across five chromosomes covering eight genes. These genes are associated with multiple molecular and biological functions, including regulation of transcription, receptor binding and signal transduction. Of particular interest is a selection signature on chromosome 1, which corresponds to the WNT4 gene, the family of which is conserved across the animal kingdom with a variety of functions. In Drosophila melanogaster, WNT4 alleles have been associated with differential wing, cross vein and abdominal phenotypes. Defining phenotypic characteristics of different Apis mellifera ssp., which are typically used as selection criteria, include colour and wing venation pattern. This signal is therefore likely to be a good candidate for human mediated-selection arising from different applied breeding practices in the two managed populations. © 2017 The Authors. Animal Genetics published by John Wiley & Sons Ltd on behalf of Stichting International Foundation for Animal Genetics.

  8. A survey of genome-wide single nucleotide polymorphisms through genome resequencing in the Périgord black truffle (Tuber melanosporum Vittad.).

    PubMed

    Payen, Thibaut; Murat, Claude; Gigant, Anaïs; Morin, Emmanuelle; De Mita, Stéphane; Martin, Francis

    2015-09-01

    The Périgord black truffle (Tuber melanosporum Vittad.), considered a gastronomic delicacy worldwide, is an ectomycorrhizal filamentous fungus that is ecologically important in Mediterranean French, Italian and Spanish woodlands. In this study, we developed a novel resource of single nucleotide polymorphisms (SNPs) for T. melanosporum using Illumina high-throughput resequencing. The genome from six T. melanosporum geographical accessions was sequenced to a depth of approximately 20×. These geographical accessions were selected from different populations within the northern and southern regions of the geographical species distribution. Approximately 80% of the reads for each of the six resequenced geographical accessions mapped against the reference T. melanosporum genome assembly, estimating the core genome size of this organism to be approximately 110 Mbp. A total of 442 326 SNPs corresponding to 3540 SNPs/Mbps were identified as being included in all seven genomes. The SNPs occurred more frequently in repeated sequences (85%), although 4501 SNPs were also identified in the coding regions of 2587 genes. Using the ratio of nonsynonymous mutations per nonsynonymous site (pN) to synonymous mutations per synonymous site (pS) and Tajima's D index scanning the whole genome, we were able to identify genomic regions and genes potentially subjected to positive or purifying selection. The SNPs identified represent a valuable resource for future population genetics and genomics studies. © 2015 John Wiley & Sons Ltd.

  9. Genome-wide association analysis of milk yield traits in Nordic Red Cattle using imputed whole genome sequence variants.

    PubMed

    Iso-Touru, T; Sahana, G; Guldbrandtsen, B; Lund, M S; Vilkki, J

    2016-03-22

    The Nordic Red Cattle consisting of three different populations from Finland, Sweden and Denmark are under a joint breeding value estimation system. The long history of recording of production and health traits offers a great opportunity to study production traits and identify causal variants behind them. In this study, we used whole genome sequence level data from 4280 progeny tested Nordic Red Cattle bulls to scan the genome for loci affecting milk, fat and protein yields. Using a genome-wise significance threshold, regions on Bos taurus chromosomes 5, 14, 23, 25 and 26 were associated with fat yield. Regions on chromosomes 5, 14, 16, 19, 20 and 25 were associated with milk yield and chromosomes 5, 14 and 25 had regions associated with protein yield. Significantly associated variations were found in 227 genes for fat yield, 72 genes for milk yield and 30 genes for protein yield. Ingenuity Pathway Analysis was used to identify networks connecting these genes displaying significant hits. When compared to previously mapped genomic regions associated with fertility, significantly associated variations were found in 5 genes common for fat yield and fertility, thus linking these two traits via biological networks. This is the first time when whole genome sequence data is utilized to study genomic regions affecting milk production in the Nordic Red Cattle population. Sequence level data offers the possibility to study quantitative traits in detail but still cannot unambiguously reveal which of the associated variations is causative. Linkage disequilibrium creates difficulties to pinpoint the causative genes and variations. One solution to overcome these difficulties is the identification of the functional gene networks and pathways to reveal important interacting genes as candidates for the observed effects. This information on target genomic regions may be exploited to improve genomic prediction.

  10. Genome-wide detection of selection signatures in Chinese indigenous Laiwu pigs revealed candidate genes regulating fat deposition in muscle.

    PubMed

    Chen, Minhui; Wang, Jiying; Wang, Yanping; Wu, Ying; Fu, Jinluan; Liu, Jian-Feng

    2018-05-18

    Currently, genome-wide scans for positive selection signatures in commercial breed have been investigated. However, few studies have focused on selection footprints of indigenous breeds. Laiwu pig is an invaluable Chinese indigenous pig breed with extremely high proportion of intramuscular fat (IMF), and an excellent model to detect footprint as the result of natural and artificial selection for fat deposition in muscle. In this study, based on GeneSeek Genomic profiler Porcine HD data, three complementary methods, F ST , iHS (integrated haplotype homozygosity score) and CLR (composite likelihood ratio), were implemented to detect selection signatures in the whole genome of Laiwu pigs. Totally, 175 candidate selected regions were obtained by at least two of the three methods, which covered 43.75 Mb genomic regions and corresponded to 1.79% of the genome sequence. Gene annotation of the selected regions revealed a list of functionally important genes for feed intake and fat deposition, reproduction, and immune response. Especially, in accordance to the phenotypic features of Laiwu pigs, among the candidate genes, we identified several genes, NPY1R, NPY5R, PIK3R1 and JAKMIP1, involved in the actions of two sets of neurons, which are central regulators in maintaining the balance between food intake and energy expenditure. Our results identified a number of regions showing signatures of selection, as well as a list of functionally candidate genes with potential effect on phenotypic traits, especially fat deposition in muscle. Our findings provide insights into the mechanisms of artificial selection of fat deposition and further facilitate follow-up functional studies.

  11. Genome-wide single nucleotide polymorphisms (SNPs) for a model invasive ascidian Botryllus schlosseri.

    PubMed

    Gao, Yangchun; Li, Shiguo; Zhan, Aibin

    2018-04-01

    Invasive species cause huge damages to ecology, environment and economy globally. The comprehensive understanding of invasion mechanisms, particularly genetic bases of micro-evolutionary processes responsible for invasion success, is essential for reducing potential damages caused by invasive species. The golden star tunicate, Botryllus schlosseri, has become a model species in invasion biology, mainly owing to its high invasiveness nature and small well-sequenced genome. However, the genome-wide genetic markers have not been well developed in this highly invasive species, thus limiting the comprehensive understanding of genetic mechanisms of invasion success. Using restriction site-associated DNA (RAD) tag sequencing, here we developed a high-quality resource of 14,119 out of 158,821 SNPs for B. schlosseri. These SNPs were relatively evenly distributed at each chromosome. SNP annotations showed that the majority of SNPs (63.20%) were located at intergenic regions, and 21.51% and 14.58% were located at introns and exons, respectively. In addition, the potential use of the developed SNPs for population genomics studies was primarily assessed, such as the estimate of observed heterozygosity (H O ), expected heterozygosity (H E ), nucleotide diversity (π), Wright's inbreeding coefficient (F IS ) and effective population size (Ne). Our developed SNP resource would provide future studies the genome-wide genetic markers for genetic and genomic investigations, such as genetic bases of micro-evolutionary processes responsible for invasion success.

  12. Genome-wide association analysis of symbiotic nitrogen fixation in common bean

    USDA-ARS?s Scientific Manuscript database

    A genome-wide association study (GWAS) was conducted to explore the genetic basis of variation for symbiotic nitrogen fixation (SNF) and related traits in the Andean diversity panel (ADP) comprised of 259 common bean (Phaseolus vulgaris) genotypes. The ADP was evaluated for SNF and related traits in...

  13. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders

    PubMed Central

    Lo, Min-Tzu; Hinds, David A.; Tung, Joyce Y.; Franz, Carol; Fan, Chun-Chieh; Wang, Yunpeng; Smeland, Olav B.; Schork, Andrew; Holland, Dominic; Kauppi, Karolina; Sanyal, Nilotpal; Escott-Price, Valentina; Smith, Daniel J.; O'Donovan, Michael; Stefansson, Hreinn; Bjornsdottir, Gyda; Thorgeirsson, Thorgeir E.; Stefansson, Kari; McEvoy, Linda K.; Dale, Anders M.; Andreassen, Ole A.; Chen, Chi-Hua

    2017-01-01

    Summary Personality is influenced by genetic and environmental factors1, and associated with mental health. However, the underlying genetic determinants are largely unknown. We identified six genetic loci, including five novel loci2,3, significantly associated with personality traits in a meta-analysis of genome-wide association studies (N=123,132–260,861). Of these genome-wide significant loci, extraversion was associated with variants in WSCD2 and near PCDH15, and neuroticism with variants on chromosome 8p23.1 and in L3MBTL2. We performed a principal component analysis to extract major dimensions underlying genetic variations among five personality traits and six psychiatric disorders (N=5,422–18,759). The first genetic dimension separated personality traits and psychiatric disorders, except that neuroticism and openness to experience were clustered with the disorders. High genetic correlations were found between extraversion and attention-deficit/hyperactivity disorder (ADHD), and between openness and schizophrenia/bipolar disorder. The second genetic dimension was closely aligned with extraversion-introversion and grouped neuroticism with internalizing psychopathology (e.g., depression/anxiety). PMID:27918536

  14. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders.

    PubMed

    Lo, Min-Tzu; Hinds, David A; Tung, Joyce Y; Franz, Carol; Fan, Chun-Chieh; Wang, Yunpeng; Smeland, Olav B; Schork, Andrew; Holland, Dominic; Kauppi, Karolina; Sanyal, Nilotpal; Escott-Price, Valentina; Smith, Daniel J; O'Donovan, Michael; Stefansson, Hreinn; Bjornsdottir, Gyda; Thorgeirsson, Thorgeir E; Stefansson, Kari; McEvoy, Linda K; Dale, Anders M; Andreassen, Ole A; Chen, Chi-Hua

    2017-01-01

    Personality is influenced by genetic and environmental factors and associated with mental health. However, the underlying genetic determinants are largely unknown. We identified six genetic loci, including five novel loci, significantly associated with personality traits in a meta-analysis of genome-wide association studies (N = 123,132-260,861). Of these genome-wide significant loci, extraversion was associated with variants in WSCD2 and near PCDH15, and neuroticism with variants on chromosome 8p23.1 and in L3MBTL2. We performed a principal component analysis to extract major dimensions underlying genetic variations among five personality traits and six psychiatric disorders (N = 5,422-18,759). The first genetic dimension separated personality traits and psychiatric disorders, except that neuroticism and openness to experience were clustered with the disorders. High genetic correlations were found between extraversion and attention-deficit-hyperactivity disorder (ADHD) and between openness and schizophrenia and bipolar disorder. The second genetic dimension was closely aligned with extraversion-introversion and grouped neuroticism with internalizing psychopathology (e.g., depression or anxiety).

  15. Gene expression levels as endophenotypes in genome-wide association studies of Alzheimer disease

    PubMed Central

    Zou, F.; Carrasquillo, M. M.; Pankratz, V. S.; Belbin, O.; Morgan, K.; Allen, M.; Wilcox, S. L.; Ma, L.; Walker, L. P.; Kouri, N.; Burgess, J. D.; Younkin, L. H.; Younkin, Samuel G.; Younkin, C. S.; Bisceglio, G. D.; Crook, J. E.; Dickson, D. W.; Petersen, R. C.; Graff-Radford, N.; Younkin, Steven G.; Ertekin-Taner, N.

    2010-01-01

    Background: Late-onset Alzheimer disease (LOAD) is a common disorder with a substantial genetic component. We postulate that many disease susceptibility variants act by altering gene expression levels. Methods: We measured messenger RNA (mRNA) expression levels of 12 LOAD candidate genes in the cerebella of 200 subjects with LOAD. Using the genotypes from our LOAD genome-wide association study for the cis-single nucleotide polymorphisms (SNPs) (n = 619) of these 12 LOAD candidate genes, we tested for associations with expression levels as endophenotypes. The strongest expression cis-SNP was tested for AD association in 7 independent case-control series (2,280 AD and 2,396 controls). Results: We identified 3 SNPs that associated significantly with IDE (insulin degrading enzyme) expression levels. A single copy of the minor allele for each significant SNP was associated with ∼twofold higher IDE expression levels. The most significant SNP, rs7910977, is 4.2 kb beyond the 3′ end of IDE. The association observed with this SNP was significant even at the genome-wide level (p = 2.7 × 10−8). Furthermore, the minor allele of rs7910977 associated significantly (p = 0.0046) with reduced LOAD risk (OR = 0.81 with a 95% CI of 0.70-0.94), as expected biologically from its association with elevated IDE expression. Conclusions: These results provide strong evidence that IDE is a late-onset Alzheimer disease (LOAD) gene with variants that modify risk of LOAD by influencing IDE expression. They also suggest that the use of expression levels as endophenotypes in genome-wide association studies may provide a powerful approach for the identification of disease susceptibility alleles. GLOSSARY AD = Alzheimer disease; CI = confidence interval; GWAS = genome-wide association study; LOAD = late-onset Alzheimer disease; mRNA = messenger RNA; OR = odds ratio; SNP = single nucleotide polymorphism. PMID:20142614

  16. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

    PubMed

    Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

    2014-01-30

    RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.

  17. Genomic Predictions and Genome-Wide Association Study of Resistance Against Piscirickettsia salmonis in Coho Salmon (Oncorhynchus kisutch) Using ddRAD Sequencing

    PubMed Central

    Barría, Agustín; Christensen, Kris A.; Yoshida, Grazyella M.; Correa, Katharina; Jedlicki, Ana; Lhorente, Jean P.; Davidson, William S.; Yáñez, José M.

    2018-01-01

    Piscirickettsia salmonis is one of the main infectious diseases affecting coho salmon (Oncorhynchus kisutch) farming, and current treatments have been ineffective for the control of this disease. Genetic improvement for P. salmonis resistance has been proposed as a feasible alternative for the control of this infectious disease in farmed fish. Genotyping by sequencing (GBS) strategies allow genotyping of hundreds of individuals with thousands of single nucleotide polymorphisms (SNPs), which can be used to perform genome wide association studies (GWAS) and predict genetic values using genome-wide information. We used double-digest restriction-site associated DNA (ddRAD) sequencing to dissect the genetic architecture of resistance against P. salmonis in a farmed coho salmon population and to identify molecular markers associated with the trait. We also evaluated genomic selection (GS) models in order to determine the potential to accelerate the genetic improvement of this trait by means of using genome-wide molecular information. A total of 764 individuals from 33 full-sib families (17 highly resistant and 16 highly susceptible) were experimentally challenged against P. salmonis and their genotypes were assayed using ddRAD sequencing. A total of 9,389 SNPs markers were identified in the population. These markers were used to test genomic selection models and compare different GWAS methodologies for resistance measured as day of death (DD) and binary survival (BIN). Genomic selection models showed higher accuracies than the traditional pedigree-based best linear unbiased prediction (PBLUP) method, for both DD and BIN. The models showed an improvement of up to 95% and 155% respectively over PBLUP. One SNP related with B-cell development was identified as a potential functional candidate associated with resistance to P. salmonis defined as DD. PMID:29440129

  18. Genome-wide investigation of genetic changes during modern breeding of Brassica napus.

    PubMed

    Wang, Nian; Li, Feng; Chen, Biyun; Xu, Kun; Yan, Guixin; Qiao, Jiangwei; Li, Jun; Gao, Guizhen; Bancroft, Ian; Meng, Jingling; King, Graham J; Wu, Xiaoming

    2014-08-01

    Considerable genome variation had been incorporated within rapeseed breeding programs over past decades. In past decades, there have been substantial changes in phenotypic properties of rapeseed as a result of extensive breeding effort. Uncovering the underlying patterns of allelic variation in the context of genome organisation would provide knowledge to guide future genetic improvement. We assessed genome-wide genetic changes, including population structure, genetic relatedness, the extent of linkage disequilibrium, nucleotide diversity and genetic differentiation based on F ST outlier detection, for a panel of 472 Brassica napus inbred accessions using a 60 k Brassica Infinium® SNP array. We found genetic diversity varied in different sub-groups. Moreover, the genetic diversity increased from 1950 to 1980 and then remained at a similar level in China and Europe. We also found ~6-10 % genomic regions revealed high F ST values. Some QTLs previously associated with important agronomic traits overlapped with these regions. Overall, the B. napus C genome was found to have more high F ST signals than the A genome, and we concluded that the C genome may contribute more valuable alleles to generate elite traits. The results of this study indicate that considerable genome variation had been incorporated within rapeseed breeding programs over past decades. These results also contribute to understanding the impact of rapeseed improvement on available genome variation and the potential for dissecting complex agronomic traits.

  19. Genome-wide Association Study Identifies Loci for the Polled Phenotype in Yak

    PubMed Central

    Wu, Xiaoyun; Wang, Kun; Ding, Xuezhi; Wang, Mingcheng; Chu, Min; Xie, Xiuyue; Qiu, Qiang; Yan, Ping

    2016-01-01

    The absence of horns, known as the polled phenotype, is an economically important trait in modern yak husbandry, but the genomic structure and genetic basis of this phenotype have yet to be discovered. Here, we conducted a genome-wide association study with a panel of 10 horned and 10 polled yaks using whole genome sequencing. We mapped the POLLED locus to a 200-kb interval, which comprises three protein-coding genes. Further characterization of the candidate region showed recent artificial selection signals resulting from the breeding process. We suggest that expressional variations rather than structural variations in protein probably contribute to the polled phenotype. Our results not only represent the first and important step in establishing the genomic structure of the polled region in yak, but also add to our understanding of the polled trait in bovid species. PMID:27389700

  20. Using an Inbred Horse Breed in a High Density Genome-Wide Scan for Genetic Risk Factors of Insect Bite Hypersensitivity (IBH).

    PubMed

    Velie, Brandon D; Shrestha, Merina; Franҫois, Liesbeth; Schurink, Anouk; Tesfayonas, Yohannes G; Stinckens, Anneleen; Blott, Sarah; Ducro, Bart J; Mikko, Sofia; Thomas, Ruth; Swinburne, June E; Sundqvist, Marie; Eriksson, Susanne; Buys, Nadine; Lindgren, Gabriella

    2016-01-01

    While susceptibility to hypersensitive reactions is a common problem amongst humans and animals alike, the population structure of certain animal species and breeds provides a more advantageous route to better understanding the biology underpinning these conditions. The current study uses Exmoor ponies, a highly inbred breed of horse known to frequently suffer from insect bite hypersensitivity, to identify genomic regions associated with a type I and type IV hypersensitive reaction. A total of 110 cases and 170 controls were genotyped on the 670K Axiom Equine Genotyping Array. Quality control resulted in 452,457 SNPs and 268 individuals being tested for association. Genome-wide association analyses were performed using the GenABEL package in R and resulted in the identification of two regions of interest on Chromosome 8. The first region contained the most significant SNP identified, which was located in an intron of the DCC netrin 1 receptor gene. The second region identified contained multiple top SNPs and encompassed the PIGN, KIAA1468, TNFRSF11A, ZCCHC2, and PHLPP1 genes. Although additional studies will be needed to validate the importance of these regions in horses and the relevance of these regions in other species, the knowledge gained from the current study has the potential to be a step forward in unraveling the complex nature of hypersensitive reactions.

  1. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods.

    PubMed

    Oldfield, Lauren M; Grzesik, Peter; Voorhies, Alexander A; Alperovich, Nina; MacMath, Derek; Najera, Claudia D; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N; Montague, Michael G; Friedman, Robert M; Desai, Prashant J; Vashee, Sanjay

    2017-10-17

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOS YA , replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats.

  2. Genome-wide engineering of an infectious clone of herpes simplex virus type 1 using synthetic genomics assembly methods

    PubMed Central

    Grzesik, Peter; Voorhies, Alexander A.; Alperovich, Nina; MacMath, Derek; Najera, Claudia D.; Chandra, Diya Sabrina; Prasad, Sanjana; Noskov, Vladimir N.; Montague, Michael G.; Friedman, Robert M.; Desai, Prashant J.

    2017-01-01

    Here, we present a transformational approach to genome engineering of herpes simplex virus type 1 (HSV-1), which has a large DNA genome, using synthetic genomics tools. We believe this method will enable more rapid and complex modifications of HSV-1 and other large DNA viruses than previous technologies, facilitating many useful applications. Yeast transformation-associated recombination was used to clone 11 fragments comprising the HSV-1 strain KOS 152 kb genome. Using overlapping sequences between the adjacent pieces, we assembled the fragments into a complete virus genome in yeast, transferred it into an Escherichia coli host, and reconstituted infectious virus following transfection into mammalian cells. The virus derived from this yeast-assembled genome, KOSYA, replicated with kinetics similar to wild-type virus. We demonstrated the utility of this modular assembly technology by making numerous modifications to a single gene, making changes to two genes at the same time and, finally, generating individual and combinatorial deletions to a set of five conserved genes that encode virion structural proteins. While the ability to perform genome-wide editing through assembly methods in large DNA virus genomes raises dual-use concerns, we believe the incremental risks are outweighed by potential benefits. These include enhanced functional studies, generation of oncolytic virus vectors, development of delivery platforms of genes for vaccines or therapy, as well as more rapid development of countermeasures against potential biothreats. PMID:28928148

  3. Genome-wide survey of single-nucleotide polymorphisms reveals fine-scale population structure and signs of selection in the threatened Caribbean elkhorn coral, Acropora palmata

    PubMed Central

    2017-01-01

    The advent of next-generation sequencing tools has made it possible to conduct fine-scale surveys of population differentiation and genome-wide scans for signatures of selection in non-model organisms. Such surveys are of particular importance in sharply declining coral species, since knowledge of population boundaries and signs of local adaptation can inform restoration and conservation efforts. Here, we use genome-wide surveys of single-nucleotide polymorphisms in the threatened Caribbean elkhorn coral, Acropora palmata, to reveal fine-scale population structure and infer the major barrier to gene flow that separates the eastern and western Caribbean populations between the Bahamas and Puerto Rico. The exact location of this break had been subject to discussion because two previous studies based on microsatellite data had come to differing conclusions. We investigate this contradiction by analyzing an extended set of 11 microsatellite markers including the five previously employed and discovered that one of the original microsatellite loci is apparently under selection. Exclusion of this locus reconciles the results from the SNP and the microsatellite datasets. Scans for outlier loci in the SNP data detected 13 candidate loci under positive selection, however there was no correlation between available environmental parameters and genetic distance. Together, these results suggest that reef restoration efforts should use local sources and utilize existing functional variation among geographic regions in ex situ crossing experiments to improve stress resistance of this species. PMID:29181279

  4. Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma.

    PubMed

    Law, Matthew H; Bishop, D Timothy; Lee, Jeffrey E; Brossard, Myriam; Martin, Nicholas G; Moses, Eric K; Song, Fengju; Barrett, Jennifer H; Kumar, Rajiv; Easton, Douglas F; Pharoah, Paul D P; Swerdlow, Anthony J; Kypreou, Katerina P; Taylor, John C; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A; Andresen, Per A; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M; Dębniak, Tadeusz; Duffy, David L; Elder, David E; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M; Goldstein, Alisa M; Gruis, Nelleke A; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A; Chen, Wei V; Landi, Maria Teresa; Lang, Julie; Lathrop, G Mark; Lubiński, Jan; Mackie, Rona M; Mann, Graham J; Molven, Anders; Montgomery, Grant W; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A; Radford-Smith, Graham L; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C; Craig, Jamie E; Schadendorf, Dirk; Simms, Lisa A; Burdon, Kathryn P; Nyholt, Dale R; Pooley, Karen A; Orr, Nick; Stratigos, Alexander J; Cust, Anne E; Ward, Sarah V; Hayward, Nicholas K; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M; Bishop, Julia A Newton; Demenais, Florence; Amos, Christopher I; MacGregor, Stuart; Iles, Mark M

    2015-09-01

    Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5 × 10(-8)), as did 2 previously reported but unreplicated loci and all 13 established loci. Newly associated SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes in the associated regions, including one involved in telomere biology.

  5. Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma

    PubMed Central

    Law, Matthew H.; Bishop, D. Timothy; Martin, Nicholas G.; Moses, Eric K.; Song, Fengju; Barrett, Jennifer H.; Kumar, Rajiv; Easton, Douglas F.; Pharoah, Paul D. P.; Swerdlow, Anthony J.; Kypreou, Katerina P.; Taylor, John C.; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A.; Andresen, Per A.; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M.; Dębniak, Tadeusz; Duffy, David L.; Elder, David E.; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M.; Goldstein, Alisa M.; Gruis, Nelleke A.; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A.; Chen, Wei V.; Landi, Maria Teresa; Lang, Julie; Lathrop, G. Mark; Lubiński, Jan; Mackie, Rona M.; Mann, Graham J.; Molven, Anders; Montgomery, Grant W.; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A.; Radford-Smith, Graham L.; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C.; Craig, Jamie E.; Schadendorf, Dirk; Simms, Lisa A.; Burdon, Kathryn P.; Nyholt, Dale R.; Pooley, Karen A.; Orr, Nick; Stratigos, Alexander J.; Cust, Anne E.; Ward, Sarah V.; Hayward, Nicholas K.; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M.; Bishop, Julia A. Newton; MacGregor, Stuart; Iles, Mark M.

    2015-01-01

    Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5×10–8), as did two previously-reported but un-replicated loci and all thirteen established loci. Novel SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes including one involved in telomere biology. PMID:26237428

  6. Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots

    USDA-ARS?s Scientific Manuscript database

    Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the compu...

  7. Genome-wide population structure and evolutionary history of the Frizarta dairy sheep.

    PubMed

    Kominakis, A; Hager-Theodorides, A L; Saridaki, A; Antonakos, G; Tsiamis, G

    2017-10-01

    In the present study, we used genomic data, generated with a medium density single nucleotide polymorphisms (SNP) array, to acquire more information on the population structure and evolutionary history of the synthetic Frizarta dairy sheep. First, two typical measures of linkage disequilibrium (LD) were estimated at various physical distances that were then used to make inferences on the effective population size at key past time points. Population structure was also assessed by both multidimensional scaling analysis and k-means clustering on the distance matrix obtained from the animals' genomic relationships. The Wright's fixation F ST index was also employed to assess herds' genetic homogeneity and to indirectly estimate past migration rates. The Wright's fixation F IS index and genomic inbreeding coefficients based on the genomic relationship matrix as well as on runs of homozygosity were also estimated. The Frizarta breed displays relatively low LD levels with r 2 and |D'| equal to 0.18 and 0.50, respectively, at an average inter-marker distance of 31 kb. Linkage disequilibrium decayed rapidly by distance and persisted over just a few thousand base pairs. Rate of LD decay (β) varied widely among the 26 autosomes with larger values estimated for shorter chromosomes (e.g. β=0.057, for OAR6) and smaller values for longer ones (e.g. β=0.022, for OAR2). The inferred effective population size at the beginning of the breed's formation was as high as 549, was then reduced to 463 in 1981 (end of the breed's formation) and further declined to 187, one generation ago. Multidimensional scaling analysis and k-means clustering suggested a genetically homogenous population, F ST estimates indicated relatively low genetic differentiation between herds, whereas a heat map of the animals' genomic kinship relationships revealed a stratified population, at a herd level. Estimates of genomic inbreeding coefficients suggested that most recent parental relatedness may have been a

  8. Genome-wide association study of sporadic brain arteriovenous malformations.

    PubMed

    Weinsheimer, Shantel; Bendjilali, Nasrine; Nelson, Jeffrey; Guo, Diana E; Zaroff, Jonathan G; Sidney, Stephen; McCulloch, Charles E; Al-Shahi Salman, Rustam; Berg, Jonathan N; Koeleman, Bobby P C; Simon, Matthias; Bostroem, Azize; Fontanella, Marco; Sturiale, Carmelo L; Pola, Roberto; Puca, Alfredo; Lawton, Michael T; Young, William L; Pawlikowska, Ludmila; Klijn, Catharina J M; Kim, Helen

    2016-09-01

    The pathogenesis of sporadic brain arteriovenous malformations (BAVMs) remains unknown, but studies suggest a genetic component. We estimated the heritability of sporadic BAVM and performed a genome-wide association study (GWAS) to investigate association of common single nucleotide polymorphisms (SNPs) with risk of sporadic BAVM in the international, multicentre Genetics of Arteriovenous Malformation (GEN-AVM) consortium. The Caucasian discovery cohort included 515 BAVM cases and 1191 controls genotyped using Affymetrix genome-wide SNP arrays. Genotype data were imputed to 1000 Genomes Project data, and well-imputed SNPs (>0.01 minor allele frequency) were analysed for association with BAVM. 57 top BAVM-associated SNPs (51 SNPs with p<10(-05) or p<10(-04) in candidate pathway genes, and 6 candidate BAVM SNPs) were tested in a replication cohort including 608 BAVM cases and 744 controls. The estimated heritability of BAVM was 17.6% (SE 8.9%, age and sex-adjusted p=0.015). None of the SNPs were significantly associated with BAVM in the replication cohort after correction for multiple testing. 6 SNPs had a nominal p<0.1 in the replication cohort and map to introns in EGFEM1P, SP4 and CDKAL1 or near JAG1 and BNC2. Of the 6 candidate SNPs, 2 in ACVRL1 and MMP3 had a nominal p<0.05 in the replication cohort. We performed the first GWAS of sporadic BAVM in the largest BAVM cohort assembled to date. No GWAS SNPs were replicated, suggesting that common SNPs do not contribute strongly to BAVM susceptibility. However, heritability estimates suggest a modest but significant genetic contribution. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  9. Genome-wide DNA methylation analysis of pseudohypoparathyroidism patients with GNAS imprinting defects.

    PubMed

    Rochtus, Anne; Martin-Trujillo, Alejandro; Izzi, Benedetta; Elli, Francesca; Garin, Intza; Linglart, Agnes; Mantovani, Giovanna; Perez de Nanclares, Guiomar; Thiele, Suzanne; Decallonne, Brigitte; Van Geet, Chris; Monk, David; Freson, Kathleen

    2016-01-01

    Pseudohypoparathyroidism (PHP) is caused by (epi)genetic defects in the imprinted GNAS cluster. Current classification of PHP patients is hampered by clinical and molecular diagnostic overlaps. The European Consortium for the study of PHP designed a genome-wide methylation study to improve molecular diagnosis. The HumanMethylation 450K BeadChip was used to analyze genome-wide methylation in 24 PHP patients with parathyroid hormone resistance and 20 age- and gender-matched controls. Patients were previously diagnosed with GNAS-specific differentially methylated regions (DMRs) and include 6 patients with known STX16 deletion (PHP(Δstx16)) and 18 without deletion (PHP(neg)). The array demonstrated that PHP patients do not show DNA methylation differences at the whole-genome level. Unsupervised clustering of GNAS-specific DMRs divides PHP(Δstx16) versus PHP(neg) patients. Interestingly, in contrast to the notion that all PHP patients share methylation defects in the A/B DMR while only PHP(Δstx16) patients have normal NESP, GNAS-AS1 and XL methylation, we found a novel DMR (named GNAS-AS2) in the GNAS-AS1 region that is significantly different in both PHP(Δstx16) and PHP(neg), as validated by Sequenom EpiTYPER in a larger PHP cohort. The analysis of 58 DMRs revealed that 8/18 PHP(neg) and 1/6 PHP(Δstx16) patients have multi-locus methylation defects. Validation was performed for FANCC and SVOPL DMRs. This is the first genome-wide methylation study for PHP patients that confirmed that GNAS is the most significant DMR, and the presence of STX16 deletion divides PHP patients in two groups. Moreover, a novel GNAS-AS2 DMR affects all PHP patients, and PHP patients seem sensitive to multi-locus methylation defects.

  10. Unraveling the Genetic Etiology of Adult Antisocial Behavior: A Genome-Wide Association Study

    PubMed Central

    Tielbeek, Jorim J.; Medland, Sarah E.; Benyamin, Beben; Byrne, Enda M.; Heath, Andrew C.; Madden, Pamela A. F.; Martin, Nicholas G.; Wray, Naomi R.; Verweij, Karin J. H.

    2012-01-01

    Crime poses a major burden for society. The heterogeneous nature of criminal behavior makes it difficult to unravel its causes. Relatively little research has been conducted on the genetic influences of criminal behavior. The few twin and adoption studies that have been undertaken suggest that about half of the variance in antisocial behavior can be explained by genetic factors. In order to identify the specific common genetic variants underlying this behavior, we conduct the first genome-wide association study (GWAS) on adult antisocial behavior. Our sample comprised a community sample of 4816 individuals who had completed a self-report questionnaire. No genetic polymorphisms reached genome-wide significance for association with adult antisocial behavior. In addition, none of the traditional candidate genes can be confirmed in our study. While not genome-wide significant, the gene with the strongest association (p-value = 8.7×10−5) was DYRK1A, a gene previously related to abnormal brain development and mental retardation. Future studies should use larger, more homogeneous samples to disentangle the etiology of antisocial behavior. Biosocial criminological research allows a more empirically grounded understanding of criminal behavior, which could ultimately inform and improve current treatment strategies. PMID:23077488

  11. Multi-instance multi-label distance metric learning for genome-wide protein function prediction.

    PubMed

    Xu, Yonghui; Min, Huaqing; Song, Hengjie; Wu, Qingyao

    2016-08-01

    Multi-instance multi-label (MIML) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with not only multiple instances but also multiple class labels. To find an appropriate MIML learning method for genome-wide protein function prediction, many studies in the literature attempted to optimize objective functions in which dissimilarity between instances is measured using the Euclidean distance. But in many real applications, Euclidean distance may be unable to capture the intrinsic similarity/dissimilarity in feature space and label space. Unlike other previous approaches, in this paper, we propose to learn a multi-instance multi-label distance metric learning framework (MIMLDML) for genome-wide protein function prediction. Specifically, we learn a Mahalanobis distance to preserve and utilize the intrinsic geometric information of both feature space and label space for MIML learning. In addition, we try to deal with the sparsely labeled data by giving weight to the labeled data. Extensive experiments on seven real-world organisms covering the biological three-domain system (i.e., archaea, bacteria, and eukaryote; Woese et al., 1990) show that the MIMLDML algorithm is superior to most state-of-the-art MIML learning algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Genome-Wide Linkage and Association Analysis Identifies Major Gene Loci for Guttural Pouch Tympany in Arabian and German Warmblood Horses

    PubMed Central

    Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar

    2012-01-01

    Equine guttural pouch tympany (GPT) is a hereditary condition affecting foals in their first months of life. Complex segregation analyses in Arabian and German warmblood horses showed the involvement of a major gene as very likely. Genome-wide linkage and association analyses including a high density marker set of single nucleotide polymorphisms (SNPs) were performed to map the genomic region harbouring the potential major gene for GPT. A total of 85 Arabian and 373 German warmblood horses were genotyped on the Illumina equine SNP50 beadchip. Non-parametric multipoint linkage analyses showed genome-wide significance on horse chromosomes (ECA) 3 for German warmblood at 16–26 Mb and 34–55 Mb and for Arabian on ECA15 at 64–65 Mb. Genome-wide association analyses confirmed the linked regions for both breeds. In Arabian, genome-wide association was detected at 64 Mb within the region with the highest linkage peak on ECA15. For German warmblood, signals for genome-wide association were close to the peak region of linkage at 52 Mb on ECA3. The odds ratio for the SNP with the highest genome-wide association was 0.12 for the Arabian. In conclusion, the refinement of the regions with the Illumina equine SNP50 beadchip is an important step to unravel the responsible mutations for GPT. PMID:22848553

  13. Genome-wide association studies in Alzheimer's disease.

    PubMed

    Bertram, Lars; Tanzi, Rudolph E

    2009-10-15

    Genome-wide association studies (GWAS) have gained considerable momentum over the last couple of years for the identification of novel complex disease genes. In the field of Alzheimer's disease (AD), there are currently eight published and two provisionally reported GWAS, highlighting over two dozen novel potential susceptibility loci beyond the well-established APOE association. On the basis of the data available at the time of this writing, the most compelling novel GWAS signal has been observed in GAB2 (GRB2-associated binding protein 2), followed by less consistently replicated signals in galanin-like peptide (GALP), piggyBac transposable element derived 1 (PGBD1), tyrosine kinase, non-receptor 1 (TNK1). Furthermore, consistent replication has been recently announced for CLU (clusterin, also known as apolipoprotein J). Finally, there are at least three replicated loci in hitherto uncharacterized genomic intervals on chromosomes 14q32.13, 14q31.2 and 6q24.1 likely implicating the existence of novel AD genes in these regions. In this review, we will discuss the characteristics and potential relevance to pathogenesis of the outcomes of all currently available GWAS in AD. A particular emphasis will be laid on findings with independent data in favor of the original association.

  14. Genome-wide linkage scan for maximum and length-dependent knee muscle strength in young men: significant evidence for linkage at chromosome 14q24.3.

    PubMed

    De Mars, G; Windelinckx, A; Huygens, W; Peeters, M W; Beunen, G P; Aerssens, J; Vlietinck, R; Thomis, M A I

    2008-05-01

    Maintenance of high muscular fitness is positively related to bone health, functionality in daily life and increasing insulin sensitivity, and negatively related to falls and fractures, morbidity and mortality. Heritability of muscle strength phenotypes ranges between 31% and 95%, but little is known about the identity of the genes underlying this complex trait. As a first attempt, this genome-wide linkage study aimed to identify chromosomal regions linked to muscle and bone cross-sectional area, isometric knee flexion and extension torque, and torque-length relationship for knee flexors and extensors. In total, 283 informative male siblings (17-36 years old), belonging to 105 families, were used to conduct a genome-wide SNP-based multipoint linkage analysis. The strongest evidence for linkage was found for the torque-length relationship of the knee flexors at 14q24.3 (LOD = 4.09; p<10(-5)). Suggestive evidence for linkage was found at 14q32.2 (LOD = 3.00; P = 0.005) for muscle and bone cross-sectional area, at 2p24.2 (LOD = 2.57; p = 0.01) for isometric knee torque at 30 degrees flexion, at 1q21.3, 2p23.3 and 18q11.2 (LOD = 2.33, 2.69 and 2.21; p<10(-4) for all) for the torque-length relationship of the knee extensors and at 18p11.31 (LOD = 2.39; p = 0.0004) for muscle-mass adjusted isometric knee extension torque. We conclude that many small contributing genes rather than a few important genes are involved in causing variation in different underlying phenotypes of muscle strength. Furthermore, some overlap in promising genomic regions were identified among different strength phenotypes.

  15. Genome-wide linkage scan for maximum and length-dependent knee muscle strength in young men: significant evidence for linkage at chromosome 14q24.3

    PubMed Central

    De Mars, G; Windelinckx, A; Huygens, W; Peeters, M W; Beunen, G P; Aerssens, J; Vlietinck, R; Thomis, M A I

    2008-01-01

    Background: Maintenance of high muscular fitness is positively related to bone health, functionality in daily life and increasing insulin sensitivity, and negatively related to falls and fractures, morbidity and mortality. Heritability of muscle strength phenotypes ranges between 31% and 95%, but little is known about the identity of the genes underlying this complex trait. As a first attempt, this genome-wide linkage study aimed to identify chromosomal regions linked to muscle and bone cross-sectional area, isometric knee flexion and extension torque, and torque–length relationship for knee flexors and extensors. Methods: In total, 283 informative male siblings (17–36 years old), belonging to 105 families, were used to conduct a genome-wide SNP-based multipoint linkage analysis. Results: The strongest evidence for linkage was found for the torque–length relationship of the knee flexors at 14q24.3 (LOD  = 4.09; p<10−5). Suggestive evidence for linkage was found at 14q32.2 (LOD  = 3.00; P = 0.005) for muscle and bone cross-sectional area, at 2p24.2 (LOD  = 2.57; p = 0.01) for isometric knee torque at 30° flexion, at 1q21.3, 2p23.3 and 18q11.2 (LOD  = 2.33, 2.69 and 2.21; p<10−4 for all) for the torque–length relationship of the knee extensors and at 18p11.31 (LOD  = 2.39; p = 0.0004) for muscle-mass adjusted isometric knee extension torque. Conclusions: We conclude that many small contributing genes rather than a few important genes are involved in causing variation in different underlying phenotypes of muscle strength. Furthermore, some overlap in promising genomic regions were identified among different strength phenotypes. PMID:18178634

  16. Genetic determinants of common epilepsies: a meta-analysis of genome-wide association studies

    PubMed Central

    2014-01-01

    Summary Background The epilepsies are a clinically heterogeneous group of neurological disorders. Despite strong evidence for heritability, genome-wide association studies have had little success in identification of risk loci associated with epilepsy, probably because of relatively small sample sizes and insufficient power. We aimed to identify risk loci through meta-analyses of genome-wide association studies for all epilepsy and the two largest clinical subtypes (genetic generalised epilepsy and focal epilepsy). Methods We combined genome-wide association data from 12 cohorts of individuals with epilepsy and controls from population-based datasets. Controls were ethnically matched with cases. We phenotyped individuals with epilepsy into categories of genetic generalised epilepsy, focal epilepsy, or unclassified epilepsy. After standardised filtering for quality control and imputation to account for different genotyping platforms across sites, investigators at each site conducted a linear mixed-model association analysis for each dataset. Combining summary statistics, we conducted fixed-effects meta-analyses of all epilepsy, focal epilepsy, and genetic generalised epilepsy. We set the genome-wide significance threshold at p<1·66 × 10−8. Findings We included 8696 cases and 26 157 controls in our analysis. Meta-analysis of the all-epilepsy cohort identified loci at 2q24.3 (p=8·71 × 10−10), implicating SCN1A, and at 4p15.1 (p=5·44 × 10−9), harbouring PCDH7, which encodes a protocadherin molecule not previously implicated in epilepsy. For the cohort of genetic generalised epilepsy, we noted a single signal at 2p16.1 (p=9·99 × 10−9), implicating VRK2 or FANCL. No single nucleotide polymorphism achieved genome-wide significance for focal epilepsy. Interpretation This meta-analysis describes a new locus not previously implicated in epilepsy and provides further evidence about the genetic architecture of these disorders, with the

  17. Mycobacterium tuberculosis genome-wide screen exposes multiple CD8+ T cell epitopes

    PubMed Central

    Hammond, A S; Klein, M R; Corrah, T; Fox, A; Jaye, A; McAdam, K P; Brookes, R H

    2005-01-01

    Mounting evidence suggests human leucocyte antigen (HLA) class I-restricted CD8+ T cells play a role in protective immunity against tuberculosis yet relatively few epitopes specific for the causative organism, Mycobacterium tuberculosis, are reported. Here a total genome-wide screen of M. tuberculosis was used to identify putative HLA-B*3501 T cell epitopes. Of 479 predicted epitopes, 13 with the highest score were synthesized and used to restimulate lymphocytes from naturally exposed HLA-B*3501 healthy individuals in cultured and ex vivo enzyme-linked immunospot (ELISPOT) assays for interferon (IFN)-γ. All 13 peptides elicited a response that varied considerably between individuals. For three peptides CD8+ T cell lines were expanded and four of the 13 were recognized permissively through the HLA-B7 supertype family. Although further testing is required we show the genome-wide screen to be feasible for the identification of unknown mycobacterial antigens involved in immunity against natural infection. While the mechanisms of protective immunity against M. tuberculosis infection remain unclear, conventional class I-restricted CD8+ T cell responses appear to be widespread throughout the genome. PMID:15762882

  18. From Genome-Wide Association Study to Phenome-Wide Association Study: New Paradigms in Obesity Research.

    PubMed

    Zhang, Y-P; Zhang, Y-Y; Duan, D D

    2016-01-01

    Obesity is a condition in which excess body fat has accumulated over an extent that increases the risk of many chronic diseases. The current clinical classification of obesity is based on measurement of body mass index (BMI), waist-hip ratio, and body fat percentage. However, these measurements do not account for the wide individual variations in fat distribution, degree of fatness or health risks, and genetic variants identified in the genome-wide association studies (GWAS). In this review, we will address this important issue with the introduction of phenome, phenomics, and phenome-wide association study (PheWAS). We will discuss the new paradigm shift from GWAS to PheWAS in obesity research. In the era of precision medicine, phenomics and PheWAS provide the required approaches to better definition and classification of obesity according to the association of obese phenome with their unique molecular makeup, lifestyle, and environmental impact. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Genome-Wide Association Study of Erosive Tooth Wear in a Finnish Cohort.

    PubMed

    Alaraudanjoki, Viivi Karoliina; Koivisto, Salla; Pesonen, Paula; Männikkö, Minna; Leinonen, Jukka; Tjäderhane, Leo; Laitala, Marja-Liisa; Lussi, Adrian; Anttonen, Vuokko Anna-Marketta

    2018-06-13

    Erosive tooth wear is defined as irreversible loss of dental tissues due to intrinsic or extrinsic acids, exacerbated by mechanical forces. Recent studies have suggested a higher prevalence of erosive tooth wear in males, as well as a genetic contribution to susceptibility to erosive tooth wear. Our aim was to examine erosive tooth wear by performing a genome-wide association study (GWAS) in a sample of the Northern Finland Birth Cohort 1966 (n = 1,962). Erosive tooth wear was assessed clinically using the basic erosive wear examination. A GWAS was performed for the whole sample as well as separately for males and females. We identified one genome-wide significant signal (rs11681214) in the GWAS of the whole sample near the genes PXDN and MYT1L. When the sample was stratified by sex, the strongest genome-wide significant signals were observed in or near the genes FGFR1, C8orf86, CDH4, SCD5, F2R, and ING1. Additionally, multiple suggestive association signals were detected in all GWASs performed. Many of the signals were in or near the genes putatively related to oral environment or tooth development, and some were near the regions considered to be associated with dental caries, such as 2p24, 4q21, and 13q33. Replications of these associations in other samples, as well as experimental studies to determine the biological functions of associated genetic variants, are needed. © 2018 S. Karger AG, Basel.

  20. A Genome-Wide Landscape of Retrocopies in Primate Genomes.

    PubMed

    Navarro, Fábio C P; Galante, Pedro A F

    2015-07-29

    Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (∼7,500 retrocopies) in Catarrhini (Old Word Monkeys, including humans), but a surprising large number of retrocopies (∼10,000) in Platyrrhini (New World Monkeys), which may be a by-product of higher long interspersed nuclear element 1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified approximately 3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  1. CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining

    PubMed Central

    Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando

    2014-01-01

    Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by

  2. Significant Locus and Metabolic Genetic Correlations Revealed in Genome-Wide Association Study of Anorexia Nervosa.

    PubMed

    Duncan, Laramie; Yilmaz, Zeynep; Gaspar, Helena; Walters, Raymond; Goldstein, Jackie; Anttila, Verneri; Bulik-Sullivan, Brendan; Ripke, Stephan; Thornton, Laura; Hinney, Anke; Daly, Mark; Sullivan, Patrick F; Zeggini, Eleftheria; Breen, Gerome; Bulik, Cynthia M

    2017-09-01

    The authors conducted a genome-wide association study of anorexia nervosa and calculated genetic correlations with a series of psychiatric, educational, and metabolic phenotypes. Following uniform quality control and imputation procedures using the 1000 Genomes Project (phase 3) in 12 case-control cohorts comprising 3,495 anorexia nervosa cases and 10,982 controls, the authors performed standard association analysis followed by a meta-analysis across cohorts. Linkage disequilibrium score regression was used to calculate genome-wide common variant heritability (single-nucleotide polymorphism [SNP]-based heritability [h 2 SNP ]), partitioned heritability, and genetic correlations (r g ) between anorexia nervosa and 159 other phenotypes. Results were obtained for 10,641,224 SNPs and insertion-deletion variants with minor allele frequencies >1% and imputation quality scores >0.6. The h 2 SNP of anorexia nervosa was 0.20 (SE=0.02), suggesting that a substantial fraction of the twin-based heritability arises from common genetic variation. The authors identified one genome-wide significant locus on chromosome 12 (rs4622308) in a region harboring a previously reported type 1 diabetes and autoimmune disorder locus. Significant positive genetic correlations were observed between anorexia nervosa and schizophrenia, neuroticism, educational attainment, and high-density lipoprotein cholesterol, and significant negative genetic correlations were observed between anorexia nervosa and body mass index, insulin, glucose, and lipid phenotypes. Anorexia nervosa is a complex heritable phenotype for which this study has uncovered the first genome-wide significant locus. Anorexia nervosa also has large and significant genetic correlations with both psychiatric phenotypes and metabolic traits. The study results encourage a reconceptualization of this frequently lethal disorder as one with both psychiatric and metabolic etiology.

  3. Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment

    PubMed Central

    Xu, Dong; Zhang, Yang

    2013-01-01

    Genome-wide protein structure prediction and structure-based function annotation have been a long-term goal in molecular biology but not yet become possible due to difficulties in modeling distant-homology targets. We developed a hybrid pipeline combining ab initio folding and template-based modeling for genome-wide structure prediction applied to the Escherichia coli genome. The pipeline was tested on 43 known sequences, where QUARK-based ab initio folding simulation generated models with TM-score 17% higher than that by traditional comparative modeling methods. For 495 unknown hard sequences, 72 are predicted to have a correct fold (TM-score > 0.5) and 321 have a substantial portion of structure correctly modeled (TM-score > 0.35). 317 sequences can be reliably assigned to a SCOP fold family based on structural analogy to existing proteins in PDB. The presented results, as a case study of E. coli, represent promising progress towards genome-wide structure modeling and fold family assignment using state-of-the-art ab initio folding algorithms. PMID:23719418

  4. Genome-wide association study of handedness excludes simple genetic models

    PubMed Central

    Armour, J AL; Davison, A; McManus, I C

    2014-01-01

    Handedness is a human behavioural phenotype that appears to be congenital, and is often assumed to be inherited, but for which the developmental origin and underlying causation(s) have been elusive. Models of the genetic basis of variation in handedness have been proposed that fit different features of the observed resemblance between relatives, but none has been decisively tested or a corresponding causative locus identified. In this study, we applied data from well-characterised individuals studied at the London Twin Research Unit. Analysis of genome-wide SNP data from 3940 twins failed to identify any locus associated with handedness at a genome-wide level of significance. The most straightforward interpretation of our analyses is that they exclude the simplest formulations of the ‘right-shift' model of Annett and the ‘dextral/chance' model of McManus, although more complex modifications of those models are still compatible with our observations. For polygenic effects, our study is inadequately powered to reliably detect alleles with effect sizes corresponding to an odds ratio of 1.2, but should have good power to detect effects at an odds ratio of 2 or more. PMID:24065183

  5. Genomic signatures of positive selection in humans and the limits of outlier approaches.

    PubMed

    Kelley, Joanna L; Madeoy, Jennifer; Calhoun, John C; Swanson, Willie; Akey, Joshua M

    2006-08-01

    Identifying regions of the human genome that have been targets of positive selection will provide important insights into recent human evolutionary history and may facilitate the search for complex disease genes. However, the confounding effects of population demographic history and selection on patterns of genetic variation complicate inferences of selection when a small number of loci are studied. To this end, identifying outlier loci from empirical genome-wide distributions of genetic variation is a promising strategy to detect targets of selection. Here, we evaluate the power and efficiency of a simple outlier approach and describe a genome-wide scan for positive selection using a dense catalog of 1.58 million SNPs that were genotyped in three human populations. In total, we analyzed 14,589 genes, 385 of which possess patterns of genetic variation consistent with the hypothesis of positive selection. Furthermore, several extended genomic regions were found, spanning >500 kb, that contained multiple contiguous candidate selection genes. More generally, these data provide important practical insights into the limits of outlier approaches in genome-wide scans for selection, provide strong candidate selection genes to study in greater detail, and may have important implications for disease related research.

  6. Exploiting the Proteome to Improve the Genome-Wide Genetic Analysis of Epistasis in Common Human Diseases

    PubMed Central

    Pattin, Kristine A.; Moore, Jason H.

    2009-01-01

    One of the central goals of human genetics is the identification of loci with alleles or genotypes that confer increased susceptibility. The availability of dense maps of single-nucleotide polymorphisms (SNPs) along with high-throughput genotyping technologies has set the stage for routine genome-wide association studies that are expected to significantly improve our ability to identify susceptibility loci. Before this promise can be realized, there are some significant challenges that need to be addressed. We address here the challenge of detecting epistasis or gene-gene interactions in genome-wide association studies. Discovering epistatic interactions in high dimensional datasets remains a challenge due to the computational complexity resulting from the analysis of all possible combinations of SNPs. One potential way to overcome the computational burden of a genome-wide epistasis analysis would be to devise a logical way to prioritize the many SNPs in a dataset so that the data may be analyzed more efficiently and yet still retain important biological information. One of the strongest demonstrations of the functional relationship between genes is protein-protein interaction. Thus, it is plausible that the expert knowledge extracted from protein interaction databases may allow for a more efficient analysis of genome-wide studies as well as facilitate the biological interpretation of the data. In this review we will discuss the challenges of detecting epistasis in genome-wide genetic studies and the means by which we propose to apply expert knowledge extracted from protein interaction databases to facilitate this process. We explore some of the fundamentals of protein interactions and the databases that are publicly available. PMID:18551320

  7. Genome-wide Association for Major Depression Through Age at Onset Stratification: Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium.

    PubMed

    Power, Robert A; Tansey, Katherine E; Buttenschøn, Henriette Nørmølle; Cohen-Woods, Sarah; Bigdeli, Tim; Hall, Lynsey S; Kutalik, Zoltán; Lee, S Hong; Ripke, Stephan; Steinberg, Stacy; Teumer, Alexander; Viktorin, Alexander; Wray, Naomi R; Arolt, Volker; Baune, Bernard T; Boomsma, Dorret I; Børglum, Anders D; Byrne, Enda M; Castelao, Enrique; Craddock, Nick; Craig, Ian W; Dannlowski, Udo; Deary, Ian J; Degenhardt, Franziska; Forstner, Andreas J; Gordon, Scott D; Grabe, Hans J; Grove, Jakob; Hamilton, Steven P; Hayward, Caroline; Heath, Andrew C; Hocking, Lynne J; Homuth, Georg; Hottenga, Jouke J; Kloiber, Stefan; Krogh, Jesper; Landén, Mikael; Lang, Maren; Levinson, Douglas F; Lichtenstein, Paul; Lucae, Susanne; MacIntyre, Donald J; Madden, Pamela; Magnusson, Patrik K E; Martin, Nicholas G; McIntosh, Andrew M; Middeldorp, Christel M; Milaneschi, Yuri; Montgomery, Grant W; Mors, Ole; Müller-Myhsok, Bertram; Nyholt, Dale R; Oskarsson, Hogni; Owen, Michael J; Padmanabhan, Sandosh; Penninx, Brenda W J H; Pergadia, Michele L; Porteous, David J; Potash, James B; Preisig, Martin; Rivera, Margarita; Shi, Jianxin; Shyn, Stanley I; Sigurdsson, Engilbert; Smit, Johannes H; Smith, Blair H; Stefansson, Hreinn; Stefansson, Kari; Strohmaier, Jana; Sullivan, Patrick F; Thomson, Pippa; Thorgeirsson, Thorgeir E; Van der Auwera, Sandra; Weissman, Myrna M; Breen, Gerome; Lewis, Cathryn M

    2017-02-15

    Major depressive disorder (MDD) is a disabling mood disorder, and despite a known heritable component, a large meta-analysis of genome-wide association studies revealed no replicable genetic risk variants. Given prior evidence of heterogeneity by age at onset in MDD, we tested whether genome-wide significant risk variants for MDD could be identified in cases subdivided by age at onset. Discovery case-control genome-wide association studies were performed where cases were stratified using increasing/decreasing age-at-onset cutoffs; significant single nucleotide polymorphisms were tested in nine independent replication samples, giving a total sample of 22,158 cases and 133,749 control subjects for subsetting. Polygenic score analysis was used to examine whether differences in shared genetic risk exists between earlier and adult-onset MDD with commonly comorbid disorders of schizophrenia, bipolar disorder, Alzheimer's disease, and coronary artery disease. We identified one replicated genome-wide significant locus associated with adult-onset (>27 years) MDD (rs7647854, odds ratio: 1.16, 95% confidence interval: 1.11-1.21, p = 5.2 × 10 -11 ). Using polygenic score analyses, we show that earlier-onset MDD is genetically more similar to schizophrenia and bipolar disorder than adult-onset MDD. We demonstrate that using additional phenotype data previously collected by genetic studies to tackle phenotypic heterogeneity in MDD can successfully lead to the discovery of genetic risk factor despite reduced sample size. Furthermore, our results suggest that the genetic susceptibility to MDD differs between adult- and earlier-onset MDD, with earlier-onset cases having a greater genetic overlap with schizophrenia and bipolar disorder. Copyright © 2016 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  8. Challenges and Opportunities in Genome-Wide Environmental Interaction (GWEI) studies

    PubMed Central

    Aschard, Hugues; Lutz, Sharon; Maus, Bärbel; Duell, Eric J.; Fingerlin, Tasha; Chatterjee, Nilanjan; Kraft, Peter; Van Steen, Kristel

    2012-01-01

    The interest in performing gene-environment interaction studies has seen a significant increase with the increase of advanced molecular genetics techniques. Practically, it became possible to investigate the role of environmental factors in disease risk and hence to investigate their role as genetic effect modifiers. The understanding that genetics is important in the uptake and metabolism of toxic substances is an example of how genetic profiles can modify important environmental risk factors to disease. Several rationales exist to set up gene-environment interaction studies and the technical challenges related to these studies – when the number of environmental or genetic risk factors is relatively small – has been described before. In the post-genomic era, it is now possible to study thousands of genes and their interaction with the environment. This brings along a whole range of new challenges and opportunities. Despite a continuing effort in developing efficient methods and optimal bioinformatics infrastructures to deal with the available wealth of data, the challenge remains how to best present and analyze Genome-Wide Environmental Interaction (GWEI) studies involving multiple genetic and environmental factors. Since GWEIs are performed at the intersection of statistical genetics, bioinformatics and epidemiology, usually similar problems need to be dealt with as for Genome-Wide Association gene-gene Interaction (GWAI) studies. However, additional complexities need to be considered which are typical for large-scale epidemiological studies, but are also related to “joining” two heterogeneous types of data in explaining complex disease trait variation or for prediction purposes. PMID:22760307

  9. Genome-wide association of meat quality traits and tenderness in swine

    USDA-ARS?s Scientific Manuscript database

    Pork quality has a large impact on consumer preference and perception of eating quality. A genome-wide association was performed for pork quality traits [intramuscular fat (IMF)], slice shear force (SSF), color attributes, purge, cooking loss, and pH] from 531 to 1,237 records on barrows and gilts o...

  10. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study

    PubMed Central

    Amyotte, Beatrice; Bowen, Amy J.; Banks, Travis; Rajcan, Istvan; Somers, Daryl J.

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants. PMID:28231290

  11. Mapping the sensory perception of apple using descriptive sensory evaluation in a genome wide association study.

    PubMed

    Amyotte, Beatrice; Bowen, Amy J; Banks, Travis; Rajcan, Istvan; Somers, Daryl J

    2017-01-01

    Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants.

  12. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    PubMed

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  13. [Genome-wide association study for adolescent idiopathic scoliosis].

    PubMed

    Ogura, Yoji; Kou, Ikuyo; Scoliosis, Japan; Matsumoto, Morio; Watanabe, Kota; Ikegawa, Shiro

    2016-04-01

    Adolescent idiopathic scoliosis(AIS)is a polygenic disease. Genome-wide association studies(GWASs)have been performed for a lot of polygenic diseases. For AIS, we conducted GWAS and identified the first AIS locus near LBX1. After the discovery, we have extended our study by increasing the numbers of subjects and SNPs. In total, our Japanese GWAS has identified four susceptibility genes. GWASs for AIS have also been performed in the USA and China, which identified one and three susceptibility genes, respectively. Here we review GWASs in Japan and abroad and functional analysis to clarify the pathomechanism of AIS.

  14. Placental genome and maternal-placental genetic interactions: a genome-wide and candidate gene association study of placental abruption.

    PubMed

    Denis, Marie; Enquobahrie, Daniel A; Tadesse, Mahlet G; Gelaye, Bizu; Sanchez, Sixto E; Salazar, Manuel; Ananth, Cande V; Williams, Michelle A

    2014-01-01

    While available evidence supports the role of genetics in the pathogenesis of placental abruption (PA), PA-related placental genome variations and maternal-placental genetic interactions have not been investigated. Maternal blood and placental samples collected from participants in the Peruvian Abruptio Placentae Epidemiology study were genotyped using Illumina's Cardio-Metabochip platform. We examined 118,782 genome-wide SNPs and 333 SNPs in 32 candidate genes from mitochondrial biogenesis and oxidative phosphorylation pathways in placental DNA from 280 PA cases and 244 controls. We assessed maternal-placental interactions in the candidate gene SNPS and two imprinted regions (IGF2/H19 and C19MC). Univariate and penalized logistic regression models were fit to estimate odds ratios. We examined the combined effect of multiple SNPs on PA risk using weighted genetic risk scores (WGRS) with repeated ten-fold cross-validations. A multinomial model was used to investigate maternal-placental genetic interactions. In placental genome-wide and candidate gene analyses, no SNP was significant after false discovery rate correction. The top genome-wide association study (GWAS) hits were rs544201, rs1484464 (CTNNA2), rs4149570 (TNFRSF1A) and rs13055470 (ZNRF3) (p-values: 1.11e-05 to 3.54e-05). The top 200 SNPs of the GWAS overrepresented genes involved in cell cycle, growth and proliferation. The top candidate gene hits were rs16949118 (COX10) and rs7609948 (THRB) (p-values: 6.00e-03 and 8.19e-03). Participants in the highest quartile of WGRS based on cross-validations using SNPs selected from the GWAS and candidate gene analyses had a 8.40-fold (95% CI: 5.8-12.56) and a 4.46-fold (95% CI: 2.94-6.72) higher odds of PA compared to participants in the lowest quartile. We found maternal-placental genetic interactions on PA risk for two SNPs in PPARG (chr3:12313450 and chr3:12412978) and maternal imprinting effects for multiple SNPs in the C19MC and IGF2/H19 regions. Variations in

  15. GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes.

    PubMed

    Arakawa, Kazuharu; Yamada, Yohei; Shinoda, Kosaku; Nakayama, Yoichi; Tomita, Masaru

    2006-03-23

    Successful realization of a "systems biology" approach to analyzing cells is a grand challenge for our understanding of life. However, current modeling approaches to cell simulation are labor-intensive, manual affairs, and therefore constitute a major bottleneck in the evolution of computational cell biology. We developed the Genome-based Modeling (GEM) System for the purpose of automatically prototyping simulation models of cell-wide metabolic pathways from genome sequences and other public biological information. Models generated by the GEM System include an entire Escherichia coli metabolism model comprising 968 reactions of 1195 metabolites, achieving 100% coverage when compared with the KEGG database, 92.38% with the EcoCyc database, and 95.06% with iJR904 genome-scale model. The GEM System prototypes qualitative models to reduce the labor-intensive tasks required for systems biology research. Models of over 90 bacterial genomes are available at our web site.

  16. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits

    PubMed Central

    Pecetti, Luciano; Brummer, E. Charles; Palmonari, Alberto; Tava, Aldo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3–0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits

  17. Genome-Wide Association Mapping and Genomic Selection for Alfalfa (Medicago sativa) Forage Quality Traits.

    PubMed

    Biazzi, Elisa; Nazzicari, Nelson; Pecetti, Luciano; Brummer, E Charles; Palmonari, Alberto; Tava, Aldo; Annicchiarico, Paolo

    2017-01-01

    Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3-0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits

  18. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines

    PubMed Central

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours’ biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription–quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes. PMID:29263807

  19. Genome-wide methylation analysis identifies genes silenced in non-seminoma cell lines.

    PubMed

    Noor, Dzul Azri Mohamed; Jeyapalan, Jennie N; Alhazmi, Safiah; Carr, Matthew; Squibb, Benjamin; Wallace, Claire; Tan, Christopher; Cusack, Martin; Hughes, Jaime; Reader, Tom; Shipley, Janet; Sheer, Denise; Scotting, Paul J

    2016-01-01

    Silencing of genes by DNA methylation is a common phenomenon in many types of cancer. However, the genome-wide effect of DNA methylation on gene expression has been analysed in relatively few cancers. Germ cell tumours (GCTs) are a complex group of malignancies. They are unique in developing from a pluripotent progenitor cell. Previous analyses have suggested that non-seminomas exhibit much higher levels of DNA methylation than seminomas. The genomic targets that are methylated, the extent to which this results in gene silencing and the identity of the silenced genes most likely to play a role in the tumours' biology have not yet been established. In this study, genome-wide methylation and expression analysis of GCT cell lines was combined with gene expression data from primary tumours to address this question. Genome methylation was analysed using the Illumina infinium HumanMethylome450 bead chip system and gene expression was analysed using Affymetrix GeneChip Human Genome U133 Plus 2.0 arrays. Regulation by methylation was confirmed by demethylation using 5-aza-2-deoxycytidine and reverse transcription-quantitative PCR. Large differences in the level of methylation of the CpG islands of individual genes between tumour cell lines correlated well with differential gene expression. Treatment of non-seminoma cells with 5-aza-2-deoxycytidine verified that methylation of all genes tested played a role in their silencing in yolk sac tumour cells and many of these genes were also differentially expressed in primary tumours. Genes silenced by methylation in the various GCT cell lines were identified. Several pluripotency-associated genes were identified as a major functional group of silenced genes.

  20. Genome-wide detection of intervals of genetic heterogeneity associated with complex traits

    PubMed Central

    Llinares-López, Felipe; Grimm, Dominik G.; Bodenham, Dean A.; Gieraths, Udo; Sugiyama, Mahito; Rowan, Beth; Borgwardt, Karsten

    2015-01-01

    Motivation: Genetic heterogeneity, the fact that several sequence variants give rise to the same phenotype, is a phenomenon that is of the utmost interest in the analysis of complex phenotypes. Current approaches for finding regions in the genome that exhibit genetic heterogeneity suffer from at least one of two shortcomings: (i) they require the definition of an exact interval in the genome that is to be tested for genetic heterogeneity, potentially missing intervals of high relevance, or (ii) they suffer from an enormous multiple hypothesis testing problem due to the large number of potential candidate intervals being tested, which results in either many false positives or a lack of power to detect true intervals. Results: Here, we present an approach that overcomes both problems: it allows one to automatically find all contiguous sequences of single nucleotide polymorphisms in the genome that are jointly associated with the phenotype. It also solves both the inherent computational efficiency problem and the statistical problem of multiple hypothesis testing, which are both caused by the huge number of candidate intervals. We demonstrate on Arabidopsis thaliana genome-wide association study data that our approach can discover regions that exhibit genetic heterogeneity and would be missed by single-locus mapping. Conclusions: Our novel approach can contribute to the genome-wide discovery of intervals that are involved in the genetic heterogeneity underlying complex phenotypes. Availability and implementation: The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/sis.html. Contact: felipe.llinares@bsse.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26072488

  1. Genome-wide association analysis for feed efficiency in Angus cattle.

    PubMed

    Rolf, M M; Taylor, J F; Schnabel, R D; McKay, S D; McClure, M C; Northcutt, S L; Kerley, M S; Weaber, R L

    2012-08-01

    Estimated breeding values for average daily feed intake (AFI; kg/day), residual feed intake (RFI; kg/day) and average daily gain (ADG; kg/day) were generated using a mixed linear model incorporating genomic relationships for 698 Angus steers genotyped with the Illumina BovineSNP50 assay. Association analyses of estimated breeding values (EBVs) were performed for 41,028 single nucleotide polymorphisms (SNPs), and permutation analysis was used to empirically establish the genome-wide significance threshold (P < 0.05) for each trait. SNPs significantly associated with each trait were used in a forward selection algorithm to identify genomic regions putatively harbouring genes with effects on each trait. A total of 53, 66 and 68 SNPs explained 54.12% (24.10%), 62.69% (29.85%) and 55.13% (26.54%) of the additive genetic variation (when accounting for the genomic relationships) in steer breeding values for AFI, RFI and ADG, respectively, within this population. Evaluation by pathway analysis revealed that many of these SNPs are in genomic regions that harbour genes with metabolic functions. The presence of genetic correlations between traits resulted in 13.2% of SNPs selected for AFI and 4.5% of SNPs selected for RFI also being selected for ADG in the analysis of breeding values. While our study identifies panels of SNPs significant for efficiency traits in our population, validation of all SNPs in independent populations will be necessary before commercialization. © 2011 The Authors, Animal Genetics © 2011 Stichting International Foundation for Animal Genetics.

  2. Genetics of Genome-Wide Recombination Rate Evolution in Mice from an Isolated Island.

    PubMed

    Wang, Richard J; Payseur, Bret A

    2017-08-01

    Recombination rate is a heritable quantitative trait that evolves despite the fundamentally conserved role that recombination plays in meiosis. Differences in recombination rate can alter the landscape of the genome and the genetic diversity of populations. Yet our understanding of the genetic basis of recombination rate evolution in nature remains limited. We used wild house mice ( Mus musculus domesticus ) from Gough Island (GI), which diverged recently from their mainland counterparts, to characterize the genetics of recombination rate evolution. We quantified genome-wide autosomal recombination rates by immunofluorescence cytology in spermatocytes from 240 F 2 males generated from intercrosses between GI-derived mice and the wild-derived inbred strain WSB/EiJ. We identified four quantitative trait loci (QTL) responsible for inter-F 2 variation in this trait, the strongest of which had effects that opposed the direction of the parental trait differences. Candidate genes and mutations for these QTL were identified by overlapping the detected intervals with whole-genome sequencing data and publicly available transcriptomic profiles from spermatocytes. Combined with existing studies, our findings suggest that genome-wide recombination rate divergence is not directional and its evolution within and between subspecies proceeds from distinct genetic loci. Copyright © 2017 by the Genetics Society of America.

  3. Genome-wide methylation study of diploid and triploid brown trout (Salmo trutta L.).

    PubMed

    Covelo-Soto, L; Leunda, P M; Pérez-Figueroa, A; Morán, P

    2015-06-01

    The induction of triploidization in fish is a very common practice in aquaculture. Although triploidization has been applied successfully in many salmonid species, little is known about the epigenetic mechanisms implicated in the maintenance of the normal functions of the new polyploid genome. By means of methylation-sensitive amplified polymorphism (MSAP) techniques, genome-wide methylation changes associated with triploidization were assessed in DNA samples obtained from diploid and triploid siblings of brown trout (Salmo trutta). Simple comparative body measurements showed that the triploid trout used in the study were statistically bigger, however, not heavier than their diploid counterparts. The statistical analysis of the MSAP data showed no significant differences between diploid and triploid brown trout in respect to brain, gill, heart, liver, kidney or muscle samples. Nonetheless, local analysis pointed to the possibility of differences in connection with concrete loci. This is the first study that has investigated DNA methylation alterations associated with triploidization in brown trout. Our results set the basis for new studies to be undertaken and provide a new approach concerning triploidization effects of the salmonid genome while also contributing to the better understanding of the genome-wide methylation processes. © 2015 Stichting International Foundation for Animal Genetics.

  4. Genome-Wide Networks of Amino Acid Covariances Are Common among Viruses

    PubMed Central

    Donlin, Maureen J.; Szeto, Brandon; Gohara, David W.; Aurora, Rajeev

    2012-01-01

    Coordinated variation among positions in amino acid sequence alignments can reveal genetic dependencies at noncontiguous positions, but methods to assess these interactions are incompletely developed. Previously, we found genome-wide networks of covarying residue positions in the hepatitis C virus genome (R. Aurora, M. J. Donlin, N. A. Cannon, and J. E. Tavis, J. Clin. Invest. 119:225–236, 2009). Here, we asked whether such networks are present in a diverse set of viruses and, if so, what they may imply about viral biology. Viral sequences were obtained for 16 viruses in 13 species from 9 families. The entire viral coding potential for each virus was aligned, all possible amino acid covariances were identified using the observed-minus-expected-squared algorithm at a false-discovery rate of ≤1%, and networks of covariances were assessed using standard methods. Covariances that spanned the viral coding potential were common in all viruses. In all cases, the covariances formed a single network that contained essentially all of the covariances. The hepatitis C virus networks had hub-and-spoke topologies, but all other networks had random topologies with an unusually large number of highly connected nodes. These results indicate that genome-wide networks of genetic associations and the coordinated evolution they imply are very common in viral genomes, that the networks rarely have the hub-and-spoke topology that dominates other biological networks, and that network topologies can vary substantially even within a given viral group. Five examples with hepatitis B virus and poliovirus are presented to illustrate how covariance network analysis can lead to inferences about viral biology. PMID:22238298

  5. Large meta-analysis of genome-wide association studies identifies five loci for lean body mass.

    PubMed

    Zillikens, M Carola; Demissie, Serkalem; Hsu, Yi-Hsiang; Yerges-Armstrong, Laura M; Chou, Wen-Chi; Stolk, Lisette; Livshits, Gregory; Broer, Linda; Johnson, Toby; Koller, Daniel L; Kutalik, Zoltán; Luan, Jian'an; Malkin, Ida; Ried, Janina S; Smith, Albert V; Thorleifsson, Gudmar; Vandenput, Liesbeth; Hua Zhao, Jing; Zhang, Weihua; Aghdassi, Ali; Åkesson, Kristina; Amin, Najaf; Baier, Leslie J; Barroso, Inês; Bennett, David A; Bertram, Lars; Biffar, Rainer; Bochud, Murielle; Boehnke, Michael; Borecki, Ingrid B; Buchman, Aron S; Byberg, Liisa; Campbell, Harry; Campos Obanda, Natalia; Cauley, Jane A; Cawthon, Peggy M; Cederberg, Henna; Chen, Zhao; Cho, Nam H; Jin Choi, Hyung; Claussnitzer, Melina; Collins, Francis; Cummings, Steven R; De Jager, Philip L; Demuth, Ilja; Dhonukshe-Rutten, Rosalie A M; Diatchenko, Luda; Eiriksdottir, Gudny; Enneman, Anke W; Erdos, Mike; Eriksson, Johan G; Eriksson, Joel; Estrada, Karol; Evans, Daniel S; Feitosa, Mary F; Fu, Mao; Garcia, Melissa; Gieger, Christian; Girke, Thomas; Glazer, Nicole L; Grallert, Harald; Grewal, Jagvir; Han, Bok-Ghee; Hanson, Robert L; Hayward, Caroline; Hofman, Albert; Hoffman, Eric P; Homuth, Georg; Hsueh, Wen-Chi; Hubal, Monica J; Hubbard, Alan; Huffman, Kim M; Husted, Lise B; Illig, Thomas; Ingelsson, Erik; Ittermann, Till; Jansson, John-Olov; Jordan, Joanne M; Jula, Antti; Karlsson, Magnus; Khaw, Kay-Tee; Kilpeläinen, Tuomas O; Klopp, Norman; Kloth, Jacqueline S L; Koistinen, Heikki A; Kraus, William E; Kritchevsky, Stephen; Kuulasmaa, Teemu; Kuusisto, Johanna; Laakso, Markku; Lahti, Jari; Lang, Thomas; Langdahl, Bente L; Launer, Lenore J; Lee, Jong-Young; Lerch, Markus M; Lewis, Joshua R; Lind, Lars; Lindgren, Cecilia; Liu, Yongmei; Liu, Tian; Liu, Youfang; Ljunggren, Östen; Lorentzon, Mattias; Luben, Robert N; Maixner, William; McGuigan, Fiona E; Medina-Gomez, Carolina; Meitinger, Thomas; Melhus, Håkan; Mellström, Dan; Melov, Simon; Michaëlsson, Karl; Mitchell, Braxton D; Morris, Andrew P; Mosekilde, Leif; Newman, Anne; Nielson, Carrie M; O'Connell, Jeffrey R; Oostra, Ben A; Orwoll, Eric S; Palotie, Aarno; Parker, Stephen C J; Peacock, Munro; Perola, Markus; Peters, Annette; Polasek, Ozren; Prince, Richard L; Räikkönen, Katri; Ralston, Stuart H; Ripatti, Samuli; Robbins, John A; Rotter, Jerome I; Rudan, Igor; Salomaa, Veikko; Satterfield, Suzanne; Schadt, Eric E; Schipf, Sabine; Scott, Laura; Sehmi, Joban; Shen, Jian; Soo Shin, Chan; Sigurdsson, Gunnar; Smith, Shad; Soranzo, Nicole; Stančáková, Alena; Steinhagen-Thiessen, Elisabeth; Streeten, Elizabeth A; Styrkarsdottir, Unnur; Swart, Karin M A; Tan, Sian-Tsung; Tarnopolsky, Mark A; Thompson, Patricia; Thomson, Cynthia A; Thorsteinsdottir, Unnur; Tikkanen, Emmi; Tranah, Gregory J; Tuomilehto, Jaakko; van Schoor, Natasja M; Verma, Arjun; Vollenweider, Peter; Völzke, Henry; Wactawski-Wende, Jean; Walker, Mark; Weedon, Michael N; Welch, Ryan; Wichmann, H-Erich; Widen, Elisabeth; Williams, Frances M K; Wilson, James F; Wright, Nicole C; Xie, Weijia; Yu, Lei; Zhou, Yanhua; Chambers, John C; Döring, Angela; van Duijn, Cornelia M; Econs, Michael J; Gudnason, Vilmundur; Kooner, Jaspal S; Psaty, Bruce M; Spector, Timothy D; Stefansson, Kari; Rivadeneira, Fernando; Uitterlinden, André G; Wareham, Nicholas J; Ossowski, Vicky; Waterworth, Dawn; Loos, Ruth J F; Karasik, David; Harris, Tamara B; Ohlsson, Claes; Kiel, Douglas P

    2017-07-19

    Lean body mass, consisting mostly of skeletal muscle, is important for healthy aging. We performed a genome-wide association study for whole body (20 cohorts of European ancestry with n = 38,292) and appendicular (arms and legs) lean body mass (n = 28,330) measured using dual energy X-ray absorptiometry or bioelectrical impedance analysis, adjusted for sex, age, height, and fat mass. Twenty-one single-nucleotide polymorphisms were significantly associated with lean body mass either genome wide (p < 5 × 10 -8 ) or suggestively genome wide (p < 2.3 × 10 -6 ). Replication in 63,475 (47,227 of European ancestry) individuals from 33 cohorts for whole body lean body mass and in 45,090 (42,360 of European ancestry) subjects from 25 cohorts for appendicular lean body mass was successful for five single-nucleotide polymorphisms in/near HSD17B11, VCAN, ADAMTSL3, IRS1, and FTO for total lean body mass and for three single-nucleotide polymorphisms in/near VCAN, ADAMTSL3, and IRS1 for appendicular lean body mass. Our findings provide new insight into the genetics of lean body mass.Lean body mass is a highly heritable trait and is associated with various health conditions. Here, Kiel and colleagues perform a meta-analysis of genome-wide association studies for whole body lean body mass and find five novel genetic loci to be significantly associated.

  6. Ensembl Genomes 2013: scaling up access to genome-wide data

    USDA-ARS?s Scientific Manuscript database

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provi...

  7. Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome.

    PubMed

    Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

    2016-10-01

    Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. Anonymization of electronic medical records for validating genome-wide association studies

    PubMed Central

    Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley

    2010-01-01

    Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806

  9. Genome wide selection in Citrus breeding.

    PubMed

    Gois, I B; Borém, A; Cristofani-Yaly, M; de Resende, M D V; Azevedo, C F; Bastianel, M; Novelli, V M; Machado, M A

    2016-10-17

    Genome wide selection (GWS) is essential for the genetic improvement of perennial species such as Citrus because of its ability to increase gain per unit time and to enable the efficient selection of characteristics with low heritability. This study assessed GWS efficiency in a population of Citrus and compared it with selection based on phenotypic data. A total of 180 individual trees from a cross between Pera sweet orange (Citrus sinensis Osbeck) and Murcott tangor (Citrus sinensis Osbeck x Citrus reticulata Blanco) were evaluated for 10 characteristics related to fruit quality. The hybrids were genotyped using 5287 DArT_seq TM (diversity arrays technology) molecular markers and their effects on phenotypes were predicted using the random regression - best linear unbiased predictor (rr-BLUP) method. The predictive ability, prediction bias, and accuracy of GWS were estimated to verify its effectiveness for phenotype prediction. The proportion of genetic variance explained by the markers was also computed. The heritability of the traits, as determined by markers, was 16-28%. The predictive ability of these markers ranged from 0.53 to 0.64, and the regression coefficients between predicted and observed phenotypes were close to unity. Over 35% of the genetic variance was accounted for by the markers. Accuracy estimates with GWS were lower than those obtained by phenotypic analysis; however, GWS was superior in terms of genetic gain per unit time. Thus, GWS may be useful for Citrus breeding as it can predict phenotypes early and accurately, and reduce the length of the selection cycle. This study demonstrates the feasibility of genomic selection in Citrus.

  10. Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies

    PubMed Central

    Abnet, Christian C.; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D.; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M.; Liao, Linda M.; Lee, Maxwell P.; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C.; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B.; Giffen, Carol A.; Burdett, Laurie; Fraumeni, Joseph F.; Tucker, Margaret A.; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M.; Yuan, Zhi-Qing; Chanock, Stephen J.; Zhang, Xue-Jun; Taylor, Philip R.; Wang, Li-Dong

    2012-01-01

    Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10−8, and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19–1.40) and P= 7.63 × 10−10. An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants. PMID:22323360

  11. A Genome-Wide Association Study to Identify Genomic Modulators of Rate Control Therapy in Patients with Atrial Fibrillation

    PubMed Central

    Kolek, Matthew J.; Edwards, Todd L.; Muhammad, Raafia; Balouch, Adnan; Shoemaker, M. Benjamin; Blair, Marcia A.; Kor, Kaylen C.; Takahashi, Atsushi; Kubo, Michiaki; Roden, Dan M.; Tanaka, Toshihiro; Darbar, Dawood

    2014-01-01

    For many patients with atrial fibrillation (AF), ventricular rate control with atrioventricular (AV) nodal blockers is considered first-line therapy, though response to treatment is highly variable. Using an extreme phenotype of failure of rate control necessitating AV nodal ablation and pacemaker implantation, we conducted a genome wide association study (GWAS) to identify genomic modulators of rate control therapy. Cases included 95 patients who failed rate control therapy. Controls (N=190) achieved adequate rate control therapy with ≤2 AV nodal blockers using a conventional clinical definition. Genotyping was performed on the Illumina 610-Quad platform, and results were imputed to the 1000 Genomes reference haplotypes. 554,041 single nucleotide polymorphisms (SNPs) met criteria for minor allele frequency (>0.01), call rate (>95%), and quality control, and 6,055,224 SNPs were available after imputation. No SNP reached the canonical threshold for significance for GWAS of P<5 × 10−8. Sixty-three SNPs with P<10−5 at 6 genomic loci were genotyped in a validation cohort of 130 cases and 157 controls. These included 6q24.3 (near SAMD5/SASH1, P=9.36 × 10−8), 4q12 (IGFBP7, P=1.75 × 10−7), 6q22.33 (C6orf174, P=4.86 × 10−7), 3p21.31 (CDCP1, P=1.18 × 10−6), 12p12.1 (SOX5, P=1.62 × 10−6), and 7p11 (LANCL2, P=6.51 × 10−6). However, none of these were significant in the replication cohort or in a meta-analysis of both cohorts. In conclusion, we identified several potentially important genomic modulators of rate control therapy in AF, particularly SOX5, which was previously associated with resting heart rate and PR interval. However these failed to reach genome-wide significance. PMID:25015694

  12. A genome-wide association study to identify genomic modulators of rate control therapy in patients with atrial fibrillation.

    PubMed

    Kolek, Matthew J; Edwards, Todd L; Muhammad, Raafia; Balouch, Adnan; Shoemaker, M Benjamin; Blair, Marcia A; Kor, Kaylen C; Takahashi, Atsushi; Kubo, Michiaki; Roden, Dan M; Tanaka, Toshihiro; Darbar, Dawood

    2014-08-15

    For many patients with atrial fibrillation, ventricular rate control with atrioventricular (AV) nodal blockers is considered first-line therapy, although response to treatment is highly variable. Using an extreme phenotype of failure of rate control necessitating AV nodal ablation and pacemaker implantation, we conducted a genome-wide association study (GWAS) to identify genomic modulators of rate control therapy. Cases included 95 patients who failed rate control therapy. Controls (n = 190) achieved adequate rate control therapy with ≤2 AV nodal blockers using a conventional clinical definition. Genotyping was performed on the Illumina 610-Quad platform, and results were imputed to the 1000 Genomes reference haplotypes. A total of 554,041 single-nucleotide polymorphisms (SNPs) met criteria for minor allele frequency (>0.01), call rate (>95%), and quality control, and 6,055,224 SNPs were available after imputation. No SNP reached the canonical threshold for significance for GWAS of p <5 × 10(-8). Sixty-three SNPs with p <10(-5) at 6 genomic loci were genotyped in a validation cohort of 130 cases and 157 controls. These included 6q24.3 (near SAMD5/SASH1, p = 9.36 × 10(-8)), 4q12 (IGFBP7, p = 1.75 × 10(-7)), 6q22.33 (C6orf174, p = 4.86 × 10(-7)), 3p21.31 (CDCP1, p = 1.18 × 10(-6)), 12p12.1 (SOX5, p = 1.62 × 10(-6)), and 7p11 (LANCL2, p = 6.51 × 10(-6)). However, none of these were significant in the replication cohort or in a meta-analysis of both cohorts. In conclusion, we identified several potentially important genomic modulators of rate control therapy in atrial fibrillation, particularly SOX5, which was previously associated with heart rate at rest and PR interval. However, these failed to reach genome-wide significance. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies.

    PubMed

    Yang, Cheng-Hong; Chuang, Li-Yeh; Lin, Yu-Da

    2017-08-01

    Detecting epistatic interactions in genome-wide association studies (GWAS) is a computational challenge. Such huge numbers of single-nucleotide polymorphism (SNP) combinations limit the some of the powerful algorithms to be applied to detect the potential epistasis in large-scale SNP datasets. We propose a new algorithm which combines the differential evolution (DE) algorithm with a classification based multifactor-dimensionality reduction (CMDR), termed DECMDR. DECMDR uses the CMDR as a fitness measure to evaluate values of solutions in DE process for scanning the potential statistical epistasis in GWAS. The results indicated that DECMDR outperforms the existing algorithms in terms of detection success rate by the large simulation and real data obtained from the Wellcome Trust Case Control Consortium. For running time comparison, DECMDR can efficient to apply the CMDR to detect the significant association between cases and controls amongst all possible SNP combinations in GWAS. DECMDR is freely available at https://goo.gl/p9sLuJ . chuang@isu.edu.tw or e0955767257@yahoo.com.tw. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  14. Type 2 diabetes mellitus disease risk genes identified by genome wide copy number variation scan in normal populations.

    PubMed

    Prabhanjan, Manasa; Suresh, Raviraj V; Murthy, Megha N; Ramachandra, Nallur B

    2016-03-01

    To identify the role of copy number variations (CNVs) on disease risk genes and its effect on disease phenotypes in type 2 diabetes mellitus (T2DM) in 12 random populations using high throughput arrays. CNV analysis was carried out on a total of 1715 individuals from 12 populations, from ArrayExpress Archive of the European Bioinformatics Institute along with our subjects using Affymetrix Genome Wide SNP 6.0 array. CNV effect on T2DM genes were analyzed using several bioinformatics tools and a molecular protein interaction network was constructed to identify the disease mechanism altered by the CNVs. Analysis showed 34.4% of the total population to be under CNV burden for T2DM, with 83 disease causal and associated genes being under CNV influence. Hotspots were identified on chromosomes 22, 12, 6, 19 and 11.Overlap studies with case cohorts revealed significant disease risk genes such as EGFR, E2F1, PPP1R3A, HLA and TSPAN8. CNVs play a significant role in predisposing T2DM in normal cohorts and contribute to the phenotypic effects. Thus, CNVs should be considered as one of the major contributors in predisposition of the disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  15. Human Genomic Loci Important in Common Infectious Diseases: Role of High-Throughput Sequencing and Genome-Wide Association Studies

    PubMed Central

    Sserwadda, Ivan; Amujal, Marion; Namatovu, Norah

    2018-01-01

    HIV/AIDS, tuberculosis (TB), and malaria are 3 major global public health threats that undermine development in many resource-poor settings. Recently, the notion that positive selection during epidemics or longer periods of exposure to common infectious diseases may have had a major effect in modifying the constitution of the human genome is being interrogated at a large scale in many populations around the world. This positive selection from infectious diseases increases power to detect associations in genome-wide association studies (GWASs). High-throughput sequencing (HTS) has transformed both the management of infectious diseases and continues to enable large-scale functional characterization of host resistance/susceptibility alleles and loci; a paradigm shift from single candidate gene studies. Application of genome sequencing technologies and genomics has enabled us to interrogate the host-pathogen interface for improving human health. Human populations are constantly locked in evolutionary arms races with pathogens; therefore, identification of common infectious disease-associated genomic variants/markers is important in therapeutic, vaccine development, and screening susceptible individuals in a population. This review describes a range of host-pathogen genomic loci that have been associated with disease susceptibility and resistant patterns in the era of HTS. We further highlight potential opportunities for these genetic markers. PMID:29755620

  16. Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically

  17. A genome-wide assessment of rare copy number variants in colorectal cancer.

    PubMed

    Li, Zhenli; Yu, Dan; Gan, Meifu; Shan, Qiaonan; Yin, Xiaoyang; Tang, Shunli; Zhang, Shuai; Shi, Yongyong; Zhu, Yimin; Lai, Maode; Zhang, Dandan

    2015-09-22

    Colorectal cancer (CRC) is a complex disease with an estimated heritability of approximately 35%. However, known CRC-related common single nucleotide polymorphisms (SNPs) can only explain ~0.65% of the heritability. This "missing heritability" may be explained partially by rare copy number variants (CNVs). In this study, we performed a genome-wide scan using Illumina Human-Omni Express BeadChip, 694 sporadic CRC cases and 1641 controls were eventually included in our analysis after quality control. The global burden analysis revealed a 1.53-fold excess of rare CNVs in CRC cases compared with controls (P < 1 × 10(-6)), and the difference being more pronounced for genic rare CNVs and CNVs overlapped with coding regions (1.65-fold and 1.84-fold, respectively, both P < 1 × 10(-6)). Interestingly, both the cases in the lowest and middle tertile of age carried a higher burden of rare CNVs comparing to the highest tertile. Furthermore, 639 CNV-disrupted genes exclusive to CRC cases were found to be significantly enriched in gene ontology (GO) terms concerning nucleosome assembly and olfactory receptor activity. Our study was the first to evaluate the burden of rare CNVs in sporadic CRC and suggested that rare CNVs contributed to the missing heritability of CRC.

  18. Genome-wide association studies in preterm birth: implications for the practicing obstetrician-gynaecologist

    PubMed Central

    2013-01-01

    Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run. PMID:23445776

  19. Genome-wide association studies in preterm birth: implications for the practicing obstetrician-gynaecologist.

    PubMed

    Dolan, Siobhan M; Christiaens, Inge

    2013-01-01

    Preterm birth has the highest mortality and morbidity of all pregnancy complications. The burden of preterm birth on public health worldwide is enormous, yet there are few effective means to prevent a preterm delivery. To date, much of its etiology is unexplained, but genetic predisposition is thought to play a major role. In the upcoming year, the international Preterm Birth Genome Project (PGP) consortium plans to publish a large genome wide association study in early preterm birth. Genome-wide association studies (GWAS) are designed to identify common genetic variants that influence health and disease. Despite the many challenges that are involved, GWAS can be an important discovery tool, revealing genetic variations that are associated with preterm birth. It is highly unlikely that findings of a GWAS can be directly translated into clinical practice in the short run. Nonetheless, it will help us to better understand the etiology of preterm birth and the GWAS results will generate new hypotheses for further research, thus enhancing our understanding of preterm birth and informing prevention efforts in the long run.

  20. genipe: an automated genome-wide imputation pipeline with automatic reporting and statistical tools.

    PubMed

    Lemieux Perreault, Louis-Philippe; Legault, Marc-André; Asselin, Géraldine; Dubé, Marie-Pierre

    2016-12-01

    Genotype imputation is now commonly performed following genome-wide genotyping experiments. Imputation increases the density of analyzed genotypes in the dataset, enabling fine-mapping across the genome. However, the process of imputation using the most recent publicly available reference datasets can require considerable computation power and the management of hundreds of large intermediate files. We have developed genipe, a complete genome-wide imputation pipeline which includes automatic reporting, imputed data indexing and management, and a suite of statistical tests for imputed data commonly used in genetic epidemiology (Sequence Kernel Association Test, Cox proportional hazards for survival analysis, and linear mixed models for repeated measurements in longitudinal studies). The genipe package is an open source Python software and is freely available for non-commercial use (CC BY-NC 4.0) at https://github.com/pgxcentre/genipe Documentation and tutorials are available at http://pgxcentre.github.io/genipe CONTACT: louis-philippe.lemieux.perreault@statgen.org or marie-pierre.dube@statgen.orgSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  1. Harnessing the sorghum genome sequence:development of a genome-wide microsattelite (SSR) resource for swift genetic mapping and map based cloning in sorghum

    USDA-ARS?s Scientific Manuscript database

    Sorghum is the second cereal crop to have a full genome completely sequenced (Nature (2009), 457:551). This achievement is widely recognized as a scientific milestone for grass genetics and genomics in general. However, the true worth of genetic information lies in translating the sequence informa...

  2. Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes.

    PubMed

    Behura, Susanta K; Severson, David W

    2013-02-01

    Codon usage bias refers to the phenomenon where specific codons are used more often than other synonymous codons during translation of genes, the extent of which varies within and among species. Molecular evolutionary investigations suggest that codon bias is manifested as a result of balance between mutational and translational selection of such genes and that this phenomenon is widespread across species and may contribute to genome evolution in a significant manner. With the advent of whole-genome sequencing of numerous species, both prokaryotes and eukaryotes, genome-wide patterns of codon bias are emerging in different organisms. Various factors such as expression level, GC content, recombination rates, RNA stability, codon position, gene length and others (including environmental stress and population size) can influence codon usage bias within and among species. Moreover, there has been a continuous quest towards developing new concepts and tools to measure the extent of codon usage bias of genes. In this review, we outline the fundamental concepts of evolution of the genetic code, discuss various factors that may influence biased usage of synonymous codons and then outline different principles and methods of measurement of codon usage bias. Finally, we discuss selected studies performed using whole-genome sequences of different insect species to show how codon bias patterns vary within and among genomes. We conclude with generalized remarks on specific emerging aspects of codon bias studies and highlight the recent explosion of genome-sequencing efforts on arthropods (such as twelve Drosophila species, species of ants, honeybee, Nasonia and Anopheles mosquitoes as well as the recent launch of a genome-sequencing project involving 5000 insects and other arthropods) that may help us to understand better the evolution of codon bias and its biological significance. © 2012 The Authors. Biological Reviews © 2012 Cambridge Philosophical Society.

  3. A novel genome-wide microsatellite resource for species of Eucalyptus with linkage-to-physical correspondence on the reference genome sequence.

    PubMed

    Grattapaglia, Dario; Mamani, Eva M C; Silva-Junior, Orzenil B; Faria, Danielle A

    2015-03-01

    Keystone species in their native ranges, eucalypts, are ecologically and genetically very diverse, growing naturally along extensive latitudinal and altitudinal ranges and variable environments. Besides their ecological importance, eucalypts are also the most widely planted trees for sustainable forestry in the world. We report the development of a novel collection of 535 microsatellites for species of Eucalyptus, 494 designed from ESTs and 41 from genomic libraries. A selected subset of 223 was evaluated for individual identification, parentage testing, and ancestral information content in the two most extensively studied species, Eucalyptus grandis and Eucalyptus globulus. Microsatellites showed high transferability and overlapping allele size range, suggesting they have arisen still in their common ancestor and confirming the extensive genome conservation between these two species. A consensus linkage map with 437 microsatellites, the most comprehensive microsatellite-only genetic map for Eucalyptus, was built by assembling segregation data from three mapping populations and anchored to the Eucalyptus genome. An overall colinearity between recombination-based and physical positioning of 84% of the mapped microsatellites was observed, with some ordering discrepancies and sporadic locus duplications, consistent with the recently described whole genome duplication events in Eucalyptus. The linkage map covered 95.2% of the 605.8-Mbp assembled genome sequence, placing one microsatellite every 1.55 Mbp on average, and an overall estimate of physical to recombination distance of 618 kbp/cM. The genetic parameters estimates together with linkage and physical position data for this large set of microsatellites should assist marker choice for genome-wide population genetics and comparative mapping in Eucalyptus. © 2014 John Wiley & Sons Ltd.

  4. Genomic scan for genes predisposing to schizophrenia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coon, H.; Jensen. S.; Holik, J.

    1994-03-15

    We initiated a genome-wide search for genes predisposing to schizophrenia by ascertaining 9 families, each containing three to five cases of schizophrenia. The 9 pedigrees were initially genotyped with 329 polymorphic DNA loci distributed throughout the genome. Assuming either autosomal dominant or recessive inheritance, 254 DNA loci yielded lod scores less than -2.0 at {theta} = 0.0, 101 DNA markers gave lod scores less than -2.0 at {theta} = 0.05, while 5 DNA loci produced maximum lod scores greater than 1: D4S35, D14S17, D15S1, D22S84, and D22S55. Of the DNA markers yielding lod scores greater than 1, D4S35 and D22S55more » also were suggestive of linkage when the Affected-Pedigree-Member method was used. The families were then genotyped with four highly polymorphic simple sequence repeat markers; possible linkage diminished with DNA markers mapping nearby D4S35, while suggestive evidence of linkage remained with loci in the region of D22S55. Although follow-up investigation of these chromosomal regions may be warranted, our linkage results should be viewed as preliminary observations, as 35 unaffected persons are not past the age of risk. 90 refs., 3 tabs.« less

  5. Genome wide association study and genomic prediction for fatty acid composition in Chinese Simmental beef cattle using high density SNP array.

    PubMed

    Zhu, Bo; Niu, Hong; Zhang, Wengang; Wang, Zezhao; Liang, Yonghu; Guan, Long; Guo, Peng; Chen, Yan; Zhang, Lupei; Guo, Yong; Ni, Heming; Gao, Xue; Gao, Huijiang; Xu, Lingyang; Li, Junya

    2017-06-14

    Fatty acid composition of muscle is an important trait contributing to meat quality. Recently, genome-wide association study (GWAS) has been extensively used to explore the molecular mechanism underlying important traits in cattle. In this study, we performed GWAS using high density SNP array to analyze the association between SNPs and fatty acids and evaluated the accuracy of genomic prediction for fatty acids in Chinese Simmental cattle. Using the BayesB method, we identified 35 and 7 regions in Chinese Simmental cattle that displayed significant associations with individual fatty acids and fatty acid groups, respectively. We further obtained several candidate genes which may be involved in fatty acid biosynthesis including elongation of very long chain fatty acids protein 5 (ELOVL5), fatty acid synthase (FASN), caspase 2 (CASP2) and thyroglobulin (TG). Specifically, we obtained strong evidence of association signals for one SNP located at 51.3 Mb for FASN using Genome-wide Rapid Association Mixed Model and Regression-Genomic Control (GRAMMAR-GC) approaches. Also, region-based association test identified multiple SNPs within FASN and ELOVL5 for C14:0. In addition, our result revealed that the effectiveness of genomic prediction for fatty acid composition using BayesB was slightly superior over GBLUP in Chinese Simmental cattle. We identified several significantly associated regions and loci which can be considered as potential candidate markers for genomics-assisted breeding programs. Using multiple methods, our results revealed that FASN and ELOVL5 are associated with fatty acids with strong evidence. Our finding also suggested that it is feasible to perform genomic selection for fatty acids in Chinese Simmental cattle.

  6. Alzheimer Disease Pathology in Cognitively Healthy Elderly:A Genome-wide Study

    PubMed Central

    Kramer, Patricia L; Xu, Haiyan; Woltjer, Randall L; Westaway, Shawn K; Clark, David; Erten-Lyons, Deniz; Kaye, Jeffrey A; Welsh-Bohmer, Kathleen A; Troncoso, Juan C; Markesbery, William R; Petersen, Ronald C; Turner, R Scott; Kukull, Walter A; Bennett, David A; DouglasGalasko; Morris, John C; Ott, Jurg

    2010-01-01

    Many elderly individuals remain dementia-free throughout their life. However, some of these individuals exhibit Alzheimer disease neuropathology on autopsy, evidenced by neurofibrillary tangles (NFTs) in AD-specific brain regions. We conducted a genome-wide association study to identify genetic mechanisms that distinguish non-demented elderly with a heavy NFT burden from those with a low NFT burden. The study included 344 non-demented subjects with autopsy (201 subjects with low and 143 with high NFT levels). Both a genotype test, using logistic regression, and an allele test provided genome-wide significant evidence that variants in the RELNgene are associated with neuropathology in the context of cognitive health. Immunohistochemical data for reelin expression in AD-related brain regions added support for these findings. Reelin signaling pathways modulate phosphorylation of tau, the major component of NFTs, either directly or through β-amyloid pathways that influence tau phosphorylation. Our findings suggest that up-regulation of reelin may be a compensatory response to tau-related or beta-amyloid stress associated with AD even prior to the onset of dementia. PMID:20452100

  7. Implications of genome-wide association studies in cancer therapeutics.

    PubMed

    Patel, Jai N; McLeod, Howard L; Innocenti, Federico

    2013-09-01

    Genome wide association studies (GWAS) provide an agnostic approach to identifying potential genetic variants associated with disease susceptibility, prognosis of survival and/or predictive of drug response. Although these techniques are costly and interpretation of study results is challenging, they do allow for a more unbiased interrogation of the entire genome, resulting in the discovery of novel genes and understanding of novel biological associations. This review will focus on the implications of GWAS in cancer therapy, in particular germ-line mutations, including findings from major GWAS which have identified predictive genetic loci for clinical outcome and/or toxicity. Lessons and challenges in cancer GWAS are also discussed, including the need for functional analysis and replication, as well as future perspectives for biological and clinical utility. Given the large heterogeneity in response to cancer therapeutics, novel methods of identifying mechanisms and biology of variable drug response and ultimately treatment individualization will be indispensable. © 2013 The British Pharmacological Society.

  8. GenoGAM: genome-wide generalized additive models for ChIP-Seq analysis.

    PubMed

    Stricker, Georg; Engelhardt, Alexander; Schulz, Daniel; Schmid, Matthias; Tresch, Achim; Gagneur, Julien

    2017-08-01

    Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) is a widely used approach to study protein-DNA interactions. Often, the quantities of interest are the differential occupancies relative to controls, between genetic backgrounds, treatments, or combinations thereof. Current methods for differential occupancy of ChIP-Seq data rely however on binning or sliding window techniques, for which the choice of the window and bin sizes are subjective. Here, we present GenoGAM (Genome-wide Generalized Additive Model), which brings the well-established and flexible generalized additive models framework to genomic applications using a data parallelism strategy. We model ChIP-Seq read count frequencies as products of smooth functions along chromosomes. Smoothing parameters are objectively estimated from the data by cross-validation, eliminating ad hoc binning and windowing needed by current approaches. GenoGAM provides base-level and region-level significance testing for full factorial designs. Application to a ChIP-Seq dataset in yeast showed increased sensitivity over existing differential occupancy methods while controlling for type I error rate. By analyzing a set of DNA methylation data and illustrating an extension to a peak caller, we further demonstrate the potential of GenoGAM as a generic statistical modeling tool for genome-wide assays. Software is available from Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/GenoGAM.html . gagneur@in.tum.de. Supplementary information is available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  9. A GENOME WIDE ASSOCIATION STUDY FOR DIABETIC NEPHROPATHY GENES IN AFRICAN AMERICANS

    PubMed Central

    McDonough, Caitrin W.; Palmer, Nicholette D.; Hicks, Pamela J.; Roh, Bong H.; An, S. Sandy; Cooke, Jessica N.; Hester, Jessica M.; Wing, Maria R.; Bostrom, Meredith A.; Rudock, Megan E.; Lewis, Joshua P.; Talbert, Matthew E.; Blevins, Rebecca A.; Lu, Lingyi; Ng, Maggie C.Y.; Sale, Michele M.; Divers, Jasmin; Langefeld, Carl D.; Freedman, Barry I.; Bowden, Donald W.

    2011-01-01

    A genome-wide association study was performed using the Affymetrix 6.0 chip to identify genes associated with diabetic nephropathy in African Americans. Association analysis was performed adjusting for admixture in 965 type 2 diabetic African American patients with end-stage renal disease (ESRD) and in 1029 African Americans without type 2 diabetes or kidney disease as controls. The top 724 single nucleotide polymorphisms (SNPs) with evidence of association to diabetic nephropathy were then genotyped in a replication sample of an additional 709 type 2 diabetes-ESRD patients and 690 controls. SNPs with evidence of association in both the original and replication studies were tested in additional African American cohorts consisting of 1246 patients with type 2 diabetes without kidney disease and 1216 with non-diabetic ESRD to differentiate candidate loci for type 2 diabetes-ESRD, type 2 diabetes, and/or all-cause ESRD. Twenty-five SNPs were significantly associated with type 2 diabetes-ESRD in the genome-wide association and initial replication. Although genome-wide significance with type 2 diabetes was not found for any of these 25 SNPs, several genes, including RPS12, LIMK2, and SFI1 are strong candidates for diabetic nephropathy. A combined analysis of all 2890 patients with ESRD showed significant association SNPs in LIMK2 and SFI1 suggesting that they also contribute to all-cause ESRD. Thus, our results suggest that multiple loci underlie susceptibility to kidney disease in African Americans with type 2 diabetes and some may also contribute to all-cause ESRD. PMID:21150874

  10. A genome-wide association study for diabetic nephropathy genes in African Americans.

    PubMed

    McDonough, Caitrin W; Palmer, Nicholette D; Hicks, Pamela J; Roh, Bong H; An, S Sandy; Cooke, Jessica N; Hester, Jessica M; Wing, Maria R; Bostrom, Meredith A; Rudock, Megan E; Lewis, Joshua P; Talbert, Matthew E; Blevins, Rebecca A; Lu, Lingyi; Ng, Maggie C Y; Sale, Michele M; Divers, Jasmin; Langefeld, Carl D; Freedman, Barry I; Bowden, Donald W

    2011-03-01

    A genome-wide association study was performed using the Affymetrix 6.0 chip to identify genes associated with diabetic nephropathy in African Americans. Association analysis was performed adjusting for admixture in 965 type 2 diabetic African American patients with end-stage renal disease (ESRD) and in 1029 African Americans without type 2 diabetes or kidney disease as controls. The top 724 single nucleotide polymorphisms (SNPs) with evidence of association to diabetic nephropathy were then genotyped in a replication sample of an additional 709 type 2 diabetes-ESRD patients and 690 controls. SNPs with evidence of association in both the original and replication studies were tested in additional African American cohorts consisting of 1246 patients with type 2 diabetes without kidney disease and 1216 with non-diabetic ESRD to differentiate candidate loci for type 2 diabetes-ESRD, type 2 diabetes, and/or all-cause ESRD. Twenty-five SNPs were significantly associated with type 2 diabetes-ESRD in the genome-wide association and initial replication. Although genome-wide significance with type 2 diabetes was not found for any of these 25 SNPs, several genes, including RPS12, LIMK2, and SFI1 are strong candidates for diabetic nephropathy. A combined analysis of all 2890 patients with ESRD showed significant association SNPs in LIMK2 and SFI1 suggesting that they also contribute to all-cause ESRD. Thus, our results suggest that multiple loci underlie susceptibility to kidney disease in African Americans with type 2 diabetes and some may also contribute to all-cause ESRD.

  11. Evidence for a gene influencing the TG/HDL-C ratio on chromosome 7q32.3-qter: a genome-wide scan in the Framingham study.

    PubMed

    Shearman, A M; Ordovas, J M; Cupples, L A; Schaefer, E J; Harmon, M D; Shao, Y; Keen, J D; DeStefano, A L; Joost, O; Wilson, P W; Housman, D E; Myers, R H

    2000-05-22

    Some studies show that plasma triglyceride (TG) levels are a significant independent risk factor for cardiovascular disease (CVD). TG levels are inversely correlated with high density lipoprotein cholesterol (HDL-C) levels, and their metabolism may be closely interrelated. Therefore, the TG/HDL-C ratio may be a relevant CVD risk factor. Our analysis of families in the Framingham Heart Study gave a genetic heritability estimate for log(TG) of 0.40 and for log(TG/HDL-C) of 0.49, demonstrating an important genetic component for both. A 10 cM genome-wide scan for log(TG) level and log(TG/HDL-C) was carried out for the largest 332 extended families of the Framingham Heart Study (1702 genotyped individuals). The highest multipoint variance component LOD scores obtained for both log(TG) and log(TG/HDL-C) were on chromosome 7 (at 155 cM), where the results for the two phenotypes were 1.8 and 2.5, respectively. The 7q32.3-qter region contains several candidate genes. Four other regions with multipoint LOD scores greater than one were identified on chromosome 3 [LOD score for log(TG/HDL-C) = 1.8 at 140 cM], chromosome 11 [LOD score for log(TG/HDL-C) = 1.1 at 125 cM], chromosome 16 [LOD score for log(TG) = 1.5 at 70 cM, LOD score for log(TG/HDL-C) = 1.1 at 75 cM] and chromosome 20 [LOD score for log(TG/HDL-C) = 1.7 at 35 cM, LOD score for log(TG) = 1.3 at 40 cM]. These results identify loci worthy of further study.

  12. Ensembl Genomes 2013: scaling up access to genome-wide data.

    PubMed

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.

  13. Realizing privacy preserving genome-wide association studies.

    PubMed

    Simmons, Sean; Berger, Bonnie

    2016-05-01

    As genomics moves into the clinic, there has been much interest in using this medical data for research. At the same time the use of such data raises many privacy concerns. These circumstances have led to the development of various methods to perform genome-wide association studies (GWAS) on patient records while ensuring privacy. In particular, there has been growing interest in applying differentially private techniques to this challenge. Unfortunately, up until now all methods for finding high scoring SNPs in a differentially private manner have had major drawbacks in terms of either accuracy or computational efficiency. Here we overcome these limitations with a substantially modified version of the neighbor distance method for performing differentially private GWAS, and thus are able to produce a more viable mechanism. Specifically, we use input perturbation and an adaptive boundary method to overcome accuracy issues. We also design and implement a convex analysis based algorithm to calculate the neighbor distance for each SNP in constant time, overcoming the major computational bottleneck in the neighbor distance method. It is our hope that methods such as ours will pave the way for more widespread use of patient data in biomedical research. A python implementation is available at http://groups.csail.mit.edu/cb/DiffPriv/ bab@csail.mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  14. Genome-wide association mapping of partial resistance to Aphanomyces euteiches in pea

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association mapping has recently emerged as a valuable approach to refine genetic basis of polygenic resistance to plant diseases, which are increasingly used in integrated strategies for durable crop protection. Aphanomyces euteiches is a soil borne pathogen of pea and other legumes wor...

  15. A genome-wide linkage scan for dietary energy and nutrient intakes: the Health, Risk Factors, Exercise Training, and Genetics (HERITAGE) Family Study.

    PubMed

    Collaku, Agron; Rankinen, Tuomo; Rice, Treva; Leon, Arthur S; Rao, D C; Skinner, James S; Wilmore, Jack H; Bouchard, Claude

    2004-05-01

    A poor diet is a risk factor for chronic diseases such as obesity, cardiovascular disease, hypertension, and some cancers. Twin and family studies suggest that genetic factors potentially influence energy and nutrient intakes. We sought to identify genomic regions harboring genes affecting total energy, carbohydrate, protein, and fat intakes. We performed a genomic scan in 347 white sibling pairs and 99 black sibling pairs. Dietary energy and nutrient intakes were assessed by using Willett's food-frequency questionnaire. Single-point and multipoint Haseman-Elston regression techniques were used to test for linkage. These subjects were part of the Health, Risk Factors, Exercise Training, and Genetics (HERITAGE) Family Study, a multicenter project undertaken by 5 laboratories. In the whites, the strongest evidence of linkage appeared for dietary energy and nutrient intakes on chromosomes 1p21.2 (P = 0.0002) and 20q13.13 (P = 0.00007), and that for fat intake appeared on chromosome 12q14.1 (P = 0.0013). The linkage evidence on chromosomes 1 and 20 related to total energy intake rather than to the intake of specific macronutrients. In the blacks, promising linkages for macronutrient intakes occurred on chromosomes 12q23-q24.21, 1q32.1, and 7q11.1. Several potential candidate genes are encoded in and around the linkage regions on chromosomes 1p21.2, 12q14.1, and 20q13.13. These are the first reported human quantitative trait loci for dietary energy and macronutrient intakes. Further study may refine these quantitative trait loci to identify potential candidate genes for energy and specific macronutrient intakes that would be amenable to more detailed molecular studies.

  16. A genome-wide association study identifies multiple loci for variation in human ear morphology.

    PubMed

    Adhikari, Kaustubh; Reales, Guillermo; Smith, Andrew J P; Konka, Esra; Palmen, Jutta; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Fuentes, Macarena; Pizarro, María; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Calderón, Rosario; Rosique, Javier; Cheeseman, Michael; Bhutta, Mahmood F; Humphries, Steve E; Gonzalez-José, Rolando; Headon, Denis; Balding, David; Ruiz-Linares, Andrés

    2015-06-24

    Here we report a genome-wide association study for non-pathological pinna morphology in over 5,000 Latin Americans. We find genome-wide significant association at seven genomic regions affecting: lobe size and attachment, folding of antihelix, helix rolling, ear protrusion and antitragus size (linear regression P values 2 × 10(-8) to 3 × 10(-14)). Four traits are associated with a functional variant in the Ectodysplasin A receptor (EDAR) gene, a key regulator of embryonic skin appendage development. We confirm expression of Edar in the developing mouse ear and that Edar-deficient mice have an abnormally shaped pinna. Two traits are associated with SNPs in a region overlapping the T-Box Protein 15 (TBX15) gene, a major determinant of mouse skeletal development. Strongest association in this region is observed for SNP rs17023457 located in an evolutionarily conserved binding site for the transcription factor Cartilage paired-class homeoprotein 1 (CART1), and we confirm that rs17023457 alters in vitro binding of CART1.

  17. Genome-Wide Motif Statistics are Shaped by DNA Binding Proteins over Evolutionary Time Scales

    NASA Astrophysics Data System (ADS)

    Qian, Long; Kussell, Edo

    2016-10-01

    The composition of a genome with respect to all possible short DNA motifs impacts the ability of DNA binding proteins to locate and bind their target sites. Since nonfunctional DNA binding can be detrimental to cellular functions and ultimately to organismal fitness, organisms could benefit from reducing the number of nonfunctional DNA binding sites genome wide. Using in vitro measurements of binding affinities for a large collection of DNA binding proteins, in multiple species, we detect a significant global avoidance of weak binding sites in genomes. We demonstrate that the underlying evolutionary process leaves a distinct genomic hallmark in that similar words have correlated frequencies, a signal that we detect in all species across domains of life. We consider the possibility that natural selection against weak binding sites contributes to this process, and using an evolutionary model we show that the strength of selection needed to maintain global word compositions is on the order of point mutation rates. Likewise, we show that evolutionary mechanisms based on interference of protein-DNA binding with replication and mutational repair processes could yield similar results and operate with similar rates. On the basis of these modeling and bioinformatic results, we conclude that genome-wide word compositions have been molded by DNA binding proteins acting through tiny evolutionary steps over time scales spanning millions of generations.

  18. Genome-wide study of resistant hypertension identified from electronic health records.

    PubMed

    Dumitrescu, Logan; Ritchie, Marylyn D; Denny, Joshua C; El Rouby, Nihal M; McDonough, Caitrin W; Bradford, Yuki; Ramirez, Andrea H; Bielinski, Suzette J; Basford, Melissa A; Chai, High Seng; Peissig, Peggy; Carrell, David; Pathak, Jyotishman; Rasmussen, Luke V; Wang, Xiaoming; Pacheco, Jennifer A; Kho, Abel N; Hayes, M Geoffrey; Matsumoto, Martha; Smith, Maureen E; Li, Rongling; Cooper-DeHoff, Rhonda M; Kullo, Iftikhar J; Chute, Christopher G; Chisholm, Rex L; Jarvik, Gail P; Larson, Eric B; Carey, David; McCarty, Catherine A; Williams, Marc S; Roden, Dan M; Bottinger, Erwin; Johnson, Julie A; de Andrade, Mariza; Crawford, Dana C

    2017-01-01

    Resistant hypertension is defined as high blood pressure that remains above treatment goals in spite of the concurrent use of three antihypertensive agents from different classes. Despite the important health consequences of resistant hypertension, few studies of resistant hypertension have been conducted. To perform a genome-wide association study for resistant hypertension, we defined and identified cases of resistant hypertension and hypertensives with treated, controlled hypertension among >47,500 adults residing in the US linked to electronic health records (EHRs) and genotyped as part of the electronic MEdical Records & GEnomics (eMERGE) Network. Electronic selection logic using billing codes, laboratory values, text queries, and medication records was used to identify resistant hypertension cases and controls at each site, and a total of 3,006 cases of resistant hypertension and 876 controlled hypertensives were identified among eMERGE Phase I and II sites. After imputation and quality control, a total of 2,530,150 SNPs were tested for an association among 2,830 multi-ethnic cases of resistant hypertension and 876 controlled hypertensives. No test of association was genome-wide significant in the full dataset or in the dataset limited to European American cases (n = 1,719) and controls (n = 708). The most significant finding was CLNK rs13144136 at p = 1.00x10-6 (odds ratio = 0.68; 95% CI = 0.58-0.80) in the full dataset with similar results in the European American only dataset. We also examined whether SNPs known to influence blood pressure or hypertension also influenced resistant hypertension. None was significant after correction for multiple testing. These data highlight both the difficulties and the potential utility of EHR-linked genomic data to study clinically-relevant traits such as resistant hypertension.

  19. Genome-wide association study of Alzheimer's disease.

    PubMed

    Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; Dekosky, S T; Lopez, O L; Barmada, M M

    2012-05-15

    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ~2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69-180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P = 3.05E-07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples.

  20. Genome-wide association study of Alzheimer's disease

    PubMed Central

    Kamboh, M I; Demirci, F Y; Wang, X; Minster, R L; Carrasquillo, M M; Pankratz, V S; Younkin, S G; Saykin, A J; Jun, G; Baldwin, C; Logue, M W; Buros, J; Farrer, L; Pericak-Vance, M A; Haines, J L; Sweet, R A; Ganguli, M; Feingold, E; DeKosky, S T; Lopez, O L; Barmada, M M

    2012-01-01

    In addition to apolipoprotein E (APOE), recent large genome-wide association studies (GWASs) have identified nine other genes/loci (CR1, BIN1, CLU, PICALM, MS4A4/MS4A6E, CD2AP, CD33, EPHA1 and ABCA7) for late-onset Alzheimer's disease (LOAD). However, the genetic effect attributable to known loci is about 50%, indicating that additional risk genes for LOAD remain to be identified. In this study, we have used a new GWAS data set from the University of Pittsburgh (1291 cases and 938 controls) to examine in detail the recently implicated nine new regions with Alzheimer's disease (AD) risk, and also performed a meta-analysis utilizing the top 1% GWAS single-nucleotide polymorphisms (SNPs) with P<0.01 along with four independent data sets (2727 cases and 3336 controls) for these SNPs in an effort to identify new AD loci. The new GWAS data were generated on the Illumina Omni1-Quad chip and imputed at ∼2.5 million markers. As expected, several markers in the APOE regions showed genome-wide significant associations in the Pittsburg sample. While we observed nominal significant associations (P<0.05) either within or adjacent to five genes (PICALM, BIN1, ABCA7, MS4A4/MS4A6E and EPHA1), significant signals were observed 69–180 kb outside of the remaining four genes (CD33, CLU, CD2AP and CR1). Meta-analysis on the top 1% SNPs revealed a suggestive novel association in the PPP1R3B gene (top SNP rs3848140 with P=3.05E–07). The association of this SNP with AD risk was consistent in all five samples with a meta-analysis odds ratio of 2.43. This is a potential candidate gene for AD as this is expressed in the brain and is involved in lipid metabolism. These findings need to be confirmed in additional samples. PMID:22832961

  1. A Novel Genome-Information Content-Based Statistic for Genome-Wide Association Analysis Designed for Next-Generation Sequencing Data

    PubMed Central

    Luo, Li; Zhu, Yun

    2012-01-01

    Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812

  2. A novel genome-information content-based statistic for genome-wide association analysis designed for next-generation sequencing data.

    PubMed

    Luo, Li; Zhu, Yun; Xiong, Momiao

    2012-06-01

    The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.

  3. A Genome-Wide Map of Mitochondrial DNA Recombination in Yeast

    PubMed Central

    Fritsch, Emilie S.; Chabbert, Christophe D.; Klaus, Bernd; Steinmetz, Lars M.

    2014-01-01

    In eukaryotic cells, the production of cellular energy requires close interplay between nuclear and mitochondrial genomes. The mitochondrial genome is essential in that it encodes several genes involved in oxidative phosphorylation. Each cell contains several mitochondrial genome copies and mitochondrial DNA recombination is a widespread process occurring in plants, fungi, protists, and invertebrates. Saccharomyces cerevisiae has proved to be an excellent model to dissect mitochondrial biology. Several studies have focused on DNA recombination in this organelle, yet mostly relied on reporter genes or artificial systems. However, no complete mitochondrial recombination map has been released for any eukaryote so far. In the present work, we sequenced pools of diploids originating from a cross between two different S. cerevisiae strains to detect recombination events. This strategy allowed us to generate the first genome-wide map of recombination for yeast mitochondrial DNA. We demonstrated that recombination events are enriched in specific hotspots preferentially localized in non-protein-coding regions. Additionally, comparison of the recombination profiles of two different crosses showed that the genetic background affects hotspot localization and recombination rates. Finally, to gain insights into the mechanisms involved in mitochondrial recombination, we assessed the impact of individual depletion of four genes previously associated with this process. Deletion of NTG1 and MGT1 did not substantially influence the recombination landscape, alluding to the potential presence of additional regulatory factors. Our findings also revealed the loss of large mitochondrial DNA regions in the absence of MHR1, suggesting a pivotal role for Mhr1 in mitochondrial genome maintenance during mating. This study provides a comprehensive overview of mitochondrial DNA recombination in yeast and thus paves the way for future mechanistic studies of mitochondrial recombination and genome

  4. A genome-wide map of mitochondrial DNA recombination in yeast.

    PubMed

    Fritsch, Emilie S; Chabbert, Christophe D; Klaus, Bernd; Steinmetz, Lars M

    2014-10-01

    In eukaryotic cells, the production of cellular energy requires close interplay between nuclear and mitochondrial genomes. The mitochondrial genome is essential in that it encodes several genes involved in oxidative phosphorylation. Each cell contains several mitochondrial genome copies and mitochondrial DNA recombination is a widespread process occurring in plants, fungi, protists, and invertebrates. Saccharomyces cerevisiae has proved to be an excellent model to dissect mitochondrial biology. Several studies have focused on DNA recombination in this organelle, yet mostly relied on reporter genes or artificial systems. However, no complete mitochondrial recombination map has been released for any eukaryote so far. In the present work, we sequenced pools of diploids originating from a cross between two different S. cerevisiae strains to detect recombination events. This strategy allowed us to generate the first genome-wide map of recombination for yeast mitochondrial DNA. We demonstrated that recombination events are enriched in specific hotspots preferentially localized in non-protein-coding regions. Additionally, comparison of the recombination profiles of two different crosses showed that the genetic background affects hotspot localization and recombination rates. Finally, to gain insights into the mechanisms involved in mitochondrial recombination, we assessed the impact of individual depletion of four genes previously associated with this process. Deletion of NTG1 and MGT1 did not substantially influence the recombination landscape, alluding to the potential presence of additional regulatory factors. Our findings also revealed the loss of large mitochondrial DNA regions in the absence of MHR1, suggesting a pivotal role for Mhr1 in mitochondrial genome maintenance during mating. This study provides a comprehensive overview of mitochondrial DNA recombination in yeast and thus paves the way for future mechanistic studies of mitochondrial recombination and genome

  5. Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds

    USDA-ARS?s Scientific Manuscript database

    Copy number variations (CNVs) are large insertions, deletions or duplications in the genome that vary between members of a species and are known to affect a wide variety of phenotypic traits. In this study, we identified CNVs in a population of bulls using low coverage next-generation sequence data....

  6. Lessons learned from the dog genome.

    PubMed

    Wayne, Robert K; Ostrander, Elaine A

    2007-11-01

    Extensive genetic resources and a high-quality genome sequence position the dog as an important model species for understanding genome evolution, population genetics and genes underlying complex phenotypic traits. Newly developed genomic resources have expanded our understanding of canine evolutionary history and dog origins. Domestication involved genetic contributions from multiple populations of gray wolves probably through backcrossing. More recently, the advent of controlled breeding practices has segregated genetic variability into distinct dog breeds that possess specific phenotypic traits. Consequently, genome-wide association and selective sweep scans now allow the discovery of genes underlying breed-specific characteristics. The dog is finally emerging as a novel resource for studying the genetic basis of complex traits, including behavior.

  7. Genome-wide compendium and functional assessment of in vivo heart enhancers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen

    Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less

  8. Genome-wide compendium and functional assessment of in vivo heart enhancers

    DOE PAGES

    Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; ...

    2016-10-05

    Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of > 35 epigenomic data sets from mouse and human pre-and postnatal hearts we created a comprehensive reference of > 80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs ofmore » two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function.« less

  9. Genome-wide compendium and functional assessment of in vivo heart enhancers

    PubMed Central

    Dickel, Diane E.; Barozzi, Iros; Zhu, Yiwen; Fukuda-Yuzawa, Yoko; Osterwalder, Marco; Mannion, Brandon J.; May, Dalit; Spurrell, Cailyn H.; Plajzer-Frick, Ingrid; Pickle, Catherine S.; Lee, Elizabeth; Garvin, Tyler H.; Kato, Momoe; Akiyama, Jennifer A.; Afzal, Veena; Lee, Ah Young; Gorkin, David U.; Ren, Bing; Rubin, Edward M.; Visel, Axel; Pennacchio, Len A.

    2016-01-01

    Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of >35 epigenomic data sets from mouse and human pre- and postnatal hearts we created a comprehensive reference of >80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs of two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function. PMID:27703156

  10. Genome-wide association studies to identify rice salt-tolerance markers.

    PubMed

    Patishtan, Juan; Hartley, Tom N; Fonseca de Carvalho, Raquel; Maathuis, Frans J M

    2018-05-01

    Salinity is an ever increasing menace that affects agriculture worldwide. Crops such as rice are salt sensitive, but its degree of susceptibility varies widely between cultivars pointing to extensive genetic diversity that can be exploited to identify genes and proteins that are relevant in the response of rice to salt stress. We used a diversity panel of 306 rice accessions and collected phenotypic data after short (6 h), medium (7 d) and long (30 d) salinity treatment (50 mm NaCl). A genome-wide association study (GWAS) was subsequently performed, which identified around 1200 candidate genes from many functional categories, but this was treatment period dependent. Further analysis showed the presence of cation transporters and transcription factors with a known role in salinity tolerance and those that hitherto were not known to be involved in salt stress. Localization analysis of single nucleotide polymorphisms (SNPs) showed the presence of several hundred non-synonymous SNPs (nsSNPs) in coding regions and earmarked specific genomic regions with increased numbers of nsSNPs. It points to components of the ubiquitination pathway as important sources of genetic diversity that could underpin phenotypic variation in stress tolerance. © 2017 John Wiley & Sons Ltd.

  11. Genome-wide association analysis of ischemic stroke in young adults.

    PubMed

    Cheng, Yu-Ching; O'Connell, Jeffrey R; Cole, John W; Stine, O Colin; Dueker, Nicole; McArdle, Patrick F; Sparks, Mary J; Shen, Jess; Laurie, Cathy C; Nelson, Sarah; Doheny, Kimberly F; Ling, Hua; Pugh, Elizabeth W; Brott, Thomas G; Brown, Robert D; Meschia, James F; Nalls, Michael; Rich, Stephen S; Worrall, Bradford; Anderson, Christopher D; Biffi, Alessandro; Cortellini, Lynelle; Furie, Karen L; Rost, Natalia S; Rosand, Jonathan; Manolio, Teri A; Kittner, Steven J; Mitchell, Braxton D

    2011-11-01

    Ischemic stroke (IS) is among the leading causes of death in Western countries. There is a significant genetic component to IS susceptibility, especially among young adults. To date, research to identify genetic loci predisposing to stroke has met only with limited success. We performed a genome-wide association (GWA) analysis of early-onset IS to identify potential stroke susceptibility loci. The GWA analysis was conducted by genotyping 1 million SNPs in a biracial population of 889 IS cases and 927 controls, ages 15-49 years. Genotypes were imputed using the HapMap3 reference panel to provide 1.4 million SNPs for analysis. Logistic regression models adjusting for age, recruitment stages, and population structure were used to determine the association of IS with individual SNPs. Although no single SNP reached genome-wide significance (P < 5 × 10(-8)), we identified two SNPs in chromosome 2q23.3, rs2304556 (in FMNL2; P = 1.2 × 10(-7)) and rs1986743 (in ARL6IP6; P = 2.7 × 10(-7)), strongly associated with early-onset stroke. These data suggest that a novel locus on human chromosome 2q23.3 may be associated with IS susceptibility among young adults.

  12. Genome-Wide Linkage, Exome Sequencing and Functional Analyses Identify ABCB6 as the Pathogenic Gene of Dyschromatosis Universalis Hereditaria

    PubMed Central

    Wang, Na; Wang, Chuan; Chen, Xuechao; Sheng, Donglai; Fu, Xi’an; See, Kelvin; Foo, Jia Nee; Low, Huiqi; Liany, Herty; Irwan, Ishak Darryl; Liu, Jian; Yang, Baoqi; Chen, Mingfei; Yu, Yongxiang; Yu, Gongqi; Niu, Guiye; You, Jiabao; Zhou, Yan; Ma, Shanshan; Wang, Ting; Yan, Xiaoxiao; Goh, Boon Kee; Common, John E. A.; Lane, Birgitte E.; Sun, Yonghu; Zhou, Guizhi; Lu, Xianmei; Wang, Zhenhua; Tian, Hongqing; Cao, Yuanhua; Chen, Shumin; Liu, Qiji; Liu, Jianjun; Zhang, Furen

    2014-01-01

    Background As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH) had remained unclear until recently when ABCB6 was reported as a causative gene of DUH. Methodology We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation. Results Genome-wide linkage (assuming autosomal dominant inheritance mode) and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val) that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val) and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys) in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them. Conclusion Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma. PMID:24498303

  13. A Genome-Wide Association Study of Circulating Galectin-3

    PubMed Central

    van Veldhuisen, Dirk J.; Westra, Harm-Jan; Bakker, Stephan J. L.; Gansevoort, Ron T.; Muller Kobold, Anneke C.; van Gilst, Wiek H.; Franke, Lude

    2012-01-01

    Galectin-3 is a lectin involved in fibrosis, inflammation and proliferation. Increased circulating levels of galectin-3 have been associated with various diseases, including cancer, immunological disorders, and cardiovascular disease. To enhance our knowledge on galectin-3 biology we performed the first genome-wide association study (GWAS) using the Illumina HumanCytoSNP-12 array imputed with the HapMap 2 CEU panel on plasma galectin-3 levels in 3,776 subjects and follow-up genotyping in an additional 3,516 subjects. We identified 2 genome wide significant loci associated with plasma galectin-3 levels. One locus harbours the LGALS3 gene (rs2274273; P = 2.35×10−188) and the other locus the ABO gene (rs644234; P = 3.65×10−47). The variance explained by the LGALS3 locus was 25.6% and by the ABO locus 3.8% and jointly they explained 29.2%. Rs2274273 lies in high linkage disequilibrium with two non-synonymous SNPs (rs4644; r2 = 1.0, and rs4652; r2 = 0.91) and wet lab follow-up genotyping revealed that both are strongly associated with galectin-3 levels (rs4644; P = 4.97×10−465 and rs4652 P = 1.50×10−421) and were also associated with LGALS3 gene-expression. The origins of our associations should be further validated by means of functional experiments. PMID:23056639

  14. Nencki Genomics Database—Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs

    PubMed Central

    Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

    2013-01-01

    We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface. Database URL: http://www.nencki-genomics.org. PMID:24089456

  15. Optimization and quality control of genome-wide Hi-C library preparation.

    PubMed

    Zhang, Xiang-Yuan; He, Chao; Ye, Bing-Yu; Xie, De-Jian; Shi, Ming-Lei; Zhang, Yan; Shen, Wen-Long; Li, Ping; Zhao, Zhi-Hu

    2017-09-20

    Highest-throughput chromosome conformation capture (Hi-C) is one of the key assays for genome- wide chromatin interaction studies. It is a time-consuming process that involves many steps and many different kinds of reagents, consumables, and equipments. At present, the reproducibility is unsatisfactory. By optimizing the key steps of the Hi-C experiment, such as crosslinking, pretreatment of digestion, inactivation of restriction enzyme, and in situ ligation etc., we established a robust Hi-C procedure and prepared two biological replicates of Hi-C libraries from the GM12878 cells. After preliminary quality control by Sanger sequencing, the two replicates were high-throughput sequenced. The bioinformatics analysis of the raw sequencing data revealed the mapping-ability and pair-mate rate of the raw data were around 90% and 72%, respectively. Additionally, after removal of self-circular ligations and dangling-end products, more than 96% of the valid pairs were reached. Genome-wide interactome profiling shows clear topological associated domains (TADs), which is consistent with previous reports. Further correlation analysis showed that the two biological replicates strongly correlate with each other in terms of both bin coverage and all bin pairs. All these results indicated that the optimized Hi-C procedure is robust and stable, which will be very helpful for the wide applications of the Hi-C assay.

  16. Genome-Wide Analysis of the Arabidopsis Replication Timing Program1[OPEN

    PubMed Central

    Brooks, Ashley M.; Wheeler, Emily; LeBlanc, Chantal; Lee, Tae-Jin; Martienssen, Robert A.; Thompson, William F.

    2018-01-01

    Eukaryotes use a temporally regulated process, known as the replication timing program, to ensure that their genomes are fully and accurately duplicated during S phase. Replication timing programs are predictive of genomic features and activity and are considered to be functional readouts of chromatin organization. Although replication timing programs have been described for yeast and animal systems, much less is known about the temporal regulation of plant DNA replication or its relationship to genome sequence and chromatin structure. We used the thymidine analog, 5-ethynyl-2′-deoxyuridine, in combination with flow sorting and Repli-Seq to describe, at high-resolution, the genome-wide replication timing program for Arabidopsis (Arabidopsis thaliana) Col-0 suspension cells. We identified genomic regions that replicate predominantly during early, mid, and late S phase, and correlated these regions with genomic features and with data for chromatin state, accessibility, and long-distance interaction. Arabidopsis chromosome arms tend to replicate early while pericentromeric regions replicate late. Early and mid-replicating regions are gene-rich and predominantly euchromatic, while late regions are rich in transposable elements and primarily heterochromatic. However, the distribution of chromatin states across the different times is complex, with each replication time corresponding to a mixture of states. Early and mid-replicating sequences interact with each other and not with late sequences, but early regions are more accessible than mid regions. The replication timing program in Arabidopsis reflects a bipartite genomic organization with early/mid-replicating regions and late regions forming separate, noninteracting compartments. The temporal order of DNA replication within the early/mid compartment may be modulated largely by chromatin accessibility. PMID:29301956

  17. Genome-wide Analysis of Genetic Loci Associated with Alzheimer’s Disease

    PubMed Central

    Seshadri, Sudha; Fitzpatrick, Annette L.; Arfan Ikram, M; DeStefano, Anita L.; Gudnason, Vilmundur; Boada, Merce; Bis, Joshua C.; Smith, Albert V.; Carassquillo, Minerva M.; Charles Lambert, Jean; Harold, Denise; Schrijvers, Elisabeth M. C.; Ramirez-Lorca, Reposo; Debette, Stephanie; Longstreth, W.T.; Janssens, A. Cecile J.W.; Shane Pankratz, V.; Dartigues, Jean François; Hollingworth, Paul; Aspelund, Thor; Hernandez, Isabel; Beiser, Alexa; Kuller, Lewis H.; Koudstaal, Peter J.; Dickson, Dennis W.; Tzourio, Christophe; Abraham, Richard; Antunez, Carmen; Du, Yangchun; Rotter, Jerome I.; Aulchenko, Yurii S.; Harris, Tamara B.; Petersen, Ronald C.; Berr, Claudine; Owen, Michael J.; Lopez-Arrieta, Jesus; Varadarajan, Badri N.; Becker, James T.; Rivadeneira, Fernando; Nalls, Michael A.; Graff-Radford, Neill R.; Campion, Dominique; Auerbach, Sanford; Rice, Kenneth; Hofman, Albert; Jonsson, Palmi V.; Schmidt, Helena; Lathrop, Mark; Mosley, Thomas H.; Au, Rhoda; Psaty, Bruce M.; Uitterlinden, Andre G.; Farrer, Lindsay A.; Lumley, Thomas; Ruiz, Agustin; Williams, Julie; Amouyel, Philippe; Younkin, Steve G.; Wolf, Philip A.; Launer, Lenore J.; Lopez, Oscar L.; van Duijn, Cornelia M.; Breteler, Monique M. B.

    2010-01-01

    Context Genome wide association studies (GWAS) have recently identified CLU, PICALM and CR1 as novel genes for late-onset Alzheimer’s disease (AD). Objective In a three-stage analysis of new and previously published GWAS on over 35000 persons (8371 AD cases), we sought to identify and strengthen additional loci associated with AD and confirm these in an independent sample. We also examined the contribution of recently identified genes to AD risk prediction. Design, Setting, and Participants We identified strong genetic associations (p<10−3) in a Stage 1 sample of 3006 AD cases and 14642 controls by combining new data from the population-based Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium (1367 AD cases (973 incident)) with previously reported results from the Translational Genomics Research Institute (TGEN) and Mayo AD GWAS. We identified 2708 single nucleotide polymorphisms (SNPs) with p-values<10−3, and in Stage 2 pooled results for these SNPs with the European AD Initiative (2032 cases, 5328 controls) to identify ten loci with p-values<10−5. In Stage 3, we combined data for these ten loci with data from the Genetic and Environmental Risk in AD consortium (3333 cases, 6995 controls) to identify four SNPs with a p-value<1.7×10−8. These four SNPs were replicated in an independent Spanish sample (1140 AD cases and 1209 controls). Main outcome measure Alzheimer’s Disease. Results We showed genome-wide significance for two new loci: rs744373 near BIN1 (OR:1.13; 95%CI:1.06–1.21 per copy of the minor allele; p=1.6×10−11) and rs597668 near EXOC3L2/BLOC1S3/MARK4 (OR:1.18; 95%CI1.07–1.29; p=6.5×10−9). Associations of CLU, PICALM, BIN1 and EXOC3L2 with AD were confirmed in the Spanish sample (p<0.05). However, CLU and PICALM did not improve incident AD prediction beyond age, sex, and APOE (improvement in area under receiver-operating-characteristic curve <0.003). Conclusions Two novel genetic loci for AD are reported

  18. SUSCEPTIBILITY LOCI FOR UMBILICAL HERNIA IN SWINE DETECTED BY GENOME-WIDE ASSOCIATION.

    PubMed

    Liao, X J; Lia, L; Zhang, Z Y; Long, Y; Yang, B; Ruan, G R; Su, Y; Ai, H S; Zhang, W C; Deng, W Y; Xiao, S J; Ren, J; Ding, N S; Huang, L S

    2015-10-01

    Umbilical hernia (UH) is a complex disorder caused by both genetic and environmental factors. UH brings animal welfare problems and severe economic loss to the pig industry. Until now, the genetic basis of UH is poorly understood. The high-density 60K porcine SNP array enables the rapid application of genome-wide association study (GWAS) to identify genetic loci for phenotypic traits at genome wide scale in pigs. The objective of this research was to identify susceptibility loci for swine umbilical hernia using the GWAS approach. We genotyped 478 piglets from 142 families representing three Western commercial breeds with the Illumina PorcineSNP60 BeadChip. Then significant SNPs were detected by GWAS using ROADTRIPS (Robust Association-Detection Test for Related Individuals with Population Substructure) software base on a Bonferroni corrected threshold (P = 1.67E-06) or suggestive threshold (P = 3.34E-05) and false discovery rate (FDR = 0.05). After quality control, 29,924 qualified SNPs and 472 piglets were used for GWAS. Two suggestive loci predisposing to pig UH were identified at 44.25MB on SSC2 (rs81358018, P = 3.34E-06, FDR = 0.049933) and at 45.90MB on SSC17 (rs81479278, P = 3.30E-06, FDR = 0.049933) in Duroc population, respectively. And no SNP was detected to be associated with pig UH at significant level in neither Landrace nor Large White population. Furthermore, we carried out a meta-analysis in the combined pure-breed population containing all the 472 piglets. rs81479278 (P = 1.16E-06, FDR = 0.022475) was identified to associate with pig UH at genome-wide significant level. SRC was characterized as plausible candidate gene for susceptibility to pig UH according to its genomic position and biological functions. To our knowledge, this study gives the first description of GWAS identifying susceptibility loci for umbilical hernia in pigs. Our findings provide deeper insights to the genetic architecture of umbilical hernia in pigs.

  19. Genetic dissection of Al tolerance QTLs in the maize genome by high density SNP scan

    USDA-ARS?s Scientific Manuscript database

    Aluminum (Al) toxicity is an important limitation to food security in the tropical and subtropical regions. High Al saturation in acid soils limits root development and its ability to uptake water and nutrients. In this study, we present a genome scan for Al tolerance loci with over 50,000 GBS-based...

  20. Genome-Wide Identification and Transferability of Microsatellite Markers between Palmae Species

    PubMed Central

    Xiao, Yong; Xia, Wei; Ma, Jianwei; Mason, Annaliese S.; Fan, Haikuo; Shi, Peng; Lei, Xintao; Ma, Zilong; Peng, Ming

    2016-01-01

    The Palmae family contains 202 genera and approximately 2800 species. Except for Elaeis guineensis and Phoenix dactylifera, almost no genetic and genomic information is available for Palmae species. Therefore, this is an obstacle to the conservation and genetic assessment of Palmae species, especially those that are currently endangered. The study was performed to develop a large number of microsatellite markers which can be used for genetic analysis in different Palmae species. Based on the assembled genome of E. guineensis and P. dactylifera, a total of 814 383 and 371 629 microsatellites were identified. Among these microsatellites identified in E. guineensis, 734 509 primer pairs could be designed from the flanking sequences of these microsatellites. The majority (618 762) of these designed primer pairs had in silico products in the genome of E. guineensis. These 618 762 primer pairs were subsequently used to in silico amplify the genome of P. dactylifera. A total of 7 265 conserved microsatellites were identified between E. guineensis and P. dactylifera. One hundred and thirty-five primer pairs flanking the conserved SSRs were stochastically selected and validated to have high cross-genera transferability, varying from 16.7 to 93.3% with an average of 73.7%. These genome-wide conserved microsatellite markers will provide a useful tool for genetic assessment and conservation of different Palmae species in the future. PMID:27826307

  1. Genome Scan Meta-Analysis of Schizophrenia and Bipolar Disorder, Part II: Schizophrenia

    PubMed Central

    Lewis, Cathryn M.; Levinson, Douglas F.; Wise, Lesley H.; DeLisi, Lynn E.; Straub, Richard E.; Hovatta, Iiris; Williams, Nigel M.; Schwab, Sibylle G.; Pulver, Ann E.; Faraone, Stephen V.; Brzustowicz, Linda M.; Kaufmann, Charles A.; Garver, David L.; Gurling, Hugh M. D.; Lindholm, Eva; Coon, Hilary; Moises, Hans W.; Byerley, William; Shaw, Sarah H.; Mesen, Andrea; Sherrington, Robin; O’Neill, F. Anthony; Walsh, Dermot; Kendler, Kenneth S.; Ekelund, Jesper; Paunio, Tiina; Lönnqvist, Jouko; Peltonen, Leena; O’Donovan, Michael C.; Owen, Michael J.; Wildenauer, Dieter B.; Maier, Wolfgang; Nestadt, Gerald; Blouin, Jean-Louis; Antonarakis, Stylianos E.; Mowry, Bryan J.; Silverman, Jeremy M.; Crowe, Raymond R.; Cloninger, C. Robert; Tsuang, Ming T.; Malaspina, Dolores; Harkavy-Friedman, Jill M.; Svrakic, Dragan M.; Bassett, Anne S.; Holcomb, Jennifer; Kalsi, Gursharan; McQuillin, Andrew; Brynjolfson, Jon; Sigmundsson, Thordur; Petursson, Hannes; Jazin, Elena; Zoëga, Tomas; Helgason, Tomas

    2003-01-01

    Schizophrenia is a common disorder with high heritability and a 10-fold increase in risk to siblings of probands. Replication has been inconsistent for reports of significant genetic linkage. To assess evidence for linkage across studies, rank-based genome scan meta-analysis (GSMA) was applied to data from 20 schizophrenia genome scans. Each marker for each scan was assigned to 1 of 120 30-cM bins, with the bins ranked by linkage scores (1 = most significant) and the ranks averaged across studies (Ravg) and then weighted for sample size (\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} \\begin{equation*}\\sqrt{N[affected cases]}\\end{equation*}\\end{document}). A permutation test was used to compute the probability of observing, by chance, each bin’s average rank (PAvgRnk) or of observing it for a bin with the same place (first, second, etc.) in the order of average ranks in each permutation (Pord). The GSMA produced significant genomewide evidence for linkage on chromosome 2q (PAvgRnk<.000417). Two aggregate criteria for linkage were also met (clusters of nominally significant P values that did not occur in 1,000 replicates of the entire data set with no linkage present): 12 consecutive bins with both PAvgRnk and Pord<.05, including regions of chromosomes 5q, 3p, 11q, 6p, 1q, 22q, 8p, 20q, and 14p, and 19 consecutive bins with Pord<.05, additionally including regions of chromosomes 16q, 18q, 10p, 15q, 6q, and 17q. There is greater consistency of linkage results across studies than has been previously recognized. The results suggest that some or all of these regions contain loci that increase susceptibility to schizophrenia in diverse populations. PMID:12802786

  2. A genome-wide association study platform built on iPlant cyber-infrastructure

    USDA-ARS?s Scientific Manuscript database

    We demonstrated a flexible Genome-Wide Association (GWA) Study (GWAS) platform built upon the iPlant Collaborative Cyber-infrastructure. The platform supports big data management, sharing, and large scale study of both genotype and phenotype data on clusters. End users can add their own analysis too...

  3. Genome-wide association study of type 2 diabetes in a sample from Mexico City and a meta-analysis of a Mexican-American sample from Starr County, Texas

    PubMed Central

    Parra, E. J.; Below, J. E.; Krithika, S.; Valladares, A.; Barta, J. L.; Cox, N. J.; Hanis, C. L.; Wacher, N.; Garcia-Mena, J.; Hu, P.; Shriver, M. D.; Kumate, J.; McKeigue, P. M.; Escobedo, J.; Cruz, M.

    2013-01-01

    Aims/hypothesis We report a genome-wide association study of type 2 diabetes in an admixed sample from Mexico City and describe the results of a meta-analysis of this study and another genome-wide scan in a Mexican-American sample from Starr County, TX, USA. The top signals observed in this meta-analysis were followed up in the Diabetes Genetics Replication and Meta-analysis Consortium (DIAGRAM) and DIAGRAM+ datasets. Methods We analysed 967 cases and 343 normoglycaemic controls. The samples were genotyped with the Affymetrix Genome-wide Human SNP array 5.0. Associations of genotyped and imputed markers with type 2 diabetes were tested using a missing data likelihood score test. A fixed-effects meta-analysis including 1,804 cases and 780 normoglycaemic controls was carried out by weighting the effect estimates by their inverse variances. Results In the meta-analysis of the two Hispanic studies, markers showing suggestive associations (p<10−5) were identified in two known diabetes genes, HNF1A and KCNQ1, as well as in several additional regions. Meta-analysis of the two Hispanic studies and the recent DIAGRAM+ dataset identified genome-wide significant signals (p<5×10−8) within or near the genes HNF1A and CDKN2A/CDKN2B, as well as suggestive associations in three additional regions, IGF2BP2, KCNQ1 and the previously unreported C14orf70. Conclusions/interpretation We observed numerous regions with suggestive associations with type 2 diabetes. Some of these signals correspond to regions described in previous studies. However, many of these regions could not be replicated in the DIAGRAM datasets. It is critical to carry out additional studies in Hispanic and American Indian populations, which have a high prevalence of type 2 diabetes. PMID:21573907

  4. Genome-Wide Analysis of A-to-I RNA Editing.

    PubMed

    Savva, Yiannis A; Laurent, Georges St; Reenan, Robert A

    2016-01-01

    Adenosine (A)-to-inosine (I) RNA editing is a fundamental posttranscriptional modification that ensures the deamination of A-to-I in double-stranded (ds) RNA molecules. Intriguingly, the A-to-I RNA editing system is particularly active in the nervous system of higher eukaryotes, altering a plethora of noncoding and coding sequences. Abnormal RNA editing is highly associated with many neurological phenotypes and neurodevelopmental disorders. However, the molecular mechanisms underlying RNA editing-mediated pathogenesis still remain enigmatic and have attracted increasing attention from researchers. Over the last decade, methods available to perform genome-wide transcriptome analysis, have evolved rapidly. Within the RNA editing field researchers have adopted next-generation sequencing technologies to identify RNA-editing sites within genomes and to elucidate the underlying process. However, technical challenges associated with editing site discovery have hindered efforts to uncover comprehensive editing site datasets, resulting in the general perception that the collections of annotated editing sites represent only a small minority of the total number of sites in a given organism, tissue, or cell type of interest. Additionally to doubts about sensitivity, existing RNA-editing site lists often contain high percentages of false positives, leading to uncertainty about their validity and usefulness in downstream studies. An accurate investigation of A-to-I editing requires properly validated datasets of editing sites with demonstrated and transparent levels of sensitivity and specificity. Here, we describe a high signal-to-noise method for RNA-editing site detection using single-molecule sequencing (SMS). With this method, authentic RNA-editing sites may be differentiated from artifacts. Machine learning approaches provide a procedure to improve upon and experimentally validate sequencing outcomes through use of computationally predicted, iterative feedback loops

  5. Identification of Susceptible Loci and Enriched Pathways for Bipolar II Disorder Using Genome-Wide Association Studies.

    PubMed

    Kao, Chung-Feng; Chen, Hui-Wen; Chen, Hsi-Chung; Yang, Jenn-Hwai; Huang, Ming-Chyi; Chiu, Yi-Hang; Lin, Shih-Ku; Lee, Ya-Chin; Liu, Chih-Min; Chuang, Li-Chung; Chen, Chien-Hsiun; Wu, Jer-Yuarn; Lu, Ru-Band; Kuo, Po-Hsiu

    2016-12-01

    This study aimed to identify susceptible loci and enriched pathways for bipolar disorder subtype II. We conducted a genome-wide association scan in discovery samples with 189 bipolar disorder subtype II patients and 1773 controls, and replication samples with 283 bipolar disorder subtype II patients and 500 controls in a Taiwanese Han population using Affymetrix Axiom Genome-Wide CHB1 Array. We performed single-marker and gene-based association analyses, as well as calculated polygeneic risk scores for bipolar disorder subtype II. Pathway enrichment analyses were employed to reveal significant biological pathways. Seven markers were found to be associated with bipolar disorder subtype II in meta-analysis combining both discovery and replication samples (P<5.0×10 -6 ), including markers in or close to MYO16, HSP90AB3P, noncoding gene LOC100507632, and markers in chromosomes 4 and 10. A novel locus, ETF1, was associated with bipolar disorder subtype II (P<6.0×10 -3 ) in gene-based association tests. Results of risk evaluation demonstrated that higher genetic risk scores were able to distinguish bipolar disorder subtype II patients from healthy controls in both discovery (P=3.9×10 -4 ~1.0×10 -3 ) and replication samples (2.8×10 -4 ~1.7×10 -3 ). Genetic variance explained by chip markers for bipolar disorder subtype II was substantial in the discovery (55.1%) and replication (60.5%) samples. Moreover, pathways related to neurodevelopmental function, signal transduction, neuronal system, and cell adhesion molecules were significantly associated with bipolar disorder subtype II. We reported novel susceptible loci for pure bipolar subtype II disorder that is less addressed in the literature. Future studies are needed to confirm the roles of these loci for bipolar disorder subtype II. © The Author 2016. Published by Oxford University Press on behalf of CINP.

  6. Efficiently Identifying Significant Associations in Genome-wide Association Studies

    PubMed Central

    Eskin, Eleazar

    2013-01-01

    Abstract Over the past several years, genome-wide association studies (GWAS) have implicated hundreds of genes in common disease. More recently, the GWAS approach has been utilized to identify regions of the genome that harbor variation affecting gene expression or expression quantitative trait loci (eQTLs). Unlike GWAS applied to clinical traits, where only a handful of phenotypes are analyzed per study, in eQTL studies, tens of thousands of gene expression levels are measured, and the GWAS approach is applied to each gene expression level. This leads to computing billions of statistical tests and requires substantial computational resources, particularly when applying novel statistical methods such as mixed models. We introduce a novel two-stage testing procedure that identifies all of the significant associations more efficiently than testing all the single nucleotide polymorphisms (SNPs). In the first stage, a small number of informative SNPs, or proxies, across the genome are tested. Based on their observed associations, our approach locates the regions that may contain significant SNPs and only tests additional SNPs from those regions. We show through simulations and analysis of real GWAS datasets that the proposed two-stage procedure increases the computational speed by a factor of 10. Additionally, efficient implementation of our software increases the computational speed relative to the state-of-the-art testing approaches by a factor of 75. PMID:24033261

  7. Pernicious plans revealed: Plasmodium falciparum genome wide expression analysis.

    PubMed

    Llinás, Manuel; DeRisi, Joseph L

    2004-08-01

    The asexual intraerythrocytic developmental cycle (IDC) of Plasmodium falciparum is responsible for the majority of the clinical manifestations of malaria in humans. Although malaria has been studied for over a century, the elucidation of the full genome sequence of P. falciparum has now allowed for in-depth studies of gene expression throughout the entire intraerythrocytic stage. As the mainstays of anti-malarial chemotherapy become increasingly ineffective, we need a deeper understanding of fundamental plasmodial bioregulatory mechanisms to successfully subvert them. Recent gene expression studies have begun to examine different aspects of the IDC and are providing key insights into the basic mechanisms of Plasmodium gene regulation and are helping to define gene functions. However, to date, no transcription factor has been fully characterized from Plasmodium and the definitive identification of cis-acting regulatory elements along with their corresponding trans-acting partners is still lacking. The characterization of the transcriptome of P. falciparum is the first major step towards the understanding of the genome wide regulation of gene expression in this parasite. IDC expression data for almost every gene in the P. falciparum genome can now be publicly queried at and. The results of these studies suggest promising leads for identifying novel targets for anti-malarial therapeutics and vaccines in addition to providing a solid foundation for the ongoing elucidation of plasmodial gene expression.

  8. Genome-Wide Association Study of Seed Dormancy and the Genomic Consequences of Improvement Footprints in Rice (Oryza sativa L.)

    PubMed Central

    Lu, Qing; Niu, Xiaojun; Zhang, Mengchen; Wang, Caihong; Xu, Qun; Feng, Yue; Yang, Yaolong; Wang, Shan; Yuan, Xiaoping; Yu, Hanyong; Wang, Yiping; Chen, Xiaoping; Liang, Xuanqiang; Wei, Xinghua

    2018-01-01

    Seed dormancy is an important agronomic trait affecting grain yield and quality because of pre-harvest germination and is influenced by both environmental and genetic factors. However, our knowledge of the factors controlling seed dormancy remains limited. To better reveal the molecular mechanism underlying this trait, a genome-wide association study was conducted in an indica-only population consisting of 453 accessions genotyped using 5,291 SNPs. Nine known and new significant SNPs were identified on eight chromosomes. These lead SNPs explained 34.9% of the phenotypic variation, and four of them were designed as dCAPS markers in the hope of accelerating molecular breeding. Moreover, a total of 212 candidate genes was predicted and eight candidate genes showed plant tissue-specific expression in expression profile data from different public bioinformatics databases. In particular, LOC_Os03g10110, which had a maize homolog involved in embryo development, was identified as a candidate regulator for further biological function investigations. Additionally, a polymorphism information content ratio method was used to screen improvement footprints and 27 selective sweeps were identified, most of which harbored domestication-related genes. Further studies suggested that three significant SNPs were adjacent to the candidate selection signals, supporting the accuracy of our genome-wide association study (GWAS) results. These findings show that genome-wide screening for selective sweeps can be used to identify new improvement-related DNA regions, although the phenotypes are unknown. This study enhances our knowledge of the genetic variation in seed dormancy, and the new dormancy-associated SNPs will provide real benefits in molecular breeding. PMID:29354150

  9. Genome-wide scan for commons SNPs affecting bovine leukemia virus infection level in dairy cattle.

    PubMed

    Carignano, Hugo A; Roldan, Dana L; Beribe, María J; Raschia, María A; Amadio, Ariel; Nani, Juan P; Gutierrez, Gerónimo; Alvarez, Irene; Trono, Karina; Poli, Mario A; Miretti, Marcos M

    2018-02-13

    Bovine leukemia virus (BLV) infection is omnipresent in dairy herds causing direct economic losses due to trade restrictions and lymphosarcoma-related deaths. Milk production drops and increase in the culling rate are also relevant and usually neglected. The BLV provirus persists throughout a lifetime and an inter-individual variation is observed in the level of infection (LI) in vivo. High LI is strongly correlated with disease progression and BLV transmission among herd mates. In a context of high prevalence, classical control strategies are economically prohibitive. Alternatively, host genomics studies aiming to dissect loci associated with LI are potentially useful tools for genetic selection programs tending to abrogate the viral spreading. The LI was measured through the proviral load (PVL) set-point and white blood cells (WBC) counts. The goals of this work were to gain insight into the contribution of SNPs (bovine 50KSNP panel) on LI variability and to identify genomics regions underlying this trait. We quantified anti-p24 response and total leukocytes count in peripheral blood from 1800 cows and used these to select 800 individuals with extreme phenotypes in WBCs and PVL. Two case-control genomic association studies using linear mixed models (LMMs) considering population stratification were performed. The proportion of the variance captured by all QC-passed SNPs represented 0.63 (SE ± 0.14) of the phenotypic variance for PVL and 0.56 (SE ± 0.15) for WBCs. Overall, significant associations (Bonferroni's corrected -log 10 p > 5.94) were shared for both phenotypes by 24 SNPs within the Bovine MHC. Founder haplotypes were used to measure the linkage disequilibrium (LD) extent (r 2  = 0.22 ± 0.27 at inter-SNP distance of 25-50 kb). The SNPs and LD blocks indicated genes potentially associated with LI in infected cows: i.e. relevant immune response related genes (DQA1, DRB3, BOLA-A, LTA, LTB, TNF, IER3, GRP111, CRISP1), several genes

  10. Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays.

    PubMed

    Mak, Angel C Y; Lai, Yvonne Y Y; Lam, Ernest T; Kwok, Tsz-Piu; Leung, Alden K Y; Poon, Annie; Mostovoy, Yulia; Hastie, Alex R; Stedman, William; Anantharaman, Thomas; Andrews, Warren; Zhou, Xiang; Pang, Andy W C; Dai, Heng; Chu, Catherine; Lin, Chin; Wu, Jacob J K; Li, Catherine M L; Li, Jing-Woei; Yim, Aldrin K Y; Chan, Saki; Sibert, Justin; Džakula, Željko; Cao, Han; Yiu, Siu-Ming; Chan, Ting-Fung; Yip, Kevin Y; Xiao, Ming; Kwok, Pui-Yan

    2016-01-01

    Comprehensive whole-genome structural variation detection is challenging with current approaches. With diploid cells as DNA source and the presence of numerous repetitive elements, short-read DNA sequencing cannot be used to detect structural variation efficiently. In this report, we show that genome mapping with long, fluorescently labeled DNA molecules imaged on nanochannel arrays can be used for whole-genome structural variation detection without sequencing. While whole-genome haplotyping is not achieved, local phasing (across >150-kb regions) is routine, as molecules from the parental chromosomes are examined separately. In one experiment, we generated genome maps from a trio from the 1000 Genomes Project, compared the maps against that derived from the reference human genome, and identified structural variations that are >5 kb in size. We find that these individuals have many more structural variants than those published, including some with the potential of disrupting gene function or regulation. Copyright © 2016 by the Genetics Society of America.

  11. Genome-Wide Association Study of Receptive Language Ability of 12-Year-Olds

    ERIC Educational Resources Information Center

    Harlaar, Nicole; Meaburn, Emma L.; Hayiou-Thomas, Marianna E.; Davis, Oliver S. P.; Docherty, Sophia; Hanscombe, Ken B.; Haworth, Claire M. A.; Price, Thomas S.; Trzaskowski, Maciej; Dale, Philip S.; Plomin, Robert

    2014-01-01

    Purpose: Researchers have previously shown that individual differences in measures of receptive language ability at age 12 are highly heritable. In the current study, the authors attempted to identify some of the genes responsible for the heritability of receptive language ability using a "genome-wide association" approach. Method: The…

  12. Genome-wide significant predictors of metabolites in the one-carbon metabolism pathway

    USDA-ARS?s Scientific Manuscript database

    Low plasma B-vitamin levels and elevated homocysteine have been associated with cancer, cardiovascular disease, and neurodegenerative disorders. Common variants in FUT2 on chromosome 19q13 were associated with plasma vitamin B12 levels among women in a genome-wide association study (GWAS) in the Nur...

  13. Genome-Wide Polygenic Scores Predict Reading Performance throughout the School Years

    ERIC Educational Resources Information Center

    Selzam, Saskia; Dale, Philip S.; Wagner, Richard K.; DeFries, John C.; Cederlöf, Martin; O'Reilly, Paul F.; Krapohl, Eva; Plomin, Robert

    2017-01-01

    It is now possible to create individual-specific genetic scores, called genome-wide polygenic scores (GPS). We used a GPS for years of education ("EduYears") to predict reading performance assessed at UK National Curriculum Key Stages 1 (age 7), 2 (age 12) and 3 (age 14) and on reading tests administered at ages 7 and 12 in a UK sample…

  14. Genome-wide association study with 1000 genomes imputation identifies signals for nine sex hormone-related phenotypes.

    PubMed

    Ruth, Katherine S; Campbell, Purdey J; Chew, Shelby; Lim, Ee Mun; Hadlow, Narelle; Stuckey, Bronwyn G A; Brown, Suzanne J; Feenstra, Bjarke; Joseph, John; Surdulescu, Gabriela L; Zheng, Hou Feng; Richards, J Brent; Murray, Anna; Spector, Tim D; Wilson, Scott G; Perry, John R B

    2016-02-01

    Genetic factors contribute strongly to sex hormone levels, yet knowledge of the regulatory mechanisms remains incomplete. Genome-wide association studies (GWAS) have identified only a small number of loci associated with sex hormone levels, with several reproductive hormones yet to be assessed. The aim of the study was to identify novel genetic variants contributing to the regulation of sex hormones. We performed GWAS using genotypes imputed from the 1000 Genomes reference panel. The study used genotype and phenotype data from a UK twin register. We included 2913 individuals (up to 294 males) from the Twins UK study, excluding individuals receiving hormone treatment. Phenotypes were standardised for age, sex, BMI, stage of menstrual cycle and menopausal status. We tested 7,879,351 autosomal SNPs for association with levels of dehydroepiandrosterone sulphate (DHEAS), oestradiol, free androgen index (FAI), follicle-stimulating hormone (FSH), luteinizing hormone (LH), prolactin, progesterone, sex hormone-binding globulin and testosterone. Eight independent genetic variants reached genome-wide significance (P<5 × 10(-8)), with minor allele frequencies of 1.3-23.9%. Novel signals included variants for progesterone (P=7.68 × 10(-12)), oestradiol (P=1.63 × 10(-8)) and FAI (P=1.50 × 10(-8)). A genetic variant near the FSHB gene was identified which influenced both FSH (P=1.74 × 10(-8)) and LH (P=3.94 × 10(-9)) levels. A separate locus on chromosome 7 was associated with both DHEAS (P=1.82 × 10(-14)) and progesterone (P=6.09 × 10(-14)). This study highlights loci that are relevant to reproductive function and suggests overlap in the genetic basis of hormone regulation.

  15. Genome-Wide Mutagenesis in Borrelia burgdorferi.

    PubMed

    Lin, Tao; Gao, Lihui

    2018-01-01

    population of mutants with different tags, after recovered from different tissues of infected mice and ticks, mutants from output pool and input pool are detected using high-throughput, semi-quantitative Luminex ® FLEXMAP™ or next-generation sequencing (Tn-seq) technologies. Thus far, we have created a high-density, sequence-defined transposon library of over 6600 STM mutants for the efficient genome-wide investigation of genes and gene products required for wild-type pathogenesis, host-pathogen interactions, in vitro growth, in vivo survival, physiology, morphology, chemotaxis, motility, structure, metabolism, gene regulation, plasmid maintenance and replication, etc. The insertion sites of 4480 transposon mutants have been determined. About 800 predicted protein-encoding genes in the genome were disrupted in the STM transposon library. The infectivity and some functions of 800 mutants in 500 genes have been determined. Analysis of these transposon mutants has yielded valuable information regarding the genes and gene products important in the pathogenesis and biology of B. burgdorferi and its tick vectors.

  16. Fluorescence Reporter-Based Genome-Wide RNA Interference Screening to Identify Alternative Splicing Regulators.

    PubMed

    Misra, Ashish; Green, Michael R

    2017-01-01

    Alternative splicing is a regulated process that leads to inclusion or exclusion of particular exons in a pre-mRNA transcript, resulting in multiple protein isoforms being encoded by a single gene. With more than 90 % of human genes known to undergo alternative splicing, it represents a major source for biological diversity inside cells. Although in vitro splicing assays have revealed insights into the mechanisms regulating individual alternative splicing events, our global understanding of alternative splicing regulation is still evolving. In recent years, genome-wide RNA interference (RNAi) screening has transformed biological research by enabling genome-scale loss-of-function screens in cultured cells and model organisms. In addition to resulting in the identification of new cellular pathways and potential drug targets, these screens have also uncovered many previously unknown mechanisms regulating alternative splicing. Here, we describe a method for the identification of alternative splicing regulators using genome-wide RNAi screening, as well as assays for further validation of the identified candidates. With modifications, this method can also be adapted to study the splicing regulation of pre-mRNAs that contain two or more splice isoforms.

  17. Genome-wide evidence for divergent selection between populations of a major agricultural pathogen.

    PubMed

    Hartmann, Fanny E; McDonald, Bruce A; Croll, Daniel

    2018-06-01

    The genetic and environmental homogeneity in agricultural ecosystems is thought to impose strong and uniform selection pressures. However, the impact of this selection on plant pathogen genomes remains largely unknown. We aimed to identify the proportion of the genome and the specific gene functions under positive selection in populations of the fungal wheat pathogen Zymoseptoria tritici. First, we performed genome scans in four field populations that were sampled from different continents and on distinct wheat cultivars to test which genomic regions are under recent selection. Based on extended haplotype homozygosity and composite likelihood ratio tests, we identified 384 and 81 selective sweeps affecting 4% and 0.5% of the 35 Mb core genome, respectively. We found differences both in the number and the position of selective sweeps across the genome between populations. Using a XtX-based outlier detection approach, we identified 51 extremely divergent genomic regions between the allopatric populations, suggesting that divergent selection led to locally adapted pathogen populations. We performed an outlier detection analysis between two sympatric populations infecting two different wheat cultivars to identify evidence for host-driven selection. Selective sweep regions harboured genes that are likely to play a role in successfully establishing host infections. We also identified secondary metabolite gene clusters and an enrichment in genes encoding transporter and protein localization functions. The latter gene functions mediate responses to environmental stress, including interactions with the host. The distinct gene functions under selection indicate that both local host genotypes and abiotic factors contributed to local adaptation. © 2018 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  18. Haplotype-Based Genome-Wide Prediction Models Exploit Local Epistatic Interactions Among Markers

    PubMed Central

    Jiang, Yong; Schmidt, Renate H.; Reif, Jochen C.

    2018-01-01

    Genome-wide prediction approaches represent versatile tools for the analysis and prediction of complex traits. Mostly they rely on marker-based information, but scenarios have been reported in which models capitalizing on closely-linked markers that were combined into haplotypes outperformed marker-based models. Detailed comparisons were undertaken to reveal under which circumstances haplotype-based genome-wide prediction models are superior to marker-based models. Specifically, it was of interest to analyze whether and how haplotype-based models may take local epistatic effects between markers into account. Assuming that populations consisted of fully homozygous individuals, a marker-based model in which local epistatic effects inside haplotype blocks were exploited (LEGBLUP) was linearly transformable into a haplotype-based model (HGBLUP). This theoretical derivation formally revealed that haplotype-based genome-wide prediction models capitalize on local epistatic effects among markers. Simulation studies corroborated this finding. Due to its computational efficiency the HGBLUP model promises to be an interesting tool for studies in which ultra-high-density SNP data sets are studied. Applying the HGBLUP model to empirical data sets revealed higher prediction accuracies than for marker-based models for both traits studied using a mouse panel. In contrast, only a small subset of the traits analyzed in crop populations showed such a benefit. Cases in which higher prediction accuracies are observed for HGBLUP than for marker-based models are expected to be of immediate relevance for breeders, due to the tight linkage a beneficial haplotype will be preserved for many generations. In this respect the inheritance of local epistatic effects very much resembles the one of additive effects. PMID:29549092

  19. Genome-wide survey and expression analysis of F-box genes in chickpea.

    PubMed

    Gupta, Shefali; Garg, Vanika; Kant, Chandra; Bhatia, Sabhyata

    2015-02-13

    The F-box genes constitute one of the largest gene families in plants involved in degradation of cellular proteins. F-box proteins can recognize a wide array of substrates and regulate many important biological processes such as embryogenesis, floral development, plant growth and development, biotic and abiotic stress, hormonal responses and senescence, among others. However, little is known about the F-box genes in the important legume crop, chickpea. The available draft genome sequence of chickpea allowed us to conduct a genome-wide survey of the F-box gene family in chickpea. A total of 285 F-box genes were identified in chickpea which were classified based on their C-terminal domain structures into 10 subfamilies. Thirteen putative novel motifs were also identified in F-box proteins with no known functional domain at their C-termini. The F-box genes were physically mapped on the 8 chickpea chromosomes and duplication events were investigated which revealed that the F-box gene family expanded largely due to tandem duplications. Phylogenetic analysis classified the chickpea F-box genes into 9 clusters. Also, maximum syntenic relationship was observed with soybean followed by Medicago truncatula, Lotus japonicus and Arabidopsis. Digital expression analysis of F-box genes in various chickpea tissues as well as under abiotic stress conditions utilizing the available chickpea transcriptome data revealed differential expression patterns with several F-box genes specifically expressing in each tissue, few of which were validated by using quantitative real-time PCR. The genome-wide analysis of chickpea F-box genes provides new opportunities for characterization of candidate F-box genes and elucidation of their function in growth, development and stress responses for utilization in chickpea improvement.

  20. Haplotype-Based Genome-Wide Prediction Models Exploit Local Epistatic Interactions Among Markers.

    PubMed

    Jiang, Yong; Schmidt, Renate H; Reif, Jochen C

    2018-05-04

    Genome-wide prediction approaches represent versatile tools for the analysis and prediction of complex traits. Mostly they rely on marker-based information, but scenarios have been reported in which models capitalizing on closely-linked markers that were combined into haplotypes outperformed marker-based models. Detailed comparisons were undertaken to reveal under which circumstances haplotype-based genome-wide prediction models are superior to marker-based models. Specifically, it was of interest to analyze whether and how haplotype-based models may take local epistatic effects between markers into account. Assuming that populations consisted of fully homozygous individuals, a marker-based model in which local epistatic effects inside haplotype blocks were exploited (LEGBLUP) was linearly transformable into a haplotype-based model (HGBLUP). This theoretical derivation formally revealed that haplotype-based genome-wide prediction models capitalize on local epistatic effects among markers. Simulation studies corroborated this finding. Due to its computational efficiency the HGBLUP model promises to be an interesting tool for studies in which ultra-high-density SNP data sets are studied. Applying the HGBLUP model to empirical data sets revealed higher prediction accuracies than for marker-based models for both traits studied using a mouse panel. In contrast, only a small subset of the traits analyzed in crop populations showed such a benefit. Cases in which higher prediction accuracies are observed for HGBLUP than for marker-based models are expected to be of immediate relevance for breeders, due to the tight linkage a beneficial haplotype will be preserved for many generations. In this respect the inheritance of local epistatic effects very much resembles the one of additive effects. Copyright © 2018 Jiang et al.