Sample records for genome-wide linkage search

  1. Meta-analysis of 32 genome-wide linkage studies of schizophrenia

    PubMed Central

    Ng, MYM; Levinson, DF; Faraone, SV; Suarez, BK; DeLisi, LE; Arinami, T; Riley, B; Paunio, T; Pulver, AE; Irmansyah; Holmans, PA; Escamilla, M; Wildenauer, DB; Williams, NM; Laurent, C; Mowry, BJ; Brzustowicz, LM; Maziade, M; Sklar, P; Garver, DL; Abecasis, GR; Lerer, B; Fallin, MD; Gurling, HMD; Gejman, PV; Lindholm, E; Moises, HW; Byerley, W; Wijsman, EM; Forabosco, P; Tsuang, MT; Hwu, H-G; Okazaki, Y; Kendler, KS; Wormley, B; Fanous, A; Walsh, D; O’Neill, FA; Peltonen, L; Nestadt, G; Lasseter, VK; Liang, KY; Papadimitriou, GM; Dikeos, DG; Schwab, SG; Owen, MJ; O’Donovan, MC; Norton, N; Hare, E; Raventos, H; Nicolini, H; Albus, M; Maier, W; Nimgaonkar, VL; Terenius, L; Mallet, J; Jay, M; Godard, S; Nertney, D; Alexander, M; Crowe, RR; Silverman, JM; Bassett, AS; Roy, M-A; Mérette, C; Pato, CN; Pato, MT; Roos, J Louw; Kohn, Y; Amann-Zalcenstein, D; Kalsi, G; McQuillin, A; Curtis, D; Brynjolfson, J; Sigmundsson, T; Petursson, H; Sanders, AR; Duan, J; Jazin, E; Myles-Worsley, M; Karayiorgou, M; Lewis, CM

    2009-01-01

    A genome scan meta-analysis (GSMA) was carried out on 32 independent genome-wide linkage scan analyses that included 3255 pedigrees with 7413 genotyped cases affected with schizophrenia (SCZ) or related disorders. The primary GSMA divided the autosomes into 120 bins, rank-ordered the bins within each study according to the most positive linkage result in each bin, summed these ranks (weighted for study size) for each bin across studies and determined the empirical probability of a given summed rank (PSR) by simulation. Suggestive evidence for linkage was observed in two single bins, on chromosomes 5q (142-168 Mb) and 2q (103-134 Mb). Genome-wide evidence for linkage was detected on chromosome 2q (119-152 Mb) when bin boundaries were shifted to the middle of the previous bins. The primary analysis met empirical criteria for ‘aggregate’ genome-wide significance, indicating that some or all of 10 bins are likely to contain loci linked to SCZ, including regions of chromosomes 1, 2q, 3q, 4q, 5q, 8p and 10q. In a secondary analysis of 22 studies of European-ancestry samples, suggestive evidence for linkage was observed on chromosome 8p (16-33 Mb). Although the newer genome-wide association methodology has greater power to detect weak associations to single common DNA sequence variants, linkage analysis can detect diverse genetic effects that segregate in families, including multiple rare variants within one locus or several weakly associated loci in the same region. Therefore, the regions supported by this meta-analysis deserve close attention in future studies. PMID:19349958

  2. Genome-wide high-density SNP linkage search for glioma susceptibility loci: results from the Gliogene Consortium

    PubMed Central

    Shete, Sanjay; Lau, Ching C; Houlston, Richard S; Claus, Elizabeth B; Barnholtz-Sloan, Jill; Lai, Rose; Il’yasova, Dora; Schildkraut, Joellen; Sadetzki, Siegal; Johansen, Christoffer; Bernstein, Jonine L; Olson, Sara H; Jenkins, Robert B; Yang, Ping; Vick, Nicholas A; Wrensch, Margaret; Davis, Faith G; McCarthy, Bridget J; Leung, Eastwood Hon-chiu; Davis, Caleb; Cheng, Rita; Hosking, Fay J; Armstrong, Georgina N; Liu, Yanhong; Yu, Robert K; Henriksson, Roger; Consortium, The Gliogene; Melin, Beatrice S; Bondy, Melissa L

    2011-01-01

    Gliomas, which generally have a poor prognosis, are the most common primary malignant brain tumors in adults. Recent genome-wide association studies have demonstrated that inherited susceptibility plays a role in the development of glioma. Although first-degree relatives of patients exhibit a two-fold increased risk of glioma, the search for susceptibility loci in familial forms of the disease has been challenging because the disease is relatively rare, fatal, and heterogeneous, making it difficult to collect sufficient biosamples from families for statistical power. To address this challenge, the Genetic Epidemiology of Glioma International Consortium (Gliogene) was formed to collect DNA samples from families with two or more cases of histologically confirmed glioma. In this study, we present results obtained from 46 U.S. families in which multipoint linkage analyses were undertaken using nonparametric (model-free) methods. After removal of high linkage disequilibrium SNPs, we obtained a maximum nonparametric linkage score (NPL) of 3.39 (P=0.0005) at 17q12–21.32 and the Z-score of 4.20 (P=0.000007). To replicate our findings, we genotyped 29 independent U.S. families and obtained a maximum NPL score of 1.26 (P=0.008) and the Z-score of 1.47 (P=0.035). Accounting for the genetic heterogeneity using the ordered subset analysis approach, the combined analyses of 75 families resulted in a maximum NPL score of 3.81 (P=0.00001). The genomic regions we have implicated in this study may offer novel insights into glioma susceptibility, focusing future work to identify genes that cause familial glioma. PMID:22037877

  3. A genome-wide search for linkage of estimated glomerular filtration rate (eGFR) in the Family Investigation of Nephropathy and Diabetes (FIND).

    PubMed

    Thameem, Farook; Igo, Robert P; Freedman, Barry I; Langefeld, Carl; Hanson, Robert L; Schelling, Jeffrey R; Elston, Robert C; Duggirala, Ravindranath; Nicholas, Susanne B; Goddard, Katrina A B; Divers, Jasmin; Guo, Xiuqing; Ipp, Eli; Kimmel, Paul L; Meoni, Lucy A; Shah, Vallabh O; Smith, Michael W; Winkler, Cheryl A; Zager, Philip G; Knowler, William C; Nelson, Robert G; Pahl, Madeline V; Parekh, Rulan S; Kao, W H Linda; Rasooly, Rebekah S; Adler, Sharon G; Abboud, Hanna E; Iyengar, Sudha K; Sedor, John R

    2013-01-01

    Estimated glomerular filtration rate (eGFR), a measure of kidney function, is heritable, suggesting that genes influence renal function. Genes that influence eGFR have been identified through genome-wide association studies. However, family-based linkage approaches may identify loci that explain a larger proportion of the heritability. This study used genome-wide linkage and association scans to identify quantitative trait loci (QTL) that influence eGFR. Genome-wide linkage and sparse association scans of eGFR were performed in families ascertained by probands with advanced diabetic nephropathy (DN) from the multi-ethnic Family Investigation of Nephropathy and Diabetes (FIND) study. This study included 954 African Americans (AA), 781 American Indians (AI), 614 European Americans (EA) and 1,611 Mexican Americans (MA). A total of 3,960 FIND participants were genotyped for 6,000 single nucleotide polymorphisms (SNPs) using the Illumina Linkage IVb panel. GFR was estimated by the Modification of Diet in Renal Disease (MDRD) formula. The non-parametric linkage analysis, accounting for the effects of diabetes duration and BMI, identified the strongest evidence for linkage of eGFR on chromosome 20q11 (log of the odds [LOD] = 3.34; P = 4.4 × 10(-5)) in MA and chromosome 15q12 (LOD = 2.84; P = 1.5 × 10(-4)) in EA. In all subjects, the strongest linkage signal for eGFR was detected on chromosome 10p12 (P = 5.5 × 10(-4)) at 44 cM near marker rs1339048. A subsequent association scan in both ancestry-specific groups and the entire population identified several SNPs significantly associated with eGFR across the genome. The present study describes the localization of QTL influencing eGFR on 20q11 in MA, 15q21 in EA and 10p12 in the combined ethnic groups participating in the FIND study. Identification of causal genes/variants influencing eGFR, within these linkage and association loci, will open new avenues for functional analyses and development of novel diagnostic markers

  4. A GENOME-WIDE LINKAGE AND ASSOCIATION SCAN REVEALS NOVEL LOCI FOR AUTISM

    PubMed Central

    Weiss, Lauren A.; Arking, Dan E.

    2009-01-01

    Summary Although autism is a highly heritable neurodevelopmental disorder, attempts to identify specific susceptibility genes have thus far met with limited success 1. Genome-wide association studies (GWAS) using half a million or more markers, particularly those with very large sample sizes achieved through meta-analysis, have shown great success in mapping genes for other complex genetic traits (http://www.genome.gov/26525384). Consequently, we initiated a linkage and association mapping study using half a million genome-wide SNPs in a common set of 1,031 multiplex autism families (1,553 affected offspring). We identified regions of suggestive and significant linkage on chromosomes 6q27 and 20p13, respectively. Initial analysis did not yield genome-wide significant associations; however, genotyping of top hits in additional families revealed a SNP on chromosome 5p15 (between SEMA5A and TAS2R1) that was significantly associated with autism (P = 2 × 10−7). We also demonstrated that expression of SEMA5A is reduced in brains from autistic patients, further implicating SEMA5A as an autism susceptibility gene. The linkage regions reported here provide targets for rare variation screening while the discovery of a single novel association demonstrates the action of common variants. PMID:19812673

  5. Genome-wide linkage in Utah autism pedigrees

    PubMed Central

    Allen-Brady, K; Robison, R; Cannon, D; Varvil, T; Villalobos, M; Pingree, C; Leppert, MF; Miller, J; McMahon, WM; Coon, H

    2014-01-01

    Genetic studies of autism over the past decade suggest a complex landscape of multiple genes. In the face of this heterogeneity, studies that include large extended pedigrees may offer valuable insight, as the relatively few susceptibility genes within single large families may be more easily discerned. This genome-wide screen of 70 families includes 20 large extended pedigrees of 6–9 generations, 6 moderate-sized families of 4–5 generations, and 44 smaller families of 2–3 generations. The Center for Inherited Disease Research (CIDR) provided genotyping using the Illumina Linkage Panel 12, a 6K single nucleotide polymorphism (SNP) platform. Results from 192 subjects with an Autism Spectrum Disorder (ASD), and 461 of their relatives revealed genome-wide significance on chromosome 15q, with three possibly distinct peaks: 15q13.1-q14 (HLOD=4.09 at 29,459,872bp); 15q14-q21.1 (HLOD=3.59 at 36,837,208bp); and 15q21.1-q22.2 (HLOD=5.31 at 55,629,733bp). Two of these peaks replicate previous findings. There were additional suggestive results on chromosomes 2p25.3-p24.1 (HLOD=1.87), 7q31.31-q32.3 (HLOD=1.97), and 13q12.11-q12.3 (HLOD=1.93). Affected subjects in families supporting the linkage peaks found in this study did not reveal strong evidence for distinct phenotypic subgroups. PMID:19455147

  6. A Genome-Wide Search for Linkage of Estimated Glomerular Filtration Rate (eGFR) in the Family Investigation of Nephropathy and Diabetes (FIND)

    PubMed Central

    Thameem, Farook; Igo, Robert P.; Freedman, Barry I.; Langefeld, Carl; Hanson, Robert L.; Schelling, Jeffrey R.; Elston, Robert C.; Duggirala, Ravindranath; Nicholas, Susanne B.; Goddard, Katrina A. B.; Divers, Jasmin; Guo, Xiuqing; Ipp, Eli; Kimmel, Paul L.; Meoni, Lucy A.; Shah, Vallabh O.; Smith, Michael W.; Winkler, Cheryl A.; Zager, Philip G.; Knowler, William C.; Nelson, Robert G.; Pahl, Madeline V.; Parekh, Rulan S.; Kao, W. H. Linda; Rasooly, Rebekah S.; Adler, Sharon G.; Abboud, Hanna E.; Iyengar, Sudha K.; Sedor, John R.

    2013-01-01

    Objective Estimated glomerular filtration rate (eGFR), a measure of kidney function, is heritable, suggesting that genes influence renal function. Genes that influence eGFR have been identified through genome-wide association studies. However, family-based linkage approaches may identify loci that explain a larger proportion of the heritability. This study used genome-wide linkage and association scans to identify quantitative trait loci (QTL) that influence eGFR. Methods Genome-wide linkage and sparse association scans of eGFR were performed in families ascertained by probands with advanced diabetic nephropathy (DN) from the multi-ethnic Family Investigation of Nephropathy and Diabetes (FIND) study. This study included 954 African Americans (AA), 781 American Indians (AI), 614 European Americans (EA) and 1,611 Mexican Americans (MA). A total of 3,960 FIND participants were genotyped for 6,000 single nucleotide polymorphisms (SNPs) using the Illumina Linkage IVb panel. GFR was estimated by the Modification of Diet in Renal Disease (MDRD) formula. Results The non-parametric linkage analysis, accounting for the effects of diabetes duration and BMI, identified the strongest evidence for linkage of eGFR on chromosome 20q11 (log of the odds [LOD] = 3.34; P = 4.4×10−5) in MA and chromosome 15q12 (LOD = 2.84; P = 1.5×10−4) in EA. In all subjects, the strongest linkage signal for eGFR was detected on chromosome 10p12 (P = 5.5×10−4) at 44 cM near marker rs1339048. A subsequent association scan in both ancestry-specific groups and the entire population identified several SNPs significantly associated with eGFR across the genome. Conclusion The present study describes the localization of QTL influencing eGFR on 20q11 in MA, 15q21 in EA and 10p12 in the combined ethnic groups participating in the FIND study. Identification of causal genes/variants influencing eGFR, within these linkage and association loci, will open new avenues for functional

  7. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

    PubMed Central

    Minster, Ryan L.; Sanders, Jason L.; Singh, Jatinder; Kammerer, Candace M.; Barmada, M. Michael; Matteini, Amy M.; Zhang, Qunyuan; Wojczynski, Mary K.; Daw, E. Warwick; Brody, Jennifer A.; Arnold, Alice M.; Lunetta, Kathryn L.; Murabito, Joanne M.; Christensen, Kaare; Perls, Thomas T.; Province, Michael A.

    2015-01-01

    Background. The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. Methods. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. Results. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10− 6) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24–p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. Conclusions. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. PMID:25758594

  8. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index.

    PubMed

    Minster, Ryan L; Sanders, Jason L; Singh, Jatinder; Kammerer, Candace M; Barmada, M Michael; Matteini, Amy M; Zhang, Qunyuan; Wojczynski, Mary K; Daw, E Warwick; Brody, Jennifer A; Arnold, Alice M; Lunetta, Kathryn L; Murabito, Joanne M; Christensen, Kaare; Perls, Thomas T; Province, Michael A; Newman, Anne B

    2015-08-01

    The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted for mortality risk in 3,140 individuals selected for familial longevity from the Long Life Family Study. The genome-wide association study used the Long Life Family Study as the discovery cohort and individuals from the Cardiovascular Health Study and the Framingham Heart Study as replication cohorts. There were no genome-wide significant findings from the genome-wide association study; however, several single-nucleotide polymorphisms near ZNF704 on chromosome 8q21.13 were suggestively associated with the HAI in the Long Life Family Study (p < 10(-) (6)) and nominally replicated in the Cardiovascular Health Study and Framingham Heart Study. Linkage results revealed significant evidence (log-odds score = 3.36) for a quantitative trait locus for mortality-optimized HAI in women on chromosome 9p24-p23. However, results of fine-mapping studies did not implicate any specific candidate genes within this region of interest. ZNF704 may be a potential candidate gene for studies of the genetic underpinnings of longevity. © The Author 2015. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. Refining genome-wide linkage intervals using a meta-analysis of genome-wide association studies identifies loci influencing personality dimensions

    PubMed Central

    Amin, Najaf; Hottenga, Jouke-Jan; Hansell, Narelle K; Janssens, A Cecile JW; de Moor, Marleen HM; Madden, Pamela AF; Zorkoltseva, Irina V; Penninx, Brenda W; Terracciano, Antonio; Uda, Manuela; Tanaka, Toshiko; Esko, Tonu; Realo, Anu; Ferrucci, Luigi; Luciano, Michelle; Davies, Gail; Metspalu, Andres; Abecasis, Goncalo R; Deary, Ian J; Raikkonen, Katri; Bierut, Laura J; Costa, Paul T; Saviouk, Viatcheslav; Zhu, Gu; Kirichenko, Anatoly V; Isaacs, Aaron; Aulchenko, Yurii S; Willemsen, Gonneke; Heath, Andrew C; Pergadia, Michele L; Medland, Sarah E; Axenovich, Tatiana I; de Geus, Eco; Montgomery, Grant W; Wright, Margaret J; Oostra, Ben A; Martin, Nicholas G; Boomsma, Dorret I; van Duijn, Cornelia M

    2013-01-01

    Personality traits are complex phenotypes related to psychosomatic health. Individually, various gene finding methods have not achieved much success in finding genetic variants associated with personality traits. We performed a meta-analysis of four genome-wide linkage scans (N=6149 subjects) of five basic personality traits assessed with the NEO Five-Factor Inventory. We compared the significant regions from the meta-analysis of linkage scans with the results of a meta-analysis of genome-wide association studies (GWAS) (N∼17 000). We found significant evidence of linkage of neuroticism to chromosome 3p14 (rs1490265, LOD=4.67) and to chromosome 19q13 (rs628604, LOD=3.55); of extraversion to 14q32 (ATGG002, LOD=3.3); and of agreeableness to 3p25 (rs709160, LOD=3.67) and to two adjacent regions on chromosome 15, including 15q13 (rs970408, LOD=4.07) and 15q14 (rs1055356, LOD=3.52) in the individual scans. In the meta-analysis, we found strong evidence of linkage of extraversion to 4q34, 9q34, 10q24 and 11q22, openness to 2p25, 3q26, 9p21, 11q24, 15q26 and 19q13 and agreeableness to 4q34 and 19p13. Significant evidence of association in the GWAS was detected between openness and rs677035 at 11q24 (P-value=2.6 × 10−06, KCNJ1). The findings of our linkage meta-analysis and those of the GWAS suggest that 11q24 is a susceptible locus for openness, with KCNJ1 as the possible candidate gene. PMID:23211697

  10. Genome-wide linkage and association analysis of cardiometabolic phenotypes in Hispanic Americans.

    PubMed

    Hellwege, Jacklyn N; Palmer, Nicholette D; Dimitrov, Latchezar; Keaton, Jacob M; Tabb, Keri L; Sajuthi, Satria; Taylor, Kent D; Ng, Maggie C Y; Speliotes, Elizabeth K; Hawkins, Gregory A; Long, Jirong; Ida Chen, Yii-Der; Lorenzo, Carlos; Norris, Jill M; Rotter, Jerome I; Langefeld, Carl D; Wagenknecht, Lynne E; Bowden, Donald W

    2017-02-01

    Linkage studies of complex genetic diseases have been largely replaced by genome-wide association studies, due in part to limited success in complex trait discovery. However, recent interest in rare and low-frequency variants motivates re-examination of family-based methods. In this study, we investigated the performance of two-point linkage analysis for over 1.6 million single-nucleotide polymorphisms (SNPs) combined with single variant association analysis to identify high impact variants, which are both strongly linked and associated with cardiometabolic traits in up to 1414 Hispanics from the Insulin Resistance Atherosclerosis Family Study (IRASFS). Evaluation of all 50 phenotypes yielded 83 557 000 LOD (logarithm of the odds) scores, with 9214 LOD scores ⩾3.0, 845 ⩾4.0 and 89 ⩾5.0, with a maximal LOD score of 6.49 (rs12956744 in the LAMA1 gene for tumor necrosis factor-α (TNFα) receptor 2). Twenty-seven variants were associated with P<0.005 as well as having an LOD score >4, including variants in the NFIB gene under a linkage peak with TNFα receptor 2 levels on chromosome 9. Linkage regions of interest included a broad peak (31 Mb) on chromosome 1q with acute insulin response (max LOD=5.37). This region was previously documented with type 2 diabetes in family-based studies, providing support for the validity of these results. Overall, we have demonstrated the utility of two-point linkage and association in comprehensive genome-wide array-based SNP genotypes.

  11. Combined genome-wide linkage and targeted association analysis of head circumference in autism spectrum disorder families.

    PubMed

    Woodbury-Smith, M; Bilder, D A; Morgan, J; Jerominski, L; Darlington, T; Dyer, T; Paterson, A D; Coon, H

    2017-01-01

    It has long been recognized that there is an association between enlarged head circumference (HC) and autism spectrum disorder (ASD), but the genetics of HC in ASD is not well understood. In order to investigate the genetic underpinning of HC in ASD, we undertook a genome-wide linkage study of HC followed by linkage signal targeted association among a sample of 67 extended pedigrees with ASD. HC measurements on members of 67 multiplex ASD extended pedigrees were used as a quantitative trait in a genome-wide linkage analysis. The Illumina 6K SNP linkage panel was used, and analyses were carried out using the SOLAR implemented variance components model. Loci identified in this way formed the target for subsequent association analysis using the Illumina OmniExpress chip and imputed genotypes. A modification of the qTDT was used as implemented in SOLAR. We identified a linkage signal spanning 6p21.31 to 6p22.2 (maximum LOD = 3.4). Although targeted association did not find evidence of association with any SNP overall, in one family with the strongest evidence of linkage, there was evidence for association (rs17586672, p  = 1.72E-07). Although this region does not overlap with ASD linkage signals in these same samples, it has been associated with other psychiatric risk, including ADHD, developmental dyslexia, schizophrenia, specific language impairment, and juvenile bipolar disorder. The genome-wide significant linkage signal represents the first reported observation of a potential quantitative trait locus for HC in ASD and may be relevant in the context of complex multivariate risk likely leading to ASD.

  12. Genome-Wide Linkage and Association Analysis Identifies Major Gene Loci for Guttural Pouch Tympany in Arabian and German Warmblood Horses

    PubMed Central

    Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar

    2012-01-01

    Equine guttural pouch tympany (GPT) is a hereditary condition affecting foals in their first months of life. Complex segregation analyses in Arabian and German warmblood horses showed the involvement of a major gene as very likely. Genome-wide linkage and association analyses including a high density marker set of single nucleotide polymorphisms (SNPs) were performed to map the genomic region harbouring the potential major gene for GPT. A total of 85 Arabian and 373 German warmblood horses were genotyped on the Illumina equine SNP50 beadchip. Non-parametric multipoint linkage analyses showed genome-wide significance on horse chromosomes (ECA) 3 for German warmblood at 16–26 Mb and 34–55 Mb and for Arabian on ECA15 at 64–65 Mb. Genome-wide association analyses confirmed the linked regions for both breeds. In Arabian, genome-wide association was detected at 64 Mb within the region with the highest linkage peak on ECA15. For German warmblood, signals for genome-wide association were close to the peak region of linkage at 52 Mb on ECA3. The odds ratio for the SNP with the highest genome-wide association was 0.12 for the Arabian. In conclusion, the refinement of the regions with the Illumina equine SNP50 beadchip is an important step to unravel the responsible mutations for GPT. PMID:22848553

  13. Parent-Of-Origin Effects in Autism Identified through Genome-Wide Linkage Analysis of 16,000 SNPs

    PubMed Central

    Fradin, Delphine; Cheslack-Postava, Keely; Ladd-Acosta, Christine; Newschaffer, Craig; Chakravarti, Aravinda; Arking, Dan E.; Feinberg, Andrew; Fallin, M. Daniele

    2010-01-01

    Background Autism is a common heritable neurodevelopmental disorder with complex etiology. Several genome-wide linkage and association scans have been carried out to identify regions harboring genes related to autism or autism spectrum disorders, with mixed results. Given the overlap in autism features with genetic abnormalities known to be associated with imprinting, one possible reason for lack of consistency would be the influence of parent-of-origin effects that may mask the ability to detect linkage and association. Methods and Findings We have performed a genome-wide linkage scan that accounts for potential parent-of-origin effects using 16,311 SNPs among families from the Autism Genetic Resource Exchange (AGRE) and the National Institute of Mental Health (NIMH) autism repository. We report parametric (GH, Genehunter) and allele-sharing linkage (Aspex) results using a broad spectrum disorder case definition. Paternal-origin genome-wide statistically significant linkage was observed on chromosomes 4 (LODGH = 3.79, empirical p<0.005 and LODAspex = 2.96, p = 0.008), 15 (LODGH = 3.09, empirical p<0.005 and LODAspex = 3.62, empirical p = 0.003) and 20 (LODGH = 3.36, empirical p<0.005 and LODAspex = 3.38, empirical p = 0.006). Conclusions These regions may harbor imprinted sites associated with the development of autism and offer fruitful domains for molecular investigation into the role of epigenetic mechanisms in autism. PMID:20824079

  14. Meta-analysis of genome-wide linkage studies in BMI and obesity.

    PubMed

    Saunders, Catherine L; Chiodini, Benedetta D; Sham, Pak; Lewis, Cathryn M; Abkevich, Victor; Adeyemo, Adebowale A; de Andrade, Mariza; Arya, Rector; Berenson, Gerald S; Blangero, John; Boehnke, Michael; Borecki, Ingrid B; Chagnon, Yvon C; Chen, Wei; Comuzzie, Anthony G; Deng, Hong-Wen; Duggirala, Ravindranath; Feitosa, Mary F; Froguel, Philippe; Hanson, Robert L; Hebebrand, Johannes; Huezo-Dias, Patricia; Kissebah, Ahmed H; Li, Weidong; Luke, Amy; Martin, Lisa J; Nash, Matthew; Ohman, Miina; Palmer, Lyle J; Peltonen, Leena; Perola, Markus; Price, R Arlen; Redline, Susan; Srinivasan, Sathanur R; Stern, Michael P; Stone, Steven; Stringham, Heather; Turner, Stephen; Wijmenga, Cisca; Collier, David A

    2007-09-01

    The objective was to provide an overall assessment of genetic linkage data of BMI and BMI-defined obesity using a nonparametric genome scan meta-analysis. We identified 37 published studies containing data on over 31,000 individuals from more than >10,000 families and obtained genome-wide logarithm of the odds (LOD) scores, non-parametric linkage (NPL) scores, or maximum likelihood scores (MLS). BMI was analyzed in a pooled set of all studies, as a subgroup of 10 studies that used BMI-defined obesity, and for subgroups ascertained through type 2 diabetes, hypertension, or subjects of European ancestry. Bins at chromosome 13q13.2- q33.1, 12q23-q24.3 achieved suggestive evidence of linkage to BMI in the pooled analysis and samples ascertained for hypertension. Nominal evidence of linkage to these regions and suggestive evidence for 11q13.3-22.3 were also observed for BMI-defined obesity. The FTO obesity gene locus at 16q12.2 also showed nominal evidence for linkage. However, overall distribution of summed rank p values <0.05 is not different from that expected by chance. The strongest evidence was obtained in the families ascertained for hypertension at 9q31.1-qter and 12p11.21-q23 (p < 0.01). Despite having substantial statistical power, we did not unequivocally implicate specific loci for BMI or obesity. This may be because genes influencing adiposity are of very small effect, with substantial genetic heterogeneity and variable dependence on environmental factors. However, the observation that the FTO gene maps to one of the highest ranking bins for obesity is interesting and, while not a validation of this approach, indicates that other potential loci identified in this study should be investigated further.

  15. A novel genome-wide microsatellite resource for species of Eucalyptus with linkage-to-physical correspondence on the reference genome sequence.

    PubMed

    Grattapaglia, Dario; Mamani, Eva M C; Silva-Junior, Orzenil B; Faria, Danielle A

    2015-03-01

    Keystone species in their native ranges, eucalypts, are ecologically and genetically very diverse, growing naturally along extensive latitudinal and altitudinal ranges and variable environments. Besides their ecological importance, eucalypts are also the most widely planted trees for sustainable forestry in the world. We report the development of a novel collection of 535 microsatellites for species of Eucalyptus, 494 designed from ESTs and 41 from genomic libraries. A selected subset of 223 was evaluated for individual identification, parentage testing, and ancestral information content in the two most extensively studied species, Eucalyptus grandis and Eucalyptus globulus. Microsatellites showed high transferability and overlapping allele size range, suggesting they have arisen still in their common ancestor and confirming the extensive genome conservation between these two species. A consensus linkage map with 437 microsatellites, the most comprehensive microsatellite-only genetic map for Eucalyptus, was built by assembling segregation data from three mapping populations and anchored to the Eucalyptus genome. An overall colinearity between recombination-based and physical positioning of 84% of the mapped microsatellites was observed, with some ordering discrepancies and sporadic locus duplications, consistent with the recently described whole genome duplication events in Eucalyptus. The linkage map covered 95.2% of the 605.8-Mbp assembled genome sequence, placing one microsatellite every 1.55 Mbp on average, and an overall estimate of physical to recombination distance of 618 kbp/cM. The genetic parameters estimates together with linkage and physical position data for this large set of microsatellites should assist marker choice for genome-wide population genetics and comparative mapping in Eucalyptus. © 2014 John Wiley & Sons Ltd.

  16. Genome-Wide Linkage Analysis to Identify Genetic Modifiers of ALK Mutation Penetrance in Familial Neuroblastoma

    PubMed Central

    Devoto, Marcella; Specchia, Claudia; Laudenslager, Marci; Longo, Luca; Hakonarson, Hakon; Maris, John; Mossé, Yael

    2011-01-01

    Background Neuroblastoma (NB) is an important childhood cancer with a strong genetic component related to disease susceptibility. Approximately 1% of NB cases have a positive family history. Following a genome-wide linkage analysis and sequencing of candidate genes in the critical region, we identified ALK as the major familial NB gene. Dominant mutations in ALK are found in more than 50% of familial NB cases. However, in the families used for the linkage study, only about 50% of carriers of ALK mutations are affected by NB. Methods To test whether genetic variation may explain the reduced penetrance of the disease phenotype, we analyzed genome-wide genotype data in ALK mutation-positive families using a model-based linkage approach with different liability classes for carriers and non-carriers of ALK mutations. Results The region with the highest LOD score was located at chromosome 2p23–p24 and included the ALK locus under models of dominant and recessive inheritance. Conclusions This finding suggests that variants in the non-mutated ALK gene or another gene linked to it may affect penetrance of the ALK mutations and risk of developing NB in familial cases. PMID:21734404

  17. Genome-wide scan for genes involved in bipolar affective disorder in 70 European families ascertained through a bipolar type I early-onset proband: supportive evidence for linkage at 3p14

    PubMed Central

    Etain, Bruno; Mathieu, Flavie; Rietschel, Marcella; Maier, Wolfgang; Albus, Margot; Mckeon, Patrick; Roche, S.; Kealey, Carmel; Blackwood, Douglas; Muir, Walter; Bellivier, Franc; Henry, C.; Dina, Christian; Gallina, Sophie; Gurling, H.; Malafosse, Alain; Preisig, Martin; Ferrero, François; Cichon, Sven; Schumacher, J.; Ohlraun, Stéphanie; Borrmann-Hassenbach, M.; Propping, Peter; Abou Jamra, Rami; Schulze, Thomas G.; Marusic, Andrej; Dernovsek, Mojca Z.; Giros, Bruno; Bourgeron, Thomas; Lemainque, Arnaud; Bacq, Delphine; Betard, Christine; Charon, Céline; Nöthen, Markus M.; Lathrop, Mark; Leboyer, Marion

    2006-01-01

    Summary Preliminary studies suggested that age at onset (AAO) may help to define homogeneous bipolar affective disorder (BPAD) subtypes. This candidate symptom approach might be useful to identify vulnerability genes. Thus, the probability of detecting major disease-causing genes might be increased by focusing on families with early-onset BPAD type I probands. This study was conducted as part of the European Collaborative Study of Early Onset BPAD (France, Germany, Ireland, Scotland, Switzerland, England, Slovenia). We performed a genome-wide search with 384 microsatellite markers using non parametric linkage analysis in 87 sib-pairs ascertained through an early-onset BPAD type I proband (age at onset of 21 years or below). Non parametric multi-point analysis suggested eight regions of linkage with p-values <0.01 (2p21, 2q14.3, 3p14, 5q33, 7q36, 10q23, 16q23 and 20p12). The 3p14 region showed the most significant linkage (genome-wide p-value estimated over 10.000 simulated replicates of 0.015 [0.01–0.02]). After genome-wide search analysis, we performed additional linkage analyses with increase marker density using markers in four regions suggestive for linkage and having an information contents lower than 75% (3p14, 10q23, 16q23 and 20p12). For these regions, the information content improved by about 10%. In chromosome 3, the non parametric linkage score increased from 3.51 to 3.83. This study is the first to use early onset bipolar type I probands in an attempt to increase sample homogeneity. These preliminary findings require confirmation in independent panels of families. PMID:16534504

  18. Systematic, genome-wide, sex-specific linkage of cardiovascular traits in French Canadians.

    PubMed

    Seda, Ondrej; Tremblay, Johanne; Gaudet, Daniel; Brunelle, Pierre-Luc; Gurau, Alexandru; Merlo, Ettore; Pilote, Louise; Orlov, Sergei N; Boulva, Francis; Petrovich, Milan; Kotchen, Theodore A; Cowley, Allen W; Hamet, Pavel

    2008-04-01

    The sexual dimorphism of cardiovascular traits, as well as susceptibility to a variety of related diseases, has long been recognized, yet their sex-specific genomic determinants are largely unknown. We systematically assessed the sex-specific heritability and linkage of 539 hemodynamic, metabolic, anthropometric, and humoral traits in 120 French-Canadian families from the Saguenay-Lac-St-Jean region of Quebec, Canada. We performed multipoint linkage analysis using microsatellite markers followed by peak-wide linkage scan based on Affymetrix Human Mapping 50K Array Xba240 single nucleotide polymorphism genotypes in 3 settings, including the entire sample and then separately in men and women. Nearly one half of the traits were age and sex independent, one quarter were both age and sex dependent, and one eighth were exclusively age or sex dependent. Sex-specific phenotypes are most frequent in heart rate and blood pressure categories, whereas sex- and age-independent determinants are predominant among humoral and biochemical parameters. Twenty sex-specific loci passing multiple testing criteria were corroborated by 2-point single nucleotide polymorphism linkage. Several resting systolic blood pressure measurements showed significant genotype-by-sex interaction, eg, male-specific locus at chromosome 12 (male-female logarithm of odds difference: 4.16; interaction P=0.0002), which was undetectable in the entire population, even after adjustment for sex. Detailed interrogation of this locus revealed a 220-kb block overlapping parts of TAO-kinase 3 and SUDS3 genes. In summary, a large number of complex cardiovascular traits display significant sexual dimorphism, for which we have demonstrated genomic determinants at the haplotype level. Many of these would have been missed in a traditional, sex-adjusted setting.

  19. Motor sequencing deficit as an endophenotype of speech sound disorder: a genome-wide linkage analysis in a multigenerational family.

    PubMed

    Peter, Beate; Matsushita, Mark; Raskind, Wendy H

    2012-10-01

    The aim of this pilot study was to investigate a measure of motor sequencing deficit as a potential endophenotype of speech sound disorder (SSD) in a multigenerational family with evidence of familial SSD. In a multigenerational family with evidence of a familial motor-based SSD, affectation status and a measure of motor sequencing during oral motor testing were obtained. To further investigate the role of motor sequencing as an endophenotype for genetic studies, parametric and nonparametric linkage analyses were carried out using a genome-wide panel of 404 microsatellites. In seven of the 10 family members with available data, SSD affectation status and motor sequencing status coincided. Linkage analysis revealed four regions of interest, 6p21, 7q32, 7q36, and 8q24, primarily identified with the measure of motor sequencing ability. The 6p21 region overlaps with a locus implicated in rapid alternating naming in a recent genome-wide dyslexia linkage study. The 7q32 locus contains a locus implicated in dyslexia. The 7q36 locus borders on a gene known to affect the component traits of language impairment. The results are consistent with a motor-based endophenotype of SSD that would be informative for genetic studies. The linkage results in this first genome-wide study in a multigenerational family with SSD warrant follow-up in additional families and with fine mapping or next-generation approaches to gene identification.

  20. Motor sequencing deficit as an endophenotype of speech sound disorder: A genome-wide linkage analysis in a multigenerational family

    PubMed Central

    Peter, Beate; Matsushita, Mark; Raskind, Wendy H.

    2012-01-01

    Objectives The purpose of this pilot study was to investigate a measure of motor sequencing deficit as a potential endophenotype of speech sound disorder (SSD) in a multigenerational family with evidence of familial SSD. Methods In a multigenerational family with evidence of a familial motor-based SSD, affectation status and a measure of motor sequencing during oral motor testing were obtained. To further investigate the role of motor sequencing as an endophenotype for genetic studies, parametric and nonparametric linkage analyses were conducted using a genome-wide panel of 404 microsatellites. Results In seven of the ten family members with available data, SSD affectation status and motor sequencing status coincided. Linkage analysis revealed four regions of interest, 6p21, 7q32, 7q36, and 8q24, primarily identified with the measure of motor sequencing ability. The 6p21 region overlaps with a locus implicated in rapid alternating naming in a recent genome-wide dyslexia linkage study. The 7q32 locus contains a locus implicated in dyslexia. The 7q36 locus borders on a gene known to affect component traits of language impairment. Conclusions Results are consistent with a motor-based endophenotype of SSD that would be informative for genetic studies. The linkage results in this first genome-wide study in a multigenerational family with SSD warrant follow-up in additional families and with fine mapping or next-generation approaches to gene identification. PMID:22517379

  1. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C. V. Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9–2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers. PMID:27857720

  2. Genome-Wide Divergence and Linkage Disequilibrium Analyses for Capsicum baccatum Revealed by Genome-Anchored Single Nucleotide Polymorphisms.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C V Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-01-01

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum , indicating a population bottleneck during domestication of C. baccatum . In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum , 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index ( F ST ) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9-2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers.

  3. Creative Activities in Music--A Genome-Wide Linkage Analysis.

    PubMed

    Oikkonen, Jaana; Kuusi, Tuire; Peltonen, Petri; Raijas, Pirre; Ukkola-Vuoti, Liisa; Karma, Kai; Onkamo, Päivi; Järvelä, Irma

    2016-01-01

    Creative activities in music represent a complex cognitive function of the human brain, whose biological basis is largely unknown. In order to elucidate the biological background of creative activities in music we performed genome-wide linkage and linkage disequilibrium (LD) scans in musically experienced individuals characterised for self-reported composing, arranging and non-music related creativity. The participants consisted of 474 individuals from 79 families, and 103 sporadic individuals. We found promising evidence for linkage at 16p12.1-q12.1 for arranging (LOD 2.75, 120 cases), 4q22.1 for composing (LOD 2.15, 103 cases) and Xp11.23 for non-music related creativity (LOD 2.50, 259 cases). Surprisingly, statistically significant evidence for linkage was found for the opposite phenotype of creative activity in music (neither composing nor arranging; NCNA) at 18q21 (LOD 3.09, 149 cases), which contains cadherin genes like CDH7 and CDH19. The locus at 4q22.1 overlaps the previously identified region of musical aptitude, music perception and performance giving further support for this region as a candidate region for broad range of music-related traits. The other regions at 18q21 and 16p12.1-q12.1 are also adjacent to the previously identified loci with musical aptitude. Pathway analysis of the genes suggestively associated with composing suggested an overrepresentation of the cerebellar long-term depression pathway (LTD), which is a cellular model for synaptic plasticity. The LTD also includes cadherins and AMPA receptors, whose component GSG1L was linked to arranging. These results suggest that molecular pathways linked to memory and learning via LTD affect music-related creative behaviour. Musical creativity is a complex phenotype where a common background with musicality and intelligence has been proposed. Here, we implicate genetic regions affecting music-related creative behaviour, which also include genes with neuropsychiatric associations. We also propose

  4. A genome-wide linkage study of mammographic density, a risk factor for breast cancer

    PubMed Central

    2011-01-01

    Introduction Mammographic breast density is a highly heritable (h2 > 0.6) and strong risk factor for breast cancer. We conducted a genome-wide linkage study to identify loci influencing mammographic breast density (MD). Methods Epidemiological data were assembled on 1,415 families from the Australia, Northern California and Ontario sites of the Breast Cancer Family Registry, and additional families recruited in Australia and Ontario. Families consisted of sister pairs with age-matched mammograms and data on factors known to influence MD. Single nucleotide polymorphism (SNP) genotyping was performed on 3,952 individuals using the Illumina Infinium 6K linkage panel. Results Using a variance components method, genome-wide linkage analysis was performed using quantitative traits obtained by adjusting MD measurements for known covariates. Our primary trait was formed by fitting a linear model to the square root of the percentage of the breast area that was dense (PMD), adjusting for age at mammogram, number of live births, menopausal status, weight, height, weight squared, and menopausal hormone therapy. The maximum logarithm of odds (LOD) score from the genome-wide scan was on chromosome 7p14.1-p13 (LOD = 2.69; 63.5 cM) for covariate-adjusted PMD, with a 1-LOD interval spanning 8.6 cM. A similar signal was seen for the covariate adjusted area of the breast that was dense (DA) phenotype. Simulations showed that the complete sample had adequate power to detect LOD scores of 3 or 3.5 for a locus accounting for 20% of phenotypic variance. A modest peak initially seen on chromosome 7q32.3-q34 increased in strength when only the 513 families with at least two sisters below 50 years of age were included in the analysis (LOD 3.2; 140.7 cM, 1-LOD interval spanning 9.6 cM). In a subgroup analysis, we also found a LOD score of 3.3 for DA phenotype on chromosome 12.11.22-q13.11 (60.8 cM, 1-LOD interval spanning 9.3 cM), overlapping a region identified in a previous study

  5. Genome-Wide Characterization and Linkage Mapping of Simple Sequence Repeats in Mei (Prunus mume Sieb. et Zucc.)

    PubMed Central

    Sun, Lidan; Yang, Weiru; Zhang, Qixiang; Cheng, Tangren; Pan, Huitang; Xu, Zongda; Zhang, Jie; Chen, Chuguang

    2013-01-01

    Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species. PMID:23555708

  6. Genome-wide Linkage Scan of Antisocial Behavior, Depression and Impulsive Substance Use in the UCSF Family Alcoholism Study

    PubMed Central

    Gizer, Ian R.; Ehlers, Cindy L.; Vieten, Cassandra; Feiler, Heidi S.; Gilder, David A.; Wilhelmsen, Kirk C.

    2012-01-01

    OBJECTIVE Epidemiological and clinical studies suggest that rates of antisocial behavior, depression, and impulsive substance use are increased among individuals diagnosed with alcohol dependence relative to those who are not. Thus, the present study conducted genome-wide linkage scans of antisocial behavior, depression, and impulsive substance use in the University of California at San Francisco Family Alcoholism Study. METHODS Antisocial behavior, depressive symptoms, and impulsive substance use were assessed using three scales from the MMPI-2, the Antisocial Practices content scale (ASP), the Depression content scale (DEP), and the revised MacAndrew Alcoholism scale (MAC-R). Linkage analyses were conducted using a variance components approach. RESULTS Suggestive evidence of linkage to three genomic regions independent of alcohol and cannabis dependence diagnostic status was observed: the ASP scale showed evidence of linkage to chromosome 13 at 11 cM, the MAC-R scale showed evidence of linkage to chromosome 15 at 47 cM, and all 3 scales showed evidence of linkage to chromosome 17 at 57–58 cM. CONCLUSIONS Each of these regions has shown prior evidence of linkage and association to substance dependence as well as other psychiatric disorders such as mood and anxiety disorders, ADHD, and schizophrenia thus suggesting potentially broad relations between these regions and psychopathology. PMID:22517380

  7. Genome-wide distribution of genetic diversity and linkage disequilibrium in a mass-selected population of maritime pine

    PubMed Central

    2014-01-01

    Background The accessibility of high-throughput genotyping technologies has contributed greatly to the development of genomic resources in non-model organisms. High-density genotyping arrays have only recently been developed for some economically important species such as conifers. The potential for using genomic technologies in association mapping and breeding depends largely on the genome wide patterns of diversity and linkage disequilibrium in current breeding populations. This study aims to deepen our knowledge regarding these issues in maritime pine, the first species used for reforestation in south western Europe. Results Using a new map merging algorithm, we first established a 1,712 cM composite linkage map (comprising 1,838 SNP markers in 12 linkage groups) by bringing together three already available genetic maps. Using rigorous statistical testing based on kernel density estimation and resampling we identified cold and hot spots of recombination. In parallel, 186 unrelated trees of a mass-selected population were genotyped using a 12k-SNP array. A total of 2,600 informative SNPs allowed to describe historical recombination, genetic diversity and genetic structure of this recently domesticated breeding pool that forms the basis of much of the current and future breeding of this species. We observe very low levels of population genetic structure and find no evidence that artificial selection has caused a reduction in genetic diversity. By combining these two pieces of information, we provided the map position of 1,671 SNPs corresponding to 1,192 different loci. This made it possible to analyze the spatial pattern of genetic diversity (H e ) and long distance linkage disequilibrium (LD) along the chromosomes. We found no particular pattern in the empirical variogram of H e across the 12 linkage groups and, as expected for an outcrossing species with large effective population size, we observed an almost complete lack of long distance LD. Conclusions These

  8. Creative Activities in Music – A Genome-Wide Linkage Analysis

    PubMed Central

    Oikkonen, Jaana; Kuusi, Tuire; Peltonen, Petri; Raijas, Pirre; Ukkola-Vuoti, Liisa; Karma, Kai; Onkamo, Päivi; Järvelä, Irma

    2016-01-01

    Creative activities in music represent a complex cognitive function of the human brain, whose biological basis is largely unknown. In order to elucidate the biological background of creative activities in music we performed genome-wide linkage and linkage disequilibrium (LD) scans in musically experienced individuals characterised for self-reported composing, arranging and non-music related creativity. The participants consisted of 474 individuals from 79 families, and 103 sporadic individuals. We found promising evidence for linkage at 16p12.1-q12.1 for arranging (LOD 2.75, 120 cases), 4q22.1 for composing (LOD 2.15, 103 cases) and Xp11.23 for non-music related creativity (LOD 2.50, 259 cases). Surprisingly, statistically significant evidence for linkage was found for the opposite phenotype of creative activity in music (neither composing nor arranging; NCNA) at 18q21 (LOD 3.09, 149 cases), which contains cadherin genes like CDH7 and CDH19. The locus at 4q22.1 overlaps the previously identified region of musical aptitude, music perception and performance giving further support for this region as a candidate region for broad range of music-related traits. The other regions at 18q21 and 16p12.1-q12.1 are also adjacent to the previously identified loci with musical aptitude. Pathway analysis of the genes suggestively associated with composing suggested an overrepresentation of the cerebellar long-term depression pathway (LTD), which is a cellular model for synaptic plasticity. The LTD also includes cadherins and AMPA receptors, whose component GSG1L was linked to arranging. These results suggest that molecular pathways linked to memory and learning via LTD affect music-related creative behaviour. Musical creativity is a complex phenotype where a common background with musicality and intelligence has been proposed. Here, we implicate genetic regions affecting music-related creative behaviour, which also include genes with neuropsychiatric associations. We also propose

  9. Genome-wide linkage scan for contraction velocity characteristics of knee musculature in the Leuven Genes for Muscular Strength Study.

    PubMed

    De Mars, Gunther; Windelinckx, An; Huygens, Wim; Peeters, Maarten W; Beunen, Gaston P; Aerssens, Jeroen; Vlietinck, Robert; Thomis, Martine A I

    2008-09-17

    The torque-velocity relationship is known to be affected by ageing, decreasing its protective role in the prevention of falls. Interindividual variability in this torque-velocity relationship is partly determined by genetic factors (h(2): 44-67%). As a first attempt, this genome-wide linkage study aimed to identify chromosomal regions linked to the torque-velocity relationship of the knee flexors and extensors. A selection of 283 informative male siblings (17-36 yr), belonging to 105 families, was used to conduct a genome-wide SNP-based (Illumina Linkage IVb panel) multipoint linkage analysis for the torque-velocity relationship of the knee flexors and extensors. The strongest evidence for linkage was found at 15q23 for the torque-velocity slope of the knee extensors (TVSE). Other interesting linkage regions with LOD scores >2 were found at 7p12.3 [logarithm of the odds ratio (LOD) = 2.03, P = 0.0011] for the torque-velocity ratio of the knee flexors (TVRF), at 2q14.3 (LOD = 2.25, P = 0.0006) for TVSE, and at 4p14 and 18q23 for the torque-velocity ratio of the knee extensors TVRE (LOD = 2.23 and 2.08; P = 0.0007 and 0.001, respectively). We conclude that many small contributing genes are involved in causing variation in the torque-velocity relationship of the knee flexor and extensor muscles. Several earlier reported candidate genes for muscle strength and muscle mass and new candidates are harbored within or in close vicinity of the linkage regions reported in the present study.

  10. Genome-wide scan of IQ finds significant linkage to a quantitative trait locus on 2q.

    PubMed

    Luciano, M; Wright, M J; Duffy, D L; Wainwright, M A; Zhu, G; Evans, D M; Geffen, G M; Montgomery, G W; Martin, N G

    2006-01-01

    A genome-wide linkage scan of 795 microsatellite markers (761 autosomal, 34 X chromosome) was performed on Multidimensional Aptitude Battery subtests and verbal, performance and full scale scores, the WAIS-R Digit Symbol subtest, and two word-recognition tests (Schonell Graded Word Reading Test, Cambridge Contextual Reading Test) highly predictive of IQ. The sample included 361 families comprising 2-5 siblings who ranged in age from 15.7 to 22.2 years; genotype, but not phenotype, data were available for 81% of parents. A variance components analysis which controlled for age and sex effects showed significant linkage for the Cambridge reading test and performance IQ to the same region on chromosome 2, with respective LOD scores of 4.15 and 3.68. Suggestive linkage (LOD score>2.2) for various measures was further supported on chromosomes 6, 7, 11, 14, 21 and 22. Where location of linkage peaks converged for IQ subtests within the same scale, the overall scale score provided increased evidence for linkage to that region over any individual subtest. Association studies of candidate genes, particularly those involved in neural transmission and development, will be directed to genes located under the linkage peaks identified in this study.

  11. Genome-wide linkage meta-analysis identifies susceptibility loci at 2q34 and 13q31.3 for genetic generalized epilepsies.

    PubMed

    Leu, Costin; de Kovel, Carolien G F; Zara, Federico; Striano, Pasquale; Pezzella, Marianna; Robbiano, Angela; Bianchi, Amedeo; Bisulli, Francesca; Coppola, Antonietta; Giallonardo, Anna Teresa; Beccaria, Francesca; Trenité, Dorothée Kasteleijn-Nolst; Lindhout, Dick; Gaus, Verena; Schmitz, Bettina; Janz, Dieter; Weber, Yvonne G; Becker, Felicitas; Lerche, Holger; Kleefuss-Lie, Ailing A; Hallman, Kerstin; Kunz, Wolfram S; Elger, Christian E; Muhle, Hiltrud; Stephani, Ulrich; Møller, Rikke S; Hjalgrim, Helle; Mullen, Saul; Scheffer, Ingrid E; Berkovic, Samuel F; Everett, Kate V; Gardiner, Mark R; Marini, Carla; Guerrini, Renzo; Lehesjoki, Anna-Elina; Siren, Auli; Nabbout, Rima; Baulac, Stephanie; Leguern, Eric; Serratosa, Jose M; Rosenow, Felix; Feucht, Martha; Unterberger, Iris; Covanis, Athanasios; Suls, Arvid; Weckhuysen, Sarah; Kaneva, Radka; Caglayan, Hande; Turkdogan, Dilsad; Baykan, Betul; Bebek, Nerses; Ozbek, Ugur; Hempelmann, Anne; Schulz, Herbert; Rüschendorf, Franz; Trucks, Holger; Nürnberg, Peter; Avanzini, Giuliano; Koeleman, Bobby P C; Sander, Thomas

    2012-02-01

    Genetic generalized epilepsies (GGEs) have a lifetime prevalence of 0.3% with heritability estimates of 80%. A considerable proportion of families with siblings affected by GGEs presumably display an oligogenic inheritance. The present genome-wide linkage meta-analysis aimed to map: (1) susceptibility loci shared by a broad spectrum of GGEs, and (2) seizure type-related genetic factors preferentially predisposing to either typical absence or myoclonic seizures, respectively. Meta-analysis of three genome-wide linkage datasets was carried out in 379 GGE-multiplex families of European ancestry including 982 relatives with GGEs. To dissect out seizure type-related susceptibility genes, two family subgroups were stratified comprising 235 families with predominantly genetic absence epilepsies (GAEs) and 118 families with an aggregation of juvenile myoclonic epilepsy (JME). To map shared and seizure type-related susceptibility loci, both nonparametric loci (NPL) and parametric linkage analyses were performed for a broad trait model (GGEs) in the entire set of GGE-multiplex families and a narrow trait model (typical absence or myoclonic seizures) in the subgroups of JME and GAE families. For the entire set of 379 GGE-multiplex families, linkage analysis revealed six loci achieving suggestive evidence for linkage at 1p36.22, 3p14.2, 5q34, 13q12.12, 13q31.3, and 19q13.42. The linkage finding at 5q34 was consistently supported by both NPL and parametric linkage results across all three family groups. A genome-wide significant nonparametric logarithm of odds score of 3.43 was obtained at 2q34 in 118 JME families. Significant parametric linkage to 13q31.3 was found in 235 GAE families assuming recessive inheritance (heterogeneity logarithm of odds = 5.02). Our linkage results support an oligogenic predisposition of familial GGE syndromes. The genetic risk factor at 5q34 confers risk to a broad spectrum of familial GGE syndromes, whereas susceptibility loci at 2q34 and 13q31

  12. Genome-wide linkage scan for maximum and length-dependent knee muscle strength in young men: significant evidence for linkage at chromosome 14q24.3.

    PubMed

    De Mars, G; Windelinckx, A; Huygens, W; Peeters, M W; Beunen, G P; Aerssens, J; Vlietinck, R; Thomis, M A I

    2008-05-01

    Maintenance of high muscular fitness is positively related to bone health, functionality in daily life and increasing insulin sensitivity, and negatively related to falls and fractures, morbidity and mortality. Heritability of muscle strength phenotypes ranges between 31% and 95%, but little is known about the identity of the genes underlying this complex trait. As a first attempt, this genome-wide linkage study aimed to identify chromosomal regions linked to muscle and bone cross-sectional area, isometric knee flexion and extension torque, and torque-length relationship for knee flexors and extensors. In total, 283 informative male siblings (17-36 years old), belonging to 105 families, were used to conduct a genome-wide SNP-based multipoint linkage analysis. The strongest evidence for linkage was found for the torque-length relationship of the knee flexors at 14q24.3 (LOD = 4.09; p<10(-5)). Suggestive evidence for linkage was found at 14q32.2 (LOD = 3.00; P = 0.005) for muscle and bone cross-sectional area, at 2p24.2 (LOD = 2.57; p = 0.01) for isometric knee torque at 30 degrees flexion, at 1q21.3, 2p23.3 and 18q11.2 (LOD = 2.33, 2.69 and 2.21; p<10(-4) for all) for the torque-length relationship of the knee extensors and at 18p11.31 (LOD = 2.39; p = 0.0004) for muscle-mass adjusted isometric knee extension torque. We conclude that many small contributing genes rather than a few important genes are involved in causing variation in different underlying phenotypes of muscle strength. Furthermore, some overlap in promising genomic regions were identified among different strength phenotypes.

  13. Genome-wide linkage scan for maximum and length-dependent knee muscle strength in young men: significant evidence for linkage at chromosome 14q24.3

    PubMed Central

    De Mars, G; Windelinckx, A; Huygens, W; Peeters, M W; Beunen, G P; Aerssens, J; Vlietinck, R; Thomis, M A I

    2008-01-01

    Background: Maintenance of high muscular fitness is positively related to bone health, functionality in daily life and increasing insulin sensitivity, and negatively related to falls and fractures, morbidity and mortality. Heritability of muscle strength phenotypes ranges between 31% and 95%, but little is known about the identity of the genes underlying this complex trait. As a first attempt, this genome-wide linkage study aimed to identify chromosomal regions linked to muscle and bone cross-sectional area, isometric knee flexion and extension torque, and torque–length relationship for knee flexors and extensors. Methods: In total, 283 informative male siblings (17–36 years old), belonging to 105 families, were used to conduct a genome-wide SNP-based multipoint linkage analysis. Results: The strongest evidence for linkage was found for the torque–length relationship of the knee flexors at 14q24.3 (LOD  = 4.09; p<10−5). Suggestive evidence for linkage was found at 14q32.2 (LOD  = 3.00; P = 0.005) for muscle and bone cross-sectional area, at 2p24.2 (LOD  = 2.57; p = 0.01) for isometric knee torque at 30° flexion, at 1q21.3, 2p23.3 and 18q11.2 (LOD  = 2.33, 2.69 and 2.21; p<10−4 for all) for the torque–length relationship of the knee extensors and at 18p11.31 (LOD  = 2.39; p = 0.0004) for muscle-mass adjusted isometric knee extension torque. Conclusions: We conclude that many small contributing genes rather than a few important genes are involved in causing variation in different underlying phenotypes of muscle strength. Furthermore, some overlap in promising genomic regions were identified among different strength phenotypes. PMID:18178634

  14. A Genome-Wide Linkage Scan for Age at Menarche in Three Populations of European Descent

    PubMed Central

    Anderson, Carl A.; Zhu, Gu; Falchi, Mario; van den Berg, Stéphanie M.; Treloar, Susan A.; Spector, Timothy D.; Martin, Nicholas G.; Boomsma, Dorret I.; Visscher, Peter M.; Montgomery, Grant W.

    2008-01-01

    Context: Age at menarche (AAM) is an important trait both biologically and socially, a clearly defined event in female pubertal development, and has been associated with many clinically significant phenotypes. Objective: The objective of the study was to identify genetic loci influencing variation in AAM in large population-based samples from three countries. Design/Participants: Recalled AAM data were collected from 13,697 individuals and 4,899 pseudoindependent sister-pairs from three different populations (Australia, The Netherlands, and the United Kingdom) by mailed questionnaire or interview. Genome-wide variance components linkage analysis was implemented on each sample individually and in combination. Results: The mean, sd, and heritability of AAM across the three samples was 13.1 yr, 1.5 yr, and 0.69, respectively. No loci were detected that reached genome-wide significance in the combined analysis, but a suggestive locus was detected on chromosome 12 (logarithm of the odds = 2.0). Three loci of suggestive significance were seen in the U.K. sample on chromosomes 1, 4, and 18 (logarithm of the odds = 2.4, 2.2 and 3.2, respectively). Conclusions: There was no evidence for common highly penetrant variants influencing AAM. Linkage and association suggest that one trait locus for AAM is located on chromosome 12, but further studies are required to replicate these results. PMID:18647812

  15. Genome-wide linkage scan for submaximal exercise heart rate in the HERITAGE family study.

    PubMed

    Spielmann, Nadine; Leon, Arthur S; Rao, D C; Rice, Treva; Skinner, James S; Rankinen, Tuomo; Bouchard, Claude

    2007-12-01

    The purpose of this study was to identify regions of the human genome linked to submaximal exercise heart rates in the sedentary state and in response to a standardized 20-wk endurance training program in blacks and whites of the HERITAGE Family Study. A total of 701 polymorphic markers covering the 22 autosomes were used in the genome-wide linkage scan, with 328 sibling pairs from 99 white nuclear families and 102 pairs from 115 black family units. Steady-state heart rates were measured at the relative intensity of 60% maximal oxygen uptake (HR60) and at the absolute intensity of 50 W (HR50). Baseline phenotypes were adjusted for age, sex, and baseline body mass index (BMI) and training responses (posttraining minus baseline, Delta) were adjusted for age, sex, baseline BMI, and baseline value of the phenotype. Two analytic strategies were used, a multipoint variance components and a regression-based multipoint linkage analysis. In whites, promising linkages (LOD > 1.75) were identified on 18q21-q22 for baseline HR50 (LOD = 2.64; P = 0.0002) and DeltaHR60 (LOD = 2.10; P = 0.0009) and on chromosome 2q33.3 for DeltaHR50 (LOD = 2.13; P = 0.0009). In blacks, evidence of promising linkage for baseline HR50 was detected with several markers within the chromosomal region 10q24-q25.3 (peak LOD = 2.43, P = 0.0004 with D10S597). The most promising regions for fine mapping in the HERITAGE Family Study were found on 2q33 for HR50 training response in whites, on 10q25-26 for baseline HR60 in blacks, and on 18q21-22 for both baseline HR50 and DeltaHR60 in whites.

  16. Genome-wide patterns of recombination, linkage disequilibrium and nucleotide diversity from pooled resequencing and single nucleotide polymorphism genotyping unlock the evolutionary history of Eucalyptus grandis.

    PubMed

    Silva-Junior, Orzenil B; Grattapaglia, Dario

    2015-11-01

    We used high-density single nucleotide polymorphism (SNP) data and whole-genome pooled resequencing to examine the landscape of population recombination (ρ) and nucleotide diversity (ϴw ), assess the extent of linkage disequilibrium (r(2) ) and build the highest density linkage maps for Eucalyptus. At the genome-wide level, linkage disequilibrium (LD) decayed within c. 4-6 kb, slower than previously reported from candidate gene studies, but showing considerable variation from absence to complete LD up to 50 kb. A sharp decrease in the estimate of ρ was seen when going from short to genome-wide inter-SNP distances, highlighting the dependence of this parameter on the scale of observation adopted. Recombination was correlated with nucleotide diversity, gene density and distance from the centromere, with hotspots of recombination enriched for genes involved in chemical reactions and pathways of the normal metabolic processes. The high nucleotide diversity (ϴw = 0.022) of E. grandis revealed that mutation is more important than recombination in shaping its genomic diversity (ρ/ϴw = 0.645). Chromosome-wide ancestral recombination graphs allowed us to date the split of E. grandis (1.7-4.8 million yr ago) and identify a scenario for the recent demographic history of the species. Our results have considerable practical importance to Genome Wide Association Studies (GWAS), while indicating bright prospects for genomic prediction of complex phenotypes in eucalypt breeding. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  17. A Genome-Wide Linkage Study for Chronic Obstructive Pulmonary Disease in a Dutch Genetic Isolate Identifies Novel Rare Candidate Variants.

    PubMed

    Nedeljkovic, Ivana; Terzikhan, Natalie; Vonk, Judith M; van der Plaat, Diana A; Lahousse, Lies; van Diemen, Cleo C; Hobbs, Brian D; Qiao, Dandi; Cho, Michael H; Brusselle, Guy G; Postma, Dirkje S; Boezen, H M; van Duijn, Cornelia M; Amin, Najaf

    2018-01-01

    Chronic obstructive pulmonary disease (COPD) is a complex and heritable disease, associated with multiple genetic variants. Specific familial types of COPD may be explained by rare variants, which have not been widely studied. We aimed to discover rare genetic variants underlying COPD through a genome-wide linkage scan. Affected-only analysis was performed using the 6K Illumina Linkage IV Panel in 142 cases clustered in 27 families from a genetic isolate, the Erasmus Rucphen Family (ERF) study. Potential causal variants were identified by searching for shared rare variants in the exome-sequence data of the affected members of the families contributing most to the linkage peak. The identified rare variants were then tested for association with COPD in a large meta-analysis of several cohorts. Significant evidence for linkage was observed on chromosomes 15q14-15q25 [logarithm of the odds (LOD) score = 5.52], 11p15.4-11q14.1 (LOD = 3.71) and 5q14.3-5q33.2 (LOD = 3.49). In the chromosome 15 peak, that harbors the known COPD locus for nicotinic receptors, and in the chromosome 5 peak we could not identify shared variants. In the chromosome 11 locus, we identified four rare (minor allele frequency (MAF) <0.02), predicted pathogenic, missense variants. These were shared among the affected family members. The identified variants localize to genes including neuroblast differentiation-associated protein ( AHNAK ), previously associated with blood biomarkers in COPD, phospholipase C Beta 3 ( PLCB3 ), shown to increase airway hyper-responsiveness, solute carrier family 22-A11 ( SLC22A11 ), involved in amino acid metabolism and ion transport, and metallothionein-like protein 5 ( MTL5 ), involved in nicotinate and nicotinamide metabolism. Association of SLC22A11 and MTL5 variants were confirmed in the meta-analysis of 9,888 cases and 27,060 controls. In conclusion, we have identified novel rare variants in plausible genes related to COPD. Further studies utilizing large sample

  18. First-generation linkage map of the gray, short-tailed opossum, Monodelphis domestica, reveals genome-wide reduction in female recombination rates.

    PubMed Central

    Samollow, Paul B; Kammerer, Candace M; Mahaney, Susan M; Schneider, Jennifer L; Westenberger, Scott J; VandeBerg, John L; Robinson, Edward S

    2004-01-01

    The gray, short-tailed opossum, Monodelphis domestica, is the most extensively used, laboratory-bred marsupial resource for basic biologic and biomedical research worldwide. To enhance the research utility of this species, we are building a linkage map, using both anonymous markers and functional gene loci, that will enable the localization of quantitative trait loci (QTL) and provide comparative information regarding the evolution of mammalian and other vertebrate genomes. The current map is composed of 83 loci distributed among eight autosomal linkage groups and the X chromosome. The autosomal linkage groups appear to encompass a very large portion of the genome, yet span a sex-average distance of only 633.0 cM, making this the most compact linkage map known among vertebrates. Most surprising, the male map is much larger than the female map (884.6 cM vs. 443.1 cM), a pattern contrary to that in eutherian mammals and other vertebrates. The finding of genome-wide reduction in female recombination in M. domestica, coupled with recombination data from two other, distantly related marsupial species, suggests that reduced female recombination might be a widespread metatherian attribute. We discuss possible explanations for reduced female recombination in marsupials as a consequence of the metatherian characteristic of determinate paternal X chromosome inactivation. PMID:15020427

  19. Neuropsychological Endophenotype Approach to Genome-wide Linkage Analysis Identifies Susceptibility Loci for ADHD on 2q21.1 and 13q12.11

    PubMed Central

    Rommelse, Nanda N.J.; Arias-Vásquez, Alejandro; Altink, Marieke E.; Buschgens, Cathelijne J.M.; Fliers, Ellen; Asherson, Philip; Faraone, Stephen V.; Buitelaar, Jan K.; Sergeant, Joseph A.; Oosterlaan, Jaap; Franke, Barbara

    2008-01-01

    ADHD linkage findings have not all been consistently replicated, suggesting that other approaches to linkage analysis in ADHD might be necessary, such as the use of (quantitative) endophenotypes (heritable traits associated with an increased risk for ADHD). Genome-wide linkage analyses were performed in the Dutch subsample of the International Multi-Center ADHD Genetics (IMAGE) study comprising 238 DSM-IV combined-type ADHD probands and their 112 affected and 195 nonaffected siblings. Eight candidate neuropsychological ADHD endophenotypes with heritabilities > 0.2 were used as quantitative traits. In addition, an overall component score of neuropsychological functioning was used. A total of 5407 autosomal single-nucleotide polymorphisms (SNPs) were used to run multipoint regression-based linkage analyses. Two significant genome-wide linkage signals were found, one for Motor Timing on chromosome 2q21.1 (LOD score: 3.944) and one for Digit Span on 13q12.11 (LOD score: 3.959). Ten suggestive linkage signals were found (LOD scores ≥ 2) on chromosomes 2p, 2q, 3p, 4q, 8q, 12p, 12q, 14q, and 17q. The suggestive linkage signal for the component score that was found at 2q14.3 (LOD score: 2.878) overlapped with the region significantly linked to Motor Timing. Endophenotype approaches may increase power to detect susceptibility loci in ADHD and possibly in other complex disorders. PMID:18599010

  20. Single nucleotide polymorphisms generated by genotyping by sequencing to characterize genome-wide diversity, linkage disequilibrium, and selective sweeps in cultivated watermelon

    USDA-ARS?s Scientific Manuscript database

    Large datasets containing single nucleotide polymorphisms (SNPs) are used to analyze genome-wide diversity in a robust collection of cultivars from representative accessions, across the world. The extent of linkage disequilibrium (LD) within a population determines the number of markers required fo...

  1. A combined genome-wide linkage and association approach to find susceptibility loci for platelet function phenotypes in European American and African American families with coronary artery disease

    PubMed Central

    2010-01-01

    Background The inability of aspirin (ASA) to adequately suppress platelet aggregation is associated with future risk of coronary artery disease (CAD). Heritability studies of agonist-induced platelet function phenotypes suggest that genetic variation may be responsible for ASA responsiveness. In this study, we leverage independent information from genome-wide linkage and association data to determine loci controlling platelet phenotypes before and after treatment with ASA. Methods Clinical data on 37 agonist-induced platelet function phenotypes were evaluated before and after a 2-week trial of ASA (81 mg/day) in 1231 European American and 846 African American healthy subjects with a family history of premature CAD. Principal component analysis was performed to minimize the number of independent factors underlying the covariance of these various phenotypes. Multi-point sib-pair based linkage analysis was performed using a microsatellite marker set, and single-SNP association tests were performed using markers from the Illumina 1 M genotyping chip from deCODE Genetics, Inc. All analyses were performed separately within each ethnic group. Results Several genomic regions appear to be linked to ASA response factors: a 10 cM region in African Americans on chromosome 5q11.2 had several STRs with suggestive (p-value < 7 × 10-4) and significant (p-value < 2 × 10-5) linkage to post aspirin platelet response to ADP, and ten additional factors had suggestive evidence for linkage (p-value < 7 × 10-4) to thirteen genomic regions. All but one of these factors were aspirin response variables. While the strength of genome-wide SNP association signals for factors showing evidence for linkage is limited, especially at the strict thresholds of genome-wide criteria (N = 9 SNPs for 11 factors), more signals were considered significant when the association signal was weighted by evidence for linkage (N = 30 SNPs). Conclusions Our study supports the hypothesis that platelet phenotypes

  2. Genome-wide SNP identification, linkage map construction and QTL mapping for seed mineral concentrations and contents in pea (Pisum sativum L.).

    PubMed

    Ma, Yu; Coyne, Clarice J; Grusak, Michael A; Mazourek, Michael; Cheng, Peng; Main, Dorrie; McGee, Rebecca J

    2017-02-13

    Marker-assisted breeding is now routinely used in major crops to facilitate more efficient cultivar improvement. This has been significantly enabled by the use of next-generation sequencing technology to identify loci and markers associated with traits of interest. While rich in a range of nutritional components, such as protein, mineral nutrients, carbohydrates and several vitamins, pea (Pisum sativum L.), one of the oldest domesticated crops in the world, remains behind many other crops in the availability of genomic and genetic resources. To further improve mineral nutrient levels in pea seeds requires the development of genome-wide tools. The objectives of this research were to develop these tools by: identifying genome-wide single nucleotide polymorphisms (SNPs) using genotyping by sequencing (GBS); constructing a high-density linkage map and comparative maps with other legumes, and identifying quantitative trait loci (QTL) for levels of boron, calcium, iron, potassium, magnesium, manganese, molybdenum, phosphorous, sulfur, and zinc in the seed, as well as for seed weight. In this study, 1609 high quality SNPs were found to be polymorphic between 'Kiflica' and 'Aragorn', two parents of an F 6 -derived recombinant inbred line (RIL) population. Mapping 1683 markers including 75 previously published markers and 1608 SNPs developed from the present study generated a linkage map of size 1310.1 cM. Comparative mapping with other legumes demonstrated that the highest level of synteny was observed between pea and the genome of Medicago truncatula. QTL analysis of the RIL population across two locations revealed at least one QTL for each of the mineral nutrient traits. In total, 46 seed mineral concentration QTLs, 37 seed mineral content QTLs, and 6 seed weight QTLs were discovered. The QTLs explained from 2.4% to 43.3% of the phenotypic variance. The genome-wide SNPs and the genetic linkage map developed in this study permitted QTL identification for pea seed mineral

  3. A genome-wide linkage scan for quantitative trait loci underlying obesity related phenotypes in 434 Caucasian families.

    PubMed

    Zhao, Lan-Juan; Xiao, Peng; Liu, Yong-Jun; Xiong, Dong-Hai; Shen, Hui; Recker, Robert R; Deng, Hong-Wen

    2007-03-01

    To identify quantitative trait loci (QTLs) that contribute to obesity, we performed a large-scale whole genome linkage scan (WGS) involving 4,102 individuals from 434 Caucasian families. The most pronounced linkage evidence was found at the genomic region 20p11-12 for fat mass (LOD = 3.31) and percentage fat mass (PFM) (LOD = 2.92). We also identified several regions showing suggestive linkage signals (threshold LOD = 1.9) for obesity phenotypes, including 5q35, 8q13, 10p12, and 17q11.

  4. Genome-wide Linkage Analysis for Identifying Quantitative Trait Loci Involved in the Regulation of Lipoprotein a (Lpa) Levels

    PubMed Central

    López, Sonia; Buil, Alfonso; Ordoñez, Jordi; Souto, Juan Carlos; Almasy, Laura; Lathrop, Mark; Blangero, John; Blanco-Vaca, Francisco; Fontcuberta, Jordi; Soria, José Manuel

    2009-01-01

    Lipoprotein Lp(a) levels are highly heritable and are associated with cardiovascular risk. We performed a genome-wide linkage analysis to delineate the genomic regions that influence the concentration of Lp(a) in families from the Genetic Analysis of Idiopathic Thrombophilia (GAIT) Project. Lp(a) levels were measured in 387 individuals belonging to 21 extended Spanish families. A total of 485 DNA microsatellite markers were genotyped to provide a 7.1 cM genetic map. A variance component linkage method was used to evaluate linkage and to detect quantitative trait loci (QTLs). The main QTL that showed strong evidence of linkage with Lp(a) levels was located at the structural gene for apo(a) on Chromosome 6 (LOD score=13.8). Interestingly, another QTL influencing Lp(a) concentration was located on Chromosome 2 with a LOD score of 2.01. This region contains several candidate genes. One of them is the tissue factor pathway inhibitor (TFPI), which has antithrombotic action and also has the ability to bind lipoproteins. However, quantitative trait association analyses performed with 12 SNPs in TFPI gene revealed no association with Lp(a) levels. Our study confirms previous results on the genetic basis of Lp(a) levels. In addition, we report a new QTL on Chromosome 2 involved in the quantitative variation of Lp(a). These data should serve as the basis for further detection of candidate genes and to elucidate the relationship between the concentration of Lp(a) and cardiovascular risk. PMID:18560444

  5. Identifying Loci for the Overlap between Attention-Deficit/Hyperactivity Disorder and Autism Spectrum Disorder Using a Genome-Wide QTL Linkage Approach

    ERIC Educational Resources Information Center

    Nijmeijer, Judith S.; Arias-Vasquez, Alejandro; Rommelse, Nanda N. J.; Altink, Marieke E.; Anney, Richard J. L.; Asherson, Philip; Banaschewski, Tobias; Buschgens, Cathelijne J. M.; Fliers, Ellen A.; Gill, Michael; Minderaa, Ruud B.; Poustka, Luise; Sergeant, Joseph A.; Buitelaar, Jan K.; Franke, Barbara; Ebstein, Richard P.; Miranda, Ana; Mulas, Fernando; Oades, Robert D.; Roeyers, Herbert; Rothenberger, Aribert; Sonuga-Barke, Edmund J. S.; Steinhausen, Hans-Christoph; Faraone, Stephen V.; Hartman, Catharina A.; Hoekstra, Pieter J.

    2010-01-01

    Objective: The genetic basis for autism spectrum disorder (ASD) symptoms in children with attention-deficit/hyperactivity disorder (ADHD) was addressed using a genome-wide linkage approach. Method: Participants of the International Multi-Center ADHD Genetics study comprising 1,143 probands with ADHD and 1,453 siblings were analyzed. The total and…

  6. Genome scan for linkage to asthma using a linkage disequilibrium-lod score test.

    PubMed

    Jiang, Y; Slager, S L; Huang, J

    2001-01-01

    We report a genome-wide linkage study of asthma on the German and Collaborative Study on the Genetics of Asthma (CSGA) data. Using a combined linkage and linkage disequilibrium test and the nonparametric linkage score, we identified 13 markers from the German data, 1 marker from the African American (CSGA) data, and 7 markers from the Caucasian (CSGA) data in which the p-values ranged between 0.0001 and 0.0100. From our analysis and taking into account previous published linkage studies of asthma, we suggest that three regions in chromosome 5 (around D5S418, D5S644, and D5S422), one region in chromosome 6 (around three neighboring markers D6S1281, D6S291, and D6S1019), one region in chromosome 11 (around D11S2362), and two regions in chromosome 12 (around D12S351 and D12S324) especially merit further investigation.

  7. Genome-wide linkage scan for loci of musical aptitude in Finnish families: evidence for a major locus at 4q22

    PubMed Central

    Pulli, K; Karma, K; Norio, R; Sistonen, P; Göring, H H H; Järvelä, I

    2008-01-01

    Background: Music perception and performance are comprehensive human cognitive functions and thus provide an excellent model system for studying human behaviour and brain function. However, the molecules involved in mediating music perception and performance are so far uncharacterised. Objective: To unravel the biological background of music perception, using molecular and statistical genetic approaches. Methods: 15 Finnish multigenerational families (with a total of 234 family members) were recruited via a nationwide search. The phenotype of all family members was determined using three tests used in defining musical aptitude: a test for auditory structuring ability (Karma Music test; KMT) commonly used in Finland, and the Seashore pitch and time discrimination subtests (SP and ST respectively) used internationally. We calculated heritabilities and performed a genome-wide variance components-based linkage scan using genotype data for 1113 microsatellite markers. Results: The heritability estimates were 42% for KMT, 57% for SP, 21% for ST and 48% for the combined music test scores. Significant evidence of linkage was obtained on chromosome 4q22 (LOD 3.33) and suggestive evidence of linkage at 8q13-21 (LOD 2.29) with the combined music test scores, using variance component linkage analyses. The major contribution of the 4q22 locus was obtained for the KMT (LOD 2.91). Interestingly, a positive LOD score of 1.69 was shown at 18q, a region previously linked to dyslexia (DYX6) using combined music test scores. Conclusion: Our results show that there is a genetic contribution to musical aptitude that is likely to be regulated by several predisposing genes or variants. PMID:18424507

  8. A genome-wide search for genes affecting circulating fibrinogen levels in the Framingham Heart Study.

    PubMed

    Yang, Qiong; Tofler, Geoffrey H; Cupples, L Adrienne; Larson, Martin G; Feng, DaLi; Lindpaintner, Klaus; Levy, Daniel; D'Agostino, Ralph B; O'Donnell, Christopher J

    2003-04-15

    Circulating levels of fibrinogen are associated with atherosclerosis and predict future coronary heart disease and stroke. Levels of fibrinogen are correlated among family members, suggesting a heritable component. Variants of the beta-fibrinogen gene subunit on 4q28 are associated with fibrinogen levels but explain only a small proportion of the total genetic variability. It remains unknown what role, if any, is played by other genetic variants in the inter-individual variability in levels of fibrinogen in the general population. We conducted a 10-cM spaced genome-wide scan using 402 original cohort subjects and 1193 offspring subjects from 330 extended families of the Framingham Heart Study. Heritability and linkage analyses were carried out using variance component methods. Regression analyses were performed to adjust for traditional risk factors and HindIII beta-148 genotypes. The total heritability was estimated as 0.24. The highest and second highest LOD scores of linkage were found on chromosomes 2 (LOD=1.5 at 243 cM) and 10 (LOD=2.4 at 87 cM) using only offspring subjects in the analysis, and on chromosomes 2 (LOD=2.1 at 242 cM) and 10(LOD=1.4 at 86 cM), 17 (LOD=1.4 at 96 cM) and 20 (LOD=1.4 at 80 cM) using both original cohort and offspring. These results suggest that there may be influential genetic regions on these chromosomes. While no linkage with genome-wide significance was detected, further research to confirm our findings is warranted.

  9. Genome wide linkage disequilibrium and genetic structure in Sicilian dairy sheep breeds.

    PubMed

    Mastrangelo, Salvatore; Di Gerlando, Rosalia; Tolone, Marco; Tortorici, Lina; Sardina, Maria Teresa; Portolano, Baldassare

    2014-10-10

    The recent availability of sheep genome-wide SNP panels allows providing background information concerning genome structure in domestic animals. The aim of this work was to investigate the patterns of linkage disequilibrium (LD), the genetic diversity and population structure in Valle del Belice, Comisana, and Pinzirita dairy sheep breeds using the Illumina Ovine SNP50K Genotyping array. Average r (2) between adjacent SNPs across all chromosomes was 0.155 ± 0.204 for Valle del Belice, 0.156 ± 0.208 for Comisana, and 0.128 ± 0.188 for Pinzirita breeds, and some variations in LD value across chromosomes were observed, in particular for Valle del Belice and Comisana breeds. Average values of r (2) estimated for all pairwise combinations of SNPs pooled over all autosomes were 0.058 ± 0.023 for Valle del Belice, 0.056 ± 0.021 for Comisana, and 0.037 ± 0.017 for Pinzirita breeds. The LD declined as a function of distance and average r (2) was lower than the values observed in other sheep breeds. Consistency of results among the several used approaches (Principal component analysis, Bayesian clustering, F ST, Neighbor networks) showed that while Valle del Belice and Pinzirita breeds formed a unique cluster, Comisana breed showed the presence of substructure. In Valle del Belice breed, the high level of genetic differentiation within breed, the heterogeneous cluster in Admixture analysis, but at the same time the highest inbreeding coefficient, suggested that the breed had a wide genetic base with inbred individuals belonging to the same flock. The Sicilian breeds were characterized by low genetic differentiation and high level of admixture. Pinzirita breed displayed the highest genetic diversity (He, Ne) whereas the lowest value was found in Valle del Belice breed. This study has reported for the first time estimates of LD and genetic diversity from a genome-wide perspective in Sicilian dairy sheep breeds. Our results indicate that breeds formed non

  10. Genome-wide linkage disequilibrium and genetic diversity in five populations of Australian domestic sheep.

    PubMed

    Al-Mamun, Hawlader Abdullah; Clark, Samuel A; Kwan, Paul; Gondro, Cedric

    2015-11-24

    Knowledge of the genetic structure and overall diversity of livestock species is important to maximise the potential of genome-wide association studies and genomic prediction. Commonly used measures such as linkage disequilibrium (LD), effective population size (N e ), heterozygosity, fixation index (F ST) and runs of homozygosity (ROH) are widely used and help to improve our knowledge about genetic diversity in animal populations. The development of high-density single nucleotide polymorphism (SNP) arrays and the subsequent genotyping of large numbers of animals have greatly increased the accuracy of these population-based estimates. In this study, we used the Illumina OvineSNP50 BeadChip array to estimate and compare LD (measured by r (2) and D'), N e , heterozygosity, F ST and ROH in five Australian sheep populations: three pure breeds, i.e., Merino (MER), Border Leicester (BL), Poll Dorset (PD) and two crossbred populations i.e. F1 crosses of Merino and Border Leicester (MxB) and MxB crossed to Poll Dorset (MxBxP). Compared to other livestock species, the sheep populations that were analysed in this study had low levels of LD and high levels of genetic diversity. The rate of LD decay was greater in Merino than in the other pure breeds. Over short distances (<10 kb), the levels of LD were higher in BL and PD than in MER. Similarly, BL and PD had comparatively smaller N e than MER. Observed heterozygosity in the pure breeds ranged from 0.3 in BL to 0.38 in MER. Genetic distances between breeds were modest compared to other livestock species (highest F ST = 0.063) but the genetic diversity within breeds was high. Based on ROH, two chromosomal regions showed evidence of strong recent selection. This study shows that there is a large range of genome diversity in Australian sheep breeds, especially in Merino sheep. The observed range of diversity will influence the design of genome-wide association studies and the results that can be obtained from them. This

  11. Population genomic structure and linkage disequilibrium analysis of South African goat breeds using genome-wide SNP data.

    PubMed

    Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C

    2016-08-01

    The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2)  = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in

  12. Nonlinear Analysis of Time Series in Genome-Wide Linkage Disequilibrium Data

    NASA Astrophysics Data System (ADS)

    Hernández-Lemus, Enrique; Estrada-Gil, Jesús K.; Silva-Zolezzi, Irma; Fernández-López, J. Carlos; Hidalgo-Miranda, Alfredo; Jiménez-Sánchez, Gerardo

    2008-02-01

    The statistical study of large scale genomic data has turned out to be a very important tool in population genetics. Quantitative methods are essential to understand and implement association studies in the biomedical and health sciences. Nevertheless, the characterization of recently admixed populations has been an elusive problem due to the presence of a number of complex phenomena. For example, linkage disequilibrium structures are thought to be more complex than their non-recently admixed population counterparts, presenting the so-called ancestry blocks, admixed regions that are not yet smoothed by the effect of genetic recombination. In order to distinguish characteristic features for various populations we have implemented several methods, some of them borrowed or adapted from the analysis of nonlinear time series in statistical physics and quantitative physiology. We calculate the main fractal dimensions (Kolmogorov's capacity, information dimension and correlation dimension, usually named, D0, D1 and D2). We also have made detrended fluctuation analysis and information based similarity index calculations for the probability distribution of correlations of linkage disequilibrium coefficient of six recently admixed (mestizo) populations within the Mexican Genome Diversity Project [1] and for the non-recently admixed populations in the International HapMap Project [2]. Nonlinear correlations showed up as a consequence of internal structure within the haplotype distributions. The analysis of these correlations as well as the scope and limitations of these procedures within the biomedical sciences are discussed.

  13. Genome-wide significant loci for addiction and anxiety.

    PubMed

    Hodgson, K; Almasy, L; Knowles, E E M; Kent, J W; Curran, J E; Dyer, T D; Göring, H H H; Olvera, R L; Fox, P T; Pearlson, G D; Krystal, J H; Duggirala, R; Blangero, J; Glahn, D C

    2016-08-01

    Psychiatric comorbidity is common among individuals with addictive disorders, with patients frequently suffering from anxiety disorders. While the genetic architecture of comorbid addictive and anxiety disorders remains unclear, elucidating the genes involved could provide important insights into the underlying etiology. Here we examine a sample of 1284 Mexican-Americans from randomly selected extended pedigrees. Variance decomposition methods were used to examine the role of genetics in addiction phenotypes (lifetime history of alcohol dependence, drug dependence or chronic smoking) and various forms of clinically relevant anxiety. Genome-wide univariate and bivariate linkage scans were conducted to localize the chromosomal regions influencing these traits. Addiction phenotypes and anxiety were shown to be heritable and univariate genome-wide linkage scans revealed significant quantitative trait loci for drug dependence (14q13.2-q21.2, LOD=3.322) and a broad anxiety phenotype (12q24.32-q24.33, LOD=2.918). Significant positive genetic correlations were observed between anxiety and each of the addiction subtypes (ρg=0.550-0.655) and further investigation with bivariate linkage analyses identified significant pleiotropic signals for alcohol dependence-anxiety (9q33.1-q33.2, LOD=3.054) and drug dependence-anxiety (18p11.23-p11.22, LOD=3.425). This study confirms the shared genetic underpinnings of addiction and anxiety and identifies genomic loci involved in the etiology of these comorbid disorders. The linkage signal for anxiety on 12q24 spans the location of TMEM132D, an emerging gene of interest from previous GWAS of anxiety traits, whilst the bivariate linkage signal identified for anxiety-alcohol on 9q33 peak coincides with a region where rare CNVs have been associated with psychiatric disorders. Other signals identified implicate novel regions of the genome in addiction genetics. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  14. Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

    PubMed Central

    Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

    2012-01-01

    Diversity Arrays Technology (DArT) provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for which no reference

  15. Linkage maps of the Atlantic salmon (Salmo salar) genome derived from RAD sequencing

    PubMed Central

    2014-01-01

    Background Genetic linkage maps are useful tools for mapping quantitative trait loci (QTL) influencing variation in traits of interest in a population. Genotyping-by-sequencing approaches such as Restriction-site Associated DNA sequencing (RAD-Seq) now enable the rapid discovery and genotyping of genome-wide SNP markers suitable for the development of dense SNP linkage maps, including in non-model organisms such as Atlantic salmon (Salmo salar). This paper describes the development and characterisation of a high density SNP linkage map based on SbfI RAD-Seq SNP markers from two Atlantic salmon reference families. Results Approximately 6,000 SNPs were assigned to 29 linkage groups, utilising markers from known genomic locations as anchors. Linkage maps were then constructed for the four mapping parents separately. Overall map lengths were comparable between male and female parents, but the distribution of the SNPs showed sex-specific patterns with a greater degree of clustering of sire-segregating SNPs to single chromosome regions. The maps were integrated with the Atlantic salmon draft reference genome contigs, allowing the unique assignment of ~4,000 contigs to a linkage group. 112 genome contigs mapped to two or more linkage groups, highlighting regions of putative homeology within the salmon genome. A comparative genomics analysis with the stickleback reference genome identified putative genes closely linked to approximately half of the ordered SNPs and demonstrated blocks of orthology between the Atlantic salmon and stickleback genomes. A subset of 47 RAD-Seq SNPs were successfully validated using a high-throughput genotyping assay, with a correspondence of 97% between the two assays. Conclusions This Atlantic salmon RAD-Seq linkage map is a resource for salmonid genomics research as genotyping-by-sequencing becomes increasingly common. This is aided by the integration of the SbfI RAD-Seq SNPs with existing reference maps and the draft reference genome, as well

  16. Genome-Wide Linkage, Exome Sequencing and Functional Analyses Identify ABCB6 as the Pathogenic Gene of Dyschromatosis Universalis Hereditaria

    PubMed Central

    Wang, Na; Wang, Chuan; Chen, Xuechao; Sheng, Donglai; Fu, Xi’an; See, Kelvin; Foo, Jia Nee; Low, Huiqi; Liany, Herty; Irwan, Ishak Darryl; Liu, Jian; Yang, Baoqi; Chen, Mingfei; Yu, Yongxiang; Yu, Gongqi; Niu, Guiye; You, Jiabao; Zhou, Yan; Ma, Shanshan; Wang, Ting; Yan, Xiaoxiao; Goh, Boon Kee; Common, John E. A.; Lane, Birgitte E.; Sun, Yonghu; Zhou, Guizhi; Lu, Xianmei; Wang, Zhenhua; Tian, Hongqing; Cao, Yuanhua; Chen, Shumin; Liu, Qiji; Liu, Jianjun; Zhang, Furen

    2014-01-01

    Background As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH) had remained unclear until recently when ABCB6 was reported as a causative gene of DUH. Methodology We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation. Results Genome-wide linkage (assuming autosomal dominant inheritance mode) and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val) that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val) and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys) in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirmed the functional role of ABCB6 in melanocytes and pigmentation. Given the involvement of ABCB6 mutations in coloboma, we performed ophthalmological examination of the DUH carriers of ABCB6 mutations and found ocular abnormalities in them. Conclusion Our study has advanced our understanding of DUH pathogenesis and revealed the shared pathological mechanism between pigmentary DUH and ocular coloboma. PMID:24498303

  17. Exome sequencing and genome-wide linkage analysis in 17 families illustrate the complex contribution of TTN truncating variants to dilated cardiomyopathy.

    PubMed

    Norton, Nadine; Li, Duanxiang; Rampersaud, Evadnie; Morales, Ana; Martin, Eden R; Zuchner, Stephan; Guo, Shengru; Gonzalez, Michael; Hedges, Dale J; Robertson, Peggy D; Krumm, Niklas; Nickerson, Deborah A; Hershberger, Ray E

    2013-04-01

    BACKGROUND- Familial dilated cardiomyopathy (DCM) is a genetically heterogeneous disease with >30 known genes. TTN truncating variants were recently implicated in a candidate gene study to cause 25% of familial and 18% of sporadic DCM cases. METHODS AND RESULTS- We used an unbiased genome-wide approach using both linkage analysis and variant filtering across the exome sequences of 48 individuals affected with DCM from 17 families to identify genetic cause. Linkage analysis ranked the TTN region as falling under the second highest genome-wide multipoint linkage peak, multipoint logarithm of odds, 1.59. We identified 6 TTN truncating variants carried by individuals affected with DCM in 7 of 17 DCM families (logarithm of odds, 2.99); 2 of these 7 families also had novel missense variants that segregated with disease. Two additional novel truncating TTN variants did not segregate with DCM. Nucleotide diversity at the TTN locus, including missense variants, was comparable with 5 other known DCM genes. The average number of missense variants in the exome sequences from the DCM cases or the ≈5400 cases from the Exome Sequencing Project was ≈23 per individual. The average number of TTN truncating variants in the Exome Sequencing Project was 0.014 per individual. We also identified a region (chr9q21.11-q22.31) with no known DCM genes with a maximum heterogeneity logarithm of odds score of 1.74. CONCLUSIONS- These data suggest that TTN truncating variants contribute to DCM cause. However, the lack of segregation of all identified TTN truncating variants illustrates the challenge of determining variant pathogenicity even with full exome sequencing.

  18. Genome-Wide Linkage and Positional Association Analyses Identify Associations of Novel AFF3 and NTM Genes with Triglycerides: The GenSalt Study

    PubMed Central

    Li, Changwei; Bazzano, Lydia A.L.; Rao, Dabeeru C.; Hixson, James E.; He, Jiang; Gu, Dongfeng; Gu, Charles C.; Shimmin, Lawrence C.; Jaquish, Cashell E.; Schwander, Karen; Liu, De-Pei; Huang, Jianfeng; Lu, Fanghong; Cao, Jie; Chong, Shen; Lu, Xiangfeng; Kelly, Tanika N.

    2016-01-01

    We conducted a genome-wide linkage scan and positional association study to identify genes and variants influencing blood lipid levels among participants of the Genetic Epidemiology Network of Salt-Sensitivity (GenSalt) study. The GenSalt study was conducted among 1906 participants from 633 Han Chinese families. Lipids were measured from overnight fasting blood samples using standard methods. Multipoint quantitative trait genome-wide linkage scans were performed on the high-density lipoprotein, low-density lipoprotein, and log-transformed triglyceride phenotypes. Using dense panels of single nucleotide polymorphisms (SNPs), single-marker and gene-based association analyses were conducted to follow-up on promising linkage signals. Additive associations between each SNP and lipid phenotypes were tested using mixed linear regression models. Gene-based analyses were performed by combining P-values from single-marker analyses within each gene using the truncated product method (TPM). Significant associations were assessed for replication among 777 Asian participants of the Multi-ethnic Study of Atherosclerosis (MESA). Bonferroni correction was used to adjust for multiple testing. In the GenSalt study, suggestive linkage signals were identified at 2p11.2–2q12.1 [maximum multipoint LOD score (MML) = 2.18 at 2q11.2] and 11q24.3–11q25 (MML = 2.29 at 11q25) for the log-transformed triglyceride phenotype. Follow-up analyses of these two regions revealed gene-based associations of charged multivesicular body protein 3 (CHMP3), ring finger protein 103 (RNF103), AF4/FMR2 family, member 3 (AFF3), and neurotrimin (NTM ) with triglycerides (P = 4 × 10−4, 1.00 × 10−5, 2.00 × 10−5, and 1.00 × 10−7, respectively). Both the AFF3 and NTM triglyceride associations were replicated among MESA study participants (P = 1.00 × 10−7 and 8.00 × 10−5, respectively). Furthermore, NTM explained the linkage signal on chromosome 11. In conclusion, we identified novel genes

  19. Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis

    DTIC Science & Technology

    2011-09-01

    Almasy, L, Blangero, J. (2009) Human QTL linkage mapping. Genetica 136:333-340. Amos, CI. (2007) Successful design and conduct of genome-wide...quantitative trait loci. Genetica 136:237-243. Skol AD, Scott LJ, Abecasis GR, Boehnke M. (2006) Joint analysis is more efficient than replication

  20. Exploiting genotyping by sequencing to characterize the genomic structure of the American cranberry through high-density linkage mapping.

    PubMed

    Covarrubias-Pazaran, Giovanny; Diaz-Garcia, Luis; Schlautman, Brandon; Deutsch, Joseph; Salazar, Walter; Hernandez-Ochoa, Miguel; Grygleski, Edward; Steffan, Shawn; Iorizzo, Massimo; Polashock, James; Vorsa, Nicholi; Zalapa, Juan

    2016-06-13

    The application of genotyping by sequencing (GBS) approaches, combined with data imputation methodologies, is narrowing the genetic knowledge gap between major and understudied, minor crops. GBS is an excellent tool to characterize the genomic structure of recently domesticated (~200 years) and understudied species, such as cranberry (Vaccinium macrocarpon Ait.), by generating large numbers of markers for genomic studies such as genetic mapping. We identified 10842 potentially mappable single nucleotide polymorphisms (SNPs) in a cranberry pseudo-testcross population wherein 5477 SNPs and 211 short sequence repeats (SSRs) were used to construct a high density linkage map in cranberry of which a total of 4849 markers were mapped. Recombination frequency, linkage disequilibrium (LD), and segregation distortion at the genomic level in the parental and integrated linkage maps were characterized for first time in cranberry. SSR markers, used as the backbone in the map, revealed high collinearity with previously published linkage maps. The 4849 point map consisted of twelve linkage groups spanning 1112 cM, which anchored 2381 nuclear scaffolds accounting for ~13 Mb of the estimated 470 Mb cranberry genome. Bin mapping identified 592 and 672 unique bins in the parentals and a total of 1676 unique marker positions in the integrated map. Synteny analyses comparing the order of anchored cranberry scaffolds to their homologous positions in kiwifruit, grape, and coffee genomes provided initial evidence of homology between cranberry and closely related species. GBS data was used to rapidly saturate the cranberry genome with markers in a pseudo-testcross population. Collinearity between the present saturated genetic map and previous cranberry SSR maps suggests that the SNP locations represent accurate marker order and chromosome structure of the cranberry genome. SNPs greatly improved current marker genome coverage, which allowed for genome-wide structure investigations such

  1. Linkage Disequilibrium And Genome-Wide Association Studies In O. sativa

    USDA-ARS?s Scientific Manuscript database

    There is increasing evidence that genome-wide association studies provide a powerful approach to find the genetic basis of complex phenotypic variation in all kinds of species. For this purpose, we developed the first generation 44K Affymetrix SNP array in rice (see Tung et al. poster). We genotyped...

  2. Genome-wide differentiation of various melon horticultural groups for use in genome wide association study for fruit firmness and construction of a high resolution genetic map

    USDA-ARS?s Scientific Manuscript database

    We generated 13,789 single nucleotide plymorphism (SNP) markers from 97 melon accessions using genotyping by sequencing and anchored them to chromosomes to understand genome-wide fixation index between various melon morphotypes and linkage disequilibrium (LD) decay for inodorus and cantalupensis, th...

  3. Large-scale linkage analysis of 1302 affected relative pairs with rheumatoid arthritis

    PubMed Central

    Hamshere, Marian L; Segurado, Ricardo; Moskvina, Valentina; Nikolov, Ivan; Glaser, Beate; Holmans, Peter A

    2007-01-01

    Rheumatoid arthritis is the most common systematic autoimmune disease and its etiology is believed to have both strong genetic and environmental components. We demonstrate the utility of including genetic and clinical phenotypes as covariates within a linkage analysis framework to search for rheumatoid arthritis susceptibility loci. The raw genotypes of 1302 affected relative pairs were combined from four large family-based samples (North American Rheumatoid Arthritis Consortium, United Kingdom, European Consortium on Rheumatoid Arthritis Families, and Canada). The familiality of the clinical phenotypes was assessed. The affected relative pairs were subjected to autosomal multipoint affected relative-pair linkage analysis. Covariates were included in the linkage analysis to take account of heterogeneity within the sample. Evidence of familiality was observed with age at onset (p << 0.001) and rheumatoid factor (RF) IgM (p << 0.001), but not definite erosions (p = 0.21). Genome-wide significant evidence for linkage was observed on chromosome 6. Genome-wide suggestive evidence for linkage was observed on chromosomes 13 and 20 when conditioning on age at onset, chromosome 15 conditional on gender, and chromosome 19 conditional on RF IgM after allowing for multiple testing of covariates. PMID:18466440

  4. Genome-wide linkage disequilibrium and past effective population size in three Korean cattle breeds.

    PubMed

    Sudrajad, P; Seo, D W; Choi, T J; Park, B H; Roh, S H; Jung, W Y; Lee, S S; Lee, J H; Kim, S; Lee, S H

    2017-02-01

    The routine collection and use of genomic data are useful for effectively managing breeding programs for endangered populations. Linkage disequilibrium (LD) using high-density DNA markers has been widely used to determine population structures and predict the genomic regions that are associated with economic traits in beef cattle. The extent of LD also provides information about historical events, including past effective population size (N e ), and it allows inferences on the genetic diversity of breeds. The objective of this study was to estimate the LD and N e in three Korean cattle breeds that are genetically similar but have different coat colors (Brown, Brindle and Jeju Black Hanwoo). Brindle and Jeju Black are endangered breeds with small populations, whereas Brown Hanwoo is the main breeding population in Korea. DNA samples from these cattle breeds were genotyped using the Illumina BovineSNP50 Bead Chip. We examined 13 cattle breeds, including European taurines, African taurines and indicines, and hybrids to compare their LD values. Brown Hanwoo consistently had the lowest mean LD compared to Jeju Black, Brindle and the other 13 cattle breeds (0.13, 0.19, 0.21 and 0.15-0.22 respectively). The high LD values of Brindle and Jeju Black contributed to small N e values (53 and 60 respectively), which were distinct from that of Brown Hanwoo (531) for 11 generations ago. The differences in LD and N e for each breed reflect the breeding strategy applied. The N e for these endangered cattle breeds remain low; thus, effort is needed to bring them back to a sustainable tract. © 2016 Stichting International Foundation for Animal Genetics.

  5. Genome-wide association and linkage identify modifier loci of lung disease severity in cystic fibrosis at 11p13 and 20q13.2

    PubMed Central

    Wright, Fred A.; Strug, Lisa J.; Doshi, Vishal K.; Commander, Clayton W.; Blackman, Scott M.; Sun, Lei; Berthiaume, Yves; Cutler, David; Cojocaru, Andreea; Collaco, J. Michael; Corey, Mary; Dorfman, Ruslan; Goddard, Katrina; Green, Deanna; Kent, Jack W.; Lange, Ethan M.; Lee, Seunggeun; Li, Weili; Luo, Jingchun; Mayhew, Gregory M.; Naughton, Kathleen M.; Pace, Rhonda G.; Paré, Peter; Rommens, Johanna M.; Sandford, Andrew; Stonebraker, Jaclyn R.; Sun, Wei; Taylor, Chelsea; Vanscoy, Lori L.; Zou, Fei; Blangero, John; Zielenski, Julian; O’Neal, Wanda K.; Drumm, Mitchell L.; Durie, Peter R.; Knowles, Michael R.; Cutting, Garry R.

    2012-01-01

    A combined genome-wide association and linkage study was used to identify loci causing variation in CF lung disease severity. A significant association (P=3. 34 × 10-8) near EHF and APIP (chr11p13) was identified in F508del homozygotes (n=1,978). The association replicated in F508del homozygotes (P=0.006) from a separate family-based study (n=557), with P=1.49 × 10-9 for the three-study joint meta-analysis. Linkage analysis of 486 sibling pairs from the family-based study identified a significant QTL on chromosome 20q13.2 (LOD=5.03). Our findings provide insight into the causes of variation in lung disease severity in CF and suggest new therapeutic targets for this life-limiting disorder. PMID:21602797

  6. Genome-wide linkage and copy number variation analysis reveals 710 kb duplication on chromosome 1p31.3 responsible for autosomal dominant omphalocele

    PubMed Central

    Radhakrishna, Uppala; Nath, Swapan K; McElreavey, Ken; Ratnamala, Uppala; Sun, Celi; Maiti, Amit K; Gagnebin, Maryline; Béna, Frédérique; Newkirk, Heather L; Sharp, Andrew J; Everman, David B; Murray, Jeffrey C; Schwartz, Charles E; Antonarakis, Stylianos E; Butler, Merlin G

    2017-01-01

    Background Omphalocele is a congenital birth defect characterised by the presence of internal organs located outside of the ventral abdominal wall. The purpose of this study was to identify the underlying genetic mechanisms of a large autosomal dominant Caucasian family with omphalocele. Methods and findings A genetic linkage study was conducted in a large family with an autosomal dominant transmission of an omphalocele using a genome-wide single nucleotide polymorphism (SNP) array. The analysis revealed significant evidence of linkage (non-parametric NPL = 6.93, p=0.0001; parametric logarithm of odds (LOD) = 2.70 under a fully penetrant dominant model) at chromosome band 1p31.3. Haplotype analysis narrowed the locus to a 2.74 Mb region between markers rs2886770 (63014807 bp) and rs1343981 (65757349 bp). Molecular characterisation of this interval using array comparative genomic hybridisation followed by quantitative microsphere hybridisation analysis revealed a 710 kb duplication located at 63.5–64.2 Mb. All affected individuals who had an omphalocele and shared the haplotype were positive for this duplicated region, while the duplication was absent from all normal individuals of this family. Multipoint linkage analysis using the duplication as a marker yielded a maximum LOD score of 3.2 at 1p31.3 under a dominant model. The 710 kb duplication at 1p31.3 band contains seven known genes including FOXD3, ALG6, ITGB3BP, KIAA1799, DLEU2L, PGM1, and the proximal portion of ROR1. Importantly, this duplication is absent from the database of genomic variants. Conclusions The present study suggests that development of an omphalocele in this family is controlled by overexpression of one or more genes in the duplicated region. To the authors’ knowledge, this is the first reported association of an inherited omphalocele condition with a chromosomal rearrangement. PMID:22499347

  7. Genome-Wide Search for Quantitative Trait Loci Controlling Important Plant and Flower Traits in Petunia Using an Interspecific Recombinant Inbred Population of Petunia axillaris and Petunia exserta.

    PubMed

    Cao, Zhe; Guo, Yufang; Yang, Qian; He, Yanhong; Fetouh, Mohammed; Warner, Ryan M; Deng, Zhanao

    2018-05-15

    A major bottleneck in plant breeding has been the much limited genetic base and much reduced genetic diversity in domesticated, cultivated germplasm. Identification and utilization of favorable gene loci or alleles from wild or progenitor species can serve as an effective approach to increasing genetic diversity and breaking this bottleneck in plant breeding. This study was conducted to identify quantitative trait loci (QTL) in wild or progenitor petunia species that can be used to improve important horticultural traits in garden petunia. An F 7 recombinant inbred population derived between Petunia axillaris and P. exserta was phenotyped for plant height, plant spread, plant size, flower counts, flower diameter, flower length, and days to anthesis, in Florida in two consecutive years. Transgressive segregation was observed for all seven traits in both years. The broad-sense heritability estimates for the traits ranged from 0.20 (days to anthesis) to 0.62 (flower length). A genome-wide genetic linkage map consisting 368 single nucleotide polymorphism bins and extending over 277 cM was searched to identify QTL for these traits. Nineteen QTL were identified and localized to five linkage groups. Eleven of the loci were identified consistently in both years; several loci explained up to 34.0% and 24.1% of the phenotypic variance for flower length and flower diameter, respectively. Multiple loci controlling different traits are co-localized in four intervals in four linkage groups. These intervals contain desirable alleles that can be introgressed into commercial petunia germplasm to expand the genetic base and improve plant performance and flower characteristics in petunia. Copyright © 2018, G3: Genes, Genomes, Genetics.

  8. Genome-wide linkage analysis of congenital heart defects using MOD score analysis identifies two novel loci

    PubMed Central

    2013-01-01

    Background Congenital heart defects (CHD) is the most common cause of death from a congenital structure abnormality in newborns and is often associated with fetal loss. There are many types of CHD. Human genetic studies have identified genes that are responsible for the inheritance of a particular type of CHD and for some types of CHD previously thought to be sporadic. However, occasionally different members of the same family might have anatomically distinct defects — for instance, one member with atrial septal defect, one with tetralogy of Fallot, and one with ventricular septal defect. Our objective is to identify susceptibility loci for CHD in families affected by distinct defects. The occurrence of these apparently discordant clinical phenotypes within one family might hint at a genetic framework common to most types of CHD. Results We performed a genome-wide linkage analysis using MOD score analysis in families with diverse CHD. Significant linkage was obtained in two regions, at chromosome 15 (15q26.3, Pempirical = 0.0004) and at chromosome 18 (18q21.2, Pempirical = 0.0005). Conclusions In these two novel regions four candidate genes are located: SELS, SNRPA1, and PCSK6 on 15q26.3, and TCF4 on 18q21.2. The new loci reported here have not previously been described in connection with CHD. Although further studies in other cohorts are needed to confirm these findings, the results presented here together with recent insight into how the heart normally develops will improve the understanding of CHD. PMID:23705960

  9. An Autosomal Genetic Linkage Map of the Sheep Genome

    PubMed Central

    Crawford, A. M.; Dodds, K. G.; Ede, A. J.; Pierson, C. A.; Montgomery, G. W.; Garmonsway, H. G.; Beattie, A. E.; Davies, K.; Maddox, J. F.; Kappes, S. W.; Stone, R. T.; Nguyen, T. C.; Penty, J. M.; Lord, E. A.; Broom, J. E.; Buitkamp, J.; Schwaiger, W.; Epplen, J. T.; Matthew, P.; Matthews, M. E.; Hulme, D. J.; Beh, K. J.; McGraw, R. A.; Beattie, C. W.

    1995-01-01

    We report the first extensive ovine genetic linkage map covering 2070 cM of the sheep genome. The map was generated from the linkage analysis of 246 polymorphic markers, in nine three-generation fullsib pedigrees, which make up the AgResearch International Mapping Flock. We have exploited many markers from cattle so that valuable comparisons between these two ruminant linkage maps can be made. The markers, used in the segregation analyses, comprised 86 anonymous microsatellite markers derived from the sheep genome, 126 anonymous microsatellites from cattle, one from deer, and 33 polymorphic markers of various types associated with known genes. The maximum number of informative meioses within the mapping flock was 222. The average number of informative meioses per marker was 140 (range 18-209). Linkage groups have been assigned to all 26 sheep autosomes. PMID:7498748

  10. Genome-wide Linkage and Positional Association Study of Blood Pressure Response to Dietary Sodium Intervention

    PubMed Central

    Mei, Hao; Gu, Dongfeng; Hixson, James E.; Rice, Treva K.; Chen, Jing; Shimmin, Lawrence C.; Schwander, Karen; Kelly, Tanika N.; Liu, De-Pei; Chen, Shufeng; Huang, Jian-feng; Jaquish, Cashell E.; Rao, Dabeeru C.; He, Jiang

    2012-01-01

    The authors conducted a genome-wide linkage scan and positional association analysis to identify the genetic determinants of salt sensitivity of blood pressure (BP) in a large family-based, dietary-feeding study. The dietary intervention was conducted among 1,906 participants in rural China (2003–2005). A 7-day low-sodium intervention was followed by a 7-day high-sodium intervention. Salt sensitivity was defined as BP responses to low- and high-sodium interventions. Signals of the logarithm of the odds to the base 10 (LOD ≥ 3) were detected at 33–42 centimorgans of chromosome 2 (2p24.3-2p24.1), with a maximum LOD score of 3.33 for diastolic blood pressure responses to high-sodium intervention. LOD scores were 2.35–2.91 for mean arterial pressure (MAP) and 0.80–1.49 for systolic blood pressure responses in this region, respectively. Correcting for multiple tests, single nucleotide polymorphism (SNP) rs11674786 (2.7 kilobases upstream of the family with sequence similarity 84, member A, gene (FAM84A)) in the linkage region was significantly associated with diastolic blood pressure (P = 0.0007) and MAP responses (P = 0.0007), and SNP rs16983422 (2.8 kilobases upstream of the visinin-like 1 gene (VSNL1)) was marginally associated with diastolic blood pressure (P = 0.005) and MAP responses (P = 0.005). An additive interaction between SNPs rs11674786 and rs16983422 was observed, with P = 7.00 × 10−5 and P = 7.23 × 10−5 for diastolic blood pressure and MAP responses, respectively. The authors concluded that genetic region 2p24.3-2p24.1 might harbor functional variants for the salt sensitivity of BP. PMID:22865701

  11. A Genetic Linkage Map of the Male Goat Genome

    PubMed Central

    Vaiman, D.; Schibler, L.; Bourgeois, F.; Oustry, A.; Amigues, Y.; Cribiu, E. P.

    1996-01-01

    This paper presents a first genetic linkage map of the goat genome. Primers derived from the flanking sequences of 612 bovine, ovine and goat microsatellite markers were gathered and tested for amplification with goat DNA under standardized PCR conditions. This screen made it possible to choose a set of 55 polymorphic markers that can be used in the three species and to define a panel of 223 microsatellites suitable for the goat. Twelve half-sib paternal goat families were then used to build a linkage map of the goat genome. The linkage analysis made it possible to construct a meiotic map covering 2300 cM, i.e., >80% of the total estimated length of the goat genome. Moreover, eight cosmids containing microsatellites were mapped by fluorescence in situ hybridization in goat and sheep. Together with 11 microsatellite-containing cosmids previously mapped in cattle (and supposing conservation of the banding pattern between this species and the goat) and data from the sheep map, these results made the orientation of 15 linkage groups possible. Furthermore, 12 coding sequences were mapped either genetically or physically, providing useful data for comparative mapping. PMID:8878693

  12. Genome-wide linkage analysis of human auditory cortical activation suggests distinct loci on chromosomes 2, 3, and 8.

    PubMed

    Renvall, Hanna; Salmela, Elina; Vihla, Minna; Illman, Mia; Leinonen, Eira; Kere, Juha; Salmelin, Riitta

    2012-10-17

    Neural processes are explored through macroscopic neuroimaging and microscopic molecular measures, but the two levels remain primarily detached. The identification of direct links between the levels would facilitate use of imaging signals as probes of genetic function and, vice versa, access to molecular correlates of imaging measures. Neuroimaging patterns have been mapped for a few isolated genes, chosen based on their connection with a clinical disorder. Here we propose an approach that allows an unrestricted discovery of the genetic basis of a neuroimaging phenotype in the normal human brain. The essential components are a subject population that is composed of relatives and selection of a neuroimaging phenotype that is reproducible within an individual and similar between relatives but markedly variable across a population. Our present combined magnetoencephalography and genome-wide linkage study in 212 healthy siblings demonstrates that auditory cortical activation strength is highly heritable and, specifically in the right hemisphere, regulated oligogenically with linkages to chromosomes 2q37, 3p12, and 8q24. The identified regions delimit as candidate genes TRAPPC9, operating in neuronal differentiation, and ROBO1, regulating projections of thalamocortical axons. Identification of normal genetic variation underlying neurophysiological phenotypes offers a non-invasive platform for an in-depth, concerted capitalization of molecular and neuroimaging levels in exploring neural function.

  13. An autosomal genetic linkage map of the sheep genome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Crawford, A.M.; Ede, A.J.; Pierson, C.A.

    1995-06-01

    We report the first extensive ovine genetic linkage map covering 2070 cM of the sheep genome. The map was generated from the linkage analysis of 246 polymorphic markers, in nine three-generation full-sib pedigrees, which make up the AgResearch International Mapping Flock. We have exploited many markers from cattle so that valuable comparisons between these two ruminant linkage maps can be made. The markers, used in the segregation analyses, comprised 86 anonymous microsatellite markers derived from the sheep genome, 126 anonymous microsatellites from cattle, one from deer, and 33 polymorphic markers of various types associated with known genes. The maximum numbermore » of informative meioses within the mapping flock was 22. The average number of informative meioses per marker was 140 (range 18-209). Linkage groups have been assigned to all 26 sheep autosomes. 102 refs., 8 figs., 5 tabs.« less

  14. Stochastic model search with binary outcomes for genome-wide association studies.

    PubMed

    Russu, Alberto; Malovini, Alberto; Puca, Annibale A; Bellazzi, Riccardo

    2012-06-01

    The spread of case-control genome-wide association studies (GWASs) has stimulated the development of new variable selection methods and predictive models. We introduce a novel Bayesian model search algorithm, Binary Outcome Stochastic Search (BOSS), which addresses the model selection problem when the number of predictors far exceeds the number of binary responses. Our method is based on a latent variable model that links the observed outcomes to the underlying genetic variables. A Markov Chain Monte Carlo approach is used for model search and to evaluate the posterior probability of each predictor. BOSS is compared with three established methods (stepwise regression, logistic lasso, and elastic net) in a simulated benchmark. Two real case studies are also investigated: a GWAS on the genetic bases of longevity, and the type 2 diabetes study from the Wellcome Trust Case Control Consortium. Simulations show that BOSS achieves higher precisions than the reference methods while preserving good recall rates. In both experimental studies, BOSS successfully detects genetic polymorphisms previously reported to be associated with the analyzed phenotypes. BOSS outperforms the other methods in terms of F-measure on simulated data. In the two real studies, BOSS successfully detects biologically relevant features, some of which are missed by univariate analysis and the three reference techniques. The proposed algorithm is an advance in the methodology for model selection with a large number of features. Our simulated and experimental results showed that BOSS proves effective in detecting relevant markers while providing a parsimonious model.

  15. Genome-wide divergence, haplotype distribution and population demographic histories for Gossypium hirsutum and Gossypium barbadense as revealed by genome-anchored SNPs

    PubMed Central

    Reddy, Umesh K.; Nimmakayala, Padma; Abburi, Venkata Lakshmi; Reddy, C. V. C. M.; Saminathan, Thangasamy; Percy, Richard G.; Yu, John Z.; Frelichowski, James; Udall, Joshua A.; Page, Justin T.; Zhang, Dong; Shehzad, Tariq; Paterson, Andrew H.

    2017-01-01

    Use of 10,129 singleton SNPs of known genomic location in tetraploid cotton provided unique opportunities to characterize genome-wide diversity among 440 Gossypium hirsutum and 219 G. barbadense cultivars and landrace accessions of widespread origin. Using the SNPs distributed genome-wide, we examined genetic diversity, haplotype distribution and linkage disequilibrium patterns in the G. hirsutum and G. barbadense genomes to clarify population demographic history. Diversity and identity-by-state analyses have revealed little sharing of alleles between the two cultivated allotetraploid genomes, with a few exceptions that indicated sporadic gene flow. We found a high number of new alleles, representing increased nucleotide diversity, on chromosomes 1 and 2 in cultivated G. hirsutum as compared with low nucleotide diversity on these chromosomes in landrace G. hirsutum. In contrast, G. barbadense chromosomes showed negative Tajima’s D on several chromosomes for both cultivated and landrace types, which indicate that speciation of G. barbadense itself, might have occurred with relatively narrow genetic diversity. The presence of conserved linkage disequilibrium (LD) blocks and haplotypes between G. hirsutum and G. barbadense provides strong evidence for comparable patterns of evolution in their domestication processes. Our study illustrates the potential use of population genetic techniques to identify genomic regions for domestication. PMID:28128280

  16. Genome-Wide Linkage and Regional Association Study of Blood Pressure Response to the Cold Pressor Test in Han Chinese: The GenSalt Study

    PubMed Central

    Yang, Xueli; Gu, Dongfeng; He, Jiang; Hixson, James E.; Rao, Dabeeru C.; Lu, Fanghong; Mu, Jianjun; Jaquish, Cashell E.; Chen, Jing; Huang, Jianfeng; Shimmin, Lawrence C.; Rice, Treva K.; Chen, Jichun; Wu, Xigui; Liu, Depei; Kelly, Tanika N.

    2014-01-01

    Background Blood pressure (BP) response to cold pressor test (CPT) is associated with increased risk of cardiovascular disease. We performed a genome-wide linkage scan and regional association analysis to identify genetic determinants of BP response to CPT. Methods and Results A total of 1,961 Chinese participants completed the CPT. Multipoint quantitative trait linkage analysis was performed, followed by single-marker and gene-based analyses of variants in promising linkage regions (logarithm of odds, LOD ≥ 2). A suggestive linkage signal was identified for systolic BP (SBP) response to CPT at 20p13-20p12.3, with a maximum multipoint LOD score of 2.37. Based on regional association analysis with 1,351 SNPs in the linkage region, we found that marker rs2326373 at 20p13 was significantly associated with mean arterial pressure (MAP) responses to CPT (P = 8.8×10−6) after FDR adjustment for multiple comparisons. A similar trend was also observed for SBP response (P = 0.03) and DBP response (P = 4.6×10−5). Results of gene-based analyses showed that variants in genes MCM8 and SLC23A2 were associated with SBP response to CPT (P = 4.0×10−5 and 2.7×10−4, respectively), and variants in genes MCM8 and STK35 were associated with MAP response to CPT (P = 1.5×10−5 and 5.0×10−5, respectively). Conclusions Within a suggestive linkage region on chromosome 20, we identified a novel variant associated with BP responses to CPT. We also found gene-based associations of MCM8, SLC23A2 and STK35 in this region. Further work is warranted to confirm these findings. Clinical Trial Registration URL: http://www.clinicaltrials.gov; Unique identifier: NCT00721721. PMID:25028485

  17. Construction of high-quality recombination maps with low-coverage genomic sequencing for joint linkage analysis in maize

    USDA-ARS?s Scientific Manuscript database

    A genome-wide association study (GWAS) is the foremost strategy used for finding genes that control human diseases and agriculturally important traits, but it often reports false positives. In contrast, its complementary method, linkage analysis, provides direct genetic confirmation, but with limite...

  18. Stochastic model search with binary outcomes for genome-wide association studies

    PubMed Central

    Malovini, Alberto; Puca, Annibale A; Bellazzi, Riccardo

    2012-01-01

    Objective The spread of case–control genome-wide association studies (GWASs) has stimulated the development of new variable selection methods and predictive models. We introduce a novel Bayesian model search algorithm, Binary Outcome Stochastic Search (BOSS), which addresses the model selection problem when the number of predictors far exceeds the number of binary responses. Materials and methods Our method is based on a latent variable model that links the observed outcomes to the underlying genetic variables. A Markov Chain Monte Carlo approach is used for model search and to evaluate the posterior probability of each predictor. Results BOSS is compared with three established methods (stepwise regression, logistic lasso, and elastic net) in a simulated benchmark. Two real case studies are also investigated: a GWAS on the genetic bases of longevity, and the type 2 diabetes study from the Wellcome Trust Case Control Consortium. Simulations show that BOSS achieves higher precisions than the reference methods while preserving good recall rates. In both experimental studies, BOSS successfully detects genetic polymorphisms previously reported to be associated with the analyzed phenotypes. Discussion BOSS outperforms the other methods in terms of F-measure on simulated data. In the two real studies, BOSS successfully detects biologically relevant features, some of which are missed by univariate analysis and the three reference techniques. Conclusion The proposed algorithm is an advance in the methodology for model selection with a large number of features. Our simulated and experimental results showed that BOSS proves effective in detecting relevant markers while providing a parsimonious model. PMID:22534080

  19. Genome-Wide Association Studies of Drug-Resistance Determinants.

    PubMed

    Volkman, Sarah K; Herman, Jonathan; Lukens, Amanda K; Hartl, Daniel L

    2017-03-01

    Population genetic strategies that leverage association, selection, and linkage have identified drug-resistant loci. However, challenges and limitations persist in identifying drug-resistance loci in malaria. In this review we discuss the genetic basis of drug resistance and the use of genome-wide association studies, complemented by selection and linkage studies, to identify and understand mechanisms of drug resistance and response. We also discuss the implications of nongenetic mechanisms of drug resistance recently reported in the literature, and present models of the interplay between nongenetic and genetic processes that contribute to the emergence of drug resistance. Throughout, we examine artemisinin resistance as an example to emphasize challenges in identifying phenotypes suitable for population genetic studies as well as complications due to multiple-factor drug resistance. Copyright © 2016. Published by Elsevier Ltd.

  20. A linkage map for the B-genome of Arachis (Fabaceae) and its synteny to the A-genome

    PubMed Central

    Moretzsohn, Márcio C; Barbosa, Andrea VG; Alves-Freitas, Dione MT; Teixeira, Cristiane; Leal-Bertioli, Soraya CM; Guimarães, Patrícia M; Pereira, Rinaldo W; Lopes, Catalina R; Cavallari, Marcelo M; Valls, José FM; Bertioli, David J; Gimenes, Marcos A

    2009-01-01

    Background Arachis hypogaea (peanut) is an important crop worldwide, being mostly used for edible oil production, direct consumption and animal feed. Cultivated peanut is an allotetraploid species with two different genome components, A and B. Genetic linkage maps can greatly assist molecular breeding and genomic studies. However, the development of linkage maps for A. hypogaea is difficult because it has very low levels of polymorphism. This can be overcome by the utilization of wild species of Arachis, which present the A- and B-genomes in the diploid state, and show high levels of genetic variability. Results In this work, we constructed a B-genome linkage map, which will complement the previously published map for the A-genome of Arachis, and produced an entire framework for the tetraploid genome. This map is based on an F2 population of 93 individuals obtained from the cross between the diploid A. ipaënsis (K30076) and the closely related A. magna (K30097), the former species being the most probable B genome donor to cultivated peanut. In spite of being classified as different species, the parents showed high crossability and relatively low polymorphism (22.3%), compared to other interspecific crosses. The map has 10 linkage groups, with 149 loci spanning a total map distance of 1,294 cM. The microsatellite markers utilized, developed for other Arachis species, showed high transferability (81.7%). Segregation distortion was 21.5%. This B-genome map was compared to the A-genome map using 51 common markers, revealing a high degree of synteny between both genomes. Conclusion The development of genetic maps for Arachis diploid wild species with A- and B-genomes effectively provides a genetic map for the tetraploid cultivated peanut in two separate diploid components and is a significant advance towards the construction of a transferable reference map for Arachis. Additionally, we were able to identify affinities of some Arachis linkage groups with Medicago

  1. Genome-wide genetic investigation of serological measures of common infections

    PubMed Central

    Rubicz, Rohina; Yolken, Robert; Drigalenko, Eugene; Carless, Melanie A; Dyer, Thomas D; Kent Jr, Jack; Curran, Joanne E; Johnson, Matthew P; Cole, Shelley A; Fowler, Sharon P; Arya, Rector; Puppala, Sobha; Almasy, Laura; Moses, Eric K; Kraig, Ellen; Duggirala, Ravindranath; Blangero, John; Leach, Charles T; Göring, Harald HH

    2015-01-01

    Populations and individuals differ in susceptibility to infections because of a number of factors, including host genetic variation. We previously demonstrated that differences in antibody titer, which reflect infection history, are significantly heritable. Here we attempt to identify the genetic factors influencing variation in these serological phenotypes. Blood samples from >1300 Mexican Americans were quantified for IgG antibody level against 12 common infections, selected on the basis of their reported role in cardiovascular disease risk: Chlamydia pneumoniae; Helicobacter pylori; Toxoplasma gondii; cytomegalovirus; herpes simplex I virus; herpes simplex II virus; human herpesvirus 6 (HHV6); human herpesvirus 8 (HHV8); varicella zoster virus; hepatitis A virus (HAV); influenza A virus; and influenza B virus. Pathogen-specific quantitative antibody levels were analyzed, as were three measures of pathogen burden. Genome-wide linkage and joint linkage and association analyses were performed using ~1 million SNPs. Significant linkage (lod scores >3.0) was obtained for HHV6 (on chromosome 7), HHV8 (on chromosome 6), and HAV (on chromosome 13). SNP rs4812712 on chromosome 20 was significantly associated with C. pneumoniae (P=5.3 × 10−8). However, no genome-wide significant loci were obtained for the other investigated antibodies. We conclude that it is possible to localize host genetic factors influencing some of these antibody traits, but that further larger-scale investigations will be required to elucidate the genetic mechanisms contributing to variation in antibody levels. PMID:25758998

  2. Genome-wide linkage mapping of QTL for black point reaction in bread wheat (Triticum aestivum L.).

    PubMed

    Liu, Jindong; He, Zhonghu; Wu, Ling; Bai, Bin; Wen, Weie; Xie, Chaojie; Xia, Xianchun

    2016-11-01

    Nine QTL for black point resistance in wheat were identified using a RIL population derived from a Linmai 2/Zhong 892 cross and 90K SNP assay. Black point, discoloration of the embryo end of the grain, downgrades wheat grain quality leading to significant economic losses to the wheat industry. The availability of molecular markers will accelerate improvement of black point resistance in wheat breeding. The aims of this study were to identify quantitative trait loci (QTL) for black point resistance and tightly linked molecular markers, and to search for candidate genes using a high-density genetic linkage map of wheat. A recombinant inbred line (RIL) population derived from the cross Linmai 2/Zhong 892 was evaluated for black point reaction during the 2011-2012, 2012-2013 and 2013-2014 cropping seasons, providing data for seven environments. A high-density linkage map was constructed by genotyping the RILs with the wheat 90K single nucleotide polymorphism (SNP) chip. Composite interval mapping detected nine QTL on chromosomes 2AL, 2BL, 3AL, 3BL, 5AS, 6A, 7AL (2) and 7BS, designated as QBp.caas-2AL, QBp.caas-2BL, QBp.caas-3AL, QBp.caas-3BL, QBp.caas-5AS, QBp.caas-6A, QBp.caas-7AL.1, QBp.caas-7AL.2 and QBp.caas-7BS, respectively. All resistance alleles, except for QBp.caas-7AL.1 from Linmai 2, were contributed by Zhong 892. QBp.caas-3BL, QBp.caas-5AS, QBp.caas-7AL.1, QBp.caas-7AL.2 and QBp.caas-7BS probably represent new loci for black point resistance. Sequences of tightly linked SNPs were used to survey wheat and related cereal genomes identifying three candidate genes for black point resistance. The tightly linked SNP markers can be used in marker-assisted breeding in combination with the kompetitive allele specific PCR technique to improve black point resistance.

  3. Genetic linkage map and comparative genome analysis for the estuarine Atlantic killifish (Fundulus heteroclitus)

    EPA Pesticide Factsheets

    Genetic linkage maps are valuable tools in evolutionary biology; however, their availability for wild populations is extremely limited. Fundulus heteroclitus (Atlantic killifish) is a non-migratory estuarine fish that exhibits high allelic and phenotypic diversity partitioned among subpopulations that reside in disparate environmental conditions. An ideal candidate model organism for studying gene-environment interactions, the molecular toolbox for F. heteroclitus is limited. We identified hundreds of novel microsatellites which, when combined with existing microsatellites and single nucleotide polymorphisms (SNPs), were used to construct the first genetic linkage map for this species. By integrating independent linkage maps from three genetic crosses, we developed a consensus map containing 24 linkage groups, consistent with the number of chromosomes reported for this species. These linkage groups span 2300 centimorgans (cM) of recombinant genomic space, intermediate in size relative to the current linkage maps for the teleosts, medaka and zebrafish. Comparisons between fish genomes support a high degree of synteny between the consensus F. heteroclitus linkage map and the medaka and (to a lesser extent) zebrafish physical genome assemblies.This dataset is associated with the following publication:Waits , E., J. Martinson , B. Rinner, S. Morris, D. Proestou, D. Champlin , and D. Nacci. Genetic linkage map and comparative genome analysis for the estuarine Atlanti

  4. Genetic linkage map of a wild genome: genomic structure, recombination and sexual dimorphism in bighorn sheep

    PubMed Central

    2010-01-01

    Background The construction of genetic linkage maps in free-living populations is a promising tool for the study of evolution. However, such maps are rare because it is difficult to develop both wild pedigrees and corresponding sets of molecular markers that are sufficiently large. We took advantage of two long-term field studies of pedigreed individuals and genomic resources originally developed for domestic sheep (Ovis aries) to construct a linkage map for bighorn sheep, Ovis canadensis. We then assessed variability in genomic structure and recombination rates between bighorn sheep populations and sheep species. Results Bighorn sheep population-specific maps differed slightly in contiguity but were otherwise very similar in terms of genomic structure and recombination rates. The joint analysis of the two pedigrees resulted in a highly contiguous map composed of 247 microsatellite markers distributed along all 26 autosomes and the X chromosome. The map is estimated to cover about 84% of the bighorn sheep genome and contains 240 unique positions spanning a sex-averaged distance of 3051 cM with an average inter-marker distance of 14.3 cM. Marker synteny, order, sex-averaged interval lengths and sex-averaged total map lengths were all very similar between sheep species. However, in contrast to domestic sheep, but consistent with the usual pattern for a placental mammal, recombination rates in bighorn sheep were significantly greater in females than in males (~12% difference), resulting in an autosomal female map of 3166 cM and an autosomal male map of 2831 cM. Despite differing genome-wide patterns of heterochiasmy between the sheep species, sexual dimorphism in recombination rates was correlated between orthologous intervals. Conclusions We have developed a first-generation bighorn sheep linkage map that will facilitate future studies of the genetic architecture of trait variation in this species. While domestication has been hypothesized to be responsible for the

  5. FGWAS: Functional genome wide association analysis.

    PubMed

    Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu

    2017-10-01

    Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. A genome-wide SNP scan accelerates trait-regulatory genomic loci identification in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C.L.L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We identified 44844 high-quality SNPs by sequencing 92 diverse chickpea accessions belonging to a seed and pod trait-specific association panel using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays. A GWAS (genome-wide association study) in an association panel of 211, including the 92 sequenced accessions, identified 22 major genomic loci showing significant association (explaining 23–47% phenotypic variation) with pod and seed number/plant and 100-seed weight. Eighteen trait-regulatory major genomic loci underlying 13 robust QTLs were validated and mapped on an intra-specific genetic linkage map by QTL mapping. A combinatorial approach of GWAS, QTL mapping and gene haplotype-specific LD mapping and transcript profiling uncovered one superior haplotype and favourable natural allelic variants in the upstream regulatory region of a CesA-type cellulose synthase (Ca_Kabuli_CesA3) gene regulating high pod and seed number/plant (explaining 47% phenotypic variation) in chickpea. The up-regulation of this superior gene haplotype correlated with increased transcript expression of Ca_Kabuli_CesA3 gene in the pollen and pod of high pod/seed number accession, resulting in higher cellulose accumulation for normal pollen and pollen tube growth. A rapid combinatorial genome-wide SNP genotyping-based approach has potential to dissect complex quantitative agronomic traits and delineate trait-regulatory genomic loci (candidate genes) for genetic enhancement in crop plants, including chickpea. PMID:26058368

  7. Genome Wide Search for Biomarkers to Diagnose Yersinia Infections.

    PubMed

    Kalia, Vipin Chandra; Kumar, Prasun

    2015-12-01

    Bacterial identification on the basis of the highly conserved 16S rRNA (rrs) gene is limited by its presence in multiple copies and a very high level of similarity among them. The need is to look for other genes with unique characteristics to be used as biomarkers. Fifty-one sequenced genomes belonging to 10 different Yersinia species were used for searching genes common to all the genomes. Out of 304 common genes, 34 genes of sizes varying from 0.11 to 4.42 kb, were selected and subjected to in silico digestion with 10 different Restriction endonucleases (RE) (4-6 base cutters). Yersinia species have 6-7 copies of rrs per genome, which are difficult to distinguish by multiple sequence alignments or their RE digestion patterns. However, certain unique combinations of other common gene sequences-carB, fadJ, gluM, gltX, ileS, malE, nusA, ribD, and rlmL and their RE digestion patterns can be used as markers for identifying 21 strains belonging to 10 Yersinia species: Y. aldovae, Y. enterocolitica, Y. frederiksenii, Y. intermedia, Y. kristensenii, Y. pestis, Y. pseudotuberculosis, Y. rohdei, Y. ruckeri, and Y. similis. This approach can be applied for rapid diagnostic applications.

  8. Construction of high resolution genetic linkage maps to improve the soybean genome sequence assembly Glyma1.01

    USDA-ARS?s Scientific Manuscript database

    A landmark in soybean research, Glyma1.01, the first whole genome sequence of variety Williams 82 (Glycine max L. Merr.) was completed in 2010 and is widely used. However, because the assembly was primarily built based on the linkage maps constructed with a limited number of markers and recombinant...

  9. Genome-wide search and comparative genomic analysis of the trypsin inhibitor-like cysteine-rich domain-containing peptides.

    PubMed

    Zeng, Xian-Chun; Liu, Yichen; Shi, Wanxia; Zhang, Lei; Luo, Xuesong; Nie, Yao; Yang, Ye

    2014-03-01

    It was shown that peptides containing trypsin inhibitor-like cysteine-rich (TIL) domain are able to inhibit proteinase activities, and thus play important roles in various biological processes, such as immune response and anticoagulation. However, only a limited number of the TIL peptides have been identified and characterized so far; and little has been known about the evolutionary relationships of the genes encoding the TIL peptides. BmKAPi is a TIL domain-containing peptide that was identified from Mesobuthus martensii Karsch. Here, we conducted genome-wide searches for new peptides that are homologous to BmKAPi or possess a cysteine pattern similar to that of BmKAPi. As a result, we identified a total of 80 different TIL peptides from 34 species of arthropods. We found that these peptides can be classified into seven evolutionarily distinct groups. Furthermore, we cloned the genomic sequence of BmKAPi; the genomic sequences of the majority of other TIL peptides were also identified from the GenBank database using bioinformatical approaches. Through phylogenetic and comparative genomic analysis, we found 26 cases of intron gain events occurred in the genes of the TIL peptides; however, no instances of intron loss were observed. Moreover, we found that alternative splicing contributes to the diversification of the TIL peptides. It is interesting to see that four genes of the TIL domain-containing peptides overlap in a DNA region located on the chromosome LG B15 of Bombus terretris. These data suggest that the evolution of the TIL peptide genes are dynamic, which was dominated by intron gain. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Fast and Accurate Approximation to Significance Tests in Genome-Wide Association Studies

    PubMed Central

    Zhang, Yu; Liu, Jun S.

    2011-01-01

    Genome-wide association studies commonly involve simultaneous tests of millions of single nucleotide polymorphisms (SNP) for disease association. The SNPs in nearby genomic regions, however, are often highly correlated due to linkage disequilibrium (LD, a genetic term for correlation). Simple Bonferonni correction for multiple comparisons is therefore too conservative. Permutation tests, which are often employed in practice, are both computationally expensive for genome-wide studies and limited in their scopes. We present an accurate and computationally efficient method, based on Poisson de-clumping heuristics, for approximating genome-wide significance of SNP associations. Compared with permutation tests and other multiple comparison adjustment approaches, our method computes the most accurate and robust p-value adjustments for millions of correlated comparisons within seconds. We demonstrate analytically that the accuracy and the efficiency of our method are nearly independent of the sample size, the number of SNPs, and the scale of p-values to be adjusted. In addition, our method can be easily adopted to estimate false discovery rate. When applied to genome-wide SNP datasets, we observed highly variable p-value adjustment results evaluated from different genomic regions. The variation in adjustments along the genome, however, are well conserved between the European and the African populations. The p-value adjustments are significantly correlated with LD among SNPs, recombination rates, and SNP densities. Given the large variability of sequence features in the genome, we further discuss a novel approach of using SNP-specific (local) thresholds to detect genome-wide significant associations. This article has supplementary material online. PMID:22140288

  11. A Targeted Capture Linkage Map Anchors the Genome of the Schistosomiasis Vector Snail, Biomphalaria glabrata.

    PubMed

    Tennessen, Jacob A; Bollmann, Stephanie R; Blouin, Michael S

    2017-07-05

    The aquatic planorbid snail Biomphalaria glabrata is one of the most intensively-studied mollusks due to its role in the transmission of schistosomiasis. Its 916 Mb genome has recently been sequenced and annotated, but it remains poorly assembled. Here, we used targeted capture markers to map over 10,000 B. glabrata scaffolds in a linkage cross of 94 F1 offspring, generating 24 linkage groups (LGs). We added additional scaffolds to these LGs based on linkage disequilibrium (LD) analysis of targeted capture and whole-genome sequences of 96 unrelated snails. Our final linkage map consists of 18,613 scaffolds comprising 515 Mb, representing 56% of the genome and 75% of genic and nonrepetitive regions. There are 18 large (> 10 Mb) LGs, likely representing the expected 18 haploid chromosomes, and > 50% of the genome has been assigned to LGs of at least 17 Mb. Comparisons with other gastropod genomes reveal patterns of synteny and chromosomal rearrangements. Linkage relationships of key immune-relevant genes may help clarify snail-schistosome interactions. By focusing on linkage among genic and nonrepetitive regions, we have generated a useful resource for associating snail phenotypes with causal genes, even in the absence of a complete genome assembly. A similar approach could potentially improve numerous poorly-assembled genomes in other taxa. This map will facilitate future work on this host of a serious human parasite. Copyright © 2017 Tennessen et al.

  12. Evo-Devo-EpiR: a genome-wide search platform for epistatic control on the evolution of development.

    PubMed

    Jiang, Libo; Zhang, Miaomiao; Sang, Mengmeng; Ye, Meixia; Wu, Rongling

    2017-09-01

    Evo-devo is a theory proposed to study how phenotypes evolve by comparing the developmental processes of different organisms or the same organism experiencing changing environments. It has been recognized that nonallelic interactions at different genes or quantitative trait loci, known as epistasis, may play a pivotal role in the evolution of development, but it has proven difficult to quantify and elucidate this role into a coherent picture. We implement a high-dimensional genome-wide association study model into the evo-devo paradigm and pack it into the R-based Evo-Devo-EpiR, aimed at facilitating the genome-wide landscaping of epistasis for the diversification of phenotypic development. By analyzing a high-throughput assay of DNA markers and their pairs simultaneously, Evo-Devo-EpiR is equipped with a capacity to systematically characterize various epistatic interactions that impact on the pattern and timing of development and its evolution. Enabling a global search for all possible genetic interactions for developmental processes throughout the whole genome, Evo-Devo-EpiR provides a computational tool to illustrate a precise genotype-phenotype map at interface between epistasis, development and evolution. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  13. Linkage of Bardet-Biedl syndrome to chromosome 16q and evidence for non-allelic genetic heterogeneity.

    PubMed

    Kwitek-Black, A E; Carmi, R; Duyk, G M; Buetow, K H; Elbedour, K; Parvari, R; Yandava, C N; Stone, E M; Sheffield, V C

    1993-12-01

    Bardet-Biedl syndrome is an autosomal recessive disorder characterized by mental retardation, obesity, retinitis pigmentosa, polydactyly and hypogonadism. Other findings include hypertension, diabetes mellitus and renal and cardiovascular anomalies. We have performed a genome-wide search for linkage in a large inbred Bedouin family. Pairwise analysis established linkage with the locus D16S408 with no recombination and a lod score of 4.2. A multilocus lod score of 5.3 was observed. By demonstrating homozygosity, in all affected individuals, for the same allele of marker D16S408, further support for linkage is found, and the utility of homozygosity mapping using inbred families is demonstrated. In a second family, linkage was excluded at this locus, suggesting non-allelic genetic heterogeneity in this disorder.

  14. A genome-wide linkage scan for dietary energy and nutrient intakes: the Health, Risk Factors, Exercise Training, and Genetics (HERITAGE) Family Study.

    PubMed

    Collaku, Agron; Rankinen, Tuomo; Rice, Treva; Leon, Arthur S; Rao, D C; Skinner, James S; Wilmore, Jack H; Bouchard, Claude

    2004-05-01

    A poor diet is a risk factor for chronic diseases such as obesity, cardiovascular disease, hypertension, and some cancers. Twin and family studies suggest that genetic factors potentially influence energy and nutrient intakes. We sought to identify genomic regions harboring genes affecting total energy, carbohydrate, protein, and fat intakes. We performed a genomic scan in 347 white sibling pairs and 99 black sibling pairs. Dietary energy and nutrient intakes were assessed by using Willett's food-frequency questionnaire. Single-point and multipoint Haseman-Elston regression techniques were used to test for linkage. These subjects were part of the Health, Risk Factors, Exercise Training, and Genetics (HERITAGE) Family Study, a multicenter project undertaken by 5 laboratories. In the whites, the strongest evidence of linkage appeared for dietary energy and nutrient intakes on chromosomes 1p21.2 (P = 0.0002) and 20q13.13 (P = 0.00007), and that for fat intake appeared on chromosome 12q14.1 (P = 0.0013). The linkage evidence on chromosomes 1 and 20 related to total energy intake rather than to the intake of specific macronutrients. In the blacks, promising linkages for macronutrient intakes occurred on chromosomes 12q23-q24.21, 1q32.1, and 7q11.1. Several potential candidate genes are encoded in and around the linkage regions on chromosomes 1p21.2, 12q14.1, and 20q13.13. These are the first reported human quantitative trait loci for dietary energy and macronutrient intakes. Further study may refine these quantitative trait loci to identify potential candidate genes for energy and specific macronutrient intakes that would be amenable to more detailed molecular studies.

  15. Genome-wide linkage scans for type 2 diabetes mellitus in four ethnically diverse populations-significant evidence for linkage on chromosome 4q in African Americans: the Family Investigation of Nephropathy and Diabetes Research Group.

    PubMed

    Malhotra, Alka; Igo, Robert P; Thameem, Farook; Kao, W H Linda; Abboud, Hanna E; Adler, Sharon G; Arar, Nedal H; Bowden, Donald W; Duggirala, Ravindranath; Freedman, Barry I; Goddard, Katrina A B; Ipp, Eli; Iyengar, Sudha K; Kimmel, Paul L; Knowler, William C; Kohn, Orly; Leehey, David; Meoni, Lucy A; Nelson, Robert G; Nicholas, Susanne B; Parekh, Rulan S; Rich, Stephen S; Chen, Yii-Der I; Saad, Mohammed F; Scavini, Marina; Schelling, Jeffrey R; Sedor, John R; Shah, Vallabh O; Taylor, Kent D; Thornley-Brown, Denyse; Zager, Philip G; Horvath, Amanda; Hanson, Robert L

    2009-11-01

    Previous studies have shown that in addition to environmental influences, type 2 diabetes mellitus (T2DM) has a strong genetic component. The goal of the current study is to identify regions of linkage for T2DM in ethnically diverse populations. Phenotypic and genotypic data were obtained from African American (AA; total number of individuals [N] = 1004), American Indian (AI; N = 883), European American (EA; N = 537), and Mexican American (MA; N = 1634) individuals from the Family Investigation of Nephropathy and Diabetes. Non-parametric linkage analysis, using an average of 4404 SNPs, was performed in relative pairs affected with T2DM in each ethnic group. In addition, family-based tests were performed to detect association with T2DM. Statistically significant evidence for linkage was observed on chromosome 4q21.1 (LOD = 3.13; genome-wide p = 0.04) in AA. In addition, a total of 11 regions showed suggestive evidence for linkage (estimated at LOD > 1.71), with the highest LOD scores on chromosomes 12q21.31 (LOD = 2.02) and 22q12.3 (LOD = 2.38) in AA, 2p11.1 (LOD = 2.23) in AI, 6p12.3 (LOD = 2.77) in EA, and 13q21.1 (LOD = . 2.24) in MA. While no region overlapped across all ethnic groups, at least five loci showing LOD > 1.71 have been identified in previously published studies. The results from this study provide evidence for the presence of genes affecting T2DM on chromosomes 4q, 12q, and 22q in AA; 6p in EA; 2p in AI; and 13q in MA. The strong evidence for linkage on chromosome 4q in AA provides important information given the paucity of diabetes genetic studies in this population.

  16. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA.

    PubMed

    Wu, Zheyang; Zhao, Hongyu

    2012-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies.

  17. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION DATA

    PubMed Central

    Wu, Zheyang; Zhao, Hongyu

    2013-01-01

    For more fruitful discoveries of genetic variants associated with diseases in genome-wide association studies, it is important to know whether joint analysis of multiple markers is more powerful than the commonly used single-marker analysis, especially in the presence of gene-gene interactions. This article provides a statistical framework to rigorously address this question through analytical power calculations for common model search strategies to detect binary trait loci: marginal search, exhaustive search, forward search, and two-stage screening search. Our approach incorporates linkage disequilibrium, random genotypes, and correlations among score test statistics of logistic regressions. We derive analytical results under two power definitions: the power of finding all the associated markers and the power of finding at least one associated marker. We also consider two types of error controls: the discovery number control and the Bonferroni type I error rate control. After demonstrating the accuracy of our analytical results by simulations, we apply them to consider a broad genetic model space to investigate the relative performances of different model search strategies. Our analytical study provides rapid computation as well as insights into the statistical mechanism of capturing genetic signals under different genetic models including gene-gene interactions. Even though we focus on genetic association analysis, our results on the power of model selection procedures are clearly very general and applicable to other studies. PMID:23956610

  18. Privacy-preserving genome-wide association studies on cloud environment using fully homomorphic encryption

    PubMed Central

    2015-01-01

    Objective Developed sequencing techniques are yielding large-scale genomic data at low cost. A genome-wide association study (GWAS) targeting genetic variations that are significantly associated with a particular disease offers great potential for medical improvement. However, subjects who volunteer their genomic data expose themselves to the risk of privacy invasion; these privacy concerns prevent efficient genomic data sharing. Our goal is to presents a cryptographic solution to this problem. Methods To maintain the privacy of subjects, we propose encryption of all genotype and phenotype data. To allow the cloud to perform meaningful computation in relation to the encrypted data, we use a fully homomorphic encryption scheme. Noting that we can evaluate typical statistics for GWAS from a frequency table, our solution evaluates frequency tables with encrypted genomic and clinical data as input. We propose to use a packing technique for efficient evaluation of these frequency tables. Results Our solution supports evaluation of the D′ measure of linkage disequilibrium, the Hardy-Weinberg Equilibrium, the χ2 test, etc. In this paper, we take χ2 test and linkage disequilibrium as examples and demonstrate how we can conduct these algorithms securely and efficiently in an outsourcing setting. We demonstrate with experimentation that secure outsourcing computation of one χ2 test with 10, 000 subjects requires about 35 ms and evaluation of one linkage disequilibrium with 10, 000 subjects requires about 80 ms. Conclusions With appropriate encoding and packing technique, cryptographic solutions based on fully homomorphic encryption for secure computations of GWAS can be practical. PMID:26732892

  19. Privacy-preserving genome-wide association studies on cloud environment using fully homomorphic encryption.

    PubMed

    Lu, Wen-Jie; Yamada, Yoshiji; Sakuma, Jun

    2015-01-01

    Developed sequencing techniques are yielding large-scale genomic data at low cost. A genome-wide association study (GWAS) targeting genetic variations that are significantly associated with a particular disease offers great potential for medical improvement. However, subjects who volunteer their genomic data expose themselves to the risk of privacy invasion; these privacy concerns prevent efficient genomic data sharing. Our goal is to presents a cryptographic solution to this problem. To maintain the privacy of subjects, we propose encryption of all genotype and phenotype data. To allow the cloud to perform meaningful computation in relation to the encrypted data, we use a fully homomorphic encryption scheme. Noting that we can evaluate typical statistics for GWAS from a frequency table, our solution evaluates frequency tables with encrypted genomic and clinical data as input. We propose to use a packing technique for efficient evaluation of these frequency tables. Our solution supports evaluation of the D' measure of linkage disequilibrium, the Hardy-Weinberg Equilibrium, the χ2 test, etc. In this paper, we take χ2 test and linkage disequilibrium as examples and demonstrate how we can conduct these algorithms securely and efficiently in an outsourcing setting. We demonstrate with experimentation that secure outsourcing computation of one χ2 test with 10, 000 subjects requires about 35 ms and evaluation of one linkage disequilibrium with 10, 000 subjects requires about 80 ms. With appropriate encoding and packing technique, cryptographic solutions based on fully homomorphic encryption for secure computations of GWAS can be practical.

  20. Identifying Specific Genes Controlling Complex Traits Through A Genome-Wide Screen For cis-Acting Regulatory Elements - An Example Using Marek's Disease

    USDA-ARS?s Scientific Manuscript database

    The identification of specific genes underlying phenotypic variation of complex traits remains one of the greatest challenges in biology despite having genome sequences and more powerful tools. Most genome-wide screens lack sufficient resolving power as they typically depend on linkage. One altern...

  1. Genomic prediction in contrast to a genome-wide association study in explaining heritable variation of complex growth traits in breeding populations of Eucalyptus.

    PubMed

    Müller, Bárbara S F; Neves, Leandro G; de Almeida Filho, Janeo E; Resende, Márcio F R; Muñoz, Patricio R; Dos Santos, Paulo E T; Filho, Estefano Paludzyszyn; Kirst, Matias; Grattapaglia, Dario

    2017-07-11

    The advent of high-throughput genotyping technologies coupled to genomic prediction methods established a new paradigm to integrate genomics and breeding. We carried out whole-genome prediction and contrasted it to a genome-wide association study (GWAS) for growth traits in breeding populations of Eucalyptus benthamii (n =505) and Eucalyptus pellita (n =732). Both species are of increasing commercial interest for the development of germplasm adapted to environmental stresses. Predictive ability reached 0.16 in E. benthamii and 0.44 in E. pellita for diameter growth. Predictive abilities using either Genomic BLUP or different Bayesian methods were similar, suggesting that growth adequately fits the infinitesimal model. Genomic prediction models using ~5000-10,000 SNPs provided predictive abilities equivalent to using all 13,787 and 19,506 SNPs genotyped in the E. benthamii and E. pellita populations, respectively. No difference was detected in predictive ability when different sets of SNPs were utilized, based on position (equidistantly genome-wide, inside genes, linkage disequilibrium pruned or on single chromosomes), as long as the total number of SNPs used was above ~5000. Predictive abilities obtained by removing relatedness between training and validation sets fell near zero for E. benthamii and were halved for E. pellita. These results corroborate the current view that relatedness is the main driver of genomic prediction, although some short-range historical linkage disequilibrium (LD) was likely captured for E. pellita. A GWAS identified only one significant association for volume growth in E. pellita, illustrating the fact that while genome-wide regression is able to account for large proportions of the heritability, very little or none of it is captured into significant associations using GWAS in breeding populations of the size evaluated in this study. This study provides further experimental data supporting positive prospects of using genome-wide data to

  2. Genome-Wide Discovery of Drug-Dependent Human Liver Regulatory Elements

    PubMed Central

    Morrissey, Kari M.; Luizon, Marcelo R.; Hoffmann, Thomas J.; Sun, Xuefeng; Jones, Stacy L.; Force Aldred, Shelley; Ramamoorthy, Anuradha; Desta, Zeruesenay; Liu, Yunlong; Skaar, Todd C.; Trinklein, Nathan D.; Giacomini, Kathleen M.; Ahituv, Nadav

    2014-01-01

    Inter-individual variation in gene regulatory elements is hypothesized to play a causative role in adverse drug reactions and reduced drug activity. However, relatively little is known about the location and function of drug-dependent elements. To uncover drug-associated elements in a genome-wide manner, we performed RNA-seq and ChIP-seq using antibodies against the pregnane X receptor (PXR) and three active regulatory marks (p300, H3K4me1, H3K27ac) on primary human hepatocytes treated with rifampin or vehicle control. Rifampin and PXR were chosen since they are part of the CYP3A4 pathway, which is known to account for the metabolism of more than 50% of all prescribed drugs. We selected 227 proximal promoters for genes with rifampin-dependent expression or nearby PXR/p300 occupancy sites and assayed their ability to induce luciferase in rifampin-treated HepG2 cells, finding only 10 (4.4%) that exhibited drug-dependent activity. As this result suggested a role for distal enhancer modules, we searched more broadly to identify 1,297 genomic regions bearing a conditional PXR occupancy as well as all three active regulatory marks. These regions are enriched near genes that function in the metabolism of xenobiotics, specifically members of the cytochrome P450 family. We performed enhancer assays in rifampin-treated HepG2 cells for 42 of these sequences as well as 7 sequences that overlap linkage-disequilibrium blocks defined by lead SNPs from pharmacogenomic GWAS studies, revealing 15/42 and 4/7 to be functional enhancers, respectively. A common African haplotype in one of these enhancers in the GSTA locus was found to exhibit potential rifampin hypersensitivity. Combined, our results further suggest that enhancers are the predominant targets of rifampin-induced PXR activation, provide a genome-wide catalog of PXR targets and serve as a model for the identification of drug-responsive regulatory elements. PMID:25275310

  3. GeNemo: a search engine for web-based functional genomic data.

    PubMed

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-07-08

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Genome-wide population structure and evolutionary history of the Frizarta dairy sheep.

    PubMed

    Kominakis, A; Hager-Theodorides, A L; Saridaki, A; Antonakos, G; Tsiamis, G

    2017-10-01

    In the present study, we used genomic data, generated with a medium density single nucleotide polymorphisms (SNP) array, to acquire more information on the population structure and evolutionary history of the synthetic Frizarta dairy sheep. First, two typical measures of linkage disequilibrium (LD) were estimated at various physical distances that were then used to make inferences on the effective population size at key past time points. Population structure was also assessed by both multidimensional scaling analysis and k-means clustering on the distance matrix obtained from the animals' genomic relationships. The Wright's fixation F ST index was also employed to assess herds' genetic homogeneity and to indirectly estimate past migration rates. The Wright's fixation F IS index and genomic inbreeding coefficients based on the genomic relationship matrix as well as on runs of homozygosity were also estimated. The Frizarta breed displays relatively low LD levels with r 2 and |D'| equal to 0.18 and 0.50, respectively, at an average inter-marker distance of 31 kb. Linkage disequilibrium decayed rapidly by distance and persisted over just a few thousand base pairs. Rate of LD decay (β) varied widely among the 26 autosomes with larger values estimated for shorter chromosomes (e.g. β=0.057, for OAR6) and smaller values for longer ones (e.g. β=0.022, for OAR2). The inferred effective population size at the beginning of the breed's formation was as high as 549, was then reduced to 463 in 1981 (end of the breed's formation) and further declined to 187, one generation ago. Multidimensional scaling analysis and k-means clustering suggested a genetically homogenous population, F ST estimates indicated relatively low genetic differentiation between herds, whereas a heat map of the animals' genomic kinship relationships revealed a stratified population, at a herd level. Estimates of genomic inbreeding coefficients suggested that most recent parental relatedness may have been a

  5. Combined approach for finding susceptibility genes in DISH/chondrocalcinosis families: whole-genome-wide linkage and IBS/IBD studies.

    PubMed

    Couto, Ana Rita; Parreira, Bruna; Thomson, Russell; Soares, Marta; Power, Deborah M; Stankovich, Jim; Armas, Jácome Bruges; Brown, Matthew A

    2017-01-01

    Twelve families with exuberant and early-onset calcium pyrophosphate dehydrate chondrocalcinosis (CC) and diffuse idiopathic skeletal hyperostosis (DISH), hereafter designated DISH/CC, were identified in Terceira Island, the Azores, Portugal. Ninety-two (92) individuals from these families were selected for whole-genome-wide linkage analysis. An identity-by-descent (IBD) analysis was performed in 10 individuals from 5 of the investigated pedigrees. The chromosome area with the maximal logarithm of the odds score (1.32; P =0.007) was not identified using the IBD/identity-by-state (IBS) analysis; therefore, it was not investigated further. From the IBD/IBS analysis, two candidate genes, LEMD3 and RSPO4 , were identified and sequenced. Nine genetic variants were identified in the RSPO4 gene; one regulatory variant (rs146447064) was significantly more frequent in control individuals than in DISH/CC patients ( P =0.03). Four variants were identified in LEMD3 , and the rs201930700 variant was further investigated using segregation analysis. None of the genetic variants in RSPO4 or LEMD3 segregated within the studied families. Therefore, although a major genetic effect was shown to determine DISH/CC occurrence within these families, the specific genetic variants involved were not identified.

  6. Combined approach for finding susceptibility genes in DISH/chondrocalcinosis families: whole-genome-wide linkage and IBS/IBD studies

    PubMed Central

    Couto, Ana Rita; Parreira, Bruna; Thomson, Russell; Soares, Marta; Power, Deborah M; Stankovich, Jim; Armas, Jácome Bruges; Brown, Matthew A

    2017-01-01

    Twelve families with exuberant and early-onset calcium pyrophosphate dehydrate chondrocalcinosis (CC) and diffuse idiopathic skeletal hyperostosis (DISH), hereafter designated DISH/CC, were identified in Terceira Island, the Azores, Portugal. Ninety-two (92) individuals from these families were selected for whole-genome-wide linkage analysis. An identity-by-descent (IBD) analysis was performed in 10 individuals from 5 of the investigated pedigrees. The chromosome area with the maximal logarithm of the odds score (1.32; P=0.007) was not identified using the IBD/identity-by-state (IBS) analysis; therefore, it was not investigated further. From the IBD/IBS analysis, two candidate genes, LEMD3 and RSPO4, were identified and sequenced. Nine genetic variants were identified in the RSPO4 gene; one regulatory variant (rs146447064) was significantly more frequent in control individuals than in DISH/CC patients (P=0.03). Four variants were identified in LEMD3, and the rs201930700 variant was further investigated using segregation analysis. None of the genetic variants in RSPO4 or LEMD3 segregated within the studied families. Therefore, although a major genetic effect was shown to determine DISH/CC occurrence within these families, the specific genetic variants involved were not identified. PMID:29104755

  7. GWFASTA: server for FASTA search in eukaryotic and microbial genomes.

    PubMed

    Issac, Biju; Raghava, G P S

    2002-09-01

    Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.

  8. Identification of Major Quantitative Trait Loci for Seed Oil Content in Soybeans by Combining Linkage and Genome-Wide Association Mapping.

    PubMed

    Cao, Yongce; Li, Shuguang; Wang, Zili; Chang, Fangguo; Kong, Jiejie; Gai, Junyi; Zhao, Tuanjie

    2017-01-01

    Soybean oil is the most widely produced vegetable oil in the world and its content in soybean seed is an important quality trait in breeding programs. More than 100 quantitative trait loci (QTLs) for soybean oil content have been identified. However, most of them are genotype specific and/or environment sensitive. Here, we used both a linkage and association mapping methodology to dissect the genetic basis of seed oil content of Chinese soybean cultivars in various environments in the Jiang-Huai River Valley. One recombinant inbred line (RIL) population (NJMN-RIL), with 104 lines developed from a cross between M8108 and NN1138-2 , was planted in five environments to investigate phenotypic data, and a new genetic map with 2,062 specific-locus amplified fragment markers was constructed to map oil content QTLs. A derived F 2 population between MN-5 (a line of NJMN-RIL) and NN1138-2 was also developed to confirm one major QTL. A soybean breeding germplasm population (279 lines) was established to perform a genome-wide association study (GWAS) using 59,845 high-quality single nucleotide polymorphism markers. In the NJMN-RIL population, 8 QTLs were found that explained a range of phenotypic variance from 6.3 to 26.3% in certain planting environments. Among them, qOil-5-1, qOil-10-1 , and qOil-14-1 were detected in different environments, and qOil-5-1 was further confirmed using the secondary F 2 population. Three loci located on chromosomes 5 and 20 were detected in a 2-year long GWAS, and one locus that overlapped with qOil-5-1 was found repeatedly and treated as the same locus. qOil-5-1 was further localized to a linkage disequilibrium block region of approximately 440 kb. These results will not only increase our understanding of the genetic control of seed oil content in soybean, but will also be helpful in marker-assisted selection for breeding high seed oil content soybean and gene cloning to elucidate the mechanisms of seed oil content.

  9. GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies

    PubMed Central

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243

  10. A Comprehensive Linkage Map of the Dog Genome

    PubMed Central

    Wong, Aaron K.; Ruhe, Alison L.; Dumont, Beth L.; Robertson, Kathryn R.; Guerrero, Giovanna; Shull, Sheila M.; Ziegle, Janet S.; Millon, Lee V.; Broman, Karl W.; Payseur, Bret A.; Neff, Mark W.

    2010-01-01

    We have leveraged the reference sequence of a boxer to construct the first complete linkage map for the domestic dog. The new map improves access to the dog's unique biology, from human disease counterparts to fascinating evolutionary adaptations. The map was constructed with ∼3000 microsatellite markers developed from the reference sequence. Familial resources afforded 450 mostly phase-known meioses for map assembly. The genotype data supported a framework map with ∼1500 loci. An additional ∼1500 markers served as map validators, contributing modestly to estimates of recombination rate but supporting the framework content. Data from ∼22,000 SNPs informing on a subset of meioses supported map integrity. The sex-averaged map extended 21 M and revealed marked region- and sex-specific differences in recombination rate. The map will enable empiric coverage estimates and multipoint linkage analysis. Knowledge of the variation in recombination rate will also inform on genomewide patterns of linkage disequilibrium (LD), and thus benefit association, selective sweep, and phylogenetic mapping approaches. The computational and wet-bench strategies can be applied to the reference genome of any nonmodel organism to assemble a de novo linkage map. PMID:19966068

  11. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    PubMed Central

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B.; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C.; Tsigenopoulos, Costas S.

    2015-01-01

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. PMID:26715088

  12. Exploring a Nonmodel Teleost Genome Through RAD Sequencing-Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis.

    PubMed

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C; Tsigenopoulos, Costas S

    2015-12-29

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. Copyright © 2016 Manousaki et al.

  13. Genome-wide association study of alcohol dependence

    PubMed Central

    Treutlein, Jens; Cichon, Sven; Ridinger, Monika; Wodarz, Norbert; Soyka, Michael; Zill, Peter; Maier, Wolfgang; Moessner, Rainald; Gaebel, Wolfgang; Dahmen, Norbert; Fehr, Christoph; Scherbaum, Norbert; Steffens, Michael; Ludwig, Kerstin U.; Frank, Josef; Wichmann, H.- Erich; Schreiber, Stefan; Dragano, Nico; Sommer, Wolfgang; Leonardi-Essmann, Fernando; Lourdusamy, Anbarasu; Gebicke-Haerter, Peter; Wienker, Thomas F.; Sullivan, Patrick F.; Nöthen, Markus M.; Kiefer, Falk; Spanagel, Rainer; Mann, Karl; Rietschel, Marcella

    2014-01-01

    Context Identification of genes contributing to alcohol dependence will improve our understanding of the mechanisms underlying this disorder. Objective To identify susceptibility genes for alcohol dependence through a genome-wide association study (GWAS) and follow-up study in a population of German male inpatients with an early age at onset. Design The GWAS included 487 male inpatients with DSM-IV alcohol dependence with an age at onset below 28 years and 1,358 population based control individuals. The follow-up study included 1,024 male inpatients and 996 age-matched male controls. All subjects were of German descent. The GWAS tested 524,396 single nucleotide polymorphisms (SNPs). All SNPs with p<10-4 were subjected to the follow-up study. In addition, nominally significant SNPs from those genes that had also shown expression changes in rat brains after chronic alcohol consumption were selected for the follow-up step. Results The GWAS produced 121 SNPs with nominal p<10-4. These, together with 19 additional SNPs from homologs of rat genes showing differential expression, were genotyped in the follow-up sample. Fifteen SNPs showed significant association with the same allele as in the GWAS. In the combined analysis, two closely linked intergenic SNPs met genome-wide significance (rs7590720 p=9.72×10-9; rs1344694 p=1.69×10-8). They are located on chromosome 2q35, a region which has been implicated in linkage studies for alcohol phenotypes. Nine SNPs were located in genes, including CDH13 and ADH1C genes which have been reported to be associated with alcohol dependence. Conclusion This is the first GWAS and follow-up study to identify a genome-wide significant association in alcohol dependence. Further independent studies are required to confirm these findings. PMID:19581569

  14. Genome-wide association studies and resting heart rate.

    PubMed

    Kilpeläinen, Tuomas O

    Genome-wide association studies (GWASs) have revolutionized the search for genetic variants regulating resting heart rate. In the last 10years, GWASs have led to the identification of at least 21 novel heart rate loci. These discoveries have provided valuable insights into the mechanisms and pathways that regulate heart rate and link heart rate to cardiovascular morbidity and mortality. GWASs capture majority of genetic variation in a population sample by utilizing high-throughput genotyping chips measuring genotypes for up to several millions of SNPs across the genome in thousands of individuals. This allows the identification of the strongest heart rate associated signals at genome-wide level. While GWASs provide robust statistical evidence of the association of a given genetic locus with heart rate, they are only the starting point for detailed follow-up studies to locate the causal variants and genes and gain further insights into the biological mechanisms underlying the observed associations. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers

    PubMed Central

    2010-01-01

    Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788

  16. Construction and Annotation of a High Density SNP Linkage Map of the Atlantic Salmon (Salmo salar) Genome.

    PubMed

    Tsai, Hsin Y; Robledo, Diego; Lowe, Natalie R; Bekaert, Michael; Taggart, John B; Bron, James E; Houston, Ross D

    2016-07-07

    High density linkage maps are useful tools for fine-scale mapping of quantitative trait loci, and characterization of the recombination landscape of a species' genome. Genomic resources for Atlantic salmon (Salmo salar) include a well-assembled reference genome, and high density single nucleotide polymorphism (SNP) arrays. Our aim was to create a high density linkage map, and to align it with the reference genome assembly. Over 96,000 SNPs were mapped and ordered on the 29 salmon linkage groups using a pedigreed population comprising 622 fish from 60 nuclear families, all genotyped with the 'ssalar01' high density SNP array. The number of SNPs per group showed a high positive correlation with physical chromosome length (r = 0.95). While the order of markers on the genetic and physical maps was generally consistent, areas of discrepancy were identified. Approximately 6.5% of the previously unmapped reference genome sequence was assigned to chromosomes using the linkage map. Male recombination rate was lower than females across the vast majority of the genome, but with a notable peak in subtelomeric regions. Finally, using RNA-Seq data to annotate the reference genome, the mapped SNPs were categorized according to their predicted function, including annotation of ∼2500 putative nonsynonymous variants. The highest density SNP linkage map for any salmonid species has been created, annotated, and integrated with the Atlantic salmon reference genome assembly. This map highlights the marked heterochiasmy of salmon, and provides a useful resource for salmonid genetics and genomics research. Copyright © 2016 Tsai et al.

  17. A genome-wide linkage and association study of musical aptitude identifies loci containing genes related to inner ear development and neurocognitive functions

    PubMed Central

    Oikkonen, J.; Huang, Y.; Onkamo, P.; Ukkola-Vuoti, L.; Raijas, P.; Karma, K.; Vieland, V. J.; Järvelä, I.

    2014-01-01

    Humans have developed the perception, production and processing of sounds into the art of music. A genetic contribution to these skills of musical aptitude has long been suggested. We performed a genome-wide scan in 76 pedigrees (767 individuals) characterized for the ability to discriminate pitch (SP), duration (ST) and sound patterns (KMT), which are primary capacities for music perception. Using the Bayesian linkage and association approach implemented in program package KELVIN, especially designed for complex pedigrees, several SNPs near genes affecting the functions of the auditory pathway and neurocognitive processes were identified. The strongest association was found at 3q21.3 (rs9854612) with combined SP, ST and KMT test scores (COMB). This region is located a few dozen kilobases upstream of the GATA binding protein 2 (GATA2) gene. GATA2 regulates the development of cochlear hair cells and the inferior colliculus (IC), which are important in tonotopic mapping. The highest probability of linkage was obtained for phenotype SP at 4p14, located next to the region harboring the protocadherin 7 gene, PCDH7. Two SNPs rs13146789 and rs13109270 of PCDH7 showed strong association. PCDH7 has been suggested to play a role in cochlear and amygdaloid complexes. Functional class analysis showed that inner ear and schizophrenia related genes were enriched inside the linked regions. This study is the first to show the importance of auditory pathway genes in musical aptitude. PMID:24614497

  18. A genome-wide linkage and association study of musical aptitude identifies loci containing genes related to inner ear development and neurocognitive functions.

    PubMed

    Oikkonen, J; Huang, Y; Onkamo, P; Ukkola-Vuoti, L; Raijas, P; Karma, K; Vieland, V J; Järvelä, I

    2015-02-01

    Humans have developed the perception, production and processing of sounds into the art of music. A genetic contribution to these skills of musical aptitude has long been suggested. We performed a genome-wide scan in 76 pedigrees (767 individuals) characterized for the ability to discriminate pitch (SP), duration (ST) and sound patterns (KMT), which are primary capacities for music perception. Using the Bayesian linkage and association approach implemented in program package KELVIN, especially designed for complex pedigrees, several single nucleotide polymorphisms (SNPs) near genes affecting the functions of the auditory pathway and neurocognitive processes were identified. The strongest association was found at 3q21.3 (rs9854612) with combined SP, ST and KMT test scores (COMB). This region is located a few dozen kilobases upstream of the GATA binding protein 2 (GATA2) gene. GATA2 regulates the development of cochlear hair cells and the inferior colliculus (IC), which are important in tonotopic mapping. The highest probability of linkage was obtained for phenotype SP at 4p14, located next to the region harboring the protocadherin 7 gene, PCDH7. Two SNPs rs13146789 and rs13109270 of PCDH7 showed strong association. PCDH7 has been suggested to play a role in cochlear and amygdaloid complexes. Functional class analysis showed that inner ear and schizophrenia-related genes were enriched inside the linked regions. This study is the first to show the importance of auditory pathway genes in musical aptitude.

  19. Genomic rearrangements and signatures of breeding in the allo-octoploid strawberry as revealed through an allele dose based SSR linkage map

    PubMed Central

    2014-01-01

    Background Breeders in the allo-octoploid strawberry currently make little use of molecular marker tools. As a first step of a QTL discovery project on fruit quality traits and resistance to soil-borne pathogens such as Phytophthora cactorum and Verticillium we built a genome-wide SSR linkage map for the cross Holiday x Korona. We used the previously published MADCE method to obtain full haplotype information for both of the parental cultivars, facilitating in-depth studies on their genomic organisation. Results The linkage map incorporates 508 segregating loci and represents each of the 28 chromosome pairs of octoploid strawberry, spanning an estimated length of 2050 cM. The sub-genomes are denoted according to their sequence divergence from F. vesca as revealed by marker performance. The map revealed high overall synteny between the sub-genomes, but also revealed two large inversions on LG2C and LG2D, of which the latter was confirmed using a separate mapping population. We discovered interesting breeding features within the parental cultivars by in-depth analysis of our haplotype data. The linkage map-derived homozygosity level of Holiday was similar to the pedigree-derived inbreeding level (33% and 29%, respectively). For Korona we found that the observed homozygosity level was over three times higher than expected from the pedigree (13% versus 3.6%). This could indicate selection pressure on genes that have favourable effects in homozygous states. The level of kinship between Holiday and Korona derived from our linkage map was 2.5 times higher than the pedigree-derived value. This large difference could be evidence of selection pressure enacted by strawberry breeders towards specific haplotypes. Conclusion The obtained SSR linkage map provides a good base for QTL discovery. It also provides the first biologically relevant basis for the discernment and notation of sub-genomes. For the first time, we revealed genomic rearrangements that were verified in a

  20. Genomic rearrangements and signatures of breeding in the allo-octoploid strawberry as revealed through an allele dose based SSR linkage map.

    PubMed

    van Dijk, Thijs; Pagliarani, Giulia; Pikunova, Anna; Noordijk, Yolanda; Yilmaz-Temel, Hulya; Meulenbroek, Bert; Visser, Richard G F; van de Weg, Eric

    2014-03-01

    Breeders in the allo-octoploid strawberry currently make little use of molecular marker tools. As a first step of a QTL discovery project on fruit quality traits and resistance to soil-borne pathogens such as Phytophthora cactorum and Verticillium we built a genome-wide SSR linkage map for the cross Holiday x Korona. We used the previously published MADCE method to obtain full haplotype information for both of the parental cultivars, facilitating in-depth studies on their genomic organisation. The linkage map incorporates 508 segregating loci and represents each of the 28 chromosome pairs of octoploid strawberry, spanning an estimated length of 2050 cM. The sub-genomes are denoted according to their sequence divergence from F. vesca as revealed by marker performance. The map revealed high overall synteny between the sub-genomes, but also revealed two large inversions on LG2C and LG2D, of which the latter was confirmed using a separate mapping population. We discovered interesting breeding features within the parental cultivars by in-depth analysis of our haplotype data. The linkage map-derived homozygosity level of Holiday was similar to the pedigree-derived inbreeding level (33% and 29%, respectively). For Korona we found that the observed homozygosity level was over three times higher than expected from the pedigree (13% versus 3.6%). This could indicate selection pressure on genes that have favourable effects in homozygous states. The level of kinship between Holiday and Korona derived from our linkage map was 2.5 times higher than the pedigree-derived value. This large difference could be evidence of selection pressure enacted by strawberry breeders towards specific haplotypes. The obtained SSR linkage map provides a good base for QTL discovery. It also provides the first biologically relevant basis for the discernment and notation of sub-genomes. For the first time, we revealed genomic rearrangements that were verified in a separate mapping population. We

  1. An Enhanced Linkage Map of the Sheep Genome Comprising More Than 1000 Loci

    PubMed Central

    Maddox, Jillian F.; Davies, Kizanne P.; Crawford, Allan M.; Hulme, Dennis J.; Vaiman, Daniel; Cribiu, Edmond P.; Freking, Bradley A.; Beh, Ken J.; Cockett, Noelle E.; Kang, Nina; Riffkin, Christopher D.; Drinkwater, Roger; Moore, Stephen S.; Dodds, Ken G.; Lumsden, Joanne M.; van Stijn, Tracey C.; Phua, Sin H.; Adelson, David L.; Burkin, Heather R.; Broom, Judith E.; Buitkamp, Johannes; Cambridge, Lisa; Cushwa, William T.; Gerard, Emily; Galloway, Susan M.; Harrison, Blair; Hawken, Rachel J.; Hiendleder, Stefan; Henry, Hannah M.; Medrano, Juan F.; Paterson, Korena A.; Schibler, Laurent; Stone, Roger T.; van Hest, Beryl

    2001-01-01

    A medium-density linkage map of the ovine genome has been developed. Marker data for 550 new loci were generated and merged with the previous sheep linkage map. The new map comprises 1093 markers representing 1062 unique loci (941 anonymous loci, 121 genes) and spans 3500 cM (sex-averaged) for the autosomes and 132 cM (female) on the X chromosome. There is an average spacing of 3.4 cM between autosomal loci and 8.3 cM between highly polymorphic [polymorphic information content (PIC) ≥ 0.7] autosomal loci. The largest gap between markers is 32.5 cM, and the number of gaps of >20 cM between loci, or regions where loci are missing from chromosome ends, has been reduced from 40 in the previous map to 6. Five hundred and seventy-three of the loci can be ordered on a framework map with odds of >1000 : 1. The sheep linkage map contains strong links to both the cattle and goat maps. Five hundred and seventy-two of the loci positioned on the sheep linkage map have also been mapped by linkage analysis in cattle, and 209 of the loci mapped on the sheep linkage map have also been placed on the goat linkage map. Inspection of ruminant linkage maps indicates that the genomic coverage by the current sheep linkage map is comparable to that of the available cattle maps. The sheep map provides a valuable resource to the international sheep, cattle, and goat gene mapping community. PMID:11435411

  2. Construction of Ultradense Linkage Maps with Lep-MAP2: Stickleback F2 Recombinant Crosses as an Example

    PubMed Central

    Rastas, Pasi; Calboli, Federico C. F.; Guo, Baocheng; Shikano, Takahito; Merilä, Juha

    2016-01-01

    High-density linkage maps are important tools for genome biology and evolutionary genetics by quantifying the extent of recombination, linkage disequilibrium, and chromosomal rearrangements across chromosomes, sexes, and populations. They provide one of the best ways to validate and refine de novo genome assemblies, with the power to identify errors in assemblies increasing with marker density. However, assembly of high-density linkage maps is still challenging due to software limitations. We describe Lep-MAP2, a software for ultradense genome-wide linkage map construction. Lep-MAP2 can handle various family structures and can account for achiasmatic meiosis to gain linkage map accuracy. Simulations show that Lep-MAP2 outperforms other available mapping software both in computational efficiency and accuracy. When applied to two large F2-generation recombinant crosses between two nine-spined stickleback (Pungitius pungitius) populations, it produced two high-density (∼6 markers/cM) linkage maps containing 18,691 and 20,054 single nucleotide polymorphisms. The two maps showed a high degree of synteny, but female maps were 1.5–2 times longer than male maps in all linkage groups, suggesting genome-wide recombination suppression in males. Comparison with the genome sequence of the three-spined stickleback (Gasterosteus aculeatus) revealed a high degree of interspecific synteny with a low frequency (<5%) of interchromosomal rearrangements. However, a fairly large (ca. 10 Mb) translocation from autosome to sex chromosome was detected in both maps. These results illustrate the utility and novel features of Lep-MAP2 in assembling high-density linkage maps, and their usefulness in revealing evolutionarily interesting properties of genomes, such as strong genome-wide sex bias in recombination rates. PMID:26668116

  3. Genome-wide search followed by replication reveals genetic interaction of CD80 and ALOX5AP associated with systemic lupus erythematosus in Asian populations.

    PubMed

    Zhang, Yan; Yang, Jing; Zhang, Jing; Sun, Liangdan; Hirankarn, Nattiya; Pan, Hai-Feng; Lau, Chak Sing; Chan, Tak Mao; Lee, Tsz Leung; Leung, Alexander Moon Ho; Mok, Chi Chiu; Zhang, Lu; Wang, Yongfei; Shen, Jiangshan Jane; Wong, Sik Nin; Lee, Ka Wing; Ho, Marco Hok Kung; Lee, Pamela Pui Wah; Chung, Brian Hon-Yin; Chong, Chun Yin; Wong, Raymond Woon Sing; Mok, Mo Yin; Wong, Wilfred Hing Sang; Tong, Kwok Lung; Tse, Niko Kei Chiu; Li, Xiang-Pei; Avihingsanon, Yingyos; Rianthavorn, Pornpimol; Deekajorndej, Thavatchai; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk; Ying, Shirley King Yee; Fung, Samuel Ka Shun; Lai, Wai Ming; Wong, Chun-Ming; Ng, Irene Oi Lin; Garcia-Barcelo, Maria-Merce; Cherny, Stacey S; Cui, Yong; Sham, Pak Chung; Yang, Sen; Ye, Dong-Qing; Zhang, Xue-Jun; Lau, Yu Lung; Yang, Wanling

    2016-05-01

    Genetic interaction has been considered as a hallmark of the genetic architecture of systemic lupus erythematosus (SLE). Based on two independent genome-wide association studies (GWAS) on Chinese populations, we performed a genome-wide search for genetic interactions contributing to SLE susceptibility. The study involved a total of 1 659 cases and 3 398 controls in the discovery stage and 2 612 cases and 3 441 controls in three cohorts for replication. Logistic regression and multifactor dimensionality reduction were used to search for genetic interaction. Interaction of CD80 (rs2222631) and ALOX5AP (rs12876893) was found to be significantly associated with SLE (OR_int=1.16, P_int_all=7.7E-04 at false discovery rate<0.05). Single nuclear polymorphism rs2222631 was found associated with SLE with genome-wide significance (P_all=4.5E-08, OR=0.86) and is independent of rs6804441 in CD80, whose association was reported previously. Significant correlation was observed between expression of these two genes in healthy controls and SLE cases, together with differential expression of these genes between cases and controls, observed from individuals from the Hong Kong cohort. Genetic interactions between BLK (rs13277113) and DDX6 (rs4639966), and between TNFSF4 (rs844648) and PXK (rs6445975) were also observed in both GWAS data sets. Our study represents the first genome-wide evaluation of epistasis interactions on SLE and the findings suggest interactions and independent variants may help partially explain missing heritability for complex diseases. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  4. Gigwa-Genotype investigator for genome-wide analyses.

    PubMed

    Sempéré, Guilhem; Philippe, Florian; Dereeper, Alexis; Ruiz, Manuel; Sarah, Gautier; Larmande, Pierre

    2016-06-06

    Exploring the structure of genomes and analyzing their evolution is essential to understanding the ecological adaptation of organisms. However, with the large amounts of data being produced by next-generation sequencing, computational challenges arise in terms of storage, search, sharing, analysis and visualization. This is particularly true with regards to studies of genomic variation, which are currently lacking scalable and user-friendly data exploration solutions. Here we present Gigwa, a web-based tool that provides an easy and intuitive way to explore large amounts of genotyping data by filtering it not only on the basis of variant features, including functional annotations, but also on genotype patterns. The data storage relies on MongoDB, which offers good scalability properties. Gigwa can handle multiple databases and may be deployed in either single- or multi-user mode. In addition, it provides a wide range of popular export formats. The Gigwa application is suitable for managing large amounts of genomic variation data. Its user-friendly web interface makes such processing widely accessible. It can either be simply deployed on a workstation or be used to provide a shared data portal for a given community of researchers.

  5. LD Score Regression Distinguishes Confounding from Polygenicity in Genome-Wide Association Studies

    PubMed Central

    Bulik-Sullivan, Brendan K.; Loh, Po-Ru; Finucane, Hilary; Ripke, Stephan; Yang, Jian; Patterson, Nick; Daly, Mark J.; Price, Alkes L.; Neale, Benjamin M.

    2015-01-01

    Both polygenicity (i.e., many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of test statistic inflation in many GWAS of large sample size. PMID:25642630

  6. Genome-wide scans for loci under selection in humans

    PubMed Central

    2005-01-01

    Natural selection, which can be defined as the differential contribution of genetic variants to future generations, is the driving force of Darwinian evolution. Identifying regions of the human genome that have been targets of natural selection is an important step in clarifying human evolutionary history and understanding how genetic variation results in phenotypic diversity, it may also facilitate the search for complex disease genes. Technological advances in high-throughput DNA sequencing and single nucleotide polymorphism genotyping have enabled several genome-wide scans of natural selection to be undertaken. Here, some of the observations that are beginning to emerge from these studies will be reviewed, including evidence for geographically restricted selective pressures (ie local adaptation) and a relationship between genes subject to natural selection and human disease. In addition, the paper will highlight several important problems that need to be addressed in future genome-wide studies of natural selection. PMID:16004726

  7. A genome wide search for alcoholism susceptibility genes.

    PubMed

    Hill, Shirley Y; Shen, Sa; Zezza, Nicholas; Hoffman, Eric K; Perlin, Mark; Allan, William

    2004-07-01

    Alcoholism is currently one of the most serious public health problems in the US. Lifetime prevalence rates are relatively high with one in five men and one in 12 women meeting criteria for this condition. Identification of genetic loci conferring an increased susceptibility to developing alcohol dependence could strengthen prevention efforts by informing individuals of their risk before abusive drinking ensues. Families identified through a double proband methodology have provided an exceptional opportunity for gene-finding because of the increased recurrence risks seen in these sibships. A total of 360 markers for 22 autosomes were spaced at an average distance of 9.4 cM and genotyping performed for 330 members of these multiplex families. Extensive clinical data, personality variation, and event-related potential characteristics were available for reducing heterogeneity and detecting robust linkage signals. Multipoint linkage analysis using different analytic strategies give strong support for loci on chromosomes 1, 2, 6, 7, 10, 12, 14, 16, and 17. Copyright 2004 Wiley-Liss, Inc.

  8. Linkage Analysis of a Model Quantitative Trait in Humans: Finger Ridge Count Shows Significant Multivariate Linkage to 5q14.1

    PubMed Central

    Medland, Sarah E; Loesch, Danuta Z; Mdzewski, Bogdan; Zhu, Gu; Montgomery, Grant W; Martin, Nicholas G

    2007-01-01

    The finger ridge count (a measure of pattern size) is one of the most heritable complex traits studied in humans and has been considered a model human polygenic trait in quantitative genetic analysis. Here, we report the results of the first genome-wide linkage scan for finger ridge count in a sample of 2,114 offspring from 922 nuclear families. Both univariate linkage to the absolute ridge count (a sum of all the ridge counts on all ten fingers), and multivariate linkage analyses of the counts on individual fingers, were conducted. The multivariate analyses yielded significant linkage to 5q14.1 (Logarithm of odds [LOD] = 3.34, pointwise-empirical p-value = 0.00025) that was predominantly driven by linkage to the ring, index, and middle fingers. The strongest univariate linkage was to 1q42.2 (LOD = 2.04, point-wise p-value = 0.002, genome-wide p-value = 0.29). In summary, the combination of univariate and multivariate results was more informative than simple univariate analyses alone. Patterns of quantitative trait loci factor loadings consistent with developmental fields were observed, and the simple pleiotropic model underlying the absolute ridge count was not sufficient to characterize the interrelationships between the ridge counts of individual fingers. PMID:17907812

  9. Meta-Analysis of Genome-Wide Scans Provides Evidence for Sex- and Site-Specific Regulation of Bone Mass

    PubMed Central

    Sham, Pak C; Zintzaras, Elias; Lewis, Cathryn M; Deng, Hong-Wen; Econs, Michael J; Karasik, David; Devoto, Marcella; Kammerer, Candace M; Spector, Tim; Andrew, Toby; Cupples, L Adrienne; Duncan, Emma L; Foroud, Tatiana; Kiel, Douglas P; Koller, Daniel; Langdahl, Bente; Mitchell, Braxton D; Peacock, Munro; Recker, Robert; Shen, Hui; Sol-Church, Katia; Spotila, Loretta D; Uitterlinden, Andre G; Wilson, Scott G; Kung, Annie WC; Ralston, Stuart H

    2014-01-01

    Several genome-wide scans have been performed to detect loci that regulate BMD, but these have yielded inconsistent results, with limited replication of linkage peaks in different studies. In an effort to improve statistical power for detection of these loci, we performed a meta-analysis of genome-wide scans in which spine or hip BMD were studied. Evidence was gained to suggest that several chromosomal loci regulate BMD in a site-specific and sex-specific manner. Introduction BMD is a heritable trait and an important predictor of osteoporotic fracture risk. Several genome-wide scans have been performed in an attempt to detect loci that regulate BMD, but there has been limited replication of linkage peaks between studies. In an attempt to resolve these inconsistencies, we conducted a collaborative meta-analysis of genome-wide linkage scans in which femoral neck BMD (FN-BMD) or lumbar spine BMD (LS-BMD) had been studied. Materials and Methods Data were accumulated from nine genome-wide scans involving 11,842 subjects. Data were analyzed separately for LS-BMD and FN-BMD and by sex. For each study, genomic bins of 30 cM were defined and ranked according to the maximum LOD score they contained. While various densitometers were used in different studies, the ranking approach that we used means that the results are not confounded by the fact that different measurement devices were used. Significance for high average rank and heterogeneity was obtained through Monte Carlo testing. Results For LS-BMD, the quantitative trait locus (QTL) with greatest significance was on chromosome 1p13.3-q23.3 (p = 0.004), but this exhibited high heterogeneity and the effect was specific for women. Other significant LS-BMD QTLs were on chromosomes 12q24.31-qter, 3p25.3-p22.1, 11p12-q13.3, and 1q32-q42.3, including one on 18p11-q12.3 that had not been detected by individual studies. For FN-BMD, the strongest QTL was on chromosome 9q31.1-q33.3 (p = 0.002). Other significant QTLs were

  10. Identification of Major Quantitative Trait Loci for Seed Oil Content in Soybeans by Combining Linkage and Genome-Wide Association Mapping

    PubMed Central

    Cao, Yongce; Li, Shuguang; Wang, Zili; Chang, Fangguo; Kong, Jiejie; Gai, Junyi; Zhao, Tuanjie

    2017-01-01

    Soybean oil is the most widely produced vegetable oil in the world and its content in soybean seed is an important quality trait in breeding programs. More than 100 quantitative trait loci (QTLs) for soybean oil content have been identified. However, most of them are genotype specific and/or environment sensitive. Here, we used both a linkage and association mapping methodology to dissect the genetic basis of seed oil content of Chinese soybean cultivars in various environments in the Jiang-Huai River Valley. One recombinant inbred line (RIL) population (NJMN-RIL), with 104 lines developed from a cross between M8108 and NN1138-2, was planted in five environments to investigate phenotypic data, and a new genetic map with 2,062 specific-locus amplified fragment markers was constructed to map oil content QTLs. A derived F2 population between MN-5 (a line of NJMN-RIL) and NN1138-2 was also developed to confirm one major QTL. A soybean breeding germplasm population (279 lines) was established to perform a genome-wide association study (GWAS) using 59,845 high-quality single nucleotide polymorphism markers. In the NJMN-RIL population, 8 QTLs were found that explained a range of phenotypic variance from 6.3 to 26.3% in certain planting environments. Among them, qOil-5-1, qOil-10-1, and qOil-14-1 were detected in different environments, and qOil-5-1 was further confirmed using the secondary F2 population. Three loci located on chromosomes 5 and 20 were detected in a 2-year long GWAS, and one locus that overlapped with qOil-5-1 was found repeatedly and treated as the same locus. qOil-5-1 was further localized to a linkage disequilibrium block region of approximately 440 kb. These results will not only increase our understanding of the genetic control of seed oil content in soybean, but will also be helpful in marker-assisted selection for breeding high seed oil content soybean and gene cloning to elucidate the mechanisms of seed oil content. PMID:28747922

  11. FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm.

    PubMed

    Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen

    2016-01-01

    Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset.

  12. FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm

    PubMed Central

    Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen

    2016-01-01

    Motivation Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. Method In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. Results We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset. PMID:27014873

  13. Genome-wide variation within and between wild and domestic yak.

    PubMed

    Wang, Kun; Hu, Quanjun; Ma, Hui; Wang, Lizhong; Yang, Yongzhi; Luo, Wenchun; Qiu, Qiang

    2014-07-01

    The yak is one of the few animals that can thrive in the harsh environment of the Qinghai-Tibetan Plateau and adjacent Alpine regions. Yak provides essential resources allowing Tibetans to live at high altitudes. However, genetic variation within and between wild and domestic yak remain unknown. Here, we present a genome-wide study of the genetic variation within and between wild and domestic yak. Using next-generation sequencing technology, we resequenced three wild and three domestic yak with a mean of fivefold coverage using our published domestic yak genome as a reference. We identified a total of 8.38 million SNPs (7.14 million novel), 383,241 InDels and 126,352 structural variants between the six yak. We observed higher linkage disequilibrium in domestic yak than in wild yak and a modest but distinct genetic divergence between these two groups. We further identified more than a thousand of potential selected regions (PSRs) for the three domestic yak by scanning the whole genome. These genomic resources can be further used to study genetic diversity and select superior breeds of yak and other bovid species. © 2014 John Wiley & Sons Ltd.

  14. Genotyping-by-sequencing enables linkage mapping in three octoploid cultivated strawberry families

    PubMed Central

    Salinas, Natalia; Tennessen, Jacob A.; Zurn, Jason D.; Sargent, Daniel James; Hancock, James; Bassil, Nahla V.

    2017-01-01

    Genotyping-by-sequencing (GBS) was used to survey genome-wide single-nucleotide polymorphisms (SNPs) in three biparental strawberry (Fragaria × ananassa) populations with the goal of evaluating this technique in a species with a complex octoploid genome. GBS sequence data were aligned to the F. vesca ‘Fvb’ reference genome in order to call SNPs. Numbers of polymorphic SNPs per population ranged from 1,163 to 3,190. Linkage maps consisting of 30–65 linkage groups were produced from the SNP sets derived from each parent. The linkage groups covered 99% of the Fvb reference genome, with three to seven linkage groups from a given parent aligned to any particular chromosome. A phylogenetic analysis performed using the POLiMAPS pipeline revealed linkage groups that were most similar to ancestral species F. vesca for each chromosome. Linkage groups that were most similar to a second ancestral species, F. iinumae, were only resolved for Fvb 4. The quantity of missing data and heterogeneity in genome coverage inherent in GBS complicated the analysis, but POLiMAPS resolved F. × ananassa chromosomal regions derived from diploid ancestor F. vesca. PMID:28875078

  15. Wide-cross whole-genome radiation hybrid mapping of cotton (Gossypium hirsutum L.).

    PubMed Central

    Gao, Wenxiang; Chen, Z Jeffrey; Yu, John Z; Raska, Dwaine; Kohel, Russell J; Womack, James E; Stelly, David M

    2004-01-01

    We report the development and characterization of a "wide-cross whole-genome radiation hybrid" (WWRH) panel from cotton (Gossypium hirsutum L.). Chromosomes were segmented by gamma-irradiation of G. hirsutum (n = 26) pollen, and segmented chromosomes were rescued after in vivo fertilization of G. barbadense egg cells (n = 26). A 5-krad gamma-ray WWRH mapping panel (N = 93) was constructed and genotyped at 102 SSR loci. SSR marker retention frequencies were higher than those for animal systems and marker retention patterns were informative. Using the program RHMAP, 52 of 102 SSR markers were mapped into 16 syntenic groups. Linkage group 9 (LG 9) SSR markers BNL0625 and BNL2805 had been colocalized by linkage analysis, but their order was resolved by differential retention among WWRH plants. Two linkage groups, LG 13 and LG 9, were combined into one syntenic group, and the chromosome 1 linkage group marker BNL4053 was reassigned to chromosome 9. Analyses of cytogenetic stocks supported synteny of LG 9 and LG 13 and localized them to the short arm of chromosome 17. They also supported reassignment of marker BNL4053 to the long arm of chromosome 9. A WWRH map of the syntenic group composed of linkage groups 9 and 13 was constructed by maximum-likelihood analysis under the general retention model. The results demonstrate not only the feasibility of WWRH panel construction and mapping, but also complementarity to traditional linkage mapping and cytogenetic methods. PMID:15280245

  16. Genome-Wide Association Mapping of Crown Rust Resistance in Oat Elite Germplasm.

    PubMed

    Klos, Kathy Esvelt; Yimer, Belayneh A; Babiker, Ebrahiem M; Beattie, Aaron D; Bonman, J Michael; Carson, Martin L; Chong, James; Harrison, Stephen A; Ibrahim, Amir M H; Kolb, Frederic L; McCartney, Curt A; McMullen, Michael; Fetch, Jennifer Mitchell; Mohammadi, Mohsen; Murphy, J Paul; Tinker, Nicholas A

    2017-07-01

    Oat crown rust, caused by f. sp. , is a major constraint to oat ( L.) production in many parts of the world. In this first comprehensive multienvironment genome-wide association map of oat crown rust, we used 2972 single-nucleotide polymorphisms (SNPs) genotyped on 631 oat lines for association mapping of quantitative trait loci (QTL). Seedling reaction to crown rust in these lines was assessed as infection type (IT) with each of 10 crown rust isolates. Adult plant reaction was assessed in the field in a total of 10 location-years as percentage severity (SV) and as infection reaction (IR) in a 0-to-1 scale. Overall, 29 SNPs on 12 linkage groups were predictive of crown rust reaction in at least one experiment at a genome-wide level of statistical significance. The QTL identified here include those in regions previously shown to be linked with seedling resistance genes , , , , , and and also with adult-plant resistance and adaptation-related QTL. In addition, QTL on linkage groups Mrg03, Mrg08, and Mrg23 were identified in regions not previously associated with crown rust resistance. Evaluation of marker genotypes in a set of crown rust differential lines supported as the identity of . The SNPs with rare alleles associated with lower disease scores may be suitable for use in marker-assisted selection of oat lines for crown rust resistance. Copyright © 2017 Crop Science Society of America.

  17. Improving the detection of pathways in genome-wide association studies by combined effects of SNPs from Linkage Disequilibrium blocks.

    PubMed

    Zhao, Huiying; Nyholt, Dale R; Yang, Yuanhao; Wang, Jihua; Yang, Yuedong

    2017-06-14

    Genome-wide association studies (GWAS) have successfully identified single variants associated with diseases. To increase the power of GWAS, gene-based and pathway-based tests are commonly employed to detect more risk factors. However, the gene- and pathway-based association tests may be biased towards genes or pathways containing a large number of single-nucleotide polymorphisms (SNPs) with small P-values caused by high linkage disequilibrium (LD) correlations. To address such bias, numerous pathway-based methods have been developed. Here we propose a novel method, DGAT-path, to divide all SNPs assigned to genes in each pathway into LD blocks, and to sum the chi-square statistics of LD blocks for assessing the significance of the pathway by permutation tests. The method was proven robust with the type I error rate >1.6 times lower than other methods. Meanwhile, the method displays a higher power and is not biased by the pathway size. The applications to the GWAS summary statistics for schizophrenia and breast cancer indicate that the detected top pathways contain more genes close to associated SNPs than other methods. As a result, the method identified 17 and 12 significant pathways containing 20 and 21 novel associated genes, respectively for two diseases. The method is available online by http://sparks-lab.org/server/DGAT-path .

  18. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa.

    PubMed

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-05-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10 -6 ), and rs7700147, an intergenic variant (P=2.93 × 10 -5 ). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes.

  19. Investigation of common, low-frequency and rare genome-wide variation in anorexia nervosa

    PubMed Central

    Huckins, L M; Hatzikotoulas, K; Southam, L; Thornton, L M; Steinberg, J; Aguilera-McKay, F; Treasure, J; Schmidt, U; Gunasinghe, C; Romero, A; Curtis, C; Rhodes, D; Moens, J; Kalsi, G; Dempster, D; Leung, R; Keohane, A; Burghardt, R; Ehrlich, S; Hebebrand, J; Hinney, A; Ludolph, A; Walton, E; Deloukas, P; Hofman, A; Palotie, A; Palta, P; van Rooij, F J A; Stirrups, K; Adan, R; Boni, C; Cone, R; Dedoussis, G; van Furth, E; Gonidakis, F; Gorwood, P; Hudson, J; Kaprio, J; Kas, M; Keski-Rahonen, A; Kiezebrink, K; Knudsen, G-P; Slof-Op 't Landt, M C T; Maj, M; Monteleone, A M; Monteleone, P; Raevuori, A H; Reichborn-Kjennerud, T; Tozzi, F; Tsitsika, A; van Elburg, A; Adan, R A H; Alfredsson, L; Ando, T; Andreassen, O A; Aschauer, H; Baker, J H; Barrett, J C; Bencko, V; Bergen, A W; Berrettini, W H; Birgegard, A; Boni, C; Boraska Perica, V; Brandt, H; Breen, G; Bulik, C M; Carlberg, L; Cassina, M; Cichon, S; Clementi, M; Cohen-Woods, S; Coleman, J; Cone, R D; Courtet, P; Crawford, S; Crow, S; Crowley, J; Danner, U N; Davis, O S P; de Zwaan, M; Dedoussis, G; Degortes, D; DeSocio, J E; Dick, D M; Dikeos, D; Dina, C; Ding, B; Dmitrzak-Weglarz, M; Docampo, E; Duncan, L; Egberts, K; Ehrlich, S; Escaramís, G; Esko, T; Espeseth, T; Estivill, X; Favaro, A; Fernández-Aranda, F; Fichter, M M; Finan, C; Fischer, K; Floyd, J A B; Foretova, L; Forzan, M; Franklin, C S; Gallinger, S; Gambaro, G; Gaspar, H A; Giegling, I; Gonidakis, F; Gorwood, P; Gratacos, M; Guillaume, S; Guo, Y; Hakonarson, H; Halmi, K A; Hatzikotoulas, K; Hauser, J; Hebebrand, J; Helder, S; Herms, S; Herpertz-Dahlmann, B; Herzog, W; Hilliard, C E; Hinney, A; Hübel, C; Huckins, L M; Hudson, J I; Huemer, J; Inoko, H; Janout, V; Jiménez-Murcia, S; Johnson, C; Julià, A; Juréus, A; Kalsi, G; Kaminska, D; Kaplan, A S; Kaprio, J; Karhunen, L; Karwautz, A; Kas, M J H; Kaye, W; Kennedy, J L; Keski-Rahkonen, A; Kiezebrink, K; Klareskog, L; Klump, K L; Knudsen, G P S; Koeleman, B P C; Koubek, D; La Via, M C; Landén, M; Le Hellard, S; Levitan, R D; Li, D; Lichtenstein, P; Lilenfeld, L; Lissowska, J; Lundervold, A; Magistretti, P; Maj, M; Mannik, K; Marsal, S; Martin, N; Mattingsdal, M; McDevitt, S; McGuffin, P; Merl, E; Metspalu, A; Meulenbelt, I; Micali, N; Mitchell, J; Mitchell, K; Monteleone, P; Monteleone, A M; Mortensen, P; Munn-Chernoff, M A; Navratilova, M; Nilsson, I; Norring, C; Ntalla, I; Ophoff, R A; O'Toole, J K; Palotie, A; Pante, J; Papezova, H; Pinto, D; Rabionet, R; Raevuori, A; Rajewski, A; Ramoz, N; Rayner, N W; Reichborn-Kjennerud, T; Ripatti, S; Roberts, M; Rotondo, A; Rujescu, D; Rybakowski, F; Santonastaso, P; Scherag, A; Scherer, S W; Schmidt, U; Schork, N J; Schosser, A; Slachtova, L; Sladek, R; Slagboom, P E; Slof-Op 't Landt, M C T; Slopien, A; Soranzo, N; Southam, L; Steen, V M; Strengman, E; Strober, M; Sullivan, P F; Szatkiewicz, J P; Szeszenia-Dabrowska, N; Tachmazidou, I; Tenconi, E; Thornton, L M; Tortorella, A; Tozzi, F; Treasure, J; Tsitsika, A; Tziouvas, K; van Elburg, A A; van Furth, E F; Wagner, G; Walton, E; Watson, H; Wichmann, H-E; Widen, E; Woodside, D B; Yanovski, J; Yao, S; Yilmaz, Z; Zeggini, E; Zerwas, S; Zipfel, S; Collier, D A; Sullivan, P F; Breen, G; Bulik, C M; Zeggini, E

    2018-01-01

    Anorexia nervosa (AN) is a complex neuropsychiatric disorder presenting with dangerously low body weight, and a deep and persistent fear of gaining weight. To date, only one genome-wide significant locus associated with AN has been identified. We performed an exome-chip based genome-wide association studies (GWAS) in 2158 cases from nine populations of European origin and 15 485 ancestrally matched controls. Unlike previous studies, this GWAS also probed association in low-frequency and rare variants. Sixteen independent variants were taken forward for in silico and de novo replication (11 common and 5 rare). No findings reached genome-wide significance. Two notable common variants were identified: rs10791286, an intronic variant in OPCML (P=9.89 × 10−6), and rs7700147, an intergenic variant (P=2.93 × 10−5). No low-frequency variant associations were identified at genome-wide significance, although the study was well-powered to detect low-frequency variants with large effect sizes, suggesting that there may be no AN loci in this genomic search space with large effect sizes. PMID:29155802

  20. Integrated genome sequence and linkage map of physic nut (Jatropha curcas L.), a biodiesel plant.

    PubMed

    Wu, Pingzhi; Zhou, Changpin; Cheng, Shifeng; Wu, Zhenying; Lu, Wenjia; Han, Jinli; Chen, Yanbo; Chen, Yan; Ni, Peixiang; Wang, Ying; Xu, Xun; Huang, Ying; Song, Chi; Wang, Zhiwen; Shi, Nan; Zhang, Xudong; Fang, Xiaohua; Yang, Qing; Jiang, Huawu; Chen, Yaping; Li, Meiru; Wang, Ying; Chen, Fan; Wang, Jun; Wu, Guojiang

    2015-03-01

    The family Euphorbiaceae includes some of the most efficient biomass accumulators. Whole genome sequencing and the development of genetic maps of these species are important components in molecular breeding and genetic improvement. Here we report the draft genome of physic nut (Jatropha curcas L.), a biodiesel plant. The assembled genome has a total length of 320.5 Mbp and contains 27,172 putative protein-coding genes. We established a linkage map containing 1208 markers and anchored the genome assembly (81.7%) to this map to produce 11 pseudochromosomes. After gene family clustering, 15,268 families were identified, of which 13,887 existed in the castor bean genome. Analysis of the genome highlighted specific expansion and contraction of a number of gene families during the evolution of this species, including the ribosome-inactivating proteins and oil biosynthesis pathway enzymes. The genomic sequence and linkage map provide a valuable resource not only for fundamental and applied research on physic nut but also for evolutionary and comparative genomics analysis, particularly in the Euphorbiaceae. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.

  1. Genome-wide association studies on HIV susceptibility, pathogenesis and pharmacogenomics

    PubMed Central

    2012-01-01

    Susceptibility to HIV-1 and the clinical course after infection show a substantial heterogeneity between individuals. Part of this variability can be attributed to host genetic variation. Initial candidate gene studies have revealed interesting host factors that influence HIV infection, replication and pathogenesis. Recently, genome-wide association studies (GWAS) were utilized for unbiased searches at a genome-wide level to discover novel genetic factors and pathways involved in HIV-1 infection. This review gives an overview of findings from the GWAS performed on HIV infection, within different cohorts, with variable patient and phenotype selection. Furthermore, novel techniques and strategies in research that might contribute to the complete understanding of virus-host interactions and its role on the pathogenesis of HIV infection are discussed. PMID:22920050

  2. A hierarchical and modular approach to the discovery of robust associations in genome-wide association studies from pooled DNA samples

    PubMed Central

    Sebastiani, Paola; Zhao, Zhenming; Abad-Grau, Maria M; Riva, Alberto; Hartley, Stephen W; Sedgewick, Amanda E; Doria, Alessandro; Montano, Monty; Melista, Efthymia; Terry, Dellara; Perls, Thomas T; Steinberg, Martin H; Baldwin, Clinton T

    2008-01-01

    Background One of the challenges of the analysis of pooling-based genome wide association studies is to identify authentic associations among potentially thousands of false positive associations. Results We present a hierarchical and modular approach to the analysis of genome wide genotype data that incorporates quality control, linkage disequilibrium, physical distance and gene ontology to identify authentic associations among those found by statistical association tests. The method is developed for the allelic association analysis of pooled DNA samples, but it can be easily generalized to the analysis of individually genotyped samples. We evaluate the approach using data sets from diverse genome wide association studies including fetal hemoglobin levels in sickle cell anemia and a sample of centenarians and show that the approach is highly reproducible and allows for discovery at different levels of synthesis. Conclusion Results from the integration of Bayesian tests and other machine learning techniques with linkage disequilibrium data suggest that we do not need to use too stringent thresholds to reduce the number of false positive associations. This method yields increased power even with relatively small samples. In fact, our evaluation shows that the method can reach almost 70% sensitivity with samples of only 100 subjects. PMID:18194558

  3. ARG-based genome-wide analysis of cacao cultivars.

    PubMed

    Utro, Filippo; Cornejo, Omar Eduardo; Livingstone, Donald; Motamayor, Juan Carlos; Parida, Laxmi

    2012-01-01

    Ancestral recombinations graph (ARG) is a topological structure that captures the relationship between the extant genomic sequences in terms of genetic events including recombinations. IRiS is a system that estimates the ARG on sequences of individuals, at genomic scales, capturing the relationship between these individuals of the species. Recently, this system was used to estimate the ARG of the recombining X Chromosome of a collection of human populations using relatively dense, bi-allelic SNP data. While the ARG is a natural model for capturing the inter-relationship between a single chromosome of the individuals of a species, it is not immediately apparent how the model can utilize whole-genome (across chromosomes) diploid data. Also, the sheer complexity of an ARG structure presents a challenge to graph visualization techniques. In this paper we examine the ARG reconstruction for (1) genome-wide or multiple chromosomes, (2) multi-allelic and (3) extremely sparse data. To aid in the visualization of the results of the reconstructed ARG, we additionally construct a much simplified topology, a classification tree, suggested by the ARG.As the test case, we study the problem of extracting the relationship between populations of Theobroma cacao. The chocolate tree is an outcrossing species in the wild, due to self-incompatibility mechanisms at play. Thus a principled approach to understanding the inter-relationships between the different populations must take the shuffling of the genomic segments into account. The polymorphisms in the test data are short tandem repeats (STR) and are multi-allelic (sometimes as high as 30 distinct possible values at a locus). Each is at a genomic location that is bilaterally transmitted, hence the ARG is a natural model for this data. Another characteristic of this plant data set is that while it is genome-wide, across 10 linkage groups or chromosomes, it is very sparse, i.e., only 96 loci from a genome of approximately 400 megabases

  4. ARG-based genome-wide analysis of cacao cultivars

    PubMed Central

    2012-01-01

    Background Ancestral recombinations graph (ARG) is a topological structure that captures the relationship between the extant genomic sequences in terms of genetic events including recombinations. IRiS is a system that estimates the ARG on sequences of individuals, at genomic scales, capturing the relationship between these individuals of the species. Recently, this system was used to estimate the ARG of the recombining X Chromosome of a collection of human populations using relatively dense, bi-allelic SNP data. Results While the ARG is a natural model for capturing the inter-relationship between a single chromosome of the individuals of a species, it is not immediately apparent how the model can utilize whole-genome (across chromosomes) diploid data. Also, the sheer complexity of an ARG structure presents a challenge to graph visualization techniques. In this paper we examine the ARG reconstruction for (1) genome-wide or multiple chromosomes, (2) multi-allelic and (3) extremely sparse data. To aid in the visualization of the results of the reconstructed ARG, we additionally construct a much simplified topology, a classification tree, suggested by the ARG. As the test case, we study the problem of extracting the relationship between populations of Theobroma cacao. The chocolate tree is an outcrossing species in the wild, due to self-incompatibility mechanisms at play. Thus a principled approach to understanding the inter-relationships between the different populations must take the shuffling of the genomic segments into account. The polymorphisms in the test data are short tandem repeats (STR) and are multi-allelic (sometimes as high as 30 distinct possible values at a locus). Each is at a genomic location that is bilaterally transmitted, hence the ARG is a natural model for this data. Another characteristic of this plant data set is that while it is genome-wide, across 10 linkage groups or chromosomes, it is very sparse, i.e., only 96 loci from a genome of

  5. A Genome-wide Admixture Scan for Ancestry-linked Genes Predisposing to Sarcoidosis in African Americans

    PubMed Central

    Rybicki, Benjamin A.; Levin, Albert M.; McKeigue, Paul; Datta, Indrani; Gray-McGuire, Courtney; Colombo, Marco; Reich, David; Burke, Robert R.; Iannuzzi, Michael C.

    2010-01-01

    Genome-wide linkage and association studies have uncovered variants associated with sarcoidosis, a multi-organ granulomatous inflammatory disease. African ancestry may influence disease pathogenesis since African Americans are more commonly affected by sarcoidosis. Therefore, we conducted the first sarcoidosis genome-wide ancestry scan using a map of 1,384 highly ancestry informative single nucleotide polymorphisms genotyped on 1,357 sarcoidosis cases and 703 unaffected controls self-identified as African American. The most significant ancestry association was at marker rs11966463 on chromosome 6p22.3 (ancestry association risk ratio (aRR)= 1.90; p=0.0002). When we restricted the analysis to biopsy-confirmed cases, the aRR for this marker increased to 2.01; p=0.00007. Among the eight other markers that demonstrated suggestive ancestry associations with sarcoidosis were rs1462906 on chromosome 8p12 which had the most significant association with European ancestry (aRR=0.65; p=0.002), and markers on chromosomes 5p13 (aRR=1.46; p=0.005) and 5q31 (aRR=0.67; p=0.005), which correspond to regions we previously identified through sib pair linkage analyses. Overall, the most significant ancestry association for Scadding stage IV cases was to marker rs7919137 on chromosome 10p11.22 (aRR=0.27; p=2×10−5), a region not associated with disease susceptibility. In summary, through admixture mapping of sarcoidosis we have confirmed previous genetic linkages and identified several novel putative candidate loci for sarcoidosis. PMID:21179114

  6. Development and application of a novel genome-wide SNP array reveals domestication history in soybean

    PubMed Central

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-01-01

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884

  7. Development and application of a novel genome-wide SNP array reveals domestication history in soybean.

    PubMed

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-02-09

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.

  8. Genome-wide investigation of genetic changes during modern breeding of Brassica napus.

    PubMed

    Wang, Nian; Li, Feng; Chen, Biyun; Xu, Kun; Yan, Guixin; Qiao, Jiangwei; Li, Jun; Gao, Guizhen; Bancroft, Ian; Meng, Jingling; King, Graham J; Wu, Xiaoming

    2014-08-01

    Considerable genome variation had been incorporated within rapeseed breeding programs over past decades. In past decades, there have been substantial changes in phenotypic properties of rapeseed as a result of extensive breeding effort. Uncovering the underlying patterns of allelic variation in the context of genome organisation would provide knowledge to guide future genetic improvement. We assessed genome-wide genetic changes, including population structure, genetic relatedness, the extent of linkage disequilibrium, nucleotide diversity and genetic differentiation based on F ST outlier detection, for a panel of 472 Brassica napus inbred accessions using a 60 k Brassica Infinium® SNP array. We found genetic diversity varied in different sub-groups. Moreover, the genetic diversity increased from 1950 to 1980 and then remained at a similar level in China and Europe. We also found ~6-10 % genomic regions revealed high F ST values. Some QTLs previously associated with important agronomic traits overlapped with these regions. Overall, the B. napus C genome was found to have more high F ST signals than the A genome, and we concluded that the C genome may contribute more valuable alleles to generate elite traits. The results of this study indicate that considerable genome variation had been incorporated within rapeseed breeding programs over past decades. These results also contribute to understanding the impact of rapeseed improvement on available genome variation and the potential for dissecting complex agronomic traits.

  9. A High-Resolution SNP Array-Based Linkage Map Anchors a New Domestic Cat Draft Genome Assembly and Provides Detailed Patterns of Recombination.

    PubMed

    Li, Gang; Hillier, LaDeana W; Grahn, Robert A; Zimin, Aleksey V; David, Victor A; Menotti-Raymond, Marilyn; Middleton, Rondo; Hannah, Steven; Hendrickson, Sher; Makunin, Alex; O'Brien, Stephen J; Minx, Pat; Wilson, Richard K; Lyons, Leslie A; Warren, Wesley C; Murphy, William J

    2016-06-01

    High-resolution genetic and physical maps are invaluable tools for building accurate genome assemblies, and interpreting results of genome-wide association studies (GWAS). Previous genetic and physical maps anchored good quality draft assemblies of the domestic cat genome, enabling the discovery of numerous genes underlying hereditary disease and phenotypes of interest to the biomedical science and breeding communities. However, these maps lacked sufficient marker density to order thousands of shorter scaffolds in earlier assemblies, which instead relied heavily on comparative mapping with related species. A high-resolution map would aid in validating and ordering chromosome scaffolds from existing and new genome assemblies. Here, we describe a high-resolution genetic linkage map of the domestic cat genome based on genotyping 453 domestic cats from several multi-generational pedigrees on the Illumina 63K SNP array. The final maps include 58,055 SNP markers placed relative to 6637 markers with unique positions, distributed across all autosomes and the X chromosome. Our final sex-averaged maps span a total autosomal length of 4464 cM, the longest described linkage map for any mammal, confirming length estimates from a previous microsatellite-based map. The linkage map was used to order and orient the scaffolds from a substantially more contiguous domestic cat genome assembly (Felis catus v8.0), which incorporated ∼20 × coverage of Illumina fragment reads. The new genome assembly shows substantial improvements in contiguity, with a nearly fourfold increase in N50 scaffold size to 18 Mb. We use this map to report probable structural errors in previous maps and assemblies, and to describe features of the recombination landscape, including a massive (∼50 Mb) recombination desert (of virtually zero recombination) on the X chromosome that parallels a similar desert on the porcine X chromosome in both size and physical location. Copyright © 2016 Li et al.

  10. Anonymization of electronic medical records for validating genome-wide association studies

    PubMed Central

    Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley

    2010-01-01

    Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806

  11. Genome-wide selection components analysis in a fish with male pregnancy.

    PubMed

    Flanagan, Sarah P; Jones, Adam G

    2017-04-01

    A major goal of evolutionary biology is to identify the genome-level targets of natural and sexual selection. With the advent of next-generation sequencing, whole-genome selection components analysis provides a promising avenue in the search for loci affected by selection in nature. Here, we implement a genome-wide selection components analysis in the sex role reversed Gulf pipefish, Syngnathus scovelli. Our approach involves a double-digest restriction-site associated DNA sequencing (ddRAD-seq) technique, applied to adult females, nonpregnant males, pregnant males, and their offspring. An F ST comparison of allele frequencies among these groups reveals 47 genomic regions putatively experiencing sexual selection, as well as 468 regions showing a signature of differential viability selection between males and females. A complementary likelihood ratio test identifies similar patterns in the data as the F ST analysis. Sexual selection and viability selection both tend to favor the rare alleles in the population. Ultimately, we conclude that genome-wide selection components analysis can be a useful tool to complement other approaches in the effort to pinpoint genome-level targets of selection in the wild. © 2017 The Author(s). Evolution © 2017 The Society for the Study of Evolution.

  12. A genome-wide association study identifies a genomic region for the polycerate phenotype in sheep (Ovis aries).

    PubMed

    Ren, Xue; Yang, Guang-Li; Peng, Wei-Feng; Zhao, Yong-Xin; Zhang, Min; Chen, Ze-Hui; Wu, Fu-An; Kantanen, Juha; Shen, Min; Li, Meng-Hua

    2016-02-17

    Horns are a cranial appendage found exclusively in Bovidae, and play important roles in accessing resources and mates. In sheep (Ovies aries), horns vary from polled to six-horned, and human have been selecting polled animals in farming and breeding. Here, we conducted a genome-wide association study on 24 two-horned versus 22 four-horned phenotypes in a native Chinese breed of Sishui Fur sheep. Together with linkage disequilibrium (LD) analyses and haplotype-based association tests, we identified a genomic region comprising 132.0-133.1 Mb on chromosome 2 that contained the top 10 SNPs (including 4 significant SNPs) and 5 most significant haplotypes associated with the polycerate phenotype. In humans and mice, this genomic region contains the HOXD gene cluster and adjacent functional genes EVX2 and KIAA1715, which have a close association with the formation of limbs and genital buds. Our results provide new insights into the genetic basis underlying variable numbers of horns and represent a new resource for use in sheep genetics and breeding.

  13. Population-Specific Patterns of Linkage Disequilibrium and SNP Variation in Spring and Winter Polyploid Wheat

    USDA-ARS?s Scientific Manuscript database

    Single nucleotide polymorphisms (SNPs) are ideally suited for the construction of high-resolution genetic maps, studying population evolutionary history and performing genome-wide association mapping experiments. Here we used a genome-wide set of 1536 SNPs to study linkage disequilibrium (LD) and po...

  14. Genome-wide association study of acute post-surgical pain in humans

    PubMed Central

    Kim, Hyungsuk; Ramsay, Edward; Lee, Hyewon; Wahl, Sharon; Dionne, Raymond A

    2009-01-01

    Aims Testing a relatively small genomic region with a few hundred SNPs provides limited information. Genome-wide association studies (GWAS) provide an opportunity to overcome the limitation of candidate gene association studies. Here, we report the results of a GWAS for the responses to an NSAID analgesic. Materials & methods European Americans (60 females and 52 males) undergoing oral surgery were genotyped with Affymetrix 500K SNP assay. Additional SNP genotyping was performed from the gene in linkage disequilibrium with the candidate SNP revealed by the GWAS. Results GWAS revealed a candidate SNP (rs2562456) associated with analgesic onset, which is in linkage disequilibrium with a gene encoding a zinc finger protein. Additional SNP genotyping of ZNF429 confirmed the association with analgesic onset in humans (p = 1.8 × 10−10, degrees of freedom = 103, F = 28.3). We also found candidate loci for the maximum post-operative pain rating (rs17122021, p = 6.9 × 10−7) and post-operative pain onset time (rs6693882, p = 2.1 × 10−6), however, correcting for multiple comparisons did not sustain these genetic associations. Conclusion GWAS for acute clinical pain followed by additional SNP genotyping of a neighboring gene suggests that genetic variations in or near the loci encoding DNA binding proteins play a role in the individual variations in responses to analgesic drugs. PMID:19207018

  15. A SSR-based composite genetic linkage map for the cultivated peanut (Arachis hypogaea L.) genome

    PubMed Central

    2010-01-01

    Background The construction of genetic linkage maps for cultivated peanut (Arachis hypogaea L.) has and continues to be an important research goal to facilitate quantitative trait locus (QTL) analysis and gene tagging for use in a marker-assisted selection in breeding. Even though a few maps have been developed, they were constructed using diploid or interspecific tetraploid populations. The most recently published intra-specific map was constructed from the cross of cultivated peanuts, in which only 135 simple sequence repeat (SSR) markers were sparsely populated in 22 linkage groups. The more detailed linkage map with sufficient markers is necessary to be feasible for QTL identification and marker-assisted selection. The objective of this study was to construct a genetic linkage map of cultivated peanut using simple sequence repeat (SSR) markers derived primarily from peanut genomic sequences, expressed sequence tags (ESTs), and by "data mining" sequences released in GenBank. Results Three recombinant inbred lines (RILs) populations were constructed from three crosses with one common female parental line Yueyou 13, a high yielding Spanish market type. The four parents were screened with 1044 primer pairs designed to amplify SSRs and 901 primer pairs produced clear PCR products. Of the 901 primer pairs, 146, 124 and 64 primer pairs (markers) were polymorphic in these populations, respectively, and used in genotyping these RIL populations. Individual linkage maps were constructed from each of the three populations and a composite map based on 93 common loci were created using JoinMap. The composite linkage maps consist of 22 composite linkage groups (LG) with 175 SSR markers (including 47 SSRs on the published AA genome maps), representing the 20 chromosomes of A. hypogaea. The total composite map length is 885.4 cM, with an average marker density of 5.8 cM. Segregation distortion in the 3 populations was 23.0%, 13.5% and 7.8% of the markers, respectively. These

  16. Linkage disequilibrium matches forensic genetic records to disjoint genomic marker sets.

    PubMed

    Edge, Michael D; Algee-Hewitt, Bridget F B; Pemberton, Trevor J; Li, Jun Z; Rosenberg, Noah A

    2017-05-30

    Combining genotypes across datasets is central in facilitating advances in genetics. Data aggregation efforts often face the challenge of record matching-the identification of dataset entries that represent the same individual. We show that records can be matched across genotype datasets that have no shared markers based on linkage disequilibrium between loci appearing in different datasets. Using two datasets for the same 872 people-one with 642,563 genome-wide SNPs and the other with 13 short tandem repeats (STRs) used in forensic applications-we find that 90-98% of forensic STR records can be connected to corresponding SNP records and vice versa. Accuracy increases to 99-100% when ∼30 STRs are used. Our method expands the potential of data aggregation, but it also suggests privacy risks intrinsic in maintenance of databases containing even small numbers of markers-including databases of forensic significance.

  17. SNP identification from RNA sequencing and linkage map construction of rubber tree for anchoring the draft genome.

    PubMed

    Shearman, Jeremy R; Sangsrakru, Duangjai; Jomchai, Nukoon; Ruang-Areerate, Panthita; Sonthirod, Chutima; Naktang, Chaiwat; Theerawattanasuk, Kanikar; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2015-01-01

    Hevea brasiliensis, or rubber tree, is an important crop species that accounts for the majority of natural latex production. The rubber tree nuclear genome consists of 18 chromosomes and is roughly 2.15 Gb. The current rubber tree reference genome assembly consists of 1,150,326 scaffolds ranging from 200 to 531,465 bp and totalling 1.1 Gb. Only 143 scaffolds, totalling 7.6 Mb, have been placed into linkage groups. We have performed RNA-seq on 6 varieties of rubber tree to identify SNPs and InDels and used this information to perform target sequence enrichment and high throughput sequencing to genotype a set of SNPs in 149 rubber tree offspring from a cross between RRIM 600 and RRII 105 rubber tree varieties. We used this information to generate a linkage map allowing for the anchoring of 24,424 contigs from 3,009 scaffolds, totalling 115 Mb or 10.4% of the published sequence, into 18 linkage groups. Each linkage group contains between 319 and 1367 SNPs, or 60 to 194 non-redundant marker positions, and ranges from 156 to 336 cM in length. This linkage map includes 20,143 of the 69,300 predicted genes from rubber tree and will be useful for mapping studies and improving the reference genome assembly.

  18. Genome-Wide Association Study Reveals a New QTL for Salinity Tolerance in Barley (Hordeum vulgare L.)

    PubMed Central

    Fan, Yun; Zhou, Gaofeng; Shabala, Sergey; Chen, Zhong-Hua; Cai, Shengguan; Li, Chengdao; Zhou, Meixue

    2016-01-01

    Salinity stress is one of the most severe abiotic stresses that affect agricultural production. Genome wide association study (GWAS) has been widely used to detect genetic variations in extensive natural accessions with more recombination and higher resolution. In this study, 206 barley accessions collected worldwide were genotyped with 408 Diversity Arrays Technology (DArT) markers and evaluated for salinity stress tolerance using salinity tolerance score – a reliable trait developed in our previous work. GWAS for salinity tolerance had been conducted through a general linkage model and a mixed linkage model based on population structure and kinship. A total of 24 significant marker-trait associations were identified. A QTL on 4H with the nearest marker of bPb-9668 was consistently detected in all different methods. This QTL has not been reported before and is worth to be further confirmed with bi-parental populations. PMID:27446173

  19. Linkage analysis of quantitative refraction and refractive errors in the Beaver Dam Eye Study.

    PubMed

    Klein, Alison P; Duggal, Priya; Lee, Kristine E; Cheng, Ching-Yu; Klein, Ronald; Bailey-Wilson, Joan E; Klein, Barbara E K

    2011-07-13

    Refraction, as measured by spherical equivalent, is the need for an external lens to focus images on the retina. While genetic factors play an important role in the development of refractive errors, few susceptibility genes have been identified. However, several regions of linkage have been reported for myopia (2q, 4q, 7q, 12q, 17q, 18p, 22q, and Xq) and for quantitative refraction (1p, 3q, 4q, 7p, 8p, and 11p). To replicate previously identified linkage peaks and to identify novel loci that influence quantitative refraction and refractive errors, linkage analysis of spherical equivalent, myopia, and hyperopia in the Beaver Dam Eye Study was performed. Nonparametric, sibling-pair, genome-wide linkage analyses of refraction (spherical equivalent adjusted for age, education, and nuclear sclerosis), myopia and hyperopia in 834 sibling pairs within 486 extended pedigrees were performed. Suggestive evidence of linkage was found for hyperopia on chromosome 3, region q26 (empiric P = 5.34 × 10(-4)), a region that had shown significant genome-wide evidence of linkage to refraction and some evidence of linkage to hyperopia. In addition, the analysis replicated previously reported genome-wide significant linkages to 22q11 of adjusted refraction and myopia (empiric P = 4.43 × 10(-3) and 1.48 × 10(-3), respectively) and to 7p15 of refraction (empiric P = 9.43 × 10(-4)). Evidence was also found of linkage to refraction on 7q36 (empiric P = 2.32 × 10(-3)), a region previously linked to high myopia. The findings provide further evidence that genes controlling refractive errors are located on 3q26, 7p15, 7p36, and 22q11.

  20. Linkage Analysis of Quantitative Refraction and Refractive Errors in the Beaver Dam Eye Study

    PubMed Central

    Duggal, Priya; Lee, Kristine E.; Cheng, Ching-Yu; Klein, Ronald; Bailey-Wilson, Joan E.; Klein, Barbara E. K.

    2011-01-01

    Purpose. Refraction, as measured by spherical equivalent, is the need for an external lens to focus images on the retina. While genetic factors play an important role in the development of refractive errors, few susceptibility genes have been identified. However, several regions of linkage have been reported for myopia (2q, 4q, 7q, 12q, 17q, 18p, 22q, and Xq) and for quantitative refraction (1p, 3q, 4q, 7p, 8p, and 11p). To replicate previously identified linkage peaks and to identify novel loci that influence quantitative refraction and refractive errors, linkage analysis of spherical equivalent, myopia, and hyperopia in the Beaver Dam Eye Study was performed. Methods. Nonparametric, sibling-pair, genome-wide linkage analyses of refraction (spherical equivalent adjusted for age, education, and nuclear sclerosis), myopia and hyperopia in 834 sibling pairs within 486 extended pedigrees were performed. Results. Suggestive evidence of linkage was found for hyperopia on chromosome 3, region q26 (empiric P = 5.34 × 10−4), a region that had shown significant genome-wide evidence of linkage to refraction and some evidence of linkage to hyperopia. In addition, the analysis replicated previously reported genome-wide significant linkages to 22q11 of adjusted refraction and myopia (empiric P = 4.43 × 10−3 and 1.48 × 10−3, respectively) and to 7p15 of refraction (empiric P = 9.43 × 10−4). Evidence was also found of linkage to refraction on 7q36 (empiric P = 2.32 × 10−3), a region previously linked to high myopia. Conclusions. The findings provide further evidence that genes controlling refractive errors are located on 3q26, 7p15, 7p36, and 22q11. PMID:21571680

  1. A Genome-Wide Association Study of Depressive Symptoms

    PubMed Central

    Cornelis, Marilyn C.; Amin, Najaf; Bakshis, Erin; Baumert, Jens; Ding, Jingzhong; Liu, Yongmei; Marciante, Kristin; Meirelles, Osorio; Nalls, Michael A.; Sun, Yan V.; Vogelzangs, Nicole; Yu, Lei; Bandinelli, Stefania; Benjamin, Emelia J.; Bennett, David A.; Boomsma, Dorret; Cannas, Alessandra; Coker, Laura H.; de Geus, Eco; De Jager, Philip L.; Diez-Roux, Ana V.; Purcell, Shaun; Hu, Frank B.; Rimma, Eric B.; Hunter, David J.; Jensen, Majken K.; Curhan, Gary; Rice, Kenneth; Penman, Alan D.; Rotter, Jerome I.; Sotoodehnia, Nona; Emeny, Rebecca; Eriksson, Johan G.; Evans, Denis A.; Ferrucci, Luigi; Fornage, Myriam; Gudnason, Vilmundur; Hofman, Albert; Illig, Thomas; Kardia, Sharon; Kelly-Hayes, Margaret; Koenen, Karestan; Kraft, Peter; Kuningas, Maris; Massaro, Joseph M.; Melzer, David; Mulas, Antonella; Mulder, Cornelis L.; Murray, Anna; Oostra, Ben A.; Palotie, Aarno; Penninx, Brenda; Petersmann, Astrid; Pilling, Luke C.; Psaty, Bruce; Rawal, Rajesh; Reiman, Eric M.; Schulz, Andrea; Shulman, Joshua M.; Singleton, Andrew B.; Smith, Albert V.; Sutin, Angelina R.; Uitterlinden, André G.; Völzke, Henry; Widen, Elisabeth; Yaffe, Kristine; Zonderman, Alan B.; Cucca, Francesco; Harris, Tamara; Ladwig, Karl-Heinz; Llewellyn, David J.; Räikkönen, Katri; Tanaka, Toshiko

    2013-01-01

    Background Depression is a heritable trait that exists on a continuum of varying severity and duration. Yet, the search for genetic variants associated with depression has had few successes. We exploit the entire continuum of depression to find common variants for depressive symptoms. Methods In this genome-wide association study, we combined the results of 17 population-based studies assessing depressive symptoms with the Center for Epidemiological Studies Depression Scale. Replication of the independent top hits (p < 1 × 10−5) was performed in five studies assessing depressive symptoms with other instruments. In addition, we performed a combined meta-analysis of all 22 discovery and replication studies. Results The discovery sample comprised 34,549 individuals (mean age of 66.5) and no loci reached genome-wide significance (lowest p = 1.05 × 10−7). Seven independent single nucleotide polymorphisms were considered for replication. In the replication set (n = 16,709), we found suggestive association of one single nucleotide polymorphism with depressive symptoms (rs161645, 5q21, p = 9.19 × 10−3). This 5q21 region reached genome-wide significance (p = 4.78 × 10−8) in the overall meta-analysis combining discovery and replication studies (n = 51,258). Conclusions The results suggest that only a large sample comprising more than 50,000 subjects may be sufficiently powered to detect genes for depressive symptoms. PMID:23290196

  2. A meiotic linkage map of the silver fox, aligned and compared to the canine genome.

    PubMed

    Kukekova, Anna V; Trut, Lyudmila N; Oskina, Irina N; Johnson, Jennifer L; Temnykh, Svetlana V; Kharlamova, Anastasiya V; Shepeleva, Darya V; Gulievich, Rimma G; Shikhevich, Svetlana G; Graphodatsky, Alexander S; Aguirre, Gustavo D; Acland, Gregory M

    2007-03-01

    A meiotic linkage map is essential for mapping traits of interest and is often the first step toward understanding a cryptic genome. Specific strains of silver fox (a variant of the red fox, Vulpes vulpes), which segregate behavioral and morphological phenotypes, create a need for such a map. One such strain, selected for docility, exhibits friendly dog-like responses to humans, in contrast to another strain selected for aggression. Development of a fox map is facilitated by the known cytogenetic homologies between the dog and fox, and by the availability of high resolution canine genome maps and sequence data. Furthermore, the high genomic sequence identity between dog and fox allows adaptation of canine microsatellites for genotyping and meiotic mapping in foxes. Using 320 such markers, we have constructed the first meiotic linkage map of the fox genome. The resulting sex-averaged map covers 16 fox autosomes and the X chromosome with an average inter-marker distance of 7.5 cM. The total map length corresponds to 1480.2 cM. From comparison of sex-averaged meiotic linkage maps of the fox and dog genomes, suppression of recombination in pericentromeric regions of the metacentric fox chromosomes was apparent, relative to the corresponding segments of acrocentric dog chromosomes. Alignment of the fox meiotic map against the 7.6x canine genome sequence revealed high conservation of marker order between homologous regions of the two species. The fox meiotic map provides a critical tool for genetic studies in foxes and identification of genetic loci and genes implicated in fox domestication.

  3. Sex-specific Linkage Scans in Opioid Dependence

    PubMed Central

    Yang, Bao-Zhu; Han, Shizhong; Kranzler, Henry R.; Palmer, Abraham A.; Gelernter, Joel

    2017-01-01

    Sex influences risk for opioid dependence (OD). We hypothesized that sex might interact with genetic loci that influence the risk for OD. Therefore we performed an analysis to identify sex-specific genomic susceptibility regions for OD using linkage. Over 6000 single nucleotide polymorphism (SNP) markers were genotyped for 1758 African- and European-American (AA and EA) individuals from 739 families, ascertained via affected sib-pairs with OD and/or cocaine dependence. Autosomewide non-parametric linkage scans, stratified by sex and population, were performed. We identified one significant linkage region, segregating with OD in EA men, at 71.1 cM on chromosome 4 (LOD=3.29; point-wise p=0.00005; empirical autosome-wide p=0.042), which significantly differed from the linkage signal at the same location in EA women (empirical p=0.002). Three suggestive linkage signals were identified at 181.3 cM on chromosome 7 (LOD=2.18), 104 cM on chromosome 11 (LOD=1.85), and 60.9 cM on chromosome 16 (LOD=1.93) in EA women. In AA men, four suggestive linkage signals were detected at 201.1 cM on chromosome 3 (LOD=2.32), 152.9 cM on chromosome 6 (LOD=1.86), 16.8 cM on chromosome 7 (LOD=1.95), and 36.1 cM on chromosome 17 (LOD=1.99). The significant region, mapping to 4q12-4q13.1, harbors several OD candidate genes with interconnected functionality, including VEGFR, CLOCK, PDCL2, NMU, NRSF, and IGFBP7. In conclusion, these results provide an evidence for the existence of sex-specific and population-specific differences in OD. Furthermore, these results provide positional information that will facilitate the use of targeted next-generation sequencing to search for genes that contribute to sex-specific differences in OD. PMID:27762075

  4. Heritability and linkage analysis of personality in bipolar disorder.

    PubMed

    Greenwood, Tiffany A; Badner, Judith A; Byerley, William; Keck, Paul E; McElroy, Susan L; Remick, Ronald A; Dessa Sadovnick, A; Kelsoe, John R

    2013-11-01

    The many attempts that have been made to identify genes for bipolar disorder (BD) have met with limited success, which may reflect an inadequacy of diagnosis as an informative and biologically relevant phenotype for genetic studies. Here we have explored aspects of personality as quantitative phenotypes for bipolar disorder through the use of the Temperament and Character Inventory (TCI), which assesses personality in seven dimensions. Four temperament dimensions are assessed: novelty seeking (NS), harm avoidance (HA), reward dependence (RD), and persistence (PS). Three character dimensions are also included: self-directedness (SD), cooperativeness (CO), and self-transcendence (ST). We compared personality scores between diagnostic groups and assessed heritability in a sample of 101 families collected for genetic studies of BD. A genome-wide SNP linkage analysis was then performed in the subset of 51 families for which genetic data was available. Significant group differences were observed between BD subjects, their first-degree relatives, and independent controls for all but RD and PS, and all but HA and RD were found to be significantly heritable in this sample. Linkage analysis of the heritable dimensions produced several suggestive linkage peaks for NS (chromosomes 7q21 and 10p15), PS (chromosomes 6q16, 12p13, and 19p13), and SD (chromosomes 4q35, 8q24, and 18q12). The relatively small size of our linkage sample likely limited our ability to reach genome-wide significance in this study. While not genome-wide significant, these results suggest that aspects of personality may prove useful in the identification of genes underlying BD susceptibility. © 2013 Elsevier B.V. All rights reserved.

  5. SNP Identification from RNA Sequencing and Linkage Map Construction of Rubber Tree for Anchoring the Draft Genome

    PubMed Central

    Shearman, Jeremy R.; Sangsrakru, Duangjai; Jomchai, Nukoon; Ruang-areerate, Panthita; Sonthirod, Chutima; Naktang, Chaiwat; Theerawattanasuk, Kanikar; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2015-01-01

    Hevea brasiliensis, or rubber tree, is an important crop species that accounts for the majority of natural latex production. The rubber tree nuclear genome consists of 18 chromosomes and is roughly 2.15 Gb. The current rubber tree reference genome assembly consists of 1,150,326 scaffolds ranging from 200 to 531,465 bp and totalling 1.1 Gb. Only 143 scaffolds, totalling 7.6 Mb, have been placed into linkage groups. We have performed RNA-seq on 6 varieties of rubber tree to identify SNPs and InDels and used this information to perform target sequence enrichment and high throughput sequencing to genotype a set of SNPs in 149 rubber tree offspring from a cross between RRIM 600 and RRII 105 rubber tree varieties. We used this information to generate a linkage map allowing for the anchoring of 24,424 contigs from 3,009 scaffolds, totalling 115 Mb or 10.4% of the published sequence, into 18 linkage groups. Each linkage group contains between 319 and 1367 SNPs, or 60 to 194 non-redundant marker positions, and ranges from 156 to 336 cM in length. This linkage map includes 20,143 of the 69,300 predicted genes from rubber tree and will be useful for mapping studies and improving the reference genome assembly. PMID:25831195

  6. Genome-Wide Association Mapping of Acid Soil Resistance in Barley (Hordeum vulgare L.)

    PubMed Central

    Zhou, Gaofeng; Broughton, Sue; Zhang, Xiao-Qi; Ma, Yanling; Zhou, Meixue; Li, Chengdao

    2016-01-01

    Genome-wide association studies (GWAS) based on linkage disequilibrium (LD) have been used to detect QTLs underlying complex traits in major crops. In this study, we collected 218 barley (Hordeum vulgare L.) lines including wild barley and cultivated barley from China, Canada, Australia, and Europe. A total of 408 polymorphic markers were used for population structure and LD analysis. GWAS for acid soil resistance were performed on the population using a general linkage model (GLM) and a mixed linkage model (MLM), respectively. A total of 22 QTLs (quantitative trait loci) were detected with the GLM and MLM analyses. Two QTLs, close to markers bPb-1959 (133.1 cM) and bPb-8013 (86.7 cM), localized on chromosome 1H and 4H respectively, were consistently detected in two different trials with both the GLM and MLM analyses. Furthermore, bPb-8013, the closest marker to the major Al3+ resistance gene HvAACT1 in barley, was identified to be QTL5. The QTLs could be used in marker-assisted selection to identify and pyramid different loci for improved acid soil resistance in barley. PMID:27064793

  7. Comprehensive Identification Of Specific Genes Controlling Complex Traits Through A Genome-Wide Screen for Cis-Acting Regulatory Elements - An Example Using Marek's Disease

    USDA-ARS?s Scientific Manuscript database

    The comprehensive identification of genes underlying phenotypic variation of complex traits remains a major challenge. Most genome-wide screens lack sufficient resolving power as they typically depend on linkage. An alternate method is to screen for allele-specific expression (ASE), a simple yet pow...

  8. Candidate genes for obesity-susceptibility show enriched association within a large genome-wide association study for BMI.

    PubMed

    Vimaleswaran, Karani S; Tachmazidou, Ioanna; Zhao, Jing Hua; Hirschhorn, Joel N; Dudbridge, Frank; Loos, Ruth J F

    2012-10-15

    Before the advent of genome-wide association studies (GWASs), hundreds of candidate genes for obesity-susceptibility had been identified through a variety of approaches. We examined whether those obesity candidate genes are enriched for associations with body mass index (BMI) compared with non-candidate genes by using data from a large-scale GWAS. A thorough literature search identified 547 candidate genes for obesity-susceptibility based on evidence from animal studies, Mendelian syndromes, linkage studies, genetic association studies and expression studies. Genomic regions were defined to include the genes ±10 kb of flanking sequence around candidate and non-candidate genes. We used summary statistics publicly available from the discovery stage of the genome-wide meta-analysis for BMI performed by the genetic investigation of anthropometric traits consortium in 123 564 individuals. Hypergeometric, rank tail-strength and gene-set enrichment analysis tests were used to test for the enrichment of association in candidate compared with non-candidate genes. The hypergeometric test of enrichment was not significant at the 5% P-value quantile (P = 0.35), but was nominally significant at the 25% quantile (P = 0.015). The rank tail-strength and gene-set enrichment tests were nominally significant for the full set of genes and borderline significant for the subset without SNPs at P < 10(-7). Taken together, the observed evidence for enrichment suggests that the candidate gene approach retains some value. However, the degree of enrichment is small despite the extensive number of candidate genes and the large sample size. Studies that focus on candidate genes have only slightly increased chances of detecting associations, and are likely to miss many true effects in non-candidate genes, at least for obesity-related traits.

  9. A Genome-Wide Association Study of Circulating Galectin-3

    PubMed Central

    van Veldhuisen, Dirk J.; Westra, Harm-Jan; Bakker, Stephan J. L.; Gansevoort, Ron T.; Muller Kobold, Anneke C.; van Gilst, Wiek H.; Franke, Lude

    2012-01-01

    Galectin-3 is a lectin involved in fibrosis, inflammation and proliferation. Increased circulating levels of galectin-3 have been associated with various diseases, including cancer, immunological disorders, and cardiovascular disease. To enhance our knowledge on galectin-3 biology we performed the first genome-wide association study (GWAS) using the Illumina HumanCytoSNP-12 array imputed with the HapMap 2 CEU panel on plasma galectin-3 levels in 3,776 subjects and follow-up genotyping in an additional 3,516 subjects. We identified 2 genome wide significant loci associated with plasma galectin-3 levels. One locus harbours the LGALS3 gene (rs2274273; P = 2.35×10−188) and the other locus the ABO gene (rs644234; P = 3.65×10−47). The variance explained by the LGALS3 locus was 25.6% and by the ABO locus 3.8% and jointly they explained 29.2%. Rs2274273 lies in high linkage disequilibrium with two non-synonymous SNPs (rs4644; r2 = 1.0, and rs4652; r2 = 0.91) and wet lab follow-up genotyping revealed that both are strongly associated with galectin-3 levels (rs4644; P = 4.97×10−465 and rs4652 P = 1.50×10−421) and were also associated with LGALS3 gene-expression. The origins of our associations should be further validated by means of functional experiments. PMID:23056639

  10. Significant Locus and Metabolic Genetic Correlations Revealed in Genome-Wide Association Study of Anorexia Nervosa.

    PubMed

    Duncan, Laramie; Yilmaz, Zeynep; Gaspar, Helena; Walters, Raymond; Goldstein, Jackie; Anttila, Verneri; Bulik-Sullivan, Brendan; Ripke, Stephan; Thornton, Laura; Hinney, Anke; Daly, Mark; Sullivan, Patrick F; Zeggini, Eleftheria; Breen, Gerome; Bulik, Cynthia M

    2017-09-01

    The authors conducted a genome-wide association study of anorexia nervosa and calculated genetic correlations with a series of psychiatric, educational, and metabolic phenotypes. Following uniform quality control and imputation procedures using the 1000 Genomes Project (phase 3) in 12 case-control cohorts comprising 3,495 anorexia nervosa cases and 10,982 controls, the authors performed standard association analysis followed by a meta-analysis across cohorts. Linkage disequilibrium score regression was used to calculate genome-wide common variant heritability (single-nucleotide polymorphism [SNP]-based heritability [h 2 SNP ]), partitioned heritability, and genetic correlations (r g ) between anorexia nervosa and 159 other phenotypes. Results were obtained for 10,641,224 SNPs and insertion-deletion variants with minor allele frequencies >1% and imputation quality scores >0.6. The h 2 SNP of anorexia nervosa was 0.20 (SE=0.02), suggesting that a substantial fraction of the twin-based heritability arises from common genetic variation. The authors identified one genome-wide significant locus on chromosome 12 (rs4622308) in a region harboring a previously reported type 1 diabetes and autoimmune disorder locus. Significant positive genetic correlations were observed between anorexia nervosa and schizophrenia, neuroticism, educational attainment, and high-density lipoprotein cholesterol, and significant negative genetic correlations were observed between anorexia nervosa and body mass index, insulin, glucose, and lipid phenotypes. Anorexia nervosa is a complex heritable phenotype for which this study has uncovered the first genome-wide significant locus. Anorexia nervosa also has large and significant genetic correlations with both psychiatric phenotypes and metabolic traits. The study results encourage a reconceptualization of this frequently lethal disorder as one with both psychiatric and metabolic etiology.

  11. Genome-Wide Association Mapping of Flowering and Ripening Periods in Apple

    PubMed Central

    Urrestarazu, Jorge; Muranty, Hélène; Denancé, Caroline; Leforestier, Diane; Ravon, Elisa; Guyader, Arnaud; Guisnel, Rémi; Feugey, Laurence; Aubourg, Sébastien; Celton, Jean-Marc; Daccord, Nicolas; Dondini, Luca; Gregori, Roberto; Lateur, Marc; Houben, Patrick; Ordidge, Matthew; Paprstein, Frantisek; Sedlak, Jiri; Nybom, Hilde; Garkava-Gustavsson, Larisa; Troggio, Michela; Bianco, Luca; Velasco, Riccardo; Poncet, Charles; Théron, Anthony; Moriya, Shigeki; Bink, Marco C. A. M.; Laurens, François; Tartarini, Stefano; Durel, Charles-Eric

    2017-01-01

    Deciphering the genetic control of flowering and ripening periods in apple is essential for breeding cultivars adapted to their growing environments. We implemented a large Genome-Wide Association Study (GWAS) at the European level using an association panel of 1,168 different apple genotypes distributed over six locations and phenotyped for these phenological traits. The panel was genotyped at a high-density of SNPs using the Axiom®Apple 480 K SNP array. We ran GWAS with a multi-locus mixed model (MLMM), which handles the putatively confounding effect of significant SNPs elsewhere on the genome. Genomic regions were further investigated to reveal candidate genes responsible for the phenotypic variation. At the whole population level, GWAS retained two SNPs as cofactors on chromosome 9 for flowering period, and six for ripening period (four on chromosome 3, one on chromosome 10 and one on chromosome 16) which, together accounted for 8.9 and 17.2% of the phenotypic variance, respectively. For both traits, SNPs in weak linkage disequilibrium were detected nearby, thus suggesting the existence of allelic heterogeneity. The geographic origins and relationships of apple cultivars accounted for large parts of the phenotypic variation. Variation in genotypic frequency of the SNPs associated with the two traits was connected to the geographic origin of the genotypes (grouped as North+East, West and South Europe), and indicated differential selection in different growing environments. Genes encoding transcription factors containing either NAC or MADS domains were identified as major candidates within the small confidence intervals computed for the associated genomic regions. A strong microsynteny between apple and peach was revealed in all the four confidence interval regions. This study shows how association genetics can unravel the genetic control of important horticultural traits in apple, as well as reduce the confidence intervals of the associated regions identified

  12. Genome-Wide Association Mapping of Flowering and Ripening Periods in Apple.

    PubMed

    Urrestarazu, Jorge; Muranty, Hélène; Denancé, Caroline; Leforestier, Diane; Ravon, Elisa; Guyader, Arnaud; Guisnel, Rémi; Feugey, Laurence; Aubourg, Sébastien; Celton, Jean-Marc; Daccord, Nicolas; Dondini, Luca; Gregori, Roberto; Lateur, Marc; Houben, Patrick; Ordidge, Matthew; Paprstein, Frantisek; Sedlak, Jiri; Nybom, Hilde; Garkava-Gustavsson, Larisa; Troggio, Michela; Bianco, Luca; Velasco, Riccardo; Poncet, Charles; Théron, Anthony; Moriya, Shigeki; Bink, Marco C A M; Laurens, François; Tartarini, Stefano; Durel, Charles-Eric

    2017-01-01

    Deciphering the genetic control of flowering and ripening periods in apple is essential for breeding cultivars adapted to their growing environments. We implemented a large Genome-Wide Association Study (GWAS) at the European level using an association panel of 1,168 different apple genotypes distributed over six locations and phenotyped for these phenological traits. The panel was genotyped at a high-density of SNPs using the Axiom®Apple 480 K SNP array. We ran GWAS with a multi-locus mixed model (MLMM), which handles the putatively confounding effect of significant SNPs elsewhere on the genome. Genomic regions were further investigated to reveal candidate genes responsible for the phenotypic variation. At the whole population level, GWAS retained two SNPs as cofactors on chromosome 9 for flowering period, and six for ripening period (four on chromosome 3, one on chromosome 10 and one on chromosome 16) which, together accounted for 8.9 and 17.2% of the phenotypic variance, respectively. For both traits, SNPs in weak linkage disequilibrium were detected nearby, thus suggesting the existence of allelic heterogeneity. The geographic origins and relationships of apple cultivars accounted for large parts of the phenotypic variation. Variation in genotypic frequency of the SNPs associated with the two traits was connected to the geographic origin of the genotypes (grouped as North+East, West and South Europe), and indicated differential selection in different growing environments. Genes encoding transcription factors containing either NAC or MADS domains were identified as major candidates within the small confidence intervals computed for the associated genomic regions. A strong microsynteny between apple and peach was revealed in all the four confidence interval regions. This study shows how association genetics can unravel the genetic control of important horticultural traits in apple, as well as reduce the confidence intervals of the associated regions identified

  13. Sample Reproducibility of Genetic Association Using Different Multimarker TDTs in Genome-Wide Association Studies: Characterization and a New Approach

    PubMed Central

    Abad-Grau, Mara M.; Medina-Medina, Nuria; Montes-Soldado, Rosana; Matesanz, Fuencisla; Bafna, Vineet

    2012-01-01

    Multimarker Transmission/Disequilibrium Tests (TDTs) are very robust association tests to population admixture and structure which may be used to identify susceptibility loci in genome-wide association studies. Multimarker TDTs using several markers may increase power by capturing high-degree associations. However, there is also a risk of spurious associations and power reduction due to the increase in degrees of freedom. In this study we show that associations found by tests built on simple null hypotheses are highly reproducible in a second independent data set regardless the number of markers. As a test exhibiting this feature to its maximum, we introduce the multimarker -Groups TDT ( ), a test which under the hypothesis of no linkage, asymptotically follows a distribution with degree of freedom regardless the number of markers. The statistic requires the division of parental haplotypes into two groups: disease susceptibility and disease protective haplotype groups. We assessed the test behavior by performing an extensive simulation study as well as a real-data study using several data sets of two complex diseases. We show that test is highly efficient and it achieves the highest power among all the tests used, even when the null hypothesis is tested in a second independent data set. Therefore, turns out to be a very promising multimarker TDT to perform genome-wide searches for disease susceptibility loci that may be used as a preprocessing step in the construction of more accurate genetic models to predict individual susceptibility to complex diseases. PMID:22363405

  14. Genomic scan for genes predisposing to schizophrenia

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coon, H.; Jensen. S.; Holik, J.

    1994-03-15

    We initiated a genome-wide search for genes predisposing to schizophrenia by ascertaining 9 families, each containing three to five cases of schizophrenia. The 9 pedigrees were initially genotyped with 329 polymorphic DNA loci distributed throughout the genome. Assuming either autosomal dominant or recessive inheritance, 254 DNA loci yielded lod scores less than -2.0 at {theta} = 0.0, 101 DNA markers gave lod scores less than -2.0 at {theta} = 0.05, while 5 DNA loci produced maximum lod scores greater than 1: D4S35, D14S17, D15S1, D22S84, and D22S55. Of the DNA markers yielding lod scores greater than 1, D4S35 and D22S55more » also were suggestive of linkage when the Affected-Pedigree-Member method was used. The families were then genotyped with four highly polymorphic simple sequence repeat markers; possible linkage diminished with DNA markers mapping nearby D4S35, while suggestive evidence of linkage remained with loci in the region of D22S55. Although follow-up investigation of these chromosomal regions may be warranted, our linkage results should be viewed as preliminary observations, as 35 unaffected persons are not past the age of risk. 90 refs., 3 tabs.« less

  15. Linkage of Type 2 Diabetes on Chromosome 9p24 in Mexican Americans: Additional Evidence from the Veterans Administration Genetic Epidemiology Study (VAGES)

    PubMed Central

    Farook, Vidya S.; Coletta, Dawn K.; Puppala, Sobha; Schneider, Jennifer; Chittoor, Geetha; Hu, Shirley L.; Winnier, Deidre A.; Norton, Luke; Dyer, Thomas D.; Arya, Rector; Cole, Shelley A.; Carless, Melanie; Göring, Harald H.; Almasy, Laura; Mahaney, Michael C.; Comuzzie, Anthony G.; Curran, Joanne E.; Blangero, John; Duggirala, Ravindranath; Lehman, Donna M.; Jenkinson, Christopher P.; DeFronzo, Ralph A.

    2014-01-01

    Objective Type 2 diabetes (T2DM) is a complex metabolic disease and is more prevalent in certain ethnic groups such as the Mexican Americans. The goal of our study was to perform a genome-wide linkage analysis to localize T2DM susceptibility loci in Mexican Americans. Methods We used the phenotypic and genotypic data from 1,122 Mexican American individuals (307 families) who participated in the Veterans Administration Genetic Epidemiology Study (VAGES). Genome-wide linkage analysis was performed, using the variance components approach. Data from two additional Mexican American family studies, the San Antonio Family Heart Study (SAFHS) and the San Antonio Family Diabetes/Gallbladder Study (SAFDGS), were combined with the VAGES data to test for improved linkage evidence. Results After adjusting for covariate effects, T2DM was found to be under significant genetic influences (h2 = 0.62, P = 2.7 × 10−6). The strongest evidence for linkage of T2DM occurred between markers D9S1871 and D9S2169 on chromosome 9p24.2-p24.1 (LOD = 1.8). Given that we previously reported suggestive evidence for linkage of T2DM at this region in SAFDGS also, we found the significant and increased linkage evidence (LOD = 4.3, empirical P = 1.0 × 10−5, genome-wide P = 1.6 × 10−3) for T2DM at the same chromosomal region when we performed genome-wide linkage analysis of the VAGES data combined with SAFHS and SAFDGS data. Conclusion Significant T2DM linkage evidence was found on chromosome 9p24 in Mexican Americans. Importantly, the chromosomal region of interest in this study overlaps with several recent genome-wide association studies (GWASs) involving T2DM related traits. Given its overlap with such findings and our own initial T2DM association findings in the 9p24 chromosomal region, high throughput sequencing of the linked chromosomal region could identify the potential causal T2DM genes. PMID:24060607

  16. A genome-wide association search for type 2 diabetes genes in African Americans.

    PubMed

    Palmer, Nicholette D; McDonough, Caitrin W; Hicks, Pamela J; Roh, Bong H; Wing, Maria R; An, S Sandy; Hester, Jessica M; Cooke, Jessica N; Bostrom, Meredith A; Rudock, Megan E; Talbert, Matthew E; Lewis, Joshua P; Ferrara, Assiamira; Lu, Lingyi; Ziegler, Julie T; Sale, Michele M; Divers, Jasmin; Shriner, Daniel; Adeyemo, Adebowale; Rotimi, Charles N; Ng, Maggie C Y; Langefeld, Carl D; Freedman, Barry I; Bowden, Donald W; Voight, Benjamin F; Scott, Laura J; Steinthorsdottir, Valgerdur; Morris, Andrew P; Dina, Christian; Welch, Ryan P; Zeggini, Eleftheria; Huth, Cornelia; Aulchenko, Yurii S; Thorleifsson, Gudmar; McCulloch, Laura J; Ferreira, Teresa; Grallert, Harald; Amin, Najaf; Wu, Guanming; Willer, Cristen J; Raychaudhuri, Soumya; McCarroll, Steve A; Langenberg, Claudia; Hofmann, Oliver M; Dupuis, Josée; Qi, Lu; Segrè, Ayellet V; van Hoek, Mandy; Navarro, Pau; Ardlie, Kristin; Balkau, Beverley; Benediktsson, Rafn; Bennett, Amanda J; Blagieva, Roza; Boerwinkle, Eric; Bonnycastle, Lori L; Boström, Kristina Bengtsson; Bravenboer, Bert; Bumpstead, Suzannah; Burtt, Noël P; Charpentier, Guillaume; Chines, Peter S; Cornelis, Marilyn; Couper, David J; Crawford, Gabe; Doney, Alex S F; Elliott, Katherine S; Elliott, Amanda L; Erdos, Michael R; Fox, Caroline S; Franklin, Christopher S; Ganser, Martha; Gieger, Christian; Grarup, Niels; Green, Todd; Griffin, Simon; Groves, Christopher J; Guiducci, Candace; Hadjadj, Samy; Hassanali, Neelam; Herder, Christian; Isomaa, Bo; Jackson, Anne U; Johnson, Paul R V; Jørgensen, Torben; Kao, Wen H L; Klopp, Norman; Kong, Augustine; Kraft, Peter; Kuusisto, Johanna; Lauritzen, Torsten; Li, Man; Lieverse, Aloysius; Lindgren, Cecilia M; Lyssenko, Valeriya; Marre, Michel; Meitinger, Thomas; Midthjell, Kristian; Morken, Mario A; Narisu, Narisu; Nilsson, Peter; Owen, Katharine R; Payne, Felicity; Perry, John R B; Petersen, Ann-Kristin; Platou, Carl; Proença, Christine; Prokopenko, Inga; Rathmann, Wolfgang; Rayner, N William; Robertson, Neil R; Rocheleau, Ghislain; Roden, Michael; Sampson, Michael J; Saxena, Richa; Shields, Beverley M; Shrader, Peter; Sigurdsson, Gunnar; Sparsø, Thomas; Strassburger, Klaus; Stringham, Heather M; Sun, Qi; Swift, Amy J; Thorand, Barbara; Tichet, Jean; Tuomi, Tiinamaija; van Dam, Rob M; van Haeften, Timon W; van Herpt, Thijs; van Vliet-Ostaptchouk, Jana V; Walters, G Bragi; Weedon, Michael N; Wijmenga, Cisca; Witteman, Jacqueline; Bergman, Richard N; Cauchi, Stephane; Collins, Francis S; Gloyn, Anna L; Gyllensten, Ulf; Hansen, Torben; Hide, Winston A; Hitman, Graham A; Hofman, Albert; Hunter, David J; Hveem, Kristian; Laakso, Markku; Mohlke, Karen L; Morris, Andrew D; Palmer, Colin N A; Pramstaller, Peter P; Rudan, Igor; Sijbrands, Eric; Stein, Lincoln D; Tuomilehto, Jaakko; Uitterlinden, Andre; Walker, Mark; Wareham, Nicholas J; Watanabe, Richard M; Abecasis, Goncalo R; Boehm, Bernhard O; Campbell, Harry; Daly, Mark J; Hattersley, Andrew T; Hu, Frank B; Meigs, James B; Pankow, James S; Pedersen, Oluf; Wichmann, H-Erich; Barroso, Inês; Florez, Jose C; Frayling, Timothy M; Groop, Leif; Sladek, Rob; Thorsteinsdottir, Unnur; Wilson, James F; Illig, Thomas; Froguel, Philippe; van Duijn, Cornelia M; Stefansson, Kari; Altshuler, David; Boehnke, Michael; McCarthy, Mark I; Soranzo, Nicole; Wheeler, Eleanor; Glazer, Nicole L; Bouatia-Naji, Nabila; Mägi, Reedik; Randall, Joshua; Johnson, Toby; Elliott, Paul; Rybin, Denis; Henneman, Peter; Dehghan, Abbas; Hottenga, Jouke Jan; Song, Kijoung; Goel, Anuj; Egan, Josephine M; Lajunen, Taina; Doney, Alex; Kanoni, Stavroula; Cavalcanti-Proença, Christine; Kumari, Meena; Timpson, Nicholas J; Zabena, Carina; Ingelsson, Erik; An, Ping; O'Connell, Jeffrey; Luan, Jian'an; Elliott, Amanda; McCarroll, Steven A; Roccasecca, Rosa Maria; Pattou, François; Sethupathy, Praveen; Ariyurek, Yavuz; Barter, Philip; Beilby, John P; Ben-Shlomo, Yoav; Bergmann, Sven; Bochud, Murielle; Bonnefond, Amélie; Borch-Johnsen, Knut; Böttcher, Yvonne; Brunner, Eric; Bumpstead, Suzannah J; Chen, Yii-Der Ida; Chines, Peter; Clarke, Robert; Coin, Lachlan J M; Cooper, Matthew N; Crisponi, Laura; Day, Ian N M; de Geus, Eco J C; Delplanque, Jerome; Fedson, Annette C; Fischer-Rosinsky, Antje; Forouhi, Nita G; Frants, Rune; Franzosi, Maria Grazia; Galan, Pilar; Goodarzi, Mark O; Graessler, Jürgen; Grundy, Scott; Gwilliam, Rhian; Hallmans, Göran; Hammond, Naomi; Han, Xijing; Hartikainen, Anna-Liisa; Hayward, Caroline; Heath, Simon C; Hercberg, Serge; Hicks, Andrew A; Hillman, David R; Hingorani, Aroon D; Hui, Jennie; Hung, Joe; Jula, Antti; Kaakinen, Marika; Kaprio, Jaakko; Kesaniemi, Y Antero; Kivimaki, Mika; Knight, Beatrice; Koskinen, Seppo; Kovacs, Peter; Kyvik, Kirsten Ohm; Lathrop, G Mark; Lawlor, Debbie A; Le Bacquer, Olivier; Lecoeur, Cécile; Li, Yun; Mahley, Robert; Mangino, Massimo; Manning, Alisa K; Martínez-Larrad, María Teresa; McAteer, Jarred B; McPherson, Ruth; Meisinger, Christa; Melzer, David; Meyre, David; Mitchell, Braxton D; Mukherjee, Sutapa; Naitza, Silvia; Neville, Matthew J; Oostra, Ben A; Orrù, Marco; Pakyz, Ruth; Paolisso, Giuseppe; Pattaro, Cristian; Pearson, Daniel; Peden, John F; Pedersen, Nancy L; Perola, Markus; Pfeiffer, Andreas F H; Pichler, Irene; Polasek, Ozren; Posthuma, Danielle; Potter, Simon C; Pouta, Anneli; Province, Michael A; Psaty, Bruce M; Rayner, Nigel W; Rice, Kenneth; Ripatti, Samuli; Rivadeneira, Fernando; Rolandsson, Olov; Sandbaek, Annelli; Sandhu, Manjinder; Sanna, Serena; Sayer, Avan Aihie; Scheet, Paul; Seedorf, Udo; Sharp, Stephen J; Shields, Beverley; Sijbrands, Eric J G; Silveira, Angela; Simpson, Laila; Singleton, Andrew; Smith, Nicholas L; Sovio, Ulla; Swift, Amy; Syddall, Holly; Syvänen, Ann-Christine; Tanaka, Toshiko; Tönjes, Anke; Uitterlinden, André G; van Dijk, Ko Willems; Varma, Dhiraj; Visvikis-Siest, Sophie; Vitart, Veronique; Vogelzangs, Nicole; Waeber, Gérard; Wagner, Peter J; Walley, Andrew; Ward, Kim L; Watkins, Hugh; Wild, Sarah H; Willemsen, Gonneke; Witteman, Jaqueline C M; Yarnell, John W G; Zelenika, Diana; Zethelius, Björn; Zhai, Guangju; Zhao, Jing Hua; Zillikens, M Carola; Borecki, Ingrid B; Loos, Ruth J F; Meneton, Pierre; Magnusson, Patrik K E; Nathan, David M; Williams, Gordon H; Silander, Kaisa; Salomaa, Veikko; Smith, George Davey; Bornstein, Stefan R; Schwarz, Peter; Spranger, Joachim; Karpe, Fredrik; Shuldiner, Alan R; Cooper, Cyrus; Dedoussis, George V; Serrano-Ríos, Manuel; Lind, Lars; Palmer, Lyle J; Franks, Paul W; Ebrahim, Shah; Marmot, Michael; Kao, W H Linda; Pramstaller, Peter Paul; Wright, Alan F; Stumvoll, Michael; Hamsten, Anders; Buchanan, Thomas A; Valle, Timo T; Rotter, Jerome I; Siscovick, David S; Penninx, Brenda W J H; Boomsma, Dorret I; Deloukas, Panos; Spector, Timothy D; Ferrucci, Luigi; Cao, Antonio; Scuteri, Angelo; Schlessinger, David; Uda, Manuela; Ruokonen, Aimo; Jarvelin, Marjo-Riitta; Waterworth, Dawn M; Vollenweider, Peter; Peltonen, Leena; Mooser, Vincent; Sladek, Robert

    2012-01-01

    African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide Association Study (GWAS) using the Affymetrix 6.0 array in 965 African-American cases with T2DM and end-stage renal disease (T2DM-ESRD) and 1029 population-based controls. The most significant SNPs (n = 550 independent loci) were genotyped in a replication cohort and 122 SNPs (n = 98 independent loci) were further tested through genotyping three additional validation cohorts followed by meta-analysis in all five cohorts totaling 3,132 cases and 3,317 controls. Twelve SNPs had evidence of association in the GWAS (P<0.0071), were directionally consistent in the Replication cohort and were associated with T2DM in subjects without nephropathy (P<0.05). Meta-analysis in all cases and controls revealed a single SNP reaching genome-wide significance (P<2.5×10(-8)). SNP rs7560163 (P = 7.0×10(-9), OR (95% CI) = 0.75 (0.67-0.84)) is located intergenically between RND3 and RBM43. Four additional loci (rs7542900, rs4659485, rs2722769 and rs7107217) were associated with T2DM (P<0.05) and reached more nominal levels of significance (P<2.5×10(-5)) in the overall analysis and may represent novel loci that contribute to T2DM. We have identified novel T2DM-susceptibility variants in the African-American population. Notably, T2DM risk was associated with the major allele and implies an interesting genetic architecture in this population. These results suggest that multiple loci underlie T2DM susceptibility in the African-American population and that these loci are distinct from those identified in other ethnic populations.

  17. A Genome-Wide Association Search for Type 2 Diabetes Genes in African Americans

    PubMed Central

    Palmer, Nicholette D.; McDonough, Caitrin W.; Hicks, Pamela J.; Roh, Bong H.; Wing, Maria R.; An, S. Sandy; Hester, Jessica M.; Cooke, Jessica N.; Bostrom, Meredith A.; Rudock, Megan E.; Talbert, Matthew E.; Lewis, Joshua P.; Ferrara, Assiamira; Lu, Lingyi; Ziegler, Julie T.; Sale, Michele M.; Divers, Jasmin; Shriner, Daniel; Adeyemo, Adebowale; Rotimi, Charles N.; Ng, Maggie C. Y.; Langefeld, Carl D.; Freedman, Barry I.; Bowden, Donald W.

    2012-01-01

    African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide Association Study (GWAS) using the Affymetrix 6.0 array in 965 African-American cases with T2DM and end-stage renal disease (T2DM-ESRD) and 1029 population-based controls. The most significant SNPs (n = 550 independent loci) were genotyped in a replication cohort and 122 SNPs (n = 98 independent loci) were further tested through genotyping three additional validation cohorts followed by meta-analysis in all five cohorts totaling 3,132 cases and 3,317 controls. Twelve SNPs had evidence of association in the GWAS (P<0.0071), were directionally consistent in the Replication cohort and were associated with T2DM in subjects without nephropathy (P<0.05). Meta-analysis in all cases and controls revealed a single SNP reaching genome-wide significance (P<2.5×10−8). SNP rs7560163 (P = 7.0×10−9, OR (95% CI) = 0.75 (0.67–0.84)) is located intergenically between RND3 and RBM43. Four additional loci (rs7542900, rs4659485, rs2722769 and rs7107217) were associated with T2DM (P<0.05) and reached more nominal levels of significance (P<2.5×10−5) in the overall analysis and may represent novel loci that contribute to T2DM. We have identified novel T2DM-susceptibility variants in the African-American population. Notably, T2DM risk was associated with the major allele and implies an interesting genetic architecture in this population. These results suggest that multiple loci underlie T2DM susceptibility in the African-American population and that these loci are distinct from those identified in other ethnic populations. PMID:22238593

  18. Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information

    PubMed Central

    2012-01-01

    Background Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton. Results In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium. Conclusion This study will serve as a valuable genomic resource

  19. SNP Assay Development for Linkage Map Construction, Anchoring Whole-Genome Sequence, and Other Genetic and Genomic Applications in Common Bean

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Song, Qijian; Jia, Gaofeng; Hyten, David L.

    A total of 992,682 single-nucleotide polymorphisms (SNPs) was identified as ideal for Illumina Infinium II BeadChip design after sequencing a diverse set of 17 common bean (Phaseolus vulgaris L) varieties with the aid of next-generation sequencing technology. From these, two BeadChips each with >5000 SNPs were designed. The BARCBean6K_1 BeadChip was selected for the purpose of optimizing polymorphism among market classes and, when possible, SNPs were targeted to sequence scaffolds in the Phaseolus vulgaris 14× genome assembly with sequence lengths >10 kb. The BARCBean6K_2 BeadChip was designed with the objective of anchoring additional scaffolds and to facilitate orientation of largemore » scaffolds. Analysis of 267 F2 plants from a cross of varieties Stampede × Red Hawk with the two BeadChips resulted in linkage maps with a total of 7040 markers including 7015 SNPs. With the linkage map, a total of 432.3 Mb of sequence from 2766 scaffolds was anchored to create the Phaseolus vulgaris v1.0 assembly, which accounted for approximately 89% of the 487 Mb of available sequence scaffolds of the Phaseolus vulgaris v0.9 assembly. A core set of 6000 SNPs (BARCBean6K_3 BeadChip) with high genotyping quality and polymorphism was selected based on the genotyping of 365 dry bean and 134 snap bean accessions with the BARCBean6K_1 and BARCBean6K_2 BeadChips. The BARCBean6K_3 BeadChip is a useful tool for genetics and genomics research and it is widely used by breeders and geneticists in the United States and abroad.« less

  20. SNP Assay Development for Linkage Map Construction, Anchoring Whole-Genome Sequence, and Other Genetic and Genomic Applications in Common Bean.

    PubMed

    Song, Qijian; Jia, Gaofeng; Hyten, David L; Jenkins, Jerry; Hwang, Eun-Young; Schroeder, Steven G; Osorno, Juan M; Schmutz, Jeremy; Jackson, Scott A; McClean, Phillip E; Cregan, Perry B

    2015-08-28

    A total of 992,682 single-nucleotide polymorphisms (SNPs) was identified as ideal for Illumina Infinium II BeadChip design after sequencing a diverse set of 17 common bean (Phaseolus vulgaris L) varieties with the aid of next-generation sequencing technology. From these, two BeadChips each with >5000 SNPs were designed. The BARCBean6K_1 BeadChip was selected for the purpose of optimizing polymorphism among market classes and, when possible, SNPs were targeted to sequence scaffolds in the Phaseolus vulgaris 14× genome assembly with sequence lengths >10 kb. The BARCBean6K_2 BeadChip was designed with the objective of anchoring additional scaffolds and to facilitate orientation of large scaffolds. Analysis of 267 F2 plants from a cross of varieties Stampede × Red Hawk with the two BeadChips resulted in linkage maps with a total of 7040 markers including 7015 SNPs. With the linkage map, a total of 432.3 Mb of sequence from 2766 scaffolds was anchored to create the Phaseolus vulgaris v1.0 assembly, which accounted for approximately 89% of the 487 Mb of available sequence scaffolds of the Phaseolus vulgaris v0.9 assembly. A core set of 6000 SNPs (BARCBean6K_3 BeadChip) with high genotyping quality and polymorphism was selected based on the genotyping of 365 dry bean and 134 snap bean accessions with the BARCBean6K_1 and BARCBean6K_2 BeadChips. The BARCBean6K_3 BeadChip is a useful tool for genetics and genomics research and it is widely used by breeders and geneticists in the United States and abroad. Copyright © 2015 Song et al.

  1. SNP Assay Development for Linkage Map Construction, Anchoring Whole-Genome Sequence, and Other Genetic and Genomic Applications in Common Bean

    DOE PAGES

    Song, Qijian; Jia, Gaofeng; Hyten, David L.; ...

    2015-08-28

    A total of 992,682 single-nucleotide polymorphisms (SNPs) was identified as ideal for Illumina Infinium II BeadChip design after sequencing a diverse set of 17 common bean (Phaseolus vulgaris L) varieties with the aid of next-generation sequencing technology. From these, two BeadChips each with >5000 SNPs were designed. The BARCBean6K_1 BeadChip was selected for the purpose of optimizing polymorphism among market classes and, when possible, SNPs were targeted to sequence scaffolds in the Phaseolus vulgaris 14× genome assembly with sequence lengths >10 kb. The BARCBean6K_2 BeadChip was designed with the objective of anchoring additional scaffolds and to facilitate orientation of largemore » scaffolds. Analysis of 267 F2 plants from a cross of varieties Stampede × Red Hawk with the two BeadChips resulted in linkage maps with a total of 7040 markers including 7015 SNPs. With the linkage map, a total of 432.3 Mb of sequence from 2766 scaffolds was anchored to create the Phaseolus vulgaris v1.0 assembly, which accounted for approximately 89% of the 487 Mb of available sequence scaffolds of the Phaseolus vulgaris v0.9 assembly. A core set of 6000 SNPs (BARCBean6K_3 BeadChip) with high genotyping quality and polymorphism was selected based on the genotyping of 365 dry bean and 134 snap bean accessions with the BARCBean6K_1 and BARCBean6K_2 BeadChips. The BARCBean6K_3 BeadChip is a useful tool for genetics and genomics research and it is widely used by breeders and geneticists in the United States and abroad.« less

  2. High-density linkage mapping aided by transcriptomics documents ZW sex determination system in the Chinese mitten crab Eriocheir sinensis

    PubMed Central

    Cui, Z; Hui, M; Liu, Y; Song, C; Li, X; Li, Y; Liu, L; Shi, G; Wang, S; Li, F; Zhang, X; Liu, C; Xiang, J; Chu, K H

    2015-01-01

    The sex determination system in crabs is believed to be XY-XX from karyotypy, but centromeres could not be identified in some chromosomes and their morphology is not completely clear. Using quantitative trait locus mapping of the gender phenotype, we revealed a ZW-ZZ sex determination system in Eriocheir sinensis and presented a high-density linkage map covering ~98.5% of the genome, with 73 linkage groups corresponding to the haploid chromosome number. All sex-linked markers in the family we used were located on a single linkage group, LG60, and sex linkage was confirmed by genome-wide association studies (GWAS). Forty-six markers detected by GWAS were heterozygous and segregated only in the female parent. The female LG60 was thus the putative W chromosome, with the homologous male LG60 as the Z chromosome. The putative Z and W sex chromosomes were identical in size and carried many homologous loci. Sex ratio (5:1) skewing towards females in induced triploids using unrelated animals also supported a ZW-ZZ system. Transcriptome data were used to search for candidate sex-determining loci, but only one LG60 gene was identified as an ankyrin-2 gene. Double sex- and mab3-related transcription factor 1 (Dmrt1), a Z-linked gene in birds, was located on a putative autosome. With complete genome sequencing and transcriptomic data, more genes on putative sex chromosomes will be characterised, thus leading towards a comprehensive understanding of the sex determination and differentiation mechanisms of E. sinensis, and decapod crustaceans in general. PMID:25873149

  3. Quantitative trait locus linkage analysis in a large Amish pedigree identifies novel candidate loci for erythrocyte traits

    PubMed Central

    Hinckley, Jesse D; Abbott, Diana; Burns, Trudy L; Heiman, Meadow; Shapiro, Amy D; Wang, Kai; Di Paola, Jorge

    2013-01-01

    We characterized a large Amish pedigree and, in 384 pedigree members, analyzed the genetic variance components with covariate screen as well as genome-wide quantitative trait locus (QTL) linkage analysis of red blood cell count (RBC), hemoglobin (HB), hematocrit (HCT), mean corpuscular volume (MCV), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), red cell distribution width (RDW), platelet count (PLT), and white blood cell count (WBC) using SOLAR. Age and gender were found to be significant covariates in many CBC traits. We obtained significant heritability estimates for RBC, MCV, MCH, MCHC, RDW, PLT, and WBC. We report four candidate loci with Logarithm of the odds (LOD) scores above 2.0: 6q25 (MCH), 9q33 (WBC), 10p12 (RDW), and 20q13 (MCV). We also report eleven candidate loci with LOD scores between 1.5 and <2.0. Bivariate linkage analysis of MCV and MCH on chromosome 20 resulted in a higher maximum LOD score of 3.14. Linkage signals on chromosomes 4q28, 6p22, 6q25, and 20q13 are concomitant with previously reported QTL. All other linkage signals reported herein represent novel evidence of candidate QTL. Interestingly rs1800562, the most common causal variant of hereditary hemochromatosis in HFE (6p22) was associated with MCH and MCHC in this family. Linkage studies like the one presented here will allow investigators to focus the search for rare variants amidst the noise encountered in the large amounts of data generated by whole-genome sequencing. PMID:24058921

  4. Genome-wide linkage and positional candidate gene study of blood pressure response to dietary potassium intervention: the genetic epidemiology network of salt sensitivity study.

    PubMed

    Kelly, Tanika N; Hixson, James E; Rao, Dabeeru C; Mei, Hao; Rice, Treva K; Jaquish, Cashell E; Shimmin, Lawrence C; Schwander, Karen; Chen, Chung-Shuian; Liu, Depei; Chen, Jichun; Bormans, Concetta; Shukla, Pramila; Farhana, Naveed; Stuart, Colin; Whelton, Paul K; He, Jiang; Gu, Dongfeng

    2010-12-01

    Genetic determinants of blood pressure (BP) response to potassium, or potassium sensitivity, are largely unknown. We conducted a genome-wide linkage scan and positional candidate gene analysis to identify genetic determinants of potassium sensitivity. A total of 1906 Han Chinese participants took part in a 7-day high-sodium diet followed by a 7-day high-sodium plus potassium dietary intervention. BP measurements were obtained at baseline and after each intervention using a random-zero sphygmomanometer. Significant linkage signals (logarithm of odds [LOD] score, >3) for BP responses to potassium were detected at chromosomal regions 3q24-q26.1, 3q28, and 11q22.3-q24.3. Maximum multipoint LOD scores of 3.09 at 3q25.2 and 3.41 at 11q23.3 were observed for absolute diastolic BP (DBP) and mean arterial pressure (MAP) responses, respectively. Linkage peaks of 3.56 at 3q25.1 and 3.01 at 11q23.3 for percent DBP response and 3.22 at 3q25.2, 3.01 at 3q28, and 4.48 at 11q23.3 for percent MAP response also were identified. Angiotensin II receptor, type 1 (AGTR1), single-nucleotide polymorphism rs16860760 in the 3q24-q26.1 region was significantly associated with absolute and percent systolic BP responses to potassium (P=0.0008 and P=0.0006, respectively). Absolute (95% CI) systolic BP responses for genotypes C/C, C/T, and T/T were -3.71 (-4.02 to -3.40), -2.62 (-3.38 to -1.85), and 1.03 (-3.73 to 5.79) mm Hg, respectively, and percent responses (95% CI) were -3.07 (-3.33 to -2.80), -2.07 (-2.74 to -1.41), and 0.90 (-3.20 to 4.99), respectively. Similar trends were observed for DBP and MAP responses. Genetic regions on chromosomes 3 and 11 may harbor important susceptibility loci for potassium sensitivity. Furthermore, the AGTR1 gene was a significant predictor of BP responses to potassium intake.

  5. Chromosome 9p21 in Amyotrophic Lateral Sclerosis in Finland: A Genome-Wide Association Study

    PubMed Central

    Laaksovirta, Hannu; Peuralinna, Terhi; Schymick, Jennifer C.; Scholz, Sonja W.; Lai, Shaoi-Lin; Myllykangas, Liisa; Sulkava, Raimo; Jansson, Lilja; Hernandez, Dena G.; Gibbs, J. Raphael; Nalls, Michael A.; Heckerman, David; Tienari, Pentti J.; Traynor, Bryan J.

    2010-01-01

    Introduction The genetic etiology of amyotrophic lateral sclerosis (ALS) is not well understood. Finland is a well-suited location for a genome-wide association study of ALS, as the incidence of the disease is one of the highest in the world, and because the genetic homogeneity of the Finnish population enhances the ability to detect risk loci. Methods We performed a genome-wide association study of 442 Finnish patients diagnosed with ALS, and 521 Finnish control subjects using Illumina genome-wide genotyping arrays. DNA was collected from patients attending an ALS specialty clinic that receives referrals from neurologists throughout Finland, whereas the control samples were obtained from a population-based study of elderly Finnish individuals. Individuals known to carry D90A alleles of the SOD1 gene (n = 40) were included in the final analysis as positive controls to determine if our GWAS was able to detect an association signal at this locus. Findings We identified two association peaks that exceeded genome-wide significance. One of these was located on chromosome 21q22 (rs13048019, p = 2·58×10−8) that corresponded to the known autosomal recessive D90A allele of the SOD1 gene. The other was detected in a 232kb block of linkage disequilibrium (rs3849942, p = 9·11×10−11) in a region of chromosome 9p that has been previously identified by linkage studies of ALS families. Within this region, we defined a 42-SNP haplotype that significantly increased risk of developing ALS (p = 4·2×10−33 among familial cases, odds ratio = 21·0, 95% CI = 11·2–39·1), and which overlapped with an association locus recently reported for fronto-temporal dementia (FTD). Based on the 93 familial ALS cases included in the analysis, population attributable risk percent for the chromosome 9p21 locus was 37.9% (95% CI, 27·7 – 48·1%), and for D90A homozygosity was 25·5% (95% CI, 16·9 – 34·1%). Interpretation In summary, we present evidence that the chromosome 9p21 ALS

  6. Genome-wide analysis of genetic susceptibility to language impairment in an isolated Chilean population

    PubMed Central

    Villanueva, Pia; Newbury, Dianne F; Jara, Lilian; De Barbieri, Zulema; Mirza, Ghazala; Palomino, Hernán M; Fernández, María Angélica; Cazier, Jean-Baptiste; Monaco, Anthony P; Palomino, Hernán

    2011-01-01

    Specific language impairment (SLI) is an unexpected deficit in the acquisition of language skills and affects between 5 and 8% of pre-school children. Despite its prevalence and high heritability, our understanding of the aetiology of this disorder is only emerging. In this paper, we apply genome-wide techniques to investigate an isolated Chilean population who exhibit an increased frequency of SLI. Loss of heterozygosity (LOH) mapping and parametric and non-parametric linkage analyses indicate that complex genetic factors are likely to underlie susceptibility to SLI in this population. Across all analyses performed, the most consistently implicated locus was on chromosome 7q. This locus achieved highly significant linkage under all three non-parametric models (max NPL=6.73, P=4.0 × 10−11). In addition, it yielded a HLOD of 1.24 in the recessive parametric linkage analyses and contained a segment that was homozygous in two affected individuals. Further, investigation of this region identified a two-SNP haplotype that occurs at an increased frequency in language-impaired individuals (P=0.008). We hypothesise that the linkage regions identified here, in particular that on chromosome 7, may contain variants that underlie the high prevalence of SLI observed in this isolated population and may be of relevance to other populations affected by language impairments. PMID:21248734

  7. Memory management in genome-wide association studies

    PubMed Central

    2009-01-01

    Genome-wide association is a powerful tool for the identification of genes that underlie common diseases. Genome-wide association studies generate billions of genotypes and pose significant computational challenges for most users including limited computer memory. We applied a recently developed memory management tool to two analyses of North American Rheumatoid Arthritis Consortium studies and measured the performance in terms of central processing unit and memory usage. We conclude that our memory management approach is simple, efficient, and effective for genome-wide association studies. PMID:20018047

  8. Quantitative Linkage for Autism Spectrum Disorders Symptoms in Attention-Deficit/Hyperactivity Disorder: Significant Locus on Chromosome 7q11

    ERIC Educational Resources Information Center

    Nijmeijer, Judith S.; Arias-Vásquez, Alejandro; Rommelse, Nanda N.; Altink, Marieke E.; Buschgens, Cathelijne J.; Fliers, Ellen A.; Franke, Barbara; Minderaa, Ruud B.; Sergeant, Joseph A.; Buitelaar, Jan K.; Hoekstra, Pieter J.; Hartman, Catharina A.

    2014-01-01

    We studied 261 ADHD probands and 354 of their siblings to assess quantitative trait loci associated with autism spectrum disorder symptoms (as measured by the Children's Social Behavior Questionnaire (CSBQ) using a genome-wide linkage approach, followed by locus-wide association analysis. A genome-wide significant locus for the CSBQ subscale…

  9. Genome-wide Association Study Identifies African-Specific Susceptibility Loci in African Americans with Inflammatory Bowel Disease

    PubMed Central

    Brant, Steven R.; Okou, David T.; Simpson, Claire L.; Cutler, David J.; Haritunians, Talin; Bradfield, Jonathan P.; Chopra, Pankaj; Prince, Jarod; Begum, Ferdouse; Kumar, Archana; Huang, Chengrui; Venkateswaran, Suresh; Datta, Lisa W.; Wei, Zhi; Thomas, Kelly; Herrinton, Lisa J.; Klapproth, Jan-Micheal A.; Quiros, Antonio J.; Seminerio, Jenifer; Liu, Zhenqiu; Alexander, Jonathan S.; Baldassano, Robert N.; Dudley-Brown, Sharon; Cross, Raymond K.; Dassopoulos, Themistocles; Denson, Lee A.; Dhere, Tanvi A.; Dryden, Gerald W.; Hanson, John S.; Hou, Jason K.; Hussain, Sunny Z.; Hyams, Jeffrey S.; Isaacs, Kim L.; Kader, Howard; Kappelman, Michael D.; Katz, Jeffry; Kellermayer, Richard; Kirschner, Barbara S.; Kuemmerle, John F.; Kwon, John H.; Lazarev, Mark; Li, Ellen; Mack, David; Mannon, Peter; Moulton, Dedrick E.; Newberry, Rodney D.; Osuntokun, Bankole O.; Patel, Ashish S.; Saeed, Shehzad A.; Targan, Stephan R.; Valentine, John F.; Wang, Ming-Hsi; Zonca, Martin; Rioux, John D.; Duerr, Richard H.; Silverberg, Mark S.; Cho, Judy H.; Hakonarson, Hakon; Zwick, Michael E.; McGovern, Dermot P.B.; Kugathasan, Subra

    2016-01-01

    Background & Aims The inflammatory bowel diseases (IBD) ulcerative colitis (UC) and Crohn’s disease (CD) cause significant morbidity and are increasing in prevalence among all populations, including African Americans. More than 200 susceptibility loci have been identified in populations of predominantly European ancestry, but few loci have been associated with IBD in other ethnicities. Methods We performed 2 high-density, genome-wide scans comprising 2345 cases of African Americans with IBD (1646 with CD, 583 with UC, and 116 inflammatory bowel disease unclassified [IBD-U]) and 5002 individuals without IBD (controls, identified from the Health Retirement Study and Kaiser Permanente database). Single-nucleotide polymorphisms (SNPs) associated at P<5.0×10−8 in meta-analysis with a nominal evidence (P<.05) in each scan were considered to have genome-wide significance. Results We detected SNPs at HLA-DRB1, and African-specific SNPs at ZNF649 and LSAMP, with associations of genome-wide significance for UC. We detected SNPs at USP25 with associations of genome-wide significance associations for IBD. No associations of genome-wide significance were detected for CD. In addition, 9 genes previously associated with IBD contained SNPs with significant evidence for replication (P<1.6×10−6): ADCY3, CXCR6, HLA-DRB1 to HLA-DQA1 (genome-wide significance on conditioning), IL12B, PTGER4, and TNC for IBD; IL23R, PTGER4, and SNX20 (in strong linkage disequilibrium with NOD2) for CD; and KCNQ2 (near TNFRSF6B) for UC. Several of these genes, such as TNC (near TNFSF15), CXCR6, and genes associated with IBD at the HLA locus, contained SNPs with unique association patterns with African-specific alleles. Conclusions We performed a genome-wide association study of African Americans with IBD and identified loci associated with CD and UC in only this population; we also replicated loci identified in European populations. The detection of variants associated with IBD risk in only

  10. Genome-Wide Association Study Identifies African-Specific Susceptibility Loci in African Americans With Inflammatory Bowel Disease.

    PubMed

    Brant, Steven R; Okou, David T; Simpson, Claire L; Cutler, David J; Haritunians, Talin; Bradfield, Jonathan P; Chopra, Pankaj; Prince, Jarod; Begum, Ferdouse; Kumar, Archana; Huang, Chengrui; Venkateswaran, Suresh; Datta, Lisa W; Wei, Zhi; Thomas, Kelly; Herrinton, Lisa J; Klapproth, Jan-Micheal A; Quiros, Antonio J; Seminerio, Jenifer; Liu, Zhenqiu; Alexander, Jonathan S; Baldassano, Robert N; Dudley-Brown, Sharon; Cross, Raymond K; Dassopoulos, Themistocles; Denson, Lee A; Dhere, Tanvi A; Dryden, Gerald W; Hanson, John S; Hou, Jason K; Hussain, Sunny Z; Hyams, Jeffrey S; Isaacs, Kim L; Kader, Howard; Kappelman, Michael D; Katz, Jeffry; Kellermayer, Richard; Kirschner, Barbara S; Kuemmerle, John F; Kwon, John H; Lazarev, Mark; Li, Ellen; Mack, David; Mannon, Peter; Moulton, Dedrick E; Newberry, Rodney D; Osuntokun, Bankole O; Patel, Ashish S; Saeed, Shehzad A; Targan, Stephan R; Valentine, John F; Wang, Ming-Hsi; Zonca, Martin; Rioux, John D; Duerr, Richard H; Silverberg, Mark S; Cho, Judy H; Hakonarson, Hakon; Zwick, Michael E; McGovern, Dermot P B; Kugathasan, Subra

    2017-01-01

    The inflammatory bowel diseases (IBD) ulcerative colitis (UC) and Crohn's disease (CD) cause significant morbidity and are increasing in prevalence among all populations, including African Americans. More than 200 susceptibility loci have been identified in populations of predominantly European ancestry, but few loci have been associated with IBD in other ethnicities. We performed 2 high-density, genome-wide scans comprising 2345 cases of African Americans with IBD (1646 with CD, 583 with UC, and 116 inflammatory bowel disease unclassified) and 5002 individuals without IBD (controls, identified from the Health Retirement Study and Kaiser Permanente database). Single-nucleotide polymorphisms (SNPs) associated at P < 5.0 × 10 -8 in meta-analysis with a nominal evidence (P < .05) in each scan were considered to have genome-wide significance. We detected SNPs at HLA-DRB1, and African-specific SNPs at ZNF649 and LSAMP, with associations of genome-wide significance for UC. We detected SNPs at USP25 with associations of genome-wide significance for IBD. No associations of genome-wide significance were detected for CD. In addition, 9 genes previously associated with IBD contained SNPs with significant evidence for replication (P < 1.6 × 10 -6 ): ADCY3, CXCR6, HLA-DRB1 to HLA-DQA1 (genome-wide significance on conditioning), IL12B,PTGER4, and TNC for IBD; IL23R, PTGER4, and SNX20 (in strong linkage disequilibrium with NOD2) for CD; and KCNQ2 (near TNFRSF6B) for UC. Several of these genes, such as TNC (near TNFSF15), CXCR6, and genes associated with IBD at the HLA locus, contained SNPs with unique association patterns with African-specific alleles. We performed a genome-wide association study of African Americans with IBD and identified loci associated with UC in only this population; we also replicated IBD, CD, and UC loci identified in European populations. The detection of variants associated with IBD risk in only people of African descent demonstrates the

  11. A Saturated Genetic Linkage Map of Autotetraploid Alfalfa (Medicago sativa L.) Developed Using Genotyping-by-Sequencing Is Highly Syntenous with the Medicago truncatula Genome

    PubMed Central

    Li, Xuehui; Wei, Yanling; Acharya, Ananta; Jiang, Qingzhen; Kang, Junmei; Brummer, E. Charles

    2014-01-01

    A genetic linkage map is a valuable tool for quantitative trait locus mapping, map-based gene cloning, comparative mapping, and whole-genome assembly. Alfalfa, one of the most important forage crops in the world, is autotetraploid, allogamous, and highly heterozygous, characteristics that have impeded the construction of a high-density linkage map using traditional genetic marker systems. Using genotyping-by-sequencing (GBS), we constructed low-cost, reasonably high-density linkage maps for both maternal and paternal parental genomes of an autotetraploid alfalfa F1 population. The resulting maps contain 3591 single-nucleotide polymorphism markers on 64 linkage groups across both parents, with an average density of one marker per 1.5 and 1.0 cM for the maternal and paternal haplotype maps, respectively. Chromosome assignments were made based on homology of markers to the M. truncatula genome. Four linkage groups representing the four haplotypes of each alfalfa chromosome were assigned to each of the eight Medicago chromosomes in both the maternal and paternal parents. The alfalfa linkage groups were highly syntenous with M. truncatula, and clearly identified the known translocation between Chromosomes 4 and 8. In addition, a small inversion on Chromosome 1 was identified between M. truncatula and M. sativa. GBS enabled us to develop a saturated linkage map for alfalfa that greatly improved genome coverage relative to previous maps and that will facilitate investigation of genome structure. GBS could be used in breeding populations to accelerate molecular breeding in alfalfa. PMID:25147192

  12. Family genome browser: visualizing genomes with pedigree information.

    PubMed

    Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong

    2015-07-15

    Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Sex-specific linkage scans in opioid dependence.

    PubMed

    Yang, Bao-Zhu; Han, Shizhong; Kranzler, Henry R; Palmer, Abraham A; Gelernter, Joel

    2017-04-01

    Sex influences risk for opioid dependence (OD). We hypothesized that sex might interact with genetic loci that influence the risk for OD. Therefore we performed an analysis to identify sex-specific genomic susceptibility regions for OD using linkage. Over 6,000 single nucleotide polymorphism (SNP) markers were genotyped for 1,758 African- and European-American (AA and EA) individuals from 739 families, ascertained via affected sib-pairs with OD and/or cocaine dependence. Autosomewide non-parametric linkage scans, stratified by sex and population, were performed. We identified one significant linkage region, segregating with OD in EA men, at 71.1 cM on chromosome 4 (LOD = 3.29; point-wise P = 0.00005; empirical autosome-wide P = 0.042), which significantly differed from the linkage signal at the same location in EA women (empirical P = 0.002). Three suggestive linkage signals were identified at 181.3 cM on chromosome 7 (LOD = 2.18), 104 cM on chromosome 11 (LOD = 1.85), and 60.9 cM on chromosome 16 (LOD = 1.93) in EA women. In AA men, four suggestive linkage signals were detected at 201.1 cM on chromosome 3 (LOD = 2.32), 152.9 cM on chromosome 6 (LOD = 1.86), 16.8 cM on chromosome 7 (LOD = 1.95), and 36.1 cM on chromosome 17 (LOD = 1.99). The significant region, mapping to 4q12-4q13.1, harbors several OD candidate genes with interconnected functionality, including VEGFR, CLOCK, PDCL2, NMU, NRSF, and IGFBP7. In conclusion, these results provide an evidence for the existence of sex-specific and population-specific differences in OD. Furthermore, these results provide positional information that will facilitate the use of targeted next-generation sequencing to search for genes that contribute to sex-specific differences in OD. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  14. Linkage disequilibrium and signatures of positive selection around LINE-1 retrotransposons in the human genome.

    PubMed

    Kuhn, Alexandre; Ong, Yao Min; Cheng, Ching-Yu; Wong, Tien Yin; Quake, Stephen R; Burkholder, William F

    2014-06-03

    Insertions of the human-specific subfamily of LINE-1 (L1) retrotransposon are highly polymorphic across individuals and can critically influence the human transcriptome. We hypothesized that L1 insertions could represent genetic variants determining important human phenotypic traits, and performed an integrated analysis of L1 elements and single nucleotide polymorphisms (SNPs) in several human populations. We found that a large fraction of L1s were in high linkage disequilibrium with their surrounding genomic regions and that they were well tagged by SNPs. However, L1 variants were only partially captured by SNPs on standard SNP arrays, so that their potential phenotypic impact would be frequently missed by SNP array-based genome-wide association studies. We next identified potential phenotypic effects of L1s by looking for signatures of natural selection linked to L1 insertions; significant extended haplotype homozygosity was detected around several L1 insertions. This finding suggests that some of these L1 insertions may have been the target of recent positive selection.

  15. Haplotype-Based Genome-Wide Prediction Models Exploit Local Epistatic Interactions Among Markers

    PubMed Central

    Jiang, Yong; Schmidt, Renate H.; Reif, Jochen C.

    2018-01-01

    Genome-wide prediction approaches represent versatile tools for the analysis and prediction of complex traits. Mostly they rely on marker-based information, but scenarios have been reported in which models capitalizing on closely-linked markers that were combined into haplotypes outperformed marker-based models. Detailed comparisons were undertaken to reveal under which circumstances haplotype-based genome-wide prediction models are superior to marker-based models. Specifically, it was of interest to analyze whether and how haplotype-based models may take local epistatic effects between markers into account. Assuming that populations consisted of fully homozygous individuals, a marker-based model in which local epistatic effects inside haplotype blocks were exploited (LEGBLUP) was linearly transformable into a haplotype-based model (HGBLUP). This theoretical derivation formally revealed that haplotype-based genome-wide prediction models capitalize on local epistatic effects among markers. Simulation studies corroborated this finding. Due to its computational efficiency the HGBLUP model promises to be an interesting tool for studies in which ultra-high-density SNP data sets are studied. Applying the HGBLUP model to empirical data sets revealed higher prediction accuracies than for marker-based models for both traits studied using a mouse panel. In contrast, only a small subset of the traits analyzed in crop populations showed such a benefit. Cases in which higher prediction accuracies are observed for HGBLUP than for marker-based models are expected to be of immediate relevance for breeders, due to the tight linkage a beneficial haplotype will be preserved for many generations. In this respect the inheritance of local epistatic effects very much resembles the one of additive effects. PMID:29549092

  16. Haplotype-Based Genome-Wide Prediction Models Exploit Local Epistatic Interactions Among Markers.

    PubMed

    Jiang, Yong; Schmidt, Renate H; Reif, Jochen C

    2018-05-04

    Genome-wide prediction approaches represent versatile tools for the analysis and prediction of complex traits. Mostly they rely on marker-based information, but scenarios have been reported in which models capitalizing on closely-linked markers that were combined into haplotypes outperformed marker-based models. Detailed comparisons were undertaken to reveal under which circumstances haplotype-based genome-wide prediction models are superior to marker-based models. Specifically, it was of interest to analyze whether and how haplotype-based models may take local epistatic effects between markers into account. Assuming that populations consisted of fully homozygous individuals, a marker-based model in which local epistatic effects inside haplotype blocks were exploited (LEGBLUP) was linearly transformable into a haplotype-based model (HGBLUP). This theoretical derivation formally revealed that haplotype-based genome-wide prediction models capitalize on local epistatic effects among markers. Simulation studies corroborated this finding. Due to its computational efficiency the HGBLUP model promises to be an interesting tool for studies in which ultra-high-density SNP data sets are studied. Applying the HGBLUP model to empirical data sets revealed higher prediction accuracies than for marker-based models for both traits studied using a mouse panel. In contrast, only a small subset of the traits analyzed in crop populations showed such a benefit. Cases in which higher prediction accuracies are observed for HGBLUP than for marker-based models are expected to be of immediate relevance for breeders, due to the tight linkage a beneficial haplotype will be preserved for many generations. In this respect the inheritance of local epistatic effects very much resembles the one of additive effects. Copyright © 2018 Jiang et al.

  17. Linkage mapping of beta 2 EEG waves via non-parametric regression.

    PubMed

    Ghosh, Saurabh; Begleiter, Henri; Porjesz, Bernice; Chorlian, David B; Edenberg, Howard J; Foroud, Tatiana; Goate, Alison; Reich, Theodore

    2003-04-01

    Parametric linkage methods for analyzing quantitative trait loci are sensitive to violations in trait distributional assumptions. Non-parametric methods are relatively more robust. In this article, we modify the non-parametric regression procedure proposed by Ghosh and Majumder [2000: Am J Hum Genet 66:1046-1061] to map Beta 2 EEG waves using genome-wide data generated in the COGA project. Significant linkage findings are obtained on chromosomes 1, 4, 5, and 15 with findings at multiple regions on chromosomes 4 and 15. We analyze the data both with and without incorporating alcoholism as a covariate. We also test for epistatic interactions between regions of the genome exhibiting significant linkage with the EEG phenotypes and find evidence of epistatic interactions between a region each on chromosome 1 and chromosome 4 with one region on chromosome 15. While regressing out the effect of alcoholism does not affect the linkage findings, the epistatic interactions become statistically insignificant. Copyright 2003 Wiley-Liss, Inc.

  18. Genome-wide association study reveals novel variants for growth and egg traits in Dongxiang blue-shelled and White Leghorn chickens.

    PubMed

    Liao, R; Zhang, X; Chen, Q; Wang, Z; Wang, Q; Yang, C; Pan, Y

    2016-10-01

    This study was designed to investigate the genetic basis of growth and egg traits in Dongxiang blue-shelled chickens and White Leghorn chickens. In this study, we employed a reduced representation sequencing approach called genotyping by genome reducing and sequencing to detect genome-wide SNPs in 252 Dongxiang blue-shelled chickens and 252 White Leghorn chickens. The Dongxiang blue-shelled chicken breed has many specific traits and is characterized by blue-shelled eggs, black plumage, black skin, black bone and black organs. The White Leghorn chicken is an egg-type breed with high productivity. As multibreed genome-wide association studies (GWASs) can improve precision due to less linkage disequilibrium across breeds, a multibreed GWAS was performed with 156 575 SNPs to identify the associated variants underlying growth and egg traits within the two chicken breeds. The analysis revealed 32 SNPs exhibiting a significant genome-wide association with growth and egg traits. Some of the significant SNPs are located in genes that are known to impact growth and egg traits, but nearly half of the significant SNPs are located in genes with unclear functions in chickens. To our knowledge, this is the first multibreed genome-wide report for the genetics of growth and egg traits in the Dongxiang blue-shelled and White Leghorn chickens. © 2016 Stichting International Foundation for Animal Genetics.

  19. Genome-Wide Identification of Molecular Mimicry Candidates in Parasites

    PubMed Central

    Ludin, Philipp; Nilsson, Daniel; Mäser, Pascal

    2011-01-01

    Among the many strategies employed by parasites for immune evasion and host manipulation, one of the most fascinating is molecular mimicry. With genome sequences available for host and parasite, mimicry of linear amino acid epitopes can be investigated by comparative genomics. Here we developed an in silico pipeline for genome-wide identification of molecular mimicry candidate proteins or epitopes. The predicted proteome of a given parasite was broken down into overlapping fragments, each of which was screened for close hits in the human proteome. Control searches were carried out against unrelated, free-living eukaryotes to eliminate the generally conserved proteins, and with randomized versions of the parasite proteins to get an estimate of statistical significance. This simple but computation-intensive approach yielded interesting candidates from human-pathogenic parasites. From Plasmodium falciparum, it returned a 14 amino acid motif in several of the PfEMP1 variants identical to part of the heparin-binding domain in the immunosuppressive serum protein vitronectin. And in Brugia malayi, fragments were detected that matched to periphilin-1, a protein of cell-cell junctions involved in barrier formation. All the results are publicly available by means of mimicDB, a searchable online database for molecular mimicry candidates from pathogens. To our knowledge, this is the first genome-wide survey for molecular mimicry proteins in parasites. The strategy can be adopted to any pair of host and pathogen, once appropriate negative control organisms are chosen. MimicDB provides a host of new starting points to gain insights into the molecular nature of host-pathogen interactions. PMID:21408160

  20. Educational Attainment: A Genome Wide Association Study in 9538 Australians

    PubMed Central

    Martin, Nicolas W.; Medland, Sarah E.; Verweij, Karin J. H.; Lee, S. Hong; Nyholt, Dale R.; Madden, Pamela A.; Heath, Andrew C.; Montgomery, Grant W.; Wright, Margaret J.; Martin, Nicholas G.

    2011-01-01

    Background Correlations between Educational Attainment (EA) and measures of cognitive performance are as high as 0.8. This makes EA an attractive alternative phenotype for studies wishing to map genes affecting cognition due to the ease of collecting EA data compared to other cognitive phenotypes such as IQ. Methodology In an Australian family sample of 9538 individuals we performed a genome-wide association scan (GWAS) using the imputed genotypes of ∼2.4 million single nucleotide polymorphisms (SNP) for a 6-point scale measure of EA. Top hits were checked for replication in an independent sample of 968 individuals. A gene-based test of association was then applied to the GWAS results. Additionally we performed prediction analyses using the GWAS results from our discovery sample to assess the percentage of EA and full scale IQ variance explained by the predicted scores. Results The best SNP fell short of having a genome-wide significant p-value (p = 9.77×10−7). In our independent replication sample six SNPs among the top 50 hits pruned for linkage disequilibrium (r2<0.8) had a p-value<0.05 but only one of these SNPs survived correction for multiple testing - rs7106258 (p = 9.7*10−4) located in an intergenic region of chromosome 11q14.1. The gene based test results were non-significant and our prediction analyses show that the predicted scores explained little variance in EA in our replication sample. Conclusion While we have identified a polymorphism chromosome 11q14.1 associated with EA, further replication is warranted. Overall, the absence of genome-wide significant p-values in our large discovery sample confirmed the high polygenic architecture of EA. Only the assembly of large samples or meta-analytic efforts will be able to assess the implication of common DNA polymorphisms in the etiology of EA. PMID:21694764

  1. A saturated genetic linkage map of autotetraploid alfalfa (Medicago sativa L.) developed using genotyping-by-sequencing is highly syntenous with the Medicago truncatula genome.

    PubMed

    Li, Xuehui; Wei, Yanling; Acharya, Ananta; Jiang, Qingzhen; Kang, Junmei; Brummer, E Charles

    2014-08-21

    A genetic linkage map is a valuable tool for quantitative trait locus mapping, map-based gene cloning, comparative mapping, and whole-genome assembly. Alfalfa, one of the most important forage crops in the world, is autotetraploid, allogamous, and highly heterozygous, characteristics that have impeded the construction of a high-density linkage map using traditional genetic marker systems. Using genotyping-by-sequencing (GBS), we constructed low-cost, reasonably high-density linkage maps for both maternal and paternal parental genomes of an autotetraploid alfalfa F1 population. The resulting maps contain 3591 single-nucleotide polymorphism markers on 64 linkage groups across both parents, with an average density of one marker per 1.5 and 1.0 cM for the maternal and paternal haplotype maps, respectively. Chromosome assignments were made based on homology of markers to the M. truncatula genome. Four linkage groups representing the four haplotypes of each alfalfa chromosome were assigned to each of the eight Medicago chromosomes in both the maternal and paternal parents. The alfalfa linkage groups were highly syntenous with M. truncatula, and clearly identified the known translocation between Chromosomes 4 and 8. In addition, a small inversion on Chromosome 1 was identified between M. truncatula and M. sativa. GBS enabled us to develop a saturated linkage map for alfalfa that greatly improved genome coverage relative to previous maps and that will facilitate investigation of genome structure. GBS could be used in breeding populations to accelerate molecular breeding in alfalfa. Copyright © 2014 Li et al.

  2. Sample reproducibility of genetic association using different multimarker TDTs in genome-wide association studies: characterization and a new approach.

    PubMed

    Abad-Grau, Mara M; Medina-Medina, Nuria; Montes-Soldado, Rosana; Matesanz, Fuencisla; Bafna, Vineet

    2012-01-01

    Multimarker Transmission/Disequilibrium Tests (TDTs) are very robust association tests to population admixture and structure which may be used to identify susceptibility loci in genome-wide association studies. Multimarker TDTs using several markers may increase power by capturing high-degree associations. However, there is also a risk of spurious associations and power reduction due to the increase in degrees of freedom. In this study we show that associations found by tests built on simple null hypotheses are highly reproducible in a second independent data set regardless the number of markers. As a test exhibiting this feature to its maximum, we introduce the multimarker 2-Groups TDT (mTDT(2G)), a test which under the hypothesis of no linkage, asymptotically follows a χ2 distribution with 1 degree of freedom regardless the number of markers. The statistic requires the division of parental haplotypes into two groups: disease susceptibility and disease protective haplotype groups. We assessed the test behavior by performing an extensive simulation study as well as a real-data study using several data sets of two complex diseases. We show that mTDT(2G) test is highly efficient and it achieves the highest power among all the tests used, even when the null hypothesis is tested in a second independent data set. Therefore, mTDT(2G) turns out to be a very promising multimarker TDT to perform genome-wide searches for disease susceptibility loci that may be used as a preprocessing step in the construction of more accurate genetic models to predict individual susceptibility to complex diseases.

  3. Employing genome-wide SNP discovery and genotyping strategy to extrapolate the natural allelic diversity and domestication patterns in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically

  4. Combining cow and bull reference populations to increase accuracy of genomic prediction and genome-wide association studies.

    PubMed

    Calus, M P L; de Haas, Y; Veerkamp, R F

    2013-10-01

    Genomic selection holds the promise to be particularly beneficial for traits that are difficult or expensive to measure, such that access to phenotypes on large daughter groups of bulls is limited. Instead, cow reference populations can be generated, potentially supplemented with existing information from the same or (highly) correlated traits available on bull reference populations. The objective of this study, therefore, was to develop a model to perform genomic predictions and genome-wide association studies based on a combined cow and bull reference data set, with the accuracy of the phenotypes differing between the cow and bull genomic selection reference populations. The developed bivariate Bayesian stochastic search variable selection model allowed for an unbalanced design by imputing residuals in the residual updating scheme for all missing records. The performance of this model is demonstrated on a real data example, where the analyzed trait, being milk fat or protein yield, was either measured only on a cow or a bull reference population, or recorded on both. Our results were that the developed bivariate Bayesian stochastic search variable selection model was able to analyze 2 traits, even though animals had measurements on only 1 of 2 traits. The Bayesian stochastic search variable selection model yielded consistently higher accuracy for fat yield compared with a model without variable selection, both for the univariate and bivariate analyses, whereas the accuracy of both models was very similar for protein yield. The bivariate model identified several additional quantitative trait loci peaks compared with the single-trait models on either trait. In addition, the bivariate models showed a marginal increase in accuracy of genomic predictions for the cow traits (0.01-0.05), although a greater increase in accuracy is expected as the size of the bull population increases. Our results emphasize that the chosen value of priors in Bayesian genomic prediction

  5. Revealing misassembled segments in the bovine reference genome by high resolution linkage disequilibrium scan

    USDA-ARS?s Scientific Manuscript database

    Misassembly signatures, created by shuffling the order of sequences while assembling a genome, can be easily seen by analyzing the unexpected behaviour of the linkage disequilibrium (LD) decay. A heuristic process was proposed to identify those misassembly signatures and presented the ones found in ...

  6. GWAMA: software for genome-wide association meta-analysis.

    PubMed

    Mägi, Reedik; Morris, Andrew P

    2010-05-28

    Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.

  7. Complementary genetic and genomic approaches help characterize the linkage group I seed protein QTL in soybean

    PubMed Central

    2010-01-01

    Background The nutritional and economic value of many crops is effectively a function of seed protein and oil content. Insight into the genetic and molecular control mechanisms involved in the deposition of these constituents in the developing seed is needed to guide crop improvement. A quantitative trait locus (QTL) on Linkage Group I (LG I) of soybean (Glycine max (L.) Merrill) has a striking effect on seed protein content. Results A soybean near-isogenic line (NIL) pair contrasting in seed protein and differing in an introgressed genomic segment containing the LG I protein QTL was used as a resource to demarcate the QTL region and to study variation in transcript abundance in developing seed. The LG I QTL region was delineated to less than 8.4 Mbp of genomic sequence on chromosome 20. Using Affymetrix® Soy GeneChip and high-throughput Illumina® whole transcriptome sequencing platforms, 13 genes displaying significant seed transcript accumulation differences between NILs were identified that mapped to the 8.4 Mbp LG I protein QTL region. Conclusions This study identifies gene candidates at the LG I protein QTL for potential involvement in the regulation of protein content in the soybean seed. The results demonstrate the power of complementary approaches to characterize contrasting NILs and provide genome-wide transcriptome insight towards understanding seed biology and the soybean genome. PMID:20199683

  8. Linkage and related analyses of Barrett's esophagus and its associated adenocarcinomas.

    PubMed

    Sun, Xiangqing; Elston, Robert; Falk, Gary W; Grady, William M; Faulx, Ashley; Mittal, Sumeet K; Canto, Marcia I; Shaheen, Nicholas J; Wang, Jean S; Iyer, Prasad G; Abrams, Julian A; Willis, Joseph E; Guda, Kishore; Markowitz, Sanford; Barnholtz-Sloan, Jill S; Chandar, Apoorva; Brock, Wendy; Chak, Amitabh

    2016-07-01

    Familial aggregation and segregation analysis studies have provided evidence of a genetic basis for esophageal adenocarcinoma (EAC) and its premalignant precursor, Barrett's esophagus (BE). We aim to demonstrate the utility of linkage analysis to identify the genomic regions that might contain the genetic variants that predispose individuals to this complex trait (BE and EAC). We genotyped 144 individuals in 42 multiplex pedigrees chosen from 1000 singly ascertained BE/EAC pedigrees, and performed both model-based and model-free linkage analyses, using S.A.G.E. and other software. Segregation models were fitted, from the data on both the 42 pedigrees and the 1000 pedigrees, to determine parameters for performing model-based linkage analysis. Model-based and model-free linkage analyses were conducted in two sets of pedigrees: the 42 pedigrees and a subset of 18 pedigrees with female affected members that are expected to be more genetically homogeneous. Genome-wide associations were also tested in these families. Linkage analyses on the 42 pedigrees identified several regions consistently suggestive of linkage by different linkage analysis methods on chromosomes 2q31, 12q23, and 4p14. A linkage on 15q26 is the only consistent linkage region identified in the 18 female-affected pedigrees, in which the linkage signal is higher than in the 42 pedigrees. Other tentative linkage signals are also reported. Our linkage study of BE/EAC pedigrees identified linkage regions on chromosomes 2, 4, 12, and 15, with some reported associations located within our linkage peaks. Our linkage results can help prioritize association tests to delineate the genetic determinants underlying susceptibility to BE and EAC.

  9. Genome-wide and fine-resolution association analysis of malaria in West Africa.

    PubMed

    Jallow, Muminatou; Teo, Yik Ying; Small, Kerrin S; Rockett, Kirk A; Deloukas, Panos; Clark, Taane G; Kivinen, Katja; Bojang, Kalifa A; Conway, David J; Pinder, Margaret; Sirugo, Giorgio; Sisay-Joof, Fatou; Usen, Stanley; Auburn, Sarah; Bumpstead, Suzannah J; Campino, Susana; Coffey, Alison; Dunham, Andrew; Fry, Andrew E; Green, Angela; Gwilliam, Rhian; Hunt, Sarah E; Inouye, Michael; Jeffreys, Anna E; Mendy, Alieu; Palotie, Aarno; Potter, Simon; Ragoussis, Jiannis; Rogers, Jane; Rowlands, Kate; Somaskantharajah, Elilan; Whittaker, Pamela; Widden, Claire; Donnelly, Peter; Howie, Bryan; Marchini, Jonathan; Morris, Andrew; SanJoaquin, Miguel; Achidi, Eric Akum; Agbenyega, Tsiri; Allen, Angela; Amodu, Olukemi; Corran, Patrick; Djimde, Abdoulaye; Dolo, Amagana; Doumbo, Ogobara K; Drakeley, Chris; Dunstan, Sarah; Evans, Jennifer; Farrar, Jeremy; Fernando, Deepika; Hien, Tran Tinh; Horstmann, Rolf D; Ibrahim, Muntaser; Karunaweera, Nadira; Kokwaro, Gilbert; Koram, Kwadwo A; Lemnge, Martha; Makani, Julie; Marsh, Kevin; Michon, Pascal; Modiano, David; Molyneux, Malcolm E; Mueller, Ivo; Parker, Michael; Peshu, Norbert; Plowe, Christopher V; Puijalon, Odile; Reeder, John; Reyburn, Hugh; Riley, Eleanor M; Sakuntabhai, Anavaj; Singhasivanon, Pratap; Sirima, Sodiomon; Tall, Adama; Taylor, Terrie E; Thera, Mahamadou; Troye-Blomberg, Marita; Williams, Thomas N; Wilson, Michael; Kwiatkowski, Dominic P

    2009-06-01

    We report a genome-wide association (GWA) study of severe malaria in The Gambia. The initial GWA scan included 2,500 children genotyped on the Affymetrix 500K GeneChip, and a replication study included 3,400 children. We used this to examine the performance of GWA methods in Africa. We found considerable population stratification, and also that signals of association at known malaria resistance loci were greatly attenuated owing to weak linkage disequilibrium (LD). To investigate possible solutions to the problem of low LD, we focused on the HbS locus, sequencing this region of the genome in 62 Gambian individuals and then using these data to conduct multipoint imputation in the GWA samples. This increased the signal of association, from P = 4 × 10(-7) to P = 4 × 10(-14), with the peak of the signal located precisely at the HbS causal variant. Our findings provide proof of principle that fine-resolution multipoint imputation, based on population-specific sequencing data, can substantially boost authentic GWA signals and enable fine mapping of causal variants in African populations.

  10. Revealing phenotype-associated functional differences by genome-wide scan of ancient haplotype blocks

    PubMed Central

    Onuki, Ritsuko; Yamaguchi, Rui; Shibuya, Tetsuo; Kanehisa, Minoru; Goto, Susumu

    2017-01-01

    Genome-wide scans for positive selection have become important for genomic medicine, and many studies aim to find genomic regions affected by positive selection that are associated with risk allele variations among populations. Most such studies are designed to detect recent positive selection. However, we hypothesize that ancient positive selection is also important for adaptation to pathogens, and has affected current immune-mediated common diseases. Based on this hypothesis, we developed a novel linkage disequilibrium-based pipeline, which aims to detect regions associated with ancient positive selection across populations from single nucleotide polymorphism (SNP) data. By applying this pipeline to the genotypes in the International HapMap project database, we show that genes in the detected regions are enriched in pathways related to the immune system and infectious diseases. The detected regions also contain SNPs reported to be associated with cancers and metabolic diseases, obesity-related traits, type 2 diabetes, and allergic sensitization. These SNPs were further mapped to biological pathways to determine the associations between phenotypes and molecular functions. Assessments of candidate regions to identify functions associated with variations in incidence rates of these diseases are needed in the future. PMID:28445522

  11. Meta genome-wide network from functional linkages of genes in human gut microbial ecosystems.

    PubMed

    Ji, Yan; Shi, Yixiang; Wang, Chuan; Dai, Jianliang; Li, Yixue

    2013-03-01

    The human gut microbial ecosystem (HGME) exerts an important influence on the human health. In recent researches, meta-genomics provided deep insights into the HGME in terms of gene contents, metabolic processes and genome constitutions of meta-genome. Here we present a novel methodology to investigate the HGME on the basis of a set of functionally coupled genes regardless of their genome origins when considering the co-evolution properties of genes. By analyzing these coupled genes, we showed some basic properties of HGME significantly associated with each other, and further constructed a protein interaction map of human gut meta-genome to discover some functional modules that may relate with essential metabolic processes. Compared with other studies, our method provides a new idea to extract basic function elements from meta-genome systems and investigate complex microbial environment by associating its biological traits with co-evolutionary fingerprints encoded in it.

  12. Genome-wide characterization of the WRKY gene family in radish (Raphanus sativus L.) reveals its critical functions under different abiotic stresses.

    PubMed

    Karanja, Bernard Kinuthia; Fan, Lianxue; Xu, Liang; Wang, Yan; Zhu, Xianwen; Tang, Mingjia; Wang, Ronghua; Zhang, Fei; Muleke, Everlyne M'mbone; Liu, Liwang

    2017-11-01

    The radish WRKY gene family was genome-widely identified and played critical roles in response to multiple abiotic stresses. The WRKY is among the largest transcription factors (TFs) associated with multiple biological activities for plant survival, including control response mechanisms against abiotic stresses such as heat, salinity, and heavy metals. Radish is an important root vegetable crop and therefore characterization and expression pattern investigation of WRKY transcription factors in radish is imperative. In the present study, 126 putative WRKY genes were retrieved from radish genome database. Protein sequence and annotation scrutiny confirmed that RsWRKY proteins possessed highly conserved domains and zinc finger motif. Based on phylogenetic analysis results, RsWRKYs candidate genes were divided into three groups (Group I, II and III) with the number 31, 74, and 20, respectively. Additionally, gene structure analysis revealed that intron-exon patterns of the WRKY genes are highly conserved in radish. Linkage map analysis indicated that RsWRKY genes were distributed with varying densities over nine linkage groups. Further, RT-qPCR analysis illustrated the significant variation of 36 RsWRKY genes under one or more abiotic stress treatments, implicating that they might be stress-responsive genes. In total, 126 WRKY TFs were identified from the R. sativus genome wherein, 35 of them showed abiotic stress-induced expression patterns. These results provide a genome-wide characterization of RsWRKY TFs and baseline for further functional dissection and molecular evolution investigation, specifically for improving abiotic stress resistances with an ultimate goal of increasing yield and quality of radish.

  13. Genome-Wide Linkage and Association Mapping of Halo Blight Resistance in Common Bean to Race 6 of the Globally Important Bacterial Pathogen

    PubMed Central

    Tock, Andrew J.; Fourie, Deidré; Walley, Peter G.; Holub, Eric B.; Soler, Alvaro; Cichy, Karen A.; Pastor-Corrales, Marcial A.; Song, Qijian; Porch, Timothy G.; Hart, John P.; Vasconcellos, Renato C. C.; Vicente, Joana G.; Barker, Guy C.; Miklas, Phillip N.

    2017-01-01

    Pseudomonas syringae pv. phaseolicola (Psph) Race 6 is a globally prevalent and broadly virulent bacterial pathogen with devastating impact causing halo blight of common bean (Phaseolus vulgaris L.). Common bean lines PI 150414 and CAL 143 are known sources of resistance against this pathogen. We constructed high-resolution linkage maps for three recombinant inbred populations to map resistance to Psph Race 6 derived from the two common bean lines. This was complemented with a genome-wide association study (GWAS) of Race 6 resistance in an Andean Diversity Panel of common bean. Race 6 resistance from PI 150414 maps to a single major-effect quantitative trait locus (QTL; HB4.2) on chromosome Pv04 and confers broad-spectrum resistance to eight other races of the pathogen. Resistance segregating in a Rojo × CAL 143 population maps to five chromosome arms and includes HB4.2. GWAS detected one QTL (HB5.1) on chromosome Pv05 for resistance to Race 6 with significant influence on seed yield. The same HB5.1 QTL, found in both Canadian Wonder × PI 150414 and Rojo × CAL 143 populations, was effective against Race 6 but lacks broad resistance. This study provides evidence for marker-assisted breeding for more durable halo blight control in common bean by combining alleles of race-nonspecific resistance (HB4.2 from PI 150414) and race-specific resistance (HB5.1 from cv. Rojo). PMID:28736566

  14. A saturated SSR/DArT linkage map of Musa acuminata addressing genome rearrangements among bananas.

    PubMed

    Hippolyte, Isabelle; Bakry, Frederic; Seguin, Marc; Gardes, Laetitia; Rivallan, Ronan; Risterucci, Ange-Marie; Jenny, Christophe; Perrier, Xavier; Carreel, Françoise; Argout, Xavier; Piffanelli, Pietro; Khan, Imtiaz A; Miller, Robert N G; Pappas, Georgios J; Mbéguié-A-Mbéguié, Didier; Matsumoto, Takashi; De Bernardinis, Veronique; Huttner, Eric; Kilian, Andrzej; Baurens, Franc-Christophe; D'Hont, Angélique; Cote, François; Courtois, Brigitte; Glaszmann, Jean-Christophe

    2010-04-13

    The genus Musa is a large species complex which includes cultivars at diploid and triploid levels. These sterile and vegetatively propagated cultivars are based on the A genome from Musa acuminata, exclusively for sweet bananas such as Cavendish, or associated with the B genome (Musa balbisiana) in cooking bananas such as Plantain varieties. In M. acuminata cultivars, structural heterozygosity is thought to be one of the main causes of sterility, which is essential for obtaining seedless fruits but hampers breeding. Only partial genetic maps are presently available due to chromosomal rearrangements within the parents of the mapping populations. This causes large segregation distortions inducing pseudo-linkages and difficulties in ordering markers in the linkage groups. The present study aims at producing a saturated linkage map of M. acuminata, taking into account hypotheses on the structural heterozygosity of the parents. An F1 progeny of 180 individuals was obtained from a cross between two genetically distant accessions of M. acuminata, 'Borneo' and 'Pisang Lilin' (P. Lilin). Based on the gametic recombination of each parent, two parental maps composed of SSR and DArT markers were established. A significant proportion of the markers (21.7%) deviated (p < 0.05) from the expected Mendelian ratios. These skewed markers were distributed in different linkage groups for each parent. To solve some complex ordering of the markers on linkage groups, we associated tools such as tree-like graphic representations, recombination frequency statistics and cytogenetical studies to identify structural rearrangements and build parsimonious linkage group order. An illustration of such an approach is given for the P. Lilin parent. We propose a synthetic map with 11 linkage groups containing 489 markers (167 SSRs and 322 DArTs) covering 1197 cM. This first saturated map is proposed as a "reference Musa map" for further analyses. We also propose two complete parental maps with

  15. Prospects of Fine-Mapping Trait-Associated Genomic Regions by Using Summary Statistics from Genome-wide Association Studies.

    PubMed

    Benner, Christian; Havulinna, Aki S; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ripatti, Samuli; Pirinen, Matti

    2017-10-05

    During the past few years, various novel statistical methods have been developed for fine-mapping with the use of summary statistics from genome-wide association studies (GWASs). Although these approaches require information about the linkage disequilibrium (LD) between variants, there has not been a comprehensive evaluation of how estimation of the LD structure from reference genotype panels performs in comparison with that from the original individual-level GWAS data. Using population genotype data from Finland and the UK Biobank, we show here that a reference panel of 1,000 individuals from the target population is adequate for a GWAS cohort of up to 10,000 individuals, whereas smaller panels, such as those from the 1000 Genomes Project, should be avoided. We also show, both theoretically and empirically, that the size of the reference panel needs to scale with the GWAS sample size; this has important consequences for the application of these methods in ongoing GWAS meta-analyses and large biobank studies. We conclude by providing software tools and by recommending practices for sharing LD information to more efficiently exploit summary statistics in genetics research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  16. Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr.

    PubMed

    Privé, Florian; Aschard, Hugues; Ziyatdinov, Andrey; Blum, Michael G B

    2017-03-30

    Genome-wide datasets produced for association studies have dramatically increased in size over the past few years, with modern datasets commonly including millions of variants measured in dozens of thousands of individuals. This increase in data size is a major challenge severely slowing down genomic analyses, leading to some software becoming obsolete and researchers having limited access to diverse analysis tools. Here we present two R packages, bigstatsr and bigsnpr, allowing for the analysis of large scale genomic data to be performed within R. To address large data size, the packages use memory-mapping for accessing data matrices stored on disk instead of in RAM. To perform data pre-processing and data analysis, the packages integrate most of the tools that are commonly used, either through transparent system calls to existing software, or through updated or improved implementation of existing methods. In particular, the packages implement fast and accurate computations of principal component analysis and association studies, functions to remove SNPs in linkage disequilibrium and algorithms to learn polygenic risk scores on millions of SNPs. We illustrate applications of the two R packages by analyzing a case-control genomic dataset for celiac disease, performing an association study and computing Polygenic Risk Scores. Finally, we demonstrate the scalability of the R packages by analyzing a simulated genome-wide dataset including 500,000 individuals and 1 million markers on a single desktop computer. https://privefl.github.io/bigstatsr/ & https://privefl.github.io/bigsnpr/. florian.prive@univ-grenoble-alpes.fr & michael.blum@univ-grenoble-alpes.fr. Supplementary materials are available at Bioinformatics online.

  17. Development of a dense SNP-based linkage map of an apple rootstock progeny using the Malus Infinium whole genome genotyping array.

    PubMed

    Antanaviciute, Laima; Fernández-Fernández, Felicidad; Jansen, Johannes; Banchi, Elisa; Evans, Katherine M; Viola, Roberto; Velasco, Riccardo; Dunwell, Jim M; Troggio, Michela; Sargent, Daniel J

    2012-05-25

    A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the 'Golden Delicious' genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the 'Golden Delicious' pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been

  18. Development of a dense SNP-based linkage map of an apple rootstock progeny using the Malus Infinium whole genome genotyping array

    PubMed Central

    2012-01-01

    Background A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Results Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the ‘Golden Delicious’ genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the ‘Golden Delicious’ pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. Conclusions We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and

  19. Systems genetics of obesity in an F2 pig model by genome-wide association, genetic network, and pathway analyses

    PubMed Central

    Kogelman, Lisette J. A.; Pant, Sameer D.; Fredholm, Merete; Kadarmideen, Haja N.

    2014-01-01

    Obesity is a complex condition with world-wide exponentially rising prevalence rates, linked with severe diseases like Type 2 Diabetes. Economic and welfare consequences have led to a raised interest in a better understanding of the biological and genetic background. To date, whole genome investigations focusing on single genetic variants have achieved limited success, and the importance of including genetic interactions is becoming evident. Here, the aim was to perform an integrative genomic analysis in an F2 pig resource population that was constructed with an aim to maximize genetic variation of obesity-related phenotypes and genotyped using the 60K SNP chip. Firstly, Genome Wide Association (GWA) analysis was performed on the Obesity Index to locate candidate genomic regions that were further validated using combined Linkage Disequilibrium Linkage Analysis and investigated by evaluation of haplotype blocks. We built Weighted Interaction SNP Hub (WISH) and differentially wired (DW) networks using genotypic correlations amongst obesity-associated SNPs resulting from GWA analysis. GWA results and SNP modules detected by WISH and DW analyses were further investigated by functional enrichment analyses. The functional annotation of SNPs revealed several genes associated with obesity, e.g., NPC2 and OR4D10. Moreover, gene enrichment analyses identified several significantly associated pathways, over and above the GWA study results, that may influence obesity and obesity related diseases, e.g., metabolic processes. WISH networks based on genotypic correlations allowed further identification of various gene ontology terms and pathways related to obesity and related traits, which were not identified by the GWA study. In conclusion, this is the first study to develop a (genetic) obesity index and employ systems genetics in a porcine model to provide important insights into the complex genetic architecture associated with obesity and many biological pathways that underlie

  20. Genome-Wide Meta-Analysis of Longitudinal Alcohol Consumption Across Youth and Early Adulthood.

    PubMed

    Adkins, Daniel E; Clark, Shaunna L; Copeland, William E; Kennedy, Martin; Conway, Kevin; Angold, Adrian; Maes, Hermine; Liu, Youfang; Kumar, Gaurav; Erkanli, Alaattin; Patkar, Ashwin A; Silberg, Judy; Brown, Tyson H; Fergusson, David M; Horwood, L John; Eaves, Lindon; van den Oord, Edwin J C G; Sullivan, Patrick F; Costello, E J

    2015-08-01

    The public health burden of alcohol is unevenly distributed across the life course, with levels of use, abuse, and dependence increasing across adolescence and peaking in early adulthood. Here, we leverage this temporal patterning to search for common genetic variants predicting developmental trajectories of alcohol consumption. Comparable psychiatric evaluations measuring alcohol consumption were collected in three longitudinal community samples (N=2,126, obs=12,166). Consumption-repeated measurements spanning adolescence and early adulthood were analyzed using linear mixed models, estimating individual consumption trajectories, which were then tested for association with Illumina 660W-Quad genotype data (866,099 SNPs after imputation and QC). Association results were combined across samples using standard meta-analysis methods. Four meta-analysis associations satisfied our pre-determined genome-wide significance criterion (FDR<0.1) and six others met our 'suggestive' criterion (FDR<0.2). Genome-wide significant associations were highly biological plausible, including associations within GABA transporter 1, SLC6A1 (solute carrier family 6, member 1), and exonic hits in LOC100129340 (mitofusin-1-like). Pathway analyses elaborated single marker results, indicating significant enriched associations to intuitive biological mechanisms, including neurotransmission, xenobiotic pharmacodynamics, and nuclear hormone receptors (NHR). These findings underscore the value of combining longitudinal behavioral data and genome-wide genotype information in order to study developmental patterns and improve statistical power in genomic studies.

  1. Genome-wide association analysis identifies a meningioma risk locus at 11p15.5.

    PubMed

    Claus, Elizabeth B; Cornish, Alex J; Broderick, Peter; Schildkraut, Joellen M; Dobbins, Sara E; Holroyd, Amy; Calvocoressi, Lisa; Lu, Lingeng; Hansen, Helen M; Smirnov, Ivan; Walsh, Kyle M; Schramm, Johannes; Hoffmann, Per; Nöthen, Markus M; Jöckel, Karl-Heinz; Swerdlow, Anthony; Larsen, Signe Benzon; Johansen, Christoffer; Simon, Matthias; Bondy, Melissa; Wrensch, Margaret; Houlston, Richard; Wiemels, Joseph L

    2018-05-12

    Meningioma are adult brain tumors originating in the meningeal coverings of the brain and spinal cord, with significant heritable basis. Genome-wide association studies (GWAS) have previously identified only a single risk locus for meningioma, at 10p12.31. To identify a susceptibility locus for meningioma, we conducted a meta-analysis of two GWAS, imputed using a merged reference panel of 1,000 Genomes and UK10K data, with validation in two independent sample series totaling 2,138 cases and 12,081 controls. We identified a new susceptibility locus for meningioma at 11p15.5 (rs2686876, odds ratio = 1.44, P = 9.86 × 10-9). A number of genes localize to the region of linkage disequilibrium encompassing rs2686876, including RIC8A, which plays a central role in the development of neural crest-derived structures, such as the meninges. This finding advances our understanding of the genetic basis of meningioma development and provides additional support for a polygenic model of meningioma.

  2. Genome-wide signatures of population bottlenecks and diversifying selection in European wolves

    PubMed Central

    Pilot, M; Greco, C; vonHoldt, B M; Jędrzejewska, B; Randi, E; Jędrzejewski, W; Sidorovich, V E; Ostrander, E A; Wayne, R K

    2014-01-01

    Genomic resources developed for domesticated species provide powerful tools for studying the evolutionary history of their wild relatives. Here we use 61K single-nucleotide polymorphisms (SNPs) evenly spaced throughout the canine nuclear genome to analyse evolutionary relationships among the three largest European populations of grey wolves in comparison with other populations worldwide, and investigate genome-wide effects of demographic bottlenecks and signatures of selection. European wolves have a discontinuous range, with large and connected populations in Eastern Europe and relatively smaller, isolated populations in Italy and the Iberian Peninsula. Our results suggest a continuous decline in wolf numbers in Europe since the Late Pleistocene, and long-term isolation and bottlenecks in the Italian and Iberian populations following their divergence from the Eastern European population. The Italian and Iberian populations have low genetic variability and high linkage disequilibrium, but relatively few autozygous segments across the genome. This last characteristic clearly distinguishes them from populations that underwent recent drastic demographic declines or founder events, and implies long-term bottlenecks in these two populations. Although genetic drift due to spatial isolation and bottlenecks seems to be a major evolutionary force diversifying the European populations, we detected 35 loci that are putatively under diversifying selection. Two of these loci flank the canine platelet-derived growth factor gene, which affects bone growth and may influence differences in body size between wolf populations. This study demonstrates the power of population genomics for identifying genetic signals of demographic bottlenecks and detecting signatures of directional selection in bottlenecked populations, despite their low background variability. PMID:24346500

  3. Reconstructing Roma History from Genome-Wide Data

    PubMed Central

    Moorjani, Priya; Patterson, Nick; Loh, Po-Ru; Lipson, Mark; Kisfali, Péter; Melegh, Bela I.; Bonin, Michael; Kádaši, Ľudevít; Rieß, Olaf; Berger, Bonnie; Reich, David; Melegh, Béla

    2013-01-01

    The Roma people, living throughout Europe and West Asia, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1,000–1,500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry–derived from a combination of European and South Asian sources–and that the date of admixture of South Asian and European ancestry was about 850 years before present. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which appears to have been followed by a major demographic expansion after the arrival in Europe. PMID:23516520

  4. Genome‐wide linkage analysis of pulmonary function in families of children with asthma in Costa Rica

    PubMed Central

    Hersh, Craig P; Soto‐Quirós, Manuel E; Avila, Lydiana; Lake, Stephen L; Liang, Catherine; Fournier, Eduardo; Spesny, Mitzi; Sylvia, Jody S; Lazarus, Ross; Hudson, Thomas; Verner, Andrei; Klanderman, Barbara J; Freimer, Nelson B; Silverman, Edwin K; Celedón, Juan C

    2007-01-01

    Background Although asthma is highly prevalent among certain Hispanic subgroups, genetic determinants of asthma and asthma‐related traits have not been conclusively identified in Hispanic populations. A study was undertaken to identify genomic regions containing susceptibility loci for pulmonary function and bronchodilator responsiveness (BDR) in Costa Ricans. Methods Eight extended pedigrees were ascertained through schoolchildren with asthma in the Central Valley of Costa Rica. Short tandem repeat (STR) markers were genotyped throughout the genome at an average spacing of 8.2 cM. Multipoint variance component linkage analyses of forced expiratory volume in 1 second (FEV1) and FEV1/ forced vital capacity (FVC; both pre‐bronchodilator and post‐bronchodilator) and BDR were performed in these eight families (pre‐bronchodilator spirometry, n = 640; post‐bronchodilator spirometry and BDR, n = 624). Nine additional STR markers were genotyped on chromosome 7. Secondary analyses were repeated after stratification by cigarette smoking. Results Among all subjects, the highest logarithm of the odds of linkage (LOD) score for FEV1 (post‐bronchodilator) was found on chromosome 7q34–35 (LOD = 2.45, including the additional markers). The highest LOD scores for FEV1/FVC (pre‐bronchodilator) and BDR were found on chromosomes 2q (LOD = 1.53) and 9p (LOD = 1.53), respectively. Among former and current smokers there was near‐significant evidence of linkage to FEV1/FVC (post‐bronchodilator) on chromosome 5p (LOD = 3.27) and suggestive evidence of linkage to FEV1 on chromosomes 3q (pre‐bronchodilator, LOD = 2.74) and 4q (post‐bronchodilator, LOD = 2.66). Conclusions In eight families of children with asthma in Costa Rica, there is suggestive evidence of linkage to FEV1 on chromosome 7q34–35. In these families, FEV1/FVC may be influenced by an interaction between cigarette smoking and a locus (loci) on chromosome 5p. PMID

  5. Genome-wide scans for microalbuminuria in Mexican Americans: the San Antonio Family Heart Study.

    PubMed

    Arar, Nedal; Nath, Subrata; Thameem, Farook; Bauer, Richard; Voruganti, Saroja; Comuzzie, Anthony; Cole, Shelley; Blangero, John; MacCluer, Jean; Abboud, Hanna

    2007-02-01

    Microalbuminuria, defined as urine albumin-to-creatinine ratio of 0.03 to 0.299 mg/mg, is a major risk factor for cardiovascular disease. Several genetic epidemiological studies have established that microalbuminuria clusters in families, suggesting a genetic predisposition. We estimated heritability of microalbuminuria and performed a genome-wide linkage analysis to identify chromosomal regions influencing urine albumin-to-creatinine ratio in 486 Mexican Americans from 26 multiplex families. Significant heritability was demonstrated for urine albumin-to-creatinine ratio (h = 24%, P < 0.003) after accounting for age, sex, body mass index, triglycerides, and hypertension. Genome scan revealed significant evidence of linkage of urine albumin-to-creatinine ratio to a region on chromosome 20q12 (LOD score of 3.5, P < 0.001) near marker D20S481. This region also exhibited a LOD score of 2.8 with diabetes status as a covariate and 3.0 with hypertension status as a covariate suggesting that the effect of this locus on urine albumin-to-creatinine ratio is largely independent of diabetes and hypertension. Findings indicate that there is a gene or genes located on human chromosome 20q12 that may have functional relevance to albumin excretion in Mexican Americans. Identifying and understanding the role of the genes that determine albumin excretion would lead to the development of novel therapeutic strategies targeted at high-risk individuals in whom intensive preventive measures may be most beneficial.

  6. A high-resolution genetic linkage map and QTL fine mapping for growth-related traits and sex in the Yangtze River common carp (Cyprinus carpio haematopterus).

    PubMed

    Feng, Xiu; Yu, Xiaomu; Fu, Beide; Wang, Xinhua; Liu, Haiyang; Pang, Meixia; Tong, Jingou

    2018-04-02

    A high-density genetic linkage map is essential for QTL fine mapping, comparative genome analysis, identification of candidate genes and marker-assisted selection for economic traits in aquaculture species. The Yangtze River common carp (Cyprinus carpio haematopterus) is one of the most important aquacultured strains in China. However, quite limited genetics and genomics resources have been developed for genetic improvement of economic traits in such strain. A high-resolution genetic linkage map was constructed by using 7820 2b-RAD (2b-restriction site-associated DNA) and 295 microsatellite markers in a F2 family of the Yangtze River common carp (C. c. haematopterus). The length of the map was 4586.56 cM with an average marker interval of 0.57 cM. Comparative genome mapping revealed that a high proportion (70%) of markers with disagreed chromosome location was observed between C. c. haematopterus and another common carp strain (subspecies) C. c. carpio. A clear 2:1 relationship was observed between C. c. haematopterus linkage groups (LGs) and zebrafish (Danio rerio) chromosomes. Based on the genetic map, 21 QTLs for growth-related traits were detected on 12 LGs, and contributed values of phenotypic variance explained (PVE) ranging from 16.3 to 38.6%, with LOD scores ranging from 4.02 to 11.13. A genome-wide significant QTL (LOD = 10.83) and three chromosome-wide significant QTLs (mean LOD = 4.84) for sex were mapped on LG50 and LG24, respectively. A 1.4 cM confidence interval of QTL for all growth-related traits showed conserved synteny with a 2.06 M segment on chromosome 14 of D. rerio. Five potential candidate genes were identified by blast search in this genomic region, including a well-studied multi-functional growth related gene, Apelin. We mapped a set of suggestive and significant QTLs for growth-related traits and sex based on a high-density genetic linkage map using SNP and microsatellite markers for Yangtze River common carp. Several

  7. Genome-wide analysis of mutations in mutant lineages selected following fast-neutron irradiation mutagenesis of Arabidopsis thaliana

    PubMed Central

    Belfield, Eric J.; Gan, Xiangchao; Mithani, Aziz; Brown, Carly; Jiang, Caifu; Franklin, Keara; Alvey, Elizabeth; Wibowo, Anjar; Jung, Marko; Bailey, Kit; Kalwani, Sharan; Ragoussis, Jiannis; Mott, Richard; Harberd, Nicholas P.

    2012-01-01

    Ionizing radiation has long been known to induce heritable mutagenic change in DNA sequence. However, the genome-wide effect of radiation is not well understood. Here we report the molecular properties and frequency of mutations in phenotypically selected mutant lines isolated following exposure of the genetic model flowering plant Arabidopsis thaliana to fast neutrons (FNs). Previous studies suggested that FNs predominantly induce deletions longer than a kilobase in A. thaliana. However, we found a higher frequency of single base substitution than deletion mutations. While the overall frequency and molecular spectrum of fast-neutron (FN)–induced single base substitutions differed substantially from those of “background” mutations arising spontaneously in laboratory-grown plants, G:C>A:T transitions were favored in both. We found that FN-induced G:C>A:T transitions were concentrated at pyrimidine dinucleotide sites, suggesting that FNs promote the formation of mutational covalent linkages between adjacent pyrimidine residues. In addition, we found that FNs induced more single base than large deletions, and that these single base deletions were possibly caused by replication slippage. Our observations provide an initial picture of the genome-wide molecular profile of mutations induced in A. thaliana by FN irradiation and are particularly informative of the nature and extent of genome-wide mutation in lines selected on the basis of mutant phenotypes from FN-mutagenized A. thaliana populations. PMID:22499668

  8. Linkage disequilibrium between STRPs and SNPs across the human genome.

    PubMed

    Payseur, Bret A; Place, Michael; Weber, James L

    2008-05-01

    Patterns of linkage disequilibrium (LD) reveal the action of evolutionary processes and provide crucial information for association mapping of disease genes. Although recent studies have described the landscape of LD among single nucleotide polymorphisms (SNPs) from across the human genome, associations involving other classes of molecular variation remain poorly understood. In addition to recombination and population history, mutation rate and process are expected to shape LD. To test this idea, we measured associations between short-tandem-repeat polymorphisms (STRPs), which can mutate rapidly and recurrently, and SNPs in 721 regions across the human genome. We directly compared STRP-SNP LD with SNP-SNP LD from the same genomic regions in the human HapMap populations. The intensity of STRP-SNP LD, measured by the average of D', was reduced, consistent with the action of recurrent mutation. Nevertheless, a higher fraction of STRP-SNP pairs than SNP-SNP pairs showed significant LD, on both short (up to 50 kb) and long (cM) scales. These results reveal the substantial effects of mutational processes on LD at STRPs and provide important measures of the potential of STRPs for association mapping of disease genes.

  9. A genome-wide association study of seed protein and oil content in soybean

    PubMed Central

    2014-01-01

    Background Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. Results A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r 2 ) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil. Conclusions This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome

  10. A genome-wide association study of seed protein and oil content in soybean.

    PubMed

    Hwang, Eun-Young; Song, Qijian; Jia, Gaofeng; Specht, James E; Hyten, David L; Costa, Jose; Cregan, Perry B

    2014-01-02

    Association analysis is an alternative to conventional family-based methods to detect the location of gene(s) or quantitative trait loci (QTL) and provides relatively high resolution in terms of defining the genome position of a gene or QTL. Seed protein and oil concentration are quantitative traits which are determined by the interaction among many genes with small to moderate genetic effects and their interaction with the environment. In this study, a genome-wide association study (GWAS) was performed to identify quantitative trait loci (QTL) controlling seed protein and oil concentration in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. A total of 55,159 single nucleotide polymorphisms (SNPs) were genotyped using various methods including Illumina Infinium and GoldenGate assays and 31,954 markers with minor allele frequency >0.10 were used to estimate linkage disequilibrium (LD) in heterochromatic and euchromatic regions. In euchromatic regions, the mean LD (r2) rapidly declined to 0.2 within 360 Kbp, whereas the mean LD declined to 0.2 at 9,600 Kbp in heterochromatic regions. The GWAS results identified 40 SNPs in 17 different genomic regions significantly associated with seed protein. Of these, the five SNPs with the highest associations and seven adjacent SNPs were located in the 27.6-30.0 Mbp region of Gm20. A major seed protein QTL has been previously mapped to the same location and potential candidate genes have recently been identified in this region. The GWAS results also detected 25 SNPs in 13 different genomic regions associated with seed oil. Of these markers, seven SNPs had a significant association with both protein and oil. This research indicated that GWAS not only identified most of the previously reported QTL controlling seed protein and oil, but also resulted in narrower genomic regions than the regions reported as containing these QTL. The narrower GWAS-defined genome regions will allow more precise

  11. Linkage of A-to-I RNA Editing in Metazoans and the Impact on Genome Evolution

    PubMed Central

    Duan, Yuange; Dou, Shengqian; Zhang, Hong; Wu, Changcheng; Wu, Mingming

    2018-01-01

    Abstract The adenosine-to-inosine (A-to-I) RNA editomes have been systematically characterized in various metazoan species, and many editing sites were found in clusters. However, it remains unclear whether the clustered editing sites tend to be linked in the same RNA molecules or not. By adopting a method originally designed to detect linkage disequilibrium of DNA mutations, we examined the editomes of ten metazoan species and detected extensive linkage of editing in Drosophila and cephalopods. The prevalent linkages of editing in these two clades, many of which are conserved between closely related species and might be associated with the adaptive proteomic recoding, are maintained by natural selection at the cost of genome evolution. Nevertheless, in worms and humans, we only detected modest proportions of linked editing events, the majority of which were not conserved. Furthermore, the linkage of editing in coding regions of worms and humans might be overall deleterious, which drives the evolution of DNA sites to escape promiscuous editing. Altogether, our results suggest that the linkage landscape of A-to-I editing has evolved during metazoan evolution. This present study also suggests that linkage of editing should be considered in elucidating the functional consequences of RNA editing. PMID:29048557

  12. The score statistic of the LD-lod analysis: detecting linkage adaptive to linkage disequilibrium.

    PubMed

    Huang, J; Jiang, Y

    2001-01-01

    We study the properties of a modified lod score method for testing linkage that incorporates linkage disequilibrium (LD-lod). By examination of its score statistic, we show that the LD-lod score method adaptively combines two sources of information: (a) the IBD sharing score which is informative for linkage regardless of the existence of LD and (b) the contrast between allele-specific IBD sharing scores which is informative for linkage only in the presence of LD. We also consider the connection between the LD-lod score method and the transmission-disequilibrium test (TDT) for triad data and the mean test for affected sib pair (ASP) data. We show that, for triad data, the recessive LD-lod test is asymptotically equivalent to the TDT; and for ASP data, it is an adaptive combination of the TDT and the ASP mean test. We demonstrate that the LD-lod score method has relatively good statistical efficiency in comparison with the ASP mean test and the TDT for a broad range of LD and the genetic models considered in this report. Therefore, the LD-lod score method is an interesting approach for detecting linkage when the extent of LD is unknown, such as in a genome-wide screen with a dense set of genetic markers. Copyright 2001 S. Karger AG, Basel

  13. A Genome-Wide Scan of Selective Sweeps and Association Mapping of Fruit Traits Using Microsatellite Markers in Watermelon

    PubMed Central

    Reddy, Umesh K.; Abburi, Lavanya; Abburi, Venkata Lakshmi; Saminathan, Thangasamy; Cantrell, Robert; Vajja, Venkata Gopinath; Reddy, Rishi; Tomason, Yan R.; Levi, Amnon; Wehner, Todd C.; Nimmakayala, Padma

    2015-01-01

    Our genetic diversity study uses microsatellites of known map position to estimate genome level population structure and linkage disequilibrium, and to identify genomic regions that have undergone selection during watermelon domestication and improvement. Thirty regions that showed evidence of selective sweep were scanned for the presence of candidate genes using the watermelon genome browser (www.icugi.org). We localized selective sweeps in intergenic regions, close to the promoters, and within the exons and introns of various genes. This study provided an evidence of convergent evolution for the presence of diverse ecotypes with special reference to American and European ecotypes. Our search for location of linked markers in the whole-genome draft sequence revealed that BVWS00358, a GA repeat microsatellite, is the GAGA type transcription factor located in the 5′ untranslated regions of a structure and insertion element that expresses a Cys2His2 Zinc finger motif, with presumed biological processes related to chitin response and transcriptional regulation. In addition, BVWS01708, an ATT repeat microsatellite, located in the promoter of a DTW domain-containing protein (Cla002761); and 2 other simple sequence repeats that association mapping link to fruit length and rind thickness. PMID:25425675

  14. Use of modern tomato breeding germplasm for deciphering the genetic control of agronomical traits by Genome Wide Association study.

    PubMed

    Bauchet, Guillaume; Grenier, Stéphane; Samson, Nicolas; Bonnet, Julien; Grivet, Laurent; Causse, Mathilde

    2017-05-01

    A panel of 300 tomato accessions including breeding materials was built and characterized with >11,000 SNP. A population structure in six subgroups was identified. Strong heterogeneity in linkage disequilibrium and recombination landscape among groups and chromosomes was shown. GWAS identified several associations for fruit weight, earliness and plant growth. Genome-wide association studies (GWAS) have become a method of choice in quantitative trait dissection. First limited to highly polymorphic and outcrossing species, it is now applied in horticultural crops, notably in tomato. Until now GWAS in tomato has been performed on panels of heirloom and wild accessions. Using modern breeding materials would be of direct interest for breeding purpose. To implement GWAS on a large panel of 300 tomato accessions including 168 breeding lines, this study assessed the genetic diversity and linkage disequilibrium decay and revealed the population structure and performed GWA experiment. Genetic diversity and population structure analyses were based on molecular markers (>11,000 SNP) covering the whole genome. Six genetic subgroups were revealed and associated to traits of agronomical interest, such as fruit weight and disease resistance. Estimates of linkage disequilibrium highlighted the heterogeneity of its decay among genetic subgroups. Haplotype definition allowed a fine characterization of the groups and their recombination landscape revealing the patterns of admixture along the genome. Selection footprints showed results in congruence with introgressions. Taken together, all these elements refined our knowledge of the genetic material included in this panel and allowed the identification of several associations for fruit weight, plant growth and earliness, deciphering the genetic architecture of these complex traits and identifying several new loci useful for tomato breeding.

  15. Building the Infrastructure of Resource Sharing: Union Catalogs, Distributed Search, and Cross-Database Linkage.

    ERIC Educational Resources Information Center

    Lynch, Clifford A.

    1997-01-01

    Union catalogs and distributed search systems are two ways users can locate materials in print and electronic formats. This article examines the advantages and limitations of both approaches and argues that they should be considered complementary rather than competitive. Discusses technologies creating linkage between catalogs and databases and…

  16. Genome-Wide Analysis of Gene-Gene and Gene-Environment Interactions Using Closed-Form Wald Tests.

    PubMed

    Yu, Zhaoxia; Demetriou, Michael; Gillen, Daniel L

    2015-09-01

    Despite the successful discovery of hundreds of variants for complex human traits using genome-wide association studies, the degree to which genes and environmental risk factors jointly affect disease risk is largely unknown. One obstacle toward this goal is that the computational effort required for testing gene-gene and gene-environment interactions is enormous. As a result, numerous computationally efficient tests were recently proposed. However, the validity of these methods often relies on unrealistic assumptions such as additive main effects, main effects at only one variable, no linkage disequilibrium between the two single-nucleotide polymorphisms (SNPs) in a pair or gene-environment independence. Here, we derive closed-form and consistent estimates for interaction parameters and propose to use Wald tests for testing interactions. The Wald tests are asymptotically equivalent to the likelihood ratio tests (LRTs), largely considered to be the gold standard tests but generally too computationally demanding for genome-wide interaction analysis. Simulation studies show that the proposed Wald tests have very similar performances with the LRTs but are much more computationally efficient. Applying the proposed tests to a genome-wide study of multiple sclerosis, we identify interactions within the major histocompatibility complex region. In this application, we find that (1) focusing on pairs where both SNPs are marginally significant leads to more significant interactions when compared to focusing on pairs where at least one SNP is marginally significant; and (2) parsimonious parameterization of interaction effects might decrease, rather than increase, statistical power. © 2015 WILEY PERIODICALS, INC.

  17. A genome-wide association study of resistance to HIV infection in highly exposed uninfected individuals with hemophilia A

    PubMed Central

    Lane, Jérôme; McLaren, Paul J.; Dorrell, Lucy; Shianna, Kevin V.; Stemke, Amanda; Pelak, Kimberly; Moore, Stephen; Oldenburg, Johannes; Alvarez-Roman, Maria Teresa; Angelillo-Scherrer, Anne; Boehlen, Francoise; Bolton-Maggs, Paula H.B.; Brand, Brigit; Brown, Deborah; Chiang, Elaine; Cid-Haro, Ana Rosa; Clotet, Bonaventura; Collins, Peter; Colombo, Sara; Dalmau, Judith; Fogarty, Patrick; Giangrande, Paul; Gringeri, Alessandro; Iyer, Rathi; Katsarou, Olga; Kempton, Christine; Kuriakose, Philip; Lin, Judith; Makris, Mike; Manco-Johnson, Marilyn; Tsakiris, Dimitrios A.; Martinez-Picado, Javier; Mauser-Bunschoten, Evelien; Neff, Anne; Oka, Shinichi; Oyesiku, Lara; Parra, Rafael; Peter-Salonen, Kristiina; Powell, Jerry; Recht, Michael; Shapiro, Amy; Stine, Kimo; Talks, Katherine; Telenti, Amalio; Wilde, Jonathan; Yee, Thynn Thynn; Wolinsky, Steven M.; Martinson, Jeremy; Hussain, Shehnaz K.; Bream, Jay H.; Jacobson, Lisa P.; Carrington, Mary; Goedert, James J.; Haynes, Barton F.; McMichael, Andrew J.; Goldstein, David B.; Fellay, Jacques

    2013-01-01

    Human genetic variation contributes to differences in susceptibility to HIV-1 infection. To search for novel host resistance factors, we performed a genome-wide association study (GWAS) in hemophilia patients highly exposed to potentially contaminated factor VIII infusions. Individuals with hemophilia A and a documented history of factor VIII infusions before the introduction of viral inactivation procedures (1979–1984) were recruited from 36 hemophilia treatment centers (HTCs), and their genome-wide genetic variants were compared with those from matched HIV-infected individuals. Homozygous carriers of known CCR5 resistance mutations were excluded. Single nucleotide polymorphisms (SNPs) and inferred copy number variants (CNVs) were tested using logistic regression. In addition, we performed a pathway enrichment analysis, a heritability analysis, and a search for epistatic interactions with CCR5 Δ32 heterozygosity. A total of 560 HIV-uninfected cases were recruited: 36 (6.4%) were homozygous for CCR5 Δ32 or m303. After quality control and SNP imputation, we tested 1 081 435 SNPs and 3686 CNVs for association with HIV-1 serostatus in 431 cases and 765 HIV-infected controls. No SNP or CNV reached genome-wide significance. The additional analyses did not reveal any strong genetic effect. Highly exposed, yet uninfected hemophiliacs form an ideal study group to investigate host resistance factors. Using a genome-wide approach, we did not detect any significant associations between SNPs and HIV-1 susceptibility, indicating that common genetic variants of major effect are unlikely to explain the observed resistance phenotype in this population. PMID:23372042

  18. A Discovery Genome-Wide Association Study of Entrepreneurship

    ERIC Educational Resources Information Center

    Quaye, Lydia; Nicolaou, Nicos; Shane, Scott; Mangino, Massimo

    2012-01-01

    To identify specific genetic variants influencing the phenotype of entrepreneurship, we conducted a genome-wide association study (GWAS) with 3,933 Caucasian females from the TwinsUK Adult Twin Registry. Following stringent genotype quality control, GWAF (genome-wide association analyses for family data) software was used to assess the association…

  19. A Genome-wide Combinatorial Strategy Dissects Complex Genetic Architecture of Seed Coat Color in Chickpea

    PubMed Central

    Bajaj, Deepak; Das, Shouvik; Upadhyaya, Hari D.; Ranjan, Rajeev; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. Laxmipathi; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    The study identified 9045 high-quality SNPs employing both genome-wide GBS- and candidate gene-based SNP genotyping assays in 172, including 93 cultivated (desi and kabuli) and 79 wild chickpea accessions. The GWAS in a structured population of 93 sequenced accessions detected 15 major genomic loci exhibiting significant association with seed coat color. Five seed color-associated major genomic loci underlying robust QTLs mapped on a high-density intra-specific genetic linkage map were validated by QTL mapping. The integration of association and QTL mapping with gene haplotype-specific LD mapping and transcript profiling identified novel allelic variants (non-synonymous SNPs) and haplotypes in a MATE secondary transporter gene regulating light/yellow brown and beige seed coat color differentiation in chickpea. The down-regulation and decreased transcript expression of beige seed coat color-associated MATE gene haplotype was correlated with reduced proanthocyanidins accumulation in the mature seed coats of beige than light/yellow brown seed colored desi and kabuli accessions for their coloration/pigmentation. This seed color-regulating MATE gene revealed strong purifying selection pressure primarily in LB/YB seed colored desi and wild Cicer reticulatum accessions compared with the BE seed colored kabuli accessions. The functionally relevant molecular tags identified have potential to decipher the complex transcriptional regulatory gene function of seed coat coloration and for understanding the selective sweep-based seed color trait evolutionary pattern in cultivated and wild accessions during chickpea domestication. The genome-wide integrated approach employed will expedite marker-assisted genetic enhancement for developing cultivars with desirable seed coat color types in chickpea. PMID:26635822

  20. SuperDCA for genome-wide epistasis analysis.

    PubMed

    Puranen, Santeri; Pesonen, Maiju; Pensar, Johan; Xu, Ying Ying; Lees, John A; Bentley, Stephen D; Croucher, Nicholas J; Corander, Jukka

    2018-05-29

    The potential for genome-wide modelling of epistasis has recently surfaced given the possibility of sequencing densely sampled populations and the emerging families of statistical interaction models. Direct coupling analysis (DCA) has previously been shown to yield valuable predictions for single protein structures, and has recently been extended to genome-wide analysis of bacteria, identifying novel interactions in the co-evolution between resistance, virulence and core genome elements. However, earlier computational DCA methods have not been scalable to enable model fitting simultaneously to 10 4 -10 5 polymorphisms, representing the amount of core genomic variation observed in analyses of many bacterial species. Here, we introduce a novel inference method (SuperDCA) that employs a new scoring principle, efficient parallelization, optimization and filtering on phylogenetic information to achieve scalability for up to 10 5 polymorphisms. Using two large population samples of Streptococcus pneumoniae, we demonstrate the ability of SuperDCA to make additional significant biological findings about this major human pathogen. We also show that our method can uncover signals of selection that are not detectable by genome-wide association analysis, even though our analysis does not require phenotypic measurements. SuperDCA, thus, holds considerable potential in building understanding about numerous organisms at a systems biological level.

  1. Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

    PubMed Central

    2012-01-01

    Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding

  2. Draft Genome Sequence, and a Sequence-Defined Genetic Linkage Map of the Legume Crop Species Lupinus angustifolius L

    PubMed Central

    Zheng, Zequn; Zhang, Qisen; Zhou, Gaofeng; Sweetingham, Mark W.; Howieson, John G.; Li, Chengdao

    2013-01-01

    Lupin (Lupinus angustifolius L.) is the most recently domesticated crop in major agricultural cultivation. Its seeds are high in protein and dietary fibre, but low in oil and starch. Medical and dietetic studies have shown that consuming lupin-enriched food has significant health benefits. We report the draft assembly from a whole genome shotgun sequencing dataset for this legume species with 26.9x coverage of the genome, which is predicted to contain 57,807 genes. Analysis of the annotated genes with metabolic pathways provided a partial understanding of some key features of lupin, such as the amino acid profile of storage proteins in seeds. Furthermore, we applied the NGS-based RAD-sequencing technology to obtain 8,244 sequence-defined markers for anchoring the genomic sequences. A total of 4,214 scaffolds from the genome sequence assembly were aligned into the genetic map. The combination of the draft assembly and a sequence-defined genetic map made it possible to locate and study functional genes of agronomic interest. The identification of co-segregating SNP markers, scaffold sequences and gene annotation facilitated the identification of a candidate R gene associated with resistance to the major lupin disease anthracnose. We demonstrated that the combination of medium-depth genome sequencing and a high-density genetic linkage map by application of NGS technology is a cost-effective approach to generating genome sequence data and a large number of molecular markers to study the genomics, genetics and functional genes of lupin, and to apply them to molecular plant breeding. This strategy does not require prior genome knowledge, which potentiates its application to a wide range of non-model species. PMID:23734219

  3. Draft genome sequence, and a sequence-defined genetic linkage map of the legume crop species Lupinus angustifolius L.

    PubMed

    Yang, Huaan; Tao, Ye; Zheng, Zequn; Zhang, Qisen; Zhou, Gaofeng; Sweetingham, Mark W; Howieson, John G; Li, Chengdao

    2013-01-01

    Lupin (Lupinus angustifolius L.) is the most recently domesticated crop in major agricultural cultivation. Its seeds are high in protein and dietary fibre, but low in oil and starch. Medical and dietetic studies have shown that consuming lupin-enriched food has significant health benefits. We report the draft assembly from a whole genome shotgun sequencing dataset for this legume species with 26.9x coverage of the genome, which is predicted to contain 57,807 genes. Analysis of the annotated genes with metabolic pathways provided a partial understanding of some key features of lupin, such as the amino acid profile of storage proteins in seeds. Furthermore, we applied the NGS-based RAD-sequencing technology to obtain 8,244 sequence-defined markers for anchoring the genomic sequences. A total of 4,214 scaffolds from the genome sequence assembly were aligned into the genetic map. The combination of the draft assembly and a sequence-defined genetic map made it possible to locate and study functional genes of agronomic interest. The identification of co-segregating SNP markers, scaffold sequences and gene annotation facilitated the identification of a candidate R gene associated with resistance to the major lupin disease anthracnose. We demonstrated that the combination of medium-depth genome sequencing and a high-density genetic linkage map by application of NGS technology is a cost-effective approach to generating genome sequence data and a large number of molecular markers to study the genomics, genetics and functional genes of lupin, and to apply them to molecular plant breeding. This strategy does not require prior genome knowledge, which potentiates its application to a wide range of non-model species.

  4. Search Engines on the World Wide Web.

    ERIC Educational Resources Information Center

    Walster, Dian

    1997-01-01

    Discusses search engines and provides methods for determining what resources are searched, the quality of the information, and the algorithms used that will improve the use of search engines on the World Wide Web, online public access catalogs, and electronic encyclopedias. Lists strategies for conducting searches and for learning about the latest…

  5. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants

    PubMed Central

    Jin, Ying; Andersen, Genevieve; Yorgov, Daniel; Ferrara, Tracey M; Ben, Songtao; Brownson, Kelly M; Holland, Paulene J; Birlea, Stanca A; Siebert, Janet; Hartmann, Anke; Lienert, Anne; van Geel, Nanja; Lambert, Jo; Luiten, Rosalie M; Wolkerstorfer, Albert; van der Veen, JP Wietze; Bennett, Dorothy C; Taïeb, Alain; Ezzedine, Khaled; Kemp, E Helen; Gawkrodger, David J; Weetman, Anthony P; Kõks, Sulev; Prans, Ele; Kingo, Külli; Karelson, Maire; Wallace, Margaret R; McCormack, Wayne T; Overbeck, Andreas; Moretti, Silvia; Colucci, Roberta; Picardo, Mauro; Silverberg, Nanette B; Olsson, Mats; Valle, Yan; Korobko, Igor; Böhm, Markus; Lim, Henry W.; Hamzavi, Iltefat; Zhou, Li; Mi, Qing-Sheng; Fain, Pamela R.; Santorico, Stephanie A; Spritz, Richard A

    2016-01-01

    Vitiligo is an autoimmune disease in which depigmented skin results from destruction of melanocytes1, with epidemiologic association with other autoimmune diseases2. In previous linkage and genome-wide association studies (GWAS1, GWAS2), we identified 27 vitiligo susceptibility loci in patients of European (EUR) ancestry. We carried out a third GWAS (GWAS3) in EUR subjects, with augmented GWAS1 and GWAS2 controls, genome-wide imputation, and meta-analysis of all three GWAS, followed by an independent replication. The combined analyses, with 4,680 cases and 39,586 controls, identified 23 new loci and 7 suggestive loci, most encoding immune and apoptotic regulators, some also associated with other autoimmune diseases, as well as several melanocyte regulators. Bioinformatic analyses indicate a predominance of causal regulatory variation, some corresponding to eQTL at these loci. Together, the identified genes provide a framework for vitiligo genetic architecture and pathobiology, highlight relationships to other autoimmune diseases and melanoma, and offer potential targets for treatment. PMID:27723757

  6. Family-based linkage and association mapping reveals novel genes affecting Plum pox virus infection in Arabidopsis thaliana.

    PubMed

    Pagny, Gaëlle; Paulstephenraj, Pauline S; Poque, Sylvain; Sicard, Ophélie; Cosson, Patrick; Eyquard, Jean-Philippe; Caballero, Mélodie; Chague, Aurélie; Gourdon, Germain; Negrel, Lise; Candresse, Thierry; Mariette, Stéphanie; Decroocq, Véronique

    2012-11-01

    Sharka is a devastating viral disease caused by the Plum pox virus (PPV) in stone fruit trees and few sources of resistance are known in its natural hosts. Since any knowledge gained from Arabidopsis on plant virus susceptibility factors is likely to be transferable to crop species, Arabidopsis's natural variation was searched for host factors essential for PPV infection. To locate regions of the genome associated with susceptibility to PPV, linkage analysis was performed on six biparental populations as well as on multiparental lines. To refine quantitative trait locus (QTL) mapping, a genome-wide association analysis was carried out using 147 Arabidopsis accessions. Evidence was found for linkage on chromosomes 1, 3 and 5 with restriction of PPV long-distance movement. The most relevant signals occurred within a region at the bottom of chromosome 3, which comprises seven RTM3-like TRAF domain-containing genes. Since the resistance mechanism analyzed here is recessive and the rtm3 knockout mutant is susceptible to PPV infection, it suggests that other gene(s) present in the small identified region encompassing RTM3 are necessary for PPV long-distance movement. In consequence, we report here the occurrence of host factor(s) that are indispensable for virus long-distance movement. © 2012 INRA. New Phytologist © 2012 New Phytologist Trust.

  7. Linkage of A-to-I RNA Editing in Metazoans and the Impact on Genome Evolution.

    PubMed

    Duan, Yuange; Dou, Shengqian; Zhang, Hong; Wu, Changcheng; Wu, Mingming; Lu, Jian

    2018-01-01

    The adenosine-to-inosine (A-to-I) RNA editomes have been systematically characterized in various metazoan species, and many editing sites were found in clusters. However, it remains unclear whether the clustered editing sites tend to be linked in the same RNA molecules or not. By adopting a method originally designed to detect linkage disequilibrium of DNA mutations, we examined the editomes of ten metazoan species and detected extensive linkage of editing in Drosophila and cephalopods. The prevalent linkages of editing in these two clades, many of which are conserved between closely related species and might be associated with the adaptive proteomic recoding, are maintained by natural selection at the cost of genome evolution. Nevertheless, in worms and humans, we only detected modest proportions of linked editing events, the majority of which were not conserved. Furthermore, the linkage of editing in coding regions of worms and humans might be overall deleterious, which drives the evolution of DNA sites to escape promiscuous editing. Altogether, our results suggest that the linkage landscape of A-to-I editing has evolved during metazoan evolution. This present study also suggests that linkage of editing should be considered in elucidating the functional consequences of RNA editing. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  8. Genome-wide diversity and differentiation in New World populations of the human malaria parasite Plasmodium vivax

    PubMed Central

    de Oliveira, Thais C.; Rodrigues, Priscila T.; Menezes, Maria José; Gonçalves-Lopes, Raquel M.; Bastos, Melissa S.; Lima, Nathália F.; Barbosa, Susana; Gerber, Alexandra L.; Loss de Morais, Guilherme; Berná, Luisa; Phelan, Jody; Robello, Carlos; de Vasconcelos, Ana Tereza R.

    2017-01-01

    Background The Americas were the last continent colonized by humans carrying malaria parasites. Plasmodium falciparum from the New World shows very little genetic diversity and greater linkage disequilibrium, compared with its African counterparts, and is clearly subdivided into local, highly divergent populations. However, limited available data have revealed extensive genetic diversity in American populations of another major human malaria parasite, P. vivax. Methods We used an improved sample preparation strategy and next-generation sequencing to characterize 9 high-quality P. vivax genome sequences from northwestern Brazil. These new data were compared with publicly available sequences from recently sampled clinical P. vivax isolates from Brazil (BRA, total n = 11 sequences), Peru (PER, n = 23), Colombia (COL, n = 31), and Mexico (MEX, n = 19). Principal findings/Conclusions We found that New World populations of P. vivax are as diverse (nucleotide diversity π between 5.2 × 10−4 and 6.2 × 10−4) as P. vivax populations from Southeast Asia, where malaria transmission is substantially more intense. They display several non-synonymous nucleotide substitutions (some of them previously undescribed) in genes known or suspected to be involved in antimalarial drug resistance, such as dhfr, dhps, mdr1, mrp1, and mrp-2, but not in the chloroquine resistance transporter ortholog (crt-o) gene. Moreover, P. vivax in the Americas is much less geographically substructured than local P. falciparum populations, with relatively little between-population genome-wide differentiation (pairwise FST values ranging between 0.025 and 0.092). Finally, P. vivax populations show a rapid decline in linkage disequilibrium with increasing distance between pairs of polymorphic sites, consistent with very frequent outcrossing. We hypothesize that the high diversity of present-day P. vivax lineages in the Americas originated from successive migratory waves and subsequent admixture between

  9. Rapid genotyping with DNA micro-arrays for high-density linkage mapping and QTL mapping in common buckwheat (Fagopyrum esculentum Moench)

    PubMed Central

    Yabe, Shiori; Hara, Takashi; Ueno, Mariko; Enoki, Hiroyuki; Kimura, Tatsuro; Nishimura, Satoru; Yasui, Yasuo; Ohsawa, Ryo; Iwata, Hiroyoshi

    2014-01-01

    For genetic studies and genomics-assisted breeding, particularly of minor crops, a genotyping system that does not require a priori genomic information is preferable. Here, we demonstrated the potential of a novel array-based genotyping system for the rapid construction of high-density linkage map and quantitative trait loci (QTL) mapping. By using the system, we successfully constructed an accurate, high-density linkage map for common buckwheat (Fagopyrum esculentum Moench); the map was composed of 756 loci and included 8,884 markers. The number of linkage groups converged to eight, which is the basic number of chromosomes in common buckwheat. The sizes of the linkage groups of the P1 and P2 maps were 773.8 and 800.4 cM, respectively. The average interval between adjacent loci was 2.13 cM. The linkage map constructed here will be useful for the analysis of other common buckwheat populations. We also performed QTL mapping for main stem length and detected four QTL. It took 37 days to process 178 samples from DNA extraction to genotyping, indicating the system enables genotyping of genome-wide markers for a few hundred buckwheat plants before the plants mature. The novel system will be useful for genomics-assisted breeding in minor crops without a priori genomic information. PMID:25914583

  10. Rapid genotyping with DNA micro-arrays for high-density linkage mapping and QTL mapping in common buckwheat (Fagopyrum esculentum Moench).

    PubMed

    Yabe, Shiori; Hara, Takashi; Ueno, Mariko; Enoki, Hiroyuki; Kimura, Tatsuro; Nishimura, Satoru; Yasui, Yasuo; Ohsawa, Ryo; Iwata, Hiroyoshi

    2014-12-01

    For genetic studies and genomics-assisted breeding, particularly of minor crops, a genotyping system that does not require a priori genomic information is preferable. Here, we demonstrated the potential of a novel array-based genotyping system for the rapid construction of high-density linkage map and quantitative trait loci (QTL) mapping. By using the system, we successfully constructed an accurate, high-density linkage map for common buckwheat (Fagopyrum esculentum Moench); the map was composed of 756 loci and included 8,884 markers. The number of linkage groups converged to eight, which is the basic number of chromosomes in common buckwheat. The sizes of the linkage groups of the P1 and P2 maps were 773.8 and 800.4 cM, respectively. The average interval between adjacent loci was 2.13 cM. The linkage map constructed here will be useful for the analysis of other common buckwheat populations. We also performed QTL mapping for main stem length and detected four QTL. It took 37 days to process 178 samples from DNA extraction to genotyping, indicating the system enables genotyping of genome-wide markers for a few hundred buckwheat plants before the plants mature. The novel system will be useful for genomics-assisted breeding in minor crops without a priori genomic information.

  11. Meta-analysis for genome-wide association studies using case-control design: application and practice

    PubMed Central

    2016-01-01

    This review aimed to arrange the process of a systematic review of genome-wide association studies in order to practice and apply a genome-wide meta-analysis (GWMA). The process has a series of five steps: searching and selection, extraction of related information, evaluation of validity, meta-analysis by type of genetic model, and evaluation of heterogeneity. In contrast to intervention meta-analyses, GWMA has to evaluate the Hardy–Weinberg equilibrium (HWE) in the third step and conduct meta-analyses by five potential genetic models, including dominant, recessive, homozygote contrast, heterozygote contrast, and allelic contrast in the fourth step. The ‘genhwcci’ and ‘metan’ commands of STATA software evaluate the HWE and calculate a summary effect size, respectively. A meta-regression using the ‘metareg’ command of STATA should be conducted to evaluate related factors of heterogeneities. PMID:28092928

  12. A Genome-Wide Association Meta-Analysis of Attention-Deficit/Hyperactivity Disorder Symptoms in Population-Based Paediatric Cohorts

    PubMed Central

    Groen-Blokhuis, Maria M.; Pourcain, Beate St.; Greven, Corina U.; Pappa, Irene; Tiesler, Carla M.T.; Ang, Wei; Nolte, Ilja M.; Vilor-Tejedor, Natalia; Bacelis, Jonas; Ebejer, Jane L.; Zhao, Huiying; Davies, Gareth E.; Ehli, Erik A.; Evans, David M.; Fedko, Iryna O.; Guxens, Mònica; Hottenga, Jouke-Jan; Hudziak, James J.; Jugessur, Astanand; Kemp, John P.; Krapohl, Eva; Martin, Nicholas G.; Murcia, Mario; Myhre, Ronny; Ormel, Johan; Ring, Susan M.; Standl, Marie; Stergiakouli, Evie; Stoltenberg, Camilla; Thiering, Elisabeth; Timpson, Nicholas J.; Trzaskowski, Maciej; van der Most, Peter J.; Wang, Carol; Nyholt, Dale R.; Medland, Sarah E.; Neale, Benjamin; Jacobsson, Bo; Sunyer, Jordi; Hartman, Catharina A.; Whitehouse, Andrew J.O.; Pennell, Craig E.; Heinrich, Joachim; Plomin, Robert; Smith, George Davey; Tiemeier, Henning; Posthuma, Danielle; Boomsma, Dorret I.

    2016-01-01

    Objective To elucidate the influence of common genetic variants on childhood attention-deficit/hyperactivity disorder (ADHD) symptoms, to identify genetic variants that explain its high heritability, and to investigate the genetic overlap of ADHD symptom scores with ADHD diagnosis. Method Within the EArly Genetics and Lifecourse Epidemiology (EAGLE) consortium, genome-wide single nucleotide polymorphisms (SNPs) and ADHD symptom scores were available for 17,666 children (< 13 years) from nine population-based cohorts. SNP-based heritability was estimated in data from the three largest cohorts. Meta-analysis based on genome-wide association (GWA) analyses with SNPs was followed by gene-based association tests, and the overlap in results with a meta-analysis in the Psychiatric Genomics Consortium (PGC) case-control ADHD study was investigated. Results SNP-based heritability ranged from 5% to 34%, indicating that variation in common genetic variants influences ADHD symptom scores. The meta-analysis did not detect genome-wide significant SNPs, but three genes, lying close to each other with SNPs in high linkage disequilibrium (LD), showed a gene-wide significant association (p values between 1.46×10-6 and 2.66×10-6). One gene, WASL, is involved in neuronal development. Both SNP- and gene-based analyses indicated overlap with the PGC meta-analysis results with the genetic correlation estimated at 0.96. Conclusion The SNP-based heritability for ADHD symptom scores indicates a polygenic architecture and genes involved in neurite outgrowth are possibly involved. Continuous and dichotomous measures of ADHD appear to assess a genetically common phenotype. A next step is to combine data from population-based and case-control cohorts in genetic association studies to increase sample size and improve statistical power for identifying genetic variants. PMID:27663945

  13. The sumLINK statistic for genetic linkage analysis in the presence of heterogeneity.

    PubMed

    Christensen, G B; Knight, S; Camp, N J

    2009-11-01

    We present the "sumLINK" statistic--the sum of multipoint LOD scores for the subset of pedigrees with nominally significant linkage evidence at a given locus--as an alternative to common methods to identify susceptibility loci in the presence of heterogeneity. We also suggest the "sumLOD" statistic (the sum of positive multipoint LOD scores) as a companion to the sumLINK. sumLINK analysis identifies genetic regions of extreme consistency across pedigrees without regard to negative evidence from unlinked or uninformative pedigrees. Significance is determined by an innovative permutation procedure based on genome shuffling that randomizes linkage information across pedigrees. This procedure for generating the empirical null distribution may be useful for other linkage-based statistics as well. Using 500 genome-wide analyses of simulated null data, we show that the genome shuffling procedure results in the correct type 1 error rates for both the sumLINK and sumLOD. The power of the statistics was tested using 100 sets of simulated genome-wide data from the alternative hypothesis from GAW13. Finally, we illustrate the statistics in an analysis of 190 aggressive prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics, where we identified a new susceptibility locus. We propose that the sumLINK and sumLOD are ideal for collaborative projects and meta-analyses, as they do not require any sharing of identifiable data between contributing institutions. Further, loci identified with the sumLINK have good potential for gene localization via statistical recombinant mapping, as, by definition, several linked pedigrees contribute to each peak.

  14. NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data.

    PubMed

    Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

    2016-01-01

    The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data.

  15. NABIC: A New Access Portal to Search, Visualize, and Share Agricultural Genomics Data

    PubMed Central

    Seol, Young-Joo; Lee, Tae-Ho; Park, Dong-Suk; Kim, Chang-Kug

    2016-01-01

    The National Agricultural Biotechnology Information Center developed an access portal to search, visualize, and share agricultural genomics data with a focus on South Korean information and resources. The portal features an agricultural biotechnology database containing a wide range of omics data from public and proprietary sources. We collected 28.4 TB of data from 162 agricultural organisms, with 10 types of omics data comprising next-generation sequencing sequence read archive, genome, gene, nucleotide, DNA chip, expressed sequence tag, interactome, protein structure, molecular marker, and single-nucleotide polymorphism datasets. Our genomic resources contain information on five animals, seven plants, and one fungus, which is accessed through a genome browser. We also developed a data submission and analysis system as a web service, with easy-to-use functions and cutting-edge algorithms, including those for handling next-generation sequencing data. PMID:26848255

  16. Genome-wide association study identified three major QTL for carcass weight including the PLAG1-CHCHD7 QTN for stature in Japanese Black cattle

    PubMed Central

    2012-01-01

    Background Significant quantitative trait loci (QTL) for carcass weight were previously mapped on several chromosomes in Japanese Black half-sib families. Two QTL, CW-1 and CW-2, were narrowed down to 1.1-Mb and 591-kb regions, respectively. Recent advances in genomic tools allowed us to perform a genome-wide association study (GWAS) in cattle to detect associations in a general population and estimate their effect size. Here, we performed a GWAS for carcass weight using 1156 Japanese Black steers. Results Bonferroni-corrected genome-wide significant associations were detected in three chromosomal regions on bovine chromosomes (BTA) 6, 8, and 14. The associated single nucleotide polymorphisms (SNP) on BTA 6 were in linkage disequilibrium with the SNP encoding NCAPG Ile442Met, which was previously identified as a candidate quantitative trait nucleotide for CW-2. In contrast, the most highly associated SNP on BTA 14 was located 2.3-Mb centromeric from the previously identified CW-1 region. Linkage disequilibrium mapping led to a revision of the CW-1 region within a 0.9-Mb interval around the associated SNP, and targeted resequencing followed by association analysis highlighted the quantitative trait nucleotides for bovine stature in the PLAG1-CHCHD7 intergenic region. The association on BTA 8 was accounted for by two SNP on the BovineSNP50 BeadChip and corresponded to CW-3, which was simultaneously detected by linkage analyses using half-sib families. The allele substitution effects of CW-1, CW-2, and CW-3 were 28.4, 35.3, and 35.0 kg per allele, respectively. Conclusion The GWAS revealed the genetic architecture underlying carcass weight variation in Japanese Black cattle in which three major QTL accounted for approximately one-third of the genetic variance. PMID:22607022

  17. High-Throughput Sequencing and Linkage Mapping of a Clownfish Genome Provide Insights on the Distribution of Molecular Players Involved in Sex Change.

    PubMed

    Casas, Laura; Saenz-Agudelo, Pablo; Irigoien, Xabier

    2018-03-06

    Clownfishes are an excellent model system for investigating the genetic mechanism governing hermaphroditism and socially-controlled sex change in their natural environment because they are broadly distributed and strongly site-attached. Genomic tools, such as genetic linkage maps, allow fine-mapping of loci involved in molecular pathways underlying these reproductive processes. In this study, a high-density genetic map of Amphiprion bicinctus was constructed with 3146 RAD markers in a full-sib family organized in 24 robust linkage groups which correspond to the haploid chromosome number of the species. The length of the map was 4294.71 cM, with an average marker interval of 1.38 cM. The clownfish linkage map showed various levels of conserved synteny and collinearity with the genomes of Asian and European seabass, Nile tilapia and stickleback. The map provided a platform to investigate the genomic position of genes with differential expression during sex change in A. bicinctus. This study aims to bridge the gap of genome-scale information for this iconic group of species to facilitate the study of the main gene regulatory networks governing social sex change and gonadal restructuring in protandrous hermaphrodites.

  18. A reference linkage map for Eucalyptus

    PubMed Central

    2012-01-01

    Background Genetic linkage maps are invaluable resources in plant research. They provide a key tool for many genetic applications including: mapping quantitative trait loci (QTL); comparative mapping; identifying unlinked (i.e. independent) DNA markers for fingerprinting, population genetics and phylogenetics; assisting genome sequence assembly; relating physical and recombination distances along the genome and map-based cloning of genes. Eucalypts are the dominant tree species in most Australian ecosystems and of economic importance globally as plantation trees. The genome sequence of E. grandis has recently been released providing unprecedented opportunities for genetic and genomic research in the genus. A robust reference linkage map containing sequence-based molecular markers is needed to capitalise on this resource. Several high density linkage maps have recently been constructed for the main commercial forestry species in the genus (E. grandis, E. urophylla and E. globulus) using sequenced Diversity Arrays Technology (DArT) and microsatellite markers. To provide a single reference linkage map for eucalypts a composite map was produced through the integration of data from seven independent mapping experiments (1950 individuals) using a marker-merging method. Results The composite map totalled 1107 cM and contained 4101 markers; comprising 3880 DArT, 213 microsatellite and eight candidate genes. Eighty-one DArT markers were mapped to two or more linkage groups, resulting in the 4101 markers being mapped to 4191 map positions. Approximately 13% of DArT markers mapped to identical map positions, thus the composite map contained 3634 unique loci at an average interval of 0.31 cM. Conclusion The composite map represents the most saturated linkage map yet produced in Eucalyptus. As the majority of DArT markers contained on the map have been sequenced, the map provides a direct link to the E. grandis genome sequence and will serve as an important reference for

  19. Schizophrenia: A genome search targets chromosomes 3 and 8 for exploration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lasseter, V.K.; Pulver, A.E.; Wolyniec, P.

    1994-09-01

    Using a systematically ascertained sample of 57 families, each having 2 or more members with a consensus diagnosis of schizophrenia (DSM-III-R criteria), we have searched approximately 75% of the genome for susceptibility loci for schizophrenia. Genetic linkage studies of 520 loci have been performed using a complex autosomal dominant model incorporating age-at-onset and certainty of the diagnosis. Results were analyzed under the hypothesis of heterogeneity using the A-test and the Liang test. A two-stage strategy based on lod score thresholds from simulation studies of our sample identified regions for further exploration (hot spots). In each region, a dense map ofmore » highly informative dinucleotide repeat polymorphisms (heterozygosity greater than .70) is analyzed using linkage studies with dominant, recessive, and affected only models and non-parametric sib pair identity-by-descent methods (SIBPAL, S.A.G.E. 2.1). In no region has a lod score > 3 been observed; however, affected sib pair analyses gave a p-value of .0001 corresponding to a lod score > 3. Current {open_quote}hot spots{close_quote} on chromosomes 3p26-p24 and 8p22-p21 will be presented. For 3p26-p24, the maximum two-point lod score is 1.97 (dominant model) and the SIBPAL p-value is .02. For 8p22-p21, the maximum two-point lod score is 2.02 (recessive model) and the SIBPAL p-value is .0001.« less

  20. Genome-Wide Significant Association between Alcohol Dependence and a Variant in the ADH Gene Cluster

    PubMed Central

    Frank, Josef; Cichon, Sven; Treutlein, Jens; Ridinger, Monika; Mattheisen, Manuel; Hoffmann, Per; Herms, Stefan; Wodarz, Norbert; Soyka, Michael; Zill, Peter; Maier, Wolfgang; Mössner, Rainald; Gaebel, Wolfgang; Dahmen, Norbert; Scherbaum, Norbert; Schmäl, Christine; Steffens, Michael; Lucae, Susanne; Ising, Marcus; Müller-Myhsok, Bertram; Nöthen, Markus M; Mann, Karl; Kiefer, Falk; Rietschel, Marcella

    2011-01-01

    Alcohol dependence (AD) is an important contributory factor to the global burden of disease. The etiology of AD involves both environmental and genetic factors, and the disorder has a heritability of around 50%. The aim of the present study was to identify susceptibility genes for AD by performing a genome-wide association study (GWAS). The sample comprised 1,333 male in-patients with severe DSM-IV AD and 2,168 controls. These included 487 patients and 1,358 controls from a previous GWAS study by our group. All individuals were of German descent. Single marker tests and a polygenic score based analysis to assess the combined contribution of multiple markers with small effects were performed. The SNP rs1789891, which is located between the ADH1B and ADH1C genes, achieved genome-wide significance (p=1.27E–8; OR=1.46). Other markers from this region were also associated with AD, and conditional analyses indicated that these made a partially independent contribution. The SNP rs1789891 is in complete linkage disequilibrium with the functional Arg272Gln variant (p=1.24E–7, OR=1.31) of the ADH1C gene, which has been reported to modify the rate of ethanol oxidation to acetaldehyde in vitro. A polygenic score based approach produced a significant result (p=9.66E–9). This is the first GWAS of AD to provide genome-wide significant support for the role of the ADH gene cluster and to suggest a polygenic component to the etiology of AD. The latter result suggests that many more AD susceptibility genes still await identification. PMID:22004471

  1. Genome-wide nucleotide diversity of hatchery-reared Atlantic and Mediterranean strains of brown trout Salmo trutta compared to wild Mediterranean populations.

    PubMed

    Leitwein, M; Gagnaire, P-A; Desmarais, E; Guendouz, S; Rohmer, M; Berrebi, P; Guinand, B

    2016-12-01

    A genome-wide assessment of diversity is provided for wild Mediterranean brown trout Salmo trutta populations from headwater tributaries of the Orb River and from Atlantic and Mediterranean hatchery-reared strains that have been used for stocking. Double-digest restriction-site-associated DNA sequencing (dd-RADseq) was performed and the efficiency of de novo and reference-mapping approaches to obtain individual genotypes was compared. Large numbers of single nucleotide polymorphism (SNP) markers with similar genome-wide distributions were discovered using both approaches (196 639 v. 121 016 SNPs, respectively), with c. 80% of the loci detected de novo being also found with reference mapping, using the Atlantic salmon Salmo salar genome as a reference. Lower mapping density but larger nucleotide diversity (π) was generally observed near extremities of linkage groups, consistent with regions of residual tetrasomic inheritance observed in salmonids. Genome-wide diversity estimates revealed reduced polymorphism in hatchery strains (π = 0·0040 and π = 0·0029 in Atlantic and Mediterranean strains, respectively) compared to wild populations (π = 0·0049), a pattern that was congruent with allelic richness estimated from microsatellite markers. Finally, pronounced heterozygote deficiency was found in hatchery strains (Atlantic F IS = 0·18; Mediterranean F IS = 0·42), indicating that stocking practices may affect the genetic diversity in wild populations. These new genomic resources will provide important tools to define better conservation strategies in S. trutta. © 2016 The Fisheries Society of the British Isles.

  2. A high-density SNP genetic linkage map for the silver-lipped pearl oyster, Pinctada maxima: a valuable resource for gene localisation and marker-assisted selection.

    PubMed

    Jones, David B; Jerry, Dean R; Khatkar, Mehar S; Raadsma, Herman W; Zenger, Kyall R

    2013-11-20

    The silver-lipped pearl oyster, Pinctada maxima, is an important tropical aquaculture species extensively farmed for the highly sought "South Sea" pearls. Traditional breeding programs have been initiated for this species in order to select for improved pearl quality, but many economic traits under selection are complex, polygenic and confounded with environmental factors, limiting the accuracy of selection. The incorporation of a marker-assisted selection (MAS) breeding approach would greatly benefit pearl breeding programs by allowing the direct selection of genes responsible for pearl quality. However, before MAS can be incorporated, substantial genomic resources such as genetic linkage maps need to be generated. The construction of a high-density genetic linkage map for P. maxima is not only essential for unravelling the genomic architecture of complex pearl quality traits, but also provides indispensable information on the genome structure of pearl oysters. A total of 1,189 informative genome-wide single nucleotide polymorphisms (SNPs) were incorporated into linkage map construction. The final linkage map consisted of 887 SNPs in 14 linkage groups, spans a total genetic distance of 831.7 centimorgans (cM), and covers an estimated 96% of the P. maxima genome. Assessment of sex-specific recombination across all linkage groups revealed limited overall heterochiasmy between the sexes (i.e. 1.15:1 F/M map length ratio). However, there were pronounced localised differences throughout the linkage groups, whereby male recombination was suppressed near the centromeres compared to female recombination, but inflated towards telomeric regions. Mean values of LD for adjacent SNP pairs suggest that a higher density of markers will be required for powerful genome-wide association studies. Finally, numerous nacre biomineralization genes were localised providing novel positional information for these genes. This high-density SNP genetic map is the first comprehensive linkage

  3. Genome-Wide Comparative Gene Family Classification

    PubMed Central

    Frech, Christian; Chen, Nansheng

    2010-01-01

    Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221

  4. High Quality Genomic Copy Number Data from Archival Formalin-Fixed Paraffin-Embedded Leiomyosarcoma: Optimisation of Universal Linkage System Labelling

    PubMed Central

    Salawu, Abdulazeez; Ul-Hassan, Aliya; Hammond, David; Fernando, Malee; Reed, Malcolm; Sisley, Karen

    2012-01-01

    Most soft tissue sarcomas are characterized by genetic instability and frequent genomic copy number aberrations that are not subtype-specific. Oligonucleotide microarray-based Comparative Genomic Hybridisation (array CGH) is an important technique used to map genome-wide copy number aberrations, but the traditional requirement for high-quality DNA typically obtained from fresh tissue has limited its use in sarcomas. Although large archives of Formalin-fixed Paraffin-embedded (FFPE) tumour samples are available for research, the degradative effects of formalin on DNA from these tissues has made labelling and analysis by array CGH technically challenging. The Universal Linkage System (ULS) may be used for a one-step chemical labelling of such degraded DNA. We have optimised the ULS labelling protocol to perform aCGH on archived FFPE leiomyosarcoma tissues using the 180k Agilent platform. Preservation age of samples ranged from a few months to seventeen years and the DNA showed a wide range of degradation (when visualised on agarose gels). Consistently high DNA labelling efficiency and low microarray probe-to-probe variation (as measured by the derivative log ratio spread) was seen. Comparison of paired fresh and FFPE samples from identical tumours showed good correlation of CNAs detected. Furthermore, the ability to macro-dissect FFPE samples permitted the detection of CNAs that were masked in fresh tissue. Aberrations were visually confirmed using Fluorescence in situ Hybridisation. These results suggest that archival FFPE tissue, with its relative abundance and attendant clinical data may be used for effective mapping for genomic copy number aberrations in such rare tumours as leiomyosarcoma and potentially unravel clues to tumour origins, progression and ultimately, targeted treatment. PMID:23209738

  5. Search for sarcoidosis candidate genes by integration of data from genomic, transcriptomic and proteomic studies.

    PubMed

    Maver, Ales; Medica, Igor; Peterlin, Borut

    2009-12-01

    The search for gene candidates in multifactorial diseases such as sarcoidosis can be based on the integration of linkage association data, gene expression data, and protein profile data from genomic, transcriptomic and proteomic studies, respectively. In this study we performed a literature-based search for studies reporting such data, followed by integration of collected information. Different databases were examined--Medline, HugGE Navigator, ArrayExpress and Gene Expression Omnibus (GEO). Candidate genes were defined as genes which were reported in at least 2 different types of omics studies. Genes previously investigated in sarcoidosis were excluded from further analyses. We identified 177 genes associated with sarcoidosis as potential new candidate genes. Subsequently, 9 gene candidates identified to overlap in 2 different types of studies (genomic, transcriptomic and/or proteomic) were consistently reported in at least 3 studies: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214. These genes are involved in regulation of immune response, cellular proliferation, apoptosis, inhibition of protease activity, lipid metabolism. Exact biological functions of HBEGF, LRIG1, PTPN23, DPM2 and NUP214 remain to be completely elucidated. We propose 9 candidate genes: SERPINB1, FABP4, S100A8, HBEGF, IL7R, LRIG1, PTPN23, DPM2 and NUP214, as genes with high potential for association with sarcoidosis.

  6. Toward a framework linkage map of the canine genome.

    PubMed

    Langston, A A; Mellersh, C S; Wiegand, N A; Acland, G M; Ray, K; Aguirre, G D; Ostrander, E A

    1999-01-01

    Selective breeding to maintain specific physical and behavioral traits has made the modern dog one of the most physically diverse species on earth. One unfortunate consequence of the common breeding practices used to develop lines of dogs with the desired traits is amplification and propagation of genetic diseases within distinct breeds. To map disease loci we have constructed a first-generation framework map of the canine genome. We developed large numbers of highly polymorphic markers, constructed a panel of canine-rodent hybrid cell lines, and assigned those markers to chromosome groups using the hybrid cell lines. Finally, we determined the order and spacing of markers on individual canine chromosomes by linkage analysis using a reference panel of 17 outbred pedigrees. This article describes approaches and strategies to accomplish these goals.

  7. Genetic variations and risk of placental abruption: A genome-wide association study and meta-analysis of genome-wide association studies.

    PubMed

    Workalemahu, Tsegaselassie; Enquobahrie, Daniel A; Gelaye, Bizu; Sanchez, Sixto E; Garcia, Pedro J; Tekola-Ayele, Fasil; Hajat, Anjum; Thornton, Timothy A; Ananth, Cande V; Williams, Michelle A

    2018-06-01

    Accumulating epidemiological evidence points to strong genetic susceptibility to placental abruption (PA). However, characterization of genes associated with PA remains incomplete. We conducted a genome-wide association study (GWAS) of PA and a meta-analysis of GWAS. Participants of the Placental Abruption Genetic Epidemiology (PAGE) study, a population based case-control study of PA conducted in Lima, Peru, were genotyped using the Illumina HumanCore-24 BeadChip platform. Genotypes were imputed using the 1000 genomes reference panel, and >4.9 million SNPs that passed quality control were analyzed. We performed a GWAS in PAGE participants (507 PA cases and 1090 controls) and a GWAS meta-analysis in 2512 participants (959 PA cases and 1553 controls) that included PAGE and the previously reported Peruvian Abruptio Placentae Epidemiology (PAPE) study. We fitted population stratification-adjusted logistic regression models and fixed-effects meta-analyses using inverse-variance weighting. Independent loci (linkage-disequilibrium<0.80) suggestively associated with PA (P-value<5e-5) included rs4148646 and rs2074311 in ABCC8, rs7249210, rs7250184, rs7249100 and rs10401828 in ZNF28, rs11133659 in CTNND2, and rs2074314 and rs35271178 near KCNJ11 in the PAGE GWAS. Similarly, independent loci suggestively associated with PA in the GWAS meta-analysis included rs76258369 near IRX1, and rs7094759 and rs12264492 in ADAM12. Functional analyses of these genes showed trophoblast-like cell interaction, as well as networks involved in endocrine system disorders, cardiovascular diseases, and cellular function. We identified several genetic loci and related functions that may play a role in PA risk. Understanding genetic factors underlying pathophysiological mechanisms of PA may facilitate prevention and early diagnostic efforts. Published by Elsevier Ltd.

  8. Combined linkage and association analyses identify a novel locus for obesity near PROX1 in Asians.

    PubMed

    Kim, Hyun-Jin; Yoo, Yun Joo; Ju, Young Seok; Lee, Seungbok; Cho, Sung-Il; Sung, Joohon; Kim, Jong-Il; Seo, Jeong-Sun

    2013-11-01

    Although genome-wide association studies (GWAS) have substantially contributed to understanding the genetic architecture, unidentified variants for complex traits remain an issue. One of the efficient approaches is the improvement of the power of GWAS scan by weighting P values with prior linkage signals. Our objective was to identify the novel candidates for obesity in Asian populations by using genemapping strategies that combine linkage and association analyses. To obtain linkage information for body mass index (BMI) and waist circumference (WC), we performed a multipoint genome-wide linkage study in an isolated Mongolian sample of 1,049 individuals from 74 families. Next, a family-based GWAS, which integrates within- and between-family components, was performed using the genotype data of 756 individuals of the Mongolian sample, and P values for association were weighted using linkage information obtained previously. For both BMI (LOD = 3.3) and WC (LOD = 2.6), the highest linkage peak was discovered at chromosome 10q11.22. In family-based GWAS combined with linkage information, six single-nucleotide polymorphisms (SNPs) for BMI and five SNPs for WC reached a significant level of association (linkage weighted P < 1 × 10(-5) ). Of these, only one of the SNPs associated with WC (rs1704198) was replicated in 327 Korean families comprising 1,301 individuals. This SNP was located in the proximity of the prosperorelated homeobox 1 (PROX1) gene, the function of which was validated previously in a mouse model. Our powerful strategic analysis enabled the discovery of a novel candidate gene, PROX1, associated with WC in an Asian population. Copyright © 2012 The Obesity Society.

  9. Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations.

    PubMed

    Pengelly, Reuben J; Tapper, William; Gibson, Jane; Knut, Marcin; Tearle, Rick; Collins, Andrew; Ennis, Sarah

    2015-09-03

    An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution. We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure. WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.

  10. Genome-Wide Association Analysis of Aluminum Tolerance in Cultivated and Tibetan Wild Barley

    PubMed Central

    Cai, Shengguan; Wu, Dezhi; Jabeen, Zahra; Huang, Yuqing; Huang, Yechang; Zhang, Guoping

    2013-01-01

    Tibetan wild barley (Hordeum vulgare L. ssp. spontaneum), originated and grown in harsh enviroment in Tibet, is well-known for its rich germpalsm with high tolerance to abiotic stresses. However, the genetic variation and genes involved in Al tolerance are not totally known for the wild barley. In this study, a genome-wide association analysis (GWAS) was performed by using four root parameters related with Al tolerance and 469 DArT markers on 7 chromosomes within or across 110 Tibetan wild accessions and 56 cultivated cultivars. Population structure and cluster analysis revealed that a wide genetic diversity was present in Tibetan wild barley. Linkage disequilibrium (LD) decayed more rapidly in Tibetan wild barley (9.30 cM) than cultivated barley (11.52 cM), indicating that GWAS may provide higher resolution in the Tibetan group. Two novel Tibetan group-specific loci, bpb-9458 and bpb-8524 were identified, which were associated with relative longest root growth (RLRG), located at 2H and 7H on barely genome, and could explain 12.9% and 9.7% of the phenotypic variation, respectively. Moreover, a common locus bpb-6949, localized 0.8 cM away from a candidate gene HvMATE, was detected in both wild and cultivated barleys, and showed significant association with total root growth (TRG). The present study highlights that Tibetan wild barley could provide elite germplasm novel genes for barley Al-tolerant improvement. PMID:23922796

  11. Genome-wide association study of the age of onset of childhood asthma.

    PubMed

    Forno, Erick; Lasky-Su, Jessica; Himes, Blanca; Howrylak, Judie; Ramsey, Clare; Brehm, John; Klanderman, Barbara; Ziniti, John; Melén, Erik; Pershagen, Goran; Wickman, Magnus; Martinez, Fernando; Mauger, Dave; Sorkness, Christine; Tantisira, Kelan; Raby, Benjamin A; Weiss, Scott T; Celedón, Juan C

    2012-07-01

    Childhood asthma is a complex disease with known heritability and phenotypic diversity. Although an earlier onset has been associated with more severe disease, there has been no genome-wide association study of the age of onset of asthma in children. We sought to identify genetic variants associated with earlier onset of childhood asthma. We conducted the first genome-wide association study of the age of onset of childhood asthma among participants in the Childhood Asthma Management Program (CAMP) and used 3 independent cohorts from North America, Costa Rica, and Sweden for replication. Two single nucleotide polymorphisms (SNPs) were associated with earlier onset of asthma in the combined analysis of CAMP and the replication cohorts: rs9815663 (Fisher P= 2.31 × 10(-8)) and rs7927044 (P= 6.54 × 10(-9)). Of these 2 SNPs, rs9815663 was also significantly associated with earlier asthma onset in an analysis including only the replication cohorts. Ten SNPs in linkage disequilibrium with rs9815663 were also associated with earlier asthma onset (2.24 × 10(-7)

  12. ParallABEL: an R library for generalized parallelization of genome-wide association studies.

    PubMed

    Sangket, Unitsa; Mahasirimongkol, Surakameth; Chantratita, Wasun; Tandayya, Pichaya; Aulchenko, Yurii S

    2010-04-29

    Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity-by-state matrix was

  13. ParallABEL: an R library for generalized parallelization of genome-wide association studies

    PubMed Central

    2010-01-01

    Background Genome-Wide Association (GWA) analysis is a powerful method for identifying loci associated with complex traits and drug response. Parts of GWA analyses, especially those involving thousands of individuals and consuming hours to months, will benefit from parallel computation. It is arduous acquiring the necessary programming skills to correctly partition and distribute data, control and monitor tasks on clustered computers, and merge output files. Results Most components of GWA analysis can be divided into four groups based on the types of input data and statistical outputs. The first group contains statistics computed for a particular Single Nucleotide Polymorphism (SNP), or trait, such as SNP characterization statistics or association test statistics. The input data of this group includes the SNPs/traits. The second group concerns statistics characterizing an individual in a study, for example, the summary statistics of genotype quality for each sample. The input data of this group includes individuals. The third group consists of pair-wise statistics derived from analyses between each pair of individuals in the study, for example genome-wide identity-by-state or genomic kinship analyses. The input data of this group includes pairs of SNPs/traits. The final group concerns pair-wise statistics derived for pairs of SNPs, such as the linkage disequilibrium characterisation. The input data of this group includes pairs of individuals. We developed the ParallABEL library, which utilizes the Rmpi library, to parallelize these four types of computations. ParallABEL library is not only aimed at GenABEL, but may also be employed to parallelize various GWA packages in R. The data set from the North American Rheumatoid Arthritis Consortium (NARAC) includes 2,062 individuals with 545,080, SNPs' genotyping, was used to measure ParallABEL performance. Almost perfect speed-up was achieved for many types of analyses. For example, the computing time for the identity

  14. Genome-Wide Analysis of Seed Acid Detergent Lignin (ADL) and Hull Content in Rapeseed (Brassica napus L.)

    PubMed Central

    Wei, Lijuan; Qu, Cunmin; Xu, Xinfu; Lu, Kun; Qian, Wei; Li, Jiana; Li, Maoteng; Liu, Liezhao

    2015-01-01

    A stable yellow-seeded variety is the breeding goal for obtaining the ideal rapeseed (Brassica napus L.) plant, and the amount of acid detergent lignin (ADL) in the seeds and the hull content (HC) are often used as yellow-seeded rapeseed screening indices. In this study, a genome-wide association analysis of 520 accessions was performed using the Q + K model with a total of 31,839 single-nucleotide polymorphism (SNP) sites. As a result, three significant associations on the B. napus chromosomes A05, A09, and C05 were detected for seed ADL content. The peak SNPs were within 9.27, 14.22, and 20.86 kb of the key genes BnaA.PAL4, BnaA.CAD2/BnaA.CAD3, and BnaC.CCR1, respectively. Further analyses were performed on the major locus of A05, which was also detected in the seed HC examination. A comparison of our genome-wide association study (GWAS) results and previous linkage mappings revealed a common chromosomal region on A09, which indicates that GWAS can be used as a powerful complementary strategy for dissecting complex traits in B. napus. Genomic selection (GS) utilizing the significant SNP markers based on the GWAS results exhibited increased predictive ability, indicating that the predictive ability of a given model can be substantially improved by using GWAS and GS. PMID:26673885

  15. A genome-wide association study by ImmunoChip reveals potential modifiers in myelodysplastic syndromes.

    PubMed

    Danjou, Fabrice; Fozza, Claudio; Zoledziewska, Magdalena; Mulas, Antonella; Corda, Giovanna; Contini, Salvatore; Dore, Fausto; Galleu, Antonio; Di Tucci, Anna Angela; Caocci, Giovanni; Gaviano, Eleonora; Latte, Giancarlo; Gabbas, Attilio; Casula, Paolo; Delogu, Lucia Gemma; La Nasa, Giorgio; Angelucci, Emanuele; Cucca, Francesco; Longinotti, Maurizio

    2016-11-01

    Because different findings suggest that an immune dysregulation plays a role in the pathogenesis of myelodysplastic syndrome (MDS), we analyzed a large cohort of patients from a homogeneous Sardinian population using ImmunoChip, a genotyping array exploring 147,954 single-nucleotide polymorphisms (SNPs) localized in genomic regions displaying some degree of association with immune-mediated diseases or pathways. The population studied included 133 cases and 3,894 controls, and a total of 153,978 autosomal markers and 971 non-autosomal markers were genotyped. After association analysis, only one variant passed the genome-wide significance threshold: rs71325459 (p = 1.16 × 10 -12 ), which is situated on chromosome 20. The variant is in high linkage disequilibrium with rs35640778, an untested missense variant situated in the RTEL1 gene, an interesting candidate that encodes for an ATP-dependent DNA helicase implicated in telomere-length regulation, DNA repair, and maintenance of genomic stability. The second most associated signal is composed of five variants that fall slightly below the genome-wide significance threshold but point out another interesting gene candidate. These SNPs, with p values between 2.53 × 10 -6 and 3.34 × 10 -6 , are situated in the methylene tetrahydrofolate reductase (MTHFR) gene. The most associated of these variants, rs1537514, presents an increased frequency of the derived C allele in cases, with 11.4% versus 4.4% in controls. MTHFR is the rate-limiting enzyme in the methyl cycle and genetic variations in this gene have been strongly associated with the risk of neoplastic diseases. The current understanding of the MDS biology, which is based on the hypothesis of the sequential development of multiple subclonal molecular lesions, fits very well with the demonstration of a possible role for RTEL1 and MTHFR gene polymorphisms, both of which are related to a variable risk of genomic instability. Copyright © 2016 ISEH - International

  16. Comparison of genome-wide selection strategies to identify furfural tolerance genes in Escherichia coli.

    PubMed

    Glebes, Tirzah Y; Sandoval, Nicholas R; Gillis, Jacob H; Gill, Ryan T

    2015-01-01

    Engineering both feedstock and product tolerance is important for transitioning towards next-generation biofuels derived from renewable sources. Tolerance to chemical inhibitors typically results in complex phenotypes, for which multiple genetic changes must often be made to confer tolerance. Here, we performed a genome-wide search for furfural-tolerant alleles using the TRackable Multiplex Recombineering (TRMR) method (Warner et al. (2010), Nature Biotechnology), which uses chromosomally integrated mutations directed towards increased or decreased expression of virtually every gene in Escherichia coli. We employed various growth selection strategies to assess the role of selection design towards growth enrichments. We also compared genes with increased fitness from our TRMR selection to those from a previously reported genome-wide identification study of furfural tolerance genes using a plasmid-based genomic library approach (Glebes et al. (2014) PLOS ONE). In several cases, growth improvements were observed for the chromosomally integrated promoter/RBS mutations but not for the plasmid-based overexpression constructs. Through this assessment, four novel tolerance genes, ahpC, yhjH, rna, and dicA, were identified and confirmed for their effect on improving growth in the presence of furfural. © 2014 Wiley Periodicals, Inc.

  17. Genome wide approaches to identify protein-DNA interactions.

    PubMed

    Ma, Tao; Ye, Zhenqing; Wang, Liguo

    2018-05-29

    Transcription factors are DNA-binding proteins that play key roles in many fundamental biological processes. Unraveling their interactions with DNA is essential to identify their target genes and understand the regulatory network. Genome-wide identification of their binding sites became feasible thanks to recent progress in experimental and computational approaches. ChIP-chip, ChIP-seq, and ChIP-exo are three widely used techniques to demarcate genome-wide transcription factor binding sites. This review aims to provide an overview of these three techniques including their experiment procedures, computational approaches, and popular analytic tools. ChIP-chip, ChIP-seq, and ChIP-exo have been the major techniques to study genome-wide in vivo protein-DNA interaction. Due to the rapid development of next-generation sequencing technology, array-based ChIP-chip is deprecated and ChIP-seq has become the most widely used technique to identify transcription factor binding sites in genome-wide. The newly developed ChIP-exo further improves the spatial resolution to single nucleotide. Numerous tools have been developed to analyze ChIP-chip, ChIP-seq and ChIP-exo data. However, different programs may employ different mechanisms or underlying algorithms thus each will inherently include its own set of statistical assumption and bias. So choosing the most appropriate analytic program for a given experiment needs careful considerations. Moreover, most programs only have command line interface so their installation and usage will require basic computation expertise in Unix/Linux. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  18. Genome-wide association studies of autoimmune vitiligo identify 23 new risk loci and highlight key pathways and regulatory variants.

    PubMed

    Jin, Ying; Andersen, Genevieve; Yorgov, Daniel; Ferrara, Tracey M; Ben, Songtao; Brownson, Kelly M; Holland, Paulene J; Birlea, Stanca A; Siebert, Janet; Hartmann, Anke; Lienert, Anne; van Geel, Nanja; Lambert, Jo; Luiten, Rosalie M; Wolkerstorfer, Albert; Wietze van der Veen, J P; Bennett, Dorothy C; Taïeb, Alain; Ezzedine, Khaled; Kemp, E Helen; Gawkrodger, David J; Weetman, Anthony P; Kõks, Sulev; Prans, Ele; Kingo, Külli; Karelson, Maire; Wallace, Margaret R; McCormack, Wayne T; Overbeck, Andreas; Moretti, Silvia; Colucci, Roberta; Picardo, Mauro; Silverberg, Nanette B; Olsson, Mats; Valle, Yan; Korobko, Igor; Böhm, Markus; Lim, Henry W; Hamzavi, Iltefat; Zhou, Li; Mi, Qing-Sheng; Fain, Pamela R; Santorico, Stephanie A; Spritz, Richard A

    2016-11-01

    Vitiligo is an autoimmune disease in which depigmented skin results from the destruction of melanocytes, with epidemiological association with other autoimmune diseases. In previous linkage and genome-wide association studies (GWAS1 and GWAS2), we identified 27 vitiligo susceptibility loci in patients of European ancestry. We carried out a third GWAS (GWAS3) in European-ancestry subjects, with augmented GWAS1 and GWAS2 controls, genome-wide imputation, and meta-analysis of all three GWAS, followed by an independent replication. The combined analyses, with 4,680 cases and 39,586 controls, identified 23 new significantly associated loci and 7 suggestive loci. Most encode immune and apoptotic regulators, with some also associated with other autoimmune diseases, as well as several melanocyte regulators. Bioinformatic analyses indicate a predominance of causal regulatory variation, some of which corresponds to expression quantitative trait loci (eQTLs) at these loci. Together, the identified genes provide a framework for the genetic architecture and pathobiology of vitiligo, highlight relationships with other autoimmune diseases and melanoma, and offer potential targets for treatment.

  19. Genome-Wide Association Mapping Uncovers Fw1, a Dominant Gene Conferring Resistance to Fusarium Wilt in Strawberry.

    PubMed

    Pincot, Dominique D A; Poorten, Thomas J; Hardigan, Michael A; Harshman, Julia M; Acharya, Charlotte B; Cole, Glenn S; Gordon, Thomas R; Stueven, Michelle; Edger, Patrick P; Knapp, Steven J

    2018-05-04

    Fusarium wilt, a soil-borne disease caused by the fungal pathogen Fusarium oxysporum f. sp. fragariae , threatens strawberry ( Fragaria × ananassa ) production worldwide. The spread of the pathogen, coupled with disruptive changes in soil fumigation practices, have greatly increased disease pressure and the importance of developing resistant cultivars. While resistant and susceptible cultivars have been reported, a limited number of germplasm accessions have been analyzed, and contradictory conclusions have been reached in earlier studies to elucidate the underlying genetic basis of resistance. Here, we report the discovery of Fw1 , a dominant gene conferring resistance to Fusarium wilt in strawberry. The Fw1 locus was uncovered in a genome-wide association study of 565 historically and commercially important strawberry accessions genotyped with 14,408 SNP markers. Fourteen SNPs in linkage disequilibrium with Fw1 physically mapped to a 2.3 Mb segment on chromosome 2 in a diploid F. vesca reference genome. Fw1 and 11 tightly linked GWAS-significant SNPs mapped to linkage group 2C in octoploid segregating populations. The most significant SNP explained 85% of the phenotypic variability and predicted resistance in 97% of the accessions tested-broad-sense heritability was 0.96. Several disease resistance and defense-related gene homologs, including a small cluster of genes encoding nucleotide-binding leucine-rich-repeat proteins, were identified in the 0.7 Mb genomic segment predicted to harbor Fw1 DNA variants and candidate genes identified in the present study should facilitate the development of high-throughput genotyping assays for accurately predicting Fusarium wilt phenotypes and applying marker-assisted selection. Copyright © 2018 Pincot et al.

  20. Human Xq28 inversion polymorphism: From sex linkage to Genomics--A genetic mother lode.

    PubMed

    Kirby, Cait S; Kolber, Natalie; Salih Almohaidi, Asmaa M; Bierwert, Lou Ann; Saunders, Lori; Williams, Steven; Merritt, Robert

    2016-01-01

    An inversion polymorphism of the filamin and emerin genes at the tip of the long arm of the human X-chromosome serves as the basis of an investigative laboratory in which students learn something new about their own genomes. Long, nearly identical inverted repeats flanking the filamin and emerin genes illustrate how repetitive elements can lead to alterations in genome structure (inversions) through nonallelic homologous recombination. The near identity of the inverted repeats is an example of concerted evolution through gene conversion. While the laboratory in its entirety is designed for college level genetics courses, portions of the laboratory are appropriate for courses at other levels. Because the polymorphism is on the X-chromosome, the laboratory can be used in introductory biology courses to enhance understanding of sex-linkage and to test for Hardy-Weinberg equilibrium in females. More advanced topics, such as chromosome interference, the molecular model for recombination, and inversion heterozygosity suppression of recombination can be explored in upper-level genetics and evolution courses. DNA isolation, restriction digests, ligation, long PCR, and iPCR provide experience with techniques in molecular biology. This investigative laboratory weaves together topics stretching from molecular genetics to cytogenetics and sex-linkage, population genetics and evolutionary genetics. © 2016 The International Union of Biochemistry and Molecular Biology.

  1. Africa: continent of genome contrasts with implications for biomedical research and health.

    PubMed

    Ramsay, Michèle

    2012-08-31

    The genomic architecture of African populations is poorly understood and there is considerable variation between ethno-linguistic groups. Genome-wide approaches have been extensively applied to search for genetic associations to complex traits in Europeans, but rarely in Africans. This is largely attributed to lower levels of funding, poor infrastructure and public health systems, and to the small pool of trained scientists. High levels of genetic variation and underlying population structure in Africans present significant challenges, but lower levels of linkage disequilibrium provide an opportunity for more effective localisation of causal variants. High throughput technologies, including dense genotyping arrays, genome sequencing and epigenome studies, together with plummeting costs, are making research more affordable, even for African scientists. Understanding the interactions between genome structure and environmental influences is essential to interpreting their contributions to the increase in infectious diseases and non-communicable diseases, exacerbated by adverse environments and lifestyle choices. The unique genome dynamics in African populations have an important role to play in understanding human health and susceptibility to disease. Copyright © 2012. Published by Elsevier B.V.

  2. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    PubMed

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  3. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach

    PubMed Central

    Boitard, Simon; Rodríguez, Willy; Jay, Flora; Mona, Stefano; Austerlitz, Frédéric

    2016-01-01

    Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles. PMID:26943927

  4. Significance of genome-wide association studies in molecular anthropology.

    PubMed

    Gupta, Vipin; Khadgawat, Rajesh; Sachdeva, Mohinder Pal

    2009-12-01

    The successful advent of a genome-wide approach in association studies raises the hopes of human geneticists for solving a genetic maze of complex traits especially the disorders. This approach, which is replete with the application of cutting-edge technology and supported by big science projects (like Human Genome Project; and even more importantly the International HapMap Project) and various important databases (SNP database, CNV database, etc.), has had unprecedented success in rapidly uncovering many of the genetic determinants of complex disorders. The magnitude of this approach in the genetics of classical anthropological variables like height, skin color, eye color, and other genome diversity projects has certainly expanded the horizons of molecular anthropology. Therefore, in this article we have proposed a genome-wide association approach in molecular anthropological studies by providing lessons from the exemplary study of the Wellcome Trust Case Control Consortium. We have also highlighted the importance and uniqueness of Indian population groups in facilitating the design and finding optimum solutions for other genome-wide association-related challenges.

  5. Linkage analysis of systolic blood pressure: a score statistic and computer implementation

    PubMed Central

    Wang, Kai; Peng, Yingwei

    2003-01-01

    A genome-wide linkage analysis was conducted on systolic blood pressure using a score statistic. The randomly selected Replicate 34 of the simulated data was used. The score statistic was applied to the sibships derived from the general pedigrees. An add-on R program to GENEHUNTER was developed for this analysis and is freely available. PMID:14975145

  6. A genome-wide association study of limb bone length using a Large White × Minzhu intercross population.

    PubMed

    Zhang, Long-Chao; Li, Na; Liu, Xin; Liang, Jing; Yan, Hua; Zhao, Ke-Bin; Pu, Lei; Shi, Hui-Bi; Zhang, Yue-Bo; Wang, Li-Gang; Wang, Li-Xian

    2014-11-04

    In pig, limb bone length influences ham yield and body height to a great extent and has important economic implications for pig industry. In this study, an intercross population was constructed between the indigenous Chinese Minzhu pig breed and the western commercial Large White pig breed to examine the genetic basis for variation in limb bone length. The aim of this study was to detect potential genetic variants associated with porcine limb bone length. A total of 571 F2 individuals from a Large White and Minzhu intercross population were genotyped using the Illumina PorcineSNP60K Beadchip, and phenotyped for femur length (FL), humerus length (HL), hipbone length (HIPL), scapula length (SL), tibia length (TL), and ulna length (UL). A genome-wide association study was performed by applying the previously reported approach of genome-wide rapid association using mixed model and regression. Statistical significance of the associations was based on Bonferroni-corrected P-values. A total of 39 significant SNPs were mapped to a 11.93 Mb long region on pig chromosome 7 (SSC7). Linkage analysis of these significant SNPs revealed three haplotype blocks of 495 kb, 376 kb and 492 kb, respectively, in the 11.93 Mb region. Annotation based on the pig reference genome identified 15 genes that were located near or contained the significant SNPs in these linkage disequilibrium intervals. Conditioned analysis revealed that four SNPs, one on SSC2 and three on SSC4, showed significant associations with SL and HL, respectively. Analysis of the 15 annotated genes that were identified in these three haplotype blocks indicated that HMGA1 and PPARD, which are expressed in limbs and influence chondrocyte cell growth and differentiation, could be considered as relevant biological candidates for limb bone length in pig, with potential applications in breeding programs. Our results may also be useful for the study of the mechanisms that underlie human limb length and body height.

  7. Genome-wide screening and identification of antigens for rickettsial vaccine development

    USDA-ARS?s Scientific Manuscript database

    The capacity to identify immunogens for vaccine development by genome-wide screening has been markedly enhanced by the availability of complete microbial genome sequences coupled to rapid proteomic and bioinformatic analysis. Critical to this genome-wide screening is in vivo testing in the context o...

  8. CisMiner: Genome-Wide In-Silico Cis-Regulatory Module Prediction by Fuzzy Itemset Mining

    PubMed Central

    Navarro, Carmen; Lopez, Francisco J.; Cano, Carlos; Garcia-Alcalde, Fernando; Blanco, Armando

    2014-01-01

    Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs). However, these tools present at least one of the following limitations: 1) scope limited to promoter or conserved regions of the genome; 2) do not allow to identify combinations involving more than two motifs; 3) require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding sites provided by

  9. Genomic linkage of male song and female acoustic preference QTL underlying a rapid species radiation

    PubMed Central

    Shaw, Kerry L.; Lesnick, Sky C.

    2009-01-01

    The genetic coupling hypothesis of signal-preference evolution, whereby the same genes control male signal and female preference for that signal, was first inspired by the evolution of cricket acoustic communication nearly 50 years ago. To examine this hypothesis, we compared the genomic location of quantitative trait loci (QTL) underlying male song and female acoustic preference variation in the Hawaiian cricket genus Laupala. We document a QTL underlying female acoustic preference variation between 2 closely related species (Laupala kohalensis and Laupala paranigra). This preference QTL colocalizes with a song QTL identified previously, providing compelling evidence for a genomic linkage of the genes underlying these traits. We show that both song and preference QTL make small to moderate contributions to the behavioral difference between species, suggesting that divergence in mating behavior among Laupala species is due to the fixation of many genes of minor effect. The diversity of acoustic signaling systems in crickets exemplifies the evolution of elaborate male displays by sexual selection through female choice. Our data reveal genetic conditions that would enable functional coordination between song and acoustic preference divergence during speciation, resulting in a behaviorally coupled mode of signal-preference evolution. Interestingly, Laupala exhibits one of the fastest rates of speciation in animals, concomitant with equally rapid evolution in sexual signaling behaviors. Genomic linkage may facilitate rapid speciation by contributing to genetic correlations between sexual signaling behaviors that eventually cause sexual isolation between diverging populations. PMID:19487670

  10. A genome-wide association study in soybean

    USDA-ARS?s Scientific Manuscript database

    A genome-wide association study (GWAS) was performed to estimate the feasibility of identifying genes controlling the quantitative traits, seed protein and oil concentration, in 298 soybean germplasm accessions exhibiting a wide range of seed protein and oil content. A total of 55,159 single nucleo...

  11. Genome-wide analysis of the DNA-binding with one zinc finger (Dof) transcription factor family in bananas.

    PubMed

    Dong, Chen; Hu, Huigang; Xie, Jianghui

    2016-12-01

    DNA-binding with one finger (Dof) domain proteins are a multigene family of plant-specific transcription factors involved in numerous aspects of plant growth and development. In this study, we report a genome-wide search for Musa acuminata Dof (MaDof) genes and their expression profiles at different developmental stages and in response to various abiotic stresses. In addition, a complete overview of the Dof gene family in bananas is presented, including the gene structures, chromosomal locations, cis-regulatory elements, conserved protein domains, and phylogenetic inferences. Based on the genome-wide analysis, we identified 74 full-length protein-coding MaDof genes unevenly distributed on 11 chromosomes. Phylogenetic analysis with Dof members from diverse plant species showed that MaDof genes can be classified into four subgroups (StDof I, II, III, and IV). The detailed genomic information of the MaDof gene homologs in the present study provides opportunities for functional analyses to unravel the exact role of the genes in plant growth and development.

  12. Genome wide association mapping for grain shape traits in indica rice.

    PubMed

    Feng, Yue; Lu, Qing; Zhai, Rongrong; Zhang, Mengchen; Xu, Qun; Yang, Yaolong; Wang, Shan; Yuan, Xiaoping; Yu, Hanyong; Wang, Yiping; Wei, Xinghua

    2016-10-01

    Using genome-wide association mapping, 47 SNPs within 27 significant loci were identified for four grain shape traits, and 424 candidate genes were predicted from public database. Grain shape is a key determinant of grain yield and quality in rice (Oryza sativa L.). However, our knowledge of genes controlling rice grain shape remains limited. Genome-wide association mapping based on linkage disequilibrium (LD) has recently emerged as an effective approach for identifying genes or quantitative trait loci (QTL) underlying complex traits in plants. In this study, association mapping based on 5291 single nucleotide polymorphisms (SNPs) was conducted to identify significant loci associated with grain shape traits in a global collection of 469 diverse rice accessions. A total of 47 SNPs were located in 27 significant loci for four grain traits, and explained ~44.93-65.90 % of the phenotypic variation for each trait. In total, 424 candidate genes within a 200 kb extension region (±100 kb of each locus) of these loci were predicted. Of them, the cloned genes GS3 and qSW5 showed very strong effects on grain length and grain width in our study. Comparing with previously reported QTLs for grain shape traits, we found 11 novel loci, including 3, 3, 2 and 3 loci for grain length, grain width, grain length-width ratio and thousand grain weight, respectively. Validation of these new loci would be performed in the future studies. These results revealed that besides GS3 and qSW5, multiple novel loci and mechanisms were involved in determining rice grain shape. These findings provided valuable information for understanding of the genetic control of grain shape and molecular marker assistant selection (MAS) breeding in rice.

  13. Clustering patterns of LOD scores for asthma-related phenotypes revealed by a genome-wide screen in 295 French EGEA families.

    PubMed

    Bouzigon, Emmanuelle; Dizier, Marie-Hélène; Krähenbühl, Christine; Lemainque, Arnaud; Annesi-Maesano, Isabella; Betard, Christine; Bousquet, Jean; Charpin, Denis; Gormand, Frédéric; Guilloud-Bataille, Michel; Just, Jocelyne; Le Moual, Nicole; Maccario, Jean; Matran, Régis; Neukirch, Françoise; Oryszczyn, Marie-Pierre; Paty, Evelyne; Pin, Isabelle; Rosenberg-Bourgin, Myriam; Vervloet, Daniel; Kauffmann, Francine; Lathrop, Mark; Demenais, Florence

    2004-12-15

    A genome-wide scan for asthma phenotypes was conducted in the whole sample of 295 EGEA families selected through at least one asthmatic subject. In addition to asthma, seven phenotypes involved in the main asthma physiopathological pathways were considered: SPT (positive skin prick test response to at least one of 11 allergens), SPTQ score being the number of positive skin test responses to 11 allergens, Phadiatop (positive specific IgE response to a mixture of allergens), total IgE levels, eosinophils, bronchial responsiveness (BR) to methacholine challenge and %predicted FEV(1). Four regions showed evidence for linkage (Plinkage signals (0.001genome-wide LOD scores. This analysis revealed clustering of LODs for asthma, SPT and Phadiatop on one axis and clustering of LODs for %FEV(1), BR and SPTQ on the other, while LODs for IgE and eosinophils appeared to be independent from all other LODs. These results provide new insights into the potential sharing of genetic determinants by asthma-related phenotypes.

  14. Genome-wide divergence and linkage disequilibrium analyses for Capsicum baccatum revealed by genome-anchored single nucleotide polymorphisms

    USDA-ARS?s Scientific Manuscript database

    Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...

  15. Genetic linkage map of the interspecific grape rootstock cross Ramsey (Vitis champinii) x Riparia Gloire (Vitis riparia).

    PubMed

    Lowe, K M; Walker, M A

    2006-05-01

    The first genetic linkage map of grape derived from rootstock parents was constructed using 188 progeny from a cross of Ramsey (Vitis champinii) x Riparia Gloire (V. riparia). Of 354 simple sequence repeat markers tested, 205 were polymorphic for at least one parent, and 57.6% were fully informative. Maps of Ramsey, Riparia Gloire, and the F1 population were created using JoinMap software, following a pseudotestcross strategy. The set of 205 SSRs allowed for the identification of all 19 Vitis linkage groups (2n=38), with a total combined map length of 1,304.7 cM, averaging 6.8 cM between markers. The maternal map consists of 172 markers aligned into 19 linkage groups (1,244.9 cM) while 126 markers on the paternal map cover 18 linkage groups (1,095.5 cM). The expected genome coverage is over 92%. Segregation distortion occurred in the Ramsey, Riparia Gloire, and consensus maps for 10, 13, and 16% of the markers, respectively. These distorted markers clustered primarily on the linkage groups 3, 5, 14 and 17. No genome-wide difference in recombination rate was observed between Ramsey and Riparia Gloire based on 315 common marker intervals. Fifty-four new Vitis-EST-derived SSR markers were mapped, and were distributed evenly across the genome on 16 of the 19 linkage groups. These dense linkage maps of two phenotypically diverse North American Vitis species are valuable tools for studying the genetics of many rootstock traits including nematode resistance, lime and salt tolerance, and ability to induce vigor.

  16. Genome-wide ENU mutagenesis for the discovery of novel male fertility regulators.

    PubMed

    Jamsai, Duangporn; O'Bryan, Moira K

    2010-06-01

    The completion of genome sequencing projects has provided an extensive knowledge of the contents of the genomes of human, mouse, and many other organisms. Despite this, the function of most of the estimated 25,000 human genes remains largely unknown. Attention has now turned to elucidating gene function and identifying biological pathways that contribute to human diseases, including male infertility. Our understanding of the genetic regulation of male fertility has been accelerated through the use of genetically modified mouse models including knockout, knock-in, gene-trapped, and transgenic mice. Such reverse genetic approaches however, require some fore-knowledge of a gene's function and, as such, bias against the discovery of completely novel genes and biological pathways. To facilitate high throughput gene discovery, genome-wide mouse mutagenesis via the use of a potent chemical mutagen, N-ethyl-N-nitrosourea (ENU), has been developed over the past decade. This forward genetic, or phenotype-driven, approach relies upon observing a phenotype first, then subsequently defining the underlining genetic defect. Mutations are randomly introduced into the mouse genome via ENU exposure. Through a controlled breeding scheme, mutations causing a phenotype of interest (e.g., male infertility) are then identified by linkage analysis and candidate gene sequencing. This approach allows for the possibility of revealing comprehensive phenotype-genotype relationships for a range of genes and pathways i.e. in addition to null alleles, mice containing partial loss of function or gain-of-function mutations, can be recovered. Such point mutations are likely to be more reflective of those that occur within the human population. Many research groups have successfully used this approach to generate infertile mouse lines and some novel male fertility genes have been revealed. In this review, we focus on the utility of ENU mutagenesis for the discovery of novel male fertility regulators.

  17. The genome-wide structure of two economically important indigenous Sicilian cattle breeds.

    PubMed

    Mastrangelo, S; Saura, M; Tolone, M; Salces-Ortiz, J; Di Gerlando, R; Bertolini, F; Fontanesi, L; Sardina, M T; Serrano, M; Portolano, B

    2014-11-01

    Genomic technologies, such as high-throughput genotyping based on SNP arrays, provided background information concerning genome structure in domestic animals. The aim of this work was to investigate the genetic structure, the genome-wide estimates of inbreeding, coancestry, effective population size (Ne), and the patterns of linkage disequilibrium (LD) in 2 economically important Sicilian local cattle breeds, Cinisara (CIN) and Modicana (MOD), using the Illumina Bovine SNP50K v2 BeadChip. To understand the genetic relationship and to place both Sicilian breeds in a global context, genotypes from 134 other domesticated bovid breeds were used. Principal component analysis showed that the Sicilian cattle breeds were closer to individuals of Bos taurus taurus from Eurasia and formed nonoverlapping clusters with other breeds. Between the Sicilian cattle breeds, MOD was the most differentiated, whereas the animals belonging to the CIN breed showed a lower value of assignment, the presence of substructure, and genetic links with the MOD breed. The average molecular inbreeding and coancestry coefficients were moderately high, and the current estimates of Ne were low in both breeds. These values indicated a low genetic variability. Considering levels of LD between adjacent markers, the average r(2) in the MOD breed was comparable to those reported for others cattle breeds, whereas CIN showed a lower value. Therefore, these results support the need of more dense SNP arrays for a high-power association mapping and genomic selection efficiency, particularly for the CIN cattle breed. Controlling molecular inbreeding and coancestry would restrict inbreeding depression, the probability of losing beneficial rare alleles, and therefore the risk of extinction. The results generated from this study have important implications for the development of conservation and/or selection breeding programs in these 2 local cattle breeds.

  18. A meta-analysis of genome-wide association studies of asthma in Puerto Ricans.

    PubMed

    Yan, Qi; Brehm, John; Pino-Yanes, Maria; Forno, Erick; Lin, Jerome; Oh, Sam S; Acosta-Perez, Edna; Laurie, Cathy C; Cloutier, Michelle M; Raby, Benjamin A; Stilp, Adrienne M; Sofer, Tamar; Hu, Donglei; Huntsman, Scott; Eng, Celeste S; Conomos, Matthew P; Rastogi, Deepa; Rice, Kenneth; Canino, Glorisa; Chen, Wei; Barr, R Graham; Burchard, Esteban G; Celedón, Juan C

    2017-05-01

    Puerto Ricans are disproportionately affected with asthma in the USA. In this study, we aim to identify genetic variants that confer susceptibility to asthma in Puerto Ricans.We conducted a meta-analysis of genome-wide association studies (GWAS) of asthma in Puerto Ricans, including participants from: the Genetics of Asthma in Latino Americans (GALA) I-II, the Hartford-Puerto Rico Study and the Hispanic Community Health Study. Moreover, we examined whether susceptibility loci identified in previous meta-analyses of GWAS are associated with asthma in Puerto Ricans.The only locus to achieve genome-wide significance was chromosome 17q21, as evidenced by our top single nucleotide polymorphism (SNP), rs907092 (OR 0.71, p=1.2×10 -12 ) at IKZF3 Similar to results in non-Puerto Ricans, SNPs in genes in the same linkage disequilibrium block as IKZF3 ( e.g. ZPBP2 , ORMDL3 and GSDMB ) were significantly associated with asthma in Puerto Ricans. With regard to results from a meta-analysis in Europeans, we replicated findings for rs2305480 at GSDMB , but not for SNPs in any other genes. On the other hand, we replicated results from a meta-analysis of North American populations for SNPs at IL1RL1 , TSLP and GSDMB but not for IL33 Our findings suggest that common variants on chromosome 17q21 have the greatest effects on asthma in Puerto Ricans. Copyright ©ERS 2017.

  19. Comprehensive multi-stage linkage analyses identify a locus for adult height on chromosome 3p in a healthy Caucasian population.

    PubMed

    Ellis, Justine A; Scurrah, Katrina J; Duncan, Anna E; Lamantia, Angela; Byrnes, Graham B; Harrap, Stephen B

    2007-04-01

    There have been a number of genome-wide linkage studies for adult height in recent years. These studies have yielded few well-replicated loci, and none have been further confirmed by the identification of associated gene variants. The inconsistent results may be attributable to the fact that few studies have combined accurate phenotype measures with informative statistical modelling in healthy populations. We have performed a multi-stage genome-wide linkage analysis for height in 275 adult sibling pairs drawn randomly from the Victorian Family Heart Study (VFHS), a healthy population-based Caucasian cohort. Height was carefully measured in a standardised fashion on regularly calibrated equipment. Following genome-wide identification of a peak Z-score of 3.14 on chromosome 3 at 69 cM, we performed a fine-mapping analysis of this region in an extended sample of 392 two-generation families. We used a number of variance components models that incorporated assortative mating and shared environment effects, and we observed a peak LOD score of approximately 3.5 at 78 cM in four of the five models tested. We also demonstrated that the most prevalent model in the literature gave the worst fit, and the lowest LOD score (2.9) demonstrating the importance of appropriate modelling. The region identified in this study replicates the results of other genome-wide scans of height and bone-related phenotypes, strongly suggesting the presence of a gene important in bone growth on chromosome 3p. Association analyses of relevant candidate genes should identify the genetic variants responsible for the chromosome 3p linkage signal in our population.

  20. Brain function in carriers of a genome-wide supported bipolar disorder variant.

    PubMed

    Erk, Susanne; Meyer-Lindenberg, Andreas; Schnell, Knut; Opitz von Boberfeld, Carola; Esslinger, Christine; Kirsch, Peter; Grimm, Oliver; Arnold, Claudia; Haddad, Leila; Witt, Stephanie H; Cichon, Sven; Nöthen, Markus M; Rietschel, Marcella; Walter, Henrik

    2010-08-01

    The neural abnormalities underlying genetic risk for bipolar disorder, a severe, common, and highly heritable psychiatric condition, are largely unknown. An opportunity to define these mechanisms is provided by the recent discovery, through genome-wide association, of a single-nucleotide polymorphism (rs1006737) strongly associated with bipolar disorder within the CACNA1C gene, encoding the alpha subunit of the L-type voltage-dependent calcium channel Ca(v)1.2. To determine whether the genetic risk associated with rs1006737 is mediated through hippocampal function. Functional magnetic resonance imaging study. University hospital. A total of 110 healthy volunteers of both sexes and of German descent in the Hardy-Weinberg equilibrium for rs1006737. Blood oxygen level-dependent signal during an episodic memory task and behavioral and psychopathological measures. Using an intermediate phenotype approach, we show that healthy carriers of the CACNA1C risk variant exhibit a pronounced reduction of bilateral hippocampal activation during episodic memory recall and diminished functional coupling between left and right hippocampal regions. Furthermore, risk allele carriers exhibit activation deficits of the subgenual anterior cingulate cortex, a region repeatedly associated with affective disorders and the mediation of adaptive stress-related responses. The relevance of these findings for affective disorders is supported by significantly higher psychopathology scores for depression, anxiety, obsessive-compulsive thoughts, interpersonal sensitivity, and neuroticism in risk allele carriers, correlating negatively with the observed regional brain activation. Our data demonstrate that rs1006737 or genetic variants in linkage disequilibrium with it are functional in the human brain and provide a neurogenetic risk mechanism for bipolar disorder backed by genome-wide evidence.

  1. A genome-wide association study of atopic dermatitis identifies loci with overlapping effects on asthma and psoriasis.

    PubMed

    Weidinger, Stephan; Willis-Owen, Saffron A G; Kamatani, Yoichiro; Baurecht, Hansjörg; Morar, Nilesh; Liang, Liming; Edser, Pauline; Street, Teresa; Rodriguez, Elke; O'Regan, Grainne M; Beattie, Paula; Fölster-Holst, Regina; Franke, Andre; Novak, Natalija; Fahy, Caoimhe M; Winge, Mårten C G; Kabesch, Michael; Illig, Thomas; Heath, Simon; Söderhäll, Cilla; Melén, Erik; Pershagen, Göran; Kere, Juha; Bradley, Maria; Lieden, Agne; Nordenskjold, Magnus; Harper, John I; McLean, W H Irwin; Brown, Sara J; Cookson, William O C; Lathrop, G Mark; Irvine, Alan D; Moffatt, Miriam F

    2013-12-01

    Atopic dermatitis (AD) is the most common dermatological disease of childhood. Many children with AD have asthma and AD shares regions of genetic linkage with psoriasis, another chronic inflammatory skin disease. We present here a genome-wide association study (GWAS) of childhood-onset AD in 1563 European cases with known asthma status and 4054 European controls. Using Illumina genotyping followed by imputation, we generated 268 034 consensus genotypes and in excess of 2 million single nucleotide polymorphisms (SNPs) for analysis. Association signals were assessed for replication in a second panel of 2286 European cases and 3160 European controls. Four loci achieved genome-wide significance for AD and replicated consistently across all cohorts. These included the epidermal differentiation complex (EDC) on chromosome 1, the genomic region proximal to LRRC32 on chromosome 11, the RAD50/IL13 locus on chromosome 5 and the major histocompatibility complex (MHC) on chromosome 6; reflecting action of classical HLA alleles. We observed variation in the contribution towards co-morbid asthma for these regions of association. We further explored the genetic relationship between AD, asthma and psoriasis by examining previously identified susceptibility SNPs for these diseases. We found considerable overlap between AD and psoriasis together with variable coincidence between allergic rhinitis (AR) and asthma. Our results indicate that the pathogenesis of AD incorporates immune and epidermal barrier defects with combinations of specific and overlapping effects at individual loci.

  2. A genome-wide association study of atopic dermatitis identifies loci with overlapping effects on asthma and psoriasis

    PubMed Central

    Weidinger, Stephan; Willis-Owen, Saffron A.G.; Kamatani, Yoichiro; Baurecht, Hansjörg; Morar, Nilesh; Liang, Liming; Edser, Pauline; Street, Teresa; Rodriguez, Elke; O'Regan, Grainne M.; Beattie, Paula; Fölster-Holst, Regina; Franke, Andre; Novak, Natalija; Fahy, Caoimhe M.; Winge, Mårten C.G.; Kabesch, Michael; Illig, Thomas; Heath, Simon; Söderhäll, Cilla; Melén, Erik; Pershagen, Göran; Kere, Juha; Bradley, Maria; Lieden, Agne; Nordenskjold, Magnus; Harper, John I.; Mclean, W.H. Irwin; Brown, Sara J.; Cookson, William O.C.; Lathrop, G. Mark; Irvine, Alan D.; Moffatt, Miriam F.

    2013-01-01

    Atopic dermatitis (AD) is the most common dermatological disease of childhood. Many children with AD have asthma and AD shares regions of genetic linkage with psoriasis, another chronic inflammatory skin disease. We present here a genome-wide association study (GWAS) of childhood-onset AD in 1563 European cases with known asthma status and 4054 European controls. Using Illumina genotyping followed by imputation, we generated 268 034 consensus genotypes and in excess of 2 million single nucleotide polymorphisms (SNPs) for analysis. Association signals were assessed for replication in a second panel of 2286 European cases and 3160 European controls. Four loci achieved genome-wide significance for AD and replicated consistently across all cohorts. These included the epidermal differentiation complex (EDC) on chromosome 1, the genomic region proximal to LRRC32 on chromosome 11, the RAD50/IL13 locus on chromosome 5 and the major histocompatibility complex (MHC) on chromosome 6; reflecting action of classical HLA alleles. We observed variation in the contribution towards co-morbid asthma for these regions of association. We further explored the genetic relationship between AD, asthma and psoriasis by examining previously identified susceptibility SNPs for these diseases. We found considerable overlap between AD and psoriasis together with variable coincidence between allergic rhinitis (AR) and asthma. Our results indicate that the pathogenesis of AD incorporates immune and epidermal barrier defects with combinations of specific and overlapping effects at individual loci. PMID:23886662

  3. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses

    PubMed Central

    Purcell, Shaun ; Neale, Benjamin ; Todd-Brown, Kathe ; Thomas, Lori ; Ferreira, Manuel A. R. ; Bender, David ; Maller, Julian ; Sklar, Pamela ; de Bakker, Paul I. W. ; Daly, Mark J. ; Sham, Pak C. 

    2007-01-01

    Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis. PMID:17701901

  4. Genome Structure of the Legume, Lotus japonicus

    PubMed Central

    Sato, Shusei; Nakamura, Yasukazu; Kaneko, Takakazu; Asamizu, Erika; Kato, Tomohiko; Nakao, Mitsuteru; Sasamoto, Shigemi; Watanabe, Akiko; Ono, Akiko; Kawashima, Kumiko; Fujishiro, Tsunakazu; Katoh, Midori; Kohara, Mitsuyo; Kishida, Yoshie; Minami, Chiharu; Nakayama, Shinobu; Nakazaki, Naomi; Shimizu, Yoshimi; Shinpo, Sayaka; Takahashi, Chika; Wada, Tsuyuko; Yamada, Manabu; Ohmido, Nobuko; Hayashi, Makoto; Fukui, Kiichi; Baba, Tomoya; Nakamichi, Tomoko; Mori, Hirotada; Tabata, Satoshi

    2008-01-01

    The legume Lotus japonicus has been widely used as a model system to investigate the genetic background of legume-specific phenomena such as symbiotic nitrogen fixation. Here, we report structural features of the L. japonicus genome. The 315.1-Mb sequences determined in this and previous studies correspond to 67% of the genome (472 Mb), and are likely to cover 91.3% of the gene space. Linkage mapping anchored 130-Mb sequences onto the six linkage groups. A total of 10 951 complete and 19 848 partial structures of protein-encoding genes were assigned to the genome. Comparative analysis of these genes revealed the expansion of several functional domains and gene families that are characteristic of L. japonicus. Synteny analysis detected traces of whole-genome duplication and the presence of synteny blocks with other plant genomes to various degrees. This study provides the first opportunity to look into the complex and unique genetic system of legumes. PMID:18511435

  5. Genome-Wide Development and Use of Microsatellite Markers for Large-Scale Genotyping Applications in Foxtail Millet [Setaria italica (L.)

    PubMed Central

    Pandey, Garima; Misra, Gopal; Kumari, Kajal; Gupta, Sarika; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

    2013-01-01

    The availability of well-validated informative co-dominant microsatellite markers and saturated genetic linkage map has been limited in foxtail millet (Setaria italica L.). In view of this, we conducted a genome-wide analysis and identified 28 342 microsatellite repeat-motifs spanning 405.3 Mb of foxtail millet genome. The trinucleotide repeats (∼48%) was prevalent when compared with dinucleotide repeats (∼46%). Of the 28 342 microsatellites, 21 294 (∼75%) primer pairs were successfully designed, and a total of 15 573 markers were physically mapped on 9 chromosomes of foxtail millet. About 159 markers were validated successfully in 8 accessions of Setaria sp. with ∼67% polymorphic potential. The high percentage (89.3%) of cross-genera transferability across millet and non-millet species with higher transferability percentage in bioenergy grasses (∼79%, Switchgrass and ∼93%, Pearl millet) signifies their importance in studying the bioenergy grasses. In silico comparative mapping of 15 573 foxtail millet microsatellite markers against the mapping data of sorghum (16.9%), maize (14.5%) and rice (6.4%) indicated syntenic relationships among the chromosomes of foxtail millet and target species. The results, thus, demonstrate the immense applicability of developed microsatellite markers in germplasm characterization, phylogenetics, construction of genetic linkage map for gene/quantitative trait loci discovery, comparative mapping in foxtail millet, including other millets and bioenergy grass species. PMID:23382459

  6. Genome-wide development and use of microsatellite markers for large-scale genotyping applications in foxtail millet [Setaria italica (L.)].

    PubMed

    Pandey, Garima; Misra, Gopal; Kumari, Kajal; Gupta, Sarika; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

    2013-04-01

    The availability of well-validated informative co-dominant microsatellite markers and saturated genetic linkage map has been limited in foxtail millet (Setaria italica L.). In view of this, we conducted a genome-wide analysis and identified 28 342 microsatellite repeat-motifs spanning 405.3 Mb of foxtail millet genome. The trinucleotide repeats (∼48%) was prevalent when compared with dinucleotide repeats (∼46%). Of the 28 342 microsatellites, 21 294 (∼75%) primer pairs were successfully designed, and a total of 15 573 markers were physically mapped on 9 chromosomes of foxtail millet. About 159 markers were validated successfully in 8 accessions of Setaria sp. with ∼67% polymorphic potential. The high percentage (89.3%) of cross-genera transferability across millet and non-millet species with higher transferability percentage in bioenergy grasses (∼79%, Switchgrass and ∼93%, Pearl millet) signifies their importance in studying the bioenergy grasses. In silico comparative mapping of 15 573 foxtail millet microsatellite markers against the mapping data of sorghum (16.9%), maize (14.5%) and rice (6.4%) indicated syntenic relationships among the chromosomes of foxtail millet and target species. The results, thus, demonstrate the immense applicability of developed microsatellite markers in germplasm characterization, phylogenetics, construction of genetic linkage map for gene/quantitative trait loci discovery, comparative mapping in foxtail millet, including other millets and bioenergy grass species.

  7. Genome-Wide Detection and Analysis of Multifunctional Genes

    PubMed Central

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  8. Implications of genome wide association studies for addiction: are our a priori assumptions all wrong?

    PubMed

    Hall, F Scott; Drgonova, Jana; Jain, Siddharth; Uhl, George R

    2013-12-01

    Substantial genetic contributions to addiction vulnerability are supported by data from twin studies, linkage studies, candidate gene association studies and, more recently, Genome Wide Association Studies (GWAS). Parallel to this work, animal studies have attempted to identify the genes that may contribute to responses to addictive drugs and addiction liability, initially focusing upon genes for the targets of the major drugs of abuse. These studies identified genes/proteins that affect responses to drugs of abuse; however, this does not necessarily mean that variation in these genes contributes to the genetic component of addiction liability. One of the major problems with initial linkage and candidate gene studies was an a priori focus on the genes thought to be involved in addiction based upon the known contributions of those proteins to drug actions, making the identification of novel genes unlikely. The GWAS approach is systematic and agnostic to such a priori assumptions. From the numerous GWAS now completed several conclusions may be drawn: (1) addiction is highly polygenic; each allelic variant contributing in a small, additive fashion to addiction vulnerability; (2) unexpected, compared to our a priori assumptions, classes of genes are most important in explaining addiction vulnerability; (3) although substantial genetic heterogeneity exists, there is substantial convergence of GWAS signals on particular genes. This review traces the history of this research; from initial transgenic mouse models based upon candidate gene and linkage studies, through the progression of GWAS for addiction and nicotine cessation, to the current human and transgenic mouse studies post-GWAS. © 2013.

  9. Genome-Wide Association Mapping of Correlated Traits in Cassava: Dry Matter and Total Carotenoid Content.

    PubMed

    Rabbi, Ismail Y; Udoh, Lovina I; Wolfe, Marnin; Parkes, Elizabeth Y; Gedil, Melaku A; Dixon, Alfred; Ramu, Punna; Jannink, Jean-Luc; Kulakow, Peter

    2017-11-01

    Cassava is a starchy root crop cultivated in the tropics for fresh consumption and commercial processing. Primary selection objectives in cassava breeding include dry matter content and micronutrient density, particularly provitamin A carotenoids. These traits are negatively correlated in the African germplasm. This study aimed at identifying genetic markers associated with these traits and uncovering whether linkage and/or pleiotropy were responsible for observed negative correlation. A genome-wide association mapping using 672 clones genotyped at 72,279 single nucleotide polymorphism (SNP) loci was performed. Root yellowness was used indirectly to assess variation in carotenoid content. Two major loci for root yellowness were identified on chromosome 1 at positions 24.1 and 30.5 Mbp. A single locus for dry matter content that colocated with the 24.1 Mbp peak for carotenoids was identified. Haplotypes at these loci explained 70 and 37% of the phenotypic variability for root yellowness and dry matter content, respectively. Evidence of megabase-scale linkage disequilibrium (LD) around the major loci of the two traits and detection of the major dry matter locus in independent analysis for the white- and yellow-root subpopulations suggests that physical linkage rather that pleiotropy is more likely to be the cause of the negative correlation between the target traits. Moreover, candidate genes for carotenoid () and starch biosynthesis ( and ) occurred in the vicinity of the identified locus at 24.1 Mbp. These findings elucidate the genetic architecture of carotenoids and dry matter in cassava and provide an opportunity to accelerate breeding of these traits. Copyright © 2017 Crop Science Society of America.

  10. A Genome-Wide Search for Greek and Jewish Admixture in the Kashmiri Population

    PubMed Central

    Tashi, Tsewang; Lorenzo, Felipe Ramos; Feusier, Julie Ellen; Mir, Hyder

    2016-01-01

    The Kashmiri population is an ethno-linguistic group that resides in the Kashmir Valley in northern India. A longstanding hypothesis is that this population derives ancestry from Jewish and/or Greek sources. There is historical and archaeological evidence of ancient Greek presence in India and Kashmir. Further, some historical accounts suggest ancient Hebrew ancestry as well. To date, it has not been determined whether signatures of Greek or Jewish admixture can be detected in the Kashmiri population. Using genome-wide genotyping and admixture detection methods, we determined there are no significant or substantial signs of Greek or Jewish admixture in modern-day Kashmiris. The ancestry of Kashmiri Tibetans was also determined, which showed signs of admixture with populations from northern India and west Eurasia. These results contribute to our understanding of the existing population structure in northern India and its surrounding geographical areas. PMID:27490348

  11. Statistical correction of the Winner’s Curse explains replication variability in quantitative trait genome-wide association studies

    PubMed Central

    Pe’er, Itsik

    2017-01-01

    Genome-wide association studies (GWAS) have identified hundreds of SNPs responsible for variation in human quantitative traits. However, genome-wide-significant associations often fail to replicate across independent cohorts, in apparent inconsistency with their apparent strong effects in discovery cohorts. This limited success of replication raises pervasive questions about the utility of the GWAS field. We identify all 332 studies of quantitative traits from the NHGRI-EBI GWAS Database with attempted replication. We find that the majority of studies provide insufficient data to evaluate replication rates. The remaining papers replicate significantly worse than expected (p < 10−14), even when adjusting for regression-to-the-mean of effect size between discovery- and replication-cohorts termed the Winner’s Curse (p < 10−16). We show this is due in part to misreporting replication cohort-size as a maximum number, rather than per-locus one. In 39 studies accurately reporting per-locus cohort-size for attempted replication of 707 loci in samples with similar ancestry, replication rate matched expectation (predicted 458, observed 457, p = 0.94). In contrast, ancestry differences between replication and discovery (13 studies, 385 loci) cause the most highly-powered decile of loci to replicate worse than expected, due to difference in linkage disequilibrium. PMID:28715421

  12. Genome-wide analysis of adolescent psychotic-like experiences shows genetic overlap with psychiatric disorders.

    PubMed

    Pain, Oliver; Dudbridge, Frank; Cardno, Alastair G; Freeman, Daniel; Lu, Yi; Lundstrom, Sebastian; Lichtenstein, Paul; Ronald, Angelica

    2018-03-31

    This study aimed to test for overlap in genetic influences between psychotic-like experience traits shown by adolescents in the community, and clinically-recognized psychiatric disorders in adulthood, specifically schizophrenia, bipolar disorder, and major depression. The full spectra of psychotic-like experience domains, both in terms of their severity and type (positive, cognitive, and negative), were assessed using self- and parent-ratings in three European community samples aged 15-19 years (Final N incl. siblings = 6,297-10,098). A mega-genome-wide association study (mega-GWAS) for each psychotic-like experience domain was performed. Single nucleotide polymorphism (SNP)-heritability of each psychotic-like experience domain was estimated using genomic-relatedness-based restricted maximum-likelihood (GREML) and linkage disequilibrium- (LD-) score regression. Genetic overlap between specific psychotic-like experience domains and schizophrenia, bipolar disorder, and major depression was assessed using polygenic risk score (PRS) and LD-score regression. GREML returned SNP-heritability estimates of 3-9% for psychotic-like experience trait domains, with higher estimates for less skewed traits (Anhedonia, Cognitive Disorganization) than for more skewed traits (Paranoia and Hallucinations, Parent-rated Negative Symptoms). Mega-GWAS analysis identified one genome-wide significant association for Anhedonia within IDO2 but which did not replicate in an independent sample. PRS analysis revealed that the schizophrenia PRS significantly predicted all adolescent psychotic-like experience trait domains (Paranoia and Hallucinations only in non-zero scorers). The major depression PRS significantly predicted Anhedonia and Parent-rated Negative Symptoms in adolescence. Psychotic-like experiences during adolescence in the community show additive genetic effects and partly share genetic influences with clinically-recognized psychiatric disorders, specifically schizophrenia and

  13. A genome-wide scan for signatures of differential artificial selection in ten cattle breeds.

    PubMed

    Rothammer, Sophie; Seichter, Doris; Förster, Martin; Medugorac, Ivica

    2013-12-21

    Since the times of domestication, cattle have been continually shaped by the influence of humans. Relatively recent history, including breed formation and the still enduring enormous improvement of economically important traits, is expected to have left distinctive footprints of selection within the genome. The purpose of this study was to map genome-wide selection signatures in ten cattle breeds and thus improve the understanding of the genome response to strong artificial selection and support the identification of the underlying genetic variants of favoured phenotypes. We analysed 47,651 single nucleotide polymorphisms (SNP) using Cross Population Extended Haplotype Homozygosity (XP-EHH). We set the significance thresholds using the maximum XP-EHH values of two essentially artificially unselected breeds and found up to 229 selection signatures per breed. Through a confirmation process we verified selection for three distinct phenotypes typical for one breed (polledness in Galloway, double muscling in Blanc-Bleu Belge and red coat colour in Red Holstein cattle). Moreover, we detected six genes strongly associated with known QTL for beef or dairy traits (TG, ABCG2, DGAT1, GH1, GHR and the Casein Cluster) within selection signatures of at least one breed. A literature search for genes lying in outstanding signatures revealed further promising candidate genes. However, in concordance with previous genome-wide studies, we also detected a substantial number of signatures without any yet known gene content. These results show the power of XP-EHH analyses in cattle to discover promising candidate genes and raise the hope of identifying phenotypically important variants in the near future. The finding of plausible functional candidates in some short signatures supports this hope. For instance, MAP2K6 is the only annotated gene of two signatures detected in Galloway and Gelbvieh cattle and is already known to be associated with carcass weight, back fat thickness and

  14. Genome-Wide Association Study (GWAS) and Genome-Wide Environment Interaction Study (GWEIS) of Depressive Symptoms in African American and Hispanic/Latina Women

    PubMed Central

    Dunn, Erin C.; Wiste, Anna; Radmanesh, Farid; Almli, Lynn M.; Gogarten, Stephanie M.; Sofer, Tamar; Faul, Jessica D.; Kardia, Sharon L.R.; Smith, Jennifer A.; Weir, David R.; Zhao, Wei; Soare, Thomas W.; Mirza, Saira S.; Hek, Karin; Tiemeier, Henning W.; Goveas, Joseph S.; Sarto, Gloria E.; Snively, Beverly M.; Cornelis, Marilyn; Koenen, Karestan C.; Kraft, Peter; Purcell, Shaun; Ressler, Kerry J.; Rosand, Jonathan; Wassertheil-Smoller, Sylvia; Smoller, Jordan W.

    2016-01-01

    Background Genome-wide association studies (GWAS) have been unable to identify variants linked to depression. We hypothesized that examining depressive symptoms and considering gene-environment interaction (G×E) might improve efficiency for gene discovery. We therefore conducted a GWAS and genome-wide environment interaction study (GWEIS) of depressive symptoms. Methods Using data from the SHARe cohort of the Women’s Health Initiative, comprising African Americans (n=7179) and Hispanics/Latinas (n=3138), we examined genetic main effects and G×E with stressful life events and social support. We also conducted a heritability analysis using genome-wide complex trait analysis (GCTA). Replication was attempted in four independent cohorts. Results No SNPs achieved genome-wide significance for main effects in either discovery sample. The top signals in African Americans were rs73531535 (located 20kb from GPR139, p=5.75×10−8) and rs75407252 (intronic to CACNA2D3, p=6.99×10−7). In Hispanics/Latinas, the top signals were rs2532087 (located 27kb from CD38, p=2.44×10−7) and rs4542757 (intronic to DCC, p=7.31×10−7). In the GWEIS with stressful life events, one interaction signal was genome-wide significant in African Americans (rs4652467; p=4.10×10−10; located 14kb from CEP350). This interaction was not observed in a smaller replication cohort. Although heritability estimates for depressive symptoms and stressful life events were each less than 10%, they were strongly genetically correlated (rG=0.95), suggesting that common variation underlying depressive symptoms and stressful life event exposure, though modest on their own, were highly overlapping in this sample. Conclusions Our results underscore the need for larger samples, more GWEIS, and greater investigation into genetic and environmental determinants of depressive symptoms in minorities. PMID:27038408

  15. Significant linkage to airway responsiveness on chromosome 12q24 in families of children with asthma in Costa Rica.

    PubMed

    Celedón, Juan C; Soto-Quiros, Manuel E; Avila, Lydiana; Lake, Stephen L; Liang, Catherine; Fournier, Eduardo; Spesny, Mitzi; Hersh, Craig P; Sylvia, Jody S; Hudson, Thomas J; Verner, Andrei; Klanderman, Barbara J; Freimer, Nelson B; Silverman, Edwin K; Weiss, Scott T

    2007-01-01

    Although asthma is a major public health problem in certain Hispanic subgroups in the United States and Latin America, only one genome scan for asthma has included Hispanic individuals. Because of small sample size, that study had limited statistical power to detect linkage to asthma and its intermediate phenotypes in Hispanic participants. To identify genomic regions that contain susceptibility genes for asthma and airway responsiveness in an isolated Hispanic population living in the Central Valley of Costa Rica, we conducted a genome-wide linkage analysis of asthma (n = 638) and airway responsiveness (n = 488) in members of eight large pedigrees of Costa Rican children with asthma. Nonparametric multipoint linkage analysis of asthma was conducted by the NPL-PAIR allele-sharing statistic, and variance component models were used for the multipoint linkage analysis of airway responsiveness as a quantitative phenotype. All linkage analyses were repeated after exclusion of the phenotypic data of former and current smokers. Chromosome 12q showed some evidence of linkage to asthma, particularly in nonsmokers (P < 0.01). Among nonsmokers, there was suggestive evidence of linkage to airway responsiveness on chromosome 12q24.31 (LOD = 2.33 at 146 cM). After genotyping 18 additional short-tandem repeat markers on chromosome 12q, there was significant evidence of linkage to airway responsiveness on chromosome 12q24.31 (LOD = 3.79 at 144 cM), with a relatively narrow 1.5-LOD unit support interval for the observed linkage peak (142-147 cM). Our results suggest that chromosome 12q24.31 contains a locus (or loci) that influence a critical intermediate phenotype of asthma (airway responsiveness) in Costa Ricans.

  16. Genome-wide variation in recombination rate in Eucalyptus.

    PubMed

    Gion, Jean-Marc; Hudson, Corey J; Lesur, Isabelle; Vaillancourt, René E; Potts, Brad M; Freeman, Jules S

    2016-08-09

    Meiotic recombination is a fundamental evolutionary process. It not only generates diversity, but influences the efficacy of natural selection and genome evolution. There can be significant heterogeneity in recombination rates within and between species, however this variation is not well understood outside of a few model taxa, particularly in forest trees. Eucalypts are forest trees of global economic importance, and dominate many Australian ecosystems. We studied recombination rate in Eucalyptus globulus using genetic linkage maps constructed in 10 unrelated individuals, and markers anchored to the Eucalyptus reference genome. This experimental design provided the replication to study whether recombination rate varied between individuals and chromosomes, and allowed us to study the genomic attributes and population genetic parameters correlated with this variation. Recombination rate varied significantly between individuals (range = 2.71 to 3.51 centimorgans/megabase [cM/Mb]), but was not significantly influenced by sex or cross type (F1 vs. F2). Significant differences in recombination rate between chromosomes were also evident (range = 1.98 to 3.81 cM/Mb), beyond those which were due to variation in chromosome size. Variation in chromosomal recombination rate was significantly correlated with gene density (r = 0.94), GC content (r = 0.90), and the number of tandem duplicated genes (r = -0.72) per chromosome. Notably, chromosome level recombination rate was also negatively correlated with the average genetic diversity across six species from an independent set of samples (r = -0.75). The correlations with genomic attributes are consistent with findings in other taxa, however, the direction of the correlation between diversity and recombination rate is opposite to that commonly observed. We argue this is likely to reflect the interaction of selection and specific genome architecture of Eucalyptus. Interestingly, the differences amongst

  17. Trans-ethnic follow-up of breast cancer GWAS hits using the preferential linkage disequilibrium approach

    PubMed Central

    Zhu, Qianqian; Shepherd, Lori; Lunetta, Kathryn L.; Yao, Song; Liu, Qian; Hu, Qiang; Haddad, Stephen A.; Sucheston-Campbell, Lara; Bensen, Jeannette T.; Bandera, Elisa V.; Rosenberg, Lynn; Liu, Song; Haiman, Christopher A.; Olshan, Andrew F.; Palmer, Julie R.; Ambrosone, Christine B.

    2016-01-01

    Leveraging population-distinct linkage equilibrium (LD) patterns, trans-ethnic follow-up of variants discovered from genome-wide association studies (GWAS) has proved to be useful in facilitating the identification of bona fide causal variants. We previously developed the preferential LD approach, a novel method that successfully identified causal variants driving the GWAS signals within European-descent populations even when the causal variants were only weakly linked with the GWAS-discovered variants. To evaluate the performance of our approach in a trans-ethnic setting, we applied it to follow up breast cancer GWAS hits identified mostly from populations of European ancestry in African Americans (AA). We evaluated 74 breast cancer GWAS variants in 8,315 AA women from the African American Breast Cancer Epidemiology and Risk (AMBER) consortium. Only 27% of them were associated with breast cancer risk at significance level α=0.05, suggesting race-specificity of the identified breast cancer risk loci. We followed up on those replicated GWAS hits in the AMBER consortium utilizing the preferential LD approach, to search for causal variants or better breast cancer markers from the 1000 Genomes variant catalog. Our approach identified stronger breast cancer markers for 80% of the GWAS hits with at least nominal breast cancer association, and in 81% of these cases, the marker identified was among the top 10 of all 1000 Genomes variants in the corresponding locus. The results support trans-ethnic application of the preferential LD approach in search for candidate causal variants, and may have implications for future genetic research of breast cancer in AA women. PMID:27825120

  18. Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants

    PubMed Central

    Conte, Matthieu G; Gaillard, Sylvain; Droc, Gaetan; Perin, Christophe

    2008-01-01

    Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between Oryza sativa and Arabidopsis thaliana. Firstly, we established an efficient method to cluster A. thaliana and O. sativa full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods. PMID:18426584

  19. Landscape genomics in Atlantic salmon (Salmo salar): searching for gene-environment interactions driving local adaptation.

    PubMed

    Vincent, Bourret; Dionne, Mélanie; Kent, Matthew P; Lien, Sigbjørn; Bernatchez, Louis

    2013-12-01

    A growing number of studies are examining the factors driving historical and contemporary evolution in wild populations. By combining surveys of genomic variation with a comprehensive assessment of environmental parameters, such studies can increase our understanding of the genomic and geographical extent of local adaptation in wild populations. We used a large-scale landscape genomics approach to examine adaptive and neutral differentiation across 54 North American populations of Atlantic salmon representing seven previously defined genetically distinct regional groups. Over 5500 genome-wide single nucleotide polymorphisms were genotyped in 641 individuals and 28 bulk assays of 25 pooled individuals each. Genome scans, linkage map, and 49 environmental variables were combined to conduct an innovative landscape genomic analysis. Our results provide valuable insight into the links between environmental variation and both neutral and potentially adaptive genetic divergence. In particular, we identified markers potentially under divergent selection, as well as associated selective environmental factors and biological functions with the observed adaptive divergence. Multivariate landscape genetic analysis revealed strong associations of both genetic and environmental structures. We found an enrichment of growth-related functions among outlier markers. Climate (temperature-precipitation) and geological characteristics were significantly associated with both potentially adaptive and neutral genetic divergence and should be considered as candidate loci involved in adaptation at the regional scale in Atlantic salmon. Hence, this study significantly contributes to the improvement of tools used in modern conservation and management schemes of Atlantic salmon wild populations. © 2013 The Author(s). Evolution © 2013 The Society for the Study of Evolution.

  20. Wide Distribution of Mitochondrial Genome Rearrangements in Wild Strains of the Cultivated Basidiomycete Agrocybe aegerita

    PubMed Central

    Barroso, G.; Blesa, S.; Labarere, J.

    1995-01-01

    We used restriction fragment length polymorphisms to examine mitochondrial genome rearrangements in 36 wild strains of the cultivated basidiomycete Agrocybe aegerita, collected from widely distributed locations in Europe. We identified two polymorphic regions within the mitochondrial DNA which varied independently: one carrying the Cox II coding sequence and the other carrying the Cox I, ATP6, and ATP8 coding sequences. Two types of mutations were responsible for the restriction fragment length polymorphisms that we observed and, accordingly, were involved in the A. aegerita mitochondrial genome evolution: (i) point mutations, which resulted in strain-specific mitochondrial markers, and (ii) length mutations due to genome rearrangements, such as deletions, insertions, or duplications. Within each polymorphic region, the length differences defined only two mitochondrial types, suggesting that these length mutations were not randomly generated but resulted from a precise rearrangement mechanism. For each of the two polymorphic regions, the two molecular types were distributed among the 36 strains without obvious correlation with their geographic origin. On the basis of these two polymorphisms, it is possible to define four mitochondrial haplotypes. The four mitochondrial haplotypes could be the result of intermolecular recombination between allelic forms present in the population long enough to reach linkage equilibrium. All of the 36 dikaryotic strains contained only a single mitochondrial type, confirming the previously described mitochondrial sorting out after cytoplasmic mixing in basidiomycetes. PMID:16534984

  1. Dosage Transmission Disequilibrium Test (dTDT) for Linkage and Association Detection

    PubMed Central

    Zhang, Zhehao; Wang, Jen-Chyong; Howells, William; Lin, Peng; Agrawal, Arpana; Edenberg, Howard J.; Tischfield, Jay A.; Schuckit, Marc A.; Bierut, Laura J.; Goate, Alison; Rice, John P.

    2013-01-01

    Both linkage and association studies have been successfully applied to identify disease susceptibility genes with genetic markers such as microsatellites and Single Nucleotide Polymorphisms (SNPs). As one of the traditional family-based studies, the Transmission/Disequilibrium Test (TDT) measures the over-transmission of an allele in a trio from its heterozygous parents to the affected offspring and can be potentially useful to identify genetic determinants for complex disorders. However, there is reduced information when complete trio information is unavailable. In this study, we developed a novel approach to “infer” the transmission of SNPs by combining both the linkage and association data, which uses microsatellite markers from families informative for linkage together with SNP markers from the offspring who are genotyped for both linkage and a Genome-Wide Association Study (GWAS). We generalized the traditional TDT to process these inferred dosage probabilities, which we name as the dosage-TDT (dTDT). For evaluation purpose, we developed a simulation procedure to assess its operating characteristics. We applied the dTDT to the simulated data and documented the power of the dTDT under a number of different realistic scenarios. Finally, we applied our methods to a family study of alcohol dependence (COGA) and performed individual genotyping on complete families for the top signals. One SNP (rs4903712 on chromosome 14) remained significant after correcting for multiple testing Methods developed in this study can be adapted to other platforms and will have widespread applicability in genomic research when case-control GWAS data are collected in families with existing linkage data. PMID:23691058

  2. A Genome-Wide Breast Cancer Scan in African Americans

    DTIC Science & Technology

    2010-06-01

    SNPs from the African American breast cancer scan to COGs , a European collaborative study which is has designed a SNP array with that will be genotyped...Award Number: W81XWH-08-1-0383 TITLE: A Genome-wide Breast Cancer Scan in African Americans PRINCIPAL INVESTIGATOR: Christopher A...SUBTITLE A Genome-wide Breast Cancer Scan in African Americans 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-08-1-0383 5c. PROGRAM

  3. A Genome-Wide Survey of the Microsatellite Content of the Globe Artichoke Genome and the Development of a Web-Based Database

    PubMed Central

    Portis, Ezio; Portis, Flavio; Valente, Luisa; Moglia, Andrea; Barchi, Lorenzo; Lanteri, Sergio; Acquadro, Alberto

    2016-01-01

    The recently acquired genome sequence of globe artichoke (Cynara cardunculus var. scolymus) has been used to catalog the genome’s content of simple sequence repeat (SSR) markers. More than 177,000 perfect SSRs were revealed, equivalent to an overall density across the genome of 244.5 SSRs/Mbp, but some 224,000 imperfect SSRs were also identified. About 21% of these SSRs were complex (two stretches of repeats separated by <100 nt). Some 73% of the SSRs were composed of dinucleotide motifs. The SSRs were categorized for the numbers of repeats present, their overall length and were allocated to their linkage group. A total of 4,761 perfect and 6,583 imperfect SSRs were present in 3,781 genes (14.11% of the total), corresponding to an overall density across the gene space of 32,5 and 44,9 SSRs/Mbp for perfect and imperfect motifs, respectively. A putative function has been assigned, using the gene ontology approach, to the set of genes harboring at least one SSR. The same search parameters were applied to reveal the SSR content of 14 other plant species for which genome sequence is available. Certain species-specific SSR motifs were identified, along with a hexa-nucleotide motif shared only with the other two Compositae species (sunflower (Helianthus annuus) and horseweed (Conyza canadensis)) included in the study. Finally, a database, called “Cynara cardunculus MicroSatellite DataBase” (CyMSatDB) was developed to provide a searchable interface to the SSR data. CyMSatDB facilitates the retrieval of SSR markers, as well as suggested forward and reverse primers, on the basis of genomic location, genomic vs genic context, perfect vs imperfect repeat, motif type, motif sequence and repeat number. The SSR markers were validated via an in silico based PCR analysis adopting two available assembled transcriptomes, derived from contrasting globe artichoke accessions, as templates. PMID:27648830

  4. Improved Statistics for Genome-Wide Interaction Analysis

    PubMed Central

    Ueki, Masao; Cordell, Heather J.

    2012-01-01

    Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al

  5. Genome-Wide Fine-Scale Recombination Rate Variation in Drosophila melanogaster

    PubMed Central

    Song, Yun S.

    2012-01-01

    Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features—including recombination rates, diversity, divergence, GC content, gene content, and sequence quality—is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and

  6. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  7. Genome-Wide Association Mapping for Yield and Other Agronomic Traits in an Elite Breeding Population of Tropical Rice (Oryza sativa)

    PubMed Central

    Lalusin, Antonio; Borromeo, Teresita; Gregorio, Glenn; Hernandez, Jose; Virk, Parminder; Collard, Bertrand; McCouch, Susan R.

    2015-01-01

    Genome-wide association mapping studies (GWAS) are frequently used to detect QTL in diverse collections of crop germplasm, based on historic recombination events and linkage disequilibrium across the genome. Generally, diversity panels genotyped with high density SNP panels are utilized in order to assay a wide range of alleles and haplotypes and to monitor recombination breakpoints across the genome. By contrast, GWAS have not generally been performed in breeding populations. In this study we performed association mapping for 19 agronomic traits including yield and yield components in a breeding population of elite irrigated tropical rice breeding lines so that the results would be more directly applicable to breeding than those from a diversity panel. The population was genotyped with 71,710 SNPs using genotyping-by-sequencing (GBS), and GWAS performed with the explicit goal of expediting selection in the breeding program. Using this breeding panel we identified 52 QTL for 11 agronomic traits, including large effect QTLs for flowering time and grain length/grain width/grain-length-breadth ratio. We also identified haplotypes that can be used to select plants in our population for short stature (plant height), early flowering time, and high yield, and thus demonstrate the utility of association mapping in breeding populations for informing breeding decisions. We conclude by exploring how the newly identified significant SNPs and insights into the genetic architecture of these quantitative traits can be leveraged to build genomic-assisted selection models. PMID:25785447

  8. High-resolution genetic map for understanding the effect of genome-wide recombination rate, selection sweep and linkage disequilibrium on nucleotide diversity in watermelon

    USDA-ARS?s Scientific Manuscript database

    Genotyping by sequencing (GBS) technology was used to identify a set of 9,933 single nucleotide polymorphism (SNP) markers for constructing a high-resolution genetic map of 1,087 cM for watermelon. The genome-wide variation of recombination rate (GWRR) across the map was evaluated and a positive co...

  9. Genome-Wide Association Analysis of Young-Onset Stroke Identifies a Locus on Chromosome 10q25 Near HABP2.

    PubMed

    Cheng, Yu-Ching; Stanne, Tara M; Giese, Anne-Katrin; Ho, Weang Kee; Traylor, Matthew; Amouyel, Philippe; Holliday, Elizabeth G; Malik, Rainer; Xu, Huichun; Kittner, Steven J; Cole, John W; O'Connell, Jeffrey R; Danesh, John; Rasheed, Asif; Zhao, Wei; Engelter, Stefan; Grond-Ginsbach, Caspar; Kamatani, Yoichiro; Lathrop, Mark; Leys, Didier; Thijs, Vincent; Metso, Tiina M; Tatlisumak, Turgut; Pezzini, Alessandro; Parati, Eugenio A; Norrving, Bo; Bevan, Steve; Rothwell, Peter M; Sudlow, Cathie; Slowik, Agnieszka; Lindgren, Arne; Walters, Matthew R; Jannes, Jim; Shen, Jess; Crosslin, David; Doheny, Kimberly; Laurie, Cathy C; Kanse, Sandip M; Bis, Joshua C; Fornage, Myriam; Mosley, Thomas H; Hopewell, Jemma C; Strauch, Konstantin; Müller-Nurasyid, Martina; Gieger, Christian; Waldenberger, Melanie; Peters, Annette; Meisinger, Christine; Ikram, M Arfan; Longstreth, W T; Meschia, James F; Seshadri, Sudha; Sharma, Pankaj; Worrall, Bradford; Jern, Christina; Levi, Christopher; Dichgans, Martin; Boncoraglio, Giorgio B; Markus, Hugh S; Debette, Stephanie; Rolfs, Arndt; Saleheen, Danish; Mitchell, Braxton D

    2016-02-01

    Although a genetic contribution to ischemic stroke is well recognized, only a handful of stroke loci have been identified by large-scale genetic association studies to date. Hypothesizing that genetic effects might be stronger for early- versus late-onset stroke, we conducted a 2-stage meta-analysis of genome-wide association studies, focusing on stroke cases with an age of onset <60 years. The discovery stage of our genome-wide association studies included 4505 cases and 21 968 controls of European, South-Asian, and African ancestry, drawn from 6 studies. In Stage 2, we selected the lead genetic variants at loci with association P<5×10(-6) and performed in silico association analyses in an independent sample of ≤1003 cases and 7745 controls. One stroke susceptibility locus at 10q25 reached genome-wide significance in the combined analysis of all samples from the discovery and follow-up stages (rs11196288; odds ratio =1.41; P=9.5×10(-9)). The associated locus is in an intergenic region between TCF7L2 and HABP2. In a further analysis in an independent sample, we found that 2 single nucleotide polymorphisms in high linkage disequilibrium with rs11196288 were significantly associated with total plasma factor VII-activating protease levels, a product of HABP2. HABP2, which encodes an extracellular serine protease involved in coagulation, fibrinolysis, and inflammatory pathways, may be a genetic susceptibility locus for early-onset stroke. © 2016 American Heart Association, Inc.

  10. COPS: Detecting Co-Occurrence and Spatial Arrangement of Transcription Factor Binding Motifs in Genome-Wide Datasets

    PubMed Central

    Lohmann, Ingrid

    2012-01-01

    In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression. PMID:23272209

  11. Layers of epistasis: genome-wide regulatory networks and network approaches to genome-wide association studies.

    PubMed

    Cowper-Sal lari, Richard; Cole, Michael D; Karagas, Margaret R; Lupien, Mathieu; Moore, Jason H

    2011-01-01

    The conceptual foundation of the genome-wide association study (GWAS) has advanced unchecked since its conception. A revision might seem premature as the potential of GWAS has not been fully realized. Multiple technical and practical limitations need to be overcome before GWAS can be fairly criticized. But with the completion of hundreds of studies and a deeper understanding of the genetic architecture of disease, warnings are being raised. The results compiled to date indicate that risk-associated variants lie predominantly in noncoding regions of the genome. Additionally, alternative methodologies are uncovering large and heterogeneous sets of rare variants underlying disease. The fear is that, even in its fulfillment, the current GWAS paradigm might be incapable of dissecting all kinds of phenotypes. In the following text, we review several initiatives that aim to overcome these limitations. The overarching theme of these studies is the inclusion of biological knowledge to both the analysis and interpretation of genotyping data. GWAS is uninformed of biology by design and although there is some virtue in its simplicity, it is also its most conspicuous deficiency. We propose a framework in which to integrate these novel approaches, both empirical and theoretical, in the form of a genome-wide regulatory network (GWRN). By processing experimental data into networks, emerging data types based on chromatin immunoprecipitation are made computationally tractable. This will give GWAS re-analysis efforts the most current and relevant substrates, and root them firmly on our knowledge of human disease. Copyright © 2010 John Wiley & Sons, Inc.

  12. Genome-wide association mapping of partial resistance to Aphanomyces euteiches in pea.

    PubMed

    Desgroux, Aurore; L'Anthoëne, Virginie; Roux-Duparque, Martine; Rivière, Jean-Philippe; Aubert, Grégoire; Tayeh, Nadim; Moussart, Anne; Mangin, Pierre; Vetel, Pierrick; Piriou, Christophe; McGee, Rebecca J; Coyne, Clarice J; Burstin, Judith; Baranger, Alain; Manzanares-Dauleux, Maria; Bourion, Virginie; Pilet-Nayel, Marie-Laure

    2016-02-20

    Genome-wide association (GWA) mapping has recently emerged as a valuable approach for refining the genetic basis of polygenic resistance to plant diseases, which are increasingly used in integrated strategies for durable crop protection. Aphanomyces euteiches is a soil-borne pathogen of pea and other legumes worldwide, which causes yield-damaging root rot. Linkage mapping studies reported quantitative trait loci (QTL) controlling resistance to A. euteiches in pea. However the confidence intervals (CIs) of these QTL remained large and were often linked to undesirable alleles, which limited their application in breeding. The aim of this study was to use a GWA approach to validate and refine CIs of the previously reported Aphanomyces resistance QTL, as well as identify new resistance loci. A pea-Aphanomyces collection of 175 pea lines, enriched in germplasm derived from previously studied resistant sources, was evaluated for resistance to A. euteiches in field infested nurseries in nine environments and with two strains in climatic chambers. The collection was genotyped using 13,204 SNPs from the recently developed GenoPea Infinium® BeadChip. GWA analysis detected a total of 52 QTL of small size-intervals associated with resistance to A. euteiches, using the recently developed Multi-Locus Mixed Model. The analysis validated six of the seven previously reported main Aphanomyces resistance QTL and detected novel resistance loci. It also provided marker haplotypes at 14 consistent QTL regions associated with increased resistance and highlighted accumulation of favourable haplotypes in the most resistant lines. Previous linkages between resistance alleles and undesired late-flowering alleles for dry pea breeding were mostly confirmed, but the linkage between loci controlling resistance and coloured flowers was broken due to the high resolution of the analysis. A high proportion of the putative candidate genes underlying resistance loci encoded stress-related proteins and

  13. Landscape genomics reveals altered genome wide diversity within revegetated stands of Eucalyptus microcarpa (Grey Box).

    PubMed

    Jordan, Rebecca; Dillon, Shannon K; Prober, Suzanne M; Hoffmann, Ary A

    2016-12-01

    In order to contribute to evolutionary resilience and adaptive potential in highly modified landscapes, revegetated areas should ideally reflect levels of genetic diversity within and across natural stands. Landscape genomic analyses enable such diversity patterns to be characterized at genome and chromosomal levels. Landscape-wide patterns of genomic diversity were assessed in Eucalyptus microcarpa, a dominant tree species widely used in revegetation in Southeastern Australia. Trees from small and large patches within large remnants, small isolated remnants and revegetation sites were assessed across the now highly fragmented distribution of this species using the DArTseq genomic approach. Genomic diversity was similar within all three types of remnant patches analysed, although often significantly but only slightly lower in revegetation sites compared with natural remnants. Differences in diversity between stand types varied across chromosomes. Genomic differentiation was higher between small, isolated remnants, and among revegetated sites compared with natural stands. We conclude that small remnants and revegetated sites of our E. microcarpa samples largely but not completely capture patterns in genomic diversity across the landscape. Genomic approaches provide a powerful tool for assessing restoration efforts across the landscape. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  14. A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis.

    PubMed

    Chiò, Adriano; Schymick, Jennifer C; Restagno, Gabriella; Scholz, Sonja W; Lombardo, Federica; Lai, Shiao-Lin; Mora, Gabriele; Fung, Hon-Chung; Britton, Angela; Arepalli, Sampath; Gibbs, J Raphael; Nalls, Michael; Berger, Stephen; Kwee, Lydia Coulter; Oddone, Eugene Z; Ding, Jinhui; Crews, Cynthia; Rafferty, Ian; Washecka, Nicole; Hernandez, Dena; Ferrucci, Luigi; Bandinelli, Stefania; Guralnik, Jack; Macciardi, Fabio; Torri, Federica; Lupoli, Sara; Chanock, Stephen J; Thomas, Gilles; Hunter, David J; Gieger, Christian; Wichmann, H Erich; Calvo, Andrea; Mutani, Roberto; Battistini, Stefania; Giannini, Fabio; Caponnetto, Claudia; Mancardi, Giovanni Luigi; La Bella, Vincenzo; Valentino, Francesca; Monsurrò, Maria Rosaria; Tedeschi, Gioacchino; Marinou, Kalliopi; Sabatelli, Mario; Conte, Amelia; Mandrioli, Jessica; Sola, Patrizia; Salvi, Fabrizio; Bartolomei, Ilaria; Siciliano, Gabriele; Carlesi, Cecilia; Orrell, Richard W; Talbot, Kevin; Simmons, Zachary; Connor, James; Pioro, Erik P; Dunkley, Travis; Stephan, Dietrich A; Kasperaviciute, Dalia; Fisher, Elizabeth M; Jabonka, Sibylle; Sendtner, Michael; Beck, Marcus; Bruijn, Lucie; Rothstein, Jeffrey; Schmidt, Silke; Singleton, Andrew; Hardy, John; Traynor, Bryan J

    2009-04-15

    The cause of sporadic amyotrophic lateral sclerosis (ALS) is largely unknown, but genetic factors are thought to play a significant role in determining susceptibility to motor neuron degeneration. To identify genetic variants altering risk of ALS, we undertook a two-stage genome-wide association study (GWAS): we followed our initial GWAS of 545 066 SNPs in 553 individuals with ALS and 2338 controls by testing the 7600 most associated SNPs from the first stage in three independent cohorts consisting of 2160 cases and 3008 controls. None of the SNPs selected for replication exceeded the Bonferroni threshold for significance. The two most significantly associated SNPs, rs2708909 and rs2708851 [odds ratio (OR) = 1.17 and 1.18, and P-values = 6.98 x 10(-7) and 1.16 x 10(-6)], were located on chromosome 7p13.3 within a 175 kb linkage disequilibrium block containing the SUNC1, HUS1 and C7orf57 genes. These associations did not achieve genome-wide significance in the original cohort and failed to replicate in an additional independent cohort of 989 US cases and 327 controls (OR = 1.18 and 1.19, P-values = 0.08 and 0.06, respectively). Thus, we chose to cautiously interpret our data as hypothesis-generating requiring additional confirmation, especially as all previously reported loci for ALS have failed to replicate successfully. Indeed, the three loci (FGGY, ITPR2 and DPP6) identified in previous GWAS of sporadic ALS were not significantly associated with disease in our study. Our findings suggest that ALS is more genetically and clinically heterogeneous than previously recognized. Genotype data from our study have been made available online to facilitate such future endeavors.

  15. A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis

    PubMed Central

    Chiò, Adriano; Schymick, Jennifer C.; Restagno, Gabriella; Scholz, Sonja W.; Lombardo, Federica; Lai, Shiao-Lin; Mora, Gabriele; Fung, Hon-Chung; Britton, Angela; Arepalli, Sampath; Gibbs, J. Raphael; Nalls, Michael; Berger, Stephen; Kwee, Lydia Coulter; Oddone, Eugene Z.; Ding, Jinhui; Crews, Cynthia; Rafferty, Ian; Washecka, Nicole; Hernandez, Dena; Ferrucci, Luigi; Bandinelli, Stefania; Guralnik, Jack; Macciardi, Fabio; Torri, Federica; Lupoli, Sara; Chanock, Stephen J.; Thomas, Gilles; Hunter, David J.; Gieger, Christian; Wichmann, H. Erich; Calvo, Andrea; Mutani, Roberto; Battistini, Stefania; Giannini, Fabio; Caponnetto, Claudia; Mancardi, Giovanni Luigi; La Bella, Vincenzo; Valentino, Francesca; Monsurrò, Maria Rosaria; Tedeschi, Gioacchino; Marinou, Kalliopi; Sabatelli, Mario; Conte, Amelia; Mandrioli, Jessica; Sola, Patrizia; Salvi, Fabrizio; Bartolomei, Ilaria; Siciliano, Gabriele; Carlesi, Cecilia; Orrell, Richard W.; Talbot, Kevin; Simmons, Zachary; Connor, James; Pioro, Erik P.; Dunkley, Travis; Stephan, Dietrich A.; Kasperaviciute, Dalia; Fisher, Elizabeth M.; Jabonka, Sibylle; Sendtner, Michael; Beck, Marcus; Bruijn, Lucie; Rothstein, Jeffrey; Schmidt, Silke; Singleton, Andrew; Hardy, John; Traynor, Bryan J.

    2009-01-01

    The cause of sporadic amyotrophic lateral sclerosis (ALS) is largely unknown, but genetic factors are thought to play a significant role in determining susceptibility to motor neuron degeneration. To identify genetic variants altering risk of ALS, we undertook a two-stage genome-wide association study (GWAS): we followed our initial GWAS of 545 066 SNPs in 553 individuals with ALS and 2338 controls by testing the 7600 most associated SNPs from the first stage in three independent cohorts consisting of 2160 cases and 3008 controls. None of the SNPs selected for replication exceeded the Bonferroni threshold for significance. The two most significantly associated SNPs, rs2708909 and rs2708851 [odds ratio (OR) = 1.17 and 1.18, and P-values = 6.98 × 10−7 and 1.16 × 10−6], were located on chromosome 7p13.3 within a 175 kb linkage disequilibrium block containing the SUNC1, HUS1 and C7orf57 genes. These associations did not achieve genome-wide significance in the original cohort and failed to replicate in an additional independent cohort of 989 US cases and 327 controls (OR = 1.18 and 1.19, P-values = 0.08 and 0.06, respectively). Thus, we chose to cautiously interpret our data as hypothesis-generating requiring additional confirmation, especially as all previously reported loci for ALS have failed to replicate successfully. Indeed, the three loci (FGGY, ITPR2 and DPP6) identified in previous GWAS of sporadic ALS were not significantly associated with disease in our study. Our findings suggest that ALS is more genetically and clinically heterogeneous than previously recognized. Genotype data from our study have been made available online to facilitate such future endeavors. PMID:19193627

  16. A Genome-Wide Association Study Identifies Multiple Regions Associated with Head Size in Catfish

    PubMed Central

    Geng, Xin; Liu, Shikai; Yao, Jun; Bao, Lisui; Zhang, Jiaren; Li, Chao; Wang, Ruijia; Sha, Jin; Zeng, Peng; Zhi, Degui; Liu, Zhanjiang

    2016-01-01

    Skull morphology is fundamental to evolution and the biological adaptation of species to their environments. With aquaculture fish species, head size is also important for economic reasons because it has a direct impact on fillet yield. However, little is known about the underlying genetic basis of head size. Catfish is the primary aquaculture species in the United States. In this study, we performed a genome-wide association study using the catfish 250K SNP array with backcross hybrid catfish to map the QTL for head size (head length, head width, and head depth). One significantly associated region on linkage group (LG) 7 was identified for head length. In addition, LGs 7, 9, and 16 contain suggestively associated regions for head length. For head width, significantly associated regions were found on LG9, and additional suggestively associated regions were identified on LGs 5 and 7. No region was found associated with head depth. Head size genetic loci were mapped in catfish to genomic regions with candidate genes involved in bone development. Comparative analysis indicated that homologs of several candidate genes are also involved in skull morphology in various other species ranging from amphibian to mammalian species, suggesting possible evolutionary conservation of those genes in the control of skull morphologies. PMID:27558670

  17. Probabilistic record linkage

    PubMed Central

    Sayers, Adrian; Ben-Shlomo, Yoav; Blom, Ashley W; Steele, Fiona

    2016-01-01

    Abstract Studies involving the use of probabilistic record linkage are becoming increasingly common. However, the methods underpinning probabilistic record linkage are not widely taught or understood, and therefore these studies can appear to be a ‘black box’ research tool. In this article, we aim to describe the process of probabilistic record linkage through a simple exemplar. We first introduce the concept of deterministic linkage and contrast this with probabilistic linkage. We illustrate each step of the process using a simple exemplar and describe the data structure required to perform a probabilistic linkage. We describe the process of calculating and interpreting matched weights and how to convert matched weights into posterior probabilities of a match using Bayes theorem. We conclude this article with a brief discussion of some of the computational demands of record linkage, how you might assess the quality of your linkage algorithm, and how epidemiologists can maximize the value of their record-linked research using robust record linkage methods. PMID:26686842

  18. Linkage analyses of cannabis dependence, craving, and withdrawal in the San Francisco family study.

    PubMed

    Ehlers, Cindy L; Gizer, Ian R; Vieten, Cassandra; Wilhelmsen, Kirk C

    2010-04-05

    Cannabis is the most widely used illicit drug in the United States. There is ample evidence that cannabis use has a heritable component, yet the genes underlying cannabis use disorders are yet to be completely identified. This study's aims were to map susceptibility loci for cannabis use and dependence and two narrower cannabis-related phenotypes of "craving" and "withdrawal" using a family study design. Participants were 2,524 adults participating in the University of California San Francisco (UCSF) Family Alcoholism Study. DSM-IV diagnoses of cannabis dependence, as well as indices of cannabis craving and withdrawal, were obtained using a modified version of the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA). Genotypes were determined for a panel of 791 microsatellite polymorphisms. Multipoint variance component LOD scores were obtained using SOLAR. Genome-wide significance for linkage (LOD > 3.0) was not found for the DSM-IV cannabis dependence diagnosis; however, linkage analyses of cannabis "craving" and the cannabis withdrawal symptom of "nervous, tense, restless, or irritable" revealed five sites with LOD scores over 3.0 on chromosomes 1, 3, 6, 7, and 9. These results identify new regions of the genome associated with cannabis use phenotypes as well as corroborate the importance of several chromosome regions highlighted in previous linkage analyses for other substance dependence phenotypes.

  19. Linkage analyses of cannabis dependence, craving, and withdrawal in the San Francisco Family Study

    PubMed Central

    Ehlers, Cindy L.; Gizer, Ian R.; Vieten, Cassandra; Wilhelmsen, Kirk C.

    2010-01-01

    Cannabis is the most widely used illicit drug in the United States. There is ample evidence that cannabis use has a heritable component, yet the genes underlying cannabis use disorders are yet to be completely identified. This study's aims were to map susceptibility loci for cannabis use and dependence and two narrower cannabis-related phenotypes of “craving” and “withdrawal” using a family study design. Participants were 2524 adults participating in the University of California San Francisco (UCSF) Family Alcoholism Study. DSM-IV diagnoses of cannabis dependence, as well as indices of cannabis craving and withdrawal, were obtained using a modified version of the Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA). Genotypes were determined for a panel of 791 microsatellite polymorphisms. Multipoint variance component LOD scores were obtained using SOLAR. Genome-wide significance for linkage (LOD > 3.0) was not found for the DSM-IV cannabis dependence diagnosis, however, linkage analyses of cannabis “craving” and the cannabis withdrawal symptom of “nervous, tense, restless or irritable” revealed five sites with LOD scores over 3.0 on chromosomes 1, 3, 6, 7, 9. These results identify new regions of the genome associated with cannabis use phenotypes as well as corroborate the importance of several chromosome regions highlighted in previous linkage analyses for other substance dependence phenotypes. PMID:19937978

  20. GENOME-WIDE ASSOCIATION STUDY (GWAS) AND GENOME-WIDE BY ENVIRONMENT INTERACTION STUDY (GWEIS) OF DEPRESSIVE SYMPTOMS IN AFRICAN AMERICAN AND HISPANIC/LATINA WOMEN.

    PubMed

    Dunn, Erin C; Wiste, Anna; Radmanesh, Farid; Almli, Lynn M; Gogarten, Stephanie M; Sofer, Tamar; Faul, Jessica D; Kardia, Sharon L R; Smith, Jennifer A; Weir, David R; Zhao, Wei; Soare, Thomas W; Mirza, Saira S; Hek, Karin; Tiemeier, Henning; Goveas, Joseph S; Sarto, Gloria E; Snively, Beverly M; Cornelis, Marilyn; Koenen, Karestan C; Kraft, Peter; Purcell, Shaun; Ressler, Kerry J; Rosand, Jonathan; Wassertheil-Smoller, Sylvia; Smoller, Jordan W

    2016-04-01

    Genome-wide association studies (GWAS) have made little progress in identifying variants linked to depression. We hypothesized that examining depressive symptoms and considering gene-environment interaction (GxE) might improve efficiency for gene discovery. We therefore conducted a GWAS and genome-wide by environment interaction study (GWEIS) of depressive symptoms. Using data from the SHARe cohort of the Women's Health Initiative, comprising African Americans (n = 7,179) and Hispanics/Latinas (n = 3,138), we examined genetic main effects and GxE with stressful life events and social support. We also conducted a heritability analysis using genome-wide complex trait analysis (GCTA). Replication was attempted in four independent cohorts. No SNPs achieved genome-wide significance for main effects in either discovery sample. The top signals in African Americans were rs73531535 (located 20 kb from GPR139, P = 5.75 × 10(-8) ) and rs75407252 (intronic to CACNA2D3, P = 6.99 × 10(-7) ). In Hispanics/Latinas, the top signals were rs2532087 (located 27 kb from CD38, P = 2.44 × 10(-7) ) and rs4542757 (intronic to DCC, P = 7.31 × 10(-7) ). In the GEWIS with stressful life events, one interaction signal was genome-wide significant in African Americans (rs4652467; P = 4.10 × 10(-10) ; located 14 kb from CEP350). This interaction was not observed in a smaller replication cohort. Although heritability estimates for depressive symptoms and stressful life events were each less than 10%, they were strongly genetically correlated (rG = 0.95), suggesting that common variation underlying self-reported depressive symptoms and stressful life event exposure, though modest on their own, were highly overlapping in this sample. Our results underscore the need for larger samples, more GEWIS, and greater investigation into genetic and environmental determinants of depressive symptoms in minorities. © 2016 Wiley Periodicals, Inc.

  1. Genome-Wide Meta-Analysis of Myopia and Hyperopia Provides Evidence for Replication of 11 Loci

    PubMed Central

    Simpson, Claire L.; Wojciechowski, Robert; Oexle, Konrad; Murgia, Federico; Portas, Laura; Li, Xiaohui; Verhoeven, Virginie J. M.; Vitart, Veronique; Schache, Maria; Hosseini, S. Mohsen; Hysi, Pirro G.; Raffel, Leslie J.; Cotch, Mary Frances; Chew, Emily; Klein, Barbara E. K.; Klein, Ronald; Wong, Tien Yin; van Duijn, Cornelia M.; Mitchell, Paul; Saw, Seang Mei; Fossarello, Maurizio; Wang, Jie Jin; Polašek, Ozren; Campbell, Harry; Rudan, Igor; Oostra, Ben A.; Uitterlinden, André G.; Hofman, Albert; Rivadeneira, Fernando; Amin, Najaf; Karssen, Lennart C.; Vingerling, Johannes R.; Döring, Angela; Bettecken, Thomas; Bencic, Goran; Gieger, Christian; Wichmann, H.-Erich; Wilson, James F.; Venturini, Cristina; Fleck, Brian; Cumberland, Phillippa M.; Rahi, Jugnoo S.; Hammond, Chris J.; Hayward, Caroline; Wright, Alan F.; Paterson, Andrew D.; Baird, Paul N.; Klaver, Caroline C. W.; Rotter, Jerome I.; Pirastu, Mario; Meitinger, Thomas; Bailey-Wilson, Joan E.; Stambolian, Dwight

    2014-01-01

    Refractive error (RE) is a complex, multifactorial disorder characterized by a mismatch between the optical power of the eye and its axial length that causes object images to be focused off the retina. The two major subtypes of RE are myopia (nearsightedness) and hyperopia (farsightedness), which represent opposite ends of the distribution of the quantitative measure of spherical refraction. We performed a fixed effects meta-analysis of genome-wide association results of myopia and hyperopia from 9 studies of European-derived populations: AREDS, KORA, FES, OGP-Talana, MESA, RSI, RSII, RSIII and ERF. One genome-wide significant region was observed for myopia, corresponding to a previously identified myopia locus on 8q12 (p = 1.25×10−8), which has been reported by Kiefer et al. as significantly associated with myopia age at onset and Verhoeven et al. as significantly associated to mean spherical-equivalent (MSE) refractive error. We observed two genome-wide significant associations with hyperopia. These regions overlapped with loci on 15q14 (minimum p value = 9.11×10−11) and 8q12 (minimum p value 1.82×10−11) previously reported for MSE and myopia age at onset. We also used an intermarker linkage- disequilibrium-based method for calculating the effective number of tests in targeted regional replication analyses. We analyzed myopia (which represents the closest phenotype in our data to the one used by Kiefer et al.) and showed replication of 10 additional loci associated with myopia previously reported by Kiefer et al. This is the first replication of these loci using myopia as the trait under analysis. “Replication-level” association was also seen between hyperopia and 12 of Kiefer et al.'s published loci. For the loci that show evidence of association to both myopia and hyperopia, the estimated effect of the risk alleles were in opposite directions for the two traits. This suggests that these loci are important contributors to variation of

  2. Genome-wide meta-analysis of myopia and hyperopia provides evidence for replication of 11 loci.

    PubMed

    Simpson, Claire L; Wojciechowski, Robert; Oexle, Konrad; Murgia, Federico; Portas, Laura; Li, Xiaohui; Verhoeven, Virginie J M; Vitart, Veronique; Schache, Maria; Hosseini, S Mohsen; Hysi, Pirro G; Raffel, Leslie J; Cotch, Mary Frances; Chew, Emily; Klein, Barbara E K; Klein, Ronald; Wong, Tien Yin; van Duijn, Cornelia M; Mitchell, Paul; Saw, Seang Mei; Fossarello, Maurizio; Wang, Jie Jin; Polašek, Ozren; Campbell, Harry; Rudan, Igor; Oostra, Ben A; Uitterlinden, André G; Hofman, Albert; Rivadeneira, Fernando; Amin, Najaf; Karssen, Lennart C; Vingerling, Johannes R; Döring, Angela; Bettecken, Thomas; Bencic, Goran; Gieger, Christian; Wichmann, H-Erich; Wilson, James F; Venturini, Cristina; Fleck, Brian; Cumberland, Phillippa M; Rahi, Jugnoo S; Hammond, Chris J; Hayward, Caroline; Wright, Alan F; Paterson, Andrew D; Baird, Paul N; Klaver, Caroline C W; Rotter, Jerome I; Pirastu, Mario; Meitinger, Thomas; Bailey-Wilson, Joan E; Stambolian, Dwight

    2014-01-01

    Refractive error (RE) is a complex, multifactorial disorder characterized by a mismatch between the optical power of the eye and its axial length that causes object images to be focused off the retina. The two major subtypes of RE are myopia (nearsightedness) and hyperopia (farsightedness), which represent opposite ends of the distribution of the quantitative measure of spherical refraction. We performed a fixed effects meta-analysis of genome-wide association results of myopia and hyperopia from 9 studies of European-derived populations: AREDS, KORA, FES, OGP-Talana, MESA, RSI, RSII, RSIII and ERF. One genome-wide significant region was observed for myopia, corresponding to a previously identified myopia locus on 8q12 (p = 1.25×10(-8)), which has been reported by Kiefer et al. as significantly associated with myopia age at onset and Verhoeven et al. as significantly associated to mean spherical-equivalent (MSE) refractive error. We observed two genome-wide significant associations with hyperopia. These regions overlapped with loci on 15q14 (minimum p value = 9.11×10(-11)) and 8q12 (minimum p value 1.82×10(-11)) previously reported for MSE and myopia age at onset. We also used an intermarker linkage- disequilibrium-based method for calculating the effective number of tests in targeted regional replication analyses. We analyzed myopia (which represents the closest phenotype in our data to the one used by Kiefer et al.) and showed replication of 10 additional loci associated with myopia previously reported by Kiefer et al. This is the first replication of these loci using myopia as the trait under analysis. "Replication-level" association was also seen between hyperopia and 12 of Kiefer et al.'s published loci. For the loci that show evidence of association to both myopia and hyperopia, the estimated effect of the risk alleles were in opposite directions for the two traits. This suggests that these loci are important contributors to variation of refractive

  3. Tree decomposition based fast search of RNA structures including pseudoknots in genomes.

    PubMed

    Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming

    2005-01-01

    Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.

  4. Segment-Wise Genome-Wide Association Analysis Identifies a Candidate Region Associated with Schizophrenia in Three Independent Samples

    PubMed Central

    Rietschel, Marcella; Mattheisen, Manuel; Breuer, René; Schulze, Thomas G.; Nöthen, Markus M.; Levinson, Douglas; Shi, Jianxin; Gejman, Pablo V.; Cichon, Sven; Ophoff, Roel A.

    2012-01-01

    Recent studies suggest that variation in complex disorders (e.g., schizophrenia) is explained by a large number of genetic variants with small effect size (Odds Ratio∼1.05–1.1). The statistical power to detect these genetic variants in Genome Wide Association (GWA) studies with large numbers of cases and controls (∼15,000) is still low. As it will be difficult to further increase sample size, we decided to explore an alternative method for analyzing GWA data in a study of schizophrenia, dramatically reducing the number of statistical tests. The underlying hypothesis was that at least some of the genetic variants related to a common outcome are collocated in segments of chromosomes at a wider scale than single genes. Our approach was therefore to study the association between relatively large segments of DNA and disease status. An association test was performed for each SNP and the number of nominally significant tests in a segment was counted. We then performed a permutation-based binomial test to determine whether this region contained significantly more nominally significant SNPs than expected under the null hypothesis of no association, taking linkage into account. Genome Wide Association data of three independent schizophrenia case/control cohorts with European ancestry (Dutch, German, and US) using segments of DNA with variable length (2 to 32 Mbp) was analyzed. Using this approach we identified a region at chromosome 5q23.3-q31.3 (128–160 Mbp) that was significantly enriched with nominally associated SNPs in three independent case-control samples. We conclude that considering relatively wide segments of chromosomes may reveal reliable relationships between the genome and schizophrenia, suggesting novel methodological possibilities as well as raising theoretical questions. PMID:22723893

  5. Extensive genome-wide autozygosity in the population isolates of Daghestan.

    PubMed

    Karafet, Tatiana M; Bulayeva, Kazima B; Bulayev, Oleg A; Gurgenova, Farida; Omarova, Jamilia; Yepiskoposyan, Levon; Savina, Olga V; Veeramah, Krishna R; Hammer, Michael F

    2015-10-01

    Isolated populations are valuable resources for mapping disease genes, as inbreeding increases genome-wide homozygosity and enhances the ability to map disease alleles on a genetically uniform background within a relatively homogenous environment. The populations of Daghestan are thought to have resided in the Caucasus Mountains for hundreds of generations and are characterized by a high prevalence of certain complex diseases. To explore the extent to which their unique population history led to increased levels of inbreeding, we genotyped >550 000 autosomal single-nucleotide polymorphisms (SNPs) in a set of 14 population isolates speaking Nakh-Daghestanian (ND) languages. The ND-speaking populations showed greatly elevated coefficients of inbreeding, very high numbers and long lengths of Runs of Homozygosity, and elevated linkage disequilibrium compared with surrounding groups from the Caucasus, the Near East, Europe, Central and South Asia. These results are consistent with the hypothesis that most ND-speaking groups descend from a common ancestral population that fragmented into a series of genetic isolates in the Daghestanian highlands. They have subsequently maintained a long-term small effective population size as a result of constant inbreeding and very low levels of gene flow. Given these findings, Daghestanian population isolates are likely to be useful for mapping genes associated with complex diseases.

  6. Discrimination of candidate subgenome-specific loci by linkage map construction with an S1 population of octoploid strawberry (Fragaria × ananassa).

    PubMed

    Nagano, Soichiro; Shirasawa, Kenta; Hirakawa, Hideki; Maeda, Fumi; Ishikawa, Masami; Isobe, Sachiko N

    2017-05-12

    The strawberry, Fragaria × ananassa, is an allo-octoploid (2n = 8x = 56) and outcrossing species. Although it is the most widely consumed berry crop in the world, its complex genome structure has hindered its genetic and genomic analysis, and thus discrimination of subgenome-specific loci among the homoeologous chromosomes is needed. In the present study, we identified candidate subgenome-specific single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) loci, and constructed a linkage map using an S 1 mapping population of the cultivar 'Reikou' with an IStraw90 Axiom® SNP array and previously published SSR markers. The 'Reikou' linkage map consisted of 11,574 loci (11,002 SNPs and 572 SSR loci) spanning 2816.5 cM of 31 linkage groups. The 11,574 loci were located on 4738 unique positions (bin) on the linkage map. Of the mapped loci, 8999 (8588 SNPs and 411 SSR loci) showed a 1:2:1 segregation ratio of AA:AB:BB allele, which suggested the possibility of deriving loci from candidate subgenome-specific sequences. In addition, 2575 loci (2414 SNPs and 161 SSR loci) showed a 3:1 segregation of AB:BB allele, indicating they were derived from homoeologous genomic sequences. Comparative analysis of the homoeologous linkage groups revealed differences in genome structure among the subgenomes. Our results suggest that candidate subgenome-specific loci are randomly located across the genomes, and that there are small- to large-scale structural variations among the subgenomes. The mapped SNPs and SSR loci on the linkage map are expected to be seed points for the construction of pseudomolecules in the octoploid strawberry.

  7. Lessons from ten years of genome-wide association studies of asthma

    PubMed Central

    Vicente, Cristina T; Revez, Joana A; Ferreira, Manuel A R

    2017-01-01

    Twenty-five genome-wide association studies (GWAS) of asthma were published between 2007 and 2016, the largest with a sample size of 157242 individuals. Across these studies, 39 genetic variants in low linkage disequilibrium (LD) with each other were reported to associate with disease risk at a significance threshold of P<5 × 10−8, including 31 in populations of European ancestry. Results from analyses of the UK Biobank data (n=380 503) indicate that at least 28 of the 31 associations reported in Europeans represent true-positive findings, collectively explaining 2.5% of the variation in disease liability (median of 0.06% per variant). We identified 49 transcripts as likely target genes of the published asthma risk variants, mostly based on LD with expression quantitative trait loci (eQTL). Of these genes, 16 were previously implicated in disease pathophysiology by functional studies, including TSLP, TNFSF4, ADORA1, CHIT1 and USF1. In contrast, at present, there is limited or no functional evidence directly implicating the remaining 33 likely target genes in asthma pathophysiology. Some of these genes have a known function that is relevant to allergic disease, including F11R, CD247, PGAP3, AAGAB, CAMK4 and PEX14, and so could be prioritized for functional follow-up. We conclude by highlighting three areas of research that are essential to help translate GWAS findings into clinical research or practice, namely validation of target gene predictions, understanding target gene function and their role in disease pathophysiology and genomics-guided prioritization of targets for drug development. PMID:29333270

  8. Genome-wide mapping of virulence in brown planthopper identifies loci that break down host plant resistance.

    PubMed

    Jing, Shengli; Zhang, Lei; Ma, Yinhua; Liu, Bingfang; Zhao, Yan; Yu, Hangjin; Zhou, Xi; Qin, Rui; Zhu, Lili; He, Guangcun

    2014-01-01

    Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens) is the most destructive rice (Oryza sativa) pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5) and 14 (Qgr14). This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for controlling this most

  9. Genome-Wide Mapping of Virulence in Brown Planthopper Identifies Loci That Break Down Host Plant Resistance

    PubMed Central

    Jing, Shengli; Zhang, Lei; Ma, Yinhua; Liu, Bingfang; Zhao, Yan; Yu, Hangjin; Zhou, Xi; Qin, Rui; Zhu, Lili; He, Guangcun

    2014-01-01

    Insects and plants have coexisted for over 350 million years and their interactions have affected ecosystems and agricultural practices worldwide. Variation in herbivorous insects' virulence to circumvent host resistance has been extensively documented. However, despite decades of investigation, the genetic foundations of virulence are currently unknown. The brown planthopper (Nilaparvata lugens) is the most destructive rice (Oryza sativa) pest in the world. The identification of the resistance gene Bph1 and its introduction in commercial rice varieties prompted the emergence of a new virulent brown planthopper biotype that was able to break the resistance conferred by Bph1. In this study, we aimed to construct a high density linkage map for the brown planthopper and identify the loci responsible for its virulence in order to determine their genetic architecture. Based on genotyping data for hundreds of molecular markers in three mapping populations, we constructed the most comprehensive linkage map available for this species, covering 96.6% of its genome. Fifteen chromosomes were anchored with 124 gene-specific markers. Using genome-wide scanning and interval mapping, the Qhp7 locus that governs preference for Bph1 plants was mapped to a 0.1 cM region of chromosome 7. In addition, two major QTLs that govern the rate of insect growth on resistant rice plants were identified on chromosomes 5 (Qgr5) and 14 (Qgr14). This is the first study to successfully locate virulence in the genome of this important agricultural insect by marker-based genetic mapping. Our results show that the virulence which overcomes the resistance conferred by Bph1 is controlled by a few major genes and that the components of virulence originate from independent genetic characters. The isolation of these loci will enable the elucidation of the molecular mechanisms underpinning the rice-brown planthopper interaction and facilitate the development of durable approaches for controlling this most

  10. Meta-analysis of genome-wide association from genomic prediction models

    USDA-ARS?s Scientific Manuscript database

    A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. To increase sample size, results from different GWA can be combined in a meta-analys...

  11. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

    PubMed

    Sanitá Lima, Matheus; Smith, David Roy

    2017-11-06

    Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.

  12. Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map

    PubMed Central

    Li, Ximei; Jin, Xin; Wang, Hantao; Zhang, Xianlong; Lin, Zhongxu

    2016-01-01

    A high-density linkage map was constructed using 1,885 newly obtained loci and 3,747 previously published loci, which included 5,152 loci with 4696.03 cM in total length and 0.91 cM in mean distance. Homology analysis in the cotton genome further confirmed the 13 expected homologous chromosome pairs and revealed an obvious inversion on Chr10 or Chr20 and repeated inversions on Chr07 or Chr16. In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed. Comparative genomics between the tetraploid cotton and the diploid cottons showed that no major structural changes exist between DT and D chromosomes but rather between AT and A chromosomes. Blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that most AD chromosomes, regardless of whether it is from the AT or DT genome, preferentially matched with the corresponding homologous chromosome in the diploid A genome, and then the corresponding homologous chromosome in the diploid D genome, indicating that the diploid D genome underwent converted evolution by the diploid A genome to form the DT genome during polyploidization. In addition, the results reflected that a series of chromosomal translocations occurred among Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, Chr04/Chr22, and Chr05/Chr19. PMID:27084896

  13. Construction of an Integrated High Density Simple Sequence Repeat Linkage Map in Cultivated Strawberry (Fragaria × ananassa) and its Applicability

    PubMed Central

    Isobe, Sachiko N.; Hirakawa, Hideki; Sato, Shusei; Maeda, Fumi; Ishikawa, Masami; Mori, Toshiki; Yamamoto, Yuko; Shirasawa, Kenta; Kimura, Mitsuhiro; Fukami, Masanobu; Hashizume, Fujio; Tsuji, Tomoko; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Tsuruoka, Hisano; Minami, Chiharu; Takahashi, Chika; Wada, Tsuyuko; Ono, Akiko; Kawashima, Kumiko; Nakazaki, Naomi; Kishida, Yoshie; Kohara, Mitsuyo; Nakayama, Shinobu; Yamada, Manabu; Fujishiro, Tsunakazu; Watanabe, Akiko; Tabata, Satoshi

    2013-01-01

    The cultivated strawberry (Fragaria× ananassa) is an octoploid (2n = 8x = 56) of the Rosaceae family whose genomic architecture is still controversial. Several recent studies support the AAA′A′BBB′B′ model, but its complexity has hindered genetic and genomic analysis of this important crop. To overcome this difficulty and to assist genome-wide analysis of F. × ananassa, we constructed an integrated linkage map by organizing a total of 4474 of simple sequence repeat (SSR) markers collected from published Fragaria sequences, including 3746 SSR markers [Fragaria vesca expressed sequence tag (EST)-derived SSR markers] derived from F. vesca ESTs, 603 markers (F. × ananassa EST-derived SSR markers) from F. × ananassa ESTs, and 125 markers (F. × ananassa transcriptome-derived SSR markers) from F. × ananassa transcripts. Along with the previously published SSR markers, these markers were mapped onto five parent-specific linkage maps derived from three mapping populations, which were then assembled into an integrated linkage map. The constructed map consists of 1856 loci in 28 linkage groups (LGs) that total 2364.1 cM in length. Macrosynteny at the chromosome level was observed between the LGs of F. × ananassa and the genome of F. vesca. Variety distinction on 129 F. × ananassa lines was demonstrated using 45 selected SSR markers. PMID:23248204

  14. Genome-wide generation and use of informative intron-spanning and intron-length polymorphism markers for high-throughput genetic analysis in rice

    PubMed Central

    Badoni, Saurabh; Das, Sweta; Sayal, Yogesh K.; Gopalakrishnan, S.; Singh, Ashok K.; Rao, Atmakuri R.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    We developed genome-wide 84634 ISM (intron-spanning marker) and 16510 InDel-fragment length polymorphism-based ILP (intron-length polymorphism) markers from genes physically mapped on 12 rice chromosomes. These genic markers revealed much higher amplification-efficiency (80%) and polymorphic-potential (66%) among rice accessions even by a cost-effective agarose gel-based assay. A wider level of functional molecular diversity (17–79%) and well-defined precise admixed genetic structure was assayed by 3052 genome-wide markers in a structured population of indica, japonica, aromatic and wild rice. Six major grain weight QTLs (11.9–21.6% phenotypic variation explained) were mapped on five rice chromosomes of a high-density (inter-marker distance: 0.98 cM) genetic linkage map (IR 64 x Sonasal) anchored with 2785 known/candidate gene-derived ISM and ILP markers. The designing of multiple ISM and ILP markers (2 to 4 markers/gene) in an individual gene will broaden the user-preference to select suitable primer combination for efficient assaying of functional allelic variation/diversity and realistic estimation of differential gene expression profiles among rice accessions. The genomic information generated in our study is made publicly accessible through a user-friendly web-resource, “Oryza ISM-ILP marker” database. The known/candidate gene-derived ISM and ILP markers can be enormously deployed to identify functionally relevant trait-associated molecular tags by optimal-resource expenses, leading towards genomics-assisted crop improvement in rice. PMID:27032371

  15. Family-Based Genome-Wide Association Scan of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Mick, Eric; Todorov, Alexandre; Smalley, Susan; Hu, Xiaolan; Loo, Sandra; Todd, Richard D.; Biederman, Joseph; Byrne, Deirdre; Dechairo, Bryan; Guiney, Allan; McCracken, James; McGough, James; Nelson, Stanley F.; Reiersen, Angela M.; Wilens, Timothy E.; Wozniak, Janet; Neale, Benjamin M.; Faraone, Stephen V.

    2010-01-01

    Objective: Genes likely play a substantial role in the etiology of attention-deficit/hyperactivity disorder (ADHD). However, the genetic architecture of the disorder is unknown, and prior genome-wide association studies (GWAS) have not identified a genome-wide significant association. We have conducted a third, independent, multisite GWAS of…

  16. Saturated linkage map construction in Rubus idaeus using genotyping by sequencing and genome-independent imputation

    PubMed Central

    2013-01-01

    Background Rapid development of highly saturated genetic maps aids molecular breeding, which can accelerate gain per breeding cycle in woody perennial plants such as Rubus idaeus (red raspberry). Recently, robust genotyping methods based on high-throughput sequencing were developed, which provide high marker density, but result in some genotype errors and a large number of missing genotype values. Imputation can reduce the number of missing values and can correct genotyping errors, but current methods of imputation require a reference genome and thus are not an option for most species. Results Genotyping by Sequencing (GBS) was used to produce highly saturated maps for a R. idaeus pseudo-testcross progeny. While low coverage and high variance in sequencing resulted in a large number of missing values for some individuals, a novel method of imputation based on maximum likelihood marker ordering from initial marker segregation overcame the challenge of missing values, and made map construction computationally tractable. The two resulting parental maps contained 4521 and 2391 molecular markers spanning 462.7 and 376.6 cM respectively over seven linkage groups. Detection of precise genomic regions with segregation distortion was possible because of map saturation. Microsatellites (SSRs) linked these results to published maps for cross-validation and map comparison. Conclusions GBS together with genome-independent imputation provides a rapid method for genetic map construction in any pseudo-testcross progeny. Our method of imputation estimates the correct genotype call of missing values and corrects genotyping errors that lead to inflated map size and reduced precision in marker placement. Comparison of SSRs to published R. idaeus maps showed that the linkage maps constructed with GBS and our method of imputation were robust, and marker positioning reliable. The high marker density allowed identification of genomic regions with segregation distortion in R. idaeus, which

  17. A systematic review of genome-wide research on psychotic experiences and negative symptom traits: New revelations and implications for psychiatry.

    PubMed

    Ronald, Angelica; Pain, Oliver

    2018-05-08

    We present a systematic review of genome-wide research on psychotic experience and negative symptom traits (PENS) in the community. We integrate these new findings, most of which have emerged over the last four years, with more established behaviour genetic and epidemiological research. The review includes the first genome-wide association studies of PENS, including a recent meta-analysis, and the first SNP heritability estimates. Sample sizes of < 10,000 participants mean that no genome-wide significant variants have yet been replicated. Importantly, however, in the most recent and well-powered studies, polygenic risk score prediction and linkage disequilibrium (LD) score regression analyses show that all types of PENS share genetic influences with diagnosed schizophrenia and that negative symptom traits also share genetic influences with major depression. These genetic findings corroborate other evidence in supporting a link between PENS in the community and psychiatric conditions. Beyond the systematic review, we highlight recent work on gene-environment correlation, which appears to be a relevant process for psychotic experiences. Genes that influence risk factors such as tobacco use and stressful life events are likely to be harbouring 'hits' that also influence PENS. We argue for the acceptance of PENS within the mainstream, as heritable traits in the same vein as other subclinical psychopathology and personality styles such as neuroticism. While acknowledging some mixed findings, new evidence shows genetic overlap between PENS and psychiatric conditions. In sum, normal variations in adolescent and adult thinking styles, such as feeling paranoid, are heritable and show genetic associations with schizophrenia and major depression.

  18. Transferability and Fine-Mapping of Genome-Wide Associated Loci for Adult Height across Human Populations

    PubMed Central

    Shriner, Daniel; Adeyemo, Adebowale; Gerry, Norman P.; Herbert, Alan; Chen, Guanjie; Doumatey, Ayo; Huang, Hanxia; Zhou, Jie; Christman, Michael F.; Rotimi, Charles N.

    2009-01-01

    Human height is the prototypical polygenic quantitative trait. Recently, several genetic variants influencing adult height were identified, primarily in individuals of East Asian (Chinese Han or Korean) or European ancestry. Here, we examined 152 genetic variants representing 107 independent loci previously associated with adult height for transferability in a well-powered sample of 1,016 unrelated African Americans. When we tested just the reported variants originally identified as associated with adult height in individuals of East Asian or European ancestry, only 8.3% of these loci transferred (p-values≤0.05 under an additive genetic model with directionally consistent effects) to our African American sample. However, when we comprehensively evaluated all HapMap variants in linkage disequilibrium (r 2≥0.3) with the reported variants, the transferability rate increased to 54.1%. The transferability rate was 70.8% for associations originally reported as genome-wide significant and 38.0% for associations originally reported as suggestive. An additional 23 loci were significantly associated but failed to transfer because of directionally inconsistent effects. Six loci were associated with adult height in all three groups. Using differences in linkage disequilibrium patterns between HapMap CEU or CHB reference data and our African American sample, we fine-mapped these six loci, improving both the localization and the annotation of these transferable associations. PMID:20027299

  19. Linkage of osteoporosis to chromosome 20p12 and association to BMP2.

    PubMed

    Styrkarsdottir, Unnur; Cazier, Jean-Baptiste; Kong, Augustine; Rolfsson, Ottar; Larsen, Helene; Bjarnadottir, Emma; Johannsdottir, Vala D; Sigurdardottir, Margret S; Bagger, Yu; Christiansen, Claus; Reynisdottir, Inga; Grant, Struan F A; Jonasson, Kristjan; Frigge, Michael L; Gulcher, Jeffrey R; Sigurdsson, Gunnar; Stefansson, Kari

    2003-12-01

    Osteoporotic fractures are a major cause of morbidity and mortality in ageing populations. Osteoporosis, defined as low bone mineral density (BMD) and associated fractures, have significant genetic components that are largely unknown. Linkage analysis in a large number of extended osteoporosis families in Iceland, using a phenotype that combines osteoporotic fractures and BMD measurements, showed linkage to Chromosome 20p12.3 (multipoint allele-sharing LOD, 5.10; p value, 6.3 x 10(-7)), results that are statistically significant after adjusting for the number of phenotypes tested and the genome-wide search. A follow-up association analysis using closely spaced polymorphic markers was performed. Three variants in the bone morphogenetic protein 2 (BMP2) gene, a missense polymorphism and two anonymous single nucleotide polymorphism haplotypes, were determined to be associated with osteoporosis in the Icelandic patients. The association is seen with many definitions of an osteoporotic phenotype, including osteoporotic fractures as well as low BMD, both before and after menopause. A replication study with a Danish cohort of postmenopausal women was conducted to confirm the contribution of the three identified variants. In conclusion, we find that a region on the short arm of Chromosome 20 contains a gene or genes that appear to be a major risk factor for osteoporosis and osteoporotic fractures, and our evidence supports the view that BMP2 is at least one of these genes.

  20. Case-Control Genome-Wide Association Study of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Neale, Benjamin M.; Medland, Sarah; Ripke, Stephan; Anney, Richard J. L.; Asherson, Philip; Buitelaar, Jan; Franke, Barbara; Gill, Michael; Kent, Lindsey; Holmans, Peter; Middleton, Frank; Thapar, Anita; Lesch, Klaus-Peter; Faraone, Stephen V.; Daly, Mark; Nguyen, Thuy Trang; Schafer, Helmut; Steinhausen, Hans-Christoph; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Freitag, Christine; Meyer, Jobst; Palmason, Haukur; Rothenberger, Aribert; Hawi, Ziarih; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. Thus additional genome-wide association studies (GWAS) are needed. Method: We used case-control analyses of 896 cases…

  1. Linkage analysis of high myopia susceptibility locus in 26 families.

    PubMed

    Paget, Sandrine; Julia, Sophie; Vitezica, Zulma G; Soler, Vincent; Malecaze, François; Calvas, Patrick

    2008-01-01

    We conducted a linkage analysis in high myopia families to replicate suggestive results from chromosome 7q36 using a model of autosomal dominant inheritance and genetic heterogeneity. We also performed a genome-wide scan to identify novel loci. Twenty-six families, with at least two high-myopic subjects (ie. refractive value in the less affected eye of -5 diopters) in each family, were included. Phenotypic examination included standard autorefractometry, ultrasonographic eye length measurement, and clinical confirmation of the non-syndromic character of the refractive disorder. Nine families were collected de novo including 136 available members of whom 34 were highly myopic subjects. Twenty new subjects were added in 5 of the 17 remaining families. A total of 233 subjects were submitted to a genome scan using ABI linkage mapping set LMSv2-MD-10, additional markers in all regions where preliminary LOD scores were greater than 1.5 were used. Multipoint parametric and non-parametric analyses were conducted with the software packages Genehunter 2.0 and Merlin 1.0.1. Two autosomal recessive, two autosomal dominant, and four autosomal additive models were used in the parametric linkage analyses. No linkage was found using the subset of nine newly collected families. Study of the entire population of 26 families with a parametric model did not yield a significant LOD score (>3), even for the previously suggestive locus on 7q36. A non-parametric model demonstrated significant linkage to chromosome 7p15 in the entire population (Z-NPL=4.07, p=0.00002). The interval is 7.81 centiMorgans (cM) between markers D7S2458 and D7S2515. The significant interval reported here needs confirmation in other cohorts. Among possible susceptibility genes in the interval, certain candidates are likely to be involved in eye growth and development.

  2. Stratified Whole Genome Linkage Analysis of Chiari Type I Malformation Implicates Known Klippel-Feil Syndrome Genes as Putative Disease Candidates

    PubMed Central

    Markunas, Christina A.; Soldano, Karen; Dunlap, Kaitlyn; Cope, Heidi; Asiimwe, Edgar; Stajich, Jeffrey; Enterline, David; Grant, Gerald; Fuchs, Herbert

    2013-01-01

    Chiari Type I Malformation (CMI) is characterized by displacement of the cerebellar tonsils below the base of the skull, resulting in significant neurologic morbidity. Although multiple lines of evidence support a genetic contribution to disease, no genes have been identified. We therefore conducted the largest whole genome linkage screen to date using 367 individuals from 66 families with at least two individuals presenting with nonsyndromic CMI with or without syringomyelia. Initial findings across all 66 families showed minimal evidence for linkage due to suspected genetic heterogeneity. In order to improve power to localize susceptibility genes, stratified linkage analyses were performed using clinical criteria to differentiate families based on etiologic factors. Families were stratified on the presence or absence of clinical features associated with connective tissue disorders (CTDs) since CMI and CTDs frequently co-occur and it has been proposed that CMI patients with CTDs represent a distinct class of patients with a different underlying disease mechanism. Stratified linkage analyses resulted in a marked increase in evidence of linkage to multiple genomic regions consistent with reduced genetic heterogeneity. Of particular interest were two regions (Chr8, Max LOD = 3.04; Chr12, Max LOD = 2.09) identified within the subset of “CTD-negative” families, both of which harbor growth differentiation factors (GDF6, GDF3) implicated in the development of Klippel-Feil syndrome (KFS). Interestingly, roughly 3–5% of CMI patients are diagnosed with KFS. In order to investigate the possibility that CMI and KFS are allelic, GDF3 and GDF6 were sequenced leading to the identification of a previously known KFS missense mutation and potential regulatory variants in GDF6. This study has demonstrated the value of reducing genetic heterogeneity by clinical stratification implicating several convincing biological candidates and further supporting the hypothesis that

  3. Application of Genome Wide Association and Genomic Prediction for Improvement of Cacao Productivity and Resistance to Black and Frosty Pod Diseases

    PubMed Central

    Romero Navarro, J. Alberto; Phillips-Mora, Wilbert; Arciniegas-Leal, Adriana; Mata-Quirós, Allan; Haiminen, Niina; Mustiga, Guiliana; Livingstone III, Donald; van Bakel, Harm; Kuhn, David N.; Parida, Laxmi; Kasarskis, Andrew; Motamayor, Juan C.

    2017-01-01

    Chocolate is a highly valued and palatable confectionery product. Chocolate is primarily made from the processed seeds of the tree species Theobroma cacao. Cacao cultivation is highly relevant for small-holder farmers throughout the tropics, yet its productivity remains limited by low yields and widespread pathogens. A panel of 148 improved cacao clones was assembled based on productivity and disease resistance, and phenotypic single-tree replicated clonal evaluation was performed for 8 years. Using high-density markers, the diversity of clones was expressed relative to 10 known ancestral cacao populations, and significant effects of ancestry were observed in productivity and disease resistance. Genome-wide association (GWA) was performed, and six markers were significantly associated with frosty pod disease resistance. In addition, genomic selection was performed, and consistent with the observed extensive linkage disequilibrium, high predictive ability was observed at low marker densities for all traits. Finally, quantitative trait locus mapping and differential expression analysis of two cultivars with contrasting disease phenotypes were performed to identify genes underlying frosty pod disease resistance, identifying a significant quantitative trait locus and 35 differentially expressed genes using two independent differential expression analyses. These results indicate that in breeding populations of heterozygous and recently admixed individuals, mapping approaches can be used for low complexity traits like pod color cacao, or in other species single gene disease resistance, however genomic selection for quantitative traits remains highly effective relative to mapping. Our results can help guide the breeding process for sustainable improved cacao productivity. PMID:29184558

  4. Efficient privacy-preserving string search and an application in genomics.

    PubMed

    Shimizu, Kana; Nuida, Koji; Rätsch, Gunnar

    2016-06-01

    Personal genomes carry inherent privacy risks and protecting privacy poses major social and technological challenges. We consider the case where a user searches for genetic information (e.g. an allele) on a server that stores a large genomic database and aims to receive allele-associated information. The user would like to keep the query and result private and the server the database. We propose a novel approach that combines efficient string data structures such as the Burrows-Wheeler transform with cryptographic techniques based on additive homomorphic encryption. We assume that the sequence data is searchable in efficient iterative query operations over a large indexed dictionary, for instance, from large genome collections and employing the (positional) Burrows-Wheeler transform. We use a technique called oblivious transfer that is based on additive homomorphic encryption to conceal the sequence query and the genomic region of interest in positional queries. We designed and implemented an efficient algorithm for searching sequences of SNPs in large genome databases. During search, the user can only identify the longest match while the server does not learn which sequence of SNPs the user queried. In an experiment based on 2184 aligned haploid genomes from the 1000 Genomes Project, our algorithm was able to perform typical queries within [Formula: see text] 4.6 s and [Formula: see text] 10.8 s for client and server side, respectively, on laptop computers. The presented algorithm is at least one order of magnitude faster than an exhaustive baseline algorithm. https://github.com/iskana/PBWT-sec and https://github.com/ratschlab/PBWT-sec shimizu-kana@aist.go.jp or Gunnar.Ratsch@ratschlab.org Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  5. Efficient privacy-preserving string search and an application in genomics

    PubMed Central

    Shimizu, Kana; Nuida, Koji; Rätsch, Gunnar

    2016-01-01

    Motivation: Personal genomes carry inherent privacy risks and protecting privacy poses major social and technological challenges. We consider the case where a user searches for genetic information (e.g. an allele) on a server that stores a large genomic database and aims to receive allele-associated information. The user would like to keep the query and result private and the server the database. Approach: We propose a novel approach that combines efficient string data structures such as the Burrows–Wheeler transform with cryptographic techniques based on additive homomorphic encryption. We assume that the sequence data is searchable in efficient iterative query operations over a large indexed dictionary, for instance, from large genome collections and employing the (positional) Burrows–Wheeler transform. We use a technique called oblivious transfer that is based on additive homomorphic encryption to conceal the sequence query and the genomic region of interest in positional queries. Results: We designed and implemented an efficient algorithm for searching sequences of SNPs in large genome databases. During search, the user can only identify the longest match while the server does not learn which sequence of SNPs the user queried. In an experiment based on 2184 aligned haploid genomes from the 1000 Genomes Project, our algorithm was able to perform typical queries within ≈ 4.6 s and ≈ 10.8 s for client and server side, respectively, on laptop computers. The presented algorithm is at least one order of magnitude faster than an exhaustive baseline algorithm. Availability and implementation: https://github.com/iskana/PBWT-sec and https://github.com/ratschlab/PBWT-sec. Contacts: shimizu-kana@aist.go.jp or Gunnar.Ratsch@ratschlab.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153731

  6. BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters.

    PubMed

    Huang, Hailiang; Tata, Sandeep; Prill, Robert J

    2013-01-01

    Computational workloads for genome-wide association studies (GWAS) are growing in scale and complexity outpacing the capabilities of single-threaded software designed for personal computers. The BlueSNP R package implements GWAS statistical tests in the R programming language and executes the calculations across computer clusters configured with Apache Hadoop, a de facto standard framework for distributed data processing using the MapReduce formalism. BlueSNP makes computationally intensive analyses, such as estimating empirical p-values via data permutation, and searching for expression quantitative trait loci over thousands of genes, feasible for large genotype-phenotype datasets. http://github.com/ibm-bioinformatics/bluesnp

  7. snpGeneSets: An R Package for Genome-Wide Study Annotation

    PubMed Central

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-01-01

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048

  8. An integrated approach to exploit linkage disequilibrium for ultra high dimensional genome-wide data

    USDA-ARS?s Scientific Manuscript database

    With the advent of recent DNA sequencing methods (determining molecule order) that quickly produce millions of DNA sequences, variation among sequences in a genome (all the DNA contained in chromosomes of an organism) can be tested for association with traits of economic interest on a relatively lar...

  9. Evolutionary Origins and Dynamics of Octoploid Strawberry Subgenomes Revealed by Dense Targeted Capture Linkage Maps

    PubMed Central

    Tennessen, Jacob A.; Govindarajulu, Rajanikanth; Ashman, Tia-Lynn; Liston, Aaron

    2014-01-01

    Whole-genome duplications are radical evolutionary events that have driven speciation and adaptation in many taxa. Higher-order polyploids have complex histories often including interspecific hybridization and dynamic genomic changes. This chromosomal reshuffling is poorly understood for most polyploid species, despite their evolutionary and agricultural importance, due to the challenge of distinguishing homologous sequences from each other. Here, we use dense linkage maps generated with targeted sequence capture to improve the diploid strawberry (Fragaria vesca) reference genome and to disentangle the subgenomes of the wild octoploid progenitors of cultivated strawberry, Fragaria virginiana and Fragaria chiloensis. Our novel approach, POLiMAPS (Phylogenetics Of Linkage-Map-Anchored Polyploid Subgenomes), leverages sequence reads to associate informative interhomeolog phylogenetic markers with linkage groups and reference genome positions. In contrast to a widely accepted model, we find that one of the four subgenomes originates with the diploid cytoplasm donor F. vesca, one with the diploid Fragaria iinumae, and two with an unknown ancestor close to F. iinumae. Extensive unidirectional introgression has converted F. iinumae-like subgenomes to be more F. vesca-like, but never the reverse, due either to homoploid hybridization in the F. iinumae-like diploid ancestors or else strong selection spreading F. vesca-like sequence among subgenomes through homeologous exchange. In addition, divergence between homeologous chromosomes has been substantially augmented by interchromosomal rearrangements. Our phylogenetic approach reveals novel aspects of the complicated web of genetic exchanges that occur during polyploid evolution and suggests a path forward for unraveling other agriculturally and ecologically important polyploid genomes. PMID:25477420

  10. Identification of new susceptibility loci for osteoarthritis (arcOGEN): a genome-wide association study.

    PubMed

    Zeggini, Eleftheria; Panoutsopoulou, Kalliope; Southam, Lorraine; Rayner, Nigel W; Day-Williams, Aaron G; Lopes, Margarida C; Boraska, Vesna; Esko, Tonu; Evangelou, Evangelos; Hoffman, Albert; Houwing-Duistermaat, Jeanine J; Ingvarsson, Thorvaldur; Jonsdottir, Ingileif; Jonnson, Helgi; Kerkhof, Hanneke J; Kloppenburg, Margreet; Bos, Steffan D; Mangino, Massimo; Metrustry, Sarah; Slagboom, P Eline; Thorleifsson, Gudmar; Raine, Emma V A; Ratnayake, Madhushika; Ricketts, Michelle; Beazley, Claude; Blackburn, Hannah; Bumpstead, Suzannah; Elliott, Katherine S; Hunt, Sarah E; Potter, Simon C; Shin, So-Youn; Yadav, Vijay K; Zhai, Guangju; Sherburn, Kate; Dixon, Kate; Arden, Elizabeth; Aslam, Nadim; Battley, Phillippa-kate; Carluke, Ian; Doherty, Sally; Gordon, Andrew; Joseph, John; Keen, Richard; Koller, Nicola C; Mitchell, Sheryl; O'Neill, Fiona; Paling, Ellen; Reed, Mike R; Rivadeneira, Fernando; Swift, Diane; Walker, Kirsten; Watkins, Bridget; Wheeler, Maggie; Birrell, Fraser; Ioannidis, John P A; Meulenbelt, Ingrid; Metspalu, Andres; Rai, Ashok; Salter, Donald; Stefansson, Kari; Stykarsdottir, Unnur; Uitterlinden, André G; van Meurs, Joyce B J; Chapman, Kay; Deloukas, Panos; Ollier, William E R; Wallis, Gillian A; Arden, Nigel; Carr, Andrew; Doherty, Michael; McCaskie, Andrew; Willkinson, J Mark; Ralston, Stuart H; Valdes, Ana M; Spector, Tim D; Loughlin, John

    2012-09-01

    Osteoarthritis is the most common form of arthritis worldwide and is a major cause of pain and disability in elderly people. The health economic burden of osteoarthritis is increasing commensurate with obesity prevalence and longevity. Osteoarthritis has a strong genetic component but the success of previous genetic studies has been restricted due to insufficient sample sizes and phenotype heterogeneity. We undertook a large genome-wide association study (GWAS) in 7410 unrelated and retrospectively and prospectively selected patients with severe osteoarthritis in the arcOGEN study, 80% of whom had undergone total joint replacement, and 11,009 unrelated controls from the UK. We replicated the most promising signals in an independent set of up to 7473 cases and 42,938 controls, from studies in Iceland, Estonia, the Netherlands, and the UK. All patients and controls were of European descent. We identified five genome-wide significant loci (binomial test p≤5·0×10(-8)) for association with osteoarthritis and three loci just below this threshold. The strongest association was on chromosome 3 with rs6976 (odds ratio 1·12 [95% CI 1·08-1·16]; p=7·24×10(-11)), which is in perfect linkage disequilibrium with rs11177. This SNP encodes a missense polymorphism within the nucleostemin-encoding gene GNL3. Levels of nucleostemin were raised in chondrocytes from patients with osteoarthritis in functional studies. Other significant loci were on chromosome 9 close to ASTN2, chromosome 6 between FILIP1 and SENP6, chromosome 12 close to KLHDC5 and PTHLH, and in another region of chromosome 12 close to CHST11. One of the signals close to genome-wide significance was within the FTO gene, which is involved in regulation of bodyweight-a strong risk factor for osteoarthritis. All risk variants were common in frequency and exerted small effects. Our findings provide insight into the genetics of arthritis and identify new pathways that might be amenable to future therapeutic

  11. GIGGLE: a search engine for large-scale integrated genome analysis.

    PubMed

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-02-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.

  12. Genome survey and high-density genetic map construction provide genomic and genetic resources for the Pacific White Shrimp Litopenaeus vannamei

    PubMed Central

    Yu, Yang; Zhang, Xiaojun; Yuan, Jianbo; Li, Fuhua; Chen, Xiaohan; Zhao, Yongzhen; Huang, Long; Zheng, Hongkun; Xiang, Jianhai

    2015-01-01

    The Pacific white shrimp Litopenaeus vannamei is the dominant crustacean species in global seafood mariculture. Understanding the genome and genetic architecture is useful for deciphering complex traits and accelerating the breeding program in shrimp. In this study, a genome survey was conducted and a high-density linkage map was constructed using a next-generation sequencing approach. The genome survey was used to identify preliminary genome characteristics and to generate a rough reference for linkage map construction. De novo SNP discovery resulted in 25,140 polymorphic markers. A total of 6,359 high-quality markers were selected for linkage map construction based on marker coverage among individuals and read depths. For the linkage map, a total of 6,146 markers spanning 4,271.43 cM were mapped to 44 sex-averaged linkage groups, with an average marker distance of 0.7 cM. An integration analysis linked 5,885 genome scaffolds and 1,504 BAC clones to the linkage map. Based on the high-density linkage map, several QTLs for body weight and body length were detected. This high-density genetic linkage map reveals basic genomic architecture and will be useful for comparative genomics research, genome assembly and genetic improvement of L. vannamei and other penaeid shrimp species. PMID:26503227

  13. Genetic Dissection of Clonally Inherited Genomes of Poeciliopsis. I. Linkage Analysis and Preliminary Assessment of Deleterious Gene Loads

    PubMed Central

    Leslie, James F.; Vrijenhoek, Robert C.

    1978-01-01

    Theoretical considerations suggest that a high load of deleterious mutations should accumulate in asexual genomes. An ideal system for testing this hypothesis occurs in the hybrid all-female fish Poeciliopsis monacha-lucida. The hybrid genotype is retained between generations by an oogenetic process that transmits only a nonrecombinant haploid monacha genome to their ova. The hybrid genotype is re-established in nature by fertilization of these monacha eggs with sperm from a sexual species, P. lucida. The unique reproductive mechanism of these hybrids allows the genetic dissection of the clonal monacha genome by forced matings with males of P. monacha. The resultant F1 hybrids and their backcross progeny were examined to determine the amount and kinds of genetic changes that might have occurred in two clonal monacha genomes.—Using six allozyme markers, four similar linkage groups were identified in each clonal genome. Segregation and assortment at these loci revealed no apparent differences between monacha genomes from sexually and clonally reproducing species. Mortality of F1 and backcross progeny revealed differences between the two clonal genomes, suggesting that deleterious genes may accumulate in genomes sheltered from recombination. PMID:17248875

  14. GCView: the genomic context viewer for protein homology searches

    PubMed Central

    Grin, Iwan; Linke, Dirk

    2011-01-01

    Genomic neighborhood can provide important insights into evolution and function of a protein or gene. When looking at operons, changes in operon structure and composition can only be revealed by looking at the operon as a whole. To facilitate the analysis of the genomic context of a query in multiple organisms we have developed Genomic Context Viewer (GCView). GCView accepts results from one or multiple protein homology searches such as BLASTp as input. For each hit, the neighboring protein-coding genes are extracted, the regions of homology are labeled for each input and the results are presented as a clear, interactive graphical output. It is also possible to add more searches to iteratively refine the output. GCView groups outputs by the hits for different proteins. This allows for easy comparison of different operon compositions and structures. The tool is embedded in the framework of the Bioinformatics Toolkit of the Max-Planck Institute for Developmental Biology (MPI Toolkit). Job results from the homology search tools inside the MPI Toolkit can be forwarded to GCView and results can be subsequently analyzed by sequence analysis tools. Results are stored online, allowing for later reinspection. GCView is freely available at http://toolkit.tuebingen.mpg.de/gcview. PMID:21609955

  15. A High-Density Linkage Map for Astyanax mexicanus Using Genotyping-by-Sequencing Technology

    PubMed Central

    Carlson, Brian M.; Onusko, Samuel W.; Gross, Joshua B.

    2014-01-01

    The Mexican tetra, Astyanax mexicanus, is a unique model system consisting of cave-adapted and surface-dwelling morphotypes that diverged >1 million years (My) ago. This remarkable natural experiment has enabled powerful genetic analyses of cave adaptation. Here, we describe the application of next-generation sequencing technology to the creation of a high-density linkage map. Our map comprises more than 2200 markers populating 25 linkage groups constructed from genotypic data generated from a single genotyping-by-sequencing project. We leveraged emergent genomic and transcriptomic resources to anchor hundreds of anonymous Astyanax markers to the genome of the zebrafish (Danio rerio), the most closely related model organism to our study species. This facilitated the identification of 784 distinct connections between our linkage map and the Danio rerio genome, highlighting several regions of conserved genomic architecture between the two species despite ∼150 My of divergence. Using a Mendelian cave-associated trait as a proof-of-principle, we successfully recovered the genomic position of the albinism locus near the gene Oca2. Further, our map successfully informed the positions of unplaced Astyanax genomic scaffolds within particular linkage groups. This ability to identify the relative location, orientation, and linear order of unaligned genomic scaffolds will facilitate ongoing efforts to improve on the current early draft and assemble future versions of the Astyanax physical genome. Moreover, this improved linkage map will enable higher-resolution genetic analyses and catalyze the discovery of the genetic basis for cave-associated phenotypes. PMID:25520037

  16. Genome-Wide Protein Interaction Screens Reveal Functional Networks Involving Sm-Like Proteins

    PubMed Central

    Fromont-Racine, Micheline; Mayes, Andrew E.; Brunet-Simon, Adeline; Rain, Jean-Christophe; Colley, Alan; Dix, Ian; Decourty, Laurence; Joly, Nicolas; Ricard, Florence; Beggs, Jean D.

    2000-01-01

    A set of seven structurally related Sm proteins forms the core of the snRNP particles containing the spliceosomal U1, U2, U4 and U5 snRNAs. A search of the genomic sequence of Saccharomyces cerevisiae has identified a number of open reading frames that potentially encode structurally similar proteins termed Lsm (Like Sm) proteins. With the aim of analysing all possible interactions between the Lsm proteins and any protein encoded in the yeast genome, we performed exhaustive and iterative genomic two-hybrid screens, starting with the Lsm proteins as baits. Indeed, extensive interactions amongst eight Lsm proteins were found that suggest the existence of a Lsm complex or complexes. These Lsm interactions apparently involve the conserved Sm domain that also mediates interactions between the Sm proteins. The screens also reveal functionally significant interactions with splicing factors, in particular with Prp4 and Prp24, compatible with genetic studies and with the reported association of Lsm proteins with spliceosomal U6 and U4/U6 particles. In addition, interactions with proteins involved in mRNA turnover, such as Mrt1, Dcp1, Dcp2 and Xrn1, point to roles for Lsm complexes in distinct RNA metabolic processes, that are confirmed in independent functional studies. These results provide compelling evidence that two-hybrid screens yield functionally meaningful information about protein–protein interactions and can suggest functions for uncharacterized proteins, especially when they are performed on a genome-wide scale. PMID:10900456

  17. Identification of New Resistance Loci to African Stem Rust Race TTKSK in Tetraploid Wheats Based on Linkage and Genome-Wide Association Mapping.

    PubMed

    Laidò, Giovanni; Panio, Giosuè; Marone, Daniela; Russo, Maria A; Ficco, Donatella B M; Giovanniello, Valentina; Cattivelli, Luigi; Steffenson, Brian; de Vita, Pasquale; Mastrangelo, Anna M

    2015-01-01

    Stem rust, caused by Puccinia graminis Pers. f. sp. tritici Eriks. and E. Henn. (Pgt), is one of the most destructive diseases of wheat. Races of the pathogen in the "Ug99 lineage" are of international concern due to their virulence for widely used stem rust resistance genes and their spread throughout Africa. Disease resistant cultivars provide one of the best means for controlling stem rust. To identify quantitative trait loci (QTL) conferring resistance to African stem rust race TTKSK at the seedling stage, we evaluated an association mapping (AM) panel consisting of 230 tetraploid wheat accessions under greenhouse conditions. A high level of phenotypic variation was observed in response to race TTKSK in the AM panel, allowing for genome-wide association mapping of resistance QTL in wild, landrace, and cultivated tetraploid wheats. Thirty-five resistance QTL were identified on all chromosomes, and seventeen are of particular interest as identified by multiple associations. Many of the identified resistance loci were coincident with previously identified rust resistance genes; however, nine on chromosomes 1AL, 2AL, 4AL, 5BL, and 7BS may be novel. To validate AM results, a biparental population of 146 recombinant inbred lines was also considered, which derived from a cross between the resistant cultivar "Cirillo" and susceptible "Neodur." The stem rust resistance of Cirillo was conferred by a single gene on the distal region of chromosome arm 6AL in an interval map coincident with the resistance gene Sr13, and confirmed one of the resistance loci identified by AM. A search for candidate resistance genes was carried out in the regions where QTL were identified, and many of them corresponded to NBS-LRR genes and protein kinases with LRR domains. The results obtained in the present study are of great interest as a high level of genetic variability for resistance to race TTKSK was described in a germplasm panel comprising most of the tetraploid wheat sub-species.

  18. Recombination patterns reveal information about centromere location on linkage maps.

    PubMed

    Limborg, Morten T; McKinney, Garrett J; Seeb, Lisa W; Seeb, James E

    2016-05-01

    Linkage mapping is often used to identify genes associated with phenotypic traits and for aiding genome assemblies. Still, many emerging maps do not locate centromeres - an essential component of the genomic landscape. Here, we demonstrate that for genomes with strong chiasma interference, approximate centromere placement is possible by phasing the same data used to generate linkage maps. Assuming one obligate crossover per chromosome arm, information about centromere location can be revealed by tracking the accumulated recombination frequency along linkage groups, similar to half-tetrad analyses. We validate the method on a linkage map for sockeye salmon (Oncorhynchus nerka) with known centromeric regions. Further tests suggest that the method will work well in other salmonids and other eukaryotes. However, the method performed weakly when applied to a male linkage map (rainbow trout; O. mykiss) characterized by low and unevenly distributed recombination - a general feature of male meiosis in many species. Further, a high frequency of double crossovers along chromosome arms in barley reduced resolution for locating centromeric regions on most linkage groups. Despite these limitations, our method should work well for high-density maps in species with strong recombination interference and will enrich many existing and future mapping resources. © 2015 The Authors. Molecular Ecology Resources published by John Wiley & Sons Ltd.

  19. Accounting for selection and correlation in the analysis of two-stage genome-wide association studies.

    PubMed

    Robertson, David S; Prevost, A Toby; Bowden, Jack

    2016-10-01

    The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39: (7), 830-832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account. © The Author 2016. Published by Oxford University Press.

  20. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species.

    PubMed

    Wang, Yi; Coleman-Derr, Devin; Chen, Guoping; Gu, Yong Q

    2015-07-01

    Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Genome-wide Diversity and Association Mapping for Capsaicinoids and Fruit Weight in Capsicum annuum L

    PubMed Central

    Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Alaparthi, Suresh B.; Almeida, Aldo; Davenport, Brittany; Nadimi, Marjan; Davidson, Joshua; Tonapi, Krittika; Yadav, Lav; Malkaram, Sridhar; Vajja, Gopinath; Hankins, Gerald; Harris, Robert; Park, Minkyu; Choi, Doil; Stommel, John; Reddy, Umesh K.

    2016-01-01

    Accumulated capsaicinoid content and increased fruit size are traits resulting from Capsicum annuum domestication. In this study, we used a diverse collection of C. annuum to generate 66,960 SNPs using genotyping by sequencing. The study identified 1189 haplotypes containing 3413 SNPs. Length of individual linkage disequilibrium (LD) blocks varied along chromosomes, with regions of high and low LD interspersed with an average LD of 139 kb. Principal component analysis (PCA), Bayesian model based population structure analysis and an Euclidean tree built based on identity by state (IBS) indices revealed that the clustering pattern of diverse accessions are in agreement with capsaicin content (CA) and fruit weight (FW) classifications indicating the importance of these traits in shaping modern pepper genome. PCA and IBS were used in a mixed linear model of capsaicin and dihydrocapsaicin content and fruit weight to reduce spurious associations because of confounding effects of subpopulations in genome-wide association study (GWAS). Our GWAS results showed SNPs in Ankyrin-like protein, IKI3 family protein, ABC transporter G family and pentatricopeptide repeat protein are the major markers for capsaicinoids and of 16 SNPs strongly associated with FW in both years of the study, 7 are located in known fruit weight controlling genes. PMID:27901114

  2. Genome-wide Diversity and Association Mapping for Capsaicinoids and Fruit Weight in Capsicum annuum L.

    PubMed

    Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Alaparthi, Suresh B; Almeida, Aldo; Davenport, Brittany; Nadimi, Marjan; Davidson, Joshua; Tonapi, Krittika; Yadav, Lav; Malkaram, Sridhar; Vajja, Gopinath; Hankins, Gerald; Harris, Robert; Park, Minkyu; Choi, Doil; Stommel, John; Reddy, Umesh K

    2016-11-30

    Accumulated capsaicinoid content and increased fruit size are traits resulting from Capsicum annuum domestication. In this study, we used a diverse collection of C. annuum to generate 66,960 SNPs using genotyping by sequencing. The study identified 1189 haplotypes containing 3413 SNPs. Length of individual linkage disequilibrium (LD) blocks varied along chromosomes, with regions of high and low LD interspersed with an average LD of 139 kb. Principal component analysis (PCA), Bayesian model based population structure analysis and an Euclidean tree built based on identity by state (IBS) indices revealed that the clustering pattern of diverse accessions are in agreement with capsaicin content (CA) and fruit weight (FW) classifications indicating the importance of these traits in shaping modern pepper genome. PCA and IBS were used in a mixed linear model of capsaicin and dihydrocapsaicin content and fruit weight to reduce spurious associations because of confounding effects of subpopulations in genome-wide association study (GWAS). Our GWAS results showed SNPs in Ankyrin-like protein, IKI3 family protein, ABC transporter G family and pentatricopeptide repeat protein are the major markers for capsaicinoids and of 16 SNPs strongly associated with FW in both years of the study, 7 are located in known fruit weight controlling genes.

  3. Genome-Wide Association Studies of 11 Agronomic Traits in Cassava (Manihot esculenta Crantz)

    PubMed Central

    Zhang, Shengkui; Chen, Xin; Lu, Cheng; Ye, Jianqiu; Zou, Meiling; Lu, Kundian; Feng, Subin; Pei, Jinli; Liu, Chen; Zhou, Xincheng; Ma, Ping’an; Li, Zhaogui; Liu, Cuijuan; Liao, Qi; Xia, Zhiqiang; Wang, Wenquan

    2018-01-01

    Cassava (Manihot esculenta Crantz) is a major tuberous crop produced worldwide. In this study, we sequenced 158 diverse cassava varieties and identified 349,827 single-nucleotide polymorphisms (SNPs) and indels. In each chromosome, the number of SNPs and the physical length of the respective chromosome were in agreement. Population structure analysis indicated that this panel can be divided into three subgroups. Genetic diversity analysis indicated that the average nucleotide diversity of the panel was 1.21 × 10-4 for all sampled landraces. This average nucleotide diversity was 1.97 × 10-4, 1.01 × 10-4, and 1.89 × 10-4 for subgroups 1, 2, and 3, respectively. Genome-wide linkage disequilibrium (LD) analysis demonstrated that the average LD was about ∼8 kb. We evaluated 158 cassava varieties under 11 different environments. Finally, we identified 36 loci that were related to 11 agronomic traits by genome-wide association analyses. Four loci were associated with two traits, and 62 candidate genes were identified in the peak SNP sites. We found that 40 of these genes showed different expression profiles in different tissues. Of the candidate genes related to storage roots, Manes.13G023300, Manes.16G000800, Manes.02G154700, Manes.02G192500, and Manes.09G099100 had higher expression levels in storage roots than in leaf and stem; on the other hand, of the candidate genes related to leaves, Manes.05G164500, Manes.05G164600, Manes.04G057300, Manes.01G202000, and Manes.03G186500 had higher expression levels in leaves than in storage roots and stem. This study provides basis for research on genetics and the genetic improvement of cassava. PMID:29725343

  4. Genome-Wide Association Studies of 11 Agronomic Traits in Cassava (Manihot esculenta Crantz).

    PubMed

    Zhang, Shengkui; Chen, Xin; Lu, Cheng; Ye, Jianqiu; Zou, Meiling; Lu, Kundian; Feng, Subin; Pei, Jinli; Liu, Chen; Zhou, Xincheng; Ma, Ping'an; Li, Zhaogui; Liu, Cuijuan; Liao, Qi; Xia, Zhiqiang; Wang, Wenquan

    2018-01-01

    Cassava ( Manihot esculenta Crantz) is a major tuberous crop produced worldwide. In this study, we sequenced 158 diverse cassava varieties and identified 349,827 single-nucleotide polymorphisms (SNPs) and indels. In each chromosome, the number of SNPs and the physical length of the respective chromosome were in agreement. Population structure analysis indicated that this panel can be divided into three subgroups. Genetic diversity analysis indicated that the average nucleotide diversity of the panel was 1.21 × 10 -4 for all sampled landraces. This average nucleotide diversity was 1.97 × 10 -4 , 1.01 × 10 -4 , and 1.89 × 10 -4 for subgroups 1, 2, and 3, respectively. Genome-wide linkage disequilibrium (LD) analysis demonstrated that the average LD was about ∼8 kb. We evaluated 158 cassava varieties under 11 different environments. Finally, we identified 36 loci that were related to 11 agronomic traits by genome-wide association analyses. Four loci were associated with two traits, and 62 candidate genes were identified in the peak SNP sites. We found that 40 of these genes showed different expression profiles in different tissues. Of the candidate genes related to storage roots, Manes.13G023300, Manes.16G000800, Manes.02G154700, Manes.02G192500, and Manes.09G099100 had higher expression levels in storage roots than in leaf and stem; on the other hand, of the candidate genes related to leaves, Manes.05G164500, Manes.05G164600, Manes.04G057300, Manes.01G202000, and Manes.03G186500 had higher expression levels in leaves than in storage roots and stem. This study provides basis for research on genetics and the genetic improvement of cassava.

  5. Genome-wide association study of interferon-related cytopenia in chronic hepatitis C patients

    PubMed Central

    Thompson, Alexander J.; Clark, Paul J.; Singh, Abanish; Ge, Dongliang; Fellay, Jacques; Zhu, Mingfu; Zhu, Qianqian; Urban, Thomas J.; Patel, Keyur; Tillmann, Hans L.; Naggie, Susanna; Afdhal, Nezam H.; Jacobson, Ira M.; Esteban, Rafael; Poordad, Fred; Lawitz, Eric J.; McCone, Jonathan; Shiffman, Mitchell L.; Galler, Greg W.; King, John W.; Kwo, Paul Y.; Shianna, Kevin V.; Noviello, Stephanie; Pedicone, Lisa D.; Brass, Clifford A.; Albrecht, Janice K.; Sulkowski, Mark S.; Goldstein, David B.; McHutchison, John G.; Muir, Andrew J.

    2012-01-01

    Background & Aims Interferon-alfa (IFN)-related cytopenias are common and may be dose-limiting. We performed a genome wide association study on a well-characterized genotype 1 HCV cohort to identify genetic determinants of peginterferon-α (peg-IFN)-related thrombocytopenia, neutropenia, and leukopenia. Methods 1604/3070 patients in the IDEAL study consented to genetic testing. Trial inclusion criteria included a platelet (Pl) count ≥80 × 109/L and an absolute neutrophil count (ANC) ≥ 1500/mm3. Samples were genotyped using the Illumina Human610-quad BeadChip. The primary analyses focused on the genetic determinants of quantitative change in cell counts (Pl, ANC, lymphocytes, monocytes, eosinophils, and basophils) at week 4 in patients >80% adherent to therapy (n = 1294). Results 6 SNPs on chromosome 20 were positively associated with Pl reduction (top SNP rs965469, p = 10−10). These tag SNPs are in high linkage disequilibrium with 2 functional variants in the ITPA gene, rs1127354 and rs7270101, that cause ITPase deficiency and protect against ribavirin (RBV)-induced hemolytic anemia (HA). rs1127354 and rs7270101 showed strong independent associations with Pl reduction (p = 10−12, p = 10−7) and entirely explained the genome-wide significant associations. We believe this is an example of an indirect genetic association due to a reactive thrombocytosis to RBV-induced anemia: Hb decline was inversely correlated with Pl reduction (r = −0.28, p = 10−17) and Hb change largely attenuated the association between the ITPA variants and Pl reduction in regression models. No common genetic variants were associated with pegIFN-induced neutropenia or leucopenia. Conclusions Two ITPA variants were associated with thrombocytopenia; this was largely explained by a thrombocytotic response to RBV-induced HA attenuating IFN-related thrombocytopenia. No genetic determinants of pegIFN-induced neutropenia were identified. PMID:21703177

  6. Molecular Characterization of the Lipid Genome-Wide Association Study Signal on Chromosome 18q11.2 Implicates HNF4A-Mediated Regulation of the TMEM241 Gene.

    PubMed

    Rodríguez, Alejandra; Gonzalez, Luis; Ko, Arthur; Alvarez, Marcus; Miao, Zong; Bhagat, Yash; Nikkola, Elina; Cruz-Bautista, Ivette; Arellano-Campos, Olimpia; Muñoz-Hernández, Linda L; Ordóñez-Sánchez, Maria-Luisa; Rodriguez-Guillen, Rosario; Mohlke, Karen L; Laakso, Markku; Tusie-Luna, Teresa; Aguilar-Salinas, Carlos A; Pajukanta, Päivi

    2016-07-01

    We recently identified a locus on chromosome 18q11.2 for high serum triglycerides in Mexicans. We hypothesize that the lead genome-wide association study single-nucleotide polymorphism rs9949617, or its linkage disequilibrium proxies, regulates 1 of the 5 genes in the triglyceride-associated region. We performed a linkage disequilibrium analysis and found 9 additional variants in linkage disequilibrium (r(2)>0.7) with the lead single-nucleotide polymorphism. To select the variants for functional analyses, we annotated the 10 variants using DNase I hypersensitive sites, transcription factor and chromatin states and identified rs17259126 as the lead candidate variant for functional in vitro validation. Using luciferase transcriptional reporter assay in liver HepG2 cells, we found that the G allele exhibits a significantly lower effect on transcription (P<0.05). The electrophoretic mobility shift and ChIPqPCR (chromatin immunoprecipitation coupled with quantitative polymerase chain reaction) assays confirmed that the minor G allele of rs17259126 disrupts an hepatocyte nuclear factor 4 α-binding site. To find the regional candidate gene, we performed a local expression quantitative trait locus analysis and found that rs17259126 and its linkage disequilibrium proxies alter expression of the regional transmembrane protein 241 (TMEM241) gene in 795 adipose RNAs from the Metabolic Syndrome In Men (METSIM) cohort (P=6.11×10(-07)-5.80×10(-04)). These results were replicated in expression profiles of TMEM241 from the Multiple Tissue Human Expression Resource (MuTHER; n=856). The Mexican genome-wide association study signal for high serum triglycerides on chromosome 18q11.2 harbors a regulatory single-nucleotide polymorphism, rs17259126, which disrupts normal hepatocyte nuclear factor 4 α binding and decreases the expression of the regional TMEM241 gene. Our data suggest that decreased transcript levels of TMEM241 contribute to increased triglyceride levels in Mexicans.

  7. Structure, evolution, and comparative genomics of tetraploid cotton based on a high-density genetic linkage map.

    PubMed

    Li, Ximei; Jin, Xin; Wang, Hantao; Zhang, Xianlong; Lin, Zhongxu

    2016-06-01

    A high-density linkage map was constructed using 1,885 newly obtained loci and 3,747 previously published loci, which included 5,152 loci with 4696.03 cM in total length and 0.91 cM in mean distance. Homology analysis in the cotton genome further confirmed the 13 expected homologous chromosome pairs and revealed an obvious inversion on Chr10 or Chr20 and repeated inversions on Chr07 or Chr16. In addition, two reciprocal translocations between Chr02 and Chr03 and between Chr04 and Chr05 were confirmed. Comparative genomics between the tetraploid cotton and the diploid cottons showed that no major structural changes exist between DT and D chromosomes but rather between AT and A chromosomes. Blast analysis between the tetraploid cotton genome and the mixed genome of two diploid cottons showed that most AD chromosomes, regardless of whether it is from the AT or DT genome, preferentially matched with the corresponding homologous chromosome in the diploid A genome, and then the corresponding homologous chromosome in the diploid D genome, indicating that the diploid D genome underwent converted evolution by the diploid A genome to form the DT genome during polyploidization. In addition, the results reflected that a series of chromosomal translocations occurred among Chr01/Chr15, Chr02/Chr14, Chr03/Chr17, Chr04/Chr22, and Chr05/Chr19. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  8. GIGGLE: a search engine for large-scale integrated genome analysis

    PubMed Central

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-01-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation. PMID:29309061

  9. Multibreed genome wide association can improve precision of mapping causative variants underlying milk production in dairy cattle

    PubMed Central

    2014-01-01

    Background Genome wide association studies (GWAS) in most cattle breeds result in large genomic intervals of significant associations making it difficult to identify causal mutations. This is due to the extensive, low-level linkage disequilibrium within a cattle breed. As there is less linkage disequilibrium across breeds, multibreed GWAS may improve precision of causal variant mapping. Here we test this hypothesis in a Holstein and Jersey cattle data set with 17,925 individuals with records for production and functional traits and 632,003 SNP markers. Results By using a cross validation strategy within the Holstein and Jersey data sets, we were able to identify and confirm a large number of QTL. As expected, the precision of mapping these QTL within the breeds was limited. In the multibreed analysis, we found that many loci were not segregating in both breeds. This was partly an artefact of power of the experiments, with the number of QTL shared between the breeds generally increasing with trait heritability. False discovery rates suggest that the multibreed analysis was less powerful than between breed analyses, in terms of how much genetic variance was explained by the detected QTL. However, the multibreed analysis could more accurately pinpoint the location of the well-described mutations affecting milk production such as DGAT1. Further, the significant SNP in the multibreed analysis were significantly enriched in genes regions, to a considerably greater extent than was observed in the single breed analyses. In addition, we have refined QTL on BTA5 and BTA19 to very small intervals and identified a small number of potential candidate genes in these, as well as in a number of other regions. Conclusion Where QTL are segregating across breed, multibreed GWAS can refine these to reasonably small genomic intervals. However, such QTL appear to represent only a fraction of the genetic variation. Our results suggest a significant proportion of QTL affecting milk

  10. Genome-wide comparative analysis of four Indian Drosophila species.

    PubMed

    Mohanty, Sujata; Khanna, Radhika

    2017-12-01

    Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.

  11. Genome-wide association studies and epigenome-wide association studies go together in cancer control

    PubMed Central

    Verma, Mukesh

    2016-01-01

    Completion of the human genome a decade ago laid the foundation for: using genetic information in assessing risk to identify individuals and populations that are likely to develop cancer, and designing treatments based on a person's genetic profiling (precision medicine). Genome-wide association studies (GWAS) completed during the past few years have identified risk-associated single nucleotide polymorphisms that can be used as screening tools in epidemiologic studies of a variety of tumor types. This led to the conduct of epigenome-wide association studies (EWAS). This article discusses the current status, challenges and research opportunities in GWAS and EWAS. Information gained from GWAS and EWAS has potential applications in cancer control and treatment. PMID:27079684

  12. A genome-wide association study reveals novel elite allelic variations in seed oil content of Brassica napus.

    PubMed

    Liu, Sheng; Fan, Chuchuan; Li, Jiana; Cai, Guangqin; Yang, Qingyong; Wu, Jian; Yi, Xinqi; Zhang, Chunyu; Zhou, Yongming

    2016-06-01

    A set of additive loci for seed oil content were identified using association mapping and one of the novel loci on the chromosome A5 was validated by linkage mapping. Increasing seed oil content is one of the most important goals in the breeding of oilseed crops including Brassica napus, yet the genetic basis for variations in this important trait remains unclear. By genome-wide association study of seed oil content using 521 B. napus accessions genotyped with the Brassica 60K SNP array, we identified 50 loci significantly associated with seed oil content using three statistical models, the general linear model, the mixed linear model and the Anderson-Darling test. Together, the identified loci could explain approximately 80 % of the total phenotypic variance, and 29 of these loci have not been reported previously. Furthermore, a novel locus on the chromosome A5 that could increase 1.5-1.7 % of seed oil content was validated in an independent bi-parental linkage population. Haplotype analysis showed that the favorable alleles for seed oil content exhibit cumulative effects. Our results thus provide valuable information for understanding the genetic control of seed oil content in B. napus and may facilitate marker-based breeding for a higher seed oil content in this important oil crop.

  13. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    PubMed

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  14. Bombyx mori Transcription Factors: Genome-Wide Identification, Expression Profiles and Response to Pathogens by Microarray Analysis

    PubMed Central

    Huang, Lulin; Cheng, Tingcai; Xu, Pingzhen; Fang, Ting; Xia, Qingyou

    2012-01-01

    Transcription factors are present in all living organisms, and play vital roles in a wide range of biological processes. Studies of transcription factors will help reveal the complex regulation mechanism of organisms. So far, hundreds of domains have been identified that show transcription factor activity. Here, 281 reported transcription factor domains were used as seeds to search the transcription factors in genomes of Bombyx mori L. (Lepidoptera: Bombycidae) and four other model insects. Overall, 666 transcription factors including 36 basal factors and 630 other factors were identified in B. mori genome, which accounted for 4.56% of its genome. The silkworm transcription factors' expression profiles were investigated in relation to multiple tissues, developmental stages, sexual dimorphism, and responses to oral infection by pathogens and direct bacterial injection. These all provided rich clues for revealing the transcriptional regulation mechanism of silkworm organ differentiation, growth and development, sexual dimorphism, and response to pathogen infection. PMID:22943524

  15. A genome-wide search for type 2 diabetes susceptibility genes in an extended Arab family.

    PubMed

    Al Safar, Habiba S; Cordell, Heather J; Jafer, Osman; Anderson, Denise; Jamieson, Sarra E; Fakiola, Michaela; Khazanehdari, Kamal; Tay, Guan K; Blackwell, Jenefer M

    2013-11-01

    Twenty percent of people aged 20 to 79 have type 2 diabetes (T2D) in the United Arab Emirates (UAE). Genome-wide association studies (GWAS) to identify genes for T2D have not been reported for Arab countries. We performed a discovery GWAS in an extended UAE family (N=178; 66 diabetic; 112 healthy) genotyped on the Illumina Human 660 Quad Beadchip, with independent replication of top hits in 116 cases and 199 controls. Power to achieve genome-wide significance (commonly P=5×10(-8)) was therefore limited. Nevertheless, transmission disequilibrium testing in FBAT identified top hits at Chromosome 4p12-p13 (KCTD8: rs4407541, P=9.70×10(-6); GABRB1: rs10517178/rs1372491, P=4.19×10(-6)) and 14q13 (PRKD1: rs10144903, 3.92×10(-6)), supported by analysis using a linear mixed model approximation in GenABEL (4p12-p13 GABRG1/GABRA2: rs7662743, Padj-agesex=2.06×10(-5); KCTD8: rs4407541, Padj-agesex=1.42×10(-4); GABRB1: rs10517178/rs1372491, Padj-agesex=0.027; 14q13 PRKD1: rs10144903, Padj-agesex=6.95×10(-5)). SNPs across GABRG1/GABRA2 did not replicate, whereas more proximal SNPs rs7679715 (Padj-agesex=0.030) and rs2055942 (Padj-agesex=0.022) at COX7B2/GABRA4 did, in addition to a trend distally at KCTD8 (rs4695718: Padj-agesex=0.096). Modelling of discovery and replication data support independent signals at GABRA4 (rs2055942: Padj-agesex-combined=3×10(-4)) and at KCTD8 (rs4695718: Padj-agesex-combined=2×10(-4)). Replication was observed for PRKD1 rs1953722 (proxy for rs10144903; Padj-agesex=0.031; Padj-agesex-combined=2×10(-4)). These genes may provide important functional leads in understanding disease pathogenesis in this population. © 2013 John Wiley & Sons Ltd/University College London.

  16. Genomic diversity and population structure of three autochthonous Greek sheep breeds assessed with genome-wide DNA arrays.

    PubMed

    Michailidou, S; Tsangaris, G; Fthenakis, G C; Tzora, A; Skoufos, I; Karkabounas, S C; Banos, G; Argiriou, A; Arsenos, G

    2018-06-01

    In the present study, genome-wide genotyping was applied to characterize the genetic diversity and population structure of three autochthonous Greek breeds: Boutsko, Karagouniko and Chios. Dairy sheep are among the most significant livestock species in Greece numbering approximately 9 million animals which are characterized by large phenotypic variation and reared under various farming systems. A total of 96 animals were genotyped with the Illumina's OvineSNP50K microarray beadchip, to study the population structure of the breeds and develop a specialized panel of single-nucleotide polymorphisms (SNPs), which could distinguish one breed from the others. Quality control on the dataset resulted in 46,125 SNPs, which were used to evaluate the genetic structure of the breeds. Population structure was assessed through principal component analysis (PCA) and admixture analysis, whereas inbreeding was estimated based on runs of homozygosity (ROHs) coefficients, genomic relationship matrix inbreeding coefficients (F GRM ) and patterns of linkage disequilibrium (LD). Associations between SNPs and breeds were analyzed with different inheritance models, to identify SNPs that distinguish among the breeds. Results showed high levels of genetic heterogeneity in the three breeds. Genetic distances among breeds were modest, despite their different ancestries. Chios and Karagouniko breeds were more genetically related to each other compared to Boutsko. Analysis revealed 3802 candidate SNPs that can be used to identify two-breed crosses and purebred animals. The present study provides, for the first time, data on the genetic background of three Greek indigenous dairy sheep breeds as well as a specialized marker panel that can be applied for traceability purposes as well as targeted genetic improvement schemes and conservation programs.

  17. Microfluidics for genome-wide studies involving next generation sequencing

    PubMed Central

    Murphy, Travis W.; Lu, Chang

    2017-01-01

    Next-generation sequencing (NGS) has revolutionized how molecular biology studies are conducted. Its decreasing cost and increasing throughput permit profiling of genomic, transcriptomic, and epigenomic features for a wide range of applications. Microfluidics has been proven to be highly complementary to NGS technology with its unique capabilities for handling small volumes of samples and providing platforms for automation, integration, and multiplexing. In this article, we review recent progress on applying microfluidics to facilitate genome-wide studies. We emphasize on several technical aspects of NGS and how they benefit from coupling with microfluidic technology. We also summarize recent efforts on developing microfluidic technology for genomic, transcriptomic, and epigenomic studies, with emphasis on single cell analysis. We envision rapid growth in these directions, driven by the needs for testing scarce primary cell samples from patients in the context of precision medicine. PMID:28396707

  18. Genome-wide analysis links NFATC2 with asparaginase hypersensitivity

    PubMed Central

    Fernandez, Christian A.; Smith, Colton; Yang, Wenjian; Mullighan, Charles G.; Qu, Chunxu; Larsen, Eric; Bowman, W. Paul; Liu, Chengcheng; Ramsey, Laura B.; Chang, Tamara; Karol, Seth E.; Loh, Mignon L.; Raetz, Elizabeth A.; Winick, Naomi J.; Hunger, Stephen P.; Carroll, William L.; Jeha, Sima; Pui, Ching-Hon; Evans, William E.; Devidas, Meenakshi

    2015-01-01

    Asparaginase is used to treat acute lymphoblastic leukemia (ALL); however, hypersensitivity reactions can lead to suboptimal asparaginase exposure. Our objective was to use a genome-wide approach to identify loci associated with asparaginase hypersensitivity in children with ALL enrolled on St. Jude Children’s Research Hospital (SJCRH) protocols Total XIIIA (n = 154), Total XV (n = 498), and Total XVI (n = 271), or Children’s Oncology Group protocols POG 9906 (n = 222) and AALL0232 (n = 2163). Germline DNA was genotyped using the Affymetrix 500K, Affymetrix 6.0, or the Illumina Exome BeadChip array. In multivariate logistic regression, the intronic rs6021191 variant in nuclear factor of activated T cells 2 (NFATC2) had the strongest association with hypersensitivity (P = 4.1 × 10−8; odds ratio [OR] = 3.11). RNA-seq data available from 65 SJCRH ALL tumor samples and 52 Yoruba HapMap samples showed that samples carrying the rs6021191 variant had higher NFATC2 expression compared with noncarriers (P = 1.1 × 10−3 and 0.03, respectively). The top ranked nonsynonymous polymorphism was rs17885382 in HLA-DRB1 (P = 3.2 × 10−6; OR = 1.63), which is in near complete linkage disequilibrium with the HLA-DRB1*07:01 allele we previously observed in a candidate gene study. The strongest risk factors for asparaginase allergy are variants within genes regulating the immune response. PMID:25987655

  19. The Glyphosate-Based Herbicide Roundup Does not Elevate Genome-Wide Mutagenesis of Escherichia coli.

    PubMed

    Tincher, Clayton; Long, Hongan; Behringer, Megan; Walker, Noah; Lynch, Michael

    2017-10-05

    Mutations induced by pollutants may promote pathogen evolution, for example by accelerating mutations conferring antibiotic resistance. Generally, evaluating the genome-wide mutagenic effects of long-term sublethal pollutant exposure at single-nucleotide resolution is extremely difficult. To overcome this technical barrier, we use the mutation accumulation/whole-genome sequencing (MA/WGS) method as a mutagenicity test, to quantitatively evaluate genome-wide mutagenesis of Escherichia coli after long-term exposure to a wide gradient of the glyphosate-based herbicide (GBH) Roundup Concentrate Plus. The genome-wide mutation rate decreases as GBH concentration increases, suggesting that even long-term GBH exposure does not compromise the genome stability of bacteria. Copyright © 2017 Tincher et al.

  20. Linkage Analysis in Autoimmune Addison's Disease: NFATC1 as a Potential Novel Susceptibility Locus.

    PubMed

    Mitchell, Anna L; Bøe Wolff, Anette; MacArthur, Katie; Weaver, Jolanta U; Vaidya, Bijay; Erichsen, Martina M; Darlay, Rebecca; Husebye, Eystein S; Cordell, Heather J; Pearce, Simon H S

    2015-01-01

    Autoimmune Addison's disease (AAD) is a rare, highly heritable autoimmune endocrinopathy. It is possible that there may be some highly penetrant variants which confer disease susceptibility that have yet to be discovered. DNA samples from 23 multiplex AAD pedigrees from the UK and Norway (50 cases, 67 controls) were genotyped on the Affymetrix SNP 6.0 array. Linkage analysis was performed using Merlin. EMMAX was used to carry out a genome-wide association analysis comparing the familial AAD cases to 2706 UK WTCCC controls. To explore some of the linkage findings further, a replication study was performed by genotyping 64 SNPs in two of the four linked regions (chromosomes 7 and 18), on the Sequenom iPlex platform in three European AAD case-control cohorts (1097 cases, 1117 controls). The data were analysed using a meta-analysis approach. In a parametric analysis, applying a rare dominant model, loci on chromosomes 7, 9 and 18 had LOD scores >2.8. In a non-parametric analysis, a locus corresponding to the HLA region on chromosome 6, known to be associated with AAD, had a LOD score >3.0. In the genome-wide association analysis, a SNP cluster on chromosome 2 and a pair of SNPs on chromosome 6 were associated with AAD (P <5x10-7). A meta-analysis of the replication study data demonstrated that three chromosome 18 SNPs were associated with AAD, including a non-synonymous variant in the NFATC1 gene. This linkage study has implicated a number of novel chromosomal regions in the pathogenesis of AAD in multiplex AAD families and adds further support to the role of HLA in AAD. The genome-wide association analysis has also identified a region of interest on chromosome 2. A replication study has demonstrated that the NFATC1 gene is worthy of future investigation, however each of the regions identified require further, systematic analysis.

  1. A linkage disequilibrium perspective on the genetic mosaic of speciation in two hybridizing Mediterranean white oaks

    PubMed Central

    Goicoechea, P G; Herrán, A; Durand, J; Bodénès, C; Plomion, C; Kremer, A

    2015-01-01

    We analyzed the genetic mosaic of speciation in two hybridizing Mediterranean white oaks from the Iberian Peninsula (Quercus faginea Lamb. and Quercus pyrenaica Willd.). The two species show ecological divergence in flowering phenology, leaf morphology and composition, and in their basic or acidic soil preferences. Ninety expressed sequence tag-simple sequence repeats (EST-SSRs) and eight nuclear SSRs were genotyped in 96 trees from each species. Genotyping was designed in two steps. First, we used 69 markers evenly distributed over the 12 linkage groups (LGs) of the oak linkage map to confirm the species genetic identity of the sampled genotypes, and searched for differentiation outliers. Then, we genotyped 29 additional markers from the chromosome bins containing the outliers and repeated the multilocus scans. We found one or two additional outliers within four saturated bins, thus confirming that outliers are organized into clusters. Linkage disequilibrium (LD) was extensive; even for loosely linked and for independent markers. Consequently, score tests for association between two-marker haplotypes and the ‘species trait' showed a broad genomic divergence, although substantial variation across the genome and within LGs was also observed. We discuss the influence of several confounding effects on neutrality tests and review the evolutionary processes leading to extensive LD. Finally, we examine how LD analyses within regions that contain outlier clusters and quantitative trait loci can help to identify regions of divergence and/or genomic hitchhiking in the light of predictions from ecological speciation theory. PMID:25515016

  2. SOPanG: online text searching over a pan-genome.

    PubMed

    Cislak, Aleksander; Grabowski, Szymon; Holub, Jan

    2018-06-22

    The many thousands of high-quality genomes available nowadays imply a shift from single genome to pan-genomic analyses. A basic algorithmic building brick for such a scenario is online search over a collection of similar texts, a problem with surprisingly few solutions presented so far. We present SOPanG, a simple tool for exact pattern matching over an elastic-degenerate string, a recently proposed simplified model for the pan-genome. Thanks to bit-parallelism, it achieves pattern matching speeds above 400MB/s, more than an order of magnitude higher than of other software. SOPanG is available for free from: https://github.com/MrAlexSee/sopang. Supplementary data are available at Bioinformatics online.

  3. Exploiting Genome Structure in Association Analysis

    PubMed Central

    Kim, Seyoung

    2014-01-01

    Abstract A genome-wide association study involves examining a large number of single-nucleotide polymorphisms (SNPs) to identify SNPs that are significantly associated with the given phenotype, while trying to reduce the false positive rate. Although haplotype-based association methods have been proposed to accommodate correlation information across nearby SNPs that are in linkage disequilibrium, none of these methods directly incorporated the structural information such as recombination events along chromosome. In this paper, we propose a new approach called stochastic block lasso for association mapping that exploits prior knowledge on linkage disequilibrium structure in the genome such as recombination rates and distances between adjacent SNPs in order to increase the power of detecting true associations while reducing false positives. Following a typical linear regression framework with the genotypes as inputs and the phenotype as output, our proposed method employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a first-order Markov process along the sequence of SNPs that incorporates the prior information on the linkage disequilibrium structure. The Markov-chain prior models the structural dependencies between a pair of adjacent SNPs, and allows us to look for association SNPs in a coupled manner, combining strength from multiple nearby SNPs. Our results on HapMap-simulated datasets and mouse datasets show that there is a significant advantage in incorporating the prior knowledge on linkage disequilibrium structure for marker identification under whole-genome association. PMID:21548809

  4. A genome-wide study of common SNPs and CNVs in cognitive performance in the CANTAB

    PubMed Central

    Need, Anna C.; Attix, Deborah K.; McEvoy, Jill M.; Cirulli, Elizabeth T.; Linney, Kristen L.; Hunt, Priscilla; Ge, Dongliang; Heinzen, Erin L.; Maia, Jessica M.; Shianna, Kevin V.; Weale, Michael E.; Cherkas, Lynn F.; Clement, Gail; Spector, Tim D.; Gibson, Greg; Goldstein, David B.

    2009-01-01

    Psychiatric disorders such as schizophrenia are commonly accompanied by cognitive impairments that are treatment resistant and crucial to functional outcome. There has been great interest in studying cognitive measures as endophenotypes for psychiatric disorders, with the hope that their genetic basis will be clearer. To investigate this, we performed a genome-wide association study involving 11 cognitive phenotypes from the Cambridge Neuropsychological Test Automated Battery. We showed these measures to be heritable by comparing the correlation in 100 monozygotic and 100 dizygotic twin pairs. The full battery was tested in ∼750 subjects, and for spatial and verbal recognition memory, we investigated a further 500 individuals to search for smaller genetic effects. We were unable to find any genome-wide significant associations with either SNPs or common copy number variants. Nor could we formally replicate any polymorphism that has been previously associated with cognition, although we found a weak signal of lower than expected P-values for variants in a set of 10 candidate genes. We additionally investigated SNPs in genomic loci that have been shown to harbor rare variants that associate with neuropsychiatric disorders, to see if they showed any suggestion of association when considered as a separate set. Only NRXN1 showed evidence of significant association with cognition. These results suggest that common genetic variation does not strongly influence cognition in healthy subjects and that cognitive measures do not represent a more tractable genetic trait than clinical endpoints such as schizophrenia. We discuss a possible role for rare variation in cognitive genomics. PMID:19734545

  5. A genome-wide study of common SNPs and CNVs in cognitive performance in the CANTAB.

    PubMed

    Need, Anna C; Attix, Deborah K; McEvoy, Jill M; Cirulli, Elizabeth T; Linney, Kristen L; Hunt, Priscilla; Ge, Dongliang; Heinzen, Erin L; Maia, Jessica M; Shianna, Kevin V; Weale, Michael E; Cherkas, Lynn F; Clement, Gail; Spector, Tim D; Gibson, Greg; Goldstein, David B

    2009-12-01

    Psychiatric disorders such as schizophrenia are commonly accompanied by cognitive impairments that are treatment resistant and crucial to functional outcome. There has been great interest in studying cognitive measures as endophenotypes for psychiatric disorders, with the hope that their genetic basis will be clearer. To investigate this, we performed a genome-wide association study involving 11 cognitive phenotypes from the Cambridge Neuropsychological Test Automated Battery. We showed these measures to be heritable by comparing the correlation in 100 monozygotic and 100 dizygotic twin pairs. The full battery was tested in approximately 750 subjects, and for spatial and verbal recognition memory, we investigated a further 500 individuals to search for smaller genetic effects. We were unable to find any genome-wide significant associations with either SNPs or common copy number variants. Nor could we formally replicate any polymorphism that has been previously associated with cognition, although we found a weak signal of lower than expected P-values for variants in a set of 10 candidate genes. We additionally investigated SNPs in genomic loci that have been shown to harbor rare variants that associate with neuropsychiatric disorders, to see if they showed any suggestion of association when considered as a separate set. Only NRXN1 showed evidence of significant association with cognition. These results suggest that common genetic variation does not strongly influence cognition in healthy subjects and that cognitive measures do not represent a more tractable genetic trait than clinical endpoints such as schizophrenia. We discuss a possible role for rare variation in cognitive genomics.

  6. Inferring genome-wide interplay landscape between DNA methylation and transcriptional regulation.

    PubMed

    Tang, Binhua; Wang, Xin

    2015-01-01

    DNA methylation and transcriptional regulation play important roles in cancer cell development and differentiation processes. Based on the currently available cell line profiling information from the ENCODE Consortium, we propose a Bayesian inference model to infer and construct genome-wide interaction landscape between DNA methylation and transcriptional regulation, which sheds light on the underlying complex functional mechanisms important within the human cancer and disease context. For the first time, we select all the currently available cell lines (>=20) and transcription factors (>=80) profiling information from the ENCODE Consortium portal. Through the integration of those genome-wide profiling sources, our genome-wide analysis detects multiple functional loci of interest, and indicates that DNA methylation is cell- and region-specific, due to the interplay mechanisms with transcription regulatory activities. We validate our analysis results with the corresponding RNA-sequencing technique for those detected genomic loci. Our results provide novel and meaningful insights for the interplay mechanisms of transcriptional regulation and gene expression for the human cancer and disease studies.

  7. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes.

    PubMed

    Malik, Rainer; Chauhan, Ganesh; Traylor, Matthew; Sargurupremraj, Muralidharan; Okada, Yukinori; Mishra, Aniket; Rutten-Jacobs, Loes; Giese, Anne-Katrin; van der Laan, Sander W; Gretarsdottir, Solveig; Anderson, Christopher D; Chong, Michael; Adams, Hieab H H; Ago, Tetsuro; Almgren, Peter; Amouyel, Philippe; Ay, Hakan; Bartz, Traci M; Benavente, Oscar R; Bevan, Steve; Boncoraglio, Giorgio B; Brown, Robert D; Butterworth, Adam S; Carrera, Caty; Carty, Cara L; Chasman, Daniel I; Chen, Wei-Min; Cole, John W; Correa, Adolfo; Cotlarciuc, Ioana; Cruchaga, Carlos; Danesh, John; de Bakker, Paul I W; DeStefano, Anita L; den Hoed, Marcel; Duan, Qing; Engelter, Stefan T; Falcone, Guido J; Gottesman, Rebecca F; Grewal, Raji P; Gudnason, Vilmundur; Gustafsson, Stefan; Haessler, Jeffrey; Harris, Tamara B; Hassan, Ahamad; Havulinna, Aki S; Heckbert, Susan R; Holliday, Elizabeth G; Howard, George; Hsu, Fang-Chi; Hyacinth, Hyacinth I; Ikram, M Arfan; Ingelsson, Erik; Irvin, Marguerite R; Jian, Xueqiu; Jiménez-Conde, Jordi; Johnson, Julie A; Jukema, J Wouter; Kanai, Masahiro; Keene, Keith L; Kissela, Brett M; Kleindorfer, Dawn O; Kooperberg, Charles; Kubo, Michiaki; Lange, Leslie A; Langefeld, Carl D; Langenberg, Claudia; Launer, Lenore J; Lee, Jin-Moo; Lemmens, Robin; Leys, Didier; Lewis, Cathryn M; Lin, Wei-Yu; Lindgren, Arne G; Lorentzen, Erik; Magnusson, Patrik K; Maguire, Jane; Manichaikul, Ani; McArdle, Patrick F; Meschia, James F; Mitchell, Braxton D; Mosley, Thomas H; Nalls, Michael A; Ninomiya, Toshiharu; O'Donnell, Martin J; Psaty, Bruce M; Pulit, Sara L; Rannikmäe, Kristiina; Reiner, Alexander P; Rexrode, Kathryn M; Rice, Kenneth; Rich, Stephen S; Ridker, Paul M; Rost, Natalia S; Rothwell, Peter M; Rotter, Jerome I; Rundek, Tatjana; Sacco, Ralph L; Sakaue, Saori; Sale, Michele M; Salomaa, Veikko; Sapkota, Bishwa R; Schmidt, Reinhold; Schmidt, Carsten O; Schminke, Ulf; Sharma, Pankaj; Slowik, Agnieszka; Sudlow, Cathie L M; Tanislav, Christian; Tatlisumak, Turgut; Taylor, Kent D; Thijs, Vincent N S; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Tiedt, Steffen; Trompet, Stella; Tzourio, Christophe; van Duijn, Cornelia M; Walters, Matthew; Wareham, Nicholas J; Wassertheil-Smoller, Sylvia; Wilson, James G; Wiggins, Kerri L; Yang, Qiong; Yusuf, Salim; Bis, Joshua C; Pastinen, Tomi; Ruusalepp, Arno; Schadt, Eric E; Koplev, Simon; Björkegren, Johan L M; Codoni, Veronica; Civelek, Mete; Smith, Nicholas L; Trégouët, David A; Christophersen, Ingrid E; Roselli, Carolina; Lubitz, Steven A; Ellinor, Patrick T; Tai, E Shyong; Kooner, Jaspal S; Kato, Norihiro; He, Jiang; van der Harst, Pim; Elliott, Paul; Chambers, John C; Takeuchi, Fumihiko; Johnson, Andrew D; Sanghera, Dharambir K; Melander, Olle; Jern, Christina; Strbian, Daniel; Fernandez-Cadenas, Israel; Longstreth, W T; Rolfs, Arndt; Hata, Jun; Woo, Daniel; Rosand, Jonathan; Pare, Guillaume; Hopewell, Jemma C; Saleheen, Danish; Stefansson, Kari; Worrall, Bradford B; Kittner, Steven J; Seshadri, Sudha; Fornage, Myriam; Markus, Hugh S; Howson, Joanna M M; Kamatani, Yoichiro; Debette, Stephanie; Dichgans, Martin; Malik, Rainer; Chauhan, Ganesh; Traylor, Matthew; Sargurupremraj, Muralidharan; Okada, Yukinori; Mishra, Aniket; Rutten-Jacobs, Loes; Giese, Anne-Katrin; van der Laan, Sander W; Gretarsdottir, Solveig; Anderson, Christopher D; Chong, Michael; Adams, Hieab H H; Ago, Tetsuro; Almgren, Peter; Amouyel, Philippe; Ay, Hakan; Bartz, Traci M; Benavente, Oscar R; Bevan, Steve; Boncoraglio, Giorgio B; Brown, Robert D; Butterworth, Adam S; Carrera, Caty; Carty, Cara L; Chasman, Daniel I; Chen, Wei-Min; Cole, John W; Correa, Adolfo; Cotlarciuc, Ioana; Cruchaga, Carlos; Danesh, John; de Bakker, Paul I W; DeStefano, Anita L; Hoed, Marcel den; Duan, Qing; Engelter, Stefan T; Falcone, Guido J; Gottesman, Rebecca F; Grewal, Raji P; Gudnason, Vilmundur; Gustafsson, Stefan; Haessler, Jeffrey; Harris, Tamara B; Hassan, Ahamad; Havulinna, Aki S; Heckbert, Susan R; Holliday, Elizabeth G; Howard, George; Hsu, Fang-Chi; Hyacinth, Hyacinth I; Ikram, M Arfan; Ingelsson, Erik; Irvin, Marguerite R; Jian, Xueqiu; Jiménez-Conde, Jordi; Johnson, Julie A; Jukema, J Wouter; Kanai, Masahiro; Keene, Keith L; Kissela, Brett M; Kleindorfer, Dawn O; Kooperberg, Charles; Kubo, Michiaki; Lange, Leslie A; Langefeld, Carl D; Langenberg, Claudia; Launer, Lenore J; Lee, Jin-Moo; Lemmens, Robin; Leys, Didier; Lewis, Cathryn M; Lin, Wei-Yu; Lindgren, Arne G; Lorentzen, Erik; Magnusson, Patrik K; Maguire, Jane; Manichaikul, Ani; McArdle, Patrick F; Meschia, James F; Mitchell, Braxton D; Mosley, Thomas H; Nalls, Michael A; Ninomiya, Toshiharu; O'Donnell, Martin J; Psaty, Bruce M; Pulit, Sara L; Rannikmäe, Kristiina; Reiner, Alexander P; Rexrode, Kathryn M; Rice, Kenneth; Rich, Stephen S; Ridker, Paul M; Rost, Natalia S; Rothwell, Peter M; Rotter, Jerome I; Rundek, Tatjana; Sacco, Ralph L; Sakaue, Saori; Sale, Michele M; Salomaa, Veikko; Sapkota, Bishwa R; Schmidt, Reinhold; Schmidt, Carsten O; Schminke, Ulf; Sharma, Pankaj; Slowik, Agnieszka; Sudlow, Cathie L M; Tanislav, Christian; Tatlisumak, Turgut; Taylor, Kent D; Thijs, Vincent N S; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Tiedt, Steffen; Trompet, Stella; Tzourio, Christophe; van Duijn, Cornelia M; Walters, Matthew; Wareham, Nicholas J; Wassertheil-Smoller, Sylvia; Wilson, James G; Wiggins, Kerri L; Yang, Qiong; Yusuf, Salim; Amin, Najaf; Aparicio, Hugo S; Arnett, Donna K; Attia, John; Beiser, Alexa S; Berr, Claudine; Buring, Julie E; Bustamante, Mariana; Caso, Valeria; Cheng, Yu-Ching; Choi, Seung Hoan; Chowhan, Ayesha; Cullell, Natalia; Dartigues, Jean-François; Delavaran, Hossein; Delgado, Pilar; Dörr, Marcus; Engström, Gunnar; Ford, Ian; Gurpreet, Wander S; Hamsten, Anders; Heitsch, Laura; Hozawa, Atsushi; Ibanez, Laura; Ilinca, Andreea; Ingelsson, Martin; Iwasaki, Motoki; Jackson, Rebecca D; Jood, Katarina; Jousilahti, Pekka; Kaffashian, Sara; Kalra, Lalit; Kamouchi, Masahiro; Kitazono, Takanari; Kjartansson, Olafur; Kloss, Manja; Koudstaal, Peter J; Krupinski, Jerzy; Labovitz, Daniel L; Laurie, Cathy C; Levi, Christopher R; Li, Linxin; Lind, Lars; Lindgren, Cecilia M; Lioutas, Vasileios; Liu, Yong Mei; Lopez, Oscar L; Makoto, Hirata; Martinez-Majander, Nicolas; Matsuda, Koichi; Minegishi, Naoko; Montaner, Joan; Morris, Andrew P; Muiño, Elena; Müller-Nurasyid, Martina; Norrving, Bo; Ogishima, Soichi; Parati, Eugenio A; Peddareddygari, Leema Reddy; Pedersen, Nancy L; Pera, Joanna; Perola, Markus; Pezzini, Alessandro; Pileggi, Silvana; Rabionet, Raquel; Riba-Llena, Iolanda; Ribasés, Marta; Romero, Jose R; Roquer, Jaume; Rudd, Anthony G; Sarin, Antti-Pekka; Sarju, Ralhan; Sarnowski, Chloe; Sasaki, Makoto; Satizabal, Claudia L; Satoh, Mamoru; Sattar, Naveed; Sawada, Norie; Sibolt, Gerli; Sigurdsson, Ásgeir; Smith, Albert; Sobue, Kenji; Soriano-Tárraga, Carolina; Stanne, Tara; Stine, O Colin; Stott, David J; Strauch, Konstantin; Takai, Takako; Tanaka, Hideo; Tanno, Kozo; Teumer, Alexander; Tomppo, Liisa; Torres-Aguila, Nuria P; Touze, Emmanuel; Tsugane, Shoichiro; Uitterlinden, Andre G; Valdimarsson, Einar M; van der Lee, Sven J; Völzke, Henry; Wakai, Kenji; Weir, David; Williams, Stephen R; Wolfe, Charles D A; Wong, Quenna; Xu, Huichun; Yamaji, Taiki; Sanghera, Dharambir K; Melander, Olle; Jern, Christina; Strbian, Daniel; Fernandez-Cadenas, Israel; Longstreth, W T; Rolfs, Arndt; Hata, Jun; Woo, Daniel; Rosand, Jonathan; Pare, Guillaume; Hopewell, Jemma C; Saleheen, Danish; Stefansson, Kari; Worrall, Bradford B; Kittner, Steven J; Seshadri, Sudha; Fornage, Myriam; Markus, Hugh S; Howson, Joanna M M; Kamatani, Yoichiro; Debette, Stephanie; Dichgans, Martin

    2018-04-01

    Stroke has multiple etiologies, but the underlying genes and pathways are largely unknown. We conducted a multiancestry genome-wide-association meta-analysis in 521,612 individuals (67,162 cases and 454,450 controls) and discovered 22 new stroke risk loci, bringing the total to 32. We further found shared genetic variation with related vascular traits, including blood pressure, cardiac traits, and venous thromboembolism, at individual loci (n = 18), and using genetic risk scores and linkage-disequilibrium-score regression. Several loci exhibited distinct association and pleiotropy patterns for etiological stroke subtypes. Eleven new susceptibility loci indicate mechanisms not previously implicated in stroke pathophysiology, with prioritization of risk variants and genes accomplished through bioinformatics analyses using extensive functional datasets. Stroke risk loci were significantly enriched in drug targets for antithrombotic therapy.

  8. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes

    PubMed Central

    Malik, Rainer; Chauhan, Ganesh; Traylor, Matthew; Sargurupremraj, Muralidharan; Okada, Yukinori; Mishra, Aniket; Rutten-Jacobs, Loes; Giese, Anne-Katrin; van der Laan, Sander W.; Gretarsdottir, Solveig; Anderson, Christopher D.; Chong, Michael; Adams, Hieab H. H.; Ago, Tetsuro; Almgren, Peter; Amouyel, Philippe; Ay, Hakan; Bartz, Traci M.; Benavente, Oscar R.; Bevan, Steve; Boncoraglio, Giorgio B.; Brown, Robert D.; Butterworth, Adam S.; Carrera, Caty; Carty, Cara L.; Chasman, Daniel I.; Chen, Wei-Min; Cole, John W.; Correa, Adolfo; Cotlarciuc, Ioana; Cruchaga, Carlos; Danesh, John; de Bakker, Paul I. W.; DeStefano, Anita L.; den Hoed, Marcel; Duan, Qing; Engelter, Stefan T.; Falcone, Guido J.; Gottesman, Rebecca F.; Grewal, Raji P.; Gudnason, Vilmundur; Gustafsson, Stefan; Haessler, Jeffrey; Harris, Tamara B.; Hassan, Ahamad; Havulinna, Aki S.; Heckbert, Susan R.; Holliday, Elizabeth G.; Howard, George; Hsu, Fang-Chi; Hyacinth, Hyacinth I.; Ikram, M. Arfan; ingelsson, Erik; Irvin, Marguerite R.; Jian, Xueqiu; Jimenez-Conde, Jordi; Johnson, Julie A.; Jukema, J. Wouter; Kanai, Masahiro; Keene, Keith L.; Kissela, Brett M.; Kleindorfer, Dawn O.; Kooperberg, Charles; Kubo, Michiaki; Lange, Leslie A.; Langefeld, Carl D.; Langenberg, Claudia; Launer, Lenore J.; Lee, Jin-Moo; Lemmens, Robin; Leys, Didier; Lewis, Cathryn M.; Lin, Wei-Yu; Lindgren, Arne G.; Lorentzen, Erik; Magnusson, Patrik K.; Maguire, Jane; Manichaikul, Ani; McArdle, Patrick F.; Meschia, James F.; Mitchell, Braxton D.; Mosley, Thomas H.; Nalls, Michael A.; Ninomiya, Toshiharu; O’Donnell, Martin J.; Psaty, Bruce M.; Pulit, Sara L.; Rannikmäe, Kristiina; Reiner, Alexander P.; Rexrode, Kathryn M.; Rice, Kenneth; Rich, Stephen S.; Ridker, Paul M.; Rost, Natalia S.; Rothwell, Peter M.; Rotter, Jerome I.; Rundek, Tatjana; Sacco, Ralph L.; Sakaue, Saori; Sale, Michele M.; Salomaa, Veikko; Sapkota, Bishwa R.; Schmidt, Reinhold; Schmidt, Carsten O.; Schminke, Ulf; Sharma, Pankaj; Slowik, Agnieszka; Sudlow, Cathie L. M.; Tanislav, Christian; Tatlisumak, Turgut; Taylor, Kent D.; Thijs, Vincent N. S.; Thorleifsson, Gudmar; Thorsteinsdottir, Unnur; Tiedt, Steffen; Trompet, Stella; Tzourio, Christophe; van Duijn, Cornelia M.; Walters, Matthew; Wareham, Nicholas J.; Wassertheil-Smoller, Sylvia; Wilson, James G.; Wiggins, Kerri L.; Yang, Qiong; Yusuf, Salim; Bis, Joshua C.; Pastinen, Tomi; Ruusalepp, Arno; Schadt, Eric E.; Koplev, Simon; Björkegren, Johan L. M.; Codoni, Veronica; Civelek, Mete; Smith, Nicholas L.; Tregouet, David A.; Christophersen, Ingrid E.; Roselli, Carolina; Lubitz, Steven A.; Ellinor, Patrick T.; Tai, E. Shyong; Kooner, Jaspal S.; Kato, Norihiro; He, Jiang; van der Harst, Pim; Elliott, Paul; Chambers, John C.; Takeuchi, Fumihiko; Johnson, Andrew D.; Sanghera, Dharambir K.; Melander, Olle; Jern, Christina; Strbian, Daniel; Fernandez-Cadenas, Israel; Longstreth, W. T.; Rolfs, Arndt; Hata, Jun; Woo, Daniel; Rosand, Jonathan; Pare, Guillaume; Hopewell, Jemma C.; Saleheen, Danish; Stefansson, Kari; Worrall, Bradford B.; Kittner, Steven J.; Seshadri, Sudha; Fornage, Myriam; Markus, Hugh S.; Howson, Joanna M. M.; Kamatani, Yoichiro; Debette, Stephanie; Dichgans, Martin

    2018-01-01

    Stroke has multiple etiologies, but the underlying genes and pathways are largely unknown. We conducted a multiancestry genome-wide-association meta-analysis in 521,612 individuals (67,162 cases and 454,450 controls) and discovered 22 new stroke risk loci, bringing the total to 32. We further found shared genetic variation with related vascular traits, including blood pressure, cardiac traits, and venous thromboembolism, at individual loci (n = 18), and using genetic risk scores and linkage-disequilibrium-score regression. Several loci exhibited distinct association and pleiotropy patterns for etiological stroke subtypes. Eleven new susceptibility loci indicate mechanisms not previously implicated in stroke pathophysiology, with prioritization of risk variants and genes accomplished through bioinformatics analyses using extensive functional datasets. Stroke risk loci were significantly enriched in drug targets for antithrombotic therapy. PMID:29531354

  9. Genome-wide association and genomic prediction of resistance to viral nervous necrosis in European sea bass (Dicentrarchus labrax) using RAD sequencing.

    PubMed

    Palaiokostas, Christos; Cariou, Sophie; Bestin, Anastasia; Bruant, Jean-Sebastien; Haffray, Pierrick; Morin, Thierry; Cabon, Joëlle; Allal, François; Vandeputte, Marc; Houston, Ross D

    2018-06-08

    European sea bass (Dicentrarchus labrax) is one of the most important species for European aquaculture. Viral nervous necrosis (VNN), commonly caused by the redspotted grouper nervous necrosis virus (RGNNV), can result in high levels of morbidity and mortality, mainly during the larval and juvenile stages of cultured sea bass. In the absence of efficient therapeutic treatments, selective breeding for host resistance offers a promising strategy to control this disease. Our study aimed at investigating genetic resistance to VNN and genomic-based approaches to improve disease resistance by selective breeding. A population of 1538 sea bass juveniles from a factorial cross between 48 sires and 17 dams was challenged with RGNNV with mortalities and survivors being recorded and sampled for genotyping by the RAD sequencing approach. We used genome-wide genotype data from 9195 single nucleotide polymorphisms (SNPs) for downstream analysis. Estimates of heritability of survival on the underlying scale for the pedigree and genomic relationship matrices were 0.27 (HPD interval 95%: 0.14-0.40) and 0.43 (0.29-0.57), respectively. Classical genome-wide association analysis detected genome-wide significant quantitative trait loci (QTL) for resistance to VNN on chromosomes (unassigned scaffolds in the case of 'chromosome' 25) 3, 20 and 25 (P < 1e06). Weighted genomic best linear unbiased predictor provided additional support for the QTL on chromosome 3 and suggested that it explained 4% of the additive genetic variation. Genomic prediction approaches were tested to investigate the potential of using genome-wide SNP data to estimate breeding values for resistance to VNN and showed that genomic prediction resulted in a 13% increase in successful classification of resistant and susceptible animals compared to pedigree-based methods, with Bayes A and Bayes B giving the highest predictive ability. Genome-wide significant QTL were identified but each with relatively small effects on

  10. Linkage Analysis of Urine Arsenic Species Patterns in the Strong Heart Family Study

    PubMed Central

    Gribble, Matthew O.; Voruganti, Venkata Saroja; Cole, Shelley A.; Haack, Karin; Balakrishnan, Poojitha; Laston, Sandra L.; Tellez-Plaza, Maria; Francesconi, Kevin A.; Goessler, Walter; Umans, Jason G.; Thomas, Duncan C.; Gilliland, Frank; North, Kari E.; Franceschini, Nora; Navas-Acien, Ana

    2015-01-01

    Arsenic toxicokinetics are important for disease risks in exposed populations, but genetic determinants are not fully understood. We examined urine arsenic species patterns measured by HPLC-ICPMS among 2189 Strong Heart Study participants 18 years of age and older with data on ∼400 genome-wide microsatellite markers spaced ∼10 cM and arsenic speciation (683 participants from Arizona, 684 from Oklahoma, and 822 from North and South Dakota). We logit-transformed % arsenic species (% inorganic arsenic, %MMA, and %DMA) and also conducted principal component analyses of the logit % arsenic species. We used inverse-normalized residuals from multivariable-adjusted polygenic heritability analysis for multipoint variance components linkage analysis. We also examined the contribution of polymorphisms in the arsenic metabolism gene AS3MT via conditional linkage analysis. We localized a quantitative trait locus (QTL) on chromosome 10 (LOD 4.12 for %MMA, 4.65 for %DMA, and 4.84 for the first principal component of logit % arsenic species). This peak was partially but not fully explained by measured AS3MT variants. We also localized a QTL for the second principal component of logit % arsenic species on chromosome 5 (LOD 4.21) that was not evident from considering % arsenic species individually. Some other loci were suggestive or significant for 1 geographical area but not overall across all areas, indicating possible locus heterogeneity. This genome-wide linkage scan suggests genetic determinants of arsenic toxicokinetics to be identified by future fine-mapping, and illustrates the utility of principal component analysis as a novel approach that considers % arsenic species jointly. PMID:26209557

  11. A ddRAD Based Linkage Map of the Cultivated Strawberry, Fragaria xananassa

    PubMed Central

    Davik, Jahn; Sargent, Daniel James; Brurberg, May Bente; Lien, Sigbjørn; Kent, Matthew; Alsheikh, Muath

    2015-01-01

    The cultivated strawberry (Fragaria ×ananassa Duch.) is an allo-octoploid considered difficult to disentangle genetically due to its four relatively similar sub-genomic chromosome sets. This has been alleviated by the recent release of the strawberry IStraw90 whole genome genotyping array. However, array resolution relies on the genotypes used in the array construction and may be of limited general use. SNP detection based on reduced genomic sequencing approaches has the potential of providing better coverage in cases where the studied genotypes are only distantly related from the SNP array’s construction foundation. Here we have used double digest restriction-associated DNA sequencing (ddRAD) to identify SNPs in a 145 seedling F1 hybrid population raised from the cross between the cultivars Sonata (♀) and Babette (♂). A linkage map containing 907 markers which spanned 1,581.5 cM across 31 linkage groups representing the 28 chromosomes of the species. Comparing the physical span of the SNP markers with the F. vesca genome sequence, the linkage groups resolved covered 79% of the estimated 830 Mb of the F. ×ananassa genome. Here, we have developed the first linkage map for F. ×ananassa using ddRAD and show that this technique and other related techniques are useful tools for linkage map development and downstream genetic studies in the octoploid strawberry. PMID:26398886

  12. Sex linkage, sex-specific selection, and the role of recombination in the evolution of sexually dimorphic gene expression.

    PubMed

    Connallon, Tim; Clark, Andrew G

    2010-12-01

    Sex-biased genes--genes that are differentially expressed within males and females--are nonrandomly distributed across animal genomes, with sex chromosomes and autosomes often carrying markedly different concentrations of male- and female-biased genes. These linkage patterns are often gene- and lineage-dependent, differing between functional genetic categories and between species. Although sex-specific selection is often hypothesized to shape the evolution of sex-linked and autosomal gene content, population genetics theory has yet to account for many of the gene- and lineage-specific idiosyncrasies emerging from the empirical literature. With the goal of improving the connection between evolutionary theory and a rapidly growing body of genome-wide empirical studies, we extend previous population genetics theory of sex-specific selection by developing and analyzing a biologically informed model that incorporates sex linkage, pleiotropy, recombination, and epistasis, factors that are likely to vary between genes and between species. Our results demonstrate that sex-specific selection and sex-specific recombination rates can generate, and are compatible with, the gene- and species-specific linkage patterns reported in the genomics literature. The theory suggests that sexual selection may strongly influence the architectures of animal genomes, as well as the chromosomal distribution of fixed substitutions underlying sexually dimorphic traits. © 2010 The Author(s). Evolution© 2010 The Society for the Study of Evolution.

  13. Population genomics of intrapatient HIV-1 evolution

    PubMed Central

    Zanini, Fabio; Brodin, Johanna; Thebo, Lina; Lanz, Christa; Bratt, Göran; Albert, Jan; Neher, Richard A

    2015-01-01

    Many microbial populations rapidly adapt to changing environments with multiple variants competing for survival. To quantify such complex evolutionary dynamics in vivo, time resolved and genome wide data including rare variants are essential. We performed whole-genome deep sequencing of HIV-1 populations in 9 untreated patients, with 6-12 longitudinal samples per patient spanning 5-8 years of infection. The data can be accessed and explored via an interactive web application. We show that patterns of minor diversity are reproducible between patients and mirror global HIV-1 diversity, suggesting a universal landscape of fitness costs that control diversity. Reversions towards the ancestral HIV-1 sequence are observed throughout infection and account for almost one third of all sequence changes. Reversion rates depend strongly on conservation. Frequent recombination limits linkage disequilibrium to about 100bp in most of the genome, but strong hitch-hiking due to short range linkage limits diversity. DOI: http://dx.doi.org/10.7554/eLife.11282.001 PMID:26652000

  14. TEAM: efficient two-locus epistasis tests in human genome-wide association study.

    PubMed

    Zhang, Xiang; Huang, Shunping; Zou, Fei; Wang, Wei

    2010-06-15

    As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genome-wide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach.

  15. Enhancing genomic prediction with genome-wide association studies in multiparental maize populations

    USDA-ARS?s Scientific Manuscript database

    Genome-wide association mapping using dense marker sets has identified some nucleotide variants affecting complex traits which have been validated with fine-mapping and functional analysis. Many sequence variants associated with complex traits in maize have small effects and low repeatability, howev...

  16. Novel efficient genome-wide SNP panels for the conservation of the highly endangered Iberian lynx.

    PubMed

    Kleinman-Ruiz, Daniel; Martínez-Cruz, Begoña; Soriano, Laura; Lucena-Perez, Maria; Cruz, Fernando; Villanueva, Beatriz; Fernández, Jesús; Godoy, José A

    2017-07-21

    The Iberian lynx (Lynx pardinus) has been acknowledged as the most endangered felid species in the world. An intense contraction and fragmentation during the twentieth century left less than 100 individuals split in two isolated and genetically eroded populations by 2002. Genetic monitoring and management so far have been based on 36 STRs, but their limited variability and the more complex situation of current populations demand more efficient molecular markers. The recent characterization of the Iberian lynx genome identified more than 1.6 million SNPs, of which 1536 were selected and genotyped in an extended Iberian lynx sample. We validated 1492 SNPs and analysed their heterozygosity, Hardy-Weinberg equilibrium, and linkage disequilibrium. We then selected a panel of 343 minimally linked autosomal SNPs from which we extracted subsets optimized for four different typical tasks in conservation applications: individual identification, parentage assignment, relatedness estimation, and admixture classification, and compared their power to currently used STR panels. We ascribed 21 SNPs to chromosome X based on their segregation patterns, and identified one additional marker that showed significant differentiation between sexes. For all applications considered, panels of autosomal SNPs showed higher power than the currently used STR set with only a very modest increase in the number of markers. These novel panels of highly informative genome-wide SNPs provide more powerful, efficient, and flexible tools for the genetic management and non-invasive monitoring of Iberian lynx populations. This example highlights an important outcome of whole-genome studies in genetically threatened species.

  17. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    ERIC Educational Resources Information Center

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  18. Genome-wide association study of Tourette Syndrome

    PubMed Central

    Scharf, Jeremiah M.; Yu, Dongmei; Mathews, Carol A.; Neale, Benjamin M.; Stewart, S. Evelyn; Fagerness, Jesen A; Evans, Patrick; Gamazon, Eric; Edlund, Christopher K.; Service, Susan; Tikhomirov, Anna; Osiecki, Lisa; Illmann, Cornelia; Pluzhnikov, Anna; Konkashbaev, Anuar; Davis, Lea K; Han, Buhm; Crane, Jacquelyn; Moorjani, Priya; Crenshaw, Andrew T.; Parkin, Melissa A.; Reus, Victor I.; Lowe, Thomas L.; Rangel-Lugo, Martha; Chouinard, Sylvain; Dion, Yves; Girard, Simon; Cath, Danielle C; Smit, Jan H; King, Robert A.; Fernandez, Thomas; Leckman, James F.; Kidd, Kenneth K.; Kidd, Judith R.; Pakstis, Andrew J.; State, Matthew; Herrera, Luis Diego; Romero, Roxana; Fournier, Eduardo; Sandor, Paul; Barr, Cathy L; Phan, Nam; Gross-Tsur, Varda; Benarroch, Fortu; Pollak, Yehuda; Budman, Cathy L.; Bruun, Ruth D.; Erenberg, Gerald; Naarden, Allan L; Lee, Paul C; Weiss, Nicholas; Kremeyer, Barbara; Berrío, Gabriel Bedoya; Campbell, Desmond; Silgado, Julio C. Cardona; Ochoa, William Cornejo; Restrepo, Sandra C. Mesa; Muller, Heike; Duarte, Ana V. Valencia; Lyon, Gholson J; Leppert, Mark; Morgan, Jubel; Weiss, Robert; Grados, Marco A.; Anderson, Kelley; Davarya, Sarah; Singer, Harvey; Walkup, John; Jankovic, Joseph; Tischfield, Jay A.; Heiman, Gary A.; Gilbert, Donald L.; Hoekstra, Pieter J.; Robertson, Mary M.; Kurlan, Roger; Liu, Chunyu; Gibbs, J. Raphael; Singleton, Andrew; Hardy, John; Strengman, Eric; Ophoff, Roel; Wagner, Michael; Moessner, Rainald; Mirel, Daniel B.; Posthuma, Danielle; Sabatti, Chiara; Eskin, Eleazar; Conti, David V.; Knowles, James A.; Ruiz-Linares, Andres; Rouleau, Guy A.; Purcell, Shaun; Heutink, Peter; Oostra, Ben A.; McMahon, William; Freimer, Nelson; Cox, Nancy J.; Pauls, David L.

    2012-01-01

    Tourette Syndrome (TS) is a developmental disorder that has one of the highest familial recurrence rates among neuropsychiatric diseases with complex inheritance. However, the identification of definitive TS susceptibility genes remains elusive. Here, we report the first genome-wide association study (GWAS) of TS in 1285 cases and 4964 ancestry-matched controls of European ancestry, including two European-derived population isolates, Ashkenazi Jews from North America and Israel, and French Canadians from Quebec, Canada. In a primary meta-analysis of GWAS data from these European ancestry samples, no markers achieved a genome-wide threshold of significance (p<5 × 10−8); the top signal was found in rs7868992 on chromosome 9q32 within COL27A1 (p=1.85 × 10−6). A secondary analysis including an additional 211 cases and 285 controls from two closely-related Latin-American population isolates from the Central Valley of Costa Rica and Antioquia, Colombia also identified rs7868992 as the top signal (p=3.6 × 10−7 for the combined sample of 1496 cases and 5249 controls following imputation with 1000 Genomes data). This study lays the groundwork for the eventual identification of common TS susceptibility variants in larger cohorts and helps to provide a more complete understanding of the full genetic architecture of this disorder. PMID:22889924

  19. ICSNPathway: identify candidate causal SNPs and pathways from genome-wide association study by one analytical framework.

    PubMed

    Zhang, Kunlin; Chang, Suhua; Cui, Sijia; Guo, Liyuan; Zhang, Liuyan; Wang, Jing

    2011-07-01

    Genome-wide association study (GWAS) is widely utilized to identify genes involved in human complex disease or some other trait. One key challenge for GWAS data interpretation is to identify causal SNPs and provide profound evidence on how they affect the trait. Currently, researches are focusing on identification of candidate causal variants from the most significant SNPs of GWAS, while there is lack of support on biological mechanisms as represented by pathways. Although pathway-based analysis (PBA) has been designed to identify disease-related pathways by analyzing the full list of SNPs from GWAS, it does not emphasize on interpreting causal SNPs. To our knowledge, so far there is no web server available to solve the challenge for GWAS data interpretation within one analytical framework. ICSNPathway is developed to identify candidate causal SNPs and their corresponding candidate causal pathways from GWAS by integrating linkage disequilibrium (LD) analysis, functional SNP annotation and PBA. ICSNPathway provides a feasible solution to bridge the gap between GWAS and disease mechanism study by generating hypothesis of SNP → gene → pathway(s). The ICSNPathway server is freely available at http://icsnpathway.psych.ac.cn/.

  20. SNP marker discovery, linkage map construction and identification of QTLs for enhanced salinity tolerance in field pea (Pisum sativum L.)

    PubMed Central

    2013-01-01

    Background Field pea (Pisum sativum L.) is a self-pollinating, diploid, cool-season food legume. Crop production is constrained by multiple biotic and abiotic stress factors, including salinity, that cause reduced growth and yield. Recent advances in genomics have permitted the development of low-cost high-throughput genotyping systems, allowing the construction of saturated genetic linkage maps for identification of quantitative trait loci (QTLs) associated with traits of interest. Genetic markers in close linkage with the relevant genomic regions may then be implemented in varietal improvement programs. Results In this study, single nucleotide polymorphism (SNP) markers associated with expressed sequence tags (ESTs) were developed and used to generate comprehensive linkage maps for field pea. From a set of 36,188 variant nucleotide positions detected through in silico analysis, 768 were selected for genotyping of a recombinant inbred line (RIL) population. A total of 705 SNPs (91.7%) successfully detected segregating polymorphisms. In addition to SNPs, genomic and EST-derived simple sequence repeats (SSRs) were assigned to the genetic map in order to obtain an evenly distributed genome-wide coverage. Sequences associated with the mapped molecular markers were used for comparative genomic analysis with other legume species. Higher levels of conserved synteny were observed with the genomes of Medicago truncatula Gaertn. and chickpea (Cicer arietinum L.) than with soybean (Glycine max [L.] Merr.), Lotus japonicus L. and pigeon pea (Cajanus cajan [L.] Millsp.). Parents and RIL progeny were screened at the seedling growth stage for responses to salinity stress, imposed by addition of NaCl in the watering solution at a concentration of 18 dS m-1. Salinity-induced symptoms showed normal distribution, and the severity of the symptoms increased over time. QTLs for salinity tolerance were identified on linkage groups Ps III and VII, with flanking SNP markers suitable for

  1. SNP marker discovery, linkage map construction and identification of QTLs for enhanced salinity tolerance in field pea (Pisum sativum L.).

    PubMed

    Leonforte, Antonio; Sudheesh, Shimna; Cogan, Noel O I; Salisbury, Philip A; Nicolas, Marc E; Materne, Michael; Forster, John W; Kaur, Sukhjiwan

    2013-10-17

    Field pea (Pisum sativum L.) is a self-pollinating, diploid, cool-season food legume. Crop production is constrained by multiple biotic and abiotic stress factors, including salinity, that cause reduced growth and yield. Recent advances in genomics have permitted the development of low-cost high-throughput genotyping systems, allowing the construction of saturated genetic linkage maps for identification of quantitative trait loci (QTLs) associated with traits of interest. Genetic markers in close linkage with the relevant genomic regions may then be implemented in varietal improvement programs. In this study, single nucleotide polymorphism (SNP) markers associated with expressed sequence tags (ESTs) were developed and used to generate comprehensive linkage maps for field pea. From a set of 36,188 variant nucleotide positions detected through in silico analysis, 768 were selected for genotyping of a recombinant inbred line (RIL) population. A total of 705 SNPs (91.7%) successfully detected segregating polymorphisms. In addition to SNPs, genomic and EST-derived simple sequence repeats (SSRs) were assigned to the genetic map in order to obtain an evenly distributed genome-wide coverage. Sequences associated with the mapped molecular markers were used for comparative genomic analysis with other legume species. Higher levels of conserved synteny were observed with the genomes of Medicago truncatula Gaertn. and chickpea (Cicer arietinum L.) than with soybean (Glycine max [L.] Merr.), Lotus japonicus L. and pigeon pea (Cajanus cajan [L.] Millsp.). Parents and RIL progeny were screened at the seedling growth stage for responses to salinity stress, imposed by addition of NaCl in the watering solution at a concentration of 18 dS m-1. Salinity-induced symptoms showed normal distribution, and the severity of the symptoms increased over time. QTLs for salinity tolerance were identified on linkage groups Ps III and VII, with flanking SNP markers suitable for selection of

  2. Genome-wide association study of the four-constitution medicine.

    PubMed

    Yin, Chang Shik; Park, Hi Joon; Chung, Joo-Ho; Lee, Hye-Jung; Lee, Byung-Cheol

    2009-12-01

    Four-constitution medicine (FCM), also known as Sasang constitutional medicine, and the heritage of the long history of individualized acupuncture medicine tradition, is one of the holistic and traditional systems of constitution to appraise and categorize individual differences into four major types. This study first reports a genome-wide association study on FCM, to explore the genetic basis of FCM and facilitate the integration of FCM with conventional individual differences research. Healthy individuals of the Korean population were classified into the four constitutional types (FCTs). A total of 353,202 single nucleotide polymorphisms (SNPs) were typed using whole genome amplified samples, and six-way comparison of FCM types provided lists of significantly differential SNPs. In one-to-one FCT comparisons, 15,944 SNPs were significantly differential, and 5 SNPs were commonly significant in all of the three comparisons. In one-to-two FCT comparisons, 22,616 SNPs were significantly differential, and 20 SNPs were commonly significant in all of the three comparison groups. This study presents the association between genome-wide SNP profiles and the categorization of the FCM, and it could further provide a starting point of genome-based identification and research of the constitutions of FCM.

  3. Genome wide association study (GWAS) for grain yield in rice cultivated under water deficit.

    PubMed

    Pantalião, Gabriel Feresin; Narciso, Marcelo; Guimarães, Cléber; Castro, Adriano; Colombari, José Manoel; Breseghello, Flavio; Rodrigues, Luana; Vianello, Rosana Pereira; Borba, Tereza Oliveira; Brondani, Claudio

    2016-12-01

    The identification of rice drought tolerant materials is crucial for the development of best performing cultivars for the upland cultivation system. This study aimed to identify markers and candidate genes associated with drought tolerance by Genome Wide Association Study analysis, in order to develop tools for use in rice breeding programs. This analysis was made with 175 upland rice accessions (Oryza sativa), evaluated in experiments with and without water restriction, and 150,325 SNPs. Thirteen SNP markers associated with yield under drought conditions were identified. Through stepwise regression analysis, eight SNP markers were selected and validated in silico, and when tested by PCR, two out of the eight SNP markers were able to identify a group of rice genotypes with higher productivity under drought. These results are encouraging for deriving markers for the routine analysis of marker assisted selection. From the drought experiment, including the genes inherited in linkage blocks, 50 genes were identified, from which 30 were annotated, and 10 were previously related to drought and/or abiotic stress tolerance, such as the transcription factors WRKY and Apetala2, and protein kinases.

  4. Genome-wide association mapping identifies multiple loci for a canine SLE-related disease complex.

    PubMed

    Wilbe, Maria; Jokinen, Päivi; Truvé, Katarina; Seppala, Eija H; Karlsson, Elinor K; Biagi, Tara; Hughes, Angela; Bannasch, Danika; Andersson, Göran; Hansson-Hamlin, Helene; Lohi, Hannes; Lindblad-Toh, Kerstin

    2010-03-01

    The unique canine breed structure makes dogs an excellent model for studying genetic diseases. Within a dog breed, linkage disequilibrium is extensive, enabling genome-wide association (GWA) with only around 15,000 SNPs and fewer individuals than in human studies. Incidences of specific diseases are elevated in different breeds, indicating that a few genetic risk factors might have accumulated through drift or selective breeding. In this study, a GWA study with 81 affected dogs (cases) and 57 controls from the Nova Scotia duck tolling retriever breed identified five loci associated with a canine systemic lupus erythematosus (SLE)-related disease complex that includes both antinuclear antibody (ANA)-positive immune-mediated rheumatic disease (IMRD) and steroid-responsive meningitis-arteritis (SRMA). Fine mapping with twice as many dogs validated these loci. Our results indicate that the homogeneity of strong genetic risk factors within dog breeds allows multigenic disorders to be mapped with fewer than 100 cases and 100 controls, making dogs an excellent model in which to identify pathways involved in human complex diseases.

  5. SSR-enriched genetic linkage maps of bermudagrass (Cynodon dactylon × transvaalensis), and their comparison with allied plant genomes.

    PubMed

    Khanal, Sameer; Kim, Changsoo; Auckland, Susan A; Rainville, Lisa K; Adhikari, Jeevan; Schwartz, Brian M; Paterson, Andrew H

    2017-04-01

    We report SSR-enriched genetic maps of bermudagrass that: (1) reveal partial residual polysomic inheritance in the tetraploid species, and (2) provide insights into the evolution of chloridoid genomes. This study describes genetic linkage maps of two bermudagrass species, Cynodon dactylon (T89) and Cynodon transvaalensis (T574), that integrate heterologous microsatellite markers from sugarcane into frameworks built with single-dose restriction fragments (SDRFs). A maximum likelihood approach was used to construct two separate parental maps from a population of 110 F 1 progeny of a cross between the two parents. The T89 map is based on 291 loci on 34 cosegregating groups (CGs), with an average marker spacing of 12.5 cM. The T574 map is based on 125 loci on 14 CGs, with an average marker spacing of 10.7 cM. Six T89 and one T574 CG(s) deviated from disomic inheritance. Furthermore, marker segregation data and linkage phase analysis revealed partial residual polysomic inheritance in T89, suggesting that common bermudagrass is undergoing diploidization following whole genome duplication (WGD). Twenty-six T89 CGs were coalesced into 9 homo(eo)logous linkage groups (LGs), while 12 T574 CGs were assembled into 9 LGs, both putatively representing the basic chromosome complement (x = 9) of the species. Eight T89 and two T574 CGs remain unassigned. The marker composition of bermudagrass ancestral chromosomes was inferred by aligning T89 and T574 homologs, and used in comparisons to sorghum and rice genome sequences based on 108 and 91 significant blast hits, respectively. Two nested chromosome fusions (NCFs) shared by two other chloridoids (i.e., zoysiagrass and finger millet) and at least three independent translocation events were evident during chromosome number reduction from 14 in the polyploid common ancestor of Poaceae to 9 in Cynodon.

  6. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    PubMed Central

    2011-01-01

    Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was

  7. Extent of genome-wide linkage disequilibrium in Australian Holstein-Friesian cattle based on a high-density SNP panel.

    PubMed

    Khatkar, Mehar S; Nicholas, Frank W; Collins, Andrew R; Zenger, Kyall R; Cavanagh, Julie A L; Barris, Wes; Schnabel, Robert D; Taylor, Jeremy F; Raadsma, Herman W

    2008-04-24

    The extent of linkage disequilibrium (LD) within a population determines the number of markers that will be required for successful association mapping and marker-assisted selection. Most studies on LD in cattle reported to date are based on microsatellite markers or small numbers of single nucleotide polymorphisms (SNPs) covering one or only a few chromosomes. This is the first comprehensive study on the extent of LD in cattle by analyzing data on 1,546 Holstein-Friesian bulls genotyped for 15,036 SNP markers covering all regions of all autosomes. Furthermore, most studies in cattle have used relatively small sample sizes and, consequently, may have had biased estimates of measures commonly used to describe LD. We examine minimum sample sizes required to estimate LD without bias and loss in accuracy. Finally, relatively little information is available on comparative LD structures including other mammalian species such as human and mouse, and we compare LD structure in cattle with public-domain data from both human and mouse. We computed three LD estimates, D', Dvol and r2, for 1,566,890 syntenic SNP pairs and a sample of 365,400 non-syntenic pairs. Mean D' is 0.189 among syntenic SNPs, and 0.105 among non-syntenic SNPs; mean r2 is 0.024 among syntenic SNPs and 0.0032 among non-syntenic SNPs. All three measures of LD for syntenic pairs decline with distance; the decline is much steeper for r2 than for D' and Dvol. The value of D' and Dvol are quite similar. Significant LD in cattle extends to 40 kb (when estimated as r2) and 8.2 Mb (when estimated as D'). The mean values for LD at large physical distances are close to those for non-syntenic SNPs. Minor allelic frequency threshold affects the distribution and extent of LD. For unbiased and accurate estimates of LD across marker intervals spanning < 1 kb to > 50 Mb, minimum sample sizes of 400 (for D') and 75 (for r2) are required. The bias due to small samples sizes increases with inter-marker interval. LD in cattle

  8. Ontology-Based Search of Genomic Metadata.

    PubMed

    Fernandez, Javier D; Lenzerini, Maurizio; Masseroli, Marco; Venco, Francesco; Ceri, Stefano

    2016-01-01

    The Encyclopedia of DNA Elements (ENCODE) is a huge and still expanding public repository of more than 4,000 experiments and 25,000 data files, assembled by a large international consortium since 2007; unknown biological knowledge can be extracted from these huge and largely unexplored data, leading to data-driven genomic, transcriptomic, and epigenomic discoveries. Yet, search of relevant datasets for knowledge discovery is limitedly supported: metadata describing ENCODE datasets are quite simple and incomplete, and not described by a coherent underlying ontology. Here, we show how to overcome this limitation, by adopting an ENCODE metadata searching approach which uses high-quality ontological knowledge and state-of-the-art indexing technologies. Specifically, we developed S.O.S. GeM (http://www.bioinformatics.deib.polimi.it/SOSGeM/), a system supporting effective semantic search and retrieval of ENCODE datasets. First, we constructed a Semantic Knowledge Base by starting with concepts extracted from ENCODE metadata, matched to and expanded on biomedical ontologies integrated in the well-established Unified Medical Language System. We prove that this inference method is sound and complete. Then, we leveraged the Semantic Knowledge Base to semantically search ENCODE data from arbitrary biologists' queries. This allows correctly finding more datasets than those extracted by a purely syntactic search, as supported by the other available systems. We empirically show the relevance of found datasets to the biologists' queries.

  9. Genomic prediction and genome-wide association analysis of female longevity in a composite beef cattle breed.

    PubMed

    Hamidi Hay, E; Roberts, A

    2017-04-01

    Longevity is a highly important trait to the efficiency of beef cattle production. The objective of this study was to evaluate the genomic prediction of longevity and identify genomic regions associated with this trait. The data used in this study consisted of 547 Composite Gene Combination cows (1/2 Red Angus, 1/4 Charolais, 1/4 Tarentaise) born from 2002 to 2011 genotyped with Illumina BovineSNP50 BeadChip. Three models were used to assess genomic prediction: Bayes A, Bayes B and GBLUP using a genomic relationship matrix. To identify genomic regions associated with longevity 2 approaches were adopted: single marker genome wide association and Bayesian approach using GenSel software. The genomic prediction accuracy was low 0.28, 0.25, and 0.22 for Bayes A, Bayes B and GBLUP, respectively. The single-marker genome wide association study (GWAS)identified 5 loci with -value less than 0.05 after false discovery correction: UA-IFASA-7571 on chromosome 19 (58.03 Mb), ARS-BFGL-BAC-15059 on BTA 1 (28.8 Mb), ARS-BFGL-NGS-104159 on BTA3 (29.4 Mb), ARS-BFGL-NGS-32882 on BTA9 (104.07 Mb) and ARS-BFGL-NGS-32883 on BTA25 (33.77 Mb). The Bayesian GWAS yielded 4 genomic regions overlapping with the single marker GWAS results. The region with the highest percentage of genomic variance (3.73%) was detected on chromosome 19. Both GWAS approaches adopted in this study showed evidence for association with various chromosomal locations.

  10. Genomic polymorphism, recombination, and linkage disequilibrium in human major histocompatibility complex-encoded antigen-processing genes.

    PubMed Central

    van Endert, P M; Lopez, M T; Patel, S D; Monaco, J J; McDevitt, H O

    1992-01-01

    Recently, two subunits of a large cytosolic protease and two putative peptide transporter proteins were found to be encoded by genes within the class II region of the major histocompatibility complex (MHC). These genes have been suggested to be involved in the processing of antigenic proteins for presentation by MHC class I molecules. Because of the high degree of polymorphism in MHC genes, and previous evidence for both functional and polypeptide sequence polymorphism in the proteins encoded by the antigen-processing genes, we tested DNA from 27 consanguineous human cell lines for genomic polymorphism by restriction fragment length polymorphism (RFLP) analysis. These studies demonstrate a strong linkage disequilibrium between TAP1 and LMP2 RFLPs. Moreover, RFLPs, as well as a polymorphic stop codon in the telomeric TAP2 gene, appear to be in linkage disequilibrium with HLA-DR alleles and RFLPs in the HLA-DO gene. A high rate of recombination, however, seems to occur in the center of the complex, between the TAP1 and TAP2 genes. Images PMID:1360671

  11. Genome-wide association study (GWAS) for growth rate and age at sexual maturation in Atlantic salmon (Salmo salar).

    PubMed

    Gutierrez, Alejandro P; Yáñez, José M; Fukui, Steve; Swift, Bruce; Davidson, William S

    2015-01-01

    Early sexual maturation is considered a serious drawback for Atlantic salmon aquaculture as it retards growth, increases production times and affects flesh quality. Although both growth and sexual maturation are thought to be complex processes controlled by several genetic and environmental factors, selection for these traits has been continuously accomplished since the beginning of Atlantic salmon selective breeding programs. In this genome-wide association study (GWAS) we used a 6.5K single-nucleotide polymorphism (SNP) array to genotype ∼ 480 individuals from the Cermaq Canada broodstock program and search for SNPs associated with growth and age at sexual maturation. Using a mixed model approach we identified markers showing a significant association with growth, grilsing (early sexual maturation) and late sexual maturation. The most significant associations were found for grilsing, with markers located in Ssa10, Ssa02, Ssa13, Ssa25 and Ssa12, and for late maturation with markers located in Ssa28, Ssa01 and Ssa21. A lower level of association was detected with growth on Ssa13. Candidate genes, which were linked to these genetic markers, were identified and some of them show a direct relationship with developmental processes, especially for those in association with sexual maturation. However, the relatively low power to detect genetic markers associated with growth (days to 5 kg) in this GWAS indicates the need to use a higher density SNP array in order to overcome the low levels of linkage disequilibrium observed in Atlantic salmon before the information can be incorporated into a selective breeding program.

  12. Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set.

    PubMed

    Kanai, Masahiro; Tanaka, Toshihiro; Okada, Yukinori

    2016-10-01

    To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10 -8 , the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were P sig =3.24 × 10 -8 (AFR), 9.26 × 10 -8 (EUR), 1.83 × 10 -7 (AMR), 1.61 × 10 -7 (EAS) and 9.46 × 10 -8 (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded P sig =3.25 × 10 -8 (ALL) and 4.20 × 10 -8 (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10 -8 ) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.

  13. Genome Wide Association Study of Sepsis in Extremely Premature Infants

    PubMed Central

    Srinivasan, Lakshmi; Page, Grier; Kirpalani, Haresh; Murray, Jeffrey C.; Das, Abhik; Higgins, Rosemary D.; Carlo, Waldemar A.; Bell, Edward F.; Goldberg, Ronald N.; Schibler, Kurt; Sood, Beena G.; Stevenson, David K.; Stoll, Barbara J.; Van Meurs, Krisa P.; Johnson, Karen J.; Levy, Joshua; McDonald, Scott A.; Zaterka-Baxter, Kristin M.; Kennedy, Kathleen A.; Sánchez, Pablo J.; Duara, Shahnaz; Walsh, Michele C.; Shankaran, Seetha; Wynn, James L.; Cotten, C. Michael

    2017-01-01

    Objective To identify genetic variants associated with sepsis (early and late-onset) using a genome wide association (GWA) analysis in a cohort of extremely premature infants. Study Design Previously generated GWA data from the Neonatal Research Network’s anonymized genomic database biorepository of extremely premature infants were used for this study. Sepsis was defined as culture-positive early-onset or late-onset sepsis or culture-proven meningitis. Genomic and whole genome amplified DNA was genotyped for 1.2 million single nucleotide polymorphisms (SNPs); 91% of SNPs were successfully genotyped. We imputed 7.2 million additional SNPs. P values and false discovery rates were calculated from multivariate logistic regression analysis adjusting for gender, gestational age and ancestry. Target statistical value was p<10−5. Secondary analyses assessed associations of SNPs with pathogen type. Pathway analyses were also run on primary and secondary end points. Results Data from 757 extremely premature infants were included: 351 infants with sepsis and 406 infants without sepsis. No SNPs reached genome-wide significance levels (5×10−8); two SNPs in proximity to FOXC2 and FOXL1 genes achieved target levels of significance. In secondary analyses, SNPs for ELMO1, IRAK2 (Gram positive sepsis), RALA, IMMP2L (Gram negative sepsis) and PIEZO2 (fungal sepsis) met target significance levels. Pathways associated with sepsis and Gram negative sepsis included gap junctions, fibroblast growth factor receptors, regulators of cell division and Interleukin-1 associated receptor kinase 2 (p values<0.001 and FDR<20%). Conclusions No SNPs met genome-wide significance in this cohort of ELBW infants; however, areas of potential association and pathways meriting further study were identified. PMID:28283553

  14. First genetic linkage map of Taraxacum koksaghyz Rodin based on AFLP, SSR, COS and EST-SSR markers.

    PubMed

    Arias, Marina; Hernandez, Monica; Remondegui, Naroa; Huvenaars, Koen; van Dijk, Peter; Ritter, Enrique

    2016-08-04

    Taraxacum koksaghyz Rodin (TKS) has been studied in many occasions as a possible alternative source for natural rubber production of good quality and for inulin production. Some tire companies are already testing TKS tire prototypes. There are also many investigations on the production of bio-fuels from inulin and inulin applications for health improvement and in the food industry. A limited amount of genomic resources exist for TKS and particularly no genetic linkage map is available in this species. We have constructed the first TKS genetic linkage map based on AFLP, COS, SSR and EST-SSR markers. The integrated linkage map with eight linkage groups (LG), representing the eight chromosomes of Russian dandelion, has 185 individual AFLP markers from parent 1, 188 individual AFLP markers from parent 2, 75 common AFLP markers and 6 COS, 1 SSR and 63 EST-SSR loci. Blasting the EST-SSR sequences against known sequences from lettuce allowed a partial alignment of our TKS map with a lettuce map. Blast searches against plant gene databases revealed some homologies with useful genes for downstream applications in the future.

  15. Appliation of rad-sequencing to linkage mapping in citrus

    USDA-ARS?s Scientific Manuscript database

    High density linkage maps can be developed for modest cost using high-throughput DNA sequencing to genotype a defined fraction (representation) of the genome. We developed linkage maps in two citrus populations using the RAD (Restriction site Associated DNA) genotyping method which involves restrict...

  16. Construction of a BAC library and mapping BAC clones to the linkage map of Barramundi, Lates calcarifer.

    PubMed

    Wang, Chun Ming; Lo, Loong Chueng; Feng, Felicia; Gong, Ping; Li, Jian; Zhu, Ze Yuan; Lin, Grace; Yue, Gen Hua

    2008-03-25

    Barramundi (Lates calcarifer) is an important farmed marine food fish species. Its first generation linkage map has been applied to map QTL for growth traits. To identify genes located in QTL responsible for specific traits, genomic large insert libraries are of crucial importance. We reported herein a bacterial artificial chromosome (BAC) library and the mapping of BAC clones to the linkage map. This BAC library consisted of 49,152 clones with an average insert size of 98 kb, representing 6.9-fold haploid genome coverage. Screening the library with 24 microsatellites and 15 ESTs/genes demonstrated that the library had good genome coverage. In addition, 62 novel microsatellites each isolated from 62 BAC clones were mapped onto the first generation linkage map. A total of 86 BAC clones were anchored on the linkage map with at least one BAC clone on each linkage group. We have constructed the first BAC library for L. calcarifer and mapped 86 BAC clones to the first generation linkage map. This BAC library and the improved linkage map with 302 DNA markers not only supply an indispensable tool to the integration of physical and linkage maps, the fine mapping of QTL and map based cloning genes located in QTL of commercial importance, but also contribute to comparative genomic studies and eventually whole genome sequencing.

  17. A High-Density Linkage Map Reveals Sexual Dimorphism in Recombination Landscapes in Red Deer (Cervus elaphus)

    PubMed Central

    Johnston, Susan E.; Huisman, Jisca; Ellis, Philip A.; Pemberton, Josephine M.

    2017-01-01

    High-density linkage maps are an important tool to gain insight into the genetic architecture of traits of evolutionary and economic interest, and provide a resource to characterize variation in recombination landscapes. Here, we used information from the cattle genome and the 50 K Cervine Illumina BeadChip to inform and refine a high-density linkage map in a wild population of red deer (Cervus elaphus). We constructed a predicted linkage map of 38,038 SNPs and a skeleton map of 10,835 SNPs across 34 linkage groups. We identified several chromosomal rearrangements in the deer lineage relative to sheep and cattle, including six chromosome fissions, one fusion, and two large inversions. Otherwise, our findings showed strong concordance with map orders in the cattle genome. The sex-averaged linkage map length was 2739.7 cM and the genome-wide autosomal recombination rate was 1.04 cM/Mb. The female autosomal map length was 1.21 longer than that of males (2767.4 cM vs. 2280.8 cM, respectively). Sex differences in map length were driven by high female recombination rates in peri-centromeric regions, a pattern that is unusual relative to other mammal species. This effect was more pronounced in fission chromosomes that would have had to produce new centromeres. We propose two hypotheses to explain this effect: (1) that this mechanism may have evolved to counteract centromeric drive associated with meiotic asymmetry in oocyte production; and/or (2) that sequence and structural characteristics suppressing recombination in close proximity to the centromere may not have evolved at neo-centromeres. Our study provides insight into how recombination landscapes vary and evolve in mammals, and will provide a valuable resource for studies of evolution, genetic improvement, and population management in red deer and related species. PMID:28667018

  18. Transport genes and chemotaxis in Laribacter hongkongensis: a genome-wide analysis

    PubMed Central

    2011-01-01

    Background Laribacter hongkongensis is a Gram-negative, sea gull-shaped rod associated with community-acquired gastroenteritis. The bacterium has been found in diverse freshwater environments including fish, frogs and drinking water reservoirs. Using the complete genome sequence data of L. hongkongensis, we performed a comprehensive analysis of putative transport-related genes and genes related to chemotaxis, motility and quorum sensing, which may help the bacterium adapt to the changing environments and combat harmful substances. Results A genome-wide analysis using Transport Classification Database TCDB, similarity and keyword searches revealed the presence of a large diversity of transporters (n = 457) and genes related to chemotaxis (n = 52) and flagellar biosynthesis (n = 40) in the L. hongkongensis genome. The transporters included those from all seven major transporter categories, which may allow the uptake of essential nutrients or ions, and extrusion of metabolic end products and hazardous substances. L. hongkongensis is unique among closely related members of Neisseriaceae family in possessing higher number of proteins related to transport of ammonium, urea and dicarboxylate, which may reflect the importance of nitrogen and dicarboxylate metabolism in this assacharolytic bacterium. Structural modeling of two C4-dicarboxylate transporters showed that they possessed similar structures to the determined structures of other DctP-TRAP transporters, with one having an unusual disulfide bond. Diverse mechanisms for iron transport, including hemin transporters for iron acquisition from host proteins, were also identified. In addition to the chemotaxis and flagella-related genes, the L. hongkongensis genome also contained two copies of qseB/qseC homologues of the AI-3 quorum sensing system. Conclusions The large number of diverse transporters and genes involved in chemotaxis, motility and quorum sensing suggested that the bacterium may utilize a complex system to

  19. Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase.

    PubMed

    Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R; Jha, Rajiv Kumar; Cole, Stewart T; Nagaraja, Valakunja

    2017-05-01

    Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase.

  20. Transcription facilitated genome-wide recruitment of topoisomerase I and DNA gyrase

    PubMed Central

    Ahmed, Wareed; Sala, Claudia; Hegde, Shubhada R.; Jha, Rajiv Kumar

    2017-01-01

    Movement of the transcription machinery along a template alters DNA topology resulting in the accumulation of supercoils in DNA. The positive supercoils generated ahead of transcribing RNA polymerase (RNAP) and the negative supercoils accumulating behind impose severe topological constraints impeding transcription process. Previous studies have implied the role of topoisomerases in the removal of torsional stress and the maintenance of template topology but the in vivo interaction of functionally distinct topoisomerases with heterogeneous chromosomal territories is not deciphered. Moreover, how the transcription-induced supercoils influence the genome-wide recruitment of DNA topoisomerases remains to be explored in bacteria. Using ChIP-Seq, we show the genome-wide occupancy profile of both topoisomerase I and DNA gyrase in conjunction with RNAP in Mycobacterium tuberculosis taking advantage of minimal topoisomerase representation in the organism. The study unveils the first in vivo genome-wide interaction of both the topoisomerases with the genomic regions and establishes that transcription-induced supercoils govern their recruitment at genomic sites. Distribution profiles revealed co-localization of RNAP and the two topoisomerases on the active transcriptional units (TUs). At a given locus, topoisomerase I and DNA gyrase were localized behind and ahead of RNAP, respectively, correlating with the twin-supercoiled domains generated. The recruitment of topoisomerases was higher at the genomic loci with higher transcriptional activity and/or at regions under high torsional stress compared to silent genomic loci. Importantly, the occupancy of DNA gyrase, sole type II topoisomerase in Mtb, near the Ter domain of the Mtb chromosome validates its function as a decatenase. PMID:28463980

  1. Genome-wide analysis of tandem repeats in plants and green algae

    Treesearch

    Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang

    2014-01-01

    Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...

  2. Genome-wide characterization of microsatelittes and marker development in the carcinogenic liver fluke Clonorchis sinensis

    PubMed Central

    Nguyen, Thao T.B.; Arimatsu, Yuji; Hong, Sung-Jong; Brindley, Paul J.; Blair, David; Laha, Thewarach; Sripa, Banchob

    2015-01-01

    Clonorchis sinensis is an important carcinogenic human liver fluke endemic in East and Southeast Asia. There are several conventional molecular markers have been used for identification and genetic diversity, however, no information about microsatellites of this liver fluke published so far. We here report microsatellite characterization and marker development for genetic diversity study in C. sinensis using genome-wide bioinformatics approach. Based on our search criteria, a total of 256,990 microsatellites (≥ 12 base pairs) were identified from genome database of C. sinensis with hexa-nucleotide motif being the most abundant (51%) followed by penta-nucleotide (18.3%) and tri-nucleotide (12.7%). The tetra-nucleotide, di-nucleotide and mononucleotide motifs accounted for 9.75 %, 7.63% and 0.14%, respectively. The total length of all microsatellites accounts for 0. 72 % of 547 Mb of the whole genome size and the frequency of microsatellites were found to be one microsatellite in every 2.13 kb of DNA. For the di-, tri, and tetra-nucleotide, the repeat numbers redundant are six (28%), four (45%) and three (76%), respectively. The ATC repeat is the most abundant microsatellites followed by AT, AAT and AC, respectively. Within 40 microsatellite loci developed, 24 microsatellite markers showed potential to differentiate between C. sinensis and O. viverrini. Seven out of 24 loci showed heterozygous with observed heterozygosity ranged from 0.467 to 1. Four-primer sets could amplify both C. sinensis and O. viverrini DNA with different sizes. This study provides basic information of C. sinensis microsatellites and the genome-wide markers developed may be a useful tool for genetic study of C. sinensis. PMID:25782682

  3. Genome-wide characterization of microsatellites and marker development in the carcinogenic liver fluke Clonorchis sinensis.

    PubMed

    Nguyen, Thao T B; Arimatsu, Yuji; Hong, Sung-Jong; Brindley, Paul J; Blair, David; Laha, Thewarach; Sripa, Banchob

    2015-06-01

    Clonorchis sinensis is an important carcinogenic human liver fluke endemic in East and Southeast Asia. There are several conventional molecular markers that have been used for identification and genetic diversity; however, no information about microsatellites of this liver fluke is published so far. We here report microsatellite characterization and marker development for a genetic diversity study in C. sinensis, using a genome-wide bioinformatics approach. Based on our search criteria, a total of 256,990 microsatellites (≥12 base pairs) were identified from a genome database of C. sinensis, with hexanucleotide motif being the most abundant (51%) followed by pentanucleotide (18.3%) and trinucleotide (12.7%). The tetranucleotide, dinucleotide, and mononucleotide motifs accounted for 9.75, 7.63, and 0.14%, respectively. The total length of all microsatellites accounts for 0. 72% of 547 Mb of the whole genome size, and the frequency of microsatellites was found to be one microsatellite in every 2.13 kb of DNA. For the di-, tri-, and tetranucleotide, the repeat numbers redundant are six (28%), four (45%), and three (76%), respectively. The ATC repeat is the most abundant microsatellites followed by AT, AAT, and AC, respectively. Within 40 microsatellite loci developed, 24 microsatellite markers showed potential to differentiate between C. sinensis and Opisthorchis viverrini. Seven out of 24 loci showed to be heterozygous with observed heterozygosity that ranged from 0.467 to 1. Four primer sets could amplify both C. sinensis and O. viverrini DNA with different sizes. This study provides basic information of C. sinensis microsatellites, and the genome-wide markers developed may be a useful tool for the genetic study of C. sinensis.

  4. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog)

    PubMed Central

    MacArthur, Jacqueline; Bowler, Emily; Cerezo, Maria; Gil, Laurent; Hall, Peggy; Hastings, Emma; Junkins, Heather; McMahon, Aoife; Milano, Annalisa; Morales, Joannella; Pendlington, Zoe May; Welter, Danielle; Burdett, Tony; Hindorff, Lucia; Flicek, Paul; Cunningham, Fiona; Parkinson, Helen

    2017-01-01

    The NHGRI-EBI GWAS Catalog has provided data from published genome-wide association studies since 2008. In 2015, the database was redesigned and relocated to EMBL-EBI. The new infrastructure includes a new graphical user interface (www.ebi.ac.uk/gwas/), ontology supported search functionality and an improved curation interface. These developments have improved the data release frequency by increasing automation of curation and providing scaling improvements. The range of available Catalog data has also been extended with structured ancestry and recruitment information added for all studies. The infrastructure improvements also support scaling for larger arrays, exome and sequencing studies, allowing the Catalog to adapt to the needs of evolving study design, genotyping technologies and user needs in the future. PMID:27899670

  5. Genome-wide association study of rice grain width variation.

    PubMed

    Zheng, Xiao-Ming; Gong, Tingting; Ou, Hong-Ling; Xue, Dayuan; Qiao, Weihua; Wang, Junrui; Liu, Sha; Yang, Qingwen; Olsen, Kenneth M

    2018-04-01

    Seed size is variable within many plant species, and understanding the underlying genetic factors can provide insights into mechanisms of local environmental adaptation. Here we make use of the abundant genomic and germplasm resources available for rice (Oryza sativa) to perform a large-scale genome-wide association study (GWAS) of grain width. Grain width varies widely within the crop and is also known to show climate-associated variation across populations of its wild progenitor. Using a filtered dataset of >1.9 million genome-wide SNPs in a sample of 570 cultivated and wild rice accessions, we performed GWAS with two complementary models, GLM and MLM. The models yielded 10 and 33 significant associations, respectively, and jointly yielded seven candidate locus regions, two of which have been previously identified. Analyses of nucleotide diversity and haplotype distributions at these loci revealed signatures of selection and patterns consistent with adaptive introgression of grain width alleles across rice variety groups. The results provide a 50% increase in the total number of rice grain width loci mapped to date and support a polygenic model whereby grain width is shaped by gene-by-environment interactions. These loci can potentially serve as candidates for studies of adaptive seed size variation in wild grass species.

  6. Analysis of Genome-Wide Association Studies with Multiple Outcomes Using Penalization

    PubMed Central

    Liu, Jin; Huang, Jian; Ma, Shuangge

    2012-01-01

    Genome-wide association studies have been extensively conducted, searching for markers for biologically meaningful outcomes and phenotypes. Penalization methods have been adopted in the analysis of the joint effects of a large number of SNPs (single nucleotide polymorphisms) and marker identification. This study is partly motivated by the analysis of heterogeneous stock mice dataset, in which multiple correlated phenotypes and a large number of SNPs are available. Existing penalization methods designed to analyze a single response variable cannot accommodate the correlation among multiple response variables. With multiple response variables sharing the same set of markers, joint modeling is first employed to accommodate the correlation. The group Lasso approach is adopted to select markers associated with all the outcome variables. An efficient computational algorithm is developed. Simulation study and analysis of the heterogeneous stock mice dataset show that the proposed method can outperform existing penalization methods. PMID:23272092

  7. SNP selection and classification of genome-wide SNP data using stratified sampling random forests.

    PubMed

    Wu, Qingyao; Ye, Yunming; Liu, Yang; Ng, Michael K

    2012-09-01

    For high dimensional genome-wide association (GWA) case-control data of complex disease, there are usually a large portion of single-nucleotide polymorphisms (SNPs) that are irrelevant with the disease. A simple random sampling method in random forest using default mtry parameter to choose feature subspace, will select too many subspaces without informative SNPs. Exhaustive searching an optimal mtry is often required in order to include useful and relevant SNPs and get rid of vast of non-informative SNPs. However, it is too time-consuming and not favorable in GWA for high-dimensional data. The main aim of this paper is to propose a stratified sampling method for feature subspace selection to generate decision trees in a random forest for GWA high-dimensional data. Our idea is to design an equal-width discretization scheme for informativeness to divide SNPs into multiple groups. In feature subspace selection, we randomly select the same number of SNPs from each group and combine them to form a subspace to generate a decision tree. The advantage of this stratified sampling procedure can make sure each subspace contains enough useful SNPs, but can avoid a very high computational cost of exhaustive search of an optimal mtry, and maintain the randomness of a random forest. We employ two genome-wide SNP data sets (Parkinson case-control data comprised of 408 803 SNPs and Alzheimer case-control data comprised of 380 157 SNPs) to demonstrate that the proposed stratified sampling method is effective, and it can generate better random forest with higher accuracy and lower error bound than those by Breiman's random forest generation method. For Parkinson data, we also show some interesting genes identified by the method, which may be associated with neurological disorders for further biological investigations.

  8. Joint genome-wide association study for milk fatty acid traits in Chinese and Danish Holstein populations.

    PubMed

    Li, X; Buitenhuis, A J; Lund, M S; Li, C; Sun, D; Zhang, Q; Poulsen, N A; Su, G

    2015-11-01

    The identification of causal genes or genomic regions associated with fatty acids (FA) will enhance our understanding of the pathways underlying FA synthesis and provide opportunities for changing milk fat composition through a genetic approach. The linkage disequilibrium between adjacent markers is highly consistent between the Chinese and Danish Holstein populations, such that a joint genome-wide association study (GWAS) can be performed. In this study, a joint GWAS was performed for 16 milk FA traits based on data of 784 Chinese and 371 Danish Holstein cows genotyped by a high-density bovine single nucleotide polymorphism (SNP) array. A total of 486,464 SNP markers on 29 bovine autosomes were used. Bonferroni corrections were applied to adjust the significance thresholds for multiple testing at the genome- and chromosome-wide levels. According to the analysis of either the Chinese or Danish data individually, the total numbers of overlapping SNP that were significant at the chromosome level were 94 for C14:1, 208 for the C14 index, and 1 for C18:0. Joint analysis using the combined data of the 2 populations detected greater numbers of significant SNP compared with either of the individual populations alone for 7 and 10 traits at the genome- and chromosome-wide significance levels, respectively. Greater numbers of significant SNP were detected for C18:0 and the C18 index in the Chinese population compared with the joint analysis. Sixty-five significant SNP across all traits had significantly different effects in the 2 populations. Ten FA were influenced by a quantitative trait loci (QTL) region including DGAT1. Both C14:1 and the C14 index were influenced by a QTL region including SCD1 in the combined population. Other QTL regions also showed significant associations with the studied FA. A large region (14.9-24.9 Mbp) in BTA26 significantly influenced C14:1 and the C14 index in both populations, mostly likely due to the SNP in SCD1. A QTL region (69.97-73.69 Mbp

  9. AID/APOBEC cytosine deaminase induces genome-wide kataegis

    PubMed Central

    2012-01-01

    Clusters of localized hypermutation in human breast cancer genomes, named “kataegis” (from the Greek for thunderstorm), are hypothesized to result from multiple cytosine deaminations catalyzed by AID/APOBEC proteins. However, a direct link between APOBECs and kataegis is still lacking. We have sequenced the genomes of yeast mutants induced in diploids by expression of the gene for PmCDA1, a hypermutagenic deaminase from sea lamprey. Analysis of the distribution of 5,138 induced mutations revealed localized clusters very similar to those found in tumors. Our data provide evidence that unleashed cytosine deaminase activity is an evolutionary conserved, prominent source of genome-wide kataegis events. Reviewers This article was reviewed by: Professor Sandor Pongor, Professor Shamil R. Sunyaev, and Dr Vladimir Kuznetsov. PMID:23249472

  10. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    PubMed

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  11. Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies.

    PubMed

    Abnet, Christian C; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M; Liao, Linda M; Lee, Maxwell P; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B; Giffen, Carol A; Burdett, Laurie; Fraumeni, Joseph F; Tucker, Margaret A; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M; Yuan, Zhi-Qing; Chanock, Stephen J; Zhang, Xue-Jun; Taylor, Philip R; Wang, Li-Dong

    2012-05-01

    Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10(-8), and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19-1.40) and P= 7.63 × 10(-10). An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants.

  12. Genome-wide association studies dissect the genetic networks underlying agronomical traits in soybean.

    PubMed

    Fang, Chao; Ma, Yanming; Wu, Shiwen; Liu, Zhi; Wang, Zheng; Yang, Rui; Hu, Guanghui; Zhou, Zhengkui; Yu, Hong; Zhang, Min; Pan, Yi; Zhou, Guoan; Ren, Haixiang; Du, Weiguang; Yan, Hongrui; Wang, Yanping; Han, Dezhi; Shen, Yanting; Liu, Shulin; Liu, Tengfei; Zhang, Jixiang; Qin, Hao; Yuan, Jia; Yuan, Xiaohui; Kong, Fanjiang; Liu, Baohui; Li, Jiayang; Zhang, Zhiwu; Wang, Guodong; Zhu, Baoge; Tian, Zhixi

    2017-08-24

    Soybean (Glycine max [L.] Merr.) is one of the most important oil and protein crops. Ever-increasing soybean consumption necessitates the improvement of varieties for more efficient production. However, both correlations among different traits and genetic interactions among genes that affect a single trait pose a challenge to soybean breeding. To understand the genetic networks underlying phenotypic correlations, we collected 809 soybean accessions worldwide and phenotyped them for two years at three locations for 84 agronomic traits. Genome-wide association studies identified 245 significant genetic loci, among which 95 genetically interacted with other loci. We determined that 14 oil synthesis-related genes are responsible for fatty acid accumulation in soybean and function in line with an additive model. Network analyses demonstrated that 51 traits could be linked through the linkage disequilibrium of 115 associated loci and these links reflect phenotypic correlations. We revealed that 23 loci, including the known Dt1, E2, E1, Ln, Dt2, Fan, and Fap loci, as well as 16 undefined associated loci, have pleiotropic effects on different traits. This study provides insights into the genetic correlation among complex traits and will facilitate future soybean functional studies and breeding through molecular design.

  13. Genome-Wide Architecture of Disease Resistance Genes in Lettuce

    PubMed Central

    Christopoulou, Marilena; Wo, Sebastian Reyes-Chin; Kozik, Alex; McHale, Leah K.; Truco, Maria-Jose; Wroblewski, Tadeusz; Michelmore, Richard W.

    2015-01-01

    Genome-wide motif searches identified 1134 genes in the lettuce reference genome of cv. Salinas that are potentially involved in pathogen recognition, of which 385 were predicted to encode nucleotide binding-leucine rich repeat receptor (NLR) proteins. Using a maximum-likelihood approach, we grouped the NLRs into 25 multigene families and 17 singletons. Forty-one percent of these NLR-encoding genes belong to three families, the largest being RGC16 with 62 genes in cv. Salinas. The majority of NLR-encoding genes are located in five major resistance clusters (MRCs) on chromosomes 1, 2, 3, 4, and 8 and cosegregate with multiple disease resistance phenotypes. Most MRCs contain primarily members of a single NLR gene family but a few are more complex. MRC2 spans 73 Mb and contains 61 NLRs of six different gene families that cosegregate with nine disease resistance phenotypes. MRC3, which is 25 Mb, contains 22 RGC21 genes and colocates with Dm13. A library of 33 transgenic RNA interference tester stocks was generated for functional analysis of NLR-encoding genes that cosegregated with disease resistance phenotypes in each of the MRCs. Members of four NLR-encoding families, RGC1, RGC2, RGC21, and RGC12 were shown to be required for 16 disease resistance phenotypes in lettuce. The general composition of MRCs is conserved across different genotypes; however, the specific repertoire of NLR-encoding genes varied particularly of the rapidly evolving Type I genes. These tester stocks are valuable resources for future analyses of additional resistance phenotypes. PMID:26449254

  14. A SNP based high-density linkage map of Apis cerana reveals a high recombination rate similar to Apis mellifera.

    PubMed

    Shi, Yuan Yuan; Sun, Liang Xian; Huang, Zachary Y; Wu, Xiao Bo; Zhu, Yong Qiang; Zheng, Hua Jun; Zeng, Zhi Jiang

    2013-01-01

    The Eastern honey bee, Apis cerana Fabricius, is distributed in southern and eastern Asia, from India and China to Korea and Japan and southeast to the Moluccas. This species is also widely kept for honey production besides Apis mellifera. Apis cerana is also a model organism for studying social behavior, caste determination, mating biology, sexual selection, and host-parasite interactions. Few resources are available for molecular research in this species, and a linkage map was never constructed. A linkage map is a prerequisite for quantitative trait loci mapping and for analyzing genome structure. We used the Chinese honey bee, Apis cerana cerana to construct the first linkage map in the Eastern honey bee. F2 workers (N = 103) were genotyped for 126,990 single nucleotide polymorphisms (SNPs). After filtering low quality and those not passing the Mendel test, we obtained 3,000 SNPs, 1,535 of these were informative and used to construct a linkage map. The preliminary map contains 19 linkage groups, we then mapped the 19 linkage groups to 16 chromosomes by comparing the markers to the genome of A. mellfiera. The final map contains 16 linkage groups with a total of 1,535 markers. The total genetic distance is 3,942.7 centimorgans (cM) with the largest linkage group (180 loci) measuring 574.5 cM. Average marker interval for all markers across the 16 linkage groups is 2.6 cM. We constructed a high density linkage map for A. c. cerana with 1,535 markers. Because the map is based on SNP markers, it will enable easier and faster genotyping assays than randomly amplified polymorphic DNA or microsatellite based maps used in A. mellifera.

  15. Citalopram and escitalopram plasma drug and metabolite concentrations: genome-wide associations

    PubMed Central

    Ji, Yuan; Schaid, Daniel J; Desta, Zeruesenay; Kubo, Michiaki; Batzler, Anthony J; Snyder, Karen; Mushiroda, Taisei; Kamatani, Naoyuki; Ogburn, Evan; Hall-Flavin, Daniel; Flockhart, David; Nakamura, Yusuke; Mrazek, David A; Weinshilboum, Richard M

    2014-01-01

    Aims Citalopram (CT) and escitalopram (S-CT) are among the most widely prescribed selective serotonin reuptake inhibitors used to treat major depressive disorder (MDD). We applied a genome-wide association study to identify genetic factors that contribute to variation in plasma concentrations of CT or S-CT and their metabolites in MDD patients treated with CT or S-CT. Methods Our genome-wide association study was performed using samples from 435 MDD patients. Linear mixed models were used to account for within-subject correlations of longitudinal measures of plasma drug/metabolite concentrations (4 and 8 weeks after the initiation of drug therapy), and single-nucleotide polymorphisms (SNPs) were modelled as additive allelic effects. Results Genome-wide significant associations were observed for S-CT concentration with SNPs in or near the CYP2C19 gene on chromosome 10 (rs1074145, P = 4.1 × 10−9) and with S-didesmethylcitalopram concentration for SNPs near the CYP2D6 locus on chromosome 22 (rs1065852, P = 2.0 × 10−16), supporting the important role of these cytochrome P450 (CYP) enzymes in biotransformation of citalopram. After adjustment for the effect of CYP2C19 functional alleles, the analyses also identified novel loci that will require future replication and functional validation. Conclusions In vitro and in vivo studies have suggested that the biotransformation of CT to monodesmethylcitalopram and didesmethylcitalopram is mediated by CYP isozymes. The results of our genome-wide association study performed in MDD patients treated with CT or S-CT have confirmed those observations but also identified novel genomic loci that might play a role in variation in plasma levels of CT or its metabolites during the treatment of MDD patients with these selective serotonin reuptake inhibitors. PMID:24528284

  16. Citalopram and escitalopram plasma drug and metabolite concentrations: genome-wide associations.

    PubMed

    Ji, Yuan; Schaid, Daniel J; Desta, Zeruesenay; Kubo, Michiaki; Batzler, Anthony J; Snyder, Karen; Mushiroda, Taisei; Kamatani, Naoyuki; Ogburn, Evan; Hall-Flavin, Daniel; Flockhart, David; Nakamura, Yusuke; Mrazek, David A; Weinshilboum, Richard M

    2014-08-01

    Citalopram (CT) and escitalopram (S-CT) are among the most widely prescribed selective serotonin reuptake inhibitors used to treat major depressive disorder (MDD). We applied a genome-wide association study to identify genetic factors that contribute to variation in plasma concentrations of CT or S-CT and their metabolites in MDD patients treated with CT or S-CT. Our genome-wide association study was performed using samples from 435 MDD patients. Linear mixed models were used to account for within-subject correlations of longitudinal measures of plasma drug/metabolite concentrations (4 and 8 weeks after the initiation of drug therapy), and single-nucleotide polymorphisms (SNPs) were modelled as additive allelic effects. Genome-wide significant associations were observed for S-CT concentration with SNPs in or near the CYP2C19 gene on chromosome 10 (rs1074145, P = 4.1 × 10(-9) ) and with S-didesmethylcitalopram concentration for SNPs near the CYP2D6 locus on chromosome 22 (rs1065852, P = 2.0 × 10(-16) ), supporting the important role of these cytochrome P450 (CYP) enzymes in biotransformation of citalopram. After adjustment for the effect of CYP2C19 functional alleles, the analyses also identified novel loci that will require future replication and functional validation. In vitro and in vivo studies have suggested that the biotransformation of CT to monodesmethylcitalopram and didesmethylcitalopram is mediated by CYP isozymes. The results of our genome-wide association study performed in MDD patients treated with CT or S-CT have confirmed those observations but also identified novel genomic loci that might play a role in variation in plasma levels of CT or its metabolites during the treatment of MDD patients with these selective serotonin reuptake inhibitors. © 2014 The British Pharmacological Society.

  17. A genome-wide approach to children's aggressive behavior: The EAGLE consortium.

    PubMed

    Pappa, Irene; St Pourcain, Beate; Benke, Kelly; Cavadino, Alana; Hakulinen, Christian; Nivard, Michel G; Nolte, Ilja M; Tiesler, Carla M T; Bakermans-Kranenburg, Marian J; Davies, Gareth E; Evans, David M; Geoffroy, Marie-Claude; Grallert, Harald; Groen-Blokhuis, Maria M; Hudziak, James J; Kemp, John P; Keltikangas-Järvinen, Liisa; McMahon, George; Mileva-Seitz, Viara R; Motazedi, Ehsan; Power, Christine; Raitakari, Olli T; Ring, Susan M; Rivadeneira, Fernando; Rodriguez, Alina; Scheet, Paul A; Seppälä, Ilkka; Snieder, Harold; Standl, Marie; Thiering, Elisabeth; Timpson, Nicholas J; Veenstra, René; Velders, Fleur P; Whitehouse, Andrew J O; Smith, George Davey; Heinrich, Joachim; Hypponen, Elina; Lehtimäki, Terho; Middeldorp, Christel M; Oldehinkel, Albertine J; Pennell, Craig E; Boomsma, Dorret I; Tiemeier, Henning

    2016-07-01

    Individual differences in aggressive behavior emerge in early childhood and predict persisting behavioral problems and disorders. Studies of antisocial and severe aggression in adulthood indicate substantial underlying biology. However, little attention has been given to genome-wide approaches of aggressive behavior in children. We analyzed data from nine population-based studies and assessed aggressive behavior using well-validated parent-reported questionnaires. This is the largest sample exploring children's aggressive behavior to date (N = 18,988), with measures in two developmental stages (N = 15,668 early childhood and N = 16,311 middle childhood/early adolescence). First, we estimated the additive genetic variance of children's aggressive behavior based on genome-wide SNP information, using genome-wide complex trait analysis (GCTA). Second, genetic associations within each study were assessed using a quasi-Poisson regression approach, capturing the highly right-skewed distribution of aggressive behavior. Third, we performed meta-analyses of genome-wide associations for both the total age-mixed sample and the two developmental stages. Finally, we performed a gene-based test using the summary statistics of the total sample. GCTA quantified variance tagged by common SNPs (10-54%). The meta-analysis of the total sample identified one region in chromosome 2 (2p12) at near genome-wide significance (top SNP rs11126630, P = 5.30 × 10(-8) ). The separate meta-analyses of the two developmental stages revealed suggestive evidence of association at the same locus. The gene-based analysis indicated association of variation within AVPR1A with aggressive behavior. We conclude that common variants at 2p12 show suggestive evidence for association with childhood aggression. Replication of these initial findings is needed, and further studies should clarify its biological meaning. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  18. Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping.

    PubMed

    Wang, Shuai; Wei, Wei; Luo, Xuenong; Cai, Xuepeng

    2014-01-01

    Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.

  19. Fine mapping of the chromosome 10q11-q21 linkage region in Alzheimer's disease cases and controls.

    PubMed

    Fallin, Margaret Daniele; Szymanski, Megan; Wang, Ruihua; Gherman, Adrian; Bassett, Susan S; Avramopoulos, Dimitrios

    2010-07-01

    We have previously reported strong linkage on chromosome 10q in pedigrees transmitting Alzheimer's disease through the mother, overlapping with many significant linkage reports including the largest reported study. Here, we report the most comprehensive fine mapping of this region to date. In a sample of 638 late-onset Alzheimer's disease (LOAD) cases and controls including 104 maternal LOAD cases, we genotyped 3,884 single nucleotide polymorphisms (SNPs) covering 15.2 Mb. We then used imputations and publicly available data to generate an extended dataset including 4,329 SNPs for 1,209 AD cases and 839 controls in the same region. Further, we screened eight genes in this region for rare alleles in 283 individuals by nucleotide sequencing, and we tested for possible monoallelic expression as it might underlie our maternal parent of origin linkage. We excluded the possibility of multiple rare coding risk variants for these genes and monoallelic expression when we could test for it. One SNP, rs10824310 in the PRKG1 gene, showed study-wide significant association without a parent of origin effect, but the effect size estimate is not of sufficient magnitude to explain the linkage, and no association is observed in an independent genome-wide association studies (GWAS) report. Further, no causative variants were identified though sequencing. Analysis of cases with maternal disease origin pointed to a few regions of interest that included the genes PRKG1 and PCDH15 and an intergenic interval of 200 Kb. It is likely that non-transcribed rare variants or other mechanisms involving these genomic regions underlie the observed linkage and parent of origin effect. Acquiring additional support and clarifying the mechanisms of such involvement is important for AD and other complex disorder genetics research.

  20. GUIDE-Seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases

    PubMed Central

    Nguyen, Nhu T.; Liebers, Matthew; Topkar, Ved V.; Thapar, Vishal; Wyvekens, Nicolas; Khayter, Cyd; Iafrate, A. John; Le, Long P.; Aryee, Martin J.; Joung, J. Keith

    2014-01-01

    CRISPR RNA-guided nucleases (RGNs) are widely used genome-editing reagents, but methods to delineate their genome-wide off-target cleavage activities have been lacking. Here we describe an approach for global detection of DNA double-stranded breaks (DSBs) introduced by RGNs and potentially other nucleases. This method, called Genome-wide Unbiased Identification of DSBs Enabled by Sequencing (GUIDE-Seq), relies on capture of double-stranded oligodeoxynucleotides into breaks Application of GUIDE-Seq to thirteen RGNs in two human cell lines revealed wide variability in RGN off-target activities and unappreciated characteristics of off-target sequences. The majority of identified sites were not detected by existing computational methods or ChIP-Seq. GUIDE-Seq also identified RGN-independent genomic breakpoint ‘hotspots’. Finally, GUIDE-Seq revealed that truncated guide RNAs exhibit substantially reduced RGN-induced off-target DSBs. Our experiments define the most rigorous framework for genome-wide identification of RGN off-target effects to date and provide a method for evaluating the safety of these nucleases prior to clinical use. PMID:25513782

  1. First High-Density Linkage Map and Single Nucleotide Polymorphisms Significantly Associated With Traits of Economic Importance in Yellowtail Kingfish Seriola lalandi.

    PubMed

    Nguyen, Nguyen H; Rastas, Pasi M A; Premachandra, H K A; Knibb, Wayne

    2018-01-01

    The genetic resources available for the commercially important fish species Yellowtail kingfish (YTK) ( Seriola lalandi) are relative sparse. To overcome this, we aimed (1) to develop a linkage map for this species, and (2) to identify markers/variants associated with economically important traits in kingfish (with an emphasis on body weight). Genetic and genomic analyses were conducted using 13,898 single nucleotide polymorphisms (SNPs) generated from a new high-throughput genotyping by sequencing platform, Diversity Arrays Technology (DArTseq TM ) in a pedigreed population comprising 752 animals. The linkage analysis enabled to map about 4,000 markers to 24 linkage groups (LGs), with an average density of 3.4 SNPs per cM. The linkage map was integrated into a genome-wide association study (GWAS) and identified six variants/SNPs associated with body weight ( P < 5e -8 ) when a multi-locus mixed model was used. Two out of the six significant markers were mapped to LGs 17 and 23, and collectively they explained 5.8% of the total genetic variance. It is concluded that the newly developed linkage map and the significantly associated markers with body weight provide fundamental information to characterize genetic architecture of growth-related traits in this population of YTK S. lalandi .

  2. Breed-Specific Ancestry Studies and Genome-Wide Association Analysis Highlight an Association Between the MYH9 Gene and Heat Tolerance in Alaskan Sprint Racing Sled Dogs

    PubMed Central

    Huson, Heather J.; vonHoldt, Bridgett M.; Rimbault, Maud; Byers, Alexandra M.; Runstadler, Jonathan A.; Parker, Heidi G.; Ostrander, Elaine A.

    2012-01-01

    Alaskan sled dogs are a genetically distinct population shaped by generations of selective interbreeding with purebred dogs to create a group of high performance athletes. As a result of selective breeding strategies, sled dogs present a unique opportunity to employ admixture-mapping techniques to investigate how breed composition and trait selection impact genomic structure. We used admixture mapping to investigate genetic ancestry across the genomes of two classes of sled dogs, sprint and long distance racers, and combined that with genome wide association studies (GWAS) to identify regions correlating with performance enhancing traits. The sled dog genome is enhanced by differential contributions from four non-admixed breeds (Alaskan Malamute, Siberian Husky, German Shorthaired Pointer, and Borzoi). A principle components analysis (PCA) of 115,000 genome-wide SNPs clearly resolved the sprint and distance populations as distinct genetic groups, with longer blocks of linkage disequilibrium (LD) observed in the distance versus sprint dogs (7.5–10 and 2.5–3.75 kb, respectively). Further, we identified eight regions with the genomic signal either from a selective sweep or an association analysis, corroborated by an excess of ancestry when comparing sprint and distance dogs. A comparison of elite and poor performing sled dogs identified a single region significantly association with heat tolerance. Within the region we identified seven SNPs within the myosin heavy chain 9 gene (MYH9) that were significantly associated with heat tolerance in sprint dogs, two of which correspond to conserved promoter and enhancer regions in the human ortholog. PMID:22105876

  3. Breed-specific ancestry studies and genome-wide association analysis highlight an association between the MYH9 gene and heat tolerance in Alaskan sprint racing sled dogs.

    PubMed

    Huson, Heather J; vonHoldt, Bridgett M; Rimbault, Maud; Byers, Alexandra M; Runstadler, Jonathan A; Parker, Heidi G; Ostrander, Elaine A

    2012-02-01

    Alaskan sled dogs are a genetically distinct population shaped by generations of selective interbreeding with purebred dogs to create a group of high-performance athletes. As a result of selective breeding strategies, sled dogs present a unique opportunity to employ admixture-mapping techniques to investigate how breed composition and trait selection impact genomic structure. We used admixture mapping to investigate genetic ancestry across the genomes of two classes of sled dogs, sprint and long-distance racers, and combined that with genome-wide association studies (GWAS) to identify regions that correlate with performance-enhancing traits. The sled dog genome is enhanced by differential contributions from four non-admixed breeds (Alaskan Malamute, Siberian Husky, German Shorthaired Pointer, and Borzoi). A principal components analysis (PCA) of 115,000 genome-wide SNPs clearly resolved the sprint and distance populations as distinct genetic groups, with longer blocks of linkage disequilibrium (LD) observed in the distance versus sprint dogs (7.5-10 and 2.5-3.75 kb, respectively). Furthermore, we identified eight regions with the genomic signal from either a selective sweep or an association analysis, corroborated by an excess of ancestry when comparing sprint and distance dogs. A comparison of elite and poor-performing sled dogs identified a single region significantly associated with heat tolerance. Within the region we identified seven SNPs within the myosin heavy chain 9 gene (MYH9) that were significantly associated with heat tolerance in sprint dogs, two of which correspond to conserved promoter and enhancer regions in the human ortholog.

  4. Genome-wide association as a means to understanding the mammary gland

    USDA-ARS?s Scientific Manuscript database

    Next-generation sequencing and related technologies have facilitated the creation of enormous public databases that catalogue genomic variation. These databases have facilitated a variety of approaches to discover new genes that regulate normal biology as well as disease. Genome wide association (...

  5. Genome-wide association analysis of milk yield traits in Nordic Red Cattle using imputed whole genome sequence variants.

    PubMed

    Iso-Touru, T; Sahana, G; Guldbrandtsen, B; Lund, M S; Vilkki, J

    2016-03-22

    The Nordic Red Cattle consisting of three different populations from Finland, Sweden and Denmark are under a joint breeding value estimation system. The long history of recording of production and health traits offers a great opportunity to study production traits and identify causal variants behind them. In this study, we used whole genome sequence level data from 4280 progeny tested Nordic Red Cattle bulls to scan the genome for loci affecting milk, fat and protein yields. Using a genome-wise significance threshold, regions on Bos taurus chromosomes 5, 14, 23, 25 and 26 were associated with fat yield. Regions on chromosomes 5, 14, 16, 19, 20 and 25 were associated with milk yield and chromosomes 5, 14 and 25 had regions associated with protein yield. Significantly associated variations were found in 227 genes for fat yield, 72 genes for milk yield and 30 genes for protein yield. Ingenuity Pathway Analysis was used to identify networks connecting these genes displaying significant hits. When compared to previously mapped genomic regions associated with fertility, significantly associated variations were found in 5 genes common for fat yield and fertility, thus linking these two traits via biological networks. This is the first time when whole genome sequence data is utilized to study genomic regions affecting milk production in the Nordic Red Cattle population. Sequence level data offers the possibility to study quantitative traits in detail but still cannot unambiguously reveal which of the associated variations is causative. Linkage disequilibrium creates difficulties to pinpoint the causative genes and variations. One solution to overcome these difficulties is the identification of the functional gene networks and pathways to reveal important interacting genes as candidates for the observed effects. This information on target genomic regions may be exploited to improve genomic prediction.

  6. Genomewide linkage scan of resting blood pressure: HERITAGE Family Study. Health, Risk Factors, Exercise Training, and Genetics.

    PubMed

    Rice, Treva; Rankinen, Tuomo; Chagnon, Yvon C; Province, Michael A; Pérusse, Louis; Leon, Arthur S; Skinner, James S; Wilmore, Jack H; Bouchard, Claude; Rao, Dabeeru C

    2002-06-01

    The purpose of this study was to search for genomic regions influencing resting systolic (SBP) and diastolic (DBP) blood pressure (BP) in sedentary families (baseline), and for resting BP responses (changes) resulting from a 20-week exercise training intervention (post-training-baseline) in the Health, Risk Factors, Exercise Training, and Genetics (HERITAGE) Family Study. A genome-wide scan was conducted on 317 black individuals from 114 families and 519 white individuals from 99 families using a multipoint variance-components linkage model and a panel of 509 markers. Promising results were primarily, but not exclusively, found in the black families. Linkage evidence (P<0.0023) with baseline BP replicated other studies within a 1-logarithm of odds (LOD) interval on 2p14, 3p26.3, and 12q21.33, and provided new evidence on 3q28, 11q21, and 19p12. Results for several known hypertension genes were less compelling. For response BP, results were not very strong, although markers on 13q11 were mildly suggestive (P<0.01). In conclusion, these HERITAGE data, in conjunction with results from previous genomewide scans, provide a basis for planning future investigations. The major areas warranting further study involve fine mapping to narrow down 3 regions on 2q, 3p, and 12q that may contain "novel" hypertension genes, additional typing of some biological candidate genes to determine whether they are the sources of these and other signals, multilocus investigations to understand how and to what extent some of these candidates may interact, and multivariate studies to characterize any pleiotropy.

  7. BRAD, the genetics and genomics database for Brassica plants.

    PubMed

    Cheng, Feng; Liu, Shengyi; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Li, Pingxia; Hua, Wei; Wang, Xiaowu

    2011-10-13

    Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data. BRAD, the Brassica database, is a web-based resource focusing on genome scale genetic and genomic data for important Brassica crops. BRAD was built based on the first whole genome sequence and on further data analysis of the Brassica A genome species, Brassica rapa (Chiifu-401-42). It provides datasets, such as the complete genome sequence of B. rapa, which was de novo assembled from Illumina GA II short reads and from BAC clone sequences, predicted genes and associated annotations, non coding RNAs, transposable elements (TE), B. rapa genes' orthologous to those in A. thaliana, as well as genetic markers and linkage maps. BRAD offers useful searching and data mining tools, including search across annotation datasets, search for syntenic or non-syntenic orthologs, and to search the flanking regions of a certain target, as well as the tools of BLAST and Gbrowse. BRAD allows users to enter almost any kind of information, such as a B. rapa or A. thaliana gene ID, physical position or genetic marker. BRAD, a new database which focuses on the genetics and genomics of the Brassica plants has been developed, it aims at helping scientists and breeders to fully and efficiently use the information of genome data of Brassica plants. BRAD will be continuously updated and can be accessed through http://brassicadb.org.

  8. A Genome-Wide Breast Cancer Scan in African Americans

    DTIC Science & Technology

    2011-06-01

    cancer in women of African ancestry. 13 References 1. Easton DF, P.K., Dunning AM, Pharoah PDP, Thompson D, Ballinger DG, et al . Genome...M, Hankinson, SE, et al . A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer...Millikan, R.C. Race, breast cancer subtypes, and survival in the Carolina Breast Cancer Study. Jama 295, 2492-502 ( 2006 ). 16 17. Huo, D., Ikpatt

  9. A second-generation anchored genetic linkage map of the tammar wallaby (Macropus eugenii)

    PubMed Central

    2011-01-01

    Background The tammar wallaby, Macropus eugenii, a small kangaroo used for decades for studies of reproduction and metabolism, is the model Australian marsupial for genome sequencing and genetic investigations. The production of a more comprehensive cytogenetically-anchored genetic linkage map will significantly contribute to the deciphering of the tammar wallaby genome. It has great value as a resource to identify novel genes and for comparative studies, and is vital for the ongoing genome sequence assembly and gene ordering in this species. Results A second-generation anchored tammar wallaby genetic linkage map has been constructed based on a total of 148 loci. The linkage map contains the original 64 loci included in the first-generation map, plus an additional 84 microsatellite loci that were chosen specifically to increase coverage and assist with the anchoring and orientation of linkage groups to chromosomes. These additional loci were derived from (a) sequenced BAC clones that had been previously mapped to tammar wallaby chromosomes by fluorescence in situ hybridization (FISH), (b) End sequence from BACs subsequently FISH-mapped to tammar wallaby chromosomes, and (c) tammar wallaby genes orthologous to opossum genes predicted to fill gaps in the tammar wallaby linkage map as well as three X-linked markers from a published study. Based on these 148 loci, eight linkage groups were formed. These linkage groups were assigned (via FISH-mapped markers) to all seven autosomes and the X chromosome. The sex-pooled map size is 1402.4 cM, which is estimated to provide 82.6% total coverage of the genome, with an average interval distance of 10.9 cM between adjacent markers. The overall ratio of female/male map length is 0.84, which is comparable to the ratio of 0.78 obtained for the first-generation map. Conclusions Construction of this second-generation genetic linkage map is a significant step towards complete coverage of the tammar wallaby genome and considerably

  10. A second-generation anchored genetic linkage map of the tammar wallaby (Macropus eugenii).

    PubMed

    Wang, Chenwei; Webley, Lee; Wei, Ke-jun; Wakefield, Matthew J; Patel, Hardip R; Deakin, Janine E; Alsop, Amber; Marshall Graves, Jennifer A; Cooper, Desmond W; Nicholas, Frank W; Zenger, Kyall R

    2011-08-19

    The tammar wallaby, Macropus eugenii, a small kangaroo used for decades for studies of reproduction and metabolism, is the model Australian marsupial for genome sequencing and genetic investigations. The production of a more comprehensive cytogenetically-anchored genetic linkage map will significantly contribute to the deciphering of the tammar wallaby genome. It has great value as a resource to identify novel genes and for comparative studies, and is vital for the ongoing genome sequence assembly and gene ordering in this species. A second-generation anchored tammar wallaby genetic linkage map has been constructed based on a total of 148 loci. The linkage map contains the original 64 loci included in the first-generation map, plus an additional 84 microsatellite loci that were chosen specifically to increase coverage and assist with the anchoring and orientation of linkage groups to chromosomes. These additional loci were derived from (a) sequenced BAC clones that had been previously mapped to tammar wallaby chromosomes by fluorescence in situ hybridization (FISH), (b) End sequence from BACs subsequently FISH-mapped to tammar wallaby chromosomes, and (c) tammar wallaby genes orthologous to opossum genes predicted to fill gaps in the tammar wallaby linkage map as well as three X-linked markers from a published study. Based on these 148 loci, eight linkage groups were formed. These linkage groups were assigned (via FISH-mapped markers) to all seven autosomes and the X chromosome. The sex-pooled map size is 1402.4 cM, which is estimated to provide 82.6% total coverage of the genome, with an average interval distance of 10.9 cM between adjacent markers. The overall ratio of female/male map length is 0.84, which is comparable to the ratio of 0.78 obtained for the first-generation map. Construction of this second-generation genetic linkage map is a significant step towards complete coverage of the tammar wallaby genome and considerably extends that of the first

  11. Genome-wide expression profiling in pediatric septic shock

    PubMed Central

    Wong, Hector R.

    2013-01-01

    For nearly a decade, our research group has had the privilege of developing and mining a multi-center, microarray-based, genome-wide expression database of critically ill children (≤ 10 years of age) with septic shock. Using bioinformatic and systems biology approaches, the expression data generated through this discovery-oriented, exploratory approach have been leveraged for a variety of objectives, which will be reviewed. Fundamental observations include wide spread repression of gene programs corresponding to the adaptive immune system, and biologically significant differential patterns of gene expression across developmental age groups. The data have also identified gene expression-based subclasses of pediatric septic shock having clinically relevant phenotypic differences. The data have also been leveraged for the discovery of novel therapeutic targets, and for the discovery and development of novel stratification and diagnostic biomarkers. Almost a decade of genome-wide expression profiling in pediatric septic shock is now demonstrating tangible results. The studies have progressed from an initial discovery-oriented and exploratory phase, to a new phase where the data are being translated and applied to address several areas of clinical need. PMID:23329198

  12. Construction and analysis of a high-density genetic linkage map in cabbage (Brassica oleracea L. var. capitata)

    PubMed Central

    2012-01-01

    Background Brassica oleracea encompass a family of vegetables and cabbage that are among the most widely cultivated crops. In 2009, the B. oleracea Genome Sequencing Project was launched using next generation sequencing technology. None of the available maps were detailed enough to anchor the sequence scaffolds for the Genome Sequencing Project. This report describes the development of a large number of SSR and SNP markers from the whole genome shotgun sequence data of B. oleracea, and the construction of a high-density genetic linkage map using a double haploid mapping population. Results The B. oleracea high-density genetic linkage map that was constructed includes 1,227 markers in nine linkage groups spanning a total of 1197.9 cM with an average of 0.98 cM between adjacent loci. There were 602 SSR markers and 625 SNP markers on the map. The chromosome with the highest number of markers (186) was C03, and the chromosome with smallest number of markers (99) was C09. Conclusions This first high-density map allowed the assembled scaffolds to be anchored to pseudochromosomes. The map also provides useful information for positional cloning, molecular breeding, and integration of information of genes and traits in B. oleracea. All the markers on the map will be transferable and could be used for the construction of other genetic maps. PMID:23033896

  13. Genome-wide analysis of WRKY gene family in Cucumis sativus

    PubMed Central

    2011-01-01

    Background WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. Results We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Conclusions Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes. PMID:21955985

  14. Genome-wide analysis of WRKY gene family in Cucumis sativus.

    PubMed

    Ling, Jian; Jiang, Weijie; Zhang, Ying; Yu, Hongjun; Mao, Zhenchuan; Gu, Xingfang; Huang, Sanwen; Xie, Bingyan

    2011-09-28

    WRKY proteins are a large family of transcriptional regulators in higher plant. They are involved in many biological processes, such as plant development, metabolism, and responses to biotic and abiotic stresses. Prior to the present study, only one full-length cucumber WRKY protein had been reported. The recent publication of the draft genome sequence of cucumber allowed us to conduct a genome-wide search for cucumber WRKY proteins, and to compare these positively identified proteins with their homologs in model plants, such as Arabidopsis. We identified a total of 55 WRKY genes in the cucumber genome. According to structural features of their encoded proteins, the cucumber WRKY (CsWRKY) genes were classified into three groups (group 1-3). Analysis of expression profiles of CsWRKY genes indicated that 48 WRKY genes display differential expression either in their transcript abundance or in their expression patterns under normal growth conditions, and 23 WRKY genes were differentially expressed in response to at least one abiotic stresses (cold, drought or salinity). The expression profile of stress-inducible CsWRKY genes were correlated with those of their putative Arabidopsis WRKY (AtWRKY) orthologs, except for the group 3 WRKY genes. Interestingly, duplicated group 3 AtWRKY genes appear to have been under positive selection pressure during evolution. In contrast, there was no evidence of recent gene duplication or positive selection pressure among CsWRKY group 3 genes, which may have led to the expressional divergence of group 3 orthologs. Fifty-five WRKY genes were identified in cucumber and the structure of their encoded proteins, their expression, and their evolution were examined. Considering that there has been extensive expansion of group 3 WRKY genes in angiosperms, the occurrence of different evolutionary events could explain the functional divergence of these genes.

  15. A Genome-Wide Association Study for Regulators of Micronucleus Formation in Mice.

    PubMed

    McIntyre, Rebecca E; Nicod, Jérôme; Robles-Espinoza, Carla Daniela; Maciejowski, John; Cai, Na; Hill, Jennifer; Verstraten, Ruth; Iyer, Vivek; Rust, Alistair G; Balmus, Gabriel; Mott, Richard; Flint, Jonathan; Adams, David J

    2016-08-09

    In mammals the regulation of genomic instability plays a key role in tumor suppression and also controls genome plasticity, which is important for recombination during the processes of immunity and meiosis. Most studies to identify regulators of genomic instability have been performed in cells in culture or in systems that report on gross rearrangements of the genome, yet subtle differences in the level of genomic instability can contribute to whole organism phenotypes such as tumor predisposition. Here we performed a genome-wide association study in a population of 1379 outbred Crl:CFW(SW)-US_P08 mice to dissect the genetic landscape of micronucleus formation, a biomarker of chromosomal breaks, whole chromosome loss, and extranuclear DNA. Variation in micronucleus levels is a complex trait with a genome-wide heritability of 53.1%. We identify seven loci influencing micronucleus formation (false discovery rate <5%), and define candidate genes at each locus. Intriguingly at several loci we find evidence for sexual dimorphism in micronucleus formation, with a locus on chromosome 11 being specific to males. Copyright © 2016 McIntyre et al.

  16. Discovering susceptibility genes for allergic rhinitis and allergy using a genome-wide association study strategy.

    PubMed

    Li, Jingyun; Zhang, Yuan; Zhang, Luo

    2015-02-01

    Allergic rhinitis and allergy are complex conditions, in which both genetic and environmental factors contribute to the pathogenesis. Genome-wide association studies (GWASs) employing common single-nucleotide polymorphisms have accelerated the search for novel and interesting genes, and also confirmed the role of some previously described genes which may be involved in the cause of allergic rhinitis and allergy. The aim of this review is to provide an overview of the genetic basis of allergic rhinitis and the associated allergic phenotypes, with particular focus on GWASs. The last decade has been marked by the publication of more than 20 GWASs of allergic rhinitis and the associated allergic phenotypes. Allergic diseases and traits have been shown to share a large number of genetic susceptibility loci, of which IL33/IL1RL1, IL-13-RAD50 and C11orf30/LRRC32 appear to be important for more than two allergic phenotypes. GWASs have further reflected the genetic heterogeneity underlying allergic phenotypes. Large-scale genome-wide association strategies are underway to discover new susceptibility variants for allergic rhinitis and allergic phenotypes. Characterization of the underlying genetics provides us with an insight into the potential targets for future studies and the corresponding interventions.

  17. StereoGene: rapid estimation of genome-wide correlation of continuous or interval feature data.

    PubMed

    Stavrovskaya, Elena D; Niranjan, Tejasvi; Fertig, Elana J; Wheelan, Sarah J; Favorov, Alexander V; Mironov, Andrey A

    2017-10-15

    Genomics features with similar genome-wide distributions are generally hypothesized to be functionally related, for example, colocalization of histones and transcription start sites indicate chromatin regulation of transcription factor activity. Therefore, statistical algorithms to perform spatial, genome-wide correlation among genomic features are required. Here, we propose a method, StereoGene, that rapidly estimates genome-wide correlation among pairs of genomic features. These features may represent high-throughput data mapped to reference genome or sets of genomic annotations in that reference genome. StereoGene enables correlation of continuous data directly, avoiding the data binarization and subsequent data loss. Correlations are computed among neighboring genomic positions using kernel correlation. Representing the correlation as a function of the genome position, StereoGene outputs the local correlation track as part of the analysis. StereoGene also accounts for confounders such as input DNA by partial correlation. We apply our method to numerous comparisons of ChIP-Seq datasets from the Human Epigenome Atlas and FANTOM CAGE to demonstrate its wide applicability. We observe the changes in the correlation between epigenomic features across developmental trajectories of several tissue types consistent with known biology and find a novel spatial correlation of CAGE clusters with donor splice sites and with poly(A) sites. These analyses provide examples for the broad applicability of StereoGene for regulatory genomics. The StereoGene C ++ source code, program documentation, Galaxy integration scripts and examples are available from the project homepage http://stereogene.bioinf.fbb.msu.ru/. favorov@sensi.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. Genetic linkage map and QTL identification for adventitious rooting traits in red gum eucalypts.

    PubMed

    Sumathi, Murugan; Bachpai, Vijaya Kumar Waman; Mayavel, A; Dasgupta, Modhumita Ghosh; Nagarajan, Binai; Rajasugunasekar, D; Sivakumar, Veerasamy; Yasodha, Ramasamy

    2018-05-01

    The eucalypt species, Eucalyptus tereticornis and Eucalyptus camaldulensis , show tolerance to drought and salinity conditions, respectively, and are widely cultivated in arid and semiarid regions of tropical countries. In this study, genetic linkage map was developed for interspecific cross E. tereticornis  ×  E. camaldulensis using pseudo-testcross strategy with simple sequence repeats (SSRs), intersimple sequence repeats (ISSRs), and sequence-related amplified polymorphism (SRAP) markers. The consensus genetic map comprised totally 283 markers with 84 SSRs, 94 ISSRs, and 105 SRAP markers on 11 linkage groups spanning 1163.4 cM genetic distance. Blasting the SSR sequences against E. grandis sequences allowed an alignment of 64% and the average ratio of genetic-to-physical distance was 1.7 Mbp/cM, which strengths the evidence that high amount of synteny and colinearity exists among eucalypts genome. Blast searches also revealed that 37% of SSRs had homologies with genes, which could potentially be used in the variety of downstream applications including candidate gene polymorphism. Quantitative trait loci (QTL) analysis for adventitious rooting traits revealed six QTL for rooting percent and root length on five chromosomes with interval and composite interval mapping. All the QTL explained 12.0-14.7% of the phenotypic variance, showing the involvement of major effect QTL on adventitious rooting traits. Increasing the density of markers would facilitate the detection of more number of small-effect QTL and also underpinning the genes involved in rooting process.

  19. Genome-wide mapping of mutations at single-nucleotide resolution for protein, metabolic and genome engineering.

    PubMed

    Garst, Andrew D; Bassalo, Marcelo C; Pines, Gur; Lynch, Sean A; Halweg-Edwards, Andrea L; Liu, Rongming; Liang, Liya; Wang, Zhiwen; Zeitoun, Ramsey; Alexander, William G; Gill, Ryan T

    2017-01-01

    Improvements in DNA synthesis and sequencing have underpinned comprehensive assessment of gene function in bacteria and eukaryotes. Genome-wide analyses require high-throughput methods to generate mutations and analyze their phenotypes, but approaches to date have been unable to efficiently link the effects of mutations in coding regions or promoter elements in a highly parallel fashion. We report that CRISPR-Cas9 gene editing in combination with massively parallel oligomer synthesis can enable trackable editing on a genome-wide scale. Our method, CRISPR-enabled trackable genome engineering (CREATE), links each guide RNA to homologous repair cassettes that both edit loci and function as barcodes to track genotype-phenotype relationships. We apply CREATE to site saturation mutagenesis for protein engineering, reconstruction of adaptive laboratory evolution experiments, and identification of stress tolerance and antibiotic resistance genes in bacteria. We provide preliminary evidence that CREATE will work in yeast. We also provide a webtool to design multiplex CREATE libraries.

  20. Genome-Wide Landscapes of Human Local Adaptation in Asia

    PubMed Central

    Lu, Dongsheng; Xu, Shuhua

    2013-01-01

    Genetic studies of human local adaptation have been facilitated greatly by recent advances in high-throughput genotyping and sequencing technologies. However, few studies have investigated local adaptation in Asian populations on a genome-wide scale and with a high geographic resolution. In this study, taking advantage of the dense population coverage in Southeast Asia, which is the part of the world least studied in term of natural selection, we depicted genome-wide landscapes of local adaptations in 63 Asian populations representing the majority of linguistic and ethnic groups in Asia. Using genome-wide data analysis, we discovered many genes showing signs of local adaptation or natural selection. Notable examples, such as FOXQ1, MAST2, and CDH4, were found to play a role in hair follicle development and human cancer, signal transduction, and tumor repression, respectively. These showed strong indications of natural selection in Philippine Negritos, a group of aboriginal hunter-gatherers living in the Philippines. MTTP, which has associations with metabolic syndrome, body mass index, and insulin regulation, showed a strong signature of selection in Southeast Asians, including Indonesians. Functional annotation analysis revealed that genes and genetic variants underlying natural selections were generally enriched in the functional category of alternative splicing. Specifically, many genes showing significant difference with respect to allele frequency between northern and southern Asian populations were found to be associated with human height and growth and various immune pathways. In summary, this study contributes to the overall understanding of human local adaptation in Asia and has identified both known and novel signatures of natural selection in the human genome. PMID:23349834

  1. Genetic dissection of ozone tolerance in rice (Oryza sativa L.) by a genome-wide association study

    PubMed Central

    Ueda, Yoshiaki; Frimpong, Felix; Qi, Yitao; Matthus, Elsa; Wu, Linbo; Höller, Stefanie; Kraska, Thorsten; Frei, Michael

    2015-01-01

    Tropospheric ozone causes various negative effects on plants and affects the yield and quality of agricultural crops. Here, we report a genome-wide association study (GWAS) in rice (Oryza sativa L.) to determine candidate loci associated with ozone tolerance. A diversity panel consisting of 328 accessions representing all subgroups of O. sativa was exposed to ozone stress at 60 nl l–1 for 7h every day throughout the growth season, or to control conditions. Averaged over all genotypes, ozone significantly affected biomass-related traits (plant height –1.0%, shoot dry weight –15.9%, tiller number –8.3%, grain weight –9.3%, total panicle weight –19.7%, single panicle weight –5.5%) and biochemical/physiological traits (symptom formation, SPAD value –4.4%, foliar lignin content +3.4%). A wide range of genotypic variance in response to ozone stress were observed in all phenotypes. Association mapping based on more than 30 000 single-nucleotide polymorphism (SNP) markers yielded 16 significant markers throughout the genome by applying a significance threshold of P<0.0001. Furthermore, by determining linkage disequilibrium blocks associated with significant SNPs, we gained a total of 195 candidate genes for these traits. The following sequence analysis revealed a number of novel polymorphisms in two candidate genes for the formation of visible leaf symptoms, a RING and an EREBP gene, both of which are involved in cell death and stress defence reactions. This study demonstrated substantial natural variation of responses to ozone in rice and the possibility of using GWAS in elucidating the genetic factors underlying ozone tolerance. PMID:25371505

  2. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing.

    PubMed

    Hu, Jiazhi; Meyers, Robin M; Dong, Junchao; Panchakshari, Rohit A; Alt, Frederick W; Frock, Richard L

    2016-05-01

    Unbiased, high-throughput assays for detecting and quantifying DNA double-stranded breaks (DSBs) across the genome in mammalian cells will facilitate basic studies of the mechanisms that generate and repair endogenous DSBs. They will also enable more applied studies, such as those to evaluate the on- and off-target activities of engineered nucleases. Here we describe a linear amplification-mediated high-throughput genome-wide sequencing (LAM-HTGTS) method for the detection of genome-wide 'prey' DSBs via their translocation in cultured mammalian cells to a fixed 'bait' DSB. Bait-prey junctions are cloned directly from isolated genomic DNA using LAM-PCR and unidirectionally ligated to bridge adapters; subsequent PCR steps amplify the single-stranded DNA junction library in preparation for Illumina Miseq paired-end sequencing. A custom bioinformatics pipeline identifies prey sequences that contribute to junctions and maps them across the genome. LAM-HTGTS differs from related approaches because it detects a wide range of broken end structures with nucleotide-level resolution. Familiarity with nucleic acid methods and next-generation sequencing analysis is necessary for library generation and data interpretation. LAM-HTGTS assays are sensitive, reproducible, relatively inexpensive, scalable and straightforward to implement with a turnaround time of <1 week.

  3. Sequential strategy to identify a susceptibility gene for schizophrenia: Report of potential linkage on chromosome 22q12-q13.1: Part 1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pulver, A.E.; Wolyniec, P.S.; Lasseter, V.K.

    To identify genes responsible for the susceptibility for schizophrenia, and to test the hypothesis that schizophrenia is etiologically heterogeneous, we have studied 39 multiplex families from a systematic sample of schizophrenic patients. Using a complex autosomal dominant model, which considers only those with a diagnosis of schizophrenia or schizoaffective disorder as affected, a random search of the genome for detection of linkage was undertaken. Pairwise linkage analyses suggest a potential linkage (LRH = 34.7 or maximum lod score = 1.54) for one region (22q12-q13.1). Reanalyses, varying parameters in the dominant model, maximized the LRH at 660.7 (maximum lod score 2.82).more » This finding is of sufficient interest to warrant further investigation through collaborative studies. 72 refs., 5 tabs.« less

  4. Genome-wide introgression among distantly related Heliconius butterfly species.

    PubMed

    Zhang, Wei; Dasmahapatra, Kanchon K; Mallet, James; Moreira, Gilson R P; Kronforst, Marcus R

    2016-02-27

    Although hybridization is thought to be relatively rare in animals, the raw genetic material introduced via introgression may play an important role in fueling adaptation and adaptive radiation. The butterfly genus Heliconius is an excellent system to study hybridization and introgression but most studies have focused on closely related species such as H. cydno and H. melpomene. Here we characterize genome-wide patterns of introgression between H. besckei, the only species with a red and yellow banded 'postman' wing pattern in the tiger-striped silvaniform clade, and co-mimetic H. melpomene nanna. We find a pronounced signature of putative introgression from H. melpomene into H. besckei in the genomic region upstream of the gene optix, known to control red wing patterning, suggesting adaptive introgression of wing pattern mimicry between these two distantly related species. At least 39 additional genomic regions show signals of introgression as strong or stronger than this mimicry locus. Gene flow has been on-going, with evidence of gene exchange at multiple time points, and bidirectional, moving from the melpomene to the silvaniform clade and vice versa. The history of gene exchange has also been complex, with contributions from multiple silvaniform species in addition to H. besckei. We also detect a signature of ancient introgression of the entire Z chromosome between the silvaniform and melpomene/cydno clades. Our study provides a genome-wide portrait of introgression between distantly related butterfly species. We further propose a comprehensive and efficient workflow for gene flow identification in genomic data sets.

  5. Natural Allelic Diversity, Genetic Structure and Linkage Disequilibrium Pattern in Wild Chickpea

    PubMed Central

    Kujur, Alice; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.

    2014-01-01

    Characterization of natural allelic diversity and understanding the genetic structure and linkage disequilibrium (LD) pattern in wild germplasm accessions by large-scale genotyping of informative microsatellite and single nucleotide polymorphism (SNP) markers is requisite to facilitate chickpea genetic improvement. Large-scale validation and high-throughput genotyping of genome-wide physically mapped 478 genic and genomic microsatellite markers and 380 transcription factor gene-derived SNP markers using gel-based assay, fluorescent dye-labelled automated fragment analyser and matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass array have been performed. Outcome revealed their high genotyping success rate (97.5%) and existence of a high level of natural allelic diversity among 94 wild and cultivated Cicer accessions. High intra- and inter-specific polymorphic potential and wider molecular diversity (11–94%) along with a broader genetic base (13–78%) specifically in the functional genic regions of wild accessions was assayed by mapped markers. It suggested their utility in monitoring introgression and transferring target trait-specific genomic (gene) regions from wild to cultivated gene pool for the genetic enhancement. Distinct species/gene pool-wise differentiation, admixed domestication pattern, and differential genome-wide recombination and LD estimates/decay observed in a six structured population of wild and cultivated accessions using mapped markers further signifies their usefulness in chickpea genetics, genomics and breeding. PMID:25222488

  6. A Genetic Linkage Map for Cattle

    PubMed Central

    Bishop, M. D.; Kappes, S. M.; Keele, J. W.; Stone, R. T.; Sunden, SLF.; Hawkins, G. A.; Toldo, S. S.; Fries, R.; Grosz, M. D.; Yoo, J.; Beattie, C. W.

    1994-01-01

    We report the most extensive physically anchored linkage map for cattle produced to date. Three-hundred thirteen genetic markers ordered in 30 linkage groups, anchored to 24 autosomal chromosomes (n = 29), the X and Y chromosomes, four unanchored syntenic groups and two unassigned linkage groups spanning 2464 cM of the bovine genome are summarized. The map also assigns 19 type I loci to specific chromosomes and/or syntenic groups and four cosmid clones containing informative microsatellites to chromosomes 13, 25 and 29 anchoring syntenic groups U11, U7 and U8, respectively. This map provides the skeletal framework prerequisite to development of a comprehensive genetic map for cattle and analysis of economic trait loci (ETL). PMID:7908653

  7. Genome-wide Association Analysis of Kernel Weight in Hard Winter Wheat

    USDA-ARS?s Scientific Manuscript database

    Wheat kernel weight is an important and heritable component of wheat grain yield and a key predictor of flour extraction. Genome-wide association analysis was conducted to identify genomic regions associated with kernel weight and kernel weight environmental response in 8 trials of 299 hard winter ...

  8. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    PubMed

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  9. In Search of Genes Associated with Risk for Psychopathic Tendencies in Children: A Two-Stage Genome-Wide Association Study of Pooled DNA

    ERIC Educational Resources Information Center

    Viding, Essi; Hanscombe, Ken B.; Curtis, Charles J. C.; Davis, Oliver S. P.; Meaburn, Emma L.; Plomin, Robert

    2010-01-01

    Background: Quantitative genetic data from our group indicates that antisocial behaviour (AB) is strongly heritable when coupled with psychopathic, callous-unemotional (CU) personality traits. We have also demonstrated that the genetic influences for AB and CU overlap considerably. We conducted a genome-wide association scan that capitalises on…

  10. Genome-wide cross-amplification of domestic sheep microsatellites in bighorn sheep and mountain goats.

    PubMed

    Poissant, J; Shafer, A B A; Davis, C S; Mainguy, J; Hogg, J T; Côté, S D; Coltman, D W

    2009-07-01

    We tested for cross-species amplification of microsatellite loci located throughout the domestic sheep (Ovis aries) genome in two north American mountain ungulates (bighorn sheep, Ovis canadensis, and mountain goats, Oreamnos americanus). We identified 247 new polymorphic markers in bighorn sheep (≥ 3 alleles in one of two study populations) and 149 in mountain goats (≥ 2 alleles in a single study population) using 648 and 576 primer pairs, respectively. Our efforts increased the number of available polymorphic microsatellite markers to 327 for bighorn sheep and 180 for mountain goats. The average distance between successive polymorphic bighorn sheep and mountain goat markers inferred from the Australian domestic sheep genome linkage map (mean ± 1 SD) was 11.9 ± 9.2 and 15.8 ± 13.8 centimorgans, respectively. The development of genomic resources in these wildlife species enables future studies of the genetic architecture of trait variation. © 2009 Blackwell Publishing Ltd.

  11. A Consensus Genetic Map for Pinus taeda and Pinus elliottii and Extent of Linkage Disequilibrium in Two Genotype-Phenotype Discovery Populations of Pinus taeda

    PubMed Central

    Westbrook, Jared W.; Chhatre, Vikram E.; Wu, Le-Shin; Chamala, Srikar; Neves, Leandro Gomide; Muñoz, Patricio; Martínez-García, Pedro J.; Neale, David B.; Kirst, Matias; Mockaitis, Keithanne; Nelson, C. Dana; Peter, Gary F.; Echt, Craig S.

    2015-01-01

    A consensus genetic map for Pinus taeda (loblolly pine) and Pinus elliottii (slash pine) was constructed by merging three previously published P. taeda maps with a map from a pseudo-backcross between P. elliottii and P. taeda. The consensus map positioned 3856 markers via genotyping of 1251 individuals from four pedigrees. It is the densest linkage map for a conifer to date. Average marker spacing was 0.6 cM and total map length was 2305 cM. Functional predictions of mapped genes were improved by aligning expressed sequence tags used for marker discovery to full-length P. taeda transcripts. Alignments to the P. taeda genome mapped 3305 scaffold sequences onto 12 linkage groups. The consensus genetic map was used to compare the genome-wide linkage disequilibrium in a population of distantly related P. taeda individuals (ADEPT2) used for association genetic studies and a multiple-family pedigree used for genomic selection (CCLONES). The prevalence and extent of LD was greater in CCLONES as compared to ADEPT2; however, extended LD with LGs or between LGs was rare in both populations. The average squared correlations, r2, between SNP alleles less than 1 cM apart were less than 0.05 in both populations and r2 did not decay substantially with genetic distance. The consensus map and analysis of linkage disequilibrium establish a foundation for comparative association mapping and genomic selection in P. taeda and P. elliottii. PMID:26068575

  12. Comparison of linkage analysis methods for genome-wide scanning of extended pedigrees, with application to the TG/HDL-C ratio in the Framingham Heart Study.

    PubMed

    Horne, Benjamin D; Malhotra, Alka; Camp, Nicola J

    2003-12-31

    High triglycerides (TG) and low high-density lipoprotein cholesterol (HDL-C) jointly increase coronary disease risk. We performed linkage analysis for TG/HDL-C ratio in the Framingham Heart Study data as a quantitative trait, using methods implemented in LINKAGE, GENEHUNTER (GH), MCLINK, and SOLAR. Results were compared to each other and to those from a previous evaluation using SOLAR for TG/HDL-C ratio on this sample. We also investigated linked pedigrees in each region using by-pedigree analysis. Fourteen regions with at least suggestive linkage evidence were identified, including some that may increase and some that may decrease coronary risk. Ten of the 14 regions were identified by more than one analysis, and several of these regions were not previously detected. The best regions identified for each method were on chromosomes 2 (LOD = 2.29, MCLINK), 5 (LOD = 2.65, GH), 7 (LOD = 2.67, SOLAR), and 22 (LOD = 3.37, LINKAGE). By-pedigree multi-point LOD values in MCLINK showed linked pedigrees for all five regions, ranging from 3 linked pedigrees (chromosome 5) to 14 linked pedigrees (chromosome 7), and suggested localizations of between 9 cM and 27 cM in size. Reasonable concordance was found across analysis methods. No single method identified all regions, either by full sample LOD or with by-pedigree analysis. Concordance across methods appeared better at the pedigree level, with many regions showing by-pedigree support in MCLINK when no evidence was observed in the full sample. Thus, investigating by-pedigree linkage evidence may provide a useful tool for evaluating linkage regions.

  13. Meta-Analysis in Genome-Wide Association Datasets: Strategies and Application in Parkinson Disease

    PubMed Central

    Evangelou, Evangelos; Maraganore, Demetrius M.; Ioannidis, John P.A.

    2007-01-01

    Background Genome-wide association studies hold substantial promise for identifying common genetic variants that regulate susceptibility to complex diseases. However, for the detection of small genetic effects, single studies may be underpowered. Power may be improved by combining genome-wide datasets with meta-analytic techniques. Methodology/Principal Findings Both single and two-stage genome-wide data may be combined and there are several possible strategies. In the two-stage framework, we considered the options of (1) enhancement of replication data and (2) enhancement of first-stage data, and then, we also considered (3) joint meta-analyses including all first-stage and second-stage data. These strategies were examined empirically using data from two genome-wide association studies (three datasets) on Parkinson disease. In the three strategies, we derived 12, 5, and 49 single nucleotide polymorphisms that show significant associations at conventional levels of statistical significance. None of these remained significant after conservative adjustment for the number of performed analyses in each strategy. However, some may warrant further consideration: 6 SNPs were identified with at least 2 of the 3 strategies and 3 SNPs [rs1000291 on chromosome 3, rs2241743 on chromosome 4 and rs3018626 on chromosome 11] were identified with all 3 strategies and had no or minimal between-dataset heterogeneity (I2 = 0, 0 and 15%, respectively). Analyses were primarily limited by the suboptimal overlap of tested polymorphisms across different datasets (e.g., only 31,192 shared polymorphisms between the two tier 1 datasets). Conclusions/Significance Meta-analysis may be used to improve the power and examine the between-dataset heterogeneity of genome-wide association studies. Prospective designs may be most efficient, if they try to maximize the overlap of genotyping platforms and anticipate the combination of data across many genome-wide association studies. PMID:17332845

  14. A microsatellite-based consensus linkage map for species of Eucalyptus and a novel set of 230 microsatellite markers for the genus

    PubMed Central

    Brondani, Rosana PV; Williams, Emlyn R; Brondani, Claudio; Grattapaglia, Dario

    2006-01-01

    Background Eucalypts are the most widely planted hardwood trees in the world occupying globally more than 18 million hectares as an important source of carbon neutral renewable energy and raw material for pulp, paper and solid wood. Quantitative Trait Loci (QTLs) in Eucalyptus have been localized on pedigree-specific RAPD or AFLP maps seriously limiting the value of such QTL mapping efforts for molecular breeding. The availability of a genus-wide genetic map with transferable microsatellite markers has become a must for the effective advancement of genomic undertakings. This report describes the development of a novel set of 230 EMBRA microsatellites, the construction of the first comprehensive microsatellite-based consensus linkage map for Eucalyptus and the consolidation of existing linkage information for other microsatellites and candidate genes mapped in other species of the genus. Results The consensus map covers ~90% of the recombining genome of Eucalyptus, involves 234 mapped EMBRA loci on 11 linkage groups, an observed length of 1,568 cM and a mean distance between markers of 8.4 cM. A compilation of all microsatellite linkage information published in Eucalyptus allowed us to establish the homology among linkage groups between this consensus map and other maps published for E. globulus. Comparative mapping analyses also resulted in the linkage group assignment of other 41 microsatellites derived from other Eucalyptus species as well as candidate genes and QTLs for wood and flowering traits published in the literature. This report significantly increases the availability of microsatellite markers and mapping information for species of Eucalyptus and corroborates the high conservation of microsatellite flanking sequences and locus ordering between species of the genus. Conclusion This work represents an important step forward for Eucalyptus comparative genomics, opening stimulating perspectives for evolutionary studies and molecular breeding applications

  15. A genome-wide association study of chronic obstructive pulmonary disease in Hispanics.

    PubMed

    Chen, Wei; Brehm, John M; Manichaikul, Ani; Cho, Michael H; Boutaoui, Nadia; Yan, Qi; Burkart, Kristin M; Enright, Paul L; Rotter, Jerome I; Petersen, Hans; Leng, Shuguang; Obeidat, Ma'en; Bossé, Yohan; Brandsma, Corry-Anke; Hao, Ke; Rich, Stephen S; Powell, Rhea; Avila, Lydiana; Soto-Quiros, Manuel; Silverman, Edwin K; Tesfaigzi, Yohannes; Barr, R Graham; Celedón, Juan C

    2015-03-01

    Genome-wide association studies (GWAS) of chronic obstructive pulmonary disease (COPD) have identified disease-susceptibility loci, mostly in subjects of European descent. We hypothesized that by studying Hispanic populations we would be able to identify unique loci that contribute to COPD pathogenesis in Hispanics but remain undetected in GWAS of non-Hispanic populations. We conducted a metaanalysis of two GWAS of COPD in independent cohorts of Hispanics in Costa Rica and the United States (Multi-Ethnic Study of Atherosclerosis [MESA]). We performed a replication study of the top single-nucleotide polymorphisms in an independent Hispanic cohort in New Mexico (the Lovelace Smokers Cohort). We also attempted to replicate prior findings from genome-wide studies in non-Hispanic populations in Hispanic cohorts. We found no genome-wide significant association with COPD in our metaanalysis of Costa Rica and MESA. After combining the top results from this metaanalysis with those from our replication study in the Lovelace Smokers Cohort, we identified two single-nucleotide polymorphisms approaching genome-wide significance for an association with COPD. The first (rs858249, combined P value = 6.1 × 10(-8)) is near the genes KLHL7 and NUPL2 on chromosome 7. The second (rs286499, combined P value = 8.4 × 10(-8)) is located in an intron of DLG2. The two most significant single-nucleotide polymorphisms in FAM13A from a previous genome-wide study in non-Hispanics were associated with COPD in Hispanics. We have identified two novel loci (in or near the genes KLHL7/NUPL2 and DLG2) that may play a role in COPD pathogenesis in Hispanic populations.

  16. Improved Statistical Methods Enable Greater Sensitivity in Rhythm Detection for Genome-Wide Data

    PubMed Central

    Hutchison, Alan L.; Maienschein-Cline, Mark; Chiang, Andrew H.; Tabei, S. M. Ali; Gudjonson, Herman; Bahroos, Neil; Allada, Ravi; Dinner, Aaron R.

    2015-01-01

    Robust methods for identifying patterns of expression in genome-wide data are important for generating hypotheses regarding gene function. To this end, several analytic methods have been developed for detecting periodic patterns. We improve one such method, JTK_CYCLE, by explicitly calculating the null distribution such that it accounts for multiple hypothesis testing and by including non-sinusoidal reference waveforms. We term this method empirical JTK_CYCLE with asymmetry search, and we compare its performance to JTK_CYCLE with Bonferroni and Benjamini-Hochberg multiple hypothesis testing correction, as well as to five other methods: cyclohedron test, address reduction, stable persistence, ANOVA, and F24. We find that ANOVA, F24, and JTK_CYCLE consistently outperform the other three methods when data are limited and noisy; empirical JTK_CYCLE with asymmetry search gives the greatest sensitivity while controlling for the false discovery rate. Our analysis also provides insight into experimental design and we find that, for a fixed number of samples, better sensitivity and specificity are achieved with higher numbers of replicates than with higher sampling density. Application of the methods to detecting circadian rhythms in a metadataset of microarrays that quantify time-dependent gene expression in whole heads of Drosophila melanogaster reveals annotations that are enriched among genes with highly asymmetric waveforms. These include a wide range of oxidation reduction and metabolic genes, as well as genes with transcripts that have multiple splice forms. PMID:25793520

  17. Genotypic variants at 2q33 and risk of esophageal squamous cell carcinoma in China: a meta-analysis of genome-wide association studies

    PubMed Central

    Abnet, Christian C.; Wang, Zhaoming; Song, Xin; Hu, Nan; Zhou, Fu-You; Freedman, Neal D.; Li, Xue-Min; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Dawsey, Sanford M.; Liao, Linda M.; Lee, Maxwell P.; Ding, Ti; Qiao, You-Lin; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Chung, Charles C.; Wang, Chaoyu; Wheeler, William; Yeager, Meredith; Yuenger, Jeff; Hutchinson, Amy; Jacobs, Kevin B.; Giffen, Carol A.; Burdett, Laurie; Fraumeni, Joseph F.; Tucker, Margaret A.; Chow, Wong-Ho; Zhao, Xue-Ke; Li, Jiang-Man; Li, Ai-Li; Sun, Liang-Dan; Wei, Wu; Li, Ji-Lin; Zhang, Peng; Li, Hong-Lei; Cui, Wen-Yan; Wang, Wei-Peng; Liu, Zhi-Cai; Yang, Xia; Fu, Wen-Jing; Cui, Ji-Li; Lin, Hong-Li; Zhu, Wen-Liang; Liu, Min; Chen, Xi; Chen, Jie; Guo, Li; Han, Jing-Jing; Zhou, Sheng-Li; Huang, Jia; Wu, Yue; Yuan, Chao; Huang, Jing; Ji, Ai-Fang; Kul, Jian-Wei; Fan, Zhong-Min; Wang, Jian-Po; Zhang, Dong-Yun; Zhang, Lian-Qun; Zhang, Wei; Chen, Yuan-Fang; Ren, Jing-Li; Li, Xiu-Min; Dong, Jin-Cheng; Xing, Guo-Lan; Guo, Zhi-Gang; Yang, Jian-Xue; Mao, Yi-Ming; Yuan, Yuan; Guo, Er-Tao; Zhang, Wei; Hou, Zhi-Chao; Liu, Jing; Li, Yan; Tang, Sa; Chang, Jia; Peng, Xiu-Qin; Han, Min; Yin, Wan-Li; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Yang, Liu-Qin; Zhu, Fu-Guo; Yang, Xiu-Feng; Feng, Xiao-Shan; Wang, Zhou; Li, Yin; Gao, She-Gan; Liu, Hai-Lin; Yuan, Ling; Jin, Yan; Zhang, Yan-Rui; Sheyhidin, Ilyar; Li, Feng; Chen, Bao-Ping; Ren, Shu-Wei; Liu, Bin; Li, Dan; Zhang, Gao-Fu; Yue, Wen-Bin; Feng, Chang-Wei; Qige, Qirenwang; Zhao, Jian-Ting; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Xu, Li-Yan; Wu, Zhi-Yong; Bao, Zhi-Qin; Chen, Ji-Li; Li, Xian-Chang; Zhuang, Xiang; Zhou, Ying-Fa; Zuo, Xian-Bo; Dong, Zi-Ming; Wang, Lu-Wen; Fan, Xue-Pin; Wang, Jin; Zhou, Qi; Ma, Guo-Shun; Zhang, Qin-Xian; Liu, Hai; Jian, Xin-Ying; Lian, Sin-Yong; Wang, Jin-Sheng; Chang, Fu-Bao; Lu, Chang-Dong; Miao, Jian-Jun; Chen, Zhi-Guo; Wang, Ran; Guo, Ming; Fan, Zeng-Lin; Tao, Ping; Liu, Tai-Jing; Wei, Jin-Chang; Kong, Qing-Peng; Fan, Lei; Wang, Xian-Zeng; Gao, Fu-Sheng; Wang, Tian-Yun; Xie, Dong; Wang, Li; Chen, Shu-Qing; Yang, Wan-Cai; Hong, Jun-Yan; Wang, Liang; Qiu, Song-Liang; Goldstein, Alisa M.; Yuan, Zhi-Qing; Chanock, Stephen J.; Zhang, Xue-Jun; Taylor, Philip R.; Wang, Li-Dong

    2012-01-01

    Genome-wide association studies have identified susceptibility loci for esophageal squamous cell carcinoma (ESCC). We conducted a meta-analysis of all single-nucleotide polymorphisms (SNPs) that showed nominally significant P-values in two previously published genome-wide scans that included a total of 2961 ESCC cases and 3400 controls. The meta-analysis revealed five SNPs at 2q33 with P< 5 × 10−8, and the strongest signal was rs13016963, with a combined odds ratio (95% confidence interval) of 1.29 (1.19–1.40) and P= 7.63 × 10−10. An imputation analysis of 4304 SNPs at 2q33 suggested a single association signal, and the strongest imputed SNP associations were similar to those from the genotyped SNPs. We conducted an ancestral recombination graph analysis with 53 SNPs to identify one or more haplotypes that harbor the variants directly responsible for the detected association signal. This showed that the five SNPs exist in a single haplotype along with 45 imputed SNPs in strong linkage disequilibrium, and the strongest candidate was rs10201587, one of the genotyped SNPs. Our meta-analysis found genome-wide significant SNPs at 2q33 that map to the CASP8/ALS2CR12/TRAK2 gene region. Variants in CASP8 have been extensively studied across a spectrum of cancers with mixed results. The locus we identified appears to be distinct from the widely studied rs3834129 and rs1045485 SNPs in CASP8. Future studies of esophageal and other cancers should focus on comprehensive sequencing of this 2q33 locus and functional analysis of rs13016963 and rs10201587 and other strongly correlated variants. PMID:22323360

  18. Accuracy of Genomic Prediction in Switchgrass (Panicum virgatum L.) Improved by Accounting for Linkage Disequilibrium

    PubMed Central

    Ramstein, Guillaume P.; Evans, Joseph; Kaeppler, Shawn M.; Mitchell, Robert B.; Vogel, Kenneth P.; Buell, C. Robin; Casler, Michael D.

    2016-01-01

    Switchgrass is a relatively high-yielding and environmentally sustainable biomass crop, but further genetic gains in biomass yield must be achieved to make it an economically viable bioenergy feedstock. Genomic selection (GS) is an attractive technology to generate rapid genetic gains in switchgrass, and meet the goals of a substantial displacement of petroleum use with biofuels in the near future. In this study, we empirically assessed prediction procedures for genomic selection in two different populations, consisting of 137 and 110 half-sib families of switchgrass, tested in two locations in the United States for three agronomic traits: dry matter yield, plant height, and heading date. Marker data were produced for the families’ parents by exome capture sequencing, generating up to 141,030 polymorphic markers with available genomic-location and annotation information. We evaluated prediction procedures that varied not only by learning schemes and prediction models, but also by the way the data were preprocessed to account for redundancy in marker information. More complex genomic prediction procedures were generally not significantly more accurate than the simplest procedure, likely due to limited population sizes. Nevertheless, a highly significant gain in prediction accuracy was achieved by transforming the marker data through a marker correlation matrix. Our results suggest that marker-data transformations and, more generally, the account of linkage disequilibrium among markers, offer valuable opportunities for improving prediction procedures in GS. Some of the achieved prediction accuracies should motivate implementation of GS in switchgrass breeding programs. PMID:26869619

  19. SNP Discovery and Linkage Map Construction in Cultivated Tomato

    PubMed Central

    Shirasawa, Kenta; Isobe, Sachiko; Hirakawa, Hideki; Asamizu, Erika; Fukuoka, Hiroyuki; Just, Daniel; Rothan, Christophe; Sasamoto, Shigemi; Fujishiro, Tsunakazu; Kishida, Yoshie; Kohara, Mitsuyo; Tsuruoka, Hisano; Wada, Tsuyuko; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

    2010-01-01

    Few intraspecific genetic linkage maps have been reported for cultivated tomato, mainly because genetic diversity within Solanum lycopersicum is much less than that between tomato species. Single nucleotide polymorphisms (SNPs), the most abundant source of genomic variation, are the most promising source of polymorphisms for the construction of linkage maps for closely related intraspecific lines. In this study, we developed SNP markers based on expressed sequence tags for the construction of intraspecific linkage maps in tomato. Out of the 5607 SNP positions detected through in silico analysis, 1536 were selected for high-throughput genotyping of two mapping populations derived from crosses between ‘Micro-Tom’ and either ‘Ailsa Craig’ or ‘M82’. A total of 1137 markers, including 793 out of the 1338 successfully genotyped SNPs, along with 344 simple sequence repeat and intronic polymorphism markers, were mapped onto two linkage maps, which covered 1467.8 and 1422.7 cM, respectively. The SNP markers developed were then screened against cultivated tomato lines in order to estimate the transferability of these SNPs to other breeding materials. The molecular markers and linkage maps represent a milestone in the genomics and genetics, and are the first step toward molecular breeding of cultivated tomato. Information on the DNA markers, linkage maps, and SNP genotypes for these tomato lines is available at http://www.kazusa.or.jp/tomato/. PMID:21044984

  20. Searching the world wide Web

    PubMed

    Lawrence; Giles

    1998-04-03

    The coverage and recency of the major World Wide Web search engines was analyzed, yielding some surprising results. The coverage of any one engine is significantly limited: No single engine indexes more than about one-third of the "indexable Web," the coverage of the six engines investigated varies by an order of magnitude, and combining the results of the six engines yields about 3.5 times as many documents on average as compared with the results from only one engine. Analysis of the overlap between pairs of engines gives an estimated lower bound on the size of the indexable Web of 320 million pages.