Ekine, Chinyere C; Rowe, Suzanne J; Bishop, Stephen C; de Koning, Dirk-Jan
In animal breeding, the genetic potential of an animal is summarized as its estimated breeding value, which is derived from its own performance as well as the performance of related individuals. Here, we illustrate why estimated breeding values are not suitable as a phenotype for genome-wide association studies. We simulated human-type and pig-type pedigrees with a range of quantitative trait loci (QTL) effects (0.5-3% of phenotypic variance) and heritabilities (0.3-0.8). We analyzed 1000 replicates of each scenario with four models: (a) a full mixed model including a polygenic effect, (b) a regression analysis using the residual of a mixed model as a trait score (so called GRAMMAR approach), (c) a regression analysis using the estimated breeding value as a trait score, and (d) a regression analysis that uses the raw phenotype as a trait score. We show that using breeding values as a trait score gives very high false-positive rates (up 14% in human pedigrees and >60% in pig pedigrees). Simulations based on a real pedigree show that additional generations of pedigree increase the type I error. Including the family relationship as a random effect provides the greatest power to detect QTL while controlling for type I error at the desired level and providing the most accurate estimates of the QTL effect. Both the use of residuals and the use of breeding values result in deflated estimates of the QTL effect. We derive the contributions of QTL effects to the breeding value and residual and show how this affects the estimates.
Zhang, Lifan; Zhou, Xiang; Michal, Jennifer J; Ding, Bo; Li, Rui; Jiang, Zhihua
Birth weight is an economically important trait in pig production because it directly impacts piglet growth and survival rate. In the present study, we performed a genome wide survey of candidate genes and pathways associated with individual birth weight (IBW) using the Illumina PorcineSNP60 BeadChip on 24 high (HEBV) and 24 low estimated breeding value (LEBV) animals. These animals were selected from a reference population of 522 individuals produced by three sires and six dam lines, which were crossbreds with multiple breeds. After quality-control, 43,257 SNPs (single nucleotide polymorphisms), including 42,243 autosomal SNPs and 1,014 SNPs on chromosome X, were used in the data analysis. A total of 27 differentially selected regions (DSRs), including 1 on Sus scrofa chromosome 1 (SSC1), 1 on SSC4, 2 on SSC5, 4 on SSC6, 2 on SSC7, 5 on SSC8, 3 on SSC9, 1 on SSC14, 3 on SSC18, and 5 on SSCX, were identified to show the genome wide separations between the HEBV and LEBV groups for IBW in piglets. A DSR with the most number of significant SNPs (including 7 top 0.1% and 31 top 5% SNPs) was located on SSC6, while another DSR with the largest genetic differences in F ST was found on SSC18. These regions harbor known functionally important genes involved in growth and development, such as TNFRSF9 (tumor necrosis factor receptor superfamily member 9), CA6 (carbonic anhydrase VI) and MDFIC (MyoD family inhibitor domain containing). A DSR rich in imprinting genes appeared on SSC9, which included PEG10 (paternally expressed 10), SGCE (sarcoglycan, epsilon), PPP1R9A (protein phosphatase 1, regulatory subunit 9A) and ASB4 (ankyrin repeat and SOCS box containing 4). More importantly, our present study provided evidence to support six quantitative trait loci (QTL) regions for pig birth weight, six QTL regions for average birth weight (ABW) and three QTL regions for litter birth weight (LBW) reported previously by other groups. Furthermore, gene ontology analysis with 183 genes
Gois, I B; Borém, A; Cristofani-Yaly, M; de Resende, M D V; Azevedo, C F; Bastianel, M; Novelli, V M; Machado, M A
Genome wide selection (GWS) is essential for the genetic improvement of perennial species such as Citrus because of its ability to increase gain per unit time and to enable the efficient selection of characteristics with low heritability. This study assessed GWS efficiency in a population of Citrus and compared it with selection based on phenotypic data. A total of 180 individual trees from a cross between Pera sweet orange (Citrus sinensis Osbeck) and Murcott tangor (Citrus sinensis Osbeck x Citrus reticulata Blanco) were evaluated for 10 characteristics related to fruit quality. The hybrids were genotyped using 5287 DArT_seq(TM) (diversity arrays technology) molecular markers and their effects on phenotypes were predicted using the random regression - best linear unbiased predictor (rr-BLUP) method. The predictive ability, prediction bias, and accuracy of GWS were estimated to verify its effectiveness for phenotype prediction. The proportion of genetic variance explained by the markers was also computed. The heritability of the traits, as determined by markers, was 16-28%. The predictive ability of these markers ranged from 0.53 to 0.64, and the regression coefficients between predicted and observed phenotypes were close to unity. Over 35% of the genetic variance was accounted for by the markers. Accuracy estimates with GWS were lower than those obtained by phenotypic analysis; however, GWS was superior in terms of genetic gain per unit time. Thus, GWS may be useful for Citrus breeding as it can predict phenotypes early and accurately, and reduce the length of the selection cycle. This study demonstrates the feasibility of genomic selection in Citrus.
Lee, Young-Sup; Jeong, Hyeonsoo; Taye, Mengistie; Kim, Hyeon Jeong; Ka, Sojeong; Ryu, Youn-Chul; Cho, Seoae
The missing heritability has been a major problem in the analysis of best linear unbiased prediction (BLUP). We introduced the traditional genome-wide association study (GWAS) into the BLUP to improve the heritability estimation. We analyzed eight pork quality traits of the Berkshire breeds using GWAS and BLUP. GWAS detects the putative quantitative trait loci regions given traits. The single nucleotide polymorphisms (SNPs) were obtained using GWAS results with p value <0.01. BLUP analyzed with significant SNPs was much more accurate than that with total genotyped SNPs in terms of narrow-sense heritability. It implies that genomic estimated breeding values (GEBVs) of pork quality traits can be calculated by BLUP via GWAS. The GWAS model was the linear regression using PLINK and BLUP model was the G-BLUP and SNP-GBLUP. The SNP-GBLUP uses SNP-SNP relationship matrix. The BLUP analysis using preprocessing of GWAS can be one of the possible alternatives of solving the missing heritability problem and it can provide alternative BLUP method which can find more accurate GEBVs.
Hayes, Ben; Goddard, Mike
Results from genome-wide association studies in livestock, and humans, has lead to the conclusion that the effect of individual quantitative trait loci (QTL) on complex traits, such as yield, are likely to be small; therefore, a large number of QTL are necessary to explain genetic variation in these traits. Given this genetic architecture, gains from marker-assisted selection (MAS) programs using only a small number of DNA markers to trace a limited number of QTL is likely to be small. This has lead to the development of alternative technology for using the available dense single nucleotide polymorphism (SNP) information, called genomic selection. Genomic selection uses a genome-wide panel of dense markers so that all QTL are likely to be in linkage disequilibrium with at least one SNP. The genomic breeding values are predicted to be the sum of the effect of these SNPs across the entire genome. In dairy cattle breeding, the accuracy of genomic estimated breeding values (GEBV) that can be achieved and the fact that these are available early in life have lead to rapid adoption of the technology. Here, we discuss the design of experiments necessary to achieve accurate prediction of GEBV in future generations in terms of the number of markers necessary and the size of the reference population where marker effects are estimated. We also present a simple method for implementing genomic selection using a genomic relationship matrix. Future challenges discussed include using whole genome sequence data to improve the accuracy of genomic selection and management of inbreeding through genomic relationships.
Cao, Jiaxve; Wu, Mingming; Ma, Xiaomeng; Liu, Zhen; Liu, Ruizao; Zhao, Fuping; Wei, Caihong; Du, Lixin
Background Commercial sheep raised for mutton grow faster than traditional Chinese sheep breeds. Here, we aimed to evaluate genetic selection among three different types of sheep breed: two well-known commercial mutton breeds and one indigenous Chinese breed. Results We first combined locus-specific branch lengths and di statistical methods to detect candidate regions targeted by selection in the three different populations. The results showed that the genetic distances reached at least medium divergence for each pairwise combination. We found these two methods were highly correlated, and identified many growth-related candidate genes undergoing artificial selection. For production traits, APOBR and FTO are associated with body mass index. For meat traits, ALDOA, STK32B and FAM190A are related to marbling. For reproduction traits, CCNB2 and SLC8A3 affect oocyte development. We also found two well-known genes, GHR (which affects meat production and quality) and EDAR (associated with hair thickness) were associated with German mutton merino sheep. Furthermore, four genes (POL, RPL7, MSL1 and SHISA9) were associated with pre-weaning gain in our previous genome-wide association study. Conclusions Our results indicated that combine locus-specific branch lengths and di statistical approaches can reduce the searching ranges for specific selection. And we got many credible candidate genes which not only confirm the results of previous reports, but also provide a suite of novel candidate genes in defined breeds to guide hybridization breeding. PMID:26083354
Lystig, Theodore C
Genome-wide scans for quantitative trait loci (QTL) have traditionally been summarized with plots of logarithm of odds (LOD) scores. A valuable modification is to supplement such plots with an additional vertical axis displaying quantiles of adjusted P values and labeling local maxima of the LOD scores with location-specific adjusted P values. This provides a visible gradation of genome-wide significance for the LOD score curve, instead of the stark dichotomy that a single threshold yields. Adjusted P values give genome-wide significance of individual LOD scores and are obtained through a straightforward modification of the familiar algorithm for generating permutation-based thresholds. PMID:12930772
Mastrangelo, S; Saura, M; Tolone, M; Salces-Ortiz, J; Di Gerlando, R; Bertolini, F; Fontanesi, L; Sardina, M T; Serrano, M; Portolano, B
Genomic technologies, such as high-throughput genotyping based on SNP arrays, provided background information concerning genome structure in domestic animals. The aim of this work was to investigate the genetic structure, the genome-wide estimates of inbreeding, coancestry, effective population size (Ne), and the patterns of linkage disequilibrium (LD) in 2 economically important Sicilian local cattle breeds, Cinisara (CIN) and Modicana (MOD), using the Illumina Bovine SNP50K v2 BeadChip. To understand the genetic relationship and to place both Sicilian breeds in a global context, genotypes from 134 other domesticated bovid breeds were used. Principal component analysis showed that the Sicilian cattle breeds were closer to individuals of Bos taurus taurus from Eurasia and formed nonoverlapping clusters with other breeds. Between the Sicilian cattle breeds, MOD was the most differentiated, whereas the animals belonging to the CIN breed showed a lower value of assignment, the presence of substructure, and genetic links with the MOD breed. The average molecular inbreeding and coancestry coefficients were moderately high, and the current estimates of Ne were low in both breeds. These values indicated a low genetic variability. Considering levels of LD between adjacent markers, the average r(2) in the MOD breed was comparable to those reported for others cattle breeds, whereas CIN showed a lower value. Therefore, these results support the need of more dense SNP arrays for a high-power association mapping and genomic selection efficiency, particularly for the CIN cattle breed. Controlling molecular inbreeding and coancestry would restrict inbreeding depression, the probability of losing beneficial rare alleles, and therefore the risk of extinction. The results generated from this study have important implications for the development of conservation and/or selection breeding programs in these 2 local cattle breeds.
The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identi...
Background Since the times of domestication, cattle have been continually shaped by the influence of humans. Relatively recent history, including breed formation and the still enduring enormous improvement of economically important traits, is expected to have left distinctive footprints of selection within the genome. The purpose of this study was to map genome-wide selection signatures in ten cattle breeds and thus improve the understanding of the genome response to strong artificial selection and support the identification of the underlying genetic variants of favoured phenotypes. We analysed 47,651 single nucleotide polymorphisms (SNP) using Cross Population Extended Haplotype Homozygosity (XP-EHH). Results We set the significance thresholds using the maximum XP-EHH values of two essentially artificially unselected breeds and found up to 229 selection signatures per breed. Through a confirmation process we verified selection for three distinct phenotypes typical for one breed (polledness in Galloway, double muscling in Blanc-Bleu Belge and red coat colour in Red Holstein cattle). Moreover, we detected six genes strongly associated with known QTL for beef or dairy traits (TG, ABCG2, DGAT1, GH1, GHR and the Casein Cluster) within selection signatures of at least one breed. A literature search for genes lying in outstanding signatures revealed further promising candidate genes. However, in concordance with previous genome-wide studies, we also detected a substantial number of signatures without any yet known gene content. Conclusions These results show the power of XP-EHH analyses in cattle to discover promising candidate genes and raise the hope of identifying phenotypically important variants in the near future. The finding of plausible functional candidates in some short signatures supports this hope. For instance, MAP2K6 is the only annotated gene of two signatures detected in Galloway and Gelbvieh cattle and is already known to be associated with carcass
This study leverages the breeding data of 1,862 breeding lines evaluated in 97 field trials for genome-wide association study of malting quality traits in barley. The breeding lines were six-row and two-row barley advanced breeding lines from eight barley breeding populations established at six pub...
Kijas, James W; Townley, David; Dalrymple, Brian P; Heaton, Michael P; Maddox, Jillian F; McGrath, Annette; Wilson, Peter; Ingersoll, Roxann G; McCulloch, Russell; McWilliam, Sean; Tang, Dave; McEwan, John; Cockett, Noelle; Oddy, V Hutton; Nicholas, Frank W; Raadsma, Herman
The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability.
Background Modern breeding and artificial selection play critical roles in pig domestication and shape the genetic variation of different breeds. China has many indigenous pig breeds with various characteristics in morphology and production performance that differ from those of foreign commercial pig breeds. However, the signatures of selection on genes implying for economic traits between Chinese indigenous and commercial pigs have been poorly understood. Results We identified footprints of positive selection at the whole genome level, comprising 44,652 SNPs genotyped in six Chinese indigenous pig breeds, one developed breed and two commercial breeds. An empirical genome-wide distribution of Fst (F-statistics) was constructed based on estimations of Fst for each SNP across these nine breeds. We detected selection at the genome level using the High-Fst outlier method and found that 81 candidate genes show high evidence of positive selection. Furthermore, the results of network analyses showed that the genes that displayed evidence of positive selection were mainly involved in the development of tissues and organs, and the immune response. In addition, we calculated the pairwise Fst between Chinese indigenous and commercial breeds (CHN VS EURO) and between Northern and Southern Chinese indigenous breeds (Northern VS Southern). The IGF1R and ESR1 genes showed evidence of positive selection in the CHN VS EURO and Northern VS Southern groups, respectively. Conclusions In this study, we first identified the genomic regions that showed evidences of selection between Chinese indigenous and commercial pig breeds using the High-Fst outlier method. These regions were found to be involved in the development of tissues and organs, the immune response, growth and litter size. The results of this study provide new insights into understanding the genetic variation and domestication in pigs. PMID:24422716
Edea, Z; Bhuiyan, M S A; Dessie, T; Rothschild, M F; Dadi, H; Kim, K S
Knowledge about genetic diversity and population structure is useful for designing effective strategies to improve the production, management and conservation of farm animal genetic resources. Here, we present a comprehensive genome-wide analysis of genetic diversity, population structure and admixture based on 244 animals sampled from 10 cattle populations in Asia and Africa and genotyped for 69,903 autosomal single-nucleotide polymorphisms (SNPs) mainly derived from the indicine breed. Principal component analysis, STRUCTURE and distance analysis from high-density SNP data clearly revealed that the largest genetic difference occurred between the two domestic lineages (taurine and indicine), whereas Ethiopian cattle populations represent a mosaic of the humped zebu and taurine. Estimation of the genetic influence of zebu and taurine revealed that Ethiopian cattle were characterized by considerable levels of introgression from South Asian zebu, whereas Bangladeshi populations shared very low taurine ancestry. The relationships among Ethiopian cattle populations reflect their history of origin and admixture rather than phenotype-based distinctions. The high within-individual genetic variability observed in Ethiopian cattle represents an untapped opportunity for adaptation to changing environments and for implementation of within-breed genetic improvement schemes. Our results provide a basis for future applications of genome-wide SNP data to exploit the unique genetic makeup of indigenous cattle breeds and to facilitate their improvement and conservation.
Yang, Yalan; Zhou, Rong; Mu, Yulian; Hou, Xinhua; Tang, Zhonglin; Li, Kui
DNA methylation is a crucial epigenetic modification involved in diverse biological processes. There is significant phenotypic variance between Chinese indigenous and western pig breeds. Here, we surveyed the genome-wide DNA methylation profiles of blood leukocytes from three pig breeds (Tongcheng, Landrace, and Wuzhishan) by methylated DNA immunoprecipitation sequencing. The results showed that DNA methylation was enriched in gene body regions and repetitive sequences. LINE/L1 and SINE/tRNA-Glu were the predominant methylated repeats in pigs. The methylation level in the gene body regions was higher than in the 5′ and 3′ flanking regions of genes. About 15% of CpG islands were methylated in the pig genomes. Additionally, 2,807, 2,969, and 5,547 differentially methylated genes (DMGs) were identified in the Tongcheng vs. Landrace, Tongcheng vs. Wuzhishan, and Landrace vs. Wuzhishan comparisons, respectively. A total of 868 DMGs were shared by the three contrasts. The DMGs were significantly enriched in development- and metabolism-related biological processes and pathways. Finally, we identified 32 candidate DMGs associated with phenotype variance in pigs. Our research provides a DNA methylome resource for pigs and furthers understanding of epigenetically regulated phenotype variance in mammals. PMID:27444743
Sudrajad, P; Seo, D W; Choi, T J; Park, B H; Roh, S H; Jung, W Y; Lee, S S; Lee, J H; Kim, S; Lee, S H
The routine collection and use of genomic data are useful for effectively managing breeding programs for endangered populations. Linkage disequilibrium (LD) using high-density DNA markers has been widely used to determine population structures and predict the genomic regions that are associated with economic traits in beef cattle. The extent of LD also provides information about historical events, including past effective population size (Ne ), and it allows inferences on the genetic diversity of breeds. The objective of this study was to estimate the LD and Ne in three Korean cattle breeds that are genetically similar but have different coat colors (Brown, Brindle and Jeju Black Hanwoo). Brindle and Jeju Black are endangered breeds with small populations, whereas Brown Hanwoo is the main breeding population in Korea. DNA samples from these cattle breeds were genotyped using the Illumina BovineSNP50 Bead Chip. We examined 13 cattle breeds, including European taurines, African taurines and indicines, and hybrids to compare their LD values. Brown Hanwoo consistently had the lowest mean LD compared to Jeju Black, Brindle and the other 13 cattle breeds (0.13, 0.19, 0.21 and 0.15-0.22 respectively). The high LD values of Brindle and Jeju Black contributed to small Ne values (53 and 60 respectively), which were distinct from that of Brown Hanwoo (531) for 11 generations ago. The differences in LD and Ne for each breed reflect the breeding strategy applied. The Ne for these endangered cattle breeds remain low; thus, effort is needed to bring them back to a sustainable tract.
Sallam, Ahmed M; Zare, Yalda; Alpay, Fazli; Shook, George E; Collins, Michael T; Alsheikh, Samir; Sharaby, Mahmoud; Kirkpatrick, Brian W
Paratuberculosis is a chronic disease of ruminants caused by Mycobacterium avium subspecies paratuberculosis (MAP). It occurs worldwide and causes a significant loss in the animal production industry. There is no cure for MAP infection and vaccination is problematic. Identification of genetics of susceptibility could be a useful adjunct for programs that focus on management, testing and culling of diseased animals. A case-control, genome-wide association study (GWAS) was conducted using Holstein and Jersey cattle in a combined analysis in order to identify markers and chromosomal regions associated with susceptibility to MAP infection across-breed. A mixed-model method (GRAMMAR-GC) implemented in the GenABEL R package and a Bayes C analysis implemented in GenSel software were used as alternative approaches to conduct GWAS analysis focused on single SNPs and chromosomal segments, respectively. After conducting quality control, 22 406 SNPs from 2157 individuals were available for the GRAMMAR-GC (Bayes C) analysis and 45 640 SNPs from 2199 individuals were available for the Bayes C analysis. One SNP located on BTA27 (8·6 Mb) was identified as moderately associated (P < 5 × 10-5, FDR = 0·44) in the GRAMMAR-GC analysis of the combined breed data. Nine 1 Mb windows located on BTA 2, 3 (3 windows), 6, 8, 25, 27 and 29 each explained ≥1% of the total proportion of genetic variance in the Bayes C analysis. In an analysis ignoring differences in linkage phase, two moderately significantly associated SNPs were identified; ARS-BFGL-NGS-19381 on BTA23 (32 Mb) and Hapmap40994-BTA-46361 on BTA19 (61 Mb). New common genomic regions and candidate genes have been identified from the across-breed analysis that might be involved in the immune response and susceptibility to MAP infection.
Leaf rust is an important disease, threatening wheat production annually. Identification of resistance genes or QTLs for effective field resistance could greatly enhance our ability to breed durably resistant varieties. We applied a genome wide association study (GWAS) approach to identify resista...
A genome-wide SNP resource was developed for rice using the GoldenGate assay and used to genotype 400 landrace accessions of O. sativa. SNPs were originally discovered using Perlegen re-sequencing technology in 20 diverse landraces of O. sativa as part of OryzaSNP project (http://irfgc.irri.org). An...
The imprints of domestication and breed development on the genomes of livestock likely differ from those of companion animals. A deep draft sequence assembly of shotgun reads from a single Hereford female and comparative sequences sampled from six additional breeds were used to develop probes to interrogate 37,470 single-nucleotide polymorphisms (SNPs) in 497 cattle from 19 geographically and biologically diverse breeds. These data show that cattle have undergone a rapid recent decrease in effective population size from a very large ancestral population, possibly due to bottlenecks associated with domestication, selection, and breed formation. Domestication and artificial selection appear to have left detectable signatures of selection within the cattle genome, yet the current levels of diversity within breeds are at least as great as exists within humans. PMID:19390050
He, Yuna; Ma, Junwu; Zhang, Feng; Hou, Lijuan; Chen, Hao; Guo, Yuanmei; Zhang, Zhiyan
Numerous quantitative trait loci (QTL) for loin eye area had been identified by linkage mapping studies, but the lack of their precise position hinders their application in the pig breeding industry. To map QTL for loin eye area to a precise genomic region, we conducted a genome-wide association study (GWAS) using Illumina 60 K PorcineSNP60 Beadchip in four swine populations: 819 F2 pigs, 273 Laiwu pigs, 434 Sutai pigs, and 326 Erhualian pigs. In total, 26 single nucleotide polymorphisms (SNPs) deposited on seven chromosomes associated with loin eye area were identified, 11 of which surpassed the genome-wide significant threshold; of the 11 SNPs, seven located on SSC2 in F2 pigs and four located on SSC12 and SSC18 in Laiwu pigs. Of note, all of the identified QTL were breed specific and no common QTL was identified across the four populations in our study. These findings not only confirmed a previous QTL on SSC2 harboring the candidate gene insulin-like growth factor 2 (IGF2), but also identified some novel candidate genes, far upstream element binding protein 3 (FUBP3), myosin heavy chain (MYH) family, leucine-rich repeats and guanylate kinase domain containing (LRGUK). Our study will contribute to the further identification of the causal mutation underlying these QTL and improve our knowledge of the complex genetic architecture for loin eye area in pigs.
Petersen, Jessica L.; Mickelson, James R.; Rendahl, Aaron K.; Valberg, Stephanie J.; Andersson, Lisa S.; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M.; Borges, Alexandre S.; Brama, Pieter; da Câmara Machado, Artur; Capomaccio, Stefano; Cappelli, Katia; Cothran, E. Gus; Distl, Ottmar; Fox-Clipsham, Laura; Graves, Kathryn T.; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A.; Mikko, Sofia; Orr, Nicholas; Penedo, M. Cecilia T.; Piercy, Richard J.; Raekallio, Marja; Rieder, Stefan; Røed, Knut H.; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; Wade, Claire M.; McCue, Molly E.
Intense selective pressures applied over short evolutionary time have resulted in homogeneity within, but substantial variation among, horse breeds. Utilizing this population structure, 744 individuals from 33 breeds, and a 54,000 SNP genotyping array, breed-specific targets of selection were identified using an FST-based statistic calculated in 500-kb windows across the genome. A 5.5-Mb region of ECA18, in which the myostatin (MSTN) gene was centered, contained the highest signature of selection in both the Paint and Quarter Horse. Gene sequencing and histological analysis of gluteal muscle biopsies showed a promoter variant and intronic SNP of MSTN were each significantly associated with higher Type 2B and lower Type 1 muscle fiber proportions in the Quarter Horse, demonstrating a functional consequence of selection at this locus. Signatures of selection on ECA23 in all gaited breeds in the sample led to the identification of a shared, 186-kb haplotype including two doublesex related mab transcription factor genes (DMRT2 and 3). The recent identification of a DMRT3 mutation within this haplotype, which appears necessary for the ability to perform alternative gaits, provides further evidence for selection at this locus. Finally, putative loci for the determination of size were identified in the draft breeds and the Miniature horse on ECA11, as well as when signatures of selection surrounding candidate genes at other loci were examined. This work provides further evidence of the importance of MSTN in racing breeds, provides strong evidence for selection upon gait and size, and illustrates the potential for population-based techniques to find genomic regions driving important phenotypes in the modern horse. PMID:23349635
Petersen, Jessica L; Mickelson, James R; Rendahl, Aaron K; Valberg, Stephanie J; Andersson, Lisa S; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M; Borges, Alexandre S; Brama, Pieter; da Câmara Machado, Artur; Capomaccio, Stefano; Cappelli, Katia; Cothran, E Gus; Distl, Ottmar; Fox-Clipsham, Laura; Graves, Kathryn T; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A; Mikko, Sofia; Orr, Nicholas; Penedo, M Cecilia T; Piercy, Richard J; Raekallio, Marja; Rieder, Stefan; Røed, Knut H; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; Wade, Claire M; McCue, Molly E
Intense selective pressures applied over short evolutionary time have resulted in homogeneity within, but substantial variation among, horse breeds. Utilizing this population structure, 744 individuals from 33 breeds, and a 54,000 SNP genotyping array, breed-specific targets of selection were identified using an F(ST)-based statistic calculated in 500-kb windows across the genome. A 5.5-Mb region of ECA18, in which the myostatin (MSTN) gene was centered, contained the highest signature of selection in both the Paint and Quarter Horse. Gene sequencing and histological analysis of gluteal muscle biopsies showed a promoter variant and intronic SNP of MSTN were each significantly associated with higher Type 2B and lower Type 1 muscle fiber proportions in the Quarter Horse, demonstrating a functional consequence of selection at this locus. Signatures of selection on ECA23 in all gaited breeds in the sample led to the identification of a shared, 186-kb haplotype including two doublesex related mab transcription factor genes (DMRT2 and 3). The recent identification of a DMRT3 mutation within this haplotype, which appears necessary for the ability to perform alternative gaits, provides further evidence for selection at this locus. Finally, putative loci for the determination of size were identified in the draft breeds and the Miniature horse on ECA11, as well as when signatures of selection surrounding candidate genes at other loci were examined. This work provides further evidence of the importance of MSTN in racing breeds, provides strong evidence for selection upon gait and size, and illustrates the potential for population-based techniques to find genomic regions driving important phenotypes in the modern horse.
Gao, Liangliang; Turner, M. Kathryn; Chao, Shiaoman; Kolmer, James; Anderson, James A.
Leaf rust is an important disease, threatening wheat production annually. Identification of resistance genes or QTLs for effective field resistance could greatly enhance our ability to breed durably resistant varieties. We applied a genome wide association study (GWAS) approach to identify resistance genes or QTLs in 338 spring wheat breeding lines from public and private sectors that were predominately developed in the Americas. A total of 46 QTLs were identified for field and seedling traits and approximately 20–30 confer field resistance in varying degrees. The 10 QTLs accounting for the most variation in field resistance explained 26–30% of the total variation (depending on traits: percent severity, coefficient of infection or response type). Similarly, the 10 QTLs accounting for most of the variation in seedling resistance to different races explained 24–34% of the variation, after correcting for population structure. Two potentially novel QTLs (QLr.umn-1AL, QLr.umn-4AS) were identified. Identification of novel genes or QTLs and validation of previously identified genes or QTLs for seedling and especially adult plant resistance will enhance understanding of leaf rust resistance and assist breeding for resistant wheat varieties. We also developed computer programs to automate field and seedling rust phenotype data conversions. This is the first GWAS study of leaf rust resistance in elite wheat breeding lines genotyped with high density 90K SNP arrays. PMID:26849364
Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C
The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2) = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in
Wang, Wei; Wang, Shenyuan; Hou, Chenglin; Xing, Yanping; Cao, Junwei; Wu, Kaifeng; Liu, Chunxia; Zhang, Dong; Zhang, Li; Zhang, Yanru; Zhou, Huanmin
Recent studies have found that copy number variations (CNVs) are widespread in human and animal genomes. CNVs are a significant source of genetic variation, and have been shown to be associated with phenotypic diversity. However, the effect of CNVs on genetic variation in horses is not well understood. In the present study, CNVs in 6 different breeds of mare horses, Mongolia horse, Abaga horse, Hequ horse and Kazakh horse (all plateau breeds) and Debao pony and Thoroughbred, were determined using aCGH. In total, seven hundred CNVs were identified ranging in size from 6.1 Kb to 0.57 Mb across all autosomes, with an average size of 43.08 Kb and a median size of 15.11 Kb. By merging overlapping CNVs, we found a total of three hundred and fifty-three CNV regions (CNVRs). The length of the CNVRs ranged from 6.1 Kb to 1.45 Mb with average and median sizes of 38.49 Kb and 13.1 Kb. Collectively, 13.59 Mb of copy number variation was identified among the horses investigated and accounted for approximately 0.61% of the horse genome sequence. Five hundred and eighteen annotated genes were affected by CNVs, which corresponded to about 2.26% of all horse genes. Through the gene ontology (GO), genetic pathway analysis and comparison of CNV genes among different breeds, we found evidence that CNVs involving 7 genes may be related to the adaptation to severe environment of these plateau horses. This study is the first report of copy number variations in Chinese horses, which indicates that CNVs are ubiquitous in the horse genome and influence many biological processes of the horse. These results will be helpful not only in mapping the horse whole-genome CNVs, but also to further research for the adaption to the high altitude severe environment for plateau horses.
Hou, Chenglin; Xing, Yanping; Cao, Junwei; Wu, Kaifeng; Liu, Chunxia; Zhang, Dong; Zhang, Li; Zhang, Yanru; Zhou, Huanmin
Recent studies have found that copy number variations (CNVs) are widespread in human and animal genomes. CNVs are a significant source of genetic variation, and have been shown to be associated with phenotypic diversity. However, the effect of CNVs on genetic variation in horses is not well understood. In the present study, CNVs in 6 different breeds of mare horses, Mongolia horse, Abaga horse, Hequ horse and Kazakh horse (all plateau breeds) and Debao pony and Thoroughbred, were determined using aCGH. In total, seven hundred CNVs were identified ranging in size from 6.1 Kb to 0.57 Mb across all autosomes, with an average size of 43.08 Kb and a median size of 15.11 Kb. By merging overlapping CNVs, we found a total of three hundred and fifty-three CNV regions (CNVRs). The length of the CNVRs ranged from 6.1 Kb to 1.45 Mb with average and median sizes of 38.49 Kb and 13.1 Kb. Collectively, 13.59 Mb of copy number variation was identified among the horses investigated and accounted for approximately 0.61% of the horse genome sequence. Five hundred and eighteen annotated genes were affected by CNVs, which corresponded to about 2.26% of all horse genes. Through the gene ontology (GO), genetic pathway analysis and comparison of CNV genes among different breeds, we found evidence that CNVs involving 7 genes may be related to the adaptation to severe environment of these plateau horses. This study is the first report of copy number variations in Chinese horses, which indicates that CNVs are ubiquitous in the horse genome and influence many biological processes of the horse. These results will be helpful not only in mapping the horse whole-genome CNVs, but also to further research for the adaption to the high altitude severe environment for plateau horses. PMID:24497987
Pilot, Małgorzata; Malewski, Tadeusz; Moura, Andre E.; Grzybowski, Tomasz; Oleński, Kamil; Kamiński, Stanisław; Fadel, Fernanda Ruiz; Alagaili, Abdulaziz N.; Mohammed, Osama B.; Bogdanowicz, Wiesław
Domesticated species are often composed of distinct populations differing in the character and strength of artificial and natural selection pressures, providing a valuable model to study adaptation. In contrast to pure-breed dogs that constitute artificially maintained inbred lines, free-ranging dogs are typically free-breeding, i.e., unrestrained in mate choice. Many traits in free-breeding dogs (FBDs) may be under similar natural and sexual selection conditions to wild canids, while relaxation of sexual selection is expected in pure-breed dogs. We used a Bayesian approach with strict false-positive control criteria to identify FST-outlier SNPs between FBDs and either European or East Asian breeds, based on 167,989 autosomal SNPs. By identifying outlier SNPs located within coding genes, we found four candidate genes under diversifying selection shared by these two comparisons. Three of them are associated with the Hedgehog (HH) signaling pathway regulating vertebrate morphogenesis. A comparison between FBDs and East Asian breeds also revealed diversifying selection on the BBS6 gene, which was earlier shown to cause snout shortening and dental crowding via disrupted HH signaling. Our results suggest that relaxation of natural and sexual selection in pure-breed dogs as opposed to FBDs could have led to mild changes in regulation of the HH signaling pathway. HH inhibits adhesion and the migration of neural crest cells from the neural tube, and minor deficits of these cells during embryonic development have been proposed as the underlying cause of “domestication syndrome.” This suggests that the process of breed formation involved the same genetic and developmental pathways as the process of domestication. PMID:27233669
Zaykin, Dmitri V; Kozbur, Damian O
An appealing genome-wide association study design compares one large control group against several disease samples. A pioneering study by the Wellcome Trust Case Control Consortium that employed such a design has identified multiple susceptibility regions, many of which have been independently replicated. While reusing a control sample provides effective utilization of data, it also creates correlation between association statistics across diseases. An observation of a large association statistic for one of the diseases may greatly increase chances of observing a spuriously large association for a different disease. Accounting for the correlation is also particularly important when screening for SNPs that might be involved in a set of diseases with overlapping etiology. We describe methods that correct association statistics for dependency due to shared controls, and we describe ways to obtain a measure of overall evidence and to combine association signals across multiple diseases. The methods we describe require no access to individual subject data, instead, they efficiently utilize information contained in P-values for association reported for individual diseases. P-value based combined tests for association are flexible and essentially as powerful as the approach based on aggregating the individual subject data.
Kwan, Johnny S H; Li, Miao-Xin; Deng, Jia-En; Sham, Pak C
Imputing individual-level genotypes (or genotype imputation) is now a standard procedure in genome-wide association studies (GWAS) to examine disease associations at untyped common genetic variants. Meta-analysis of publicly available GWAS summary statistics can allow more disease-associated loci to be discovered, but these data are usually provided for various variant sets. Thus imputing these summary statistics of different variant sets into a common reference panel for meta-analyses is impossible using traditional genotype imputation methods. Here we develop a fast and accurate P-value imputation (FAPI) method that utilizes summary statistics of common variants only. Its computational cost is linear with the number of untyped variants and has similar accuracy compared with IMPUTE2 with prephasing, one of the leading methods in genotype imputation. In addition, based on the FAPI idea, we develop a metric to detect abnormal association at a variant and showed that it had a significantly greater power compared with LD-PAC, a method that quantifies the evidence of spurious associations based on likelihood ratio. Our method is implemented in a user-friendly software tool, which is available at http://statgenpro.psychiatry.hku.hk/fapi.
We conducted a genome-wide scan for visceral leishmaniasis in mixed-breed dogs from a highly endemic area in Brazil using 149,648 single nucleotide polymorphism (SNP) markers genotyped in 20 cases and 28 controls. Using a mixed model approach, we found two candidate loci on canine autosomes 1 and 2....
Begum, Hasina; Spindel, Jennifer E; Lalusin, Antonio; Borromeo, Teresita; Gregorio, Glenn; Hernandez, Jose; Virk, Parminder; Collard, Bertrand; McCouch, Susan R
Genome-wide association mapping studies (GWAS) are frequently used to detect QTL in diverse collections of crop germplasm, based on historic recombination events and linkage disequilibrium across the genome. Generally, diversity panels genotyped with high density SNP panels are utilized in order to assay a wide range of alleles and haplotypes and to monitor recombination breakpoints across the genome. By contrast, GWAS have not generally been performed in breeding populations. In this study we performed association mapping for 19 agronomic traits including yield and yield components in a breeding population of elite irrigated tropical rice breeding lines so that the results would be more directly applicable to breeding than those from a diversity panel. The population was genotyped with 71,710 SNPs using genotyping-by-sequencing (GBS), and GWAS performed with the explicit goal of expediting selection in the breeding program. Using this breeding panel we identified 52 QTL for 11 agronomic traits, including large effect QTLs for flowering time and grain length/grain width/grain-length-breadth ratio. We also identified haplotypes that can be used to select plants in our population for short stature (plant height), early flowering time, and high yield, and thus demonstrate the utility of association mapping in breeding populations for informing breeding decisions. We conclude by exploring how the newly identified significant SNPs and insights into the genetic architecture of these quantitative traits can be leveraged to build genomic-assisted selection models.
Chen, Shaoxia; Lin, Zechuan; Zhou, Degui; Wang, Chongrong; Li, Hong; Yu, Renbo; Deng, Hanchao; Tang, Xiaoyan; Zhou, Shaochuan; Wang Deng, Xing; He, Hang
Improving breeding has been widely utilized in crop breeding and contributed to yield and quality improvement, yet few researches have been done to analyze genetic architecture underlying breeding improvement comprehensively. Here, we collected genotype and phenotype data of 99 cultivars from the complete pedigree including Huanghuazhan, an elite, high-quality, conventional indica rice that has been grown over 4.5 million hectares in southern China and from which more than 20 excellent cultivars have been derived. We identified 1,313 selective sweeps (SSWs) revealing four stage-specific selection patterns corresponding to improvement preference during 65 years, and 1113 conserved Huanghuazhan traceable blocks (cHTBs) introduced from different donors and conserved in >3 breeding generations were the core genomic regions for superior performance of Huanghuazhan. Based on 151 quantitative trait loci (QTLs) identified for 13 improved traits in the pedigree, we reproduced their improvement process in silico, highlighting improving breeding works well for traits controlled by major/major + minor effect QTLs, but was inefficient for traits controlled by QTLs with complex interactions or explaining low levels of phenotypic variation. These results indicate long-term breeding improvement is efficient to construct superior genetic architecture for elite performance, yet molecular breeding with designed genotype of QTLs can facilitate complex traits improvement. PMID:28374863
The Bayes factor is a summary measure that provides an alternative to the P-value for the ranking of associations, or the flagging of associations as "significant". We describe an approximate Bayes factor that is straightforward to use and is appropriate when sample sizes are large. We consider various choices of the prior on the effect size, including those that allow effect size to vary with the minor allele frequency (MAF) of the marker. An important contribution is the description of a specific prior that gives identical rankings between Bayes factors and P-values, providing a link between the two approaches, and allowing the implications of the use of P-values to be more easily understood. As a summary measure of noteworthiness P-values are difficult to calibrate since their interpretation depends on MAF and, crucially, on sample size. A consequence is that a consistent decision-making procedure using P-values requires a threshold for significance that reduces with sample size, contrary to common practice.
Huson, Heather J; vonHoldt, Bridgett M; Rimbault, Maud; Byers, Alexandra M; Runstadler, Jonathan A; Parker, Heidi G; Ostrander, Elaine A
Alaskan sled dogs are a genetically distinct population shaped by generations of selective interbreeding with purebred dogs to create a group of high-performance athletes. As a result of selective breeding strategies, sled dogs present a unique opportunity to employ admixture-mapping techniques to investigate how breed composition and trait selection impact genomic structure. We used admixture mapping to investigate genetic ancestry across the genomes of two classes of sled dogs, sprint and long-distance racers, and combined that with genome-wide association studies (GWAS) to identify regions that correlate with performance-enhancing traits. The sled dog genome is enhanced by differential contributions from four non-admixed breeds (Alaskan Malamute, Siberian Husky, German Shorthaired Pointer, and Borzoi). A principal components analysis (PCA) of 115,000 genome-wide SNPs clearly resolved the sprint and distance populations as distinct genetic groups, with longer blocks of linkage disequilibrium (LD) observed in the distance versus sprint dogs (7.5-10 and 2.5-3.75 kb, respectively). Furthermore, we identified eight regions with the genomic signal from either a selective sweep or an association analysis, corroborated by an excess of ancestry when comparing sprint and distance dogs. A comparison of elite and poor-performing sled dogs identified a single region significantly associated with heat tolerance. Within the region we identified seven SNPs within the myosin heavy chain 9 gene (MYH9) that were significantly associated with heat tolerance in sprint dogs, two of which correspond to conserved promoter and enhancer regions in the human ortholog.
Massey, Jonathan; Dietschi, Elisabeth; Kierczak, Marcin; Lund-Ziener, Martine; Sundberg, Katarina; Thoresen, Stein Istre; Kämpe, Olle; Andersson, Göran; Ollier, William E. R.; Hedhammar, Åke; Leeb, Tosso; Lindblad-Toh, Kerstin; Kennedy, Lorna J.; Lingaas, Frode; Rosengren Pielberg, Gerli
Hypothyroidism is a complex clinical condition found in both humans and dogs, thought to be caused by a combination of genetic and environmental factors. In this study we present a multi-breed analysis of predisposing genetic risk factors for hypothyroidism in dogs using three high-risk breeds—the Gordon Setter, Hovawart and the Rhodesian Ridgeback. Using a genome-wide association approach and meta-analysis, we identified a major hypothyroidism risk locus shared by these breeds on chromosome 12 (p = 2.1x10-11). Further characterisation of the candidate region revealed a shared ~167 kb risk haplotype (4,915,018–5,081,823 bp), tagged by two SNPs in almost complete linkage disequilibrium. This breed-shared risk haplotype includes three genes (LHFPL5, SRPK1 and SLC26A8) and does not extend to the dog leukocyte antigen (DLA) class II gene cluster located in the vicinity. These three genes have not been identified as candidate genes for hypothyroid disease previously, but have functions that could potentially contribute to the development of the disease. Our results implicate the potential involvement of novel genes and pathways for the development of canine hypothyroidism, raising new possibilities for screening, breeding programmes and treatments in dogs. This study may also contribute to our understanding of the genetic etiology of human hypothyroid disease, which is one of the most common endocrine disorders in humans. PMID:26261983
Breed utilization, genetic improvement, and industry consolidation are predicted to have major impacts on the genetic composition of commercial chickens. Consequently, the question arises as to whether sufficient genetic diversity remains within industry stocks to address future needs. With the ch...
Velie, Brandon D.; Shrestha, Merina; Franҫois, Liesbeth; Schurink, Anouk; Tesfayonas, Yohannes G.; Stinckens, Anneleen; Blott, Sarah; Ducro, Bart J.; Mikko, Sofia; Thomas, Ruth; Swinburne, June E.; Sundqvist, Marie; Eriksson, Susanne; Buys, Nadine; Lindgren, Gabriella
While susceptibility to hypersensitive reactions is a common problem amongst humans and animals alike, the population structure of certain animal species and breeds provides a more advantageous route to better understanding the biology underpinning these conditions. The current study uses Exmoor ponies, a highly inbred breed of horse known to frequently suffer from insect bite hypersensitivity, to identify genomic regions associated with a type I and type IV hypersensitive reaction. A total of 110 cases and 170 controls were genotyped on the 670K Axiom Equine Genotyping Array. Quality control resulted in 452,457 SNPs and 268 individuals being tested for association. Genome-wide association analyses were performed using the GenABEL package in R and resulted in the identification of two regions of interest on Chromosome 8. The first region contained the most significant SNP identified, which was located in an intron of the DCC netrin 1 receptor gene. The second region identified contained multiple top SNPs and encompassed the PIGN, KIAA1468, TNFRSF11A, ZCCHC2, and PHLPP1 genes. Although additional studies will be needed to validate the importance of these regions in horses and the relevance of these regions in other species, the knowledge gained from the current study has the potential to be a step forward in unraveling the complex nature of hypersensitive reactions. PMID:27070818
Velie, Brandon D; Shrestha, Merina; Franҫois, Liesbeth; Schurink, Anouk; Tesfayonas, Yohannes G; Stinckens, Anneleen; Blott, Sarah; Ducro, Bart J; Mikko, Sofia; Thomas, Ruth; Swinburne, June E; Sundqvist, Marie; Eriksson, Susanne; Buys, Nadine; Lindgren, Gabriella
While susceptibility to hypersensitive reactions is a common problem amongst humans and animals alike, the population structure of certain animal species and breeds provides a more advantageous route to better understanding the biology underpinning these conditions. The current study uses Exmoor ponies, a highly inbred breed of horse known to frequently suffer from insect bite hypersensitivity, to identify genomic regions associated with a type I and type IV hypersensitive reaction. A total of 110 cases and 170 controls were genotyped on the 670K Axiom Equine Genotyping Array. Quality control resulted in 452,457 SNPs and 268 individuals being tested for association. Genome-wide association analyses were performed using the GenABEL package in R and resulted in the identification of two regions of interest on Chromosome 8. The first region contained the most significant SNP identified, which was located in an intron of the DCC netrin 1 receptor gene. The second region identified contained multiple top SNPs and encompassed the PIGN, KIAA1468, TNFRSF11A, ZCCHC2, and PHLPP1 genes. Although additional studies will be needed to validate the importance of these regions in horses and the relevance of these regions in other species, the knowledge gained from the current study has the potential to be a step forward in unraveling the complex nature of hypersensitive reactions.
Boussaha, Mekki; Esquerré, Diane; Barbieri, Johanna; Djari, Anis; Pinton, Alain; Letaief, Rabia; Salin, Gérald; Escudié, Frédéric; Roulet, Alain; Fritz, Sébastien; Samson, Franck; Grohs, Cécile; Bernard, Maria; Klopp, Christophe; Boichard, Didier; Rocha, Dominique
High-throughput sequencing technologies have offered in recent years new opportunities to study genome variations. These studies have mostly focused on single nucleotide polymorphisms, small insertions or deletions and on copy number variants. Other structural variants, such as large insertions or deletions, tandem duplications, translocations, and inversions are less well-studied, despite that some have an important impact on phenotypes. In the present study, we performed a large-scale survey of structural variants in cattle. We report the identification of 6,426 putative structural variants in cattle extracted from whole-genome sequence data of 62 bulls representing the three major French dairy breeds. These genomic variants affect DNA segments greater than 50 base pairs and correspond to deletions, inversions and tandem duplications. Out of these, we identified a total of 547 deletions and 410 tandem duplications which could potentially code for CNVs. Experimental validation was carried out on 331 structural variants using a novel high-throughput genotyping method. Out of these, 255 structural variants (77%) generated good quality genotypes and 191 (75%) of them were validated. Gene content analyses in structural variant regions revealed 941 large deletions removing completely one or several genes, including 10 single-copy genes. In addition, some of the structural variants are located within quantitative trait loci for dairy traits. This study is a pan-genome assessment of genomic variations in cattle and may provide a new glimpse into the bovine genome architecture. Our results may also help to study the effects of structural variants on gene expression and consequently their effect on certain phenotypes of interest. PMID:26317361
Boussaha, Mekki; Esquerré, Diane; Barbieri, Johanna; Djari, Anis; Pinton, Alain; Letaief, Rabia; Salin, Gérald; Escudié, Frédéric; Roulet, Alain; Fritz, Sébastien; Samson, Franck; Grohs, Cécile; Bernard, Maria; Klopp, Christophe; Boichard, Didier; Rocha, Dominique
High-throughput sequencing technologies have offered in recent years new opportunities to study genome variations. These studies have mostly focused on single nucleotide polymorphisms, small insertions or deletions and on copy number variants. Other structural variants, such as large insertions or deletions, tandem duplications, translocations, and inversions are less well-studied, despite that some have an important impact on phenotypes. In the present study, we performed a large-scale survey of structural variants in cattle. We report the identification of 6,426 putative structural variants in cattle extracted from whole-genome sequence data of 62 bulls representing the three major French dairy breeds. These genomic variants affect DNA segments greater than 50 base pairs and correspond to deletions, inversions and tandem duplications. Out of these, we identified a total of 547 deletions and 410 tandem duplications which could potentially code for CNVs. Experimental validation was carried out on 331 structural variants using a novel high-throughput genotyping method. Out of these, 255 structural variants (77%) generated good quality genotypes and 191 (75%) of them were validated. Gene content analyses in structural variant regions revealed 941 large deletions removing completely one or several genes, including 10 single-copy genes. In addition, some of the structural variants are located within quantitative trait loci for dairy traits. This study is a pan-genome assessment of genomic variations in cattle and may provide a new glimpse into the bovine genome architecture. Our results may also help to study the effects of structural variants on gene expression and consequently their effect on certain phenotypes of interest.
Ceballos, Hernán; Pérez, Juan C.; Joaqui Barandica, Orlando; Lenis, Jorge I.; Morante, Nelson; Calle, Fernando; Pino, Lizbeth; Hershey, Clair H.
Breeding cassava relies on several selection stages (single row trial-SRT; preliminary; advanced; and uniform yield trials—UYT). This study uses data from 14 years of evaluations. From more than 20,000 genotypes initially evaluated only 114 reached the last stage. The objective was to assess how the data at SRT could be used to predict the probabilities of genotypes reaching the UYT. Phenotypic data from each genotype at SRT was integrated into the selection index (SIN) used by the cassava breeding program. Average SIN from all the progenies derived from each progenitor was then obtained. Average SIN is an approximation of the breeding value of each progenitor. Data clearly suggested that some genotypes were better progenitors than others (e.g., high number of their progenies reaching the UYT), suggesting important variation in breeding values of progenitors. However, regression of average SIN of each parental genotype on the number of their respective progenies reaching UYT resulted in a negligible coefficient of determination (r2 = 0.05). Breeding value (e.g., average SIN) at SRT was not efficient predicting which genotypes were more likely to reach the UYT stage. Number of families and progenies derived from a given progenitor were more efficient predicting the probabilities of the progeny from a given parent reaching the UYT stage. Large within-family genetic variation tends to mask the true breeding value of each progenitor. The use of partially inbred progenitors (e.g., S1 or S2 genotypes) would reduce the within-family genetic variation thus making the assessment of breeding value more accurate. Moreover, partial inbreeding of progenitors can improve the breeding value of the original (S0) parental material and sharply accelerate genetic gains. For instance, homozygous S1 genotypes for the dominant resistance to cassava mosaic disease (CMD) could be generated and selected. All gametes from these selected S1 genotypes would carry the desirable allele and
Sebastiani, Paola; Solovieff, Nadia
The availability of high throughput technology for parallel genotyping has opened the field of genetics to genome-wide association studies (GWAS). These studies generate massive amount of genetic data that challenge investigators with issues related to data management, statistical analysis of large data sets, visualization, and annotation of results. We will review the common approach to analysis of GWAS data and then discuss options to learn more from these data.
Background Combining information from different studies is an important and useful practice in bioinformatics, including genome-wide association study, rare variant data analysis and other set-based analyses. Many statistical methods have been proposed to combine p-values from independent studies. However, it is known that there is no uniformly most powerful test under all conditions; therefore, finding a powerful test in specific situation is important and desirable. Results In this paper, we propose a new statistical approach to combining p-values based on gamma distribution, which uses the inverse of the p-value as the shape parameter in the gamma distribution. Conclusions Simulation study and real data application demonstrate that the proposed method has good performance under some situations. PMID:25559433
Deng, Li Xin; He, Cong
In this study, the complete mitochondrial genome sequence of the Tibetan Mastiff was reported. The total length of the mitogenome is 16,729 bp. It contains the typical structure, including 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and 1 control region is in line with other canine animals. We further identified genome-wide variations among different canine mitochondrial genomes and indicated that the D-loop region harbors the most sequence variation, which will provide sequence variation information for the protection and utilization of the Tibetan Mastiff germplasm resource.
Li, Yao; Li, Jialian; Fang, Chengchi; Shi, Liang; Tan, Jiajian; Xiong, Yuanzhu; Bin Fan; Li, Changchun
Some documented evidences proved small RNAs (sRNA) and targeted genes are involved in mammalian testicular development and spermatogenesis. However, the detailed molecular regulation mechanisms of them remain largely unknown so far. In this study, we obtained a total of 10,716 mRNAs, 67 miRNAs and 16,953 piRNAs which were differentially expressed between LC and LW pig breeds or between the two sexual maturity stages. Of which, we identified 16 miRNAs and 28 targeted genes possibly related to spermatogenesis; 14 miRNA and 18 targeted genes probably associated with cell adhesion related testis development. We also annotated 579 piRNAs which could potentially regulate cell death, nucleosome organization and other basic biology process, which implied that those piRNAs might be involved in sexual maturation difference. The integrated network analysis results suggested that some differentially expressed genes were involved in spermatogenesis through the ECM–receptor interaction, focal adhesion, Wnt and PI3K–Akt signaling pathways, some particular miRNAs have the negative regulation roles and some special piRNAs have the positive and negative regulation roles in testicular development. Our data provide novel insights into the molecular expression and regulation similarities and diversities of spermatogenesis and testicular development in different pig breeds at different stages of sexual maturity. PMID:27229484
Přibyl, J; Bauer, J; Čermák, V; Pešek, P; Přibylová, J; Šplíchal, J; Vostrá-Vydrová, H; Vostrý, L; Zavadilová, L
Estimated breeding values (EBVs) and genomic enhanced breeding values (GEBVs) for milk production of young genotyped Holstein bulls were predicted using a conventional BLUP - Animal Model, a method fitting regression coefficients for loci (RRBLUP), a method utilizing the realized genomic relationship matrix (GBLUP), by a single-step procedure (ssGBLUP) and by a one-step blending procedure. Information sources for prediction were the nation-wide database of domestic Czech production records in the first lactation combined with deregressed proofs (DRP) from Interbull files (August 2013) and domestic test-day (TD) records for the first three lactations. Data from 2627 genotyped bulls were used, of which 2189 were already proven under domestic conditions. Analyses were run that used Interbull values for genotyped bulls only or that used Interbull values for all available sires. Resultant predictions were compared with GEBV of 96 young foreign bulls evaluated abroad and whose proofs were from Interbull method GMACE (August 2013) on the Czech scale. Correlations of predictions with GMACE values of foreign bulls ranged from 0.33 to 0.75. Combining domestic data with Interbull EBVs improved prediction of both EBV and GEBV. Predictions by Animal Model (traditional EBV) using only domestic first lactation records and GMACE values were correlated by only 0.33. Combining the nation-wide domestic database with all available DRP for genotyped and un-genotyped sires from Interbull resulted in an EBV correlation of 0.60, compared with 0.47 when only Interbull data were used. In all cases, GEBVs had higher correlations than traditional EBVs, and the highest correlations were for predictions from the ssGBLUP procedure using combined data (0.75), or with all available DRP from Interbull records only (one-step blending approach, 0.69). The ssGBLUP predictions using the first three domestic lactation records in the TD model were correlated with GMACE predictions by 0.69, 0.64 and 0
Iwata, Hiroyoshi; Hayashi, Takeshi; Terakami, Shingo; Takada, Norio; Sawamura, Yutaka; Yamamoto, Toshiya
Although the potential of marker-assisted selection (MAS) in fruit tree breeding has been reported, bi-parental QTL mapping before MAS has hindered the introduction of MAS to fruit tree breeding programs. Genome-wide association studies (GWAS) are an alternative to bi-parental QTL mapping in long-lived perennials. Selection based on genomic predictions of breeding values (genomic selection: GS) is another alternative for MAS. This study examined the potential of GWAS and GS in pear breeding with 76 Japanese pear cultivars to detect significant associations of 162 markers with nine agronomic traits. We applied multilocus Bayesian models accounting for ordinal categorical phenotypes for GWAS and GS model training. Significant associations were detected at harvest time, black spot resistance and the number of spurs and two of the associations were closely linked to known loci. Genome-wide predictions for GS were accurate at the highest level (0.75) in harvest time, at medium levels (0.38-0.61) in resistance to black spot, firmness of flesh, fruit shape in longitudinal section, fruit size, acid content and number of spurs and at low levels (<0.2) in all soluble solid content and vigor of tree. Results suggest the potential of GWAS and GS for use in future breeding programs in Japanese pear.
Gottschalk, Maren; Metzger, Julia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar
We performed a genome-wide association study for semen quality traits in 139 German Warmblood stallions. Stallions were genotyped using the Illumina equine SNP50 Beadchip. Traits analysed were de-regressed estimated breeding values (EBVs) for gel-free volume, sperm concentration, total number of sperm, progressive motility and the total number of progressively motile sperm. The GWAS revealed 29 SNPs on 12 different chromosomes as genome-wide significantly associated with semen quality traits. For ten genomic regions we could retrieve candidate genes influencing stallion fertility. Among the candidate genes, we could find the genes encoding cysteine-rich secretory proteins (CRISP1, CRISP2 and CRISP3). This was the first GWAS in horses performed for semen quality traits.
Shin, Donghyun; Lee, Chul; Park, Kyoung-Do; Kim, Heebal; Cho, Kwang-hyeon
Objective Holsteins are known as the world’s highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein) using Korean Holstein data. Methods This study was performed using single nucleotide polymorphism (SNP) chip data (Illumina BovineSNP50 Beadchip) of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP) and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins. PMID:26954162
Rolf, M M; Taylor, J F; Schnabel, R D; McKay, S D; McClure, M C; Northcutt, S L; Kerley, M S; Weaber, R L
Estimated breeding values for average daily feed intake (AFI; kg/day), residual feed intake (RFI; kg/day) and average daily gain (ADG; kg/day) were generated using a mixed linear model incorporating genomic relationships for 698 Angus steers genotyped with the Illumina BovineSNP50 assay. Association analyses of estimated breeding values (EBVs) were performed for 41 028 single nucleotide polymorphisms (SNPs), and permutation analysis was used to empirically establish the genome-wide significance threshold (P < 0.05) for each trait. SNPs significantly associated with each trait were used in a forward selection algorithm to identify genomic regions putatively harbouring genes with effects on each trait. A total of 53, 66 and 68 SNPs explained 54.12% (24.10%), 62.69% (29.85%) and 55.13% (26.54%) of the additive genetic variation (when accounting for the genomic relationships) in steer breeding values for AFI, RFI and ADG, respectively, within this population. Evaluation by pathway analysis revealed that many of these SNPs are in genomic regions that harbour genes with metabolic functions. The presence of genetic correlations between traits resulted in 13.2% of SNPs selected for AFI and 4.5% of SNPs selected for RFI also being selected for ADG in the analysis of breeding values. While our study identifies panels of SNPs significant for efficiency traits in our population, validation of all SNPs in independent populations will be necessary before commercialization. PMID:22497295
As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide...
Silva, Vinicius Henrique da; Regitano, Luciana Correia de Almeida; Geistlinger, Ludwig; Pértille, Fábio; Giachetto, Poliana Fernanda; Brassaloti, Ricardo Augusto; Morosini, Natália Silva; Zimmer, Ralf; Coutinho, Luiz Lehmann
Brazil is one of the largest beef producers and exporters in the world with the Nelore breed representing the vast majority of Brazilian cattle (Bos taurus indicus). Despite the great adaptability of the Nelore breed to tropical climate, meat tenderness (MT) remains to be improved. Several factors including genetic composition can influence MT. In this article, we report a genome-wide analysis of copy number variation (CNV) inferred from Illumina® High Density SNP-chip data for a Nelore population of 723 males. We detected >2,600 CNV regions (CNVRs) representing ≈6.5% of the genome. Comparing our results with previous studies revealed an overlap in ≈1400 CNVRs (>50%). A total of 1,155 CNVRs (43.6%) overlapped 2,750 genes. They were enriched for processes involving guanosine triphosphate (GTP), previously reported to influence skeletal muscle physiology and morphology. Nelore CNVRs also overlapped QTLs for MT reported in other breeds (8.9%, 236 CNVRs) and from a previous study with this population (4.1%, 109 CNVRs). Two CNVRs were also proximal to glutathione metabolism genes that were previously associated with MT. Genome-wide association study of CN state with estimated breeding values derived from meat shear force identified 6 regions, including a region on BTA3 that contains genes of the cAMP and cGMP pathway. Ten CNVRs that overlapped regions associated with MT were successfully validated by qPCR. Our results represent the first comprehensive CNV study in Bos taurus indicus cattle and identify regions in which copy number changes are potentially of importance for the MT phenotype.
da Silva, Vinicius Henrique; Regitano, Luciana Correia de Almeida; Geistlinger, Ludwig; Pértille, Fábio; Morosini, Natália Silva; Zimmer, Ralf; Coutinho, Luiz Lehmann
Brazil is one of the largest beef producers and exporters in the world with the Nelore breed representing the vast majority of Brazilian cattle (Bos taurus indicus). Despite the great adaptability of the Nelore breed to tropical climate, meat tenderness (MT) remains to be improved. Several factors including genetic composition can influence MT. In this article, we report a genome-wide analysis of copy number variation (CNV) inferred from Illumina® High Density SNP-chip data for a Nelore population of 723 males. We detected >2,600 CNV regions (CNVRs) representing ≈6.5% of the genome. Comparing our results with previous studies revealed an overlap in ≈1400 CNVRs (>50%). A total of 1,155 CNVRs (43.6%) overlapped 2,750 genes. They were enriched for processes involving guanosine triphosphate (GTP), previously reported to influence skeletal muscle physiology and morphology. Nelore CNVRs also overlapped QTLs for MT reported in other breeds (8.9%, 236 CNVRs) and from a previous study with this population (4.1%, 109 CNVRs). Two CNVRs were also proximal to glutathione metabolism genes that were previously associated with MT. Genome-wide association study of CN state with estimated breeding values derived from meat shear force identified 6 regions, including a region on BTA3 that contains genes of the cAMP and cGMP pathway. Ten CNVRs that overlapped regions associated with MT were successfully validated by qPCR. Our results represent the first comprehensive CNV study in Bos taurus indicus cattle and identify regions in which copy number changes are potentially of importance for the MT phenotype. PMID:27348523
Gaouar, S B S; Lafri, M; Djaout, A; El-Bouyahiaoui, R; Bouri, A; Bouchatal, A; Maftah, A; Ciani, E; Da Silva, A B
Algeria represents a reservoir of genetic diversity with local sheep breeds adapted to a large range of environments and showing specific features necessary to deal with harsh conditions. This remarkable diversity results from the traditional management of dryland by pastoralists over centuries. Most of these breeds are poorly productive, and the economic pressure leads farmers to realize anarchic cross-breeding (that is, not carried out in the framework of selection plans) with the hope to increase animal's conformation. In this study, eight of the nine local Algerian sheep breeds (D'men, Hamra, Ouled-Djellal, Rembi, Sidaoun, Tazegzawt, Berber and Barbarine) were investigated for the first time by genome-wide single-nucleotide polymorphism genotyping. At an international scale, Algerian sheep occupied an original position shaped by relations with African and European (particularly Italian) breeds. The strong genetic proximity with Caribbean and Brazilian breeds confirmed that the genetic make-up of these American breeds was largely influenced by the Atlantic slave trade. At a national scale, an alarming genetic dilution of the Berber (a primitive breed) and the Rembi was observed, as a consequence of uncontrolled mating practices with Ouled-Djellal. A similar, though less pronounced, phenomenon was also detected for the Barbarine, another ancestral breed. Genetic originality appeared to be better preserved in Tazegzawt, Hamra, D'men and Sidaoun. These breeds should be given high priority in the establishment of conservation plans to halt their progressive loss. For Berber and Barbarine that also occur in the bordering neighbor countries, urgent concerted transnational actions are needed.
Weber, K L; Thallman, R M; Keele, J W; Snelling, W M; Bennett, G L; Smith, T P L; McDaneld, T G; Allan, M F; Van Eenennaam, A L; Kuehn, L A
Genomic selection involves the assessment of genetic merit through prediction equations that allocate genetic variation with dense marker genotypes. It has the potential to provide accurate breeding values for selection candidates at an early age and facilitate selection for expensive or difficult to measure traits. Accurate across-breed prediction would allow genomic selection to be applied on a larger scale in the beef industry, but the limited availability of large populations for the development of prediction equations has delayed researchers from providing genomic predictions that are accurate across multiple beef breeds. In this study, the accuracy of genomic predictions for 6 growth and carcass traits were derived and evaluated using 2 multibreed beef cattle populations: 3,358 crossbred cattle of the U.S. Meat Animal Research Center Germplasm Evaluation Program (USMARC_GPE) and 1,834 high accuracy bull sires of the 2,000 Bull Project (2000_BULL) representing influential breeds in the U.S. beef cattle industry. The 2000_BULL EPD were deregressed, scaled, and weighted to adjust for between- and within-breed heterogeneous variance before use in training and validation. Molecular breeding values (MBV) trained in each multibreed population and in Angus and Hereford purebred sires of 2000_BULL were derived using the GenSel BayesCπ function (Fernando and Garrick, 2009) and cross-validated. Less than 10% of large effect loci were shared between prediction equations trained on (USMARC_GPE) relative to 2000_BULL although locus effects were moderately to highly correlated for most traits and the traits themselves were highly correlated between populations. Prediction of MBV accuracy was low and variable between populations. For growth traits, MBV accounted for up to 18% of genetic variation in a pooled, multibreed analysis and up to 28% in single breeds. For carcass traits, MBV explained up to 8% of genetic variation in a pooled, multibreed analysis and up to 42% in
Petersen, Jessica L.; Mickelson, James R.; Cothran, E. Gus; Andersson, Lisa S.; Axelsson, Jeanette; Bailey, Ernie; Bannasch, Danika; Binns, Matthew M.; Borges, Alexandre S.; Brama, Pieter; da Câmara Machado, Artur; Distl, Ottmar; Felicetti, Michela; Fox-Clipsham, Laura; Graves, Kathryn T.; Guérin, Gérard; Haase, Bianca; Hasegawa, Telhisa; Hemmann, Karin; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Lohi, Hannes; Lopes, Maria Susana; McGivney, Beatrice A.; Mikko, Sofia; Orr, Nicholas; Penedo, M. Cecilia T; Piercy, Richard J.; Raekallio, Marja; Rieder, Stefan; Røed, Knut H.; Silvestrelli, Maurizio; Swinburne, June; Tozaki, Teruaki; Vaudin, Mark; M. Wade, Claire; McCue, Molly E.
Horses were domesticated from the Eurasian steppes 5,000–6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. FST calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection. PMID:23383025
Quilez, Javier; Martínez, Verónica; Woolliams, John A.; Sanchez, Armand; Pong-Wong, Ricardo; Kennedy, Lorna J.; Quinnell, Rupert J.; Ollier, William E. R.; Roura, Xavier; Ferrer, Lluís; Altet, Laura; Francino, Olga
Background The current disease model for leishmaniasis suggests that only a proportion of infected individuals develop clinical disease, while others are asymptomatically infected due to immune control of infection. The factors that determine whether individuals progress to clinical disease following Leishmania infection are unclear, although previous studies suggest a role for host genetics. Our hypothesis was that canine leishmaniasis is a complex disease with multiple loci responsible for the progression of the disease from Leishmania infection. Methodology/Principal Findings Genome-wide association and genomic selection approaches were applied to a population-based case-control dataset of 219 dogs from a single breed (Boxer) genotyped for ∼170,000 SNPs. Firstly, we aimed to identify individual disease loci; secondly, we quantified the genetic component of the observed phenotypic variance; and thirdly, we tested whether genome-wide SNP data could accurately predict the disease. Conclusions/Significance We estimated that a substantial proportion of the genome is affecting the trait and that its heritability could be as high as 60%. Using the genome-wide association approach, the strongest associations were on chromosomes 1, 4 and 20, although none of these were statistically significant at a genome-wide level and after correcting for genetic stratification and lifestyle. Amongst these associations, chromosome 4: 61.2–76.9 Mb maps to a locus that has previously been associated with host susceptibility to human and murine leishmaniasis, and genomic selection estimated markers in this region to have the greatest effect on the phenotype. We therefore propose these regions as candidates for replication studies. An important finding of this study was the significant predictive value from using the genomic information. We found that the phenotype could be predicted with an accuracy of ∼0.29 in new samples and that the affection status was correctly predicted in 60
Wineinger, Nathan E.; Fu, Dong-Jing; Libiger, Ondrej; Alphs, Larry; Savitz, Adam; Gopal, Srihari; Cohen, Nadine; Schork, Nicholas J.
Objective Clinical response to the atypical antipsychotic paliperidone is known to vary among schizophrenic patients. We carried out a genome-wide association study to identify common genetic variants predictive of paliperidone efficacy. Methods We leveraged a collection of 1390 samples from individuals of European ancestry enrolled in 12 clinical studies investigating the efficacy of the extended-release tablet paliperidone ER (n1=490) and the once-monthly injection paliperidone palmitate (n2=550 and n3=350). We carried out a genome-wide association study using a general linear model (GLM) analysis on three separate cohorts, followed by meta-analysis and using a mixed linear model analysis on all samples. The variations in response explained by each single nucleotide polymorphism (h2SNP) were estimated. Results No SNP passed genome-wide significance in the GLM-based analyses with suggestive signals from rs56240334 [P=7.97×10−8 for change in the Clinical Global Impression Scale-Severity (CGI-S); P=8.72×10−7 for change in the total Positive and Negative Syndrome Scale (PANSS)] in the intron of ADCK1. The mixed linear model-based association P-values for rs56240334 were consistent with the results from GLM-based analyses and the association with change in CGI-S (P=4.26×10−8) reached genome-wide significance (i.e. P<5×10−8). We also found suggestive evidence for a polygenic contribution toward paliperidone treatment response with estimates of heritability, h2SNP, ranging from 0.31 to 0.43 for change in the total PANSS score, the PANSS positive Marder factor score, and CGI-S. Conclusion Genetic variations in the ADCK1 gene may differentially predict paliperidone efficacy in schizophrenic patients. However, this finding should be replicated in additional samples. PMID:27846195
A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. To increase sample size, results from different GWA can be combined in a meta-analys...
A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. For increasing N, results from different GWA can be combined in a meta-analysis (MA-...
Rodríguez-Ramilo, Silvia Teresa; Fernández, Jesús; Toro, Miguel Angel; Hernández, Delfino; Villanueva, Beatriz
Estimates of effective population size in the Holstein cattle breed have usually been low despite the large number of animals that constitute this breed. Effective population size is inversely related to the rates at which coancestry and inbreeding increase and these rates have been high as a consequence of intense and accurate selection. Traditionally, coancestry and inbreeding coefficients have been calculated from pedigree data. However, the development of genome-wide single nucleotide polymorphisms has increased the interest of calculating these coefficients from molecular data in order to improve their accuracy. In this study, genomic estimates of coancestry, inbreeding and effective population size were obtained in the Spanish Holstein population and then compared with pedigree-based estimates. A total of 11,135 animals genotyped with the Illumina BovineSNP50 BeadChip were available for the study. After applying filtering criteria, the final genomic dataset included 36,693 autosomal SNPs and 10,569 animals. Pedigree data from those genotyped animals included 31,203 animals. These individuals represented only the last five generations in order to homogenise the amount of pedigree information across animals. Genomic estimates of coancestry and inbreeding were obtained from identity by descent segments (coancestry) or runs of homozygosity (inbreeding). The results indicate that the percentage of variance of pedigree-based coancestry estimates explained by genomic coancestry estimates was higher than that for inbreeding. Estimates of effective population size obtained from genome-wide and pedigree information were consistent and ranged from about 66 to 79. These low values emphasize the need of controlling the rate of increase of coancestry and inbreeding in Holstein selection programmes.
Spindel, J E; Begum, H; Akdemir, D; Collard, B; Redoña, E; Jannink, J-L; McCouch, S
To address the multiple challenges to food security posed by global climate change, population growth and rising incomes, plant breeders are developing new crop varieties that can enhance both agricultural productivity and environmental sustainability. Current breeding practices, however, are unable to keep pace with demand. Genomic selection (GS) is a new technique that helps accelerate the rate of genetic gain in breeding by using whole-genome data to predict the breeding value of offspring. Here, we describe a new GS model that combines RR-BLUP with markers fit as fixed effects selected from the results of a genome-wide-association study (GWAS) on the RR-BLUP training data. We term this model GS + de novo GWAS. In a breeding population of tropical rice, GS + de novo GWAS outperformed six other models for a variety of traits and in multiple environments. On the basis of these results, we propose an extended, two-part breeding design that can be used to efficiently integrate novel variation into elite breeding populations, thus expanding genetic diversity and enhancing the potential for sustainable productivity gains. PMID:26860200
Genomic selection & association mapping in rice: effect of trait genetic architecture, training population composition, marker number & statistical model on accuracy of rice genomic selection in elite, tropical rice breeding
Genomic Selection (GS) is a new breeding method in which genome-wide markers are used to predict the breeding value of individuals in a breeding population. GS has been shown to improve breeding efficiency in dairy cattle and several crop plant species, and here we evaluate for the first time its ef...
Yong, Wai-Shin; Hsu, Fei-Man; Chen, Pao-Yang
DNA methylation is an epigenetic modification that plays an important role in regulating gene expression and therefore a broad range of biological processes and diseases. DNA methylation is tissue-specific, dynamic, sequence-context-dependent and trans-generationally heritable, and these complex patterns of methylation highlight the significance of profiling DNA methylation to answer biological questions. In this review, we surveyed major methylation assays, along with comparisons and biological examples, to provide an overview of DNA methylation profiling techniques. The advances in microarray and sequencing technologies make genome-wide profiling possible at a single-nucleotide or even a single-cell resolution. These profiling approaches vary in many aspects, such as DNA input, resolution, genomic region coverage, and bioinformatics analysis, and selecting a feasible method requires knowledge of these methods. We first introduce the biological background of DNA methylation and its pattern in plants, animals and fungi. We present an overview of major experimental approaches to profiling genome-wide DNA methylation and hydroxymethylation and then extend to the single-cell methylome. To evaluate these methods, we outline their strengths and weaknesses and perform comparisons across the different platforms. Due to the increasing need to compute high-throughput epigenomic data, we interrogate the computational pipeline for bisulfite sequencing data and also discuss the concept of identifying differentially methylated regions (DMRs). This review summarizes the experimental and computational concepts for profiling genome-wide DNA methylation, followed by biological examples. Overall, this review provides researchers useful guidance for the selection of a profiling method suited to specific research questions.
Duan, Jubao; Sanders, Alan R.; Gejman, Pablo V.
Schizophrenia (SZ) is a common and severe psychiatric disorder with both environmental and genetic risk factors, and a high heritability. After over 20 years of molecular genetics research, new molecular strategies, primarily genome-wide association studies (GWAS), have generated major tangible progress. This new data provides evidence for: 1) A number of chromosomal regions with common polymorphisms showing genome-wide association with SZ (the major histocompatibility complex, MHC, region at 6p22-p21; 18q21.2; and 2q32.1). The associated alleles present small odds ratios (the odds of a risk variant being present in cases versus controls) and suggest causative involvement of gene regulatory mechanisms in SZ. 2) Polygenic inheritance. 3) Involvement of rare (<1%) and large (>100kb) copy number variants (CNVs). 4) A genetic overlap of SZ with autism and with bipolar disorder (BP) challenging the classical clinical classifications. Most new SZ findings (chromosomal regions and genes) have generated new biological leads. These new findings, however, still need to be translated into a better understanding of the underlying biology and into causal mechanisms. Furthermore, a considerable amount of heritability still remains unexplained (missing heritability). Deep resequencing for rare variants and system biology approaches (e.g., integrating DNA sequence and functional data) are expected to further improve our understanding of the genetic architecture of SZ and its underlying biology. PMID:20433910
Jeong, Seok Won; Chung, Myungguen; Park, Soo-Jung; Cho, Seong Beom
Metabolic syndrome (METS) is a disorder of energy utilization and storage and increases the risk of developing cardiovascular disease and diabetes. To identify the genetic risk factors of METS, we carried out a genome-wide association study (GWAS) for 2,657 cases and 5,917 controls in Korean populations. As a result, we could identify 2 single nucleotide polymorphisms (SNPs) with genome-wide significance level p-values (<5 × 10-8), 8 SNPs with genome-wide suggestive p-values (5 × 10-8 ≤ p < 1 × 10-5), and 2 SNPs of more functional variants with borderline p-values (5 × 10-5 ≤ p < 1 × 10-4). On the other hand, the multiple correction criteria of conventional GWASs exclude false-positive loci, but simultaneously, they discard many true-positive loci. To reconsider the discarded true-positive loci, we attempted to include the functional variants (nonsynonymous SNPs [nsSNPs] and expression quantitative trait loci [eQTL]) among the top 5,000 SNPs based on the proportion of phenotypic variance explained by genotypic variance. In total, 159 eQTLs and 18 nsSNPs were presented in the top 5,000 SNPs. Although they should be replicated in other independent populations, 6 eQTLs and 2 nsSNP loci were located in the molecular pathways of LPL, APOA5, and CHRM2, which were the significant or suggestive loci in the METS GWAS. Conclusively, our approach using the conventional GWAS, reconsidering functional variants and pathway-based interpretation, suggests a useful method to understand the GWAS results of complex traits and can be expanded in other genomewide association studies. PMID:25705157
Stewart, S Evelyn; Yu, Dongmei; Scharf, Jeremiah M; Neale, Benjamin M; Fagerness, Jesen A; Mathews, Carol A; Arnold, Paul D; Evans, Patrick D; Gamazon, Eric R; Osiecki, Lisa; McGrath, Lauren; Haddad, Stephen; Crane, Jacquelyn; Hezel, Dianne; Illman, Cornelia; Mayerfeld, Catherine; Konkashbaev, Anuar; Liu, Chunyu; Pluzhnikov, Anna; Tikhomirov, Anna; Edlund, Christopher K; Rauch, Scott L; Moessner, Rainald; Falkai, Peter; Maier, Wolfgang; Ruhrmann, Stephan; Grabe, Hans-Jörgen; Lennertz, Leonard; Wagner, Michael; Bellodi, Laura; Cavallini, Maria Cristina; Richter, Margaret A; Cook, Edwin H; Kennedy, James L; Rosenberg, David; Stein, Dan J; Hemmings, Sian MJ; Lochner, Christine; Azzam, Amin; Chavira, Denise A; Fournier, Eduardo; Garrido, Helena; Sheppard, Brooke; Umaña, Paul; Murphy, Dennis L; Wendland, Jens R; Veenstra-VanderWeele, Jeremy; Denys, Damiaan; Blom, Rianne; Deforce, Dieter; Van Nieuwerburgh, Filip; Westenberg, Herman GM; Walitza, Susanne; Egberts, Karin; Renner, Tobias; Miguel, Euripedes Constantino; Cappi, Carolina; Hounie, Ana G; Conceição do Rosário, Maria; Sampaio, Aline S; Vallada, Homero; Nicolini, Humberto; Lanzagorta, Nuria; Camarena, Beatriz; Delorme, Richard; Leboyer, Marion; Pato, Carlos N; Pato, Michele T; Voyiaziakis, Emanuel; Heutink, Peter; Cath, Danielle C; Posthuma, Danielle; Smit, Jan H; Samuels, Jack; Bienvenu, O Joseph; Cullen, Bernadette; Fyer, Abby J; Grados, Marco A; Greenberg, Benjamin D; McCracken, James T; Riddle, Mark A; Wang, Ying; Coric, Vladimir; Leckman, James F; Bloch, Michael; Pittenger, Christopher; Eapen, Valsamma; Black, Donald W; Ophoff, Roel A; Strengman, Eric; Cusi, Daniele; Turiel, Maurizio; Frau, Francesca; Macciardi, Fabio; Gibbs, J Raphael; Cookson, Mark R; Singleton, Andrew; Hardy, John; Crenshaw, Andrew T; Parkin, Melissa A; Mirel, Daniel B; Conti, David V; Purcell, Shaun; Nestadt, Gerald; Hanna, Gregory L; Jenike, Michael A; Knowles, James A; Cox, Nancy; Pauls, David L
Obsessive-compulsive disorder (OCD) is a common, debilitating neuropsychiatric illness with complex genetic etiology. The International OCD Foundation Genetics Collaborative (IOCDF-GC) is a multi-national collaboration established to discover the genetic variation predisposing to OCD. A set of individuals affected with DSM-IV OCD, a subset of their parents, and unselected controls, were genotyped with several different Illumina SNP microarrays. After extensive data cleaning, 1,465 cases, 5,557 ancestry-matched controls and 400 complete trios remained, with a common set of 469,410 autosomal and 9,657 X-chromosome SNPs. Ancestry-stratified case-control association analyses were conducted for three genetically-defined subpopulations and combined in two meta-analyses, with and without the trio-based analysis. In the case-control analysis, the lowest two p-values were located within DLGAP1 (p=2.49×10-6 and p=3.44×10-6), a member of the neuronal postsynaptic density complex. In the trio analysis, rs6131295, near BTBD3, exceeded the genome-wide significance threshold with a p-value=3.84 × 10-8. However, when trios were meta-analyzed with the combined case-control samples, the p-value for this variant was 3.62×10-5, losing genome-wide significance. Although no SNPs were identified to be associated with OCD at a genome-wide significant level in the combined trio-case-control sample, a significant enrichment of methylation-QTLs (p<0.001) and frontal lobe eQTLs (p=0.001) was observed within the top-ranked SNPs (p<0.01) from the trio-case-control analysis, suggesting these top signals may have a broad role in gene expression in the brain, and possibly in the etiology of OCD. PMID:22889921
Aschebrook-Kilfoy, Briseis; Argos, Maria; Pierce, Brandon L; Tong, Lin; Jasmine, Farzana; Roy, Shantanu; Parvez, Faruque; Ahmed, Alauddin; Islam, Tariqul; Kibriya, Muhammad G; Ahsan, Habibul
Human fertility is a complex trait determined by gene-environment interactions in which genetic factors represent a significant component. To better understand inter-individual variability in fertility, we performed one of the first genome-wide association studies (GWAS) of common fertility phenotypes, lifetime number of pregnancies and number of children in a developing country population. The fertility phenotype data and DNA samples were obtained at baseline recruitment from individuals participating in a large prospective cohort study in Bangladesh. GWAS analyses of fertility phenotypes were conducted among 1,686 married women. One SNP on chromosome 4 was non-significantly associated with number of children at P <10(-7) and number of pregnancies at P <10(-6). This SNP is located in a region without a gene within 1 Mb. One SNP on chromosome 6 was non-significantly associated with extreme number of children at P <10(-6). The closest gene to this SNP is HDGFL1, a hepatoma-derived growth factor. When we excluded hormonal contraceptive users, a SNP on chromosome 5 was non-significantly associated at P <10(-5) for number of children and number of pregnancies. This SNP is located near C5orf64, an open reading frame, and ZSWIM6, a zinc ion binding gene. We also estimated the heritability of these phenotypes from our genotype data using GCTA (Genome-wide Complex Trait Analysis) for number of children (hg2 = 0.149, SE = 0.24, p-value = 0.265) and number of pregnancies (hg2 = 0.007, SE = 0.22, p-value = 0.487). Our genome-wide association study and heritability estimates of number of pregnancies and number of children in Bangladesh did not confer strong evidence of common variants for parity variation. However, our results suggest that future studies may want to consider the role of 3 notable SNPs in their analysis.
Stadler, Zsofia K.; Thom, Peter; Robson, Mark E.; Weitzel, Jeffrey N.; Kauff, Noah D.; Hurley, Karen E.; Devlin, Vincent; Gold, Bert; Klein, Robert J.; Offit, Kenneth
Knowledge of the inherited risk for cancer is an important component of preventive oncology. In addition to well-established syndromes of cancer predisposition, much remains to be discovered about the genetic variation underlying susceptibility to common malignancies. Increased knowledge about the human genome and advances in genotyping technology have made possible genome-wide association studies (GWAS) of human diseases. These studies have identified many important regions of genetic variation associated with an increased risk for human traits and diseases including cancer. Understanding the principles, major findings, and limitations of GWAS is becoming increasingly important for oncologists as dissemination of genomic risk tests directly to consumers is already occurring through commercial companies. GWAS have contributed to our understanding of the genetic basis of cancer and will shed light on biologic pathways and possible new strategies for targeted prevention. To date, however, the clinical utility of GWAS-derived risk markers remains limited. PMID:20585100
Zou, Fei; Fine, Jason P.; Hu, Jianhua; Lin, D. Y.
Assessing genome-wide statistical significance is an important and difficult problem in multipoint linkage analysis. Due to multiple tests on the same genome, the usual pointwise significance level based on the chi-square approximation is inappropriate. Permutation is widely used to determine genome-wide significance. Theoretical approximations are available for simple experimental crosses. In this article, we propose a resampling procedure to assess the significance of genome-wide QTL mapping for experimental crosses. The proposed method is computationally much less intensive than the permutation procedure (in the order of 102 or higher) and is applicable to complex breeding designs and sophisticated genetic models that cannot be handled by the permutation and theoretical methods. The usefulness of the proposed method is demonstrated through simulation studies and an application to a Drosophila backcross. PMID:15611194
Zhou, Xiang; Stephens, Matthew
Multivariate linear mixed models (mvLMMs) are powerful tools for testing associations between single-nucleotide polymorphisms and multiple correlated phenotypes while controlling for population stratification in genome-wide association studies. We present efficient algorithms in the genome-wide efficient mixed model association (GEMMA) software for fitting mvLMMs and computing likelihood ratio tests. These algorithms offer improved computation speed, power and P-value calibration over existing methods, and can deal with more than two phenotypes.
Coleman, Jonathan R. I.; Ducci, Francesca; Aliev, Fazil; Newhouse, Stephen J.; Liu, Xiehe; Ma, Xiaohong; Wang, Yingcheng; Collier, David A.; Asherson, Philip; Li, Tao; Breen, Gerome
Drug addiction is a costly and recurring healthcare problem, necessitating a need to understand risk factors and mechanisms of addiction, and to identify new biomarkers. To date, genome-wide association studies (GWAS) for heroin addiction have been limited; moreover they have been restricted to examining samples of European and African-American origin due to difficulty of recruiting samples from other populations. This is the first study to test a Han Chinese population; we performed a GWAS on a homogeneous sample of 370 Han Chinese subjects diagnosed with heroin dependence using the DSM-IV criteria and 134 ethnically matched controls. Analysis using the diagnostic criteria of heroin dependence yielded suggestive evidence for association between variants in the genes CCDC42 (coiled coil domain 42; p = 2.8x10-7) and BRSK2 (BR serine/threonine 2; p = 4.110−6). In addition, we found evidence for risk variants within the ARHGEF10 (Rho guanine nucleotide exchange factor 10) gene on chromosome 8 and variants in a region on chromosome 20q13, which is gene-poor but has a concentration of mRNAs and predicted miRNAs. Gene-based association analysis identified genome-wide significant association between variants in CCDC42 and heroin addiction. Additionally, when we investigated shared risk variants between heroin addiction and risk of other addiction-related and psychiatric phenotypes using polygenic risk scores, we found a suggestive relationship with variants predicting tobacco addiction, and a significant relationship with variants predicting schizophrenia. Our genome wide association study of heroin dependence provides data in a novel sample, with functionally plausible results and evidence of genetic data of value to the field. PMID:27936112
Ma, Meng; Dou, Taocun; Lu, Jian; Guo, Jun; Hu, Yuping; Yi, Guoqiang; Yuan, Jingwei; Sun, Congjiao; Wang, Kehua; Yang, Ning
The comb, as a secondary sexual character, is an important trait in chicken. Indicators of comb length (CL), comb height (CH), and comb weight (CW) are often selected in production. DNA-based marker-assisted selection could help chicken breeders to accelerate genetic improvement for comb or related economic characters by early selection. Although a number of quantitative trait loci (QTL) and candidate genes have been identified with advances in molecular genetics, candidate genes underlying comb traits are limited. The aim of the study was to use genome-wide association (GWA) studies by 600 K Affymetrix chicken SNP arrays to detect genes that are related to comb, using an F2 resource population. For all comb characters, comb exhibited high SNP-based heritability estimates (0.61–0.69). Chromosome 1 explained 20.80% genetic variance, while chromosome 4 explained 6.89%. Independent univariate genome-wide screens for each character identified 127, 197, and 268 novel significant SNPs with CL, CH, and CW, respectively. Three candidate genes, VPS36, AR, and WNT11B, were determined to have a plausible function in all comb characters. These genes are important to the initiation of follicle development, gonadal growth, and dermal development, respectively. The current study provides the first GWA analysis for comb traits. Identification of the genetic basis as well as promising candidate genes will help us understand the underlying genetic architecture of comb development and has practical significance in breeding programs for the selection of comb as an index for sexual maturity or reproduction. PMID:27427764
Müller, M-P; Rothammer, S; Seichter, D; Russ, I; Hinrichs, D; Tetens, J; Thaller, G; Medugorac, I
Over the last decades, a dramatic decrease in reproductive performance has been observed in Holstein cattle and fertility problems have become the most common reason for a cow to leave the herd. The premature removal of animals with high breeding values results in both economic and breeding losses. For efficient future Holstein breeding, the identification of loci associated with low fertility is of major interest and thus constitutes the aim of this study. To reach this aim, a genome-wide combined linkage disequilibrium and linkage analysis (cLDLA) was conducted using data on the following 10 calving and fertility traits in the form of estimated breeding values: days from first service to conception of heifers and cows, nonreturn rate on d 56 of heifers and cows, days from calving to first insemination, days open, paternal and maternal calving ease, paternal and maternal stillbirth. The animal data set contained 2,527 daughter-proven Holstein bulls from Germany that were genotyped with Illumina's BovineSNP50 BeadChip (Illumina Inc., San Diego, CA). For the cLDLA, 41,635 sliding windows of 40 adjacent single nucleotide polymorphisms (SNP) were used. At each window midpoint, a variance component analysis was executed using ASReml. The underlying mixed linear model included random quantitative trait locus (QTL) and polygenic effects. We identified 50 genome-wide significant QTL. The most significant peak was detected for direct calving ease at 59,179,424 bp on chromosome 18 (BTA18). Next, a mixed-linear model association (MLMA) analysis was conducted. A comparison of the cLDLA and MLMA results with special regard to BTA18 showed that the genome-wide most significant SNP from the MLMA was associated with the same trait and located on the same chromosome at 57,589,121 bp (i.e., about 1.5 Mb apart from the cLDLA peak). The results of 5 different cLDLA and 2 MLMA models, which included the fixed effects of either SNP or haplotypes, suggested that the cLDLA method
Peura, J; Kempe, R; Strandén, I; Rydhmer, L
The profit and production of an average Finnish blue fox farm was simulated using a deterministic bio-economic farm model. Risk was included using Arrow-Prat absolute risk aversion coefficient and profit variance. Risk-rated economic values were calculated for pregnancy rate, litter loss, litter size, pelt size, pelt quality, pelt colour clarity, feed efficiency and eye infection. With high absolute risk aversion, economic values were lower than with low absolute risk aversion. Economic values were highest for litter loss (18.16 and 26.42 EUR), litter size (13.27 and 19.40 EUR), pregnancy (11.99 and 18.39 EUR) and eye infection (12.39 and 13.81 EUR). Sensitivity analysis showed that selection pressure for improved eye health depended strongly on proportion of culled animals among infected animals and much less on the proportion of infected animals. The economic value of feed efficiency was lower than expected (6.06 and 8.03 EUR). However, it was almost the same magnitude as pelt quality (7.30 and 7.30 EUR) and higher than the economic value of pelt size (3.37 and 5.26 EUR). Risk factors should be considered in blue fox breeding scheme because they change the relative importance of traits.
Irano, Natalia; de Camargo, Gregório Miguel Ferreira; Costa, Raphael Bermal; Terakado, Ana Paula Nascimento; Magalhães, Ana Fabrícia Braga; Silva, Rafael Medeiros de Oliveira; Dias, Marina Mortati; Bignardi, Annaiza Braga; Baldi, Fernando; Carvalheiro, Roberto; de Oliveira, Henrique Nunes; de Albuquerque, Lucia Galvão
The objective of this study was to perform a genome-wide association study (GWAS) to detect chromosome regions associated with indicator traits of sexual precocity in Nellore cattle. Data from Nellore animals belonging to farms which participate in the DeltaGen® and Paint® animal breeding programs, were used. The traits used in this study were the occurrence of early pregnancy (EP) and scrotal circumference (SC). Data from 72,675 females and 83,911 males with phenotypes were used; of these, 1,770 females and 1,680 males were genotyped. The SNP effects were estimated with a single-step procedure (WssGBLUP) and the observed phenotypes were used as dependent variables. All animals with available genotypes and phenotypes, in addition to those with only phenotypic information, were used. A single-trait animal model was applied to predict breeding values and the solutions of SNP effects were obtained from these breeding values. The results of GWAS are reported as the proportion of variance explained by windows with 150 adjacent SNPs. The 10 windows that explained the highest proportion of variance were identified. The results of this study indicate the polygenic nature of EP and SC, demonstrating that the indicator traits of sexual precocity studied here are probably controlled by many genes, including some of moderate effect. The 10 windows with large effects obtained for EP are located on chromosomes 5, 6, 7, 14, 18, 21 and 27, and together explained 7.91% of the total genetic variance. For SC, these windows are located on chromosomes 4, 8, 11, 13, 14, 19, 22 and 23, explaining 6.78% of total variance. GWAS permitted to identify chromosome regions associated with EP and SC. The identification of these regions contributes to a better understanding and evaluation of these traits, and permits to indicate candidate genes for future investigation of causal mutations. PMID:27494397
Genomic selection (GS) uses genome-wide molecular marker data to predict the genetic value of selection candidates in breeding programs. In plant breeding, the ability to produce large numbers of progeny per cross allows GS to be conducted within each family. However, this approach requires phenotyp...
Govindaraj, Periyasamy; Nizamuddin, Sheikh; Sharath, Anugula; Jyothi, Vuskamalla; Rotti, Harish; Raval, Ritu; Nayak, Jayakrishna; Bhat, Balakrishna K.; Prasanna, B. V.; Shintre, Pooja; Sule, Mayura; Joshi, Kalpana S.; Dedge, Amrish P.; Bharadwaj, Ramachandra; Gangadharan, G. G.; Nair, Sreekumaran; Gopinath, Puthiya M.; Patwardhan, Bhushan; Kondaiah, Paturu; Satyamoorthy, Kapaettu; Valiathan, Marthanda Varma Sankaran; Thangaraj, Kumarasamy
The practice of Ayurveda, the traditional medicine of India, is based on the concept of three major constitutional types (Vata, Pitta and Kapha) defined as “Prakriti”. To the best of our knowledge, no study has convincingly correlated genomic variations with the classification of Prakriti. In the present study, we performed genome-wide SNP (single nucleotide polymorphism) analysis (Affymetrix, 6.0) of 262 well-classified male individuals (after screening 3416 subjects) belonging to three Prakritis. We found 52 SNPs (p ≤ 1 × 10−5) were significantly different between Prakritis, without any confounding effect of stratification, after 106 permutations. Principal component analysis (PCA) of these SNPs classified 262 individuals into their respective groups (Vata, Pitta and Kapha) irrespective of their ancestry, which represent its power in categorization. We further validated our finding with 297 Indian population samples with known ancestry. Subsequently, we found that PGM1 correlates with phenotype of Pitta as described in the ancient text of Caraka Samhita, suggesting that the phenotypic classification of India’s traditional medicine has a genetic basis; and its Prakriti-based practice in vogue for many centuries resonates with personalized medicine. PMID:26511157
Warrier, Varun; Chakrabarti, Bhismadev; Murphy, Laura; Chan, Allen; Craig, Ian; Mallya, Uma; Lakatošová, Silvia; Rehnstrom, Karola; Peltonen, Leena; Wheelwright, Sally; Allison, Carrie; Fisher, Simon E; Baron-Cohen, Simon
Asperger Syndrome (AS) is a neurodevelopmental condition characterized by impairments in social interaction and communication, alongside the presence of unusually repetitive, restricted interests and stereotyped behaviour. Individuals with AS have no delay in cognitive and language development. It is a subset of Autism Spectrum Conditions (ASC), which are highly heritable and has a population prevalence of approximately 1%. Few studies have investigated the genetic basis of AS. To address this gap in the literature, we performed a genome-wide pooled DNA association study to identify candidate loci in 612 individuals (294 cases and 318 controls) of Caucasian ancestry, using the Affymetrix GeneChip Human Mapping version 6.0 array. We identified 11 SNPs that had a p-value below 1x10-5. These SNPs were independently genotyped in the same sample. Three of the SNPs (rs1268055, rs7785891 and rs2782448) were nominally significant, though none remained significant after Bonferroni correction. Two of our top three SNPs (rs7785891 and rs2782448) lie in loci previously implicated in ASC. However, investigation of the three SNPs in the ASC genome-wide association dataset from the Psychiatric Genomics Consortium indicated that these three SNPs were not significantly associated with ASC. The effect sizes of the variants were modest, indicating that our study was not sufficiently powered to identify causal variants with precision.
Warrier, Varun; Chakrabarti, Bhismadev; Murphy, Laura; Chan, Allen; Craig, Ian; Mallya, Uma; Lakatošová, Silvia; Rehnstrom, Karola; Wheelwright, Sally; Allison, Carrie; Fisher, Simon E.; Baron-Cohen, Simon
Asperger Syndrome (AS) is a neurodevelopmental condition characterized by impairments in social interaction and communication, alongside the presence of unusually repetitive, restricted interests and stereotyped behaviour. Individuals with AS have no delay in cognitive and language development. It is a subset of Autism Spectrum Conditions (ASC), which are highly heritable and has a population prevalence of approximately 1%. Few studies have investigated the genetic basis of AS. To address this gap in the literature, we performed a genome-wide pooled DNA association study to identify candidate loci in 612 individuals (294 cases and 318 controls) of Caucasian ancestry, using the Affymetrix GeneChip Human Mapping version 6.0 array. We identified 11 SNPs that had a p-value below 1x10-5. These SNPs were independently genotyped in the same sample. Three of the SNPs (rs1268055, rs7785891 and rs2782448) were nominally significant, though none remained significant after Bonferroni correction. Two of our top three SNPs (rs7785891 and rs2782448) lie in loci previously implicated in ASC. However, investigation of the three SNPs in the ASC genome-wide association dataset from the Psychiatric Genomics Consortium indicated that these three SNPs were not significantly associated with ASC. The effect sizes of the variants were modest, indicating that our study was not sufficiently powered to identify causal variants with precision. PMID:26176695
Mullapudi, Nandita; Ye, Bin; Suzuki, Masako; Fazzari, Melissa; Han, Weiguo; Shi, Miao K; Marquardt, Gaby; Lin, Juan; Wang, Tao; Keller, Steven; Zhu, Changcheng; Locker, Joseph D; Spivack, Simon D
Aberrant cytosine 5-methylation underlies many deregulated elements of cancer. Among paired non-small cell lung cancers (NSCLC), we sought to profile DNA 5-methyl-cytosine features which may underlie genome-wide deregulation. In one of the more dense interrogations of the methylome, we sampled 1.2 million CpG sites from twenty-four NSCLC tumor (T)-non-tumor (NT) pairs using a methylation-sensitive restriction enzyme- based HELP-microarray assay. We found 225,350 differentially methylated (DM) sites in adenocarcinomas versus adjacent non-tumor tissue that vary in frequency across genomic compartment, particularly notable in gene bodies (GB; p<2.2E-16). Further, when DM was coupled to differential transcriptome (DE) in the same samples, 37,056 differential loci in adenocarcinoma emerged. Approximately 90% of the DM-DE relationships were non-canonical; for example, promoter DM associated with DE in the same direction. Of the canonical changes noted, promoter (PR) DM loci with reciprocal changes in expression in adenocarcinomas included HBEGF, AGER, PTPRM, DPT, CST1, MELK; DM GB loci with concordant changes in expression included FOXM1, FERMT1, SLC7A5, and FAP genes. IPA analyses showed adenocarcinoma-specific promoter DMxDE overlay identified familiar lung cancer nodes [tP53, Akt] as well as less familiar nodes [HBEGF, NQO1, GRK5, VWF, HPGD, CDH5, CTNNAL1, PTPN13, DACH1, SMAD6, LAMA3, AR]. The unique findings from this study include the discovery of numerous candidate The unique findings from this study include the discovery of numerous candidate methylation sites in both PR and GB regions not previously identified in NSCLC, and many non-canonical relationships to gene expression. These DNA methylation features could potentially be developed as risk or diagnostic biomarkers, or as candidate targets for newer methylation locus-targeted preventive or therapeutic agents.
Suzuki, Masako; Fazzari, Melissa; Han, Weiguo; Shi, Miao K.; Marquardt, Gaby; Lin, Juan; Wang, Tao; Keller, Steven; Zhu, Changcheng; Locker, Joseph D.; Spivack, Simon D.
Aberrant cytosine 5-methylation underlies many deregulated elements of cancer. Among paired non-small cell lung cancers (NSCLC), we sought to profile DNA 5-methyl-cytosine features which may underlie genome-wide deregulation. In one of the more dense interrogations of the methylome, we sampled 1.2 million CpG sites from twenty-four NSCLC tumor (T)–non-tumor (NT) pairs using a methylation-sensitive restriction enzyme- based HELP-microarray assay. We found 225,350 differentially methylated (DM) sites in adenocarcinomas versus adjacent non-tumor tissue that vary in frequency across genomic compartment, particularly notable in gene bodies (GB; p<2.2E-16). Further, when DM was coupled to differential transcriptome (DE) in the same samples, 37,056 differential loci in adenocarcinoma emerged. Approximately 90% of the DM-DE relationships were non-canonical; for example, promoter DM associated with DE in the same direction. Of the canonical changes noted, promoter (PR) DM loci with reciprocal changes in expression in adenocarcinomas included HBEGF, AGER, PTPRM, DPT, CST1, MELK; DM GB loci with concordant changes in expression included FOXM1, FERMT1, SLC7A5, and FAP genes. IPA analyses showed adenocarcinoma-specific promoter DMxDE overlay identified familiar lung cancer nodes [tP53, Akt] as well as less familiar nodes [HBEGF, NQO1, GRK5, VWF, HPGD, CDH5, CTNNAL1, PTPN13, DACH1, SMAD6, LAMA3, AR]. The unique findings from this study include the discovery of numerous candidate The unique findings from this study include the discovery of numerous candidate methylation sites in both PR and GB regions not previously identified in NSCLC, and many non-canonical relationships to gene expression. These DNA methylation features could potentially be developed as risk or diagnostic biomarkers, or as candidate targets for newer methylation locus-targeted preventive or therapeutic agents. PMID:26683690
Kang, Yang Jae; Bae, Ahra; Shim, Sangrea; Lee, Taeyoung; Lee, Jayern; Satyawan, Dani; Kim, Moon Young; Lee, Suk-Ha
DNA methylation on cytosine residues is known to affect gene expression and is potentially responsible for the phenotypic variations among different crop cultivars. Here, we present the whole-genome DNA methylation profiles and assess the potential effects of single nucleotide polymorphisms (SNPs) for two mungbean cultivars, Sunhwanogdu (VC1973A) and Kyunggijaerae#5 (V2984). By measuring the DNA methylation levels in leaf tissue with the bisulfite sequencing (BSseq) approach, we show both the frequencies of the various types of DNA methylation and the distribution of weighted gene methylation levels. SNPs that cause nucleotide changes from/to CHH – where C is cytosine and H is any other nucleotide – were found to affect DNA methylation status in VC1973A and V2984. In order to better understand the correlation between gene expression and DNA methylation levels, we surveyed gene expression in leaf tissues of VC1973A and V2984 using RNAseq. Transcript expressions of paralogous genes were controlled by DNA methylation within the VC1973A genome. Moreover, genes that were differentially expressed between the two cultivars showed distinct DNA methylation patterns. Our mungbean genome-wide methylation profiles will be valuable resources for understanding the phenotypic variations between different cultivars, as well as for molecular breeding. PMID:28084412
Jafarzadeh, Jafar; Bonnett, David; Jannink, Jean-Luc; Akdemir, Deniz; Dreisigacker, Susanne; Sorrells, Mark E.
To introduce new genetic diversity into the bread wheat gene pool from its progenitor, Aegilops tauschii (Coss.) Schmalh, 33 primary synthetic hexaploid wheat genotypes (SYN) were crossed to 20 spring bread wheat (BW) cultivars at the International Wheat and Maize Improvement Center. Modified single seed descent was used to develop 97 populations with 50 individuals per population using first back-cross, biparental, and three-way crosses. Individuals from each cross were selected for short stature, early heading, flowering and maturity, minimal lodging, and free threshing. Yield trials were conducted under irrigated, drought, and heat-stress conditions from 2011 to 2014 in Ciudad Obregon, Mexico. Genomic estimated breeding values (GEBVs) of parents and synthetic derived lines (SDLs) were estimated using a genomic best linear unbiased prediction (GBLUP) model with markers in each trial. In each environment, there were SDLs that had higher GEBVs than their recurrent BW parent for yield. The GEBVs of BW parents for yield ranged from -0.32 in heat to 1.40 in irrigated trials. The range of the SYN parent GEBVs for yield was from -2.69 in the irrigated to 0.26 in the heat trials and were mostly negative across environments. The contribution of the SYN parents to improved grain yield of the SDLs was highest under heat stress, with an average GEBV for the top 10% of the SDLs of 0.55 while the weighted average GEBV of their corresponding recurrent BW parents was 0.26. Using the pedigree-based model, the accuracy of genomic prediction for yield was 0.42, 0.43, and 0.49 in the drought, heat and irrigated trials, respectively, while for the marker-based model these values were 0.43, 0.44, and 0.55. The SYN parents introduced novel diversity into the wheat gene pool. Higher GEBVs of progenies were due to introgression and retention of some positive alleles from SYN parents. PMID:27656893
Genomic selection is a method to improve quantitative traits in crops and livestock by estimating breeding values of selection candidates using phenotype and genome-wide marker data sets. Prediction accuracy has been evaluated through simulation and cross-validation, however validation based on prog...
Jin, Eun-Heui; Zhang, Enji; Ko, Youngkwon; Sim, Woo Seog; Moon, Dong Eon; Yoon, Keon Jung; Hong, Jang Hee; Lee, Won Hyung
Complex regional pain syndrome (CRPS) is a chronic, progressive, and devastating pain syndrome characterized by spontaneous pain, hyperalgesia, allodynia, altered skin temperature, and motor dysfunction. Although previous gene expression profiling studies have been conducted in animal pain models, there genome-wide expression profiling in the whole blood of CRPS patients has not been reported yet. Here, we successfully identified certain pain-related genes through genome-wide expression profiling in the blood from CRPS patients. We found that 80 genes were differentially expressed between 4 CRPS patients (2 CRPS I and 2 CRPS II) and 5 controls (cut-off value: 1.5-fold change and p<0.05). Most of those genes were associated with signal transduction, developmental processes, cell structure and motility, and immunity and defense. The expression levels of major histocompatibility complex class I A subtype (HLA-A29.1), matrix metalloproteinase 9 (MMP9), alanine aminopeptidase N (ANPEP), l-histidine decarboxylase (HDC), granulocyte colony-stimulating factor 3 receptor (G-CSF3R), and signal transducer and activator of transcription 3 (STAT3) genes selected from the microarray were confirmed in 24 CRPS patients and 18 controls by quantitative reverse transcription-polymerase chain reaction (qRT-PCR). We focused on the MMP9 gene that, by qRT-PCR, showed a statistically significant difference in expression in CRPS patients compared to controls with the highest relative fold change (4.0±1.23 times and p = 1.4×10−4). The up-regulation of MMP9 gene in the blood may be related to the pain progression in CRPS patients. Our findings, which offer a valuable contribution to the understanding of the differential gene expression in CRPS may help in the understanding of the pathophysiology of CRPS pain progression. PMID:24244504
Watters, James W; Kraja, Aldi; Meucci, Melissa A; Province, Michael A; McLeod, Howard L
Little is known about the heritability of chemotherapy activity or the identity of genes that may enable the individualization of cancer chemotherapy. Although numerous genes are likely to influence chemotherapy response, current candidate gene-based pharmacogenetics approaches require a priori knowledge and the selection of a small number of candidate genes for hypothesis testing. In this study, an ex vivo familial genetics strategy using lymphoblastoid cells derived from Centre d'Etude du Polymorphisme Humain reference pedigrees was used to discover genetic determinants of chemotherapy cytotoxicity. Cytotoxicity to the mechanistically distinct chemotherapy agents 5-fluorouracil and docetaxel were shown to be heritable traits, with heritability values ranging from 0.26 to 0.65 for 5-fluorouracil and 0.21 to 0.70 for docetaxel, varying with dose. Genome-wide linkage analysis was also used to map a quantitative trait locus influencing the cellular effects of 5-fluorouracil to chromosome 9q13-q22 [logarithm of odds (LOD) = 3.44], and two quantitative trait loci influencing the cellular effects of docetaxel to chromosomes 5q11-21 (LOD = 2.21) and 9q13-q22 (LOD = 2.73). Finally, 5-fluorouracil and docetaxel were shown to cause apoptotic cell death involving caspase-3 cleavage in Centre d'Etude du Polymorphisme Humain lymphoblastoid cells. This study identifies genomic regions likely to harbor genes important for chemotherapy cytotoxicity using genome-wide linkage analysis in human pedigrees and provides a widely applicable strategy for pharmacogenomic discovery without the requirement for a priori candidate gene selection.
Howard, Jeremy T.; Kachman, Stephen D.; Snelling, Warren M.; Pollak, E. John; Ciobanu, Daniel C.; Kuehn, Larry A.; Spangler, Matthew L.
Cattle are reared in diverse environments and collecting phenotypic body temperature (BT) measurements to characterize BT variation across diverse environments is difficult and expensive. To better understand the genetic basis of BT regulation, a genome-wide association study was conducted utilizing crossbred steers and heifers totaling 239 animals of unknown pedigree and breed fraction. During predicted extreme heat and cold stress events, hourly tympanic and vaginal BT devices were placed in steers and heifers, respectively. Individuals were genotyped with the BovineSNP50K_v2 assay and data analyzed using Bayesian models for area under the curve (AUC), a measure of BT over time, using hourly BT observations summed across 5-days (AUC summer 5-day (AUCS5D) and AUC winter 5-day (AUCW5D)). Posterior heritability estimates were moderate to high and were estimated to be 0.68 and 0.21 for AUCS5D and AUCW5D, respectively. Moderately positive correlations between direct genomic values for AUCS5D and AUCW5D (0.40) were found, although a small percentage of the top 5 % 1-Mb windows were in common. Different sets of genes were associated with BT during winter and summer, thus simultaneous selection for animals tolerant to both heat and cold appears possible.
Spötter, Andreas; Gupta, Pooja; Mayer, Manfred; Reinsch, Norbert; Bienefeld, Kaspar
Honey bees are exposed to many damaging pathogens and parasites. The most devastating is Varroa destructor, which mainly affects the brood. A promising approach for preventing its spread is to breed Varroa-resistant honey bees. One trait that has been shown to provide significant resistance against the Varroa mite is hygienic behavior, which is a behavioral response of honeybee workers to brood diseases in general. Here, we report the use of an Affymetrix 44K SNP array to analyze SNPs associated with detection and uncapping of Varroa-parasitized brood by individual worker bees (Apis mellifera). For this study, 22 000 individually labeled bees were video-monitored and a sample of 122 cases and 122 controls was collected and analyzed to determine the dependence/independence of SNP genotypes from hygienic and nonhygienic behavior on a genome-wide scale. After false-discovery rate correction of the P values, 6 SNP markers had highly significant associations with the trait investigated (α < 0.01). Inspection of the genomic regions around these SNPs led to the discovery of putative candidate genes.
A genome wide association study (GWAS) investigating red blood cell (RBC) phenotypes was performed with over 500 domestic sheep (Ovis aries) from three economically important breeds in the US (Columbia, Polypay, and Rambouillet). A single nucleotide polymorphism (SNP, hereafter the discovery SNP) sh...
Background: A genome-wide set of single nucleotide polymorphisms (SNPs) is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a...
Background A genome-wide set of single nucleotide polymorphisms (SNPs) is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a ...
Fusarium head blight (FHB) is one of the most important wheat diseases worldwide and host resistance displays complex genetic control. A genome-wide association study (GWAS) was performed on 273 winter wheat breeding lines from the mid-western and eastern regions of the US to identify chromosomal re...
Byrne, Stephen; Czaban, Adrian; Studer, Bruno; Panitz, Frank; Bendixen, Christian; Asp, Torben
Genotyping-by-Sequencing (GBS) is an excellent tool for characterising genetic variation between plant genomes. To date, its use has been reported only for genotyping of single individuals. However, there are many applications where resolving allele frequencies within populations on a genome-wide scale would be very powerful, examples include the breeding of outbreeding species, varietal protection in outbreeding species, monitoring changes in population allele frequencies. This motivated us to test the potential to use GBS to evaluate allele frequencies within populations. Perennial ryegrass is an outbreeding species, and breeding programs are based upon selection on populations. We tested two restriction enzymes for their efficiency in complexity reduction of the perennial ryegrass genome. The resulting profiles have been termed Genome Wide Allele Frequency Fingerprints (GWAFFs), and we have shown how these fingerprints can be used to distinguish between plant populations. Even at current costs and throughput, using sequencing to directly evaluate populations on a genome-wide scale is viable. GWAFFs should find many applications, from varietal development in outbreeding species right through to playing a role in protecting plant breeders’ rights. PMID:23469194
Buzanskas, Marcos E; Grossi, Daniela A; Ventura, Ricardo V; Schenkel, Flávio S; Sargolzaei, Mehdi; Meirelles, Sarah L C; Mokry, Fabiana B; Higa, Roberto H; Mudadu, Maurício A; da Silva, Marcos V G Barbosa; Niciura, Simone C M; Torres, Roberto A A; Alencar, Maurício M; Regitano, Luciana C A; Munari, Danísio P
Studies are being conducted on the applicability of genomic data to improve the accuracy of the selection process in livestock, and genome-wide association studies (GWAS) provide valuable information to enhance the understanding on the genetics of complex traits. The aim of this study was to identify genomic regions and genes that play roles in birth weight (BW), weaning weight adjusted for 210 days of age (WW), and long-yearling weight adjusted for 420 days of age (LYW) in Canchim cattle. GWAS were performed by means of the Generalized Quasi-Likelihood Score (GQLS) method using genotypes from the BovineHD BeadChip and estimated breeding values for BW, WW, and LYW. Data consisted of 285 animals from the Canchim breed and 114 from the MA genetic group (derived from crossings between Charolais sires and ½ Canchim + ½ Zebu dams). After applying a false discovery rate correction at a 10% significance level, a total of 4, 12, and 10 SNPs were significantly associated with BW, WW, and LYW, respectively. These SNPs were surveyed to their corresponding genes or to surrounding genes within a distance of 250 kb. The genes DPP6 (dipeptidyl-peptidase 6) and CLEC3B (C-type lectin domain family 3 member B) were highlighted, considering its functions on the development of the brain and skeletal system, respectively. The GQLS method identified regions on chromosome associated with birth weight, weaning weight, and long-yearling weight in Canchim and MA animals. New candidate regions for body weight traits were detected and some of them have interesting biological functions, of which most have not been previously reported. The observation of QTL reports for body weight traits, covering areas surrounding the genes (SNPs) herein identified provides more evidence for these associations. Future studies targeting these areas could provide further knowledge to uncover the genetic architecture underlying growth traits in Canchim cattle.
Buzanskas, Marcos E.; Grossi, Daniela A.; Ventura, Ricardo V.; Schenkel, Flávio S.; Sargolzaei, Mehdi; Meirelles, Sarah L. C.; Mokry, Fabiana B.; Higa, Roberto H.; Mudadu, Maurício A.; da Silva, Marcos V. G. Barbosa.; Niciura, Simone C. M.; Júnior, Roberto A. A. Torres.; Alencar, Maurício M.; Regitano, Luciana C. A.; Munari, Danísio P.
Studies are being conducted on the applicability of genomic data to improve the accuracy of the selection process in livestock, and genome-wide association studies (GWAS) provide valuable information to enhance the understanding on the genetics of complex traits. The aim of this study was to identify genomic regions and genes that play roles in birth weight (BW), weaning weight adjusted for 210 days of age (WW), and long-yearling weight adjusted for 420 days of age (LYW) in Canchim cattle. GWAS were performed by means of the Generalized Quasi-Likelihood Score (GQLS) method using genotypes from the BovineHD BeadChip and estimated breeding values for BW, WW, and LYW. Data consisted of 285 animals from the Canchim breed and 114 from the MA genetic group (derived from crossings between Charolais sires and ½ Canchim + ½ Zebu dams). After applying a false discovery rate correction at a 10% significance level, a total of 4, 12, and 10 SNPs were significantly associated with BW, WW, and LYW, respectively. These SNPs were surveyed to their corresponding genes or to surrounding genes within a distance of 250 kb. The genes DPP6 (dipeptidyl-peptidase 6) and CLEC3B (C-type lectin domain family 3 member B) were highlighted, considering its functions on the development of the brain and skeletal system, respectively. The GQLS method identified regions on chromosome associated with birth weight, weaning weight, and long-yearling weight in Canchim and MA animals. New candidate regions for body weight traits were detected and some of them have interesting biological functions, of which most have not been previously reported. The observation of QTL reports for body weight traits, covering areas surrounding the genes (SNPs) herein identified provides more evidence for these associations. Future studies targeting these areas could provide further knowledge to uncover the genetic architecture underlying growth traits in Canchim cattle. PMID:24733441
Martin, Pauline Marie; Palhière, Isabelle; Ricard, Anne; Tosser-Klopp, Gwenola; Rupp, Rachel
This paper reports a quantitative genetics and genomic analysis of undesirable coat color patterns in goats. Two undesirable coat colors have routinely been recorded for the past 15 years in French Saanen goats. One fifth of Saanen females have been phenotyped “pink” (8.0%) or “pink neck” (11.5%) and consequently have not been included in the breeding program as elite animals. Heritability of the binary “pink” and “pink neck” phenotype, estimated from 103,443 females was 0.26 for “pink” and 0.21 for “pink neck”. Genome wide association studies (using haplotypes or single SNPs) were implemented using a daughter design of 810 Saanen goats sired by 9 Artificial Insemination bucks genotyped with the goatSNP50 chip. A highly significant signal (-log10pvalue = 10.2) was associated with the “pink neck” phenotype on chromosome 11, suggesting the presence of a major gene. Highly significant signals for the “pink” phenotype were found on chromosomes 5 and 13 (-log10p values of 7.2 and, 7.7 respectively). The most significant SNP on chromosome 13 was in the ASIP gene region, well known for its association with coat color phenotypes. Nine significant signals were also found for both traits. The highest signal for each trait was detected by both single SNP and haplotype approaches, whereas the smaller signals were not consistently detected by the two methods. Altogether these results demonstrated a strong genetic control of the “pink” and “pink neck” phenotypes in French Saanen goats suggesting that SNP information could be used to identify and remove undesired colored animals from the breeding program. PMID:27030980
Akanno, Everestus C; Plastow, Graham; Fitzsimmons, Carolyn; Miller, Stephen P; Baron, Vern; Ominski, Kimberly; Basarab, John A
The aim of this study was to identify SNP markers that associate with variation in beef heifer reproduction and performance of their calves. A genome-wide association study was performed by means of the generalized quasi-likelihood score (GQLS) method using heifer genotypes from the BovineSNP50 BeadChip and estimated breeding values for pre-breeding body weight (PBW), pregnancy rate (PR), calving difficulty (CD), age at first calving (AFC), calf birth weight (BWT), calf weaning weight (WWT), and calf pre-weaning average daily gain (ADG). Data consisted of 785 replacement heifers from three Canadian research herds, namely Brandon Research Centre, Brandon, Manitoba, University of Alberta Roy Berg Kinsella Ranch, Kinsella, Alberta, and Lacombe Research Centre, Lacombe, Alberta. After applying a false discovery rate correction at a 5% significance level, a total of 4, 3, 3, 9, 6, 2, and 1 SNPs were significantly associated with PBW, PR, CD, AFC, BWT, WWT, and ADG, respectively. These SNPs were located on chromosomes 1, 5-7, 9, 13-16, 19-21, 24, 25, and 27-29. Chromosomes 1, 5, and 24 had SNPs with pleiotropic effects. New significant SNPs that impact functional traits were detected, many of which have not been previously reported. The results of this study support quantitative genetic studies related to the inheritance of these traits, and provides new knowledge regarding beef cattle quantitative trait loci effects. The identification of these SNPs provides a starting point to identify genes affecting heifer reproduction traits and performance of their calves (BWT, WWT, and ADG). They also contribute to a better understanding of the biology underlying these traits and will be potentially useful in marker- and genome-assisted selection and management.
Blum, Meike; Distl, Ottmar
In the present study, breeding values for canine congenital sensorineural deafness, the presence of blue eyes and patches have been predicted using multivariate animal models to test the reliability of the breeding values for planned matings. The dataset consisted of 6669 German Dalmatian dogs born between 1988 and 2009. Data were provided by the Dalmatian kennel clubs which are members of the German Association for Dog Breeding and Husbandry (VDH). The hearing status for all dogs was evaluated using brainstem auditory evoked potentials. The reliability using the prediction error variance of breeding values and the realized reliability of the prediction of the phenotype of future progeny born in each one year between 2006 and 2009 were used as parameters to evaluate the goodness of prediction through breeding values. All animals from the previous birth years were used for prediction of the breeding values of the progeny in each of the up-coming birth years. The breeding values based on pedigree records achieved an average reliability of 0.19 for the future 1951 progeny. The predictive accuracy (R2) for the hearing status of single future progeny was at 1.3%. Combining breeding values for littermates increased the predictive accuracy to 3.5%. Corresponding values for maternal and paternal half-sib groups were at 3.2 and 7.3%. The use of breeding values for planned matings increases the phenotypic selection response over mass selection. The breeding values of sires may be used for planned matings because reliabilities and predictive accuracies for future paternal progeny groups were highest.
Wacholder, Sholom; Rotunno, Melissa
Investigators planning studies within cohorts have many options for choosing an efficient sampling design for genome-wide association and other molecular epidemiology studies. Consideration of person-year and proportional hazards analyses of full cohorts may add further insight into ramifications of different designs. Empirical evidence from genome-wide association studies can supplement intuition and simulations in comparing properties of various case-control designs within cohorts. Additional theoretical and empirical work, justification of sampling choice in publications, and consideration of context and scientific aims can improve designs and, thereby, increase the scientific value and cost effectiveness of future studies.
Muqaddasi, Quddoos H.; Lohwasser, Ulrike; Nagel, Manuela; Börner, Andreas; Pillen, Klaus; Röder, Marion S.
In a number of crop species hybrids are able to outperform line varieties. The anthers of the autogamous bread wheat plant are normally extruded post anthesis, a trait which is unfavourable for the production of F1 hybrid grain. Higher anther extrusion (AE) promotes cross fertilization for more efficient hybrid seed production. Therefore, this study aimed at the genetic dissection of AE by genome wide association studies (GWAS) and determination of the main effect QTL. We applied GWAS approach to identify DArT markers potentially linked to AE to unfold its genetic basis in a panel of spring wheat accessions. Phenotypic data were collected for three years and best linear unbiased estimate (BLUE) values were calculated across all years. The extent of the AE correlation between growing years and BLUE values ranged from r = +0.56 (2013 vs 2015) to 0.91 (2014 vs BLUE values). The broad sense heritability was 0.84 across all years. Six accessions displayed stable AE >80% across all the years. Genotyping data included 2,575 DArT markers (with minimum of 0.05 minor allele frequency applied). AE was influenced both by genotype and by the growing environment. In all, 131 significant marker trait associations (MTAs) (|log10 (P)| >FDR) were established for AE. AE behaved as a quantitative trait, with five consistently significant markers (significant across at least two years with a significant BLUE value) contributing a minor to modest proportion (4.29% to 8.61%) of the phenotypic variance and affecting the trait either positively or negatively. For this reason, there is potential for breeding for improved AE by gene pyramiding. The consistently significant markers linked to AE could be helpful for marker assisted selection to transfer AE to high yielding varieties allowing to promote the exploitation of hybrid-heterosis in the key crop wheat. PMID:27191600
... historical) Genome-Wide Scan Reveals Mutation Associated with Melanoma A team of international researchers supported by the ... when they divide and grow uncontrollably, develop into melanoma. Also, MITF activity is known to be amplified ...
Shin, Dong-Hyun; Lee, Jin Woo; Park, Jong-Eun; Choi, Ik-Young; Oh, Hee-Seok; Kim, Hyeon Jeong; Kim, Heebal
Thoroughbred, a relatively recent horse breed, is best known for its use in horse racing. Although myostatin (MSTN) variants have been reported to be highly associated with horse racing performance, the trait is more likely to be polygenic in nature. The purpose of this study was to identify genetic variants strongly associated with racing performance by using estimated breeding value (EBV) for race time as a phenotype. We conducted a two-stage genome-wide association study to search for genetic variants associated with the EBV. In the first stage of genome-wide association study, a relatively large number of markers (~54,000 single-nucleotide polymorphisms, SNPs) were evaluated in a small number of samples (240 horses). In the second stage, a relatively small number of markers identified to have large effects (170 SNPs) were evaluated in a much larger number of samples (1,156 horses). We also validated the SNPs related to MSTN known to have large effects on racing performance and found significant associations in the stage two analysis, but not in stage one. We identified 28 significant SNPs related to 17 genes. Among these, six genes have a function related to myogenesis and five genes are involved in muscle maintenance. To our knowledge, these genes are newly reported for the genetic association with racing performance of Thoroughbreds. It complements a recent horse genome-wide association studies of racing performance that identified other SNPs and genes as the most significant variants. These results will help to expand our knowledge of the polygenic nature of racing performance in Thoroughbreds.
Jawasreh, K; Boettcher, P J; Stella, A
Hereditary underdevelopment of the ear, a condition also known as microtia, has been observed in several sheep breeds as well as in humans and other species. Its genetic basis in sheep is unknown. The Awassi sheep, a breed native to southwest Asia, carries this phenotype and was targeted for molecular characterization via a genome-wide association study. DNA samples were collected from sheep in Jordan. Eight affected and 12 normal individuals were genotyped with the Illumina OvineSNP50(®) chip. Multilocus analyses failed to identify any genotypic association. In contrast, a single-locus analysis revealed a statistically significant association (P = 0.012, genome-wide) with a SNP at basepair 34 647 499 on OAR23. This marker is adjacent to the gene encoding transcription factor GATA-6, which has been shown to play a role in many developmental processes, including chondrogenesis. The lack of extended homozygosity in this region suggests a fairly ancient mutation, and the time of occurrence was estimated to be approximately 3000 years ago. Many of the earless sheep breeds may thus share the causative mutation, especially within the subgroup of fat-tailed, wool sheep.
Amyotte, Beatrice; Bowen, Amy J.; Banks, Travis; Rajcan, Istvan; Somers, Daryl J.
Breeding apples is a long-term endeavour and it is imperative that new cultivars are selected to have outstanding consumer appeal. This study has taken the approach of merging sensory science with genome wide association analyses in order to map the human perception of apple flavour and texture onto the apple genome. The goal was to identify genomic associations that could be used in breeding apples for improved fruit quality. A collection of 85 apple cultivars was examined over two years through descriptive sensory evaluation by a trained sensory panel. The trained sensory panel scored randomized sliced samples of each apple cultivar for seventeen taste, flavour and texture attributes using controlled sensory evaluation practices. In addition, the apple collection was subjected to genotyping by sequencing for marker discovery. A genome wide association analysis suggested significant genomic associations for several sensory traits including juiciness, crispness, mealiness and fresh green apple flavour. The findings include previously unreported genomic regions that could be used in apple breeding and suggest that similar sensory association mapping methods could be applied in other plants. PMID:28231290
Niu, Yao-Fang; Ye, Chengyin; He, Ji; Han, Fang; Guo, Long-Biao; Zheng, Hou-Feng; Chen, Guo-Bo
In line with open-source genetics, we report a novel linear regression technique for genome-wide association studies (GWAS), called Open GWAS algoriTHm (OATH). When individual-level data are not available, OATH can not only completely reproduce reported results from an experimental model, but also recover underreported results from other alternative models with a different combination of nuisance parameters using naïve summary statistics (NSS). OATH can also reliably evaluate all reported results in-depth (e.g., p-value variance analysis), as demonstrated for 42 Arabidopsis phenotypes under three magnesium (Mg) conditions. In addition, OATH can be used for consortium-driven genome-wide association meta-analyses (GWAMA), and can greatly improve the flexibility of GWAMA. A prototype of OATH is available in the Genetic Analysis Repository (https://github.com/gc5k/GEAR). PMID:28122950
Verkouteren, Joris A. C.; Hofman, Albert; Uitterlinden, André G.; Kraft, Peter; Turman, Constance; Han, Jiali; Cho, Eunyoung; Murabito, Joanne M.; Levy, Daniel; Qureshi, Abrar A.; Nijsten, Tamar
There is strong evidence for a role of environmental risk factors involved in susceptibility to develop multiple keratinocyte cancers (mKCs), but whether genes are also involved in mKCs susceptibility has not been thoroughly investigated. We investigated whether single nucleotide polymorphisms (SNPs) are associated with susceptibility for mKCs. A genome-wide association study (GWAS) of 1,666 cases with mKCs and 1,950 cases with single KC (sKCs; controls) from Harvard cohorts (the Nurses' Health Study [NHS], NHS II, and the Health Professionals Follow-Up Study) and the Framingham Heart Study was carried-out using over 8 million SNPs (stage-1). We sought to replicate the most significant statistical associations (p-value≤ 5.5x10-6) in an independent cohort of 574 mKCs and 872 sKCs from the Rotterdam Study. In the discovery stage, 40 SNPs with suggestive associations (p-value ≤5.5x10-6) were identified, with eight independent SNPs tagging all 40 SNPs. The most significant SNP was located at chromosome 9 (rs7468390; p-value = 3.92x10-7). In stage-2, none of these SNPs replicated and only two of them were associated with mKCs in the same direction in the combined meta-analysis. We tested the associations for 19 previously reported basal cell carcinoma-related SNPs (candidate gene association analysis), and found that rs1805007 (MC1R locus) was significantly associated with risk of mKCs (p-value = 2.80x10-4). Although the suggestive SNPs with susceptibility for mKCs were not replicated, we found that previously identified BCC variants may also be associated with mKC, which the most significant association (rs1805007) located at the MC1R gene. PMID:28081215
Fikse, W F
The purpose of this investigation was to compare accuracy and precision of variance components and breeding values for international genetic evaluations based on national breeding values or animal performance records. A conventional progeny test scheme was simulated for 3 countries. True breeding values and observations were generated specific to production environments. Two production environments were considered, and both balanced and unbalanced distribution of production environments over countries were considered. True breeding values for both production environments were generated as bivariate normal deviates, and low (0.70) and high (0.90) genetic correlations between performance in production environments were considered. Each cow had an observation in one country only. Performance records were generated as the sum of the true breeding value, a contemporary group effect, and a random residual. Eight generations of data were simulated, and the entire simulated data set was used to compare 3 methods for international genetic evaluation: 1) multiple-trait across-country evaluation based on national predicted breeding values of bulls (Mace), 2) international genetic evaluation across country using performance records, and 3) international genetic evaluation across production environment using performance records. Estimated genetic parameters were biased for all models in this study. Genetic correlations between countries were generally more biased for Mace than for the across-country analyses using performance records. Bias in within-country genetic variances was smaller for Mace. Even genetic parameters obtained with the international evaluation across production environment using performance records were biased, despite the fact that this model was closest to the true, simulated model. The root mean square error of predicted breeding values was similar between models for most of the situations considered. The difference between models was largest when the
Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao
Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001
Migault, Vincent; Pallas, Benoît; Costes, Evelyne
In crops, optimizing target traits in breeding programs can be fostered by selecting appropriate combinations of architectural traits which determine light interception and carbon acquisition. In apple tree, architectural traits were observed to be under genetic control. However, architectural traits also result from many organogenetic and morphological processes interacting with the environment. The present study aimed at combining a FSPM built for apple tree, MAppleT, with genetic determinisms of architectural traits, previously described in a bi-parental population. We focused on parameters related to organogenesis (phyllochron and immediate branching) and morphogenesis processes (internode length and leaf area) during the first year of tree growth. Two independent datasets collected in 2004 and 2007 on 116 genotypes, issued from a ‘Starkrimson’ × ‘Granny Smith’ cross, were used. The phyllochron was estimated as a function of thermal time and sylleptic branching was modeled subsequently depending on phyllochron. From a genetic map built with SNPs, marker effects were estimated on four MAppleT parameters with rrBLUP, using 2007 data. These effects were then considered in MAppleT to simulate tree development in the two climatic conditions. The genome wide prediction model gave consistent estimations of parameter values with correlation coefficients between observed values and estimated values from SNP markers ranging from 0.79 to 0.96. However, the accuracy of the prediction model following cross validation schemas was lower. Three integrative traits (the number of leaves, trunk length, and number of sylleptic laterals) were considered for validating MAppleT simulations. In 2007 climatic conditions, simulated values were close to observations, highlighting the correct simulation of genetic variability. However, in 2004 conditions which were not used for model calibration, the simulations differed from observations. This study demonstrates the possibility
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: firstname.lastname@example.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
Bartels, Meike; Saviouk, Viatcheslav; de Moor, Marleen H M; Willemsen, Gonneke; van Beijsterveldt, Toos C E M; Hottenga, Jouke-Jan; de Geus, Eco J C; Boomsma, Dorret I
Causes of individual differences in happiness, as assessed with the Subjective Happiness Scale, are investigated in a large of sample twins and siblings from the Netherlands Twin Register. Over 12,000 twins and siblings, average age 24.7 years (range 12 to 88), took part in the study. A genetic model with an age by sex design was fitted to the data with structural equation modeling in Mx. The heritability of happiness was estimated at 22% for males and 41% in females. No effect of age was observed. To identify the genomic regions contributing to this heritability, a genome-wide linkage study for happiness was conducted in sibling pairs. A subsample of 1157 offspring from 441 families was genotyped with an average of 371 micro-satellite markers per individual. Phenotype and genotype data were analyzed in MERLIN with multipoint variance component linkage analysis and age and sex as covariates. A linkage signal (logarithm of odds score 2.73, empirical p value 0.095) was obtained at the end of the long arm of chromosome 19 for marker D19S254 at 110 cM. A second suggestive linkage peak was found at the short arm of chromosome 1 (LOD of 2.37) at 153 cM, marker D1S534 (empirical p value of .209). These two regions of interest are not overlapping with the regions found for contrasting phenotypes (such as depression, which is negatively associated with happiness). Further linkage and future association studies are warranted.
Ferfouri, F; Boitrelle, F; Ghout, I; Albert, M; Molina Gomes, D; Wainer, R; Bailly, M; Selva, J; Vialard, F
The objective of this study was to assess genome-wide DNA methylation in testicular tissue from azoospermic patients. A total of 94 azoospermic patients were recruited and classified into three groups: 29 patients presented obstructive azoospermia (OA), 26 displayed non-obstructive azoospermia (NOA) and successful retrieval of spermatozoa by testicular sperm extraction (TESE+) and 39 displayed NOA and failure to retrieve spermatozoa by TESE (TESE-). An Illumina Infinium Human Methylation27 BeadChip DNA methylation array was used to establish a testicular DNA methylation pattern for each type of azoospermic patient. The OA and NOA groups were compared in terms of the relative M-value (the log2 ratio between methylated and non-methylated probe intensities) for each CpG site. We observed significantly different DNA methylation profiles for the NOA and OA groups, with differences at over 9000 of the 27 578 CpG sites; 212 CpG sites had a relative M-value >3. The results highlighted 14 testis-specific genes. Patient clustering with respect to these 212 CpG sites corresponded closely to the clinical classification. The DNA methylation patterns showed that in the NOA group, 78 of the 212 CpG sites were hypomethylated and 134 were hypermethylated (relative to the OA group). On the basis of these DNA methylation profiles, azoospermic patients could be classified as OA or NOA by considering the 212 CpG sites with the greatest methylation differences. Furthermore, we identified genes that may provide insight into the mechanism of idiopathic NOA.
Lu, Xia; Luan, Sheng; Kong, Jie; Hu, Longyang; Mao, Yong; Zhong, Shengping
The kuruma prawn, Marsupenaeus japonicus, is one of the most cultivated and consumed species of shrimp. However, very few molecular genetic/genomic resources are publically available for it. Thus, the characterization and distribution of simple sequence repeats (SSRs) remains ambiguous and the use of SSR markers in genomic studies and marker-assisted selection is limited. The goal of this study is to characterize and develop genome-wide SSR markers in M. japonicus by genome survey sequencing for application in comparative genomics and breeding. A total of 326 945 perfect SSRs were identified, among which dinucleotide repeats were the most frequent class (44.08%), followed by mononucleotides (29.67%), trinucleotides (18.96%), tetranucleotides (5.66%), hexanucleotides (1.07%), and pentanucleotides (0.56%). In total, 151 541 SSR loci primers were successfully designed. A subset of 30 SSR primer pairs were synthesized and tested in 42 individuals from a wild population, of which 27 loci (90.0%) were successfully amplified with specific products and 24 (80.0%) were polymorphic. For the amplified polymorphic loci, the alleles ranged from 5 to 17 (with an average of 9.63), and the average PIC value was 0.796. A total of 58 256 SSR-containing sequences had significant Gene Ontology annotation; these are good functional molecular marker candidates for association studies and comparative genomic analysis. The newly identified SSRs significantly contribute to the M. japonicus genomic resources and will facilitate a number of genetic and genomic studies, including high density linkage mapping, genome-wide association analysis, marker-aided selection, comparative genomics analysis, population genetics, and evolution.
Tarrant, K J; Dey, S; Kinney, R; Anthony, N B; Rhoads, D D
Ascites is a multi-faceted disease commonly observed in fast growing broilers, which is initiated when the body is insufficiently oxygenated. A series of events follow, including an increase in pulmonary artery pressure, right ventricle hypertrophy, and accumulation of fluid in the abdominal cavity and pericardium. Advances in management practices along with improved selection programs have decreased ascites incidence in modern broilers. However, ascites syndrome remains an economically important disease throughout the world, causing estimated losses of $100 million per year. In this study, a 60 K Illumina SNP BeadChip was used to perform a series of genome wide association studies (GWAS) on the 16th and 18th generation of our relaxed (REL) line descended from a commercial elite broiler line beginning in 1995. Regions significantly associated with ascites incidence were identified on chromosome 2 around 70 megabase pairs (Mbp) and on chromosome Z around 60 Mbp. Five candidate single nucleotide polymorphisms (SNP) were evaluated as indicators for these 2 regions in order to identify association with ascites and right ventricle to total ventricle weight (RVTV) ratios. Chromosome 2 SNP showed an association with RVTV ratios in males phenotyped as ascites resistant and ascites susceptible (P = 0.02 and P = 0.03, respectively). The chromosome Z region also indicates an association with resistant female RVTV values (P = 0.02). Regions of significance identified on chromosomes 2 and Z described in this study will be used as proposed candidate regions for further investigation into the genetics of ascites. This information will lead to a better understanding of the underlying genetics and gene networks contributing to ascites, and thus advances in ascites reduction through commercial breeding schemes.
Schulz, Dietmar F.; Schott, Rena T.; Voorrips, Roeland E.; Smulders, Marinus J. M.; Linde, Marcus; Debener, Thomas
Petal color is one of the key characteristics determining the attractiveness and therefore the commercial value of an ornamental crop. Here, we present the first genome-wide association study for the important ornamental crop rose, focusing on the anthocyanin and carotenoid contents in petals of 96 diverse tetraploid garden rose genotypes. Cultivated roses display a vast phenotypic and genetic diversity and are therefore ideal targets for association genetics. For marker analysis, we used a recently designed Axiom SNP chip comprising 68,000 SNPs with additionally 281 SSRs, 400 AFLPs and 246 markers from candidate genes. An analysis of the structure of the rose population revealed three subpopulations with most of the genetic variation between individual genotypes rather than between clusters and with a high average proportion of heterozygous loci. The mapping of markers significantly associated with anthocyanin and carotenoid content to the related Fragaria and Prunus genomes revealed clusters of associated markers indicating five genomic regions associated with the total anthocyanin content and two large clusters associated with the carotenoid content. Among the marker clusters associated with the phenotypes, we found several candidate genes with known functions in either the anthocyanin or the carotenoid biosynthesis pathways. Among others, we identified a glutathione-S-transferase, 4CL, an auxin response factor and F3'H as candidate genes affecting anthocyanin concentration, and CCD4 and Zeaxanthine epoxidase as candidates affecting the concentration of carotenoids. These markers are starting points for future validation experiments in independent populations as well as for functional genomic studies to identify the causal factors for the observed color phenotypes. Furthermore, validated markers may be interesting tools for marker-assisted selection in commercial breeding programmes in that they provide the tools to identify superior parental combinations that
Edwards, Todd L.; Scott, William K.; Almonte, Cherylyn; Burt, Amber; Powell, Eric H.; Beecham, Gary W.; Wang, Liyong; Züchner, Stephan; Konidari, Ioanna; Wang, Gaofeng; Singer, Carlos; Nahab, Fatta; Scott, Burton; Stajich, Jeffrey M.; Pericak-Vance, Margaret; Haines, Jonathan; Vance, Jeffery M.; Martin, Eden R.
SUMMARY Parkinson disease (PD) is a chronic neurodegenerative disorder with a cumulative prevalence of greater than one per thousand. To date three independent genome-wide association studies (GWAS) have investigated the genetic susceptibility to PD. These studies have also implicated several genes as PD risk loci with strong, but not genome-wide significant, associations. In this study, we combined data from two previously published GWAS of Caucasian subjects with our GWAS of 604 cases and 619 controls for a joint analysis with a combined sample size of 1752 cases and 1745 controls. SNPs in SNCA (rs2736990, p-value = 6.7×10−8; genome-wide adjusted p = 0.0109, odds ratio (OR) = 1.29 [95% CI: 1.17–1.42] G vs. A allele, population attributable risk percent (PAR%) = 12%) and the MAPT region (rs11012, p-value = 5.6×10−8; genome-wide adjusted p = 0.0079, OR = 0.70 [95% CI: 0.62–0.79] T vs. C allele, PAR% = 8%) were genome-wide significant. No other SNPs were genome-wide significant in this analysis. This study confirms that SNCA and the MAPT region are major genes whose common variants are influencing risk of PD. PMID:20070850
Edwards, Todd L; Scott, William K; Almonte, Cherylyn; Burt, Amber; Powell, Eric H; Beecham, Gary W; Wang, Liyong; Züchner, Stephan; Konidari, Ioanna; Wang, Gaofeng; Singer, Carlos; Nahab, Fatta; Scott, Burton; Stajich, Jeffrey M; Pericak-Vance, Margaret; Haines, Jonathan; Vance, Jeffery M; Martin, Eden R
Parkinson disease (PD) is a chronic neurodegenerative disorder with a cumulative prevalence of greater than one per thousand. To date three independent genome-wide association studies (GWAS) have investigated the genetic susceptibility to PD. These studies implicated several genes as PD risk loci with strong, but not genome-wide significant, associations. In this study, we combined data from two previously published GWAS of Caucasian subjects with our GWAS of 604 cases and 619 controls for a joint analysis with a combined sample size of 1752 cases and 1745 controls. SNPs in SNCA (rs2736990, p-value = 6.7 x 10(-8); genome-wide adjusted p = 0.0109, odds ratio (OR) = 1.29 [95% CI: 1.17-1.42] G vs. A allele, population attributable risk percent (PAR%) = 12%) and the MAPT region (rs11012, p-value = 5.6 x 10(-8); genome-wide adjusted p = 0.0079, OR = 0.70 [95% CI: 0.62-0.79] T vs. C allele, PAR%= 8%) were genome-wide significant. No other SNPs were genome-wide significant in this analysis. This study confirms that SNCA and the MAPT region are major genes whose common variants are influencing risk of PD.
Mastrangelo, S; Portolano, B; Di Gerlando, R; Ciampolini, R; Tolone, M; Sardina, M T
Analysis of genomic data is becoming increasingly common in the livestock industry and the findings have been an invaluable resource for effective management of breeding programs in small and endangered populations. In this paper, with the goal of highlighting the potential of genomic analysis for small and endangered populations, genome-wide levels of linkage disequilibrium, measured as the squared correlation coefficient of allele frequencies at a pair of loci, effective population size, runs of homozygosity (ROH) and genetic diversity parameters, were estimated in Barbaresca sheep using Illumina OvineSNP50K array data. Moreover, the breed's genetic structure and its relationship with other breeds were investigated. Levels of pairwise linkage disequilibrium decreased with increasing distance between single nucleotide polymorphisms. An average correlation coefficient <0.25 was found for markers located up to 50 kb apart. Therefore, these results support the need to use denser single nucleotide polymorphism panels for high power association mapping and genomic selection efficiency in future breeding programs. The estimate of past effective population size ranged from 747 animals 250 generations ago to 28 animals five generations ago, whereas the contemporary effective population size was 25 animals. A total of 637 ROH were identified, most of which were short (67%) and ranged from 1 to 10 Mb. The genetic analyses revealed that the Barbaresca breed tended to display lower variability than other Sicilian breeds. Recent inbreeding was evident, according to the ROH analysis. All the investigated parameters showed a comparatively narrow genetic base and indicated an endangered status for Barbaresca. Multidimensional scaling, model-based clustering, measurement of population differentiation, neighbor networks and haplotype sharing distinguished Barbaresca from other breeds, showed a low level of admixture with the other breeds considered in this study, and indicated
Otowa, Takeshi; Kawamura, Yoshiya; Tsutsumi, Akizumi; Kawakami, Norito; Kan, Chiemi; Shimada, Takafumi; Umekage, Tadashi; Kasai, Kiyoto; Tokunaga, Katsushi; Sasaki, Tsukasa
Stressful events have been identified as a risk factor for depression. Although gene–environment (G × E) interaction in a limited number of candidate genes has been explored, no genome-wide search has been reported. The aim of the present study is to identify genes that influence the association of stressful events with depression. Therefore, we performed a genome-wide G × E interaction analysis in the Japanese population. A genome-wide screen with 320 subjects was performed using the Affymetrix Genome-Wide Human Array 6.0. Stressful life events were assessed using the Social Readjustment Rating Scale (SRRS) and depression symptoms were assessed with self-rating questionnaires using the Center for Epidemiologic Studies Depression (CES-D) scale. The p values for interactions between single nucleotide polymorphisms (SNPs) and stressful events were calculated using the linear regression model adjusted for sex and age. After quality control of genotype data, a total of 534,848 SNPs on autosomal chromosomes were further analyzed. Although none surpassed the level of the genome-wide significance, a marginal significant association of interaction between SRRS and rs10510057 with depression were found (p = 4.5 × 10−8). The SNP is located on 10q26 near Regulators of G-protein signaling 10 (RGS10), which encodes a regulatory molecule involved in stress response. When we investigated a similar G × E interaction between depression (K6 scale) and work-related stress in an independent sample (n = 439), a significant G × E effect on depression was observed (p = 0.015). Our findings suggest that rs10510057, interacting with stressors, may be involved in depression risk. Incorporating G × E interaction into GWAS can contribute to find susceptibility locus that are potentially missed by conventional GWAS. PMID:27529621
Yamada, Kazuo; Iwayama, Yoshimi; Hattori, Eiji; Iwamoto, Kazuya; Toyota, Tomoko; Ohnishi, Tetsuo; Ohba, Hisako; Maekawa, Motoko; Kato, Tadafumi; Yoshikawa, Takeo
Schizophrenia is a devastating neuropsychiatric disorder with genetically complex traits. Genetic variants should explain a considerable portion of the risk for schizophrenia, and genome-wide association study (GWAS) is a potentially powerful tool for identifying the risk variants that underlie the disease. Here, we report the results of a three-stage analysis of three independent cohorts consisting of a total of 2,535 samples from Japanese and Chinese populations for searching schizophrenia susceptibility genes using a GWAS approach. Firstly, we examined 115,770 single nucleotide polymorphisms (SNPs) in 120 patient-parents trio samples from Japanese schizophrenia pedigrees. In stage II, we evaluated 1,632 SNPs (1,159 SNPs of p<0.01 and 473 SNPs of p<0.05 that located in previously reported linkage regions). The second sample consisted of 1,012 case-control samples of Japanese origin. The most significant p value was obtained for the SNP in the ELAVL2 [(embryonic lethal, abnormal vision, Drosophila)-like 2] gene located on 9p21.3 (p = 0.00087). In stage III, we scrutinized the ELAVL2 gene by genotyping gene-centric tagSNPs in the third sample set of 293 family samples (1,163 individuals) of Chinese descent and the SNP in the gene showed a nominal association with schizophrenia in Chinese population (p = 0.026). The current data in Asian population would be helpful for deciphering ethnic diversity of schizophrenia etiology. PMID:21674006
Hughes, Austin L; Welch, Robert; Puri, Vinita; Matthews, Casey; Haque, Kashif; Chanock, Stephen J; Yeager, Meredith
Single-nucleotide polymorphism (SNP) arrays have become a popular technology for disease-association studies, but they also have potential for studying the genetic differentiation of human populations. Application of the Affymetrix GeneChip Human Mapping 500K Array Set to a population of 102 individuals representing the major ethnic groups in the United States (African, Asian, European, and Hispanic) revealed patterns of gene diversity and genetic distance that reflected population history. We analyzed allelic frequencies at 388,654 autosomal SNP sites that showed some variation in our study population and 10% or fewer missing values. Despite the small size (23-31 individuals) of each subpopulation, there were no fixed differences at any site between any two subpopulations. As expected from the African origin of modern humans, greater gene diversity was seen in Africans than in either Asians or Europeans, and the genetic distance between the Asian and the European populations was significantly lower than that between either of these two populations and Africans. Principal components analysis applied to a correlation matrix among individuals was able to separate completely the major continental groups of humans (Africans, Asians, and Europeans), while Hispanics overlapped all three of these groups. Genes containing two or more markers with extraordinarily high genetic distance between subpopulations were identified as candidate genes for health differences between subpopulations. The results show that, even with modest sample sizes, genome-wide SNP genotyping technologies have great promise for capturing signatures of gene frequency difference between human subpopulations, with applications in areas as diverse as forensics and the study of ethnic health disparities.
Genome-wide association study (GWAS) has appeared as a widespread strategy in decoding genotype-phenotype associations in many species thanks to technical advances in next-generation sequencing (NGS) applications. Maize is an ideal crop for GWAS and significant progress has been made in the last dec...
Genome-Wide Association Studies shed light on the identification of genes underlying human diseases and agriculturally important traits. This potential has been shadowed by false positive findings. The Mixed Linear Model (MLM) method is flexible enough to simultaneously incorporate population struct...
We examined the role of common genetic variation in schizophrenia in a genome-wide association study of substantial size: a stage 1 discovery sample of 21,856 individuals of European ancestry and a stage 2 replication sample of 29,839 independent subjects. The combined stage 1 and 2 analysis yielded genome-wide significant associations with schizophrenia for seven loci, five of which are new (1p21.3, 2q32.3, 8p23.2, 8q21.3 and 10q24.32-q24.33) and two of which have been previously implicated (6p21.32-p22.1 and 18q21.2). The strongest new finding (P = 1.6 × 10−11) was with rs1625579 within an intron of a putative primary transcript for MIR137 (microRNA 137), a known regulator of neuronal development. Four other schizophrenia loci achieving genome-wide significance contain predicted targets of MIR137, suggesting MIR137-mediated dysregulation as a previously unknown etiologic mechanism in schizophrenia. In a joint analysis with a bipolar disorder sample (16,374 affected individuals and 14,044 controls), three loci reached genome-wide significance: CACNA1C (rs4765905, P = 7.0 × 10−9), ANK3 (rs10994359, P = 2.5 × 10−8) and the ITIH3-ITIH4 region (rs2239547, P = 7.8 × 10−9). PMID:21926974
Soybean aphid is the most damaging insect pest of soybean in the Upper Midwest and is primarily controlled by insecticides. Soybean aphid resistance (i.e., Rag genes) has been documented in some soybean lines at chromosomes 6, 7, 13, and 16, but more sources of resistance are needed. Genome-wide ass...
Pryce, J E; Gonzalez-Recio, O; Nieuwhof, G; Wales, W J; Coffey, M P; Hayes, B J; Goddard, M E
A new breeding value that combines the amount of feed saved through improved metabolic efficiency with predicted maintenance requirements is described. The breeding value includes a genomic component for residual feed intake (RFI) combined with maintenance requirements calculated from either a genomic or pedigree estimated breeding value (EBV) for body weight (BW) predicted using conformation traits. Residual feed intake is only available for genotyped Holsteins; however, BW is available for all breeds. The RFI component of the "feed saved" EBV has 2 parts: Australian calf RFI and Australian lactating cow RFI. Genomic breeding values for RFI were estimated from a reference population of 2,036 individuals in a multi-trait analysis including Australian calf RFI (n=843), Australian lactating cow RFI (n=234), and UK and Dutch lactating cow RFI (n=958). In all cases, the RFI phenotypes were deviations from a mean of 0, calculated by correcting dry matter intake for BW, growth, and milk yield (in the case of lactating cows). Single nucleotide polymorphism effects were calculated from the output of genomic BLUP and used to predict breeding values of 4,106 Holstein sires that were genotyped but did not have RFI phenotypes themselves. These bulls already had BW breeding values calculated from type traits, from which maintenance requirements in kilograms of feed per year were inferred. Finally, RFI and the feed required for maintenance (through BW) were used to calculate a feed saved breeding value and expressed as the predicted amount of feed saved per year. Animals that were 1 standard deviation above the mean were predicted to eat 66 kg dry matter less per year at the same level of milk production. In a data set of genotyped Holstein sires, the mean reliability of the feed saved breeding value was 0.37. For Holsteins that are not genotyped and for breeds other than Holsteins, feed saved is calculated using BW only. From April 2015, feed saved has been included as part of
Labitzke, D; Sieme, H; Martinsson, G; Distl, O
The objectives of this study were to show whether semen traits of 30 Hanoverian stallions regularly used in AI may be useful for breeding purposes. Semen characteristics were studied using 15 149 ejaculates from 30 Hanoverian stallions of the State Stud Celle of Lower Saxony. Semen samples were collected between 2005 and 2009. Traits analysed were gel-free volume, sperm concentration, total and motile sperm number and progressive motility. A linear multivariate animal model was employed to estimate heritabilities and permanent environmental variances for stallions. The same model was used to predict breeding values for all traits simultaneously. Heritabilities were high for gel-free volume (h(2) = 0.43) and moderate for total number of sperm (h(2) = 0.29) and progressive motility (h(2) = 0.20). Gel-free volume, sperm concentration and total number of sperm were genetically negatively correlated with progressive motility. The effect of the permanent environment for stallions accounted for 9-55% of the trait variance. The total variance among stallions explained 37-69% of the trait variance. The average reliabilities of the breeding values were 0.43-0.76 for the 30 Hanoverian stallions. In conclusion, the study could demonstrate large effects of stallions, routinely employed in a breeding programme, on semen characteristics analysed here. We could demonstrate that estimated breeding values (EBV) with sufficient high reliabilities can be predicted using data from these stallions and these EBV are useful in horse breeding programmes to achieve genetic improvement in semen quality.
Evans, Daniel S.; Cailotto, Frederic; Parimi, Neeta; Valdes, Ana M.; Castaño-Betancourt, Martha C.; Liu, Youfang; Kaplan, Robert C.; Bidlingmaier, Martin; Vasan, Ramachandran S.; Teumer, Alexander; Tranah, Gregory J.; Nevitt, Michael C.; Cummings, Steven R.; Orwoll, Eric S.; Barrett-Connor, Elizabeth; Renner, Jordan B.; Jordan, Joanne M.; Doherty, Michael; Doherty, Sally A.; Uitterlinden, Andre G.; van Meurs, Joyce B.J.; Spector, Tim D.; Lories, Rik J.; Lane, Nancy E.
Objectives To identify genetic associations with hip osteoarthritis (HOA), we performed a meta-analysis of genome-wide association studies (GWAS) of HOA. Methods The GWAS meta-analysis included approximately 2.5 million imputed HapMap single nucleotide polymorphisms (SNPs). HOA cases and controls defined radiographically and by total hip replacement were selected from the Osteoporotic Fractures in Men (MrOS) Study and the Study of Osteoporotic Fractures (SOF) (654 cases and 4697 controls, combined). Replication of genome-wide significant SNP associations (P-value ≤ 5x10−8) was examined in five studies (3243 cases and 6891 controls, combined). Functional studies were performed using in vitro models of chondrogenesis and osteogenesis. Results The A allele of rs788748, located 65 kb upstream of the IGFBP3 gene, was associated with lower HOA odds at the genome-wide significance level in the discovery stage (OR = 0.71, P-value = 2x10−8). The association replicated in five studies (OR = 0.92, P-value = 0.020), but the joint analysis of discovery and replication results was not genome-wide significant (P-value = 1x10−6). In separate study populations, the rs788748 A allele was also associated with lower circulating IGFBP3 protein levels (P-value = 4x10−13), suggesting that this SNP or a variant in linkage disequilibrium (LD) could be an IGFBP3 regulatory variant. Results from functional studies were consistent with association results. Chondrocyte hypertrophy, a deleterious event in OA pathogenesis, was largely prevented upon IGFBP3 knockdown in chondrocytes. Furthermore, IGFBP3 overexpression induced cartilage catabolism and osteogenic differentiation. Conclusions Results from GWAS and functional studies provided suggestive links between IGFBP3 and HOA. PMID:24928840
Chen, L; Schenkel, F; Vinsky, M; Crews, D H; Li, C
In beef cattle, phenotypic data that are difficult and/or costly to measure, such as feed efficiency, and DNA marker genotypes are usually available on a small number of animals of different breeds or populations. To achieve a maximal accuracy of genomic prediction using the phenotype and genotype data, strategies for forming a training population to predict genomic breeding values (GEBV) of the selection candidates need to be evaluated. In this study, we examined the accuracy of predicting GEBV for residual feed intake (RFI) based on 522 Angus and 395 Charolais steers genotyped on SNP with the Illumina Bovine SNP50 Beadchip for 3 training population forming strategies: within breed, across breed, and by pooling data from the 2 breeds (i.e., combined). Two other scenarios with the training and validation data split by birth year and by sire family within a breed were also investigated to assess the impact of genetic relationships on the accuracy of genomic prediction. Three statistical methods including the best linear unbiased prediction with the relationship matrix defined based on the pedigree (PBLUP), based on the SNP genotypes (GBLUP), and a Bayesian method (BayesB) were used to predict the GEBV. The results showed that the accuracy of the GEBV prediction was the highest when the prediction was within breed and when the validation population had greater genetic relationships with the training population, with a maximum of 0.58 for Angus and 0.64 for Charolais. The within-breed prediction accuracies dropped to 0.29 and 0.38, respectively, when the validation populations had a minimal pedigree link with the training population. When the training population of a different breed was used to predict the GEBV of the validation population, that is, across-breed genomic prediction, the accuracies were further reduced to 0.10 to 0.22, depending on the prediction method used. Pooling data from the 2 breeds to form the training population resulted in accuracies increased
Hall, Jacob B; Bush, William S
Most analyses of genome-wide association data consider each variant independently without considering or adjusting for the genetic background present in the rest of the genome. New approaches to genome analysis use representations of genomic sharing to better account for confounding factors like population stratification or to directly approximate heritability through the estimated sharing of individuals in a dataset. These approaches use mixed linear models, which relate genotypic sharing to phenotypic sharing, and rely on the efficient computation of genetic sharing among individuals in a dataset. This unit describes the principles and practical application of mixed models for the analysis of genome-wide association study data. © 2016 by John Wiley & Sons, Inc.
Ronald, James; Akey, Joshua M
Natural selection, which can be defined as the differential contribution of genetic variants to future generations, is the driving force of Darwinian evolution. Identifying regions of the human genome that have been targets of natural selection is an important step in clarifying human evolutionary history and understanding how genetic variation results in phenotypic diversity, it may also facilitate the search for complex disease genes. Technological advances in high-throughput DNA sequencing and single nucleotide polymorphism genotyping have enabled several genome-wide scans of natural selection to be undertaken. Here, some of the observations that are beginning to emerge from these studies will be reviewed, including evidence for geographically restricted selective pressures (ie local adaptation) and a relationship between genes subject to natural selection and human disease. In addition, the paper will highlight several important problems that need to be addressed in future genome-wide studies of natural selection.
Motaung, Thabiso E; Ells, Ruan; Pohl, Carolina H; Albertyn, Jacobus; Tsilo, Toi J
Candida albicans is an important etiological agent of superficial and life-threatening infections in individuals with compromised immune systems. To date, we know of several overlapping genetic networks that govern virulence attributes in this fungal pathogen. Classical use of deletion mutants has led to the discovery of numerous virulence factors over the years, and genome-wide functional analysis has propelled gene discovery at an even faster pace. Indeed, a number of recent studies using large-scale genetic screens followed by genome-wide functional analysis has allowed for the unbiased discovery of many new genes involved in C. albicans biology. Here we share our perspectives on the role of these studies in analyzing fundamental aspects of C. albicans virulence properties.
Mathieson, Iain; Lazaridis, Iosif; Rohland, Nadin; Mallick, Swapan; Patterson, Nick; Roodenberg, Songül Alpaslan; Harney, Eadaoin; Stewardson, Kristin; Fernandes, Daniel; Novak, Mario; Sirak, Kendra; Gamba, Cristina; Jones, Eppie R; Llamas, Bastien; Dryomov, Stanislav; Pickrell, Joseph; Arsuaga, Juan Luís; de Castro, José María Bermúdez; Carbonell, Eudald; Gerritsen, Fokke; Khokhlov, Aleksandr; Kuznetsov, Pavel; Lozano, Marina; Meller, Harald; Mochalov, Oleg; Moiseyev, Vyacheslav; Guerra, Manuel A Rojo; Roodenberg, Jacob; Vergès, Josep Maria; Krause, Johannes; Cooper, Alan; Alt, Kurt W; Brown, Dorcas; Anthony, David; Lalueza-Fox, Carles; Haak, Wolfgang; Pinhasi, Ron; Reich, David
Ancient DNA makes it possible to observe natural selection directly by analysing samples from populations before, during and after adaptation events. Here we report a genome-wide scan for selection using ancient DNA, capitalizing on the largest ancient DNA data set yet assembled: 230 West Eurasians who lived between 6500 and 300 bc, including 163 with newly reported data. The new samples include, to our knowledge, the first genome-wide ancient DNA from Anatolian Neolithic farmers, whose genetic material we obtained by extracting from petrous bones, and who we show were members of the population that was the source of Europe's first farmers. We also report a transect of the steppe region in Samara between 5600 and 300 bc, which allows us to identify admixture into the steppe from at least two external sources. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height.
Mathieson, Iain; Lazaridis, Iosif; Rohland, Nadin; Mallick, Swapan; Patterson, Nick; Roodenberg, Songül Alpaslan; Harney, Eadaoin; Stewardson, Kristin; Fernandes, Daniel; Novak, Mario; Sirak, Kendra; Gamba, Cristina; Jones, Eppie R.; Llamas, Bastien; Dryomov, Stanislav; Pickrel, Joseph; Arsuaga, Juan Luís; de Castro, José María Bermúdez; Carbonell, Eudald; Gerritsen, Fokke; Khokhlov, Aleksandr; Kuznetsov, Pavel; Lozano, Marina; Meller, Harald; Mochalov, Oleg; Moiseyev, Vayacheslav; Rojo Guerra, Manuel A.; Roodenberg, Jacob; Vergès, Josep Maria; Krause, Johannes; Cooper, Alan; Alt, Kurt W.; Brown, Dorcas; Anthony, David; Lalueza-Fox, Carles; Haak, Wolfgang; Pinhasi, Ron; Reich, David
Ancient DNA makes it possible to directly witness natural selection by analyzing samples from populations before, during and after adaptation events. Here we report the first scan for selection using ancient DNA, capitalizing on the largest genome-wide dataset yet assembled: 230 West Eurasians dating to between 6500 and 1000 BCE, including 163 with newly reported data. The new samples include the first genome-wide data from the Anatolian Neolithic culture whose genetic material we extracted from the DNA-rich petrous bone and who we show were members of the population that was the source of Europe’s first farmers. We also report a complete transect of the steppe region in Samara between 5500 and 1200 BCE that allows us to recognize admixture from at least two external sources into steppe populations during this period. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height. PMID:26595274
Fall, Tove; Ingelsson, Erik
Until just a few years ago, the genetic determinants of obesity and metabolic syndrome were largely unknown, with the exception of a few forms of monogenic extreme obesity. Since genome-wide association studies (GWAS) became available, large advances have been made. The first single nucleotide polymorphism robustly associated with increased body mass index (BMI) was in 2007 mapped to a gene with for the time unknown function. This gene, now known as fat mass and obesity associated (FTO) has been repeatedly replicated in several ethnicities and is affecting obesity by regulating appetite. Since the first report from a GWAS of obesity, an increasing number of markers have been shown to be associated with BMI, other measures of obesity or fat distribution and metabolic syndrome. This systematic review of obesity GWAS will summarize genome-wide significant findings for obesity and metabolic syndrome and briefly give a few suggestions of what is to be expected in the next few years.
Oud, Bart; van Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T
Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages.
Oud, Bart; Maris, Antonius J A; Daran, Jean-Marc; Pronk, Jack T
Successful reverse engineering of mutants that have been obtained by nontargeted strain improvement has long presented a major challenge in yeast biotechnology. This paper reviews the use of genome-wide approaches for analysis of Saccharomyces cerevisiae strains originating from evolutionary engineering or random mutagenesis. On the basis of an evaluation of the strengths and weaknesses of different methods, we conclude that for the initial identification of relevant genetic changes, whole genome sequencing is superior to other analytical techniques, such as transcriptome, metabolome, proteome, or array-based genome analysis. Key advantages of this technique over gene expression analysis include the independency of genome sequences on experimental context and the possibility to directly and precisely reproduce the identified changes in naive strains. The predictive value of genome-wide analysis of strains with industrially relevant characteristics can be further improved by classical genetics or simultaneous analysis of strains derived from parallel, independent strain improvement lineages. PMID:22152095
Berthier, David; Peylhard, Moana; Dayo, Guiguigbaza-Kossigan; Flori, Laurence; Sylla, Souleymane; Bolly, Seydou; Sakande, Hassane; Chantal, Isabelle; Thevenon, Sophie
Background Animal African Trypanosomosis particularly affects cattle and dramatically impairs livestock development in sub-Saharan Africa. African Zebu (AFZ) or European taurine breeds usually die of the disease in the absence of treatment, whereas West African taurine breeds (AFT), considered trypanotolerant, are able to control the pathogenic effects of trypanosomosis. Up to now, only one AFT breed, the longhorn N’Dama (NDA), has been largely studied and is considered as the reference trypanotolerant breed. Shorthorn taurine trypanotolerance has never been properly assessed and compared to NDA and AFZ breeds. Methodology/Principal Findings This study compared the trypanotolerant/susceptible phenotype of five West African local breeds that differ in their demographic history. Thirty-six individuals belonging to the longhorn taurine NDA breed, two shorthorn taurine Lagune (LAG) and Baoulé (BAO) breeds, the Zebu Fulani (ZFU) and the Borgou (BOR), an admixed breed between AFT and AFZ, were infected by Trypanosoma congolense IL1180. All the cattle were genetically characterized using dense SNP markers, and parameters linked to parasitaemia, anaemia and leukocytes were analysed using synthetic variables and mixed models. We showed that LAG, followed by NDA and BAO, displayed the best control of anaemia. ZFU showed the greatest anaemia and the BOR breed had an intermediate value, as expected from its admixed origin. Large differences in leukocyte counts were also observed, with higher leukocytosis for AFT. Nevertheless, no differences in parasitaemia were found, except a tendency to take longer to display detectable parasites in ZFU. Conclusions We demonstrated that LAG and BAO are as trypanotolerant as NDA. This study highlights the value of shorthorn taurine breeds, which display strong local adaptation to trypanosomosis. Thanks to further analyses based on comparisons of the genome or transcriptome of the breeds, these results open up the way for better knowledge
Gao, Xiangwei; Wan, Ji; Qian, Shu-Bing
Regulation of translation initiation is a central control point in protein synthesis. Variations of start codon selection contribute to protein diversity and complexity. Systemic mapping of start codon positions and precise measurement of the corresponding initiation rate would transform our understanding of translational control. Here we describe a ribosome profiling approach that enables identification of translation initiation sites on a genome-wide scale. By capturing initiating ribosomes using lactimidomycin, this approach permits qualitative and quantitative analysis of alternative translation initiation.
Speliotes, Elizabeth K.
Sequencing of the human genome has opened up many opportunities to learn about our own genetic susceptibilities to disease. In this Foreword to this issue of Seminars in Liver Disease, I provide some required background to understanding genome-wide association analyses in general, including a list of terms (Table 1) often used in such studies. Five areas of particular significance are then reviewed in detail in the articles that follow. PMID:26676811
Pecetti, Luciano; Brummer, E. Charles; Palmonari, Alberto; Tava, Aldo
Genetic progress for forage quality has been poor in alfalfa (Medicago sativa L.), the most-grown forage legume worldwide. This study aimed at exploring opportunities for marker-assisted selection (MAS) and genomic selection of forage quality traits based on breeding values of parent plants. Some 154 genotypes from a broadly-based reference population were genotyped by genotyping-by-sequencing (GBS), and phenotyped for leaf-to-stem ratio, leaf and stem contents of protein, neutral detergent fiber (NDF) and acid detergent lignin (ADL), and leaf and stem NDF digestibility after 24 hours (NDFD), of their dense-planted half-sib progenies in three growing conditions (summer harvest, full irrigation; summer harvest, suspended irrigation; autumn harvest). Trait-marker analyses were performed on progeny values averaged over conditions, owing to modest germplasm × condition interaction. Genomic selection exploited 11,450 polymorphic SNP markers, whereas a subset of 8,494 M. truncatula-aligned markers were used for a genome-wide association study (GWAS). GWAS confirmed the polygenic control of quality traits and, in agreement with phenotypic correlations, indicated substantially different genetic control of a given trait in stems and leaves. It detected several SNPs in different annotated genes that were highly linked to stem protein content. Also, it identified a small genomic region on chromosome 8 with high concentration of annotated genes associated with leaf ADL, including one gene probably involved in the lignin pathway. Three genomic selection models, i.e., Ridge-regression BLUP, Bayes B and Bayesian Lasso, displayed similar prediction accuracy, whereas SVR-lin was less accurate. Accuracy values were moderate (0.3–0.4) for stem NDFD and leaf protein content, modest for leaf ADL and NDFD, and low to very low for the other traits. Along with previous results for the same germplasm set, this study indicates that GBS data can be exploited to improve both quality traits
Pasanen, Anu; Karjalainen, Minna K.; Bont, Louis; Piippo-Savolainen, Eija; Ruotsalainen, Marja; Goksör, Emma; Kumawat, Kuldeep; Hodemaekers, Hennie; Nuolivirta, Kirsi; Jartti, Tuomas; Wennergren, Göran; Hallman, Mikko; Rämet, Mika; Korppi, Matti
Bronchiolitis is a major cause of hospitalization among infants. Severe bronchiolitis is associated with later asthma, suggesting a common genetic predisposition. Genetic background of bronchiolitis is not well characterized. To identify polymorphisms associated with bronchiolitis, we conducted a genome-wide association study (GWAS) in which 5,300,000 single nucleotide polymorphisms (SNPs) were tested for association in a Finnish–Swedish population of 217 children hospitalized for bronchiolitis and 778 controls. The most promising SNPs (n = 77) were genotyped in a Dutch replication population of 416 cases and 432 controls. Finally, we used a set of 202 Finnish bronchiolitis cases to further investigate candidate SNPs. We did not detect genome-wide significant associations, but several suggestive association signals (p < 10−5) were observed in the GWAS. In the replication population, three SNPs were nominally associated (p < 0.05). Of them, rs269094 was an expression quantitative trait locus (eQTL) for KCND3, previously shown to be associated with occupational asthma. In the additional set of Finnish cases, the association for another SNP (rs9591920) within a noncoding RNA locus was further strengthened. Our results provide a first genome-wide examination of the genetics underlying bronchiolitis. These preliminary findings require further validation in a larger sample size. PMID:28139761
Hutter, Stephan; Vilella, Albert J; Rozas, Julio
Background DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. Results We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i) exhaustive population-genetic analyses including those based on the coalescent theory; ii) analysis adapted to the shallow data generated by the high-throughput genome projects; iii) use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv) identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v) visualization of the results integrated with current genome annotations in commonly available genome browsers. Conclusion VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data. PMID:16968531
Stein, Jason L; Hua, Xue; Lee, Suh; Ho, April J; Leow, Alex D; Toga, Arthur W; Saykin, Andrew J; Shen, Li; Foroud, Tatiana; Pankratz, Nathan; Huentelman, Matthew J; Craig, David W; Gerber, Jill D; Allen, April N; Corneveaux, Jason J; Dechairo, Bryan M; Potkin, Steven G; Weiner, Michael W; Thompson, Paul
The structure of the human brain is highly heritable, and is thought to be influenced by many common genetic variants, many of which are currently unknown. Recent advances in neuroimaging and genetics have allowed collection of both highly detailed structural brain scans and genome-wide genotype information. This wealth of information presents a new opportunity to find the genes influencing brain structure. Here we explore the relation between 448,293 single nucleotide polymorphisms in each of 31,622 voxels of the entire brain across 740 elderly subjects (mean age+/-s.d.: 75.52+/-6.82 years; 438 male) including subjects with Alzheimer's disease, Mild Cognitive Impairment, and healthy elderly controls from the Alzheimer's Disease Neuroimaging Initiative (ADNI). We used tensor-based morphometry to measure individual differences in brain structure at the voxel level relative to a study-specific template based on healthy elderly subjects. We then conducted a genome-wide association at each voxel to identify genetic variants of interest. By studying only the most associated variant at each voxel, we developed a novel method to address the multiple comparisons problem and computational burden associated with the unprecedented amount of data. No variant survived the strict significance criterion, but several genes worthy of further exploration were identified, including CSMD2 and CADPS2. These genes have high relevance to brain structure. This is the first voxelwise genome wide association study to our knowledge, and offers a novel method to discover genetic influences on brain structure.
Domingue, Benjamin W.; Wedow, Robbee; Conley, Dalton; McQueen, Matt; Hoffmann, Thomas J.; Boardman, Jason D.
An increasing number of studies that are widely used in the demographic research community have collected genome-wide data from their respondents. It is therefore important that demographers have a proper understanding of some of the methodological tools needed to analyze such data. Our paper details the underlying methodology behind one of the most common techniques for analyzing genome-wide data, Genome-Wide Complex Trait Analysis (GCTA). GCTA models provide heritability estimates for health, health behaviors, or indicators of attainment using data from unrelated persons.. Our goal is to describe this model, to highlight the utility of the model for biodemographic research, and to demonstrate the performance of this approach under modifications of the underlying assumptions. The first set of modifications involves changing the nature of the genetic data used to compute genetic similarities between individuals (the genetic relationship matrix). We then explore the sensitivity of the model to heteroscedastic errors. In general, GCTA estimates are robust to the modifications proposed here but we also highlight potential limitations of GCTA estimates. PMID:27050030
Verhulst, Brad; Maes, Hermine H; Neale, Michael C
Improving the accuracy of phenotyping through the use of advanced psychometric tools will increase the power to find significant associations with genetic variants and expand the range of possible hypotheses that can be tested on a genome-wide scale. Multivariate methods, such as structural equation modeling (SEM), are valuable in the phenotypic analysis of psychiatric and substance use phenotypes, but these methods have not been integrated into standard genome-wide association analyses because fitting a SEM at each single nucleotide polymorphism (SNP) along the genome was hitherto considered to be too computationally demanding. By developing a method that can efficiently fit SEMs, it is possible to expand the set of models that can be tested. This is particularly necessary in psychiatric and behavioral genetics, where the statistical methods are often handicapped by phenotypes with large components of stochastic variance. Due to the enormous amount of data that genome-wide scans produce, the statistical methods used to analyze the data are relatively elementary and do not directly correspond with the rich theoretical development, and lack the potential to test more complex hypotheses about the measurement of, and interaction between, comorbid traits. In this paper, we present a method to test the association of a SNP with multiple phenotypes or a latent construct on a genome-wide basis using a diagonally weighted least squares (DWLS) estimator for four common SEMs: a one-factor model, a one-factor residuals model, a two-factor model, and a latent growth model. We demonstrate that the DWLS parameters and p-values strongly correspond with the more traditional full information maximum likelihood parameters and p-values. We also present the timing of simulations and power analyses and a comparison with and existing multivariate GWAS software package.
Wei, Sheng; Wang, Li-E; McHugh, Michelle K; Han, Younghun; Xiong, Momiao; Amos, Christopher I; Spitz, Margaret R; Wei, Qingyi Wei
Asbestos exposure is a known risk factor for lung cancer. Although recent genome-wide association studies (GWASs) have identified some novel loci for lung cancer risk, few addressed genome-wide gene-environment interactions. To determine gene-asbestos interactions in lung cancer risk, we conducted genome-wide gene-environment interaction analyses at levels of single nucleotide polymorphisms (SNPs), genes and pathways, using our published Texas lung cancer GWAS dataset. This dataset included 317 498 SNPs from 1154 lung cancer cases and 1137 cancer-free controls. The initial SNP-level P-values for interactions between genetic variants and self-reported asbestos exposure were estimated by unconditional logistic regression models with adjustment for age, sex, smoking status and pack-years. The P-value for the most significant SNP rs13383928 was 2.17×10(-6), which did not reach the genome-wide statistical significance. Using a versatile gene-based test approach, we found that the top significant gene was C7orf54, located on 7q32.1 (P = 8.90×10(-5)). Interestingly, most of the other significant genes were located on 11q13. When we used an improved gene-set-enrichment analysis approach, we found that the Fas signaling pathway and the antigen processing and presentation pathway were most significant (nominal P < 0.001; false discovery rate < 0.05) among 250 pathways containing 17 572 genes. We believe that our analysis is a pilot study that first describes the gene-asbestos interaction in lung cancer risk at levels of SNPs, genes and pathways. Our findings suggest that immune function regulation-related pathways may be mechanistically involved in asbestos-associated lung cancer risk.
Background Linkage disequilibrium (LD) between genes at linked or independent loci can occur at gametic and zygotic levels known asgametic LD and zygotic LD, respectively. Gametic LD is well known for its roles in fine-scale mapping of quantitative trait loci, genomic selection and evolutionary inference. The less-well studied is the zygotic LD and its components that can be also estimated directly from the unphased SNPs. Results This study was set up to investigate the genome-wide extent and patterns of zygotic LD and its components in a crossbred cattle population using the genomic data from the Illumina BovineSNP50 beadchip. The animal population arose from repeated crossbreeding of multiple breeds and selection for growth and cow reproduction. The study showed that similar genomic structures in gametic and zygotic LD were observed, with zygotic LD decaying faster than gametic LD over marker distance. The trigenic and quadrigenic disequilibria were generally two- to three-fold smaller than the usual digenic disequilibria (gametic or composite LD). There was less power of testing for these high-order genic disequilibria than for the digenic disequilibria. The power estimates decreased with the marker distance between markers though the decay trend is more obvious for the digenic disequilibria than for high-order disequilibria. Conclusions This study is the first major genome-wide survey of all non-allelic associations between pairs of SNPs in a cattle population. Such analysis allows us to assess the relative importance of gametic LD vs. all other non-allelic genic LDs regardless of whether or not the population is in HWE. The observed predominance of digenic LD (gametic or composite LD) coupled with insignificant high-order trigenic and quadrigenic disequilibria supports the current intensive focus on the use of high-density SNP markers for genome-wide association studies and genomic selection activities in the cattle population. PMID:22827586
von Rönn, Jan A C; Shafer, Aaron B A; Wolf, Jochen B W
Transcontinental migration is a fascinating example of how animals can respond to climatic oscillation. Yet, quantitative data on fitness components are scarce, and the resulting population genetic consequences are poorly understood. Migratory divides, hybrid zones with a transition in migratory behaviour, provide a natural setting to investigate the micro-evolutionary dynamics induced by migration under sympatric conditions. Here, we studied the effects of migratory programme on survival, trait evolution and genome-wide patterns of population differentiation in a migratory divide of European barn swallows. We sampled a total of 824 individuals from both allopatric European populations wintering in central and southern Africa, respectively, along with two mixed populations from within the migratory divide. While most morphological characters varied by latitude consistent with Bergmann's rule, wing length co-varied with distance to wintering grounds. Survival data collected during a 5-year period provided strong evidence that this covariance is repeatedly generated by disruptive selection against intermediate phenotypes. Yet, selection-induced divergence did not translate into genome-wide genetic differentiation as assessed by microsatellites, mtDNA and >20 000 genome-wide SNP markers; nor did we find evidence of local genomic selection between migratory types. Among breeding populations, a single outlier locus mapped to the BUB1 gene with a role in mitotic and meiotic organization. Overall, this study provides evidence for an adaptive response to variation in migration behaviour continuously eroded by gene flow under current conditions of nonassortative mating. It supports the theoretical prediction that population differentiation is difficult to achieve under conditions of gene flow despite measurable disruptive selection.
Chen, Huan; Gu, Xiao-hong; Zhou, Yuxi; Ge, Zeng; Wang, Bin; Siok, Wai Ting; Wang, Guoqing; Huen, Michael; Jiang, Yuyang; Tan, Li-Hai; Sun, Yimin
Mathematics ability is a complex cognitive trait with polygenic heritability. Genome-wide association study (GWAS) has been an effective approach to investigate genetic components underlying mathematic ability. Although previous studies reported several candidate genetic variants, none of them exceeded genome-wide significant threshold in general populations. Herein, we performed GWAS in Chinese elementary school students to identify potential genetic variants associated with mathematics ability. The discovery stage included 494 and 504 individuals from two independent cohorts respectively. The replication stage included another cohort of 599 individuals. In total, 28 of 81 candidate SNPs that met validation criteria were further replicated. Combined meta-analysis of three cohorts identified four SNPs (rs1012694, rs11743006, rs17778739 and rs17777541) of SPOCK1 gene showing association with mathematics ability (minimum p value 5.67 × 10−10, maximum β −2.43). The SPOCK1 gene is located on chromosome 5q31.2 and encodes a highly conserved glycoprotein testican-1 which was associated with tumor progression and prognosis as well as neurogenesis. This is the first study to report genome-wide significant association of individual SNPs with mathematics ability in general populations. Our preliminary results further supported the role of SPOCK1 during neurodevelopment. The genetic complexities underlying mathematics ability might contribute to explain the basis of human cognition and intelligence at genetic level. PMID:28155865
Kanai, Masahiro; Tanaka, Toshihiro; Okada, Yukinori
To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10−8, the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were Psig=3.24 × 10−8 (AFR), 9.26 × 10−8 (EUR), 1.83 × 10−7 (AMR), 1.61 × 10−7 (EAS) and 9.46 × 10−8 (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded Psig=3.25 × 10−8 (ALL) and 4.20 × 10−8 (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10−8) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples. PMID:27305981
Kanai, Masahiro; Tanaka, Toshihiro; Okada, Yukinori
To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10(-8), the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were Psig=3.24 × 10(-8) (AFR), 9.26 × 10(-8) (EUR), 1.83 × 10(-7) (AMR), 1.61 × 10(-7) (EAS) and 9.46 × 10(-8) (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded Psig=3.25 × 10(-8) (ALL) and 4.20 × 10(-8) (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10(-8)) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.
Adkins, Daniel E; Clark, Shaunna L; Copeland, William E; Kennedy, Martin; Conway, Kevin; Angold, Adrian; Maes, Hermine; Liu, Youfang; Kumar, Gaurav; Erkanli, Alaattin; Patkar, Ashwin A; Silberg, Judy; Brown, Tyson H; Fergusson, David M; Horwood, L John; Eaves, Lindon; van den Oord, Edwin J C G; Sullivan, Patrick F; Costello, E J
The public health burden of alcohol is unevenly distributed across the life course, with levels of use, abuse, and dependence increasing across adolescence and peaking in early adulthood. Here, we leverage this temporal patterning to search for common genetic variants predicting developmental trajectories of alcohol consumption. Comparable psychiatric evaluations measuring alcohol consumption were collected in three longitudinal community samples (N=2,126, obs=12,166). Consumption-repeated measurements spanning adolescence and early adulthood were analyzed using linear mixed models, estimating individual consumption trajectories, which were then tested for association with Illumina 660W-Quad genotype data (866,099 SNPs after imputation and QC). Association results were combined across samples using standard meta-analysis methods. Four meta-analysis associations satisfied our pre-determined genome-wide significance criterion (FDR<0.1) and six others met our 'suggestive' criterion (FDR<0.2). Genome-wide significant associations were highly biological plausible, including associations within GABA transporter 1, SLC6A1 (solute carrier family 6, member 1), and exonic hits in LOC100129340 (mitofusin-1-like). Pathway analyses elaborated single marker results, indicating significant enriched associations to intuitive biological mechanisms, including neurotransmission, xenobiotic pharmacodynamics, and nuclear hormone receptors (NHR). These findings underscore the value of combining longitudinal behavioral data and genome-wide genotype information in order to study developmental patterns and improve statistical power in genomic studies.
Laurie, Cathy C.; Doheny, Kimberly F.; Mirel, Daniel B.; Pugh, Elizabeth W.; Bierut, Laura J.; Bhangale, Tushar; Boehm, Frederick; Caporaso, Neil E.; Cornelis, Marilyn C.; Edenberg, Howard J.; Gabriel, Stacy B.; Harris, Emily L.; Hu, Frank B.; Jacobs, Kevin; Kraft, Peter; Landi, Maria Teresa; Lumley, Thomas; Manolio, Teri A.; McHugh, Caitlin; Painter, Ian; Paschall, Justin; Rice, John P.; Rice, Kenneth M.; Zheng, Xiuwen; Weir, Bruce S.
Genome-wide scans of nucleotide variation in human subjects are providing an increasing number of replicated associations with complex disease traits. Most of the variants detected have small effects and, collectively, they account for a small fraction of the total genetic variance. Very large sample sizes are required to identify and validate findings. In this situation, even small sources of systematic or random error can cause spurious results or obscure real effects. The need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance (QC/QA) have been developed. Here we extend these methods and describe a system of QC/QA for genotypic data in genome-wide association studies. This system includes some new approaches that (1) combine analysis of allelic probe intensities and called genotypes to distinguish gender misidentification from sex chromosome aberrations, (2) detect autosomal chromosome aberrations that may affect genotype calling accuracy, (3) infer DNA sample quality from relatedness and allelic intensities, (4) use duplicate concordance to infer SNP quality, (5) detect genotyping artifacts from dependence of Hardy-Weinberg equilibrium (HWE) test p-values on allelic frequency, and (6) demonstrate sensitivity of principal components analysis (PCA) to SNP selection. The methods are illustrated with examples from the ‘Gene Environment Association Studies’ (GENEVA) program. The results suggest several recommendations for QC/QA in the design and execution of genome-wide association studies. PMID:20718045
Tsai, Kate L.; Noorai, Rooksana E.; Starr-Moss, Alison N.; Quignon, Pascale; Rinz, Caitlin J.; Ostrander, Elaine A.; Steiner, Jörg M.; Murphy, Keith E.
The German Shepherd Dog (GSD) is a popular working and companion breed for which over 50 hereditary diseases have been documented. Herein, SNP profiles for 197 GSDs were generated using the Affymetrix v2 canine SNP array for a genome-wide association study to identify loci associated with four diseases: pituitary dwarfism, degenerative myelopathy (DM), congenital megaesophagus (ME), and pancreatic acinar atrophy (PAA). A locus on Chr 9 is strongly associated with pituitary dwarfism and is proximal to a plausible candidate gene, LHX3. Results for DM confirm a major locus encompassing SOD1, in which an associated point mutation was previously identified, but do not suggest modifier loci. Several SNPs on Chr 12 are associated with ME and a 4.7 Mb haplotype block is present in affected dogs. Analysis of additional ME cases for a SNP within the haplotype provides further support for this association. Results for PAA indicate more complex genetic underpinnings. Several regions on multiple chromosomes reach genome-wide significance. However, no major locus is apparent and only two associated haplotype blocks, on Chrs 7 and 12 are observed. These data suggest that PAA may be governed by multiple loci with small effects, or it may be a heterogeneous disorder. PMID:22105877
Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong
Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species.
Deitz, Kevin C; Athrey, Giridhar A; Jawara, Musa; Overgaard, Hans J; Matias, Abrahan; Slotman, Michel A
Anopheles melas is a member of the recently diverged An. gambiae species complex, a model for speciation studies, and is a locally important malaria vector along the West-African coast where it breeds in brackish water. A recent population genetic study of An. melas revealed species-level genetic differentiation between three population clusters. An. melas West extends from The Gambia to the village of Tiko, Cameroon. The other mainland cluster, An. melas South, extends from the southern Cameroonian village of Ipono to Angola. Bioko Island, Equatorial Guinea An. melas populations are genetically isolated from mainland populations. To examine how genetic differentiation between these An. melas forms is distributed across their genomes, we conducted a genome-wide analysis of genetic differentiation and selection using whole genome sequencing data of pooled individuals (Pool-seq) from a representative population of each cluster. The An. melas forms exhibit high levels of genetic differentiation throughout their genomes, including the presence of numerous fixed differences between clusters. Although the level of divergence between the clusters is on a par with that of other species within the An. gambiae complex, patterns of genome-wide divergence and diversity do not provide evidence for the presence of pre- and/or postmating isolating mechanisms in the form of speciation islands. These results are consistent with an allopatric divergence process with little or no introgression.
Deitz, Kevin C.; Athrey, Giridhar A.; Jawara, Musa; Overgaard, Hans J.; Matias, Abrahan; Slotman, Michel A.
Anopheles melas is a member of the recently diverged An. gambiae species complex, a model for speciation studies, and is a locally important malaria vector along the West-African coast where it breeds in brackish water. A recent population genetic study of An. melas revealed species-level genetic differentiation between three population clusters. An. melas West extends from The Gambia to the village of Tiko, Cameroon. The other mainland cluster, An. melas South, extends from the southern Cameroonian village of Ipono to Angola. Bioko Island, Equatorial Guinea An. melas populations are genetically isolated from mainland populations. To examine how genetic differentiation between these An. melas forms is distributed across their genomes, we conducted a genome-wide analysis of genetic differentiation and selection using whole genome sequencing data of pooled individuals (Pool-seq) from a representative population of each cluster. The An. melas forms exhibit high levels of genetic differentiation throughout their genomes, including the presence of numerous fixed differences between clusters. Although the level of divergence between the clusters is on a par with that of other species within the An. gambiae complex, patterns of genome-wide divergence and diversity do not provide evidence for the presence of pre- and/or postmating isolating mechanisms in the form of speciation islands. These results are consistent with an allopatric divergence process with little or no introgression. PMID:27466271
Qin, Peng; Lin, Yu; Hu, Yaodong; Liu, Kun; Mao, Shuangshuang; Li, Zhanyi; Wang, Jirui; Liu, Yaxi; Wei, Yuming; Zheng, Youliang
Abstract The D-genome progenitor of wheat (Triticum aestivum), Aegilops tauschii, possesses numerous genes for resistance to abiotic stresses, including drought. Therefore, information on the genetic architecture of A. tauschii can aid the development of drought-resistant wheat varieties. Here, we evaluated 13 traits in 373 A. tauschii accessions grown under normal and polyethylene glycol-simulated drought stress conditions and performed a genome-wide association study using 7,185 single nucleotide polymorphism (SNP) markers. We identified 208 and 28 SNPs associated with all traits using the general linear model and mixed linear model, respectively, while both models detected 25 significant SNPs with genome-wide distribution. Public database searches revealed several candidate/flanking genes related to drought resistance that were grouped into three categories according to the type of encoded protein (enzyme, storage protein, and drought-induced protein). This study provided essential information for SNPs and genes related to drought resistance in A. tauschii and wheat, and represents a foundation for breeding drought-resistant wheat cultivars using marker-assisted selection. PMID:27560650
Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue
Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.
Peñagaricano, F; Weigel, K A; Khatib, H
The decline in the reproductive efficiency of dairy cattle has become a challenging problem worldwide. Female fertility is now taken into account in breeding goals while generally less attention is given to male fertility. The objective of this study was to perform a genome-wide association study in Holstein bulls to identify genetic variants significantly related to sire conception rate (SCR), a new phenotypic evaluation of bull fertility. The analysis included 1755 sires with SCR data and 38,650 single nucleotide polymorphisms (SNPs) spanning the entire bovine genome. Associations between SNPs and SCR were analyzed using a mixed linear model that included a random polygenic effect and SNP genotype either as a linear covariate or as a categorical variable. A multiple testing correction approach was used to account for the correlation between SNPs because of linkage disequilibrium. After genome-wide correction, eight SNPs showed significant association with SCR. Some of these SNPs are located close to or in the middle of genes with functions related to male fertility, such as the sperm acrosome reaction, chromatin remodeling during the spermatogenesis, and the meiotic process during male germ cell maturation. Some SNPs showed marked dominance effects, which provide more evidence for the relevance of non-additive effects in traits closely related to fitness such as fertility. The results could contribute to the identification of genes and pathways associated with male fertility in dairy cattle.
Zhang, Wanchang; Bin Yang; Zhang, Junjie; Cui, Leilei; Ma, Junwu; Chen, Congying; Ai, Huashui; Xiao, Shijun; Ren, Jun; Huang, Lusheng
Fatty acid composition profiles are important indicators of meat quality and tasting flavor. Metabolic indices of fatty acids are more authentic to reflect meat nutrition and public acceptance. To investigate the genetic mechanism of fatty acid metabolic indices in pork, we conducted genome-wide association studies (GWAS) for 33 fatty acid metabolic traits in five pig populations. We identified a total of 865 single nucleotide polymorphisms (SNPs), corresponding to 11 genome-wide significant loci on nine chromosomes and 12 suggestive loci on nine chromosomes. Our findings not only confirmed seven previously reported QTL with stronger association strength, but also revealed four novel population-specific loci, showing that investigations on intermediate phenotypes like the metabolic traits of fatty acids can increase the statistical power of GWAS for end-point phenotypes. We proposed a list of candidate genes at the identified loci, including three novel genes (FADS2, SREBF1 and PLA2G7). Further, we constructed the functional networks involving these candidate genes and deduced the potential fatty acid metabolic pathway. These findings advance our understanding of the genetic basis of fatty acid composition in pigs. The results from European hybrid commercial pigs can be immediately transited into breeding practice for beneficial fatty acid composition. PMID:27097669
Markt, Sarah C; Nuttall, Elizabeth; Turman, Constance; Sinnott, Jennifer; Rimm, Eric B; Ecsedy, Ethan; Unger, Robert H; Fall, Katja; Finn, Stephen; Jensen, Majken K; Rider, Jennifer R; Kraft, Peter
Objective To determine the inherited factors associated with the ability to smell asparagus metabolites in urine. Design Genome wide association study. Setting Nurses’ Health Study and Health Professionals Follow-up Study cohorts. Participants 6909 men and women of European-American descent with available genetic data from genome wide association studies. Main outcome measure Participants were characterized as asparagus smellers if they strongly agreed with the prompt “after eating asparagus, you notice a strong characteristic odor in your urine,” and anosmic if otherwise. We calculated per-allele estimates of asparagus anosmia for about nine million single nucleotide polymorphisms using logistic regression. P values <5×10-8 were considered as genome wide significant. Results 58.0% of men (n=1449/2500) and 61.5% of women (n=2712/4409) had anosmia. 871 single nucleotide polymorphisms reached genome wide significance for asparagus anosmia, all in a region on chromosome 1 (1q44: 248139851-248595299) containing multiple genes in the olfactory receptor 2 (OR2) family. Conditional analyses revealed three independent markers associated with asparagus anosmia: rs13373863, rs71538191, and rs6689553. Conclusion A large proportion of people have asparagus anosmia. Genetic variation near multiple olfactory receptor genes is associated with the ability of an individual to smell the metabolites of asparagus in urine. Future replication studies are necessary before considering targeted therapies to help anosmic people discover what they are missing. PMID:27965198
Webb, Bradley T; Guo, An-Yuan; Maher, Brion S; Zhao, Zhongming; van den Oord, Edwin J; Kendler, Kenneth S; Riley, Brien P; Gillespie, Nathan A; Prescott, Carol A; Middeldorp, Christel M; Willemsen, Gonneke; de Geus, Eco JC; Hottenga, Jouke-Jan; Boomsma, Dorret I; Slagboom, Eline P; Wray, Naomi R; Montgomery, Grant W; Martin, Nicholas G; Wright, Margie J; Heath, Andrew C; Madden, Pamela A; Gelernter, Joel; Knowles, James A; Hamilton, Steven P; Weissman, Myrna M; Fyer, Abby J; Huezo-Diaz, Patricia; McGuffin, Peter; Farmer, Anne; Craig, Ian W; Lewis, Cathryn; Sham, Pak; Crowe, Raymond R; Flint, Jonathan; Hettema, John M
Genetic factors underlying trait neuroticism, reflecting a tendency towards negative affective states, may overlap genetic susceptibility for anxiety disorders and help explain the extensive comorbidity amongst internalizing disorders. Genome-wide linkage (GWL) data from several studies of neuroticism and anxiety disorders have been published, providing an opportunity to test such hypotheses and identify genomic regions that harbor genes common to these phenotypes. In all, 11 independent GWL studies of either neuroticism (n=8) or anxiety disorders (n=3) were collected, which comprised of 5341 families with 15 529 individuals. The rank-based genome scan meta-analysis (GSMA) approach was used to analyze each trait separately and combined, and global correlations between results were examined. False discovery rate (FDR) analysis was performed to test for enrichment of significant effects. Using 10 cM intervals, bins nominally significant for both GSMA statistics, PSR and POR, were found on chromosomes 9, 11, 12, and 14 for neuroticism and on chromosomes 1, 5, 15, and 16 for anxiety disorders. Genome-wide, the results for the two phenotypes were significantly correlated, and a combined analysis identified additional nominally significant bins. Although none reached genome-wide significance, an excess of significant PSRP-values were observed, with 12 bins falling under a FDR threshold of 0.50. As demonstrated by our identification of multiple, consistent signals across the genome, meta-analytically combining existing GWL data is a valuable approach to narrowing down regions relevant for anxiety-related phenotypes. This may prove useful for prioritizing emerging genome-wide association data for anxiety disorders. PMID:22473089
Phipps, Amanda I.; Passarelli, Michael N.; Chan, Andrew T.; Harrison, Tabitha A.; Jeon, Jihyoun; Hutter, Carolyn M.; Berndt, Sonja I.; Brenner, Hermann; Caan, Bette J.; Campbell, Peter T.; Chang-Claude, Jenny; Chanock, Stephen J.; Cheadle, Jeremy P.; Curtis, Keith R.; Duggan, David; Fisher, David; Fuchs, Charles S.; Gala, Manish; Giovannucci, Edward L.; Hayes, Richard B.; Hoffmeister, Michael; Hsu, Li; Jacobs, Eric J.; Jansen, Lina; Kaplan, Richard; Kap, Elisabeth J.; Maughan, Timothy S.; Potter, John D.; Schoen, Robert E.; Seminara, Daniela; Slattery, Martha L.; West, Hannah; White, Emily; Peters, Ulrike; Newcomb, Polly A.
Genome-wide association studies have identified several germline single nucleotide polymorphisms (SNPs) significantly associated with colorectal cancer (CRC) incidence. Common germline genetic variation may also be related to CRC survival. We used a discovery-based approach to identify SNPs related to survival outcomes after CRC diagnosis. Genome-wide genotyping arrays were conducted for 3494 individuals with invasive CRC enrolled in six prospective cohort studies (median study-specific follow-up = 4.2–8.1 years). In pooled analyses, we used Cox regression to assess SNP-specific associations with CRC-specific and overall survival, with additional analyses stratified by stage at diagnosis. Top findings were followed-up in independent studies. A P value threshold of P < 5×10−8 in analyses combining discovery and follow-up studies was required for genome-wide significance. Among individuals with distant-metastatic CRC, several SNPs at 6p12.1, nearest the ELOVL5 gene, were statistically significantly associated with poorer survival, with the strongest associations noted for rs209489 [hazard ratio (HR) = 1.8, P = 7.6×10−10 and HR = 1.8, P = 3.7×10−9 for CRC-specific and overall survival, respectively). No SNPs were statistically significantly associated with survival among all cases combined or in cases without distant-metastases. SNPs in 6p12.1/ELOVL5 were associated with survival outcomes in individuals with distant-metastatic CRC, and merit further follow-up for functional significance. Findings from this genome-wide association study highlight the potential importance of genetic variation in CRC prognosis and provide clues to genomic regions of potential interest. PMID:26586795
Background Several studies have examined the accuracy of genomic selection both within and across purebred beef or dairy populations. However, the accuracy of direct genomic breeding values (DGVs) has been less well studied in crossbred or admixed cattle populations. We used a population of 3,240 cr...
This research elucidated genetic relationships between carcass traits, ultrasound indicator traits, and their respective molecular breeding values (MBV). Animals whose MBV data were used to estimate (co)variance components were not previously used in development of the MBV. Results are presented fo...
The growing interest in using open-pollinated varieties (OPVs) and varietal hybrids (OPVhs) of corn (Zea mays L.) especially in breeding programs for organic and low-input farming reflects the value of large plasticity levels available in their plant, ear, and kernel traits. We estimated and partiti...
Liao, R; Zhang, X; Chen, Q; Wang, Z; Wang, Q; Yang, C; Pan, Y
This study was designed to investigate the genetic basis of growth and egg traits in Dongxiang blue-shelled chickens and White Leghorn chickens. In this study, we employed a reduced representation sequencing approach called genotyping by genome reducing and sequencing to detect genome-wide SNPs in 252 Dongxiang blue-shelled chickens and 252 White Leghorn chickens. The Dongxiang blue-shelled chicken breed has many specific traits and is characterized by blue-shelled eggs, black plumage, black skin, black bone and black organs. The White Leghorn chicken is an egg-type breed with high productivity. As multibreed genome-wide association studies (GWASs) can improve precision due to less linkage disequilibrium across breeds, a multibreed GWAS was performed with 156 575 SNPs to identify the associated variants underlying growth and egg traits within the two chicken breeds. The analysis revealed 32 SNPs exhibiting a significant genome-wide association with growth and egg traits. Some of the significant SNPs are located in genes that are known to impact growth and egg traits, but nearly half of the significant SNPs are located in genes with unclear functions in chickens. To our knowledge, this is the first multibreed genome-wide report for the genetics of growth and egg traits in the Dongxiang blue-shelled and White Leghorn chickens.
Metzger, Julia; Ohnesorge, Bernhard; Distl, Ottmar
Equine guttural pouch tympany (GPT) is a hereditary condition affecting foals in their first months of life. Complex segregation analyses in Arabian and German warmblood horses showed the involvement of a major gene as very likely. Genome-wide linkage and association analyses including a high density marker set of single nucleotide polymorphisms (SNPs) were performed to map the genomic region harbouring the potential major gene for GPT. A total of 85 Arabian and 373 German warmblood horses were genotyped on the Illumina equine SNP50 beadchip. Non-parametric multipoint linkage analyses showed genome-wide significance on horse chromosomes (ECA) 3 for German warmblood at 16-26 Mb and 34-55 Mb and for Arabian on ECA15 at 64-65 Mb. Genome-wide association analyses confirmed the linked regions for both breeds. In Arabian, genome-wide association was detected at 64 Mb within the region with the highest linkage peak on ECA15. For German warmblood, signals for genome-wide association were close to the peak region of linkage at 52 Mb on ECA3. The odds ratio for the SNP with the highest genome-wide association was 0.12 for the Arabian. In conclusion, the refinement of the regions with the Illumina equine SNP50 beadchip is an important step to unravel the responsible mutations for GPT.
Joerg, H; Meili, C; Ruprecht, O; Bangerter, E; Burren, A; Bigler, A
Supernumerary teats represent a common abnormality of the bovine udder. A genome-wide association study was performed based on the proportion of the occurrence of supernumerary teats in the daughters of 1097 Holstein bulls. The heritability of caudal supernumerary teats without mammary gland in this study was 0.604. The largest proportion of the heritability was attributable to BTA 20. The strongest evidence for association was with five SNPs on chromosome 20, referred to as a QTL. The mode of inheritance at this QTL was dominant. These findings reveal that the occurrence of caudal supernumerary teats without mammary gland in Holstein cattle is influenced by a QTL on chromosome 20 and a polygenic part. The data support the high potential of the SNPs in the QTL region as markers for breeding against caudal supernumerary teats.
Wu, Xiaoyun; Wang, Kun; Ding, Xuezhi; Wang, Mingcheng; Chu, Min; Xie, Xiuyue; Qiu, Qiang; Yan, Ping
The absence of horns, known as the polled phenotype, is an economically important trait in modern yak husbandry, but the genomic structure and genetic basis of this phenotype have yet to be discovered. Here, we conducted a genome-wide association study with a panel of 10 horned and 10 polled yaks using whole genome sequencing. We mapped the POLLED locus to a 200-kb interval, which comprises three protein-coding genes. Further characterization of the candidate region showed recent artificial selection signals resulting from the breeding process. We suggest that expressional variations rather than structural variations in protein probably contribute to the polled phenotype. Our results not only represent the first and important step in establishing the genomic structure of the polled region in yak, but also add to our understanding of the polled trait in bovid species. PMID:27389700
The development of the dorsal vessel in Drosophila is one of the first systems in which key mechanisms regulating cardiogenesis have been defined in great detail at the genetic and molecular level. Due to evolutionary conservation, these findings have also provided major inputs into studies of cardiogenesis in vertebrates. Many of the major components that control Drosophila cardiogenesis were discovered based on candidate gene approaches and their functions were defined by employing the outstanding genetic tools and molecular techniques available in this system. More recently, approaches have been taken that aim to interrogate the entire genome in order to identify novel components and describe genomic features that are pertinent to the regulation of heart development. Apart from classical forward genetic screens, the availability of the thoroughly annotated Drosophila genome sequence made new genome-wide approaches possible, which include the generation of massive numbers of RNA interference (RNAi) reagents that were used in forward genetic screens, as well as studies of the transcriptomes and proteomes of the developing heart under normal and experimentally manipulated conditions. Moreover, genome-wide chromatin immunoprecipitation experiments have been performed with the aim to define the full set of genomic binding sites of the major cardiogenic transcription factors, their relevant target genes, and a more complete picture of the regulatory network that drives cardiogenesis. This review will give an overview on these genome-wide approaches to Drosophila heart development and on computational analyses of the obtained information that ultimately aim to provide a description of this process at the systems level. PMID:27294102
Ioannidis, John P A; Thomas, Gilles; Daly, Mark J
Studies using genome-wide platforms have yielded an unprecedented number of promising signals of association between genomic variants and human traits. This Review addresses the steps required to validate, augment and refine such signals to identify underlying causal variants for well-defined phenotypes. These steps include: large-scale exact replication across both similar and diverse populations; fine mapping and resequencing; determination of the most informative markers and multiple independent informative loci; incorporation of functional information; and improved phenotype mapping of the implicated genetic effects. Even in cases for which replication proves that an effect exists, confident localization of the causal variant often remains elusive.
Munroe, Patricia B.
The study of family pedigrees with rare monogenic cardiovascular disorders has revealed new molecular players in physiological processes. Genome-wide association studies of complex traits with a heritable component may afford a similar and potentially intellectually richer opportunity. In this review we focus on the interpretation of genetic associations and the issue of causality in relation to known and potentially new physiology. We mainly discuss cardiometabolic traits as it reflects our personal interests, but the issues pertain broadly in many other disciplines. We also describe some of the resources that are now available that may expedite follow up of genetic association signals into observations on causal mechanisms and pathophysiology. PMID:26106147
Fonseca, Gregory J; Seidman, Jason S; Glass, Christopher K
Macrophages play essential roles in the response to injury and infection and contribute to the development and/or homeostasis of the various tissues they reside in. Conversely, macrophages also influence the pathogenesis of metabolic, neurodegenerative, and neoplastic diseases. Mechanisms that contribute to the phenotypic diversity of macrophages in health and disease remain poorly understood. Here we review the recent application of genome-wide approaches to characterize the transcriptomes and epigenetic landscapes of tissue-resident macrophages. These studies are beginning to provide insights into how distinct tissue environments are interpreted by transcriptional regulatory elements to drive specialized programs of gene expression. PMID:28087927
Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G
The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.
The number of obese patients is increasing in Japan, due to the westernization of lifestyle. Obesity, especially visceral fat obesity, is important for the development of metabolic syndrome. Genetic factors are important for the development of obesity as well as environmental factors. Importance of genetic factors of fat distribution is also reported. Recent genome-wide association studies (GWASs) have revealed the obesity and fat distribution-related polymorphisms. GWAS will highlight a better understanding of the underlying molecular mechanisms in the regulation of obesity and distribution of body fat.
Ogura, Yoji; Kou, Ikuyo; Scoliosis, Japan; Matsumoto, Morio; Watanabe, Kota; Ikegawa, Shiro
Adolescent idiopathic scoliosis(AIS)is a polygenic disease. Genome-wide association studies(GWASs)have been performed for a lot of polygenic diseases. For AIS, we conducted GWAS and identified the first AIS locus near LBX1. After the discovery, we have extended our study by increasing the numbers of subjects and SNPs. In total, our Japanese GWAS has identified four susceptibility genes. GWASs for AIS have also been performed in the USA and China, which identified one and three susceptibility genes, respectively. Here we review GWASs in Japan and abroad and functional analysis to clarify the pathomechanism of AIS.
Hinney, Anke; Scherag, André; Jarick, Ivonne; Albayrak, Özgür; Pütter, Carolin; Pechlivanis, Sonali; Dauvermann, Maria R; Beck, Sebastian; Weber, Heike; Scherag, Susann; Nguyen, Trang T; Volckmar, Anna-Lena; Knoll, Nadja; Faraone, Stephen V; Neale, Benjamin M; Franke, Barbara; Cichon, Sven; Hoffmann, Per; Nöthen, Markus M; Schreiber, Stefan; Jöckel, Karl-Heinz; Wichmann, H-Erich; Freitag, Christine; Lempp, Thomas; Meyer, Jobst; Gilsbach, Susanne; Herpertz-Dahlmann, Beate; Sinzig, Judith; Lehmkuhl, Gerd; Renner, Tobias J; Warnke, Andreas; Romanos, Marcel; Lesch, Klaus-Peter; Reif, Andreas; Schimmelmann, Benno G; Hebebrand, Johannes
The heritability of attention deficit hyperactivity disorder (ADHD) is approximately 0.8. Despite several larger scale attempts, genome-wide association studies (GWAS) have not led to the identification of significant results. We performed a GWAS based on 495 German young patients with ADHD (according to DSM-IV criteria; Human660W-Quadv1; Illumina, San Diego, CA) and on 1,300 population-based adult controls (HumanHap550v3; Illumina). Some genes neighboring the single nucleotide polymorphisms (SNPs) with the lowest P-values (best P-value: 8.38 × 10(-7)) have potential relevance for ADHD (e.g., glutamate receptor, metabotropic 5 gene, GRM5). After quality control, the 30 independent SNPs with the lowest P-values (P-values ≤ 7.57 × 10(-5) ) were chosen for confirmation. Genotyping of these SNPs in up to 320 independent German families comprising at least one child with ADHD revealed directionally consistent effect-size point estimates for 19 (10 not consistent) of the SNPs. In silico analyses of the 30 SNPs in the largest meta-analysis so far (2,064 trios, 896 cases, and 2,455 controls) revealed directionally consistent effect-size point estimates for 16 SNPs (11 not consistent). None of the combined analyses revealed a genome-wide significant result. SNPs in previously described autosomal candidate genes did not show significantly lower P-values compared to SNPs within random sets of genes of the same size. We did not find genome-wide significant results in a GWAS of German children with ADHD compared to controls. The second best SNP is located in an intron of GRM5, a gene located within a recently described region with an infrequent copy number variation in patients with ADHD.
Ganna, Andrea; Ortega-Alonso, Alfredo; Havulinna, Aki; Salomaa, Veikko; Kaprio, Jaakko; Pedersen, Nancy L; Sullivan, Patrick F; Ingelsson, Erik; Hultman, Christina M; Magnusson, Patrik K E
Twin registries around the globe have collected DNA samples from large numbers of monozygotic and dizygotic twins. The twin sample collections are frequently used as controls in disease-specific studies together with non-twins. This approach is unbiased under the hypothesis that twins and singletons are comparable in terms of allele frequencies; i.e. there are no genetic variants associated with being a twin per se. To test this hypothesis we performed a genome-wide association study comparing the allele frequency of 572,352 single nucleotide polymorphisms (SNPs) in 1,413 monozygotic (MZ) and 5,451 dizygotic (DZ) twins with 3,720 healthy singletons. Twins and singletons have been genotyped using the same platform. SNPs showing association with being a twin at P-value < 1 × 10(-5) were selected for replication analysis in 1,492 twins (463 MZ and 1,029 DZ) and 1,880 singletons from Finland. No SNPs reached genome-wide significance (P-value < 5 × 10(-8)) in the main analysis combining MZ and DZ twins. In a secondary analysis including only DZ twins two SNPs (rs2033541 close to ADAMTSL1 and rs4149283 close to ABCA1) were genome-wide significant after meta-analysis with the Finnish population. The estimated proportion of variance on the liability scale explained by all SNPs was 0.08 (P-value=0.003) when MZ and DZ were considered together and smaller for MZ (0.06, P-value=0.10) compared to DZ (0.09, P-value=0.003) when analyzed separately. In conclusion, twins and singletons can be used in genetic studies together with general population samples without introducing large bias. Further research is needed to explore genetic variances associated with DZ twinning.
Wei, Caihong; Wang, Huihua; Liu, Gang; Zhao, Fuping; Kijas, James W.; Ma, Youji; Lu, Jian; Zhang, Li; Cao, Jiaxue; Wu, Mingming; Wang, Guangkai; Liu, Ruizao; Liu, Zhen; Zhang, Shuzhen; Liu, Chousheng; Du, Lixin
Tibetan sheep have lived on the Tibetan Plateau for thousands of years; however, the process and consequences of adaptation to this extreme environment have not been elucidated for important livestock such as sheep. Here, seven sheep breeds, representing both highland and lowland breeds from different areas of China, were genotyped for a genome-wide collection of single-nucleotide polymorphisms (SNPs). The FST and XP-EHH approaches were used to identify regions harbouring local positive selection between these highland and lowland breeds, and 236 genes were identified. We detected selection events spanning genes involved in angiogenesis, energy production and erythropoiesis. In particular, several candidate genes were associated with high-altitude hypoxia, including EPAS1, CRYAA, LONP1, NF1, DPP4, SOD1, PPARG and SOCS2. EPAS1 plays a crucial role in hypoxia adaption; therefore, we investigated the exon sequences of EPAS1 and identified 12 mutations. Analysis of the relationship between blood-related phenotypes and EPAS1 genotypes in additional highland sheep revealed that a homozygous mutation at a relatively conserved site in the EPAS1 3′ untranslated region was associated with increased mean corpuscular haemoglobin concentration and mean corpuscular volume. Taken together, our results provide evidence of the genetic diversity of highland sheep and indicate potential high-altitude hypoxia adaptation mechanisms, including the role of EPAS1 in adaptation. PMID:27230812
Vonholdt, Bridgett M; Pollinger, John P; Lohmueller, Kirk E; Han, Eunjung; Parker, Heidi G; Quignon, Pascale; Degenhardt, Jeremiah D; Boyko, Adam R; Earl, Dent A; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C; Mosher, Dana S; Spady, Tyrone C; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-Ping; Bustamante, Carlos D; Ostrander, Elaine A; Novembre, John; Wayne, Robert K
Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity.
vonHoldt, Bridgett M.; Pollinger, John P.; Lohmueller, Kirk E.; Han, Eunjung; Parker, Heidi G.; Quignon, Pascale; Degenhardt, Jeremiah D.; Boyko, Adam R.; Earl, Dent A.; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C.; Mosher, Dana S.; Spady, Tyrone C.; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G.; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-ping; Bustamante, Carlos D.; Ostrander, Elaine A.; Novembre, John; Wayne, Robert K.
Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication1,2. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data3. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity. PMID:20237475
Gandolfi, Barbara; Gruffydd-Jones, Timothy J.; Malik, Richard; Cortes, Alejandro; Jones, Boyd R.; Helps, Chris R.; Prinzenberg, Eva M.; Erhardt, George; Lyons, Leslie A.
Burmese is an old and popular cat breed, however, several health concerns, such as hypokalemia and a craniofacial defect, are prevalent, endangering the general health of the breed. Hypokalemia, a subnormal serum potassium ion concentration ([K+]), most often occurs as a secondary problem but can occur as a primary problem, such as hypokalaemic periodic paralysis in humans, and as feline hypokalaemic periodic polymyopathy primarily in Burmese. The most characteristic clinical sign of hypokalemia in Burmese is a skeletal muscle weakness that is frequently episodic in nature, either generalized, or sometimes localized to the cervical and thoracic limb girdle muscles. Burmese hypokalemia is suspected to be a single locus autosomal recessive trait. A genome wide case-control study using the illumina Infinium Feline 63K iSelect DNA array was performed using 35 cases and 25 controls from the Burmese breed that identified a locus on chromosome E1 associated with hypokalemia. Within approximately 1.2 Mb of the highest associated SNP, two candidate genes were identified, KCNH4 and WNK4. Direct sequencing of the genes revealed a nonsense mutation, producing a premature stop codon within WNK4 (c.2899C>T), leading to a truncated protein that lacks the C-terminal coiled-coil domain and the highly conserved Akt1/SGK phosphorylation site. All cases were homozygous for the mutation. Although the exact mechanism causing hypokalemia has not been determined, extrapolation from the homologous human and mouse genes suggests the mechanism may involve a potassium-losing nephropathy. A genetic test to screen for the genetic defect within the active breeding population has been developed, which should lead to eradication of the mutation and improved general health within the breed. Moreover, the identified mutation may help clarify the role of the protein in K+ regulation and the cat represents the first animal model for WNK4-associated hypokalemia. PMID:23285264
Gandolfi, Barbara; Gruffydd-Jones, Timothy J; Malik, Richard; Cortes, Alejandro; Jones, Boyd R; Helps, Chris R; Prinzenberg, Eva M; Erhardt, George; Lyons, Leslie A
Burmese is an old and popular cat breed, however, several health concerns, such as hypokalemia and a craniofacial defect, are prevalent, endangering the general health of the breed. Hypokalemia, a subnormal serum potassium ion concentration ([K(+)]), most often occurs as a secondary problem but can occur as a primary problem, such as hypokalaemic periodic paralysis in humans, and as feline hypokalaemic periodic polymyopathy primarily in Burmese. The most characteristic clinical sign of hypokalemia in Burmese is a skeletal muscle weakness that is frequently episodic in nature, either generalized, or sometimes localized to the cervical and thoracic limb girdle muscles. Burmese hypokalemia is suspected to be a single locus autosomal recessive trait. A genome wide case-control study using the illumina Infinium Feline 63K iSelect DNA array was performed using 35 cases and 25 controls from the Burmese breed that identified a locus on chromosome E1 associated with hypokalemia. Within approximately 1.2 Mb of the highest associated SNP, two candidate genes were identified, KCNH4 and WNK4. Direct sequencing of the genes revealed a nonsense mutation, producing a premature stop codon within WNK4 (c.2899C>T), leading to a truncated protein that lacks the C-terminal coiled-coil domain and the highly conserved Akt1/SGK phosphorylation site. All cases were homozygous for the mutation. Although the exact mechanism causing hypokalemia has not been determined, extrapolation from the homologous human and mouse genes suggests the mechanism may involve a potassium-losing nephropathy. A genetic test to screen for the genetic defect within the active breeding population has been developed, which should lead to eradication of the mutation and improved general health within the breed. Moreover, the identified mutation may help clarify the role of the protein in K⁺ regulation and the cat represents the first animal model for WNK4-associated hypokalemia.
Loukola, Anu; Buchwald, Jadwiga; Gupta, Richa; Palviainen, Teemu; Hällfors, Jenni; Tikkanen, Emmi; Korhonen, Tellervo; Ollikainen, Miina; Sarin, Antti-Pekka; Ripatti, Samuli; Lehtimäki, Terho; Raitakari, Olli; Salomaa, Veikko; Rose, Richard J.; Tyndale, Rachel F.; Kaprio, Jaakko
Individuals with fast nicotine metabolism typically smoke more and thus have a greater risk for smoking-induced diseases. Further, the efficacy of smoking cessation pharmacotherapy is dependent on the rate of nicotine metabolism. Our objective was to use nicotine metabolite ratio (NMR), an established biomarker of nicotine metabolism rate, in a genome-wide association study (GWAS) to identify novel genetic variants influencing nicotine metabolism. A heritability estimate of 0.81 (95% CI 0.70–0.88) was obtained for NMR using monozygotic and dizygotic twins of the FinnTwin cohort. We performed a GWAS in cotinine-verified current smokers of three Finnish cohorts (FinnTwin, Young Finns Study, FINRISK2007), followed by a meta-analysis of 1518 subjects, and annotated the genome-wide significant SNPs with methylation quantitative loci (meQTL) analyses. We detected association on 19q13 with 719 SNPs exceeding genome-wide significance within a 4.2 Mb region. The strongest evidence for association emerged for CYP2A6 (min p = 5.77E-86, in intron 4), the main metabolic enzyme for nicotine. Other interesting genes with genome-wide significant signals included CYP2B6, CYP2A7, EGLN2, and NUMBL. Conditional analyses revealed three independent signals on 19q13, all located within or in the immediate vicinity of CYP2A6. A genetic risk score constructed using the independent signals showed association with smoking quantity (p = 0.0019) in two independent Finnish samples. Our meQTL results showed that methylation values of 16 CpG sites within the region are affected by genotypes of the genome-wide significant SNPs, and according to causal inference test, for some of the SNPs the effect on NMR is mediated through methylation. To our knowledge, this is the first GWAS on NMR. Our results enclose three independent novel signals on 19q13.2. The detected CYP2A6 variants explain a strikingly large fraction of variance (up to 31%) in NMR in these study samples. Further, we provide evidence
Loukola, Anu; Buchwald, Jadwiga; Gupta, Richa; Palviainen, Teemu; Hällfors, Jenni; Tikkanen, Emmi; Korhonen, Tellervo; Ollikainen, Miina; Sarin, Antti-Pekka; Ripatti, Samuli; Lehtimäki, Terho; Raitakari, Olli; Salomaa, Veikko; Rose, Richard J; Tyndale, Rachel F; Kaprio, Jaakko
Individuals with fast nicotine metabolism typically smoke more and thus have a greater risk for smoking-induced diseases. Further, the efficacy of smoking cessation pharmacotherapy is dependent on the rate of nicotine metabolism. Our objective was to use nicotine metabolite ratio (NMR), an established biomarker of nicotine metabolism rate, in a genome-wide association study (GWAS) to identify novel genetic variants influencing nicotine metabolism. A heritability estimate of 0.81 (95% CI 0.70-0.88) was obtained for NMR using monozygotic and dizygotic twins of the FinnTwin cohort. We performed a GWAS in cotinine-verified current smokers of three Finnish cohorts (FinnTwin, Young Finns Study, FINRISK2007), followed by a meta-analysis of 1518 subjects, and annotated the genome-wide significant SNPs with methylation quantitative loci (meQTL) analyses. We detected association on 19q13 with 719 SNPs exceeding genome-wide significance within a 4.2 Mb region. The strongest evidence for association emerged for CYP2A6 (min p = 5.77E-86, in intron 4), the main metabolic enzyme for nicotine. Other interesting genes with genome-wide significant signals included CYP2B6, CYP2A7, EGLN2, and NUMBL. Conditional analyses revealed three independent signals on 19q13, all located within or in the immediate vicinity of CYP2A6. A genetic risk score constructed using the independent signals showed association with smoking quantity (p = 0.0019) in two independent Finnish samples. Our meQTL results showed that methylation values of 16 CpG sites within the region are affected by genotypes of the genome-wide significant SNPs, and according to causal inference test, for some of the SNPs the effect on NMR is mediated through methylation. To our knowledge, this is the first GWAS on NMR. Our results enclose three independent novel signals on 19q13.2. The detected CYP2A6 variants explain a strikingly large fraction of variance (up to 31%) in NMR in these study samples. Further, we provide evidence
Leduc, Frédéric; Faucher, David; Bikond Nkoma, Geneviève; Grégoire, Marie-Chantal; Arguin, Mélina; Wellinger, Raymund J; Boissonneault, Guylain
Determination of cellular DNA damage has so far been limited to global assessment of genome integrity whereas nucleotide-level mapping has been restricted to specific loci by the use of specific primers. Therefore, only limited DNA sequences can be studied and novel regions of genomic instability can hardly be discovered. Using a well-characterized yeast model, we describe a straightforward strategy to map genome-wide DNA strand breaks without compromising nucleotide-level resolution. This technique, termed "damaged DNA immunoprecipitation" (dDIP), uses immunoprecipitation and the terminal deoxynucleotidyl transferase-mediated dUTP-biotin end-labeling (TUNEL) to capture DNA at break sites. When used in combination with microarray or next-generation sequencing technologies, dDIP will allow researchers to map genome-wide DNA strand breaks as well as other types of DNA damage and to establish a clear profiling of altered genes and/or intergenic sequences in various experimental conditions. This mapping technique could find several applications for instance in the study of aging, genotoxic drug screening, cancer, meiosis, radiation and oxidative DNA damage.
Srivastava, Prashant Kumar; Bagnati, Marta; Delahaye-Duriez, Andree; Ko, Jeong-Hun; Rotival, Maxime; Langley, Sarah R.; Shkura, Kirill; Mazzuferi, Manuela; Danis, Bénédicte; van Eyll, Jonathan; Foerch, Patrik; Behmoaras, Jacques; Kaminski, Rafal M.; Petretto, Enrico; Johnson, Michael R.
The recoding of genetic information through RNA editing contributes to proteomic diversity, but the extent and significance of RNA editing in disease is poorly understood. In particular, few studies have investigated the relationship between RNA editing and disease at a genome-wide level. Here, we developed a framework for the genome-wide detection of RNA sites that are differentially edited in disease. Using RNA-sequencing data from 100 hippocampi from mice with epilepsy (pilocarpine–temporal lobe epilepsy model) and 100 healthy control hippocampi, we identified 256 RNA sites (overlapping with 87 genes) that were significantly differentially edited between epileptic cases and controls. The degree of differential RNA editing in epileptic mice correlated with frequency of seizures, and the set of genes differentially RNA-edited between case and control mice were enriched for functional terms highly relevant to epilepsy, including “neuron projection” and “seizures.” Genes with differential RNA editing were preferentially enriched for genes with a genetic association to epilepsy. Indeed, we found that they are significantly enriched for genes that harbor nonsynonymous de novo mutations in patients with epileptic encephalopathy and for common susceptibility variants associated with generalized epilepsy. These analyses reveal a functional convergence between genes that are differentially RNA-edited in acquired symptomatic epilepsy and those that contribute risk for genetic epilepsy. Taken together, our results suggest a potential role for RNA editing in the epileptic hippocampus in the occurrence and severity of epileptic seizures. PMID:28250018
Srivastava, Prashant Kumar; Bagnati, Marta; Delahaye-Duriez, Andree; Ko, Jeong-Hun; Rotival, Maxime; Langley, Sarah R; Shkura, Kirill; Mazzuferi, Manuela; Danis, Bénédicte; van Eyll, Jonathan; Foerch, Patrik; Behmoaras, Jacques; Kaminski, Rafal M; Petretto, Enrico; Johnson, Michael R
The recoding of genetic information through RNA editing contributes to proteomic diversity, but the extent and significance of RNA editing in disease is poorly understood. In particular, few studies have investigated the relationship between RNA editing and disease at a genome-wide level. Here, we developed a framework for the genome-wide detection of RNA sites that are differentially edited in disease. Using RNA-sequencing data from 100 hippocampi from mice with epilepsy (pilocarpine-temporal lobe epilepsy model) and 100 healthy control hippocampi, we identified 256 RNA sites (overlapping with 87 genes) that were significantly differentially edited between epileptic cases and controls. The degree of differential RNA editing in epileptic mice correlated with frequency of seizures, and the set of genes differentially RNA-edited between case and control mice were enriched for functional terms highly relevant to epilepsy, including "neuron projection" and "seizures." Genes with differential RNA editing were preferentially enriched for genes with a genetic association to epilepsy. Indeed, we found that they are significantly enriched for genes that harbor nonsynonymous de novo mutations in patients with epileptic encephalopathy and for common susceptibility variants associated with generalized epilepsy. These analyses reveal a functional convergence between genes that are differentially RNA-edited in acquired symptomatic epilepsy and those that contribute risk for genetic epilepsy. Taken together, our results suggest a potential role for RNA editing in the epileptic hippocampus in the occurrence and severity of epileptic seizures.
Scharf, Jeremiah M.; Yu, Dongmei; Mathews, Carol A.; Neale, Benjamin M.; Stewart, S. Evelyn; Fagerness, Jesen A; Evans, Patrick; Gamazon, Eric; Edlund, Christopher K.; Service, Susan; Tikhomirov, Anna; Osiecki, Lisa; Illmann, Cornelia; Pluzhnikov, Anna; Konkashbaev, Anuar; Davis, Lea K; Han, Buhm; Crane, Jacquelyn; Moorjani, Priya; Crenshaw, Andrew T.; Parkin, Melissa A.; Reus, Victor I.; Lowe, Thomas L.; Rangel-Lugo, Martha; Chouinard, Sylvain; Dion, Yves; Girard, Simon; Cath, Danielle C; Smit, Jan H; King, Robert A.; Fernandez, Thomas; Leckman, James F.; Kidd, Kenneth K.; Kidd, Judith R.; Pakstis, Andrew J.; State, Matthew; Herrera, Luis Diego; Romero, Roxana; Fournier, Eduardo; Sandor, Paul; Barr, Cathy L; Phan, Nam; Gross-Tsur, Varda; Benarroch, Fortu; Pollak, Yehuda; Budman, Cathy L.; Bruun, Ruth D.; Erenberg, Gerald; Naarden, Allan L; Lee, Paul C; Weiss, Nicholas; Kremeyer, Barbara; Berrío, Gabriel Bedoya; Campbell, Desmond; Silgado, Julio C. Cardona; Ochoa, William Cornejo; Restrepo, Sandra C. Mesa; Muller, Heike; Duarte, Ana V. Valencia; Lyon, Gholson J; Leppert, Mark; Morgan, Jubel; Weiss, Robert; Grados, Marco A.; Anderson, Kelley; Davarya, Sarah; Singer, Harvey; Walkup, John; Jankovic, Joseph; Tischfield, Jay A.; Heiman, Gary A.; Gilbert, Donald L.; Hoekstra, Pieter J.; Robertson, Mary M.; Kurlan, Roger; Liu, Chunyu; Gibbs, J. Raphael; Singleton, Andrew; Hardy, John; Strengman, Eric; Ophoff, Roel; Wagner, Michael; Moessner, Rainald; Mirel, Daniel B.; Posthuma, Danielle; Sabatti, Chiara; Eskin, Eleazar; Conti, David V.; Knowles, James A.; Ruiz-Linares, Andres; Rouleau, Guy A.; Purcell, Shaun; Heutink, Peter; Oostra, Ben A.; McMahon, William; Freimer, Nelson; Cox, Nancy J.; Pauls, David L.
Tourette Syndrome (TS) is a developmental disorder that has one of the highest familial recurrence rates among neuropsychiatric diseases with complex inheritance. However, the identification of definitive TS susceptibility genes remains elusive. Here, we report the first genome-wide association study (GWAS) of TS in 1285 cases and 4964 ancestry-matched controls of European ancestry, including two European-derived population isolates, Ashkenazi Jews from North America and Israel, and French Canadians from Quebec, Canada. In a primary meta-analysis of GWAS data from these European ancestry samples, no markers achieved a genome-wide threshold of significance (p<5 × 10−8); the top signal was found in rs7868992 on chromosome 9q32 within COL27A1 (p=1.85 × 10−6). A secondary analysis including an additional 211 cases and 285 controls from two closely-related Latin-American population isolates from the Central Valley of Costa Rica and Antioquia, Colombia also identified rs7868992 as the top signal (p=3.6 × 10−7 for the combined sample of 1496 cases and 5249 controls following imputation with 1000 Genomes data). This study lays the groundwork for the eventual identification of common TS susceptibility variants in larger cohorts and helps to provide a more complete understanding of the full genetic architecture of this disorder. PMID:22889924
Walter, Stefan; Atzmon, Gil; Demerath, Ellen W.; Garcia, Melissa E.; Kaplan, Robert C.; Kumari, Meena; Lunetta, Kathryn L.; Milaneschi, Yuri; Tanaka, Toshiko; Tranah, Gregory J.; Völker, Uwe; Yu, Lei; Arnold, Alice; Benjamin, Emelia J.; Biffar, Reiner; Buchman, Aron S.; Boerwinkle, Eric; Couper, David; De Jager, Philip L.; Evans, Denis A.; Harris, Tamara B.; Hoffmann, Wolfgang; Hofman, Albert; Karasik, David; Kiel, Douglas P.; Kocher, Thomas; Kuningas, Maris; Launer, Lenore J.; Lohman, Kurt K.; Lutsey, Pamela L.; Mackenbach, Johan; Marciante, Kristin; Psaty, Bruce M.; Reiman, Eric M.; Rotter, Jerome I.; Seshadri, Sudha; Shardell, Michelle D.; Smith, Albert V.; van Duijn, Cornelia; Walston, Jeremy; Zillikens, M. Carola; Bandinelli, Stefania; Baumeister, Sebastian E.; Bennett, David A.; Ferrucci, Luigi; Gudnason, Vilmundur; Kivimaki, Mika; Liu, Yongmei; Murabito, Joanne M.; Newman, Anne B.; Tiemeier, Henning; Franceschini, Nora
Human longevity and healthy aging show moderate heritability (20–50%). We conducted a meta-analysis of genome-wide association studies from nine studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for two outcomes: a) all-cause mortality and b) survival free of major disease or death. No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p < 5 × 10−8). We found fourteen independent SNPs that predicted risk of death, and eight SNPs that predicted event-free survival (p < 10−5). These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer’s disease. In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings. These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity. PMID:21782286
Preston, Jessica L.; Randel, Melissa A.; Johnson, Eric A.
Here we present a genome-wide method for de novo identification of enhancer regions. This approach enables massively parallel empirical investigation of DNA sequences that mediate transcriptional activation and provides a platform for discovery of regulatory modules capable of driving context-specific gene expression. The method links fragmented genomic DNA to the transcription of randomer molecule identifiers and measures the functional enhancer activity of the library by massively parallel sequencing. We transfected a Drosophila melanogaster library into S2 cells in normoxia and hypoxia, and assayed 4,599,881 genomic DNA fragments in parallel. The locations of the enhancer regions strongly correlate with genes up-regulated after hypoxia and previously described enhancers. Novel enhancer regions were identified and integrated with RNAseq data and transcription factor motifs to describe the hypoxic response on a genome-wide basis as a complex regulatory network involving multiple stress-response pathways. This work provides a novel method for high-throughput assay of enhancer activity and the genome-scale identification of 31 hypoxia-activated enhancers in Drosophila. PMID:26713262
Gusareva, Elena S.; Carrasquillo, Minerva M.; Bellenguez, Céline; Cuyvers, Elise; Colon, Samuel; Graff-Radford, Neill R.; Petersen, Ronald C.; Dickson, Dennis W.; Mahachie Johna, Jestinah M.; Bessonov, Kyrylo; Van Broeckhoven, Christine; Williams, Julie; Amouyel, Philippe; Sleegers, Kristel; Ertekin-Taner, Nilüfer; Lambert, Jean-Charles; Van Steen, Kristel
We propose a minimal protocol for exhaustive genome-wide association interaction analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and statistical tools. The different steps of this protocol are illustrated on a real-life data application for Alzheimer's disease (AD) (2259 patients and 6017 controls from France). Particularly, in the exhaustive genome-wide epistasis screening we identified AD-associated interacting SNPs-pair from chromosome 6q11.1 (rs6455128, the KHDRBS2 gene) and 13q12.11 (rs7989332, the CRYL1 gene) (p = 0.006, corrected for multiple testing). A replication analysis in the independent AD cohort from Germany (555 patients and 824 controls) confirmed the discovered epistasis signal (p = 0.036). This signal was also supported by a meta-analysis approach in 5 independent AD cohorts that was applied in the context of epistasis for the first time. Transcriptome analysis revealed negative correlation between expression levels of KHDRBS2 and CRYL1 in both the temporal cortex (β = −0.19, p = 0.0006) and cerebellum (β = −0.23, p < 0.0001) brain regions. This is the first time a replicable epistasis associated with AD was identified using a hypothesis free screening approach. PMID:24958192
Walter, Stefan; Atzmon, Gil; Demerath, Ellen W; Garcia, Melissa E; Kaplan, Robert C; Kumari, Meena; Lunetta, Kathryn L; Milaneschi, Yuri; Tanaka, Toshiko; Tranah, Gregory J; Völker, Uwe; Yu, Lei; Arnold, Alice; Benjamin, Emelia J; Biffar, Reiner; Buchman, Aron S; Boerwinkle, Eric; Couper, David; De Jager, Philip L; Evans, Denis A; Harris, Tamara B; Hoffmann, Wolfgang; Hofman, Albert; Karasik, David; Kiel, Douglas P; Kocher, Thomas; Kuningas, Maris; Launer, Lenore J; Lohman, Kurt K; Lutsey, Pamela L; Mackenbach, Johan; Marciante, Kristin; Psaty, Bruce M; Reiman, Eric M; Rotter, Jerome I; Seshadri, Sudha; Shardell, Michelle D; Smith, Albert V; van Duijn, Cornelia; Walston, Jeremy; Zillikens, M Carola; Bandinelli, Stefania; Baumeister, Sebastian E; Bennett, David A; Ferrucci, Luigi; Gudnason, Vilmundur; Kivimaki, Mika; Liu, Yongmei; Murabito, Joanne M; Newman, Anne B; Tiemeier, Henning; Franceschini, Nora
Human longevity and healthy aging show moderate heritability (20%-50%). We conducted a meta-analysis of genome-wide association studies from 9 studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium for 2 outcomes: (1) all-cause mortality, and (2) survival free of major disease or death. No single nucleotide polymorphism (SNP) was a genome-wide significant predictor of either outcome (p < 5 × 10(-8)). We found 14 independent SNPs that predicted risk of death, and 8 SNPs that predicted event-free survival (p < 10(-5)). These SNPs are in or near genes that are highly expressed in the brain (HECW2, HIP1, BIN2, GRIA1), genes involved in neural development and function (KCNQ4, LMO4, GRIA1, NETO1) and autophagy (ATG4C), and genes that are associated with risk of various diseases including cancer and Alzheimer's disease. In addition to considerable overlap between the traits, pathway and network analysis corroborated these findings. These findings indicate that variation in genes involved in neurological processes may be an important factor in regulating aging free of major disease and achieving longevity.
Dadousis, C; Biffani, S; Cipolat-Gotet, C; Nicolazzi, E L; Rosa, G J M; Gianola, D; Rossoni, A; Santus, E; Bittante, G; Cecchinato, A
Cheese production and consumption are increasing in many countries worldwide. As a result, interest has increased in strategies for genetic selection of individuals for technological traits of milk related to cheese yield (CY) in dairy cattle breeding. However, little is known about the genetic background of a cow's ability to produce cheese. Recently, a relatively large panel (1,264 cows) of different measures of individual cow CY and milk nutrient and energy recoveries in the cheese (REC) became available. Genetic analyses showed considerable variation for CY and for aptitude to retain high proportions of fat, protein, and water in the coagulum. For the dairy industry, these characteristics are of major economic importance. Nevertheless, use of this knowledge in dairy breeding is hampered by high costs, intense labor requirement, and lack of appropriate technology. However, in the era of genomics, new possibilities are available for animal breeding and genetic improvement. For example, identification of genomic regions involved in cow CY might provide potential for marker-assisted selection. The objective of this study was to perform genome-wide association studies on different CY and REC measures. Milk and DNA samples from 1,152 Italian Brown Swiss cows were used. Three CY traits expressing the weight (wt) of fresh curd (%CYCURD), curd solids (%CYSOLIDS), and curd moisture (%CYWATER) as a percentage of weight of milk processed, and 4 REC (RECFAT, RECPROTEIN, RECSOLIDS, and RECENERGY, calculated as the % ratio between the nutrient in curd and the corresponding nutrient in processed milk) were analyzed. Animals were genotyped with the Illumina BovineSNP50 Bead Chip v.2. Single marker regressions were fitted using the GenABEL R package (genome-wide association using mixed model and regression-genomic control). In total, 103 significant associations (88 single nucleotide polymorphisms) were identified in 10 chromosomes (2, 6, 9, 11, 12, 14, 18, 19, 27, 28). For
Gunia, M; Mandonnet, N; Arquet, R; Alexandre, G; Gourdine, J-L; Naves, M; Angeon, V; Phocas, F
A specific breeding goal definition was developed for Creole goats in Guadeloupe. This local breed is used for meat production. To ensure a balanced selection outcome, the breeding objective included two production traits, live weight (BW11) and dressing percentage (DP) at 11 months (the mating or selling age), one reproduction trait, fertility (FER), and two traits to assess animal response to parasite infection: packed cell volume (PCV), a resilience trait, and faecal worm eggs count (FEC), a resistance trait. A deterministic bio-economic model was developed to calculate the economic values based on the description of the profit of a Guadeloupean goat farm. The farm income came from the sale of animals for meat or as reproducers. The main costs were feeding and treatments against gastro-intestinal parasites. The economic values were 7.69€ per kg for BW11, 1.38€ per % for FER, 3.53€ per % for DP and 3 × 10(-4)€ per % for PCV. The economic value for FEC was derived by comparing the expected profit and average FEC in a normal situation and in an extreme situation where parasites had developed resistance to anthelmintics. This method yielded a maximum weighting for FEC, which was -18.85€ per log(eggs per gram). Alternative scenarios were tested to assess the robustness of the economic values to variations in the economic and environmental context. The economic values of PCV and DP were the most stable. Issues involved in paving the way for selective breeding on resistance or resilience to parasites are discussed.
Bae, Harold T; Sebastiani, Paola; Sun, Jenny X; Andersen, Stacy L; Daw, E Warwick; Terracciano, Antonio; Ferrucci, Luigi; Perls, Thomas T
Personality traits have been shown to be associated with longevity and healthy aging. In order to discover novel genetic modifiers associated with personality traits as related with longevity, we performed a genome-wide association study (GWAS) on personality factors assessed by NEO-five-factor inventory in individuals enrolled in the Long Life Family Study (LLFS), a study of 583 families (N up to 4595) with clustering for longevity in the United States and Denmark. Three SNPs, in almost perfect LD, associated with agreeableness reached genome-wide significance (p < 10(-8)) and replicated in an additional sample of 1279 LLFS subjects, although one (rs9650241) failed to replicate and the other two were not available in two independent replication cohorts, the Baltimore Longitudinal Study of Aging and the New England Centenarian Study. Based on 10,000,000 permutations, the empirical p-value of 2 × 10(-7) was observed for the genome-wide significant SNPs. Seventeen SNPs that reached marginal statistical significance in the two previous GWASs (p-value <10(-4) and 10(-5)), were also marginally significantly associated in this study (p-value <0.05), although none of the associations passed the Bonferroni correction. In addition, we tested age-by-SNP interactions and found some significant associations. Since scores of personality traits in LLFS subjects change in the oldest ages, and genetic factors outweigh environmental factors to achieve extreme ages, these age-by-SNP interactions could be a proxy for complex gene-gene interactions affecting personality traits and longevity.
Slavov, Gancho T; Nipper, Rick; Robson, Paul; Farrar, Kerrie; Allison, Gordon G; Bosch, Maurice; Clifton-Brown, John C; Donnison, Iain S; Jensen, Elaine
• Increasing demands for food and energy require a step change in the effectiveness, speed and flexibility of crop breeding. Therefore, the aim of this study was to assess the potential of genome-wide association studies (GWASs) and genomic selection (i.e. phenotype prediction from a genome-wide set of markers) to guide fundamental plant science and to accelerate breeding in the energy grass Miscanthus. • We generated over 100,000 single-nucleotide variants (SNVs) by sequencing restriction site-associated DNA (RAD) tags in 138 Micanthus sinensis genotypes, and related SNVs to phenotypic data for 17 traits measured in a field trial. • Confounding by population structure and relatedness was severe in naïve GWAS analyses, but mixed-linear models robustly controlled for these effects and allowed us to detect multiple associations that reached genome-wide significance. Genome-wide prediction accuracies tended to be moderate to high (average of 0.57), but varied dramatically across traits. As expected, predictive abilities increased linearly with the size of the mapping population, but reached a plateau when the number of markers used for prediction exceeded 10,000-20,000, and tended to decline, but remain significant, when cross-validations were performed across subpopulations. • Our results suggest that the immediate implementation of genomic selection in Miscanthus breeding programs may be feasible.
Tomkins, Joseph L; Penrose, Marissa A; Greeff, Johan; LeBas, Natasha R
The mutation-selection-balance model predicts most additive genetic variation to arise from numerous mildly deleterious mutations of small effect. Correspondingly, "good genes" models of sexual selection and recent models for the evolution of sex are built on the assumption that mutational loads and breeding values for fitness-related traits are correlated. In support of this concept, inbreeding depression was negatively genetically correlated with breeding values for traits under natural and sexual selection in the weevil Callosobruchus maculatus. The correlations were stronger in males and strongest for condition. These results confirm the role of existing, partially recessive mutations in maintaining additive genetic variation in outbred populations, reveal the nature of good genes under sexual selection, and show how sexual selection can offset the cost of sex.
Nicolazzi, E L; Forabosco, F; Fikse, W F
International genetic evaluations are a valuable source of information for decisions about the importation of (the semen of) foreign bulls. This study analyzed data from 6 countries (Australia, Canada, Italy, France, the Netherlands, and the United States) and compared international evaluations for production traits of foreign bulls (i.e., when no national daughter information was available) to their national breeding values in August 2009, which were based only on domestic daughters' data. A total of 821 bulls with highly reliable estimated breeding values (EBV) for milk, fat, and protein yield were analyzed. No evidence of systematic over- or underestimation was found in most of the countries analyzed. Observed correlations between national and international evaluations were close to 0.9 and, for most countries, generally close to their expected values (calculated from national and international EBV reliabilities). In Italy, however, higher differences between observed and expected correlations and significant mean differences between EBV for more than one trait were observed in bulls progeny-tested in the United States and in other European countries (with differences up to 33.1% of the genetic standard deviation). These results were probably induced by a relatively recent change in the model for national evaluation. The findings in this study reflect a conservative estimate of the real value of international evaluations, as changes in methodologies in either the national or the international evaluations decreased the ability of past international evaluations to predict current national evaluations. Nevertheless, our results indicate that international evaluations based on foreign information for Holstein bulls were reasonably accurate predictors of the future national breeding values based only upon domestic daughters.
Rautiainen, M-R; Paunio, T; Repo-Tiihonen, E; Virkkunen, M; Ollila, H M; Sulkava, S; Jolanki, O; Palotie, A; Tiihonen, J
The pathophysiology of antisocial personality disorder (ASPD) remains unclear. Although the most consistent biological finding is reduced grey matter volume in the frontal cortex, about 50% of the total liability to developing ASPD has been attributed to genetic factors. The contributing genes remain largely unknown. Therefore, we sought to study the genetic background of ASPD. We conducted a genome-wide association study (GWAS) and a replication analysis of Finnish criminal offenders fulfilling DSM-IV criteria for ASPD (N=370, N=5850 for controls, GWAS; N=173, N=3766 for controls and replication sample). The GWAS resulted in suggestive associations of two clusters of single-nucleotide polymorphisms at 6p21.2 and at 6p21.32 at the human leukocyte antigen (HLA) region. Imputation of HLA alleles revealed an independent association with DRB1*01:01 (odds ratio (OR)=2.19 (1.53–3.14), P=1.9 × 10-5). Two polymorphisms at 6p21.2 LINC00951–LRFN2 gene region were replicated in a separate data set, and rs4714329 reached genome-wide significance (OR=1.59 (1.37–1.85), P=1.6 × 10−9) in the meta-analysis. The risk allele also associated with antisocial features in the general population conditioned for severe problems in childhood family (β=0.68, P=0.012). Functional analysis in brain tissue in open access GTEx and Braineac databases revealed eQTL associations of rs4714329 with LINC00951 and LRFN2 in cerebellum. In humans, LINC00951 and LRFN2 are both expressed in the brain, especially in the frontal cortex, which is intriguing considering the role of the frontal cortex in behavior and the neuroanatomical findings of reduced gray matter volume in ASPD. To our knowledge, this is the first study showing genome-wide significant and replicable findings on genetic variants associated with any personality disorder. PMID:27598967
Amaral, Andreia J.; Ferretti, Luca; Megens, Hendrik-Jan; Crooijmans, Richard P. M. A.; Nie, Haisheng; Ramos-Onsins, Sebastian E.; Perez-Enciso, Miguel; Schook, Lawrence B.; Groenen, Martien A. M.
Background Artificial selection has caused rapid evolution in domesticated species. The identification of selection footprints across domesticated genomes can contribute to uncover the genetic basis of phenotypic diversity. Methodology/Main Findings Genome wide footprints of pig domestication and selection were identified using massive parallel sequencing of pooled reduced representation libraries (RRL) representing ∼2% of the genome from wild boar and four domestic pig breeds (Large White, Landrace, Duroc and Pietrain) which have been under strong selection for muscle development, growth, behavior and coat color. Using specifically developed statistical methods that account for DNA pooling, low mean sequencing depth, and sequencing errors, we provide genome-wide estimates of nucleotide diversity and genetic differentiation in pig. Widespread signals suggestive of positive and balancing selection were found and the strongest signals were observed in Pietrain, one of the breeds most intensively selected for muscle development. Most signals were population-specific but affected genomic regions which harbored genes for common biological categories including coat color, brain development, muscle development, growth, metabolism, olfaction and immunity. Genetic differentiation in regions harboring genes related to muscle development and growth was higher between breeds than between a given breed and the wild boar. Conclusions/Significance These results, suggest that although domesticated breeds have experienced similar selective pressures, selection has acted upon different genes. This might reflect the multiple domestication events of European breeds or could be the result of subsequent introgression of Asian alleles. Overall, it was estimated that approximately 7% of the porcine genome has been affected by selection events. This study illustrates that the massive parallel sequencing of genomic pools is a cost-effective approach to identify footprints of selection
Gilles, Annick; Van Camp, Guy; Van de Heyning, Paul; Fransen, Erik
Tinnitus, the perception of an auditory phantom sound in the form of ringing, buzzing, roaring, or hissing in the absence of an external sound source, is perceived by ~15% of the population and 2.5% experiences a severely bothersome tinnitus. The contribution of genes on the development of tinnitus is still under debate. The current manuscript reports a pilot Genome Wide Association Study (GWAS) into tinnitus, in a small cohort of 167 independent tinnitus subjects, and 749 non-tinnitus controls, who were collected as part of a cross-sectional study. After genotyping, imputation, and quality checking, the association between the tinnitus phenotype and 4,000,000 single-nucleotide polymorphisms (SNPs) was tested followed by gene set enrichment analysis. None of the SNPs reached the threshold for genome-wide significance (p < 5.0e–8), with the most significant SNPs, situated outside coding genes, reaching a p-value of 3.4e–7. By using the Genetic Analysis of Complex Traits (GACT) software, the percentage of the variance explained by all SNPs in the GWAS was estimated to be 3.2%, indicating that additive genetic effects explain only a small fraction of the tinnitus phenotype. Despite the lack of genome-wide significant SNPs, which is, at least in part, due to the limited sample size of the current study, evidence was found for a genetic involvement in tinnitus. Gene set enrichment analysis showed several metabolic pathways to be significantly enriched with SNPs having a low p-value in the GWAS. These pathways are involved in oxidative stress, endoplasmatic reticulum (ER) stress, and serotonin reception mediated signaling. These results are a promising basis for further research into the genetic basis of tinnitus, including GWAS with larger sample sizes and considering tinnitus subtypes for which a greater genetic contribution is more likely. PMID:28303087
Roh, Tae-young; Ngau, Wing Chi; Cui, Kairong; Landsman, David; Zhao, Keji
The expression patterns of eukaryotic genomes are controlled by their chromatin structure, consisting of nucleosome subunits in which DNA of approximately 146 bp is wrapped around a core of 8 histone molecules. Post-translational histone modifications play an essential role in modifying chromatin structure. Here we apply a combination of SAGE and chromatin immunoprecipitation (ChIP) protocols to determine the distribution of hyperacetylated histones H3 and H4 in the Saccharomyces cerevisiae genome. We call this approach genome-wide mapping technique (GMAT). Using GMAT, we find that the highest acetylation levels are detected in the 5' end of a gene's coding region, but not in the promoter. Furthermore, we show that the histone acetyltransferase, GCN5p, regulates H3 acetylation in the promoter and 5' end of the coding regions. These findings indicate that GMAT should find valuable applications in mapping target sites of chromatin-modifying enzymes.
Murphy, Travis W.; Lu, Chang
Next-generation sequencing (NGS) has revolutionized how molecular biology studies are conducted. Its decreasing cost and increasing throughput permit profiling of genomic, transcriptomic, and epigenomic features for a wide range of applications. Microfluidics has been proven to be highly complementary to NGS technology with its unique capabilities for handling small volumes of samples and providing platforms for automation, integration, and multiplexing. In this article, we review recent progress on applying microfluidics to facilitate genome-wide studies. We emphasize on several technical aspects of NGS and how they benefit from coupling with microfluidic technology. We also summarize recent efforts on developing microfluidic technology for genomic, transcriptomic, and epigenomic studies, with emphasis on single cell analysis. We envision rapid growth in these directions, driven by the needs for testing scarce primary cell samples from patients in the context of precision medicine.
Patel, Jai N; McLeod, Howard L; Innocenti, Federico
Genome wide association studies (GWAS) provide an agnostic approach to identifying potential genetic variants associated with disease susceptibility, prognosis of survival and/or predictive of drug response. Although these techniques are costly and interpretation of study results is challenging, they do allow for a more unbiased interrogation of the entire genome, resulting in the discovery of novel genes and understanding of novel biological associations. This review will focus on the implications of GWAS in cancer therapy, in particular germ-line mutations, including findings from major GWAS which have identified predictive genetic loci for clinical outcome and/or toxicity. Lessons and challenges in cancer GWAS are also discussed, including the need for functional analysis and replication, as well as future perspectives for biological and clinical utility. Given the large heterogeneity in response to cancer therapeutics, novel methods of identifying mechanisms and biology of variable drug response and ultimately treatment individualization will be indispensable.
Diseases related to tobacco smoking are the second leading cause of death in the world. Despite increasing evidence of genetic determination, the susceptibility genes and loci underlying various aspects of smoking behavior are largely unknown. Genome-wide association studies (GWASs) provided a new conceptual framework in the search for variants underlying common traits/disorders. A massive scan of the genome and a "hypothesis-free" approach enable discovery of new aspects of genetics of complex traits. In this paper the results of GWASs and GWAS meta-analyzes of cigarette smoking behavior and nicotine dependence are reviewed with the particular attention to smoking cessation success and the replacement therapy. The results of these studies are discussed in the context of the results of the candidate gene association studies. Studies on the role of the genomic regions, identified in GWASs, in the development of smoking-related diseases are also discussed.
Wan, Yue; Qu, Kun; Ouyang, Zhengqing; Kertesz, Michael; Li, Jun; Tibshirani, Robert; Makino, Debora L; Nutter, Robert C; Segal, Eran; Chang, Howard Y
RNA structural transitions are important in the function and regulation of RNAs. Here, we reveal a layer of transcriptome organization in the form of RNA folding energies. By probing yeast RNA structures at different temperatures, we obtained relative melting temperatures (Tm) for RNA structures in over 4000 transcripts. Specific signatures of RNA Tm demarcated the polarity of mRNA open reading frames and highlighted numerous candidate regulatory RNA motifs in 3' untranslated regions. RNA Tm distinguished noncoding versus coding RNAs and identified mRNAs with distinct cellular functions. We identified thousands of putative RNA thermometers, and their presence is predictive of the pattern of RNA decay in vivo during heat shock. The exosome complex recognizes unpaired bases during heat shock to degrade these RNAs, coupling intrinsic structural stabilities to gene regulation. Thus, genome-wide structural dynamics of RNA can parse functional elements of the transcriptome and reveal diverse biological insights.
Harari, Yaniv; Kupiec, Martin
Telomeres are specialized DNA-protein structures at the ends of eukaryotic chromosomes. Telomeres are essential for chromosomal stability and integrity, as they prevent chromosome ends from being recognized as double strand breaks. In rapidly proliferating cells, telomeric DNA is synthesized by the enzyme telomerase, which copies a short template sequence within its own RNA moiety, thus helping to solve the “end-replication problem”, in which information is lost at the ends of chromosomes with each DNA replication cycle. The basic mechanisms of telomere length, structure and function maintenance are conserved among eukaryotes. Studies in the yeast Saccharomyces cerevisiae have been instrumental in deciphering the basic aspects of telomere biology. In the last decade, technical advances, such as the availability of mutant collections, have allowed carrying out systematic genome-wide screens for mutants affecting various aspects of telomere biology. In this review we summarize these efforts, and the insights that this Systems Biology approach has produced so far.
Ben-Yakar, Adela; Bourgeois, Frederic
Summary The use of ultrafast laser pulses in surgery has allowed for unprecedented precision with minimal collateral damage to surrounding tissues. For these reasons, ultrafast laser nanosurgery, as an injury model, has gained tremendous momentum in experimental biology ranging from in-vitro manipulations of subcellular structures to in-vivo studies in whole living organisms. For example, femtosecond laser nanosurgery on such model organism as the nematode Caenorhabditis elegans (C. elegans) has opened new opportunities for in-vivo nerve regeneration studies. Meanwhile, the development of novel microfluidic devices has brought the control in experimental environment to the level required for precise nanosurgery in various animal models. Merging microfluidics and laser nanosurgery has recently improved the specificities and increased the speed of laser surgeries enabling fast genome-wide screenings that can more readily decode the genetic map of various biological processes. PMID:19278850
Lin, Eugene; Lane, Hsien-Yuan
Major depressive disorder (MDD) is one of the most common psychiatric disorders worldwide. Doctors must prescribe antidepressants based on educated guesses due to the fact that it is unmanageable to predict the effectiveness of any particular antidepressant in an individual patient. With the recent advent of scientific research, the genome-wide association study (GWAS) is extensively employed to analyze hundreds of thousands of single nucleotide polymorphisms by high-throughput genotyping technologies. In addition to the candidate-gene approach, the GWAS approach has recently been utilized to investigate the determinants of antidepressant response to therapy. In this study, we reviewed GWAS studies, their limitations and future directions with respect to the pharmacogenomics of antidepressants in MDD.
Boraska, Vesna; Franklin, Christopher S; Floyd, James AB; Thornton, Laura M; Huckins, Laura M; Southam, Lorraine; Rayner, N William; Tachmazidou, Ioanna; Klump, Kelly L; Treasure, Janet; Lewis, Cathryn M; Schmidt, Ulrike; Tozzi, Federica; Kiezebrink, Kirsty; Hebebrand, Johannes; Gorwood, Philip; Adan, Roger AH; Kas, Martien JH; Favaro, Angela; Santonastaso, Paolo; Fernández-Aranda, Fernando; Gratacos, Monica; Rybakowski, Filip; Dmitrzak-Weglarz, Monika; Kaprio, Jaakko; Keski-Rahkonen, Anna; Raevuori, Anu; Van Furth, Eric F; Slof-Op t Landt, Margarita CT; Hudson, James I; Reichborn-Kjennerud, Ted; Knudsen, Gun Peggy S; Monteleone, Palmiero; Kaplan, Allan S; Karwautz, Andreas; Hakonarson, Hakon; Berrettini, Wade H; Guo, Yiran; Li, Dong; Schork, Nicholas J.; Komaki, Gen; Ando, Tetsuya; Inoko, Hidetoshi; Esko, Tõnu; Fischer, Krista; Männik, Katrin; Metspalu, Andres; Baker, Jessica H; Cone, Roger D; Dackor, Jennifer; DeSocio, Janiece E; Hilliard, Christopher E; O’Toole, Julie K; Pantel, Jacques; Szatkiewicz, Jin P; Taico, Chrysecolla; Zerwas, Stephanie; Trace, Sara E; Davis, Oliver SP; Helder, Sietske; Bühren, Katharina; Burghardt, Roland; de Zwaan, Martina; Egberts, Karin; Ehrlich, Stefan; Herpertz-Dahlmann, Beate; Herzog, Wolfgang; Imgart, Hartmut; Scherag, André; Scherag, Susann; Zipfel, Stephan; Boni, Claudette; Ramoz, Nicolas; Versini, Audrey; Brandys, Marek K; Danner, Unna N; de Kovel, Carolien; Hendriks, Judith; Koeleman, Bobby PC; Ophoff, Roel A; Strengman, Eric; van Elburg, Annemarie A; Bruson, Alice; Clementi, Maurizio; Degortes, Daniela; Forzan, Monica; Tenconi, Elena; Docampo, Elisa; Escaramís, Geòrgia; Jiménez-Murcia, Susana; Lissowska, Jolanta; Rajewski, Andrzej; Szeszenia-Dabrowska, Neonila; Slopien, Agnieszka; Hauser, Joanna; Karhunen, Leila; Meulenbelt, Ingrid; Slagboom, P Eline; Tortorella, Alfonso; Maj, Mario; Dedoussis, George; Dikeos, Dimitris; Gonidakis, Fragiskos; Tziouvas, Konstantinos; Tsitsika, Artemis; Papezova, Hana; Slachtova, Lenka; Martaskova, Debora; Kennedy, James L.; Levitan, Robert D.; Yilmaz, Zeynep; Huemer, Julia; Koubek, Doris; Merl, Elisabeth; Wagner, Gudrun; Lichtenstein, Paul; Breen, Gerome; Cohen-Woods, Sarah; Farmer, Anne; McGuffin, Peter; Cichon, Sven; Giegling, Ina; Herms, Stefan; Rujescu, Dan; Schreiber, Stefan; Wichmann, H-Erich; Dina, Christian; Sladek, Rob; Gambaro, Giovanni; Soranzo, Nicole; Julia, Antonio; Marsal, Sara; Rabionet, Raquel; Gaborieau, Valerie; Dick, Danielle M; Palotie, Aarno; Ripatti, Samuli; Widén, Elisabeth; Andreassen, Ole A; Espeseth, Thomas; Lundervold, Astri; Reinvang, Ivar; Steen, Vidar M; Le Hellard, Stephanie; Mattingsdal, Morten; Ntalla, Ioanna; Bencko, Vladimir; Foretova, Lenka; Janout, Vladimir; Navratilova, Marie; Gallinger, Steven; Pinto, Dalila; Scherer, Stephen; Aschauer, Harald; Carlberg, Laura; Schosser, Alexandra; Alfredsson, Lars; Ding, Bo; Klareskog, Lars; Padyukov, Leonid; Finan, Chris; Kalsi, Gursharan; Roberts, Marion; Logan, Darren W; Peltonen, Leena; Ritchie, Graham RS; Barrett, Jeffrey C; Estivill, Xavier; Hinney, Anke; Sullivan, Patrick F; Collier, David A; Zeggini, Eleftheria; Bulik, Cynthia M
Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2,907 cases with AN from 14 countries (15 sites) and 14,860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery datasets. Seventy-six (72 independent) SNPs were taken forward for in silico (two datasets) or de novo (13 datasets) replication genotyping in 2,677 independent AN cases and 8,629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication datasets comprised 5,551 AN cases and 21,080 controls. AN subtype analyses (1,606 AN restricting; 1,445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01×10−7) in SOX2OT and rs17030795 (P=5.84×10−6) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76×10−6) between CUL3 and FAM124B and rs1886797 (P=8.05×10−6) near SPATA13. Comparing discovery to replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P= 4×10−6), strongly suggesting that true findings exist but that our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field. PMID:21079607
Boraska, V; Franklin, C S; Floyd, J A B; Thornton, L M; Huckins, L M; Southam, L; Rayner, N W; Tachmazidou, I; Klump, K L; Treasure, J; Lewis, C M; Schmidt, U; Tozzi, F; Kiezebrink, K; Hebebrand, J; Gorwood, P; Adan, R A H; Kas, M J H; Favaro, A; Santonastaso, P; Fernández-Aranda, F; Gratacos, M; Rybakowski, F; Dmitrzak-Weglarz, M; Kaprio, J; Keski-Rahkonen, A; Raevuori, A; Van Furth, E F; Slof-Op 't Landt, M C T; Hudson, J I; Reichborn-Kjennerud, T; Knudsen, G P S; Monteleone, P; Kaplan, A S; Karwautz, A; Hakonarson, H; Berrettini, W H; Guo, Y; Li, D; Schork, N J; Komaki, G; Ando, T; Inoko, H; Esko, T; Fischer, K; Männik, K; Metspalu, A; Baker, J H; Cone, R D; Dackor, J; DeSocio, J E; Hilliard, C E; O'Toole, J K; Pantel, J; Szatkiewicz, J P; Taico, C; Zerwas, S; Trace, S E; Davis, O S P; Helder, S; Bühren, K; Burghardt, R; de Zwaan, M; Egberts, K; Ehrlich, S; Herpertz-Dahlmann, B; Herzog, W; Imgart, H; Scherag, A; Scherag, S; Zipfel, S; Boni, C; Ramoz, N; Versini, A; Brandys, M K; Danner, U N; de Kovel, C; Hendriks, J; Koeleman, B P C; Ophoff, R A; Strengman, E; van Elburg, A A; Bruson, A; Clementi, M; Degortes, D; Forzan, M; Tenconi, E; Docampo, E; Escaramís, G; Jiménez-Murcia, S; Lissowska, J; Rajewski, A; Szeszenia-Dabrowska, N; Slopien, A; Hauser, J; Karhunen, L; Meulenbelt, I; Slagboom, P E; Tortorella, A; Maj, M; Dedoussis, G; Dikeos, D; Gonidakis, F; Tziouvas, K; Tsitsika, A; Papezova, H; Slachtova, L; Martaskova, D; Kennedy, J L; Levitan, R D; Yilmaz, Z; Huemer, J; Koubek, D; Merl, E; Wagner, G; Lichtenstein, P; Breen, G; Cohen-Woods, S; Farmer, A; McGuffin, P; Cichon, S; Giegling, I; Herms, S; Rujescu, D; Schreiber, S; Wichmann, H-E; Dina, C; Sladek, R; Gambaro, G; Soranzo, N; Julia, A; Marsal, S; Rabionet, R; Gaborieau, V; Dick, D M; Palotie, A; Ripatti, S; Widén, E; Andreassen, O A; Espeseth, T; Lundervold, A; Reinvang, I; Steen, V M; Le Hellard, S; Mattingsdal, M; Ntalla, I; Bencko, V; Foretova, L; Janout, V; Navratilova, M; Gallinger, S; Pinto, D; Scherer, S W; Aschauer, H; Carlberg, L; Schosser, A; Alfredsson, L; Ding, B; Klareskog, L; Padyukov, L; Courtet, P; Guillaume, S; Jaussent, I; Finan, C; Kalsi, G; Roberts, M; Logan, D W; Peltonen, L; Ritchie, G R S; Barrett, J C; Estivill, X; Hinney, A; Sullivan, P F; Collier, D A; Zeggini, E; Bulik, C M
Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome-wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2907 cases with AN from 14 countries (15 sites) and 14 860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery data sets. Seventy-six (72 independent) single nucleotide polymorphisms were taken forward for in silico (two data sets) or de novo (13 data sets) replication genotyping in 2677 independent AN cases and 8629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication data sets comprised 5551 AN cases and 21 080 controls. AN subtype analyses (1606 AN restricting; 1445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01 × 10(-7)) in SOX2OT and rs17030795 (P=5.84 × 10(-6)) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76 × 10(-)(6)) between CUL3 and FAM124B and rs1886797 (P=8.05 × 10(-)(6)) near SPATA13. Comparing discovery with replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P=4 × 10(-6)), strongly suggesting that true findings exist but our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field.
Boraska, Vesna; Franklin, Christopher S; Floyd, James AB; Thornton, Laura M; Huckins, Laura M; Southam, Lorraine; Rayner, N William; Tachmazidou, Ioanna; Klump, Kelly L; Treasure, Janet; Lewis, Cathryn M; Schmidt, Ulrike; Tozzi, Federica; Kiezebrink, Kirsty; Hebebrand, Johannes; Gorwood, Philip; Adan, Roger AH; Kas, Martien JH; Favaro, Angela; Santonastaso, Paolo; Fernández-Aranda, Fernando; Gratacos, Monica; Rybakowski, Filip; Dmitrzak-Weglarz, Monika; Kaprio, Jaakko; Keski-Rahkonen, Anna; Raevuori, Anu; Van Furth, Eric F; Landt, Margarita CT Slof-Op t; Hudson, James I; Reichborn-Kjennerud, Ted; Knudsen, Gun Peggy S; Monteleone, Palmiero; Kaplan, Allan S; Karwautz, Andreas; Hakonarson, Hakon; Berrettini, Wade H; Guo, Yiran; Li, Dong; Schork, Nicholas J.; Komaki, Gen; Ando, Tetsuya; Inoko, Hidetoshi; Esko, Tõnu; Fischer, Krista; Männik, Katrin; Metspalu, Andres; Baker, Jessica H; Cone, Roger D; Dackor, Jennifer; DeSocio, Janiece E; Hilliard, Christopher E; O'Toole, Julie K; Pantel, Jacques; Szatkiewicz, Jin P; Taico, Chrysecolla; Zerwas, Stephanie; Trace, Sara E; Davis, Oliver SP; Helder, Sietske; Bühren, Katharina; Burghardt, Roland; de Zwaan, Martina; Egberts, Karin; Ehrlich, Stefan; Herpertz-Dahlmann, Beate; Herzog, Wolfgang; Imgart, Hartmut; Scherag, André; Scherag, Susann; Zipfel, Stephan; Boni, Claudette; Ramoz, Nicolas; Versini, Audrey; Brandys, Marek K; Danner, Unna N; de Kovel, Carolien; Hendriks, Judith; Koeleman, Bobby PC; Ophoff, Roel A; Strengman, Eric; van Elburg, Annemarie A; Bruson, Alice; Clementi, Maurizio; Degortes, Daniela; Forzan, Monica; Tenconi, Elena; Docampo, Elisa; Escaramís, Geòrgia; Jiménez-Murcia, Susana; Lissowska, Jolanta; Rajewski, Andrzej; Szeszenia-Dabrowska, Neonila; Slopien, Agnieszka; Hauser, Joanna; Karhunen, Leila; Meulenbelt, Ingrid; Slagboom, P Eline; Tortorella, Alfonso; Maj, Mario; Dedoussis, George; Dikeos, Dimitris; Gonidakis, Fragiskos; Tziouvas, Konstantinos; Tsitsika, Artemis; Papezova, Hana; Slachtova, Lenka; Martaskova, Debora; Kennedy, James L.; Levitan, Robert D.; Yilmaz, Zeynep; Huemer, Julia; Koubek, Doris; Merl, Elisabeth; Wagner, Gudrun; Lichtenstein, Paul; Breen, Gerome; Cohen-Woods, Sarah; Farmer, Anne; McGuffin, Peter; Cichon, Sven; Giegling, Ina; Herms, Stefan; Rujescu, Dan; Schreiber, Stefan; Wichmann, H-Erich; Dina, Christian; Sladek, Rob; Gambaro, Giovanni; Soranzo, Nicole; Julia, Antonio; Marsal, Sara; Rabionet, Raquel; Gaborieau, Valerie; Dick, Danielle M; Palotie, Aarno; Ripatti, Samuli; Widén, Elisabeth; Andreassen, Ole A; Espeseth, Thomas; Lundervold, Astri; Reinvang, Ivar; Steen, Vidar M; Le Hellard, Stephanie; Mattingsdal, Morten; Ntalla, Ioanna; Bencko, Vladimir; Foretova, Lenka; Janout, Vladimir; Navratilova, Marie; Gallinger, Steven; Pinto, Dalila; Scherer, Stephen; Aschauer, Harald; Carlberg, Laura; Schosser, Alexandra; Alfredsson, Lars; Ding, Bo; Klareskog, Lars; Padyukov, Leonid; Finan, Chris; Kalsi, Gursharan; Roberts, Marion; Logan, Darren W; Peltonen, Leena; Ritchie, Graham RS; Barrett, Jeffrey C; Estivill, Xavier; Hinney, Anke; Sullivan, Patrick F; Collier, David A; Zeggini, Eleftheria; Bulik, Cynthia M
Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2,907 cases with AN from 14 countries (15 sites) and 14,860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery datasets. Seventy-six (72 independent) SNPs were taken forward for in silico (two datasets) or de novo (13 datasets) replication genotyping in 2,677 independent AN cases and 8,629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication datasets comprised 5,551 AN cases and 21,080 controls. AN subtype analyses (1,606 AN restricting; 1,445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01×10-7) in SOX2OT and rs17030795 (P=5.84×10-6) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76×10-6) between CUL3 and FAM124B and rs1886797 (P=8.05×10-6) near SPATA13. Comparing discovery to replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P=4×10-6), strongly suggesting that true findings exist but that our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field. PMID:24514567
Begum, Ferdouse; Chowdhury, Reshmi; Cheung, Vivian G.; Sherman, Stephanie L.; Feingold, Eleanor
Meiotic recombination is an essential step in gametogenesis, and is one that also generates genetic diversity. Genome-wide association studies (GWAS) and molecular studies have identified genes that influence of human meiotic recombination. RNF212 is associated with total or average number of recombination events, and PRDM9 is associated with the locations of hotspots, or sequences where crossing over appears to cluster. In addition, a common inversion on chromosome 17 is strongly associated with recombination. Other genes have been identified by GWAS, but those results have not been replicated. In this study, using new datasets, we characterized additional recombination phenotypes to uncover novel candidates and further dissect the role of already known loci. We used three datasets totaling 1562 two-generation families, including 3108 parents with 4304 children. We estimated five different recombination phenotypes including two novel phenotypes (average recombination counts within recombination hotspots and outside of hotspots) using dense SNP array genotype data. We then performed gender-specific and combined-sex genome-wide association studies (GWAS) meta-analyses. We replicated associations for several previously reported recombination genes, including RNF212 and PRDM9. By looking specifically at recombination events outside of hotspots, we showed for the first time that PRDM9 has different effects in males and females. We identified several new candidate loci, particularly for recombination events outside of hotspots. These include regions near the genes SPINK6, EVC2, ARHGAP25, and DLGAP2. This study expands our understanding of human meiotic recombination by characterizing additional features that vary across individuals, and identifying regulatory variants influencing the numbers and locations of recombination events. PMID:27733454
Begum, Ferdouse; Chowdhury, Reshmi; Cheung, Vivian G; Sherman, Stephanie L; Feingold, Eleanor
Meiotic recombination is an essential step in gametogenesis, and is one that also generates genetic diversity. Genome-wide association studies (GWAS) and molecular studies have identified genes that influence of human meiotic recombination. RNF212 is associated with total or average number of recombination events, and PRDM9 is associated with the locations of hotspots, or sequences where crossing over appears to cluster. In addition, a common inversion on chromosome 17 is strongly associated with recombination. Other genes have been identified by GWAS, but those results have not been replicated. In this study, using new datasets, we characterized additional recombination phenotypes to uncover novel candidates and further dissect the role of already known loci. We used three datasets totaling 1562 two-generation families, including 3108 parents with 4304 children. We estimated five different recombination phenotypes including two novel phenotypes (average recombination counts within recombination hotspots and outside of hotspots) using dense SNP array genotype data. We then performed gender-specific and combined-sex genome-wide association studies (GWAS) meta-analyses. We replicated associations for several previously reported recombination genes, including RNF212 and PRDM9 By looking specifically at recombination events outside of hotspots, we showed for the first time that PRDM9 has different effects in males and females. We identified several new candidate loci, particularly for recombination events outside of hotspots. These include regions near the genes SPINK6, EVC2, ARHGAP25, and DLGAP2 This study expands our understanding of human meiotic recombination by characterizing additional features that vary across individuals, and identifying regulatory variants influencing the numbers and locations of recombination events.
Kanazawa, Tetsufumi; Ikeda, Masashi; Glatt, Stephen J; Tsutsumi, Atsushi; Kikuyama, Hiroki; Kawamura, Yoshiya; Nishida, Nao; Miyagawa, Taku; Hashimoto, Ryota; Takeda, Masatoshi; Sasaki, Tsukasa; Tokunaga, Katsushi; Koh, Jun; Iwata, Nakao; Yoneda, Hiroshi
Atypical psychosis with a periodic course of exacerbation and features of major psychiatric disorders [schizophrenia (SZ) and bipolar disorder (BD)] has a long history in clinical psychiatry in Japan. Based upon the new criteria of atypical psychosis, a Genome-Wide Association Study (GWAS) was conducted to identify the risk gene or variants. The relationships between atypical psychosis, SZ and BD were then assessed using independent GWAS data. Forty-seven patients with solid criteria of atypical psychosis and 882 normal controls (NCs) were scanned using an Affymetrics 6.0 chip. GWAS SZ data (560 SZ cases and 548 NCs) and GWAS BD (107 cases with BD type 1 and 107 NCs) were compared using gene-based analysis. The most significant SNPs were detected around the CHN2/CPVL genes (rs245914, P = 1.6 × 10(-7)) , COL21A1 gene (rs12196860, P = 2.45 × 10(-7) ), and PYGL/TRIM9 genes (rs1959536, P = 7.73 × 10(-7) ), although none of the single-nucleotide polymorphisms exhibited genome-wide significance (P = 5 × 10(-8) ). One of the highest peaks was detected on the major histocompatibility complex region, where large SZ GWASs have previously disclosed an association. The gene-based analysis suggested significant enrichment between SZ and atypical psychosis (P = 0.01), but not BD. This study provides clues about the types of patient whose diagnosis lies between SZ and BD. Studies with larger samples are required to determine the causal variant.
Ayers, Stephen; Switnicki, Michal Piotr; Angajala, Anusha; Lammel, Jan; Arumanayagam, Anithachristy S.; Webb, Paul
Thyroid hormone (TH) receptors (TRs) play central roles in metabolism and are major targets for pharmaceutical intervention. Presently, however, there is limited information about genome wide localizations of TR binding sites. Thus, complexities of TR genomic distribution and links between TRβ binding events and gene regulation are not fully appreciated. Here, we employ a BioChIP approach to capture TR genome-wide binding events in a liver cell line (HepG2). Like other NRs, TRβ appears widely distributed throughout the genome. Nevertheless, there is striking enrichment of TRβ binding sites immediately 5′ and 3′ of transcribed genes and TRβ can be detected near 50% of T3 induced genes. In contrast, no significant enrichment of TRβ is seen at negatively regulated genes or genes that respond to unliganded TRs in this system. Canonical TRE half-sites are present in more than 90% of TRβ peaks and classical TREs are also greatly enriched, but individual TRE organization appears highly variable with diverse half-site orientation and spacing. There is also significant enrichment of binding sites for TR associated transcription factors, including AP-1 and CTCF, near TR peaks. We conclude that T3-dependent gene induction commonly involves proximal TRβ binding events but that far-distant binding events are needed for T3 induction of some genes and that distinct, indirect, mechanisms are often at play in negative regulation and unliganded TR actions. Better understanding of genomic context of TR binding sites will help us determine why TR regulates genes in different ways and determine possibilities for selective modulation of TR action. PMID:24558356
Soo, M; Worth, Aj
Canine hip dysplasia (CHD) is a developmental orthopaedic disease of the coxofemoral joints with a multifactorial mode of inheritance. Multiple gene effects are influenced by environmental factors; therefore, it is unlikely that a simple genetic screening test with which to identify susceptible individuals will be developed in the near future. In the absence of feasible methods for objectively quantifying clinical CHD, radiographic techniques have been developed and widely used to identify dogs for breeding which are less affected by the disease. A hip-extended ventrodorsal view of the pelvis has been traditionally used to identify dogs with subluxation and/or osteoarthritis of the coxofemoral joints. More recently, there has been emphasis on the role of coxofemoral joint laxity as a determinant of CHD and methods have been developed to measure passive hip laxity. Though well-established worldwide, the effectiveness of traditional phenotypic scoring schemes in reducing the prevalence of CHD has been variable. The most successful implementation of traditional CHD scoring has occurred in countries or breeding colonies with mandatory scoring and open registries with access to pedigree records. Several commentators have recommended that for quantitative traits like CHD, selection of breeding stock should be based on estimated breeding values (EBV) rather than individual hip score/grade. The EBV is a reflection of the genetic superiority of an animal compared to its counterparts and is calculated from the phenotype of an individual and its relatives and their pedigree relationship. Selecting breeding stock on the basis of a dog's genetic merit, ideally based on a highly predictive phenotype, will confer the breeder with greater selection power, accelerate genetic improvement towards better hip conformation and thus more likely decrease the prevalence of CHD.
Background Through social interactions, individuals affect one another’s phenotype. In such cases, an individual’s phenotype is affected by the direct (genetic) effect of the individual itself and the indirect (genetic) effects of the group mates. Using data on individual phenotypes, direct and indirect genetic (co)variances can be estimated. Together, they compose the total genetic variance that determines a population’s potential to respond to selection. However, it can be difficult or expensive to obtain individual phenotypes. Phenotypes on traits such as egg production and feed intake are, therefore, often collected on group level. In this study, we investigated whether direct, indirect and total genetic variances, and breeding values can be estimated from pooled data (pooled by group). In addition, we determined the optimal group composition, i.e. the optimal number of families represented in a group to minimise the standard error of the estimates. Methods This study was performed in three steps. First, all research questions were answered by theoretical derivations. Second, a simulation study was conducted to investigate the estimation of variance components and optimal group composition. Third, individual and pooled survival records on 12 944 purebred laying hens were analysed to investigate the estimation of breeding values and response to selection. Results Through theoretical derivations and simulations, we showed that the total genetic variance can be estimated from pooled data, but the underlying direct and indirect genetic (co)variances cannot. Moreover, we showed that the most accurate estimates are obtained when group members belong to the same family. Additional theoretical derivations and data analyses on survival records showed that the total genetic variance and breeding values can be estimated from pooled data. Moreover, the correlation between the estimated total breeding values obtained from individual and pooled data was surprisingly
Use of 10,129 singleton SNPs of known genomic location in tetraploid cotton provided unique opportunities to characterize genome-wide diversity among 440 Gossypium hirsutum and 219 G. barbadense cultivars and landrace accessions of widespread origin. Using genome-wide distributed SNPs, we examined ...
Baurley, James W.; Edlund, Christopher K.; Pardamean, Carissa I.; Conti, David V.; Krasnow, Ruth; Javitz, Harold S.; Hops, Hyman; Swan, Gary E.; Benowitz, Neal L.
Introduction: Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3′-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. Methods: Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. Results: African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). Conclusions: This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan
Pértille, Fábio; Moreira, Gabriel Costa Monteiro; Zanella, Ricardo; Nunes, José de Ribamar da Silva; Boschiero, Clarissa; Rovadoscki, Gregori Alberto; Mourão, Gerson Barreto; Ledur, Mônica Corrêa; Coutinho, Luiz Lehmann
Performance traits are economically important and are targets for selection in breeding programs, especially in the poultry industry. To identify regions on the chicken genome associated with performance traits, different genomic approaches have been applied in the last years. The aim of this study was the application of CornellGBS approach (134,528 SNPs generated from a PstI restriction enzyme) on Genome-Wide Association Studies (GWAS) in an outbred F2 chicken population. We have validated 91.7% of these 134,528 SNPs after imputation of missed genotypes. Out of those, 20 SNPs were associated with feed conversion, one was associated with body weight at 35 days of age (P < 7.86E-07) and 93 were suggestively associated with a variety of performance traits (P < 1.57E-05). The majority of these SNPs (86.2%) overlapped with previously mapped QTL for the same performance traits and some of the SNPs also showed novel potential QTL regions. The results obtained in this study suggests future searches for candidate genes and QTL refinements as well as potential use of the SNPs described here in breeding programs. PMID:28181508
Dong, Yan; Liu, Jindong; Zhang, Yan; Geng, Hongwei; Rasheed, Awais; Xiao, Yonggui; Cao, Shuanghe; Fu, Luping; Yan, Jun; Wen, Weie; Zhang, Yong; Jing, Ruilian; Xia, Xianchun; He, Zhonghu
Water soluble carbohydrates (WSC) in stems play an important role in buffering grain yield in wheat against biotic and abiotic stresses; however, knowledge of genes controlling WSC is very limited. We conducted a genome-wide association study (GWAS) using a high-density 90K SNP array to better understand the genetic basis underlying WSC, and to explore marker-based breeding approaches. WSC was evaluated in an association panel comprising 166 Chinese bread wheat cultivars planted in four environments. Fifty two marker-trait associations (MTAs) distributed across 23 loci were identified for phenotypic best linear unbiased estimates (BLUEs), and 11 MTAs were identified in two or more environments. Liner regression showed a clear dependence of WSC BLUE scores on numbers of favorable (increasing WSC content) and unfavorable alleles (decreasing WSC), indicating that genotypes with higher numbers of favorable or lower numbers of unfavorable alleles had higher WSC content. In silico analysis of flanking sequences of trait-associated SNPs revealed eight candidate genes related to WSC content grouped into two categories based on the type of encoding proteins, namely, defense response proteins and proteins triggered by environmental stresses. The identified SNPs and candidate genes related to WSC provide opportunities for breeding higher WSC wheat cultivars. PMID:27802269
Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C L L; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K
important complex quantitative agronomic traits in chickpea. The numerous informative genome-wide SNPs, natural allelic diversity-led domestication pattern, and LD-based information generated in our study have got multidimensional applicability with respect to chickpea genomics-assisted breeding.
Sun, Yanfa; Liu, Ranran; Zhao, Guiping; Zheng, Maiqing; Sun, Yan; Yu, Xiaoqiong; Li, Peng; Wen, Jie
Physical appearance traits, such as feather-crested head, comb size and type, beard, wattles size, and feathered feet, are used to distinguish between breeds of chicken and also may be associated with economic traits. In this study, a genome-wide linkage analysis was used to identify candidate regions and genes for physical appearance traits and to potentially provide further knowledge of the molecular mechanisms that underlie these traits. The linkage analysis was conducted with an F2 population derived from Beijing-You chickens and a commercial broiler line. Single-nucleotide polymorphisms were analyzed using the Illumina 60K Chicken SNP Beadchip. The data were used to map quantitative trait loci and genes for six physical appearance traits. A 10-cM/0.51-Mb region (0.0-10.0 cM/0.00-0.51 Mb) with 1% genome-wide significant level on LGE22C19W28_E50C23 linkage group (LGE22) for crest trait was identified, which is likely very closely linked to the HOXC8. A QTL with 5% chromosome-wide significant level for comb weight, which partly overlaps with a region identified in a previous study, was identified at 74 cM/25.55 Mb on chicken (Gallus gallus; GG) chromosome 3 (i.e., GGA3). For beard and wattles traits, an identical region 11 cM/2.23 Mb (0.0-11.0 cM/0.00-2.23 Mb) including WNT3 and GH genes on GGA27 was identified. Two QTL with 1% genome-wide significant level for feathered feet trait, one 9-cM/2.80-Mb (48.0-57.0/13.40-16.20 Mb) region on GGA13, and another 12-cM/1.45-Mb (41.0-53.0 cM/11.37-12.82 Mb) region on GGA15 were identified. These candidate regions and genes provide important genetic information for the physical appearance traits in chicken.
Sun, Yanfa; Liu, Ranran; Zhao, Guiping; Zheng, Maiqing; Sun, Yan; Yu, Xiaoqiong; Li, Peng; Wen, Jie
Physical appearance traits, such as feather-crested head, comb size and type, beard, wattles size, and feathered feet, are used to distinguish between breeds of chicken and also may be associated with economic traits. In this study, a genome-wide linkage analysis was used to identify candidate regions and genes for physical appearance traits and to potentially provide further knowledge of the molecular mechanisms that underlie these traits. The linkage analysis was conducted with an F2 population derived from Beijing-You chickens and a commercial broiler line. Single-nucleotide polymorphisms were analyzed using the Illumina 60K Chicken SNP Beadchip. The data were used to map quantitative trait loci and genes for six physical appearance traits. A 10-cM/0.51-Mb region (0.0−10.0 cM/0.00−0.51 Mb) with 1% genome-wide significant level on LGE22C19W28_E50C23 linkage group (LGE22) for crest trait was identified, which is likely very closely linked to the HOXC8. A QTL with 5% chromosome-wide significant level for comb weight, which partly overlaps with a region identified in a previous study, was identified at 74 cM/25.55 Mb on chicken (Gallus gallus; GG) chromosome 3 (i.e., GGA3). For beard and wattles traits, an identical region 11 cM/2.23 Mb (0.0−11.0 cM/0.00−2.23 Mb) including WNT3 and GH genes on GGA27 was identified. Two QTL with 1% genome-wide significant level for feathered feet trait, one 9-cM/2.80-Mb (48.0-57.0/13.40-16.20 Mb) region on GGA13, and another 12-cM/1.45-Mb (41.0−53.0 cM/11.37−12.82 Mb) region on GGA15 were identified. These candidate regions and genes provide important genetic information for the physical appearance traits in chicken. PMID:26248982
Willing, Eva-Maria; Bentzen, Paul; van Oosterhout, Cock; Hoffmann, Margarete; Cable, Joanne; Breden, Felix; Weigel, Detlef; Dreyer, Christine
Adaptation of guppies (Poecilia reticulata) to contrasting upland and lowland habitats has been extensively studied with respect to behaviour, morphology and life history traits. Yet population history has not been studied at the whole-genome level. Although single nucleotide polymorphisms (SNPs) are the most abundant form of variation in many genomes and consequently very informative for a genome-wide picture of standing natural variation in populations, genome-wide SNP data are rarely available for wild vertebrates. Here we use genetically mapped SNP markers to comprehensively survey genetic variation within and among naturally occurring guppy populations from a wide geographic range in Trinidad and Venezuela. Results from three different clustering methods, Neighbor-net, principal component analysis (PCA) and Bayesian analysis show that the population substructure agrees with geographic separation and largely with previously hypothesized patterns of historical colonization. Within major drainages (Caroni, Oropouche and Northern), populations are genetically similar, but those in different geographic regions are highly divergent from one another, with some indications of ancient shared polymorphisms. Clear genomic signatures of a previous introduction experiment were seen, and we detected additional potential admixture events. Headwater populations were significantly less heterozygous than downstream populations. Pairwise F(ST) values revealed marked differences in allele frequencies among populations from different regions, and also among populations within the same region. F(ST) outlier methods indicated some regions of the genome as being under directional selection. Overall, this study demonstrates the power of a genome-wide SNP data set to inform for studies on natural variation, adaptation and evolution of wild populations.
Sun, Yanfa; Liu, Ranran; Zhao, Guiping; Zheng, Maiqing; Sun, Yan; Yu, Xiaoqiong; Li, Peng; Wen, Jie
Polydactyly occurs in some chicken breeds, but the molecular mechanism remains incompletely understood. Combined genome-wide linkage analysis and association study (GWAS) for chicken polydactyly helps identify loci or candidate genes for the trait and potentially provides further mechanistic understanding of this phenotype in chickens and perhaps other species. The linkage analysis and GWAS for polydactyly was conducted using an F2 population derived from Beijing-You chickens and commercial broilers. The results identified two QTLs through linkage analysis and seven single-nucleotide polymorphisms (SNPs) through GWAS, associated with the polydactyly trait. One QTL located at 35 cM on the GGA2 was significant at the 1% genome-wise level and another QTL at the 1% chromosome-wide significance level was detected at 39 cM on GGA19. A total of seven SNPs, four of 5% genome-wide significance (P < 2.98 × 10(-6)) and three of suggestive significance (5.96 × 10(-5)) were identified, including two SNPs (GGaluGA132178 and Gga_rs14135036) in the QTL on GGA2. Of the identified SNPs, the eight nearest genes were sonic hedgehog (SHH), limb region 1 homolog (mouse) (LMBR1), dipeptidyl-peptidase 6, transcript variant 3 (DPP6), thyroid-stimulating hormone, beta (TSHB), sal-like 4 (Drosophila) (SALL4), par-6 partitioning defective 6 homolog beta (Caenorhabditis elegans) (PARD6B), coenzyme Q5 (COQ5), and tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, etapolypeptide (YWHAH). The GWAS supports earlier reports of the importance of SHH and LMBR1 as regulating genes for polydactyly in chickens and other species, and identified others, most of which have not previously been associated with limb development. The genes and associated SNPs revealed here provide detailed information for further exploring the molecular and developmental mechanisms underlying polydactyly.
Jia, Wei-Hua; Zhang, Ben; Matsuo, Keitaro; Shin, Aesun; Xiang, Yong-Bing; Jee, Sun Ha; Kim, Dong-Hyun; Ren, Zefang; Cai, Qiuyin; Long, Jirong; Shi, Jiajun; Wen, Wanqing; Yang, Gong; Delahanty, Ryan J.; Ji, Bu-Tian; Pan, Zhi-Zhong; Matsuda, Fumihiko; Gao, Yu-Tang; Oh, Jae Hwan; Ahn, Yoon-Ok; Park, Eun Jung; Li, Hong-Lan; Park, Ji Won; Jo, Jaeseong; Jeong, Jin-Young; Hosono, Satoyo; Casey, Graham; Peters, Ulrike; Shu, Xiao-Ou; Zeng, Yi-Xin; Zheng, Wei
To identify novel genetic factors for colorectal cancer (CRC), we conducted a genome-wide association study in East Asians. By analyzing genome-wide data in 2,098 cases and 5,749 controls, we selected 64 promising SNPs for replication in an independent set of samples including up to 5,358 cases and 5,922 controls. We identified four SNPs with a P-value of 8.58 × 10−7 to 3.77 × 10−10 in the combined analysis of all East Asian samples. Three of the four SNPs were replicated in a study conducted among 26,060 European descendants with a combined P-value of 1.22 × 10−10 for rs647161 (5q31.1), 6.64 × 10−9 for rs2423279 (20p12.3), and 3.06 × 10−8 for rs10774214 (12p13.32 near the CCND2 gene), respectively, derived from the meta-analysis of data from both East Asian and European populations. This study identified three new CRC susceptibility loci and provides additional insight into the genetics and biology of CRC. PMID:23263487
Stewart, W C L; Hager, V R
In the analysis of DNA sequences on related individuals, most methods strive to incorporate as much information as possible, with little or no attention paid to the issue of statistical significance. For example, a modern workstation can easily handle the computations needed to perform a large-scale genome-wide inheritance-by-descent (IBD) scan, but accurate assessment of the significance of that scan is often hindered by inaccurate approximations and computationally intensive simulation. To address these issues, we developed gLOD—a test of co-segregation that, for large samples, models chromosome-specific IBD statistics as a collection of stationary Gaussian processes. With this simple model, the parametric bootstrap yields an accurate and rapid assessment of significance—the genome-wide corrected P-value. Furthermore, we show that (i) under the null hypothesis, the limiting distribution of the gLOD is the standard Gumbel distribution; (ii) our parametric bootstrap simulator is approximately 40 000 times faster than gene-dropping methods, and it is more powerful than methods that approximate the adjusted P-value; and, (iii) the gLOD has the same statistical power as the widely used maximum Kong and Cox LOD. Thus, our approach gives researchers the ability to determine quickly and accurately the significance of most large-scale IBD scans, which may contain multiple traits, thousands of families and tens of thousands of DNA sequences. PMID:27245422
Mousel, Michelle R.; Reynolds, James O.; White, Stephen N.
Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10-5) were identified including markers in or near PIK3CB (P = 2.22x10-6; additive model), KCNB1 (P = 2.93x10-6; dominance model), ZC3H12C (P = 3.25x10-6; genotypic model), JPH1 (P = 4.68x20-6; genotypic model), and MYO3B (P = 5.74x10-6; recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection. PMID:26098909
Mousel, Michelle R; Reynolds, James O; White, Stephen N
Entropion is an inward rolling of the eyelid allowing contact between the eyelashes and cornea that may lead to blindness if not corrected. Although many mammalian species, including humans and dogs, are afflicted by congenital entropion, no specific genes or gene regions related to development of entropion have been reported in any mammalian species to date. Entropion in domestic sheep is known to have a genetic component therefore, we used domestic sheep as a model system to identify genomic regions containing genes associated with entropion. A genome-wide association was conducted with congenital entropion in 998 Columbia, Polypay, and Rambouillet sheep genotyped with 50,000 SNP markers. Prevalence of entropion was 6.01%, with all breeds represented. Logistic regression was performed in PLINK with additive allelic, recessive, dominant, and genotypic inheritance models. Two genome-wide significant (empirical P<0.05) SNP were identified, specifically markers in SLC2A9 (empirical P = 0.007; genotypic model) and near NLN (empirical P = 0.026; dominance model). Six additional genome-wide suggestive SNP (nominal P<1x10(-5)) were identified including markers in or near PIK3CB (P = 2.22x10(-6); additive model), KCNB1 (P = 2.93x10(-6); dominance model), ZC3H12C (P = 3.25x10(-6); genotypic model), JPH1 (P = 4.68x20(-6); genotypic model), and MYO3B (P = 5.74x10(-6); recessive model). This is the first report of specific gene regions associated with congenital entropion in any mammalian species, to our knowledge. Further, none of these genes have previously been associated with any eyelid traits. These results represent the first genome-wide analysis of gene regions associated with entropion and provide target regions for the development of sheep genetic markers for marker-assisted selection.
Wragg, D; Mwacharo, J M; Alcalde, J A; Hocking, P M; Hanotte, O
Extensive phenotypic variation is a common feature among village chickens found throughout much of the developing world, and in traditional chicken breeds that have been artificially selected for traits such as plumage variety. We present here an assessment of traditional and village chicken populations, for fine mapping of Mendelian traits using genome-wide single-nucleotide polymorphism (SNP) genotyping while providing information on their genetic structure and diversity. Bayesian clustering analysis reveals two main genetic backgrounds in traditional breeds, Kenyan, Ethiopian and Chilean village chickens. Analysis of linkage disequilibrium (LD) reveals useful LD (r(2) ≥ 0.3) in both traditional and village chickens at pairwise marker distances of ~10 Kb; while haplotype block analysis indicates a median block size of 11-12 Kb. Association mapping yielded refined mapping intervals for duplex comb (Gga 2:38.55-38.89 Mb) and rose comb (Gga 7:18.41-22.09 Mb) phenotypes in traditional breeds. Combined mapping information from traditional breeds and Chilean village chicken allows the oocyan phenotype to be fine mapped to two small regions (Gga 1:67.25-67.28 Mb, Gga 1:67.28-67.32 Mb) totalling ~75 Kb. Mapping the unmapped earlobe pigmentation phenotype supports previous findings that the trait is sex-linked and polygenic. A critical assessment of the number of SNPs required to map simple traits indicate that between 90 and 110K SNPs are required for full genome-wide analysis of haplotype block structure/ancestry, and for association mapping in both traditional and village chickens. Our results demonstrate the importance and uniqueness of phenotypic diversity and genetic structure of traditional chicken breeds for fine-scale mapping of Mendelian traits in the species, with village chicken populations providing further opportunities to enhance mapping resolutions.
Wragg, D; Mwacharo, J M; Alcalde, J A; Hocking, P M; Hanotte, O
Extensive phenotypic variation is a common feature among village chickens found throughout much of the developing world, and in traditional chicken breeds that have been artificially selected for traits such as plumage variety. We present here an assessment of traditional and village chicken populations, for fine mapping of Mendelian traits using genome-wide single-nucleotide polymorphism (SNP) genotyping while providing information on their genetic structure and diversity. Bayesian clustering analysis reveals two main genetic backgrounds in traditional breeds, Kenyan, Ethiopian and Chilean village chickens. Analysis of linkage disequilibrium (LD) reveals useful LD (r2⩾0.3) in both traditional and village chickens at pairwise marker distances of ∼10 Kb; while haplotype block analysis indicates a median block size of 11–12 Kb. Association mapping yielded refined mapping intervals for duplex comb (Gga 2:38.55–38.89 Mb) and rose comb (Gga 7:18.41–22.09 Mb) phenotypes in traditional breeds. Combined mapping information from traditional breeds and Chilean village chicken allows the oocyan phenotype to be fine mapped to two small regions (Gga 1:67.25–67.28 Mb, Gga 1:67.28–67.32 Mb) totalling ∼75 Kb. Mapping the unmapped earlobe pigmentation phenotype supports previous findings that the trait is sex-linked and polygenic. A critical assessment of the number of SNPs required to map simple traits indicate that between 90 and 110K SNPs are required for full genome-wide analysis of haplotype block structure/ancestry, and for association mapping in both traditional and village chickens. Our results demonstrate the importance and uniqueness of phenotypic diversity and genetic structure of traditional chicken breeds for fine-scale mapping of Mendelian traits in the species, with village chicken populations providing further opportunities to enhance mapping resolutions. PMID:22395157
Amuzu-Aweh, E N; Bijma, P; Kinghorn, B P; Vereijken, A; Visscher, J; van Arendonk, J Am; Bovenhuis, H
Prediction of heterosis has a long history with mixed success, partly due to low numbers of genetic markers and/or small data sets. We investigated the prediction of heterosis for egg number, egg weight and survival days in domestic white Leghorns, using ∼400 000 individuals from 47 crosses and allele frequencies on ∼53 000 genome-wide single nucleotide polymorphisms (SNPs). When heterosis is due to dominance, and dominance effects are independent of allele frequencies, heterosis is proportional to the squared difference in allele frequency (SDAF) between parental pure lines (not necessarily homozygous). Under these assumptions, a linear model including regression on SDAF partitions crossbred phenotypes into pure-line values and heterosis, even without pure-line phenotypes. We therefore used models where phenotypes of crossbreds were regressed on the SDAF between parental lines. Accuracy of prediction was determined using leave-one-out cross-validation. SDAF predicted heterosis for egg number and weight with an accuracy of ∼0.5, but did not predict heterosis for survival days. Heterosis predictions allowed preselection of pure lines before field-testing, saving ∼50% of field-testing cost with only 4% loss in heterosis. Accuracies from cross-validation were lower than from the model-fit, suggesting that accuracies previously reported in literature are overestimated. Cross-validation also indicated that dominance cannot fully explain heterosis. Nevertheless, the dominance model had considerable accuracy, clearly greater than that of a general/specific combining ability model. This work also showed that heterosis can be modelled even when pure-line phenotypes are unavailable. We concluded that SDAF is a useful predictor of heterosis in commercial layer breeding.
Ostrander, Elaine A; Beale, Holly C
Because of dogs' unique population structure, human-like disease biology, and advantageous genomic features, the canine system has risen dramatically in popularity as a tool for discovering disease alleles that have been difficult to find by studying human families or populations. To date, disease studies in dogs have primarily employed either linkage analysis, leveraging the typically large family size, or genome-wide association, which requires only modest-sized case and control groups in dogs. Both have been successful but, like most techniques, each requires a specific combination of time and money, and there are inherent problems associated with each. Here we review the first report of mRNA-Seq in the dog, a study that provides insights into the potential value of applying high-throughput sequencing to the study of genetic diseases in dogs. Forman and colleagues apply high-throughput sequencing to a single case of canine neonatal cerebellar cortical degeneration. This implementation of whole genome mRNA sequencing, the first reported in dog, is additionally unusual due to the analysis: the data was used not to examine transcript levels or annotate genes, but as a form of target capture that revealed the sequence of transcripts of genes associated with ataxia in humans. This approach entails risks. It would fail if, for example, the relevant transcripts were not sufficiently expressed for genotyping or were not associated with ataxia in humans. But here it pays off handsomely, identifying a single frameshift mutation that segregates with the disease. This work sets the stage for similar studies that take advantage of recent advances in genomics while exploiting the historical background of dog breeds to identify disease-causing mutations.
Kim, Jin Il; Park, Sehee; Lee, Ilseob; Park, Kwang Sook; Kwak, Eun Jung; Moon, Kwang Mee; Lee, Chang Kyu; Bae, Joon-Yong; Park, Man-Seong; Song, Ki-Joon
Human metapneumovirus (HMPV) has been described as an important etiologic agent of upper and lower respiratory tract infections, especially in young children and the elderly. Most of school-aged children might be introduced to HMPVs, and exacerbation with other viral or bacterial super-infection is common. However, our understanding of the molecular evolution of HMPVs remains limited. To address the comprehensive evolutionary dynamics of HMPVs, we report a genome-wide analysis of the eight genes (N, P, M, F, M2, SH, G, and L) using 103 complete genome sequences. Phylogenetic reconstruction revealed that the eight genes from one HMPV strain grouped into the same genetic group among the five distinct lineages (A1, A2a, A2b, B1, and B2). A few exceptions of phylogenetic incongruence might suggest past recombination events, and we detected possible recombination breakpoints in the F, SH, and G coding regions. The five genetic lineages of HMPVs shared quite remote common ancestors ranging more than 220 to 470 years of age with the most recent origins for the A2b sublineage. Purifying selection was common, but most protein genes except the F and M2-2 coding regions also appeared to experience episodic diversifying selection. Taken together, these suggest that the five lineages of HMPVs maintain their individual evolutionary dynamics and that recombination and selection forces might work on shaping the genetic diversity of HMPVs. PMID:27046055
Mondul, Alison M; Yu, Kai; Wheeler, William; Zhang, Hong; Weinstein, Stephanie J; Major, Jacqueline M; Cornelis, Marilyn C; Männistö, Satu; Hazra, Aditi; Hsing, Ann W; Jacobs, Kevin B; Eliassen, Heather; Tanaka, Toshiko; Reding, Douglas J; Hendrickson, Sara; Ferrucci, Luigi; Virtamo, Jarmo; Hunter, David J; Chanock, Stephen J; Kraft, Peter; Albanes, Demetrius
Retinol is one of the most biologically active forms of vitamin A and is hypothesized to influence a wide range of human diseases including asthma, cardiovascular disease, infectious diseases and cancer. We conducted a genome-wide association study of 5006 Caucasian individuals drawn from two cohorts of men: the Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) Study and the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial. We identified two independent single-nucleotide polymorphisms associated with circulating retinol levels, which are located near the transthyretin (TTR) and retinol binding protein 4 (RBP4) genes which encode major carrier proteins of retinol: rs1667255 (P =2.30× 10(-17)) and rs10882272 (P =6.04× 10(-12)). We replicated the association with rs10882272 in RBP4 in independent samples from the Nurses' Health Study and the Invecchiare in Chianti Study (InCHIANTI) that included 3792 women and 504 men (P =9.49× 10(-5)), but found no association for retinol with rs1667255 in TTR among women, thus suggesting evidence for gender dimorphism (P-interaction=1.31× 10(-5)). Discovery of common genetic variants associated with serum retinol levels may provide further insight into the contribution of retinol and other vitamin A compounds to the development of cancer and other complex diseases.
Fernandez, Christian A.; Smith, Colton; Yang, Wenjian; Mullighan, Charles G.; Qu, Chunxu; Larsen, Eric; Bowman, W. Paul; Liu, Chengcheng; Ramsey, Laura B.; Chang, Tamara; Karol, Seth E.; Loh, Mignon L.; Raetz, Elizabeth A.; Winick, Naomi J.; Hunger, Stephen P.; Carroll, William L.; Jeha, Sima; Pui, Ching-Hon; Evans, William E.; Devidas, Meenakshi
Asparaginase is used to treat acute lymphoblastic leukemia (ALL); however, hypersensitivity reactions can lead to suboptimal asparaginase exposure. Our objective was to use a genome-wide approach to identify loci associated with asparaginase hypersensitivity in children with ALL enrolled on St. Jude Children’s Research Hospital (SJCRH) protocols Total XIIIA (n = 154), Total XV (n = 498), and Total XVI (n = 271), or Children’s Oncology Group protocols POG 9906 (n = 222) and AALL0232 (n = 2163). Germline DNA was genotyped using the Affymetrix 500K, Affymetrix 6.0, or the Illumina Exome BeadChip array. In multivariate logistic regression, the intronic rs6021191 variant in nuclear factor of activated T cells 2 (NFATC2) had the strongest association with hypersensitivity (P = 4.1 × 10−8; odds ratio [OR] = 3.11). RNA-seq data available from 65 SJCRH ALL tumor samples and 52 Yoruba HapMap samples showed that samples carrying the rs6021191 variant had higher NFATC2 expression compared with noncarriers (P = 1.1 × 10−3 and 0.03, respectively). The top ranked nonsynonymous polymorphism was rs17885382 in HLA-DRB1 (P = 3.2 × 10−6; OR = 1.63), which is in near complete linkage disequilibrium with the HLA-DRB1*07:01 allele we previously observed in a candidate gene study. The strongest risk factors for asparaginase allergy are variants within genes regulating the immune response. PMID:25987655
Moorjani, Priya; Patterson, Nick; Loh, Po-Ru; Lipson, Mark; Kisfali, Péter; Melegh, Bela I.; Bonin, Michael; Kádaši, Ľudevít; Rieß, Olaf; Berger, Bonnie; Reich, David; Melegh, Béla
The Roma people, living throughout Europe and West Asia, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1,000–1,500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry–derived from a combination of European and South Asian sources–and that the date of admixture of South Asian and European ancestry was about 850 years before present. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which appears to have been followed by a major demographic expansion after the arrival in Europe. PMID:23516520
Quintales, Luis; Vázquez, Enrique; Antequera, Francisco
Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use.
Larsson, Ola; Sonenberg, Nahum; Nadon, Robert
Regulation of gene expression through translational control is a fundamental mechanism implicated in many biological processes ranging from memory formation to innate immunity and whose dysregulation contributes to human diseases. Genome wide analyses of translational control strive to identify differential translation independent of cytosolic mRNA levels. For this reason, most studies measure genes' translation levels as log ratios (translation levels divided by corresponding cytosolic mRNA levels obtained in parallel). Counterintuitively, arising from a mathematical necessity, these log ratios tend to be highly correlated with the cytosolic mRNA levels. Accordingly, they do not effectively correct for cytosolic mRNA level and generate substantial numbers of biological false positives and false negatives. We show that analysis of partial variance, which produces estimates of translational activity that are independent of cytosolic mRNA levels, is a superior alternative. When combined with a variance shrinkage method for estimating error variance, analysis of partial variance has the additional benefit of having greater statistical power and identifying fewer genes as translationally regulated resulting merely from unrealistically low variance estimates rather than from large changes in translational activity. In contrast to log ratios, this formal analytical approach estimates translation effects in a statistically rigorous manner, eliminates the need for inefficient and error-prone heuristics, and produces results that agree with biological function. The method is applicable to datasets obtained from both the commonly used polysome microarray method and the sequencing-based ribosome profiling method.
Kim, Jin Il; Park, Sehee; Lee, Ilseob; Park, Kwang Sook; Kwak, Eun Jung; Moon, Kwang Mee; Lee, Chang Kyu; Bae, Joon-Yong; Park, Man-Seong; Song, Ki-Joon
Human metapneumovirus (HMPV) has been described as an important etiologic agent of upper and lower respiratory tract infections, especially in young children and the elderly. Most of school-aged children might be introduced to HMPVs, and exacerbation with other viral or bacterial super-infection is common. However, our understanding of the molecular evolution of HMPVs remains limited. To address the comprehensive evolutionary dynamics of HMPVs, we report a genome-wide analysis of the eight genes (N, P, M, F, M2, SH, G, and L) using 103 complete genome sequences. Phylogenetic reconstruction revealed that the eight genes from one HMPV strain grouped into the same genetic group among the five distinct lineages (A1, A2a, A2b, B1, and B2). A few exceptions of phylogenetic incongruence might suggest past recombination events, and we detected possible recombination breakpoints in the F, SH, and G coding regions. The five genetic lineages of HMPVs shared quite remote common ancestors ranging more than 220 to 470 years of age with the most recent origins for the A2b sublineage. Purifying selection was common, but most protein genes except the F and M2-2 coding regions also appeared to experience episodic diversifying selection. Taken together, these suggest that the five lineages of HMPVs maintain their individual evolutionary dynamics and that recombination and selection forces might work on shaping the genetic diversity of HMPVs.
Hashimoto, Tatsunori; Sherwood, Richard I.; Kang, Daniel D.; Rajagopal, Nisha; Barkal, Amira A.; Zeng, Haoyang; Emons, Bart J.M.; Srinivasan, Sharanya; Jaakkola, Tommi; Gifford, David K.
Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SCM), which when trained with DNase-seq data for a cell type is capable of predicting expected read counts of genome-wide chromatin accessibility at every base from DNA sequence alone, with the highest accuracy at hypersensitive sites shared across cell types. We confirm that a SCM accurately predicts chromatin accessibility for thousands of synthetic DNA sequences using a novel CRISPR-based method of highly efficient site-specific DNA library integration. SCMs are directly interpretable and reveal that a logic based on local, nonspecific synergistic effects, largely among pioneer TFs, is sufficient to predict a large fraction of cellular chromatin accessibility in a wide variety of cell types. PMID:27456004
Ott, Felix; Weigel, Detlef; Bowman, John L.; Heisler, Marcus G.; Wenkel, Stephan
Plant organ development and polarity establishment is mediated by the action of several transcription factors. Among these, the KANADI (KAN) subclade of the GARP protein family plays important roles in polarity-associated processes during embryo, shoot and root patterning. In this study, we have identified a set of potential direct target genes of KAN1 through a combination of chromatin immunoprecipitation/DNA sequencing (ChIP-Seq) and genome-wide transcriptional profiling using tiling arrays. Target genes are over-represented for genes involved in the regulation of organ development as well as in the response to auxin. KAN1 affects directly the expression of several genes previously shown to be important in the establishment of polarity during lateral organ and vascular tissue development. We also show that KAN1 controls through its target genes auxin effects on organ development at different levels: transport and its regulation, and signaling. In addition, KAN1 regulates genes involved in the response to abscisic acid, jasmonic acid, brassinosteroids, ethylene, cytokinins and gibberellins. The role of KAN1 in organ polarity is antagonized by HD-ZIPIII transcription factors, including REVOLUTA (REV). A comparison of their target genes reveals that the REV/KAN1 module acts in organ patterning through opposite regulation of shared targets. Evidence of mutual repression between closely related family members is also shown. PMID:24155946
Liu, Yong-Jun; Zhang, Lei; Papasian, Christopher J.
In the past few years, the bone field has witnessed great advances in genome-wide association studies (GWASs) of osteoporosis, with a number of promising genes identified. In particular, meta-analysis of GWASs, aimed at increasing the power of studies by combining the results from different study populations, have led to the identification of novel associations that would not otherwise have been identified in individual GWASs. Recently, the first whole genome sequencing study for osteoporosis and fractures was published, reporting a novel rare nonsense mutation. This review summarizes the important and representative findings published by December 2013. Comments are made on the notable findings and representative studies for their potential influence and implications on our present understanding of the genetics of osteoporosis. Potential limitations of GWASs and their meta-analyses are evaluated, with an emphasis on understanding the reasons for inconsistent results between different studies and clarification of misinterpretation of GWAS meta-analysis results. Implications and challenges of GWAS are also discussed, including the need for multi- and inter-disciplinary studies. PMID:25006567
Patra, Biranchi; Kon, Yoshiko; Yadav, Gitanjali; Sevold, Anthony W.; Frumkin, Jesse P.; Vallabhajosyula, Ravishankar R.; Hintze, Arend; Østman, Bjørn; Schossau, Jory; Bhan, Ashish; Marzolf, Bruz; Tamashiro, Jenna K.; Kaur, Amardeep; Baliga, Nitin S.; Grayhack, Elizabeth J.; Adami, Christoph; Galas, David J.; Raval, Alpan; Phizicky, Eric M.; Ray, Animesh
Genomic robustness is the extent to which an organism has evolved to withstand the effects of deleterious mutations. We explored the extent of genomic robustness in budding yeast by genome wide dosage suppressor analysis of 53 conditional lethal mutations in cell division cycle and RNA synthesis related genes, revealing 660 suppressor interactions of which 642 are novel. This collection has several distinctive features, including high co-occurrence of mutant-suppressor pairs within protein modules, highly correlated functions between the pairs and higher diversity of functions among the co-suppressors than previously observed. Dosage suppression of essential genes encoding RNA polymerase subunits and chromosome cohesion complex suggests a surprising degree of functional plasticity of macromolecular complexes, and the existence of numerous degenerate pathways for circumventing the effects of potentially lethal mutations. These results imply that organisms and cancer are likely able to exploit the genomic robustness properties, due the persistence of cryptic gene and pathway functions, to generate variation and adapt to selective pressures. PMID:27899637
Sanchez, Robersy; Mackenzie, Sally A.
Cytosine DNA methylation (CDM) is a highly abundant, heritable but reversible chemical modification to the genome. Herein, a machine learning approach was applied to analyze the accumulation of epigenetic marks in methylomes of 152 ecotypes and 85 silencing mutants of Arabidopsis thaliana. In an information-thermodynamics framework, two measurements were used: (1) the amount of information gained/lost with the CDM changes IR and (2) the uncertainty of not observing a SNP LCR. We hypothesize that epigenetic marks are chromosomal footprints accounting for different ontogenetic and phylogenetic histories of individual populations. A machine learning approach is proposed to verify this hypothesis. Results support the hypothesis by the existence of discriminatory information (DI) patterns of CDM able to discriminate between individuals and between individual subpopulations. The statistical analyses revealed a strong association between the topologies of the structured population of Arabidopsis ecotypes based on IR and on LCR, respectively. A statistical-physical relationship between IR and LCR was also found. Results to date imply that the genome-wide distribution of CDM changes is not only part of the biological signal created by the methylation regulatory machinery, but ensures the stability of the DNA molecule, preserving the integrity of the genetic message under continuous stress from thermal fluctuations in the cell environment. PMID:27322251
Tchurikov, Nickolai A.; Kretova, Olga V.; Sosin, Dmitri V.; Zykov, Ivan A.; Zhimulev, Igor F.; Kravatsky, Yuri V.
Forum domains are stretches of chromosomal DNA that are excised from eukaryotic chromosomes during their spontaneous non-random fragmentation. Most forum domains are 50–200 kb in length. We mapped forum domain termini using FISH on polytene chromosomes and we performed genome-wide mapping using a Drosophila melanogaster genomic tiling microarray consisting of overlapping 3 kb fragments. We found that forum termini very often correspond to regions of intercalary heterochromatin and regions of late replication in polytene chromosomes. We found that forum domains contain clusters of several or many genes. The largest forum domains correspond to the main clusters of homeotic genes inside BX-C and ANTP-C, cluster of histone genes and clusters of piRNAs. PRE/TRE and transcription factor binding sites often reside inside domains and do not overlap with forum domain termini. We also found that about 20% of forum domain termini correspond to small chromosomal regions where Ago1, Ago2, small RNAs and repressive chromatin structures are detected. Our results indicate that forum domains correspond to big multi-gene chromosomal units, some of which could be coordinately expressed. The data on the global mapping of forum domains revealed a strong correlation between fragmentation sites in chromosomes, particular sets of mobile elements and regions of intercalary heterochromatin. PMID:21247882
Kelemen, Linda E.; Lawrenson, Kate; Tyrer, Jonathan; Li, Qiyuan; M. Lee, Janet; Seo, Ji-Heui; Phelan, Catherine M.; Beesley, Jonathan; Chen, Xiaoqin; Spindler, Tassja J.; Aben, Katja K.H.; Anton-Culver, Hoda; Antonenkova, Natalia; Baker, Helen; Bandera, Elisa V.; Bean, Yukie; Beckmann, Matthias W.; Bisogna, Maria; Bjorge, Line; Bogdanova, Natalia; Brinton, Louise A.; Brooks-Wilson, Angela; Bruinsma, Fiona; Butzow, Ralf; Campbell, Ian G.; Carty, Karen; Chang-Claude, Jenny; Chen, Y. Ann; Chen, Zhihua; Cook, Linda S.; Cramer, Daniel W.; Cunningham, Julie M.; Cybulski, Cezary; Dansonka-Mieszkowska, Agnieszka; Dennis, Joe; Dicks, Ed; Doherty, Jennifer A.; Dörk, Thilo; du Bois, Andreas; Dürst, Matthias; Eccles, Diana; Easton, Douglas T.; Edwards, Robert P.; Eilber, Ursula; Ekici, Arif B.; Engelholm, Svend Aage; Fasching, Peter A.; Fridley, Brooke L.; Gao, Yu-Tang; Gentry-Maharaj, Aleksandra; Giles, Graham G.; Glasspool, Rosalind; Goode, Ellen L.; Goodman, Marc T.; Grownwald, Jacek; Harrington, Patricia; Harter, Philipp; Hasmad, Hanis Nazihah; Hein, Alexander; Heitz, Florian; Hildebrandt, Michelle A.T.; Hillemanns, Peter; Hogdall, Estrid; Hogdall, Claus; Hosono, Satoyo; Iversen, Edwin S.; Jakubowska, Anna; Jensen, Allan; Ji, Bu-Tian; Karlan, Beth Y; Kellar, Melissa; Kelley, Joseph L.; Kiemeney, Lambertus A.; Krakstad, Camilla; Kjaer, Susanne K.; Kupryjanczyk, Jolanta; Lambrechts, Diether; Lambrechts, Sandrina; Le, Nhu D.; Lee, Alice W.; Lele, Shashi; Leminen, Arto; Lester, Jenny; Levine, Douglas A.; Liang, Dong; Lissowska, Jolanta; Lu, Karen; Lubinski, Jan; Lundvall, Lene; Massuger, Leon F.A.G.; Matsuo, Keitaro; McGuire, Valerie; McLaughlin, John R.; McNeish, Iain; Menon, Usha; Modugno, Francesmary; Moes-Sosnowska, Joanna; Moysich, Kirsten B.; Narod, Steven A.; Nedergaard, Lotte; Ness, Roberta B.; Nevanlinna, Heli; Azmi, Mat Adenan Noor; Odunsi, Kunle; Olson, Sara H.; Orlow, Irene; Orsulic, Sandra; Weber, Rachel Palmieri; Paul, James; Pearce, Celeste Leigh; Pejovic, Tanja; Pelttari, Liisa M.; Permuth-Wey, Jennifer; Pike, Malcolm C.; Poole, Elizabeth M.; Ramus, Susan J.; Risch, Harvey A.; Rosen, Barry; Rossing, Mary Anne; Rothstein, Joseph H.; Rudolph, Anja; Runnebaum, Ingo B.; Rzepecka, Iwona K.; Salvesen, Helga B.; Schildkraut, Joellen M.; Schwaab, Ira; Shu, Xiao-Ou; Shvetsov, Yurii B; Siddiqui, Nadeem; Sieh, Weiva; Song, Honglin; Southey, Melissa C.; Sucheston, Lara; Tangen, Ingvild L.; Teo, Soo-Hwang; Terry, Kathryn L.; Thompson, Pamela J; Tworoger, Shelley S.; van Altena, Anne M.; Van Nieuwenhuysen, Els; Vergote, Ignace; Vierkant, Robert A.; Wang-Gohrke, Shan; Walsh, Christine; Wentzensen, Nicolas; Whittemore, Alice S.; Wicklund, Kristine G.; Wilkens, Lynne R.; Wlodzimierz, Sawicki; Woo, Yin-Ling; Wu, Xifeng; Wu, Anna H.; Yang, Hannah; Zheng, Wei; Ziogas, Argyrios; Sellers, Thomas A.; Freedman, Matthew L.; Chenevix-Trench, Georgia; Pharoah, Paul D.; Gayther, Simon A.; Berchuck, Andrew
Genome-wide association studies have identified several risk associations for ovarian carcinomas (OC) but not for mucinous ovarian carcinomas (MOC). Genotypes from OC cases and controls were imputed into the 1000 Genomes Project reference panel. Analysis of 1,644 MOC cases and 21,693 controls identified three novel risk associations: rs752590 at 2q13 (P = 3.3 × 10−8), rs711830 at 2q31.1 (P = 7.5 × 10−12) and rs688187 at 19q13.2 (P = 6.8 × 10−13). Expression Quantitative Trait Locus (eQTL) analysis in ovarian and colorectal tumors (which are histologically similar to MOC) identified significant eQTL associations for HOXD9 at 2q31.1 in ovarian (P = 4.95 × 10−4, FDR = 0.003) and colorectal (P = 0.01, FDR = 0.09) tumors, and for PAX8 at 2q13 in colorectal tumors (P = 0.03, FDR = 0.09). Chromosome conformation capture analysis identified interactions between the HOXD9 promoter and risk SNPs at 2q31.1. Overexpressing HOXD9 in MOC cells augmented the neoplastic phenotype. These findings provide the first evidence for MOC susceptibility variants and insights into the underlying biology of the disease. PMID:26075790
Lakshmanan, Vairavan; Bansal, Dhiru; Kulkarni, Jahnavi; Poduval, Deepak; Krishna, Srikar; Sasidharan, Vidyanand; Anand, Praveen; Seshasayee, Aswin; Palakodeti, Dasaradhi
In eukaryotes, 3′ untranslated regions (UTRs) play important roles in regulating posttranscriptional gene expression. The 3′UTR is defined by regulated cleavage/polyadenylation of the pre-mRNA. The advent of next-generation sequencing technology has now enabled us to identify these events on a genome-wide scale. In this study, we used poly(A)-position profiling by sequencing (3P-Seq) to capture all poly(A) sites across the genome of the freshwater planarian, Schmidtea mediterranea, an ideal model system for exploring the process of regeneration and stem cell function. We identified the 3′UTRs for ∼14,000 transcripts and thus improved the existing gene annotations. We found 97 transcripts, which are polyadenylated within an internal exon, resulting in the shrinking of the ORF and loss of a predicted protein domain. Around 40% of the transcripts in planaria were alternatively polyadenylated (ApA), resulting either in an altered 3′UTR or a change in coding sequence. We identified specific ApA transcript isoforms that were subjected to miRNA mediated gene regulation using degradome sequencing. In this study, we also confirmed a tissue-specific expression pattern for alternate polyadenylated transcripts. The insights from this study highlight the potential role of ApA in regulating the gene expression essential for planarian regeneration. PMID:27489207
Li, Zhenhui; Zheng, Ming; Abdalla, Bahareldin Ali; Zhang, Zhe; Xu, Zhenqiang; Ye, Qiao; Xu, Haiping; Luo, Wei; Nie, Qinghua; Zhang, Xiquan
In the poultry industry, aggressive behaviour is a large animal welfare issue all over the world. To date, little is known about the underlying genetics of the aggressive behaviour. Here, we performed a genome-wide association study (GWAS) to explore the genetic mechanism associated with aggressive behaviour in chickens. The GWAS results showed that a total of 33 SNPs were associated with aggressive behaviour traits (P < 4.6E-6). rs312463697 on chromosome 4 was significantly associated with aggression (P = 2.10905E-07), and it was in the intron region of the sortilin-related VPS10 domain containing receptor 2 (SORCS2) gene. In addition, biological function analysis of the nearest 26 genes around the significant SNPs was performed with Ingenuity Pathway Analysis. An interaction network contained 17 genes was obtained and SORCS2 was involved in this network, interacted with nerve growth factor (NGF), nerve growth factor receptor (NGFR), dopa decarboxylase (L-dopa) and dopamine. After knockdown of SORCS2, the mRNA levels of NGF, L-dopa and dopamine receptor genes DRD1, DRD2, DRD3 and DRD4 were significantly decreased (P < 0.05). In summary, our data indicated that SORCS2 might play an important role in chicken aggressive behaviour through the regulation of dopaminergic pathways and NGF. PMID:27485826
Dai, Hui; Zhao, Yang; Qian, Cheng; Cai, Min; Zhang, Ruyang; Chu, Minjie; Dai, Juncheng; Hu, Zhibin; Shen, Hongbing; Chen, Feng
Genome-wide association studies (GWAS) are popular for identifying genetic variants which are associated with disease risk. Many approaches have been proposed to test multiple single nucleotide polymorphisms (SNPs) in a region simultaneously which considering disadvantages of methods in single locus association analysis. Kernel machine based SNP set analysis is more powerful than single locus analysis, which borrows information from SNPs correlated with causal or tag SNPs. Four types of kernel machine functions and principal component based approach (PCA) were also compared. However, given the loss of power caused by low minor allele frequencies (MAF), we conducted an extension work on PCA and used a new method called weighted PCA (wPCA). Comparative analysis was performed for weighted principal component analysis (wPCA), logistic kernel machine based test (LKM) and principal component analysis (PCA) based on SNP set in the case of different minor allele frequencies (MAF) and linkage disequilibrium (LD) structures. We also applied the three methods to analyze two SNP sets extracted from a real GWAS dataset of non-small cell lung cancer in Han Chinese population. Simulation results show that when the MAF of the causal SNP is low, weighted principal component and weighted IBS are more powerful than PCA and other kernel machine functions at different LD structures and different numbers of causal SNPs. Application of the three methods to a real GWAS dataset indicates that wPCA and wIBS have better performance than the linear kernel, IBS kernel and PCA.
Kikuchi, Shinji; Bheemanahalli, Raju; Jagadish, Krishna S V; Kumagai, Etsushi; Masuya, Yusuke; Kuroda, Eiki; Raghavan, Chitra; Dingkuhn, Michael; Abe, Akira; Shimono, Hiroyuki
Phenotypic plasticity of plants in response to environmental changes is important for adapting to changing climate. Less attention has been paid to exploring the advantages of phenotypic plasticity in resource-rich environments to enhance the productivity of agricultural crops. Here, we examined genetic variation in phenotypic plasticity in indica rice (Oryza sativa L.) across two diverse panels: (i) a Phenomics of Rice Adaptation and Yield (PRAY) population comprising 301 accessions and (ii) a Multi-parent-Advanced-Generation-Inter-Cross (MAGIC) indica population comprising 151 accessions. Altered planting density was used as a proxy for elevated atmospheric CO2 response. Low planting density significantly increased panicle weight per plant compared with normal density, and the magnitude of the increase ranged from 1.10 to 2.78 times among accessions for the PRAY population and from 1.05 to 2.45 times for the MAGIC population. Genome-wide-association studies revealed three Environmental Responsiveness (ER) candidate alleles (qER1-3) that were associated with relative response of panicle weight to low density. Two of these alleles were tested in 13 genotypes to clarify their biomass responses during vegetative growth under elevated CO2 in Japan. Our study provides evidence for polymorphisms that control rice phenotypic plasticity in environments that are rich in resources such as light and CO2 .
Sailer, Anna; Nalls, Michael A.; Schulte, Claudia; Federoff, Monica; Price, T. Ryan; Lees, Andrew; Ross, Owen A.; Dickson, Dennis W.; Mok, Kin; Mencacci, Niccolo E.; Schottlaender, Lucia; Chelban, Viorica; Ling, Helen; O'Sullivan, Sean S.; Wood, Nicholas W.; Traynor, Bryan J.; Ferrucci, Luigi; Federoff, Howard J.; Mhyre, Timothy R.; Morris, Huw R.; Deuschl, Günther; Quinn, Niall; Widner, Hakan; Albanese, Alberto; Infante, Jon; Bhatia, Kailash P.; Poewe, Werner; Oertel, Wolfgang; Höglinger, Günter U.; Wüllner, Ullrich; Goldwurm, Stefano; Pellecchia, Maria Teresa; Ferreira, Joaquim; Tolosa, Eduardo; Bloem, Bastiaan R.; Rascol, Olivier; Meissner, Wassilios G.; Hardy, John A.; Revesz, Tamas; Holton, Janice L.; Gasser, Thomas; Wenning, Gregor K.; Singleton, Andrew B.
Objective: To identify genetic variants that play a role in the pathogenesis of multiple system atrophy (MSA), we undertook a genome-wide association study (GWAS). Methods: We performed a GWAS with >5 million genotyped and imputed single nucleotide polymorphisms (SNPs) in 918 patients with MSA of European ancestry and 3,864 controls. MSA cases were collected from North American and European centers, one third of which were neuropathologically confirmed. Results: We found no significant loci after stringent multiple testing correction. A number of regions emerged as potentially interesting for follow-up at p < 1 × 10−6, including SNPs in the genes FBXO47, ELOVL7, EDN1, and MAPT. Contrary to previous reports, we found no association of the genes SNCA and COQ2 with MSA. Conclusions: We present a GWAS in MSA. We have identified several potentially interesting gene loci, including the MAPT locus, whose significance will have to be evaluated in a larger sample set. Common genetic variation in SNCA and COQ2 does not seem to be associated with MSA. In the future, additional samples of well-characterized patients with MSA will need to be collected to perform a larger MSA GWAS, but this initial study forms the basis for these next steps. PMID:27629089
Haltaufderhyde, Kirk D.; Oancea, Elena
Because human epidermal melanocytes (HEMs) provide critical protection against skin cancer, sunburn, and photoaging, a genome-wide perspective of gene expression in these cells is vital to understanding human skin physiology. In this study we performed high throughput sequencing of HEMs to obtain a complete data set of transcript sizes, abundances, and splicing. As expected, we found that melanocyte specific genes that function in pigmentation were among the highest expressed genes. We analyzed receptor, ion channel and transcription factor gene families to get a better understanding of the cell signalling pathways used by melanocytes. We also performed a comparative transcriptomic analysis of lightly versus darkly pigmented HEMs and found 16 genes differentially expressed in the two pigmentation phenotypes; of those, only one putative melanosomal transporter (SLC45A2) has known function in pigmentation. In addition, we found 166 genes with splice isoforms expressed exclusively in one pigmentation phenotype, 17 of which are genes involved in signal transduction. Our melanocyte transcriptome study provides a comprehensive view and may help identify novel pigmentation genes and potential pharmacological targets. PMID:25451175
Tchurikov, Nickolai A; Kretova, Olga V; Sosin, Dmitri V; Zykov, Ivan A; Zhimulev, Igor F; Kravatsky, Yuri V
Forum domains are stretches of chromosomal DNA that are excised from eukaryotic chromosomes during their spontaneous non-random fragmentation. Most forum domains are 50-200 kb in length. We mapped forum domain termini using FISH on polytene chromosomes and we performed genome-wide mapping using a Drosophila melanogaster genomic tiling microarray consisting of overlapping 3 kb fragments. We found that forum termini very often correspond to regions of intercalary heterochromatin and regions of late replication in polytene chromosomes. We found that forum domains contain clusters of several or many genes. The largest forum domains correspond to the main clusters of homeotic genes inside BX-C and ANTP-C, cluster of histone genes and clusters of piRNAs. PRE/TRE and transcription factor binding sites often reside inside domains and do not overlap with forum domain termini. We also found that about 20% of forum domain termini correspond to small chromosomal regions where Ago1, Ago2, small RNAs and repressive chromatin structures are detected. Our results indicate that forum domains correspond to big multi-gene chromosomal units, some of which could be coordinately expressed. The data on the global mapping of forum domains revealed a strong correlation between fragmentation sites in chromosomes, particular sets of mobile elements and regions of intercalary heterochromatin.
Kim, Jinsil; Pitlick, Mitchell M.; Christine, Paul J.; Schaefer, Amanda R.; Saleme, Cesar; Comas, Belén; Cosentino, Viviana; Gadow, Enrique; Murray, Jeffrey C.
The amnion is a specialized tissue in contact with the amniotic fluid, which is in a constantly changing state. To investigate the importance of epigenetic events in this tissue in the physiology and pathophysiology of pregnancy, we performed genome-wide DNA methylation profiling of human amnion from term (with and without labor) and preterm deliveries. Using the Illumina Infinium HumanMethylation27 BeadChip, we identified genes exhibiting differential methylation associated with normal labor and preterm birth. Functional analysis of the differentially methylated genes revealed biologically relevant enriched gene sets. Bisulfite sequencing analysis of the promoter region of the oxytocin receptor (OXTR) gene detected two CpG dinucleotides showing significant methylation differences among the three groups of samples. Hypermethylation of the CpG island of the solute carrier family 30 member 3 (SLC30A3) gene in preterm amnion was confirmed by methylation-specific PCR. This work provides preliminary evidence that DNA methylation changes in the amnion may be at least partially involved in the physiological process of labor and the etiology of preterm birth and suggests that DNA methylation profiles, in combination with other biological data, may provide valuable insight into the mechanisms underlying normal and pathological pregnancies. PMID:23533356
Fransen, Erik; Bonneux, Sarah; Corneveaux, Jason J; Schrauwen, Isabelle; Di Berardino, Federica; White, Cory H; Ohmen, Jeffrey D; Van de Heyning, Paul; Ambrosetti, Umberto; Huentelman, Matthew J; Van Camp, Guy; Friedman, Rick A
We performed a genome-wide association study (GWAS) to identify the genes responsible for age-related hearing impairment (ARHI), the most common form of hearing impairment in the elderly. Analysis of common variants, with and without adjustment for stratification and environmental covariates, rare variants and interactions, as well as gene-set enrichment analysis, showed no variants with genome-wide significance. No evidence for replication of any previously reported genes was found. A study of the genetic architecture indicates for the first time that ARHI is highly polygenic in nature, with probably no major genes involved. The phenotype depends on the aggregated effect of a large number of SNPs, of which the individual effects are undetectable in a modestly powered GWAS. We estimated that 22% of the variance in our data set can be explained by the collective effect of all genotyped SNPs. A score analysis showed a modest enrichment in causative SNPs among the SNPs with a P-value below 0.01. PMID:24939585
Cao, Zong-Fu; Ma, Chuan-Xiang; Wang, Lei; Cai, Bin
Since population genetic STRUCTURE can increase false-positive rate in genome-wide association studies (GWAS) for complex diseases, the effect of population stratification should be taken into account in GWAS. However, the effect of randomly selected SNPs in population stratification analysis is underdetermined. In this study, based on the genotype data generated on Genome-Wide Human SNP Array 6.0 from unrelated individuals of HapMap Phase2, we randomly selected SNPs that were evenly distributed across the whole-genome, and acquired Ancestry Informative Markers (AIMs) by the method of f value and allelic Fisher exact test. F-statistics and STRUCTURE analysis based on the select different sets of SNPs were used to evaluate the effect of distinguishing the populations from HapMap Phase3. We found that randomly selected SNPs that were evenly distributed across the whole-genome were able to be used to identify the population structure. This study further indicated that more than 3 000 randomly selected SNPs that were evenly distributed across the whole-genome were substituted for AIMs in population stratification analysis, when there were no available AIMs for spe-cific populations.
Sauvage, Christopher; Segura, Vincent; Bauchet, Guillaume; Stevens, Rebecca; Do, Phuc Thi; Nikoloski, Zoran; Fernie, Alisdair R.; Causse, Mathilde
Genome-wide association studies have been successful in identifying genes involved in polygenic traits and are valuable for crop improvement. Tomato (Solanum lycopersicum) is a major crop and is highly appreciated worldwide for its health value. We used a core collection of 163 tomato accessions composed of S. lycopersicum, S. lycopersicum var cerasiforme, and Solanum pimpinellifolium to map loci controlling variation in fruit metabolites. Fruits were phenotyped for a broad range of metabolites, including amino acids, sugars, and ascorbate. In parallel, the accessions were genotyped with 5,995 single-nucleotide polymorphism markers spread over the whole genome. Genome-wide association analysis was conducted on a large set of metabolic traits that were stable over 2 years using a multilocus mixed model as a general method for mapping complex traits in structured populations and applied to tomato. We detected a total of 44 loci that were significantly associated with a total of 19 traits, including sucrose, ascorbate, malate, and citrate levels. These results not only provide a list of candidate loci to be functionally validated but also a powerful analytical approach for finding genetic variants that can be directly used for crop improvement and deciphering the genetic architecture of complex traits. PMID:24894148
Adhikari, Kaustubh; Reales, Guillermo; Smith, Andrew J P; Konka, Esra; Palmen, Jutta; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Fuentes, Macarena; Pizarro, María; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Calderón, Rosario; Rosique, Javier; Cheeseman, Michael; Bhutta, Mahmood F; Humphries, Steve E; Gonzalez-José, Rolando; Headon, Denis; Balding, David; Ruiz-Linares, Andrés
Here we report a genome-wide association study for non-pathological pinna morphology in over 5,000 Latin Americans. We find genome-wide significant association at seven genomic regions affecting: lobe size and attachment, folding of antihelix, helix rolling, ear protrusion and antitragus size (linear regression P values 2 × 10(-8) to 3 × 10(-14)). Four traits are associated with a functional variant in the Ectodysplasin A receptor (EDAR) gene, a key regulator of embryonic skin appendage development. We confirm expression of Edar in the developing mouse ear and that Edar-deficient mice have an abnormally shaped pinna. Two traits are associated with SNPs in a region overlapping the T-Box Protein 15 (TBX15) gene, a major determinant of mouse skeletal development. Strongest association in this region is observed for SNP rs17023457 located in an evolutionarily conserved binding site for the transcription factor Cartilage paired-class homeoprotein 1 (CART1), and we confirm that rs17023457 alters in vitro binding of CART1.
Adhikari, Kaustubh; Reales, Guillermo; Smith, Andrew J. P.; Konka, Esra; Palmen, Jutta; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Fuentes, Macarena; Pizarro, María; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C.; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M.; Bortolini, Maria- Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Calderón, Rosario; Rosique, Javier; Cheeseman, Michael; Bhutta, Mahmood F.; Humphries, Steve E.; Gonzalez-José, Rolando; Headon, Denis; Balding, David; Ruiz-Linares, Andrés
Here we report a genome-wide association study for non-pathological pinna morphology in over 5,000 Latin Americans. We find genome-wide significant association at seven genomic regions affecting: lobe size and attachment, folding of antihelix, helix rolling, ear protrusion and antitragus size (linear regression P values 2 × 10−8 to 3 × 10−14). Four traits are associated with a functional variant in the Ectodysplasin A receptor (EDAR) gene, a key regulator of embryonic skin appendage development. We confirm expression of Edar in the developing mouse ear and that Edar-deficient mice have an abnormally shaped pinna. Two traits are associated with SNPs in a region overlapping the T-Box Protein 15 (TBX15) gene, a major determinant of mouse skeletal development. Strongest association in this region is observed for SNP rs17023457 located in an evolutionarily conserved binding site for the transcription factor Cartilage paired-class homeoprotein 1 (CART1), and we confirm that rs17023457 alters in vitro binding of CART1. PMID:26105758
Vongpaisarnsin, Kornkiat; Listman, Jennifer Beth; Malison, Robert T; Gelernter, Joel
The main purpose of this work was to identify a set of AIMs that stratify the genetic structure and diversity of the Thai population from a high-throughput autosomal genome-wide association study. In this study, more than one million SNPs from the international HapMap database and the Thai depression genome-wide association study have been examined to identify ancestry informative markers (AIMs) that distinguish between Thai populations. An efficient strategy is proposed to identify and characterize such SNPs and to test high-resolution SNP data from international HapMap populations. The best AIMs are identified to stratify the population and to infer genetic ancestry structure. A total of 124 AIMs were clearly clustered geographically across the continent, whereas only 89 AIMs stratified the Thai population from East Asian populations. Finally, a set of 273 AIMs was able to distinguish northern from southern Thai subpopulations. These markers will be of particular value in identifying the ethnic origins in regions where matching by self-reports is unavailable or unreliable, which usually occurs in real forensic cases.
Gadelha, Ary; Coleman, Jonathan; Breen, Gerome; Mazzoti, Diego Robles; Yonamine, Camila M; Pellegrino, Renata; Ota, Vanessa Kiyomi; Belangero, Sintia Iole; Glessner, Joseph; Sleiman, Patrick; Hakonarson, Hakon; Hayashi, Mirian A F; Bressan, Rodrigo A
Ndel1 is a DISC1-interacting oligopeptidase that cleaves in vitro neuropeptides as neurotensin and bradykinin, and which has been associated with both neuronal migration and neurite outgrowth. We previously reported that plasma Ndel1 enzyme activity is lower in patients with schizophrenia (SCZ) compared to healthy controls (HCs). To our knowledge, no previous study has investigated the genetic factors associated with the plasma Ndel1 enzyme activity. In the current analyses, samples from 83 SCZ patients and 92 control subjects that were assayed for plasma Ndel1 enzyme activity were genotyped on Illumina Omni Express arrays. A genetic relationship matrix using genome-wide information was then used for ancestry correction, and association statistics were calculated genome-wide. Ndel1 enzyme activity was significantly lower in patients with SCZ (t=4.9; p<0.001) and was found to be associated with CAMK1D, MAGI2, CCDC25, and GABGR3, at a level of suggestive significance (p<10(-6)), independent of the clinical status. Then, we performed a model to investigate the observed differences for case/control measures. 2 SNPs at region 1p22.2 reached the p<10(-7) level. ZFPM2 and MAD1L1 were the only two genes with more than one hit at 10(-6) order of p value. Therefore, Ndel1 enzyme activity is a complex trait influenced by many different genetic variants that may contribute to SCZ physiopathology.
Athanasiu, Lavinia; Smorr, Lisa-Lena H; Tesli, Martin; Røssberg, Jan I; Sønderby, Ida E; Spigset, Olav; Djurovic, Srdjan; Andreassen, Ole A
Individual variation in pharmacokinetics of psychotropic drugs, particularly metabolism, is an important factor to consider in pharmacological treatment in psychiatry. A large proportion of this variance is still not accounted for, but evidence so far suggests the involvement of genetic factors. We performed a genome-wide association study (GWAS) with concentration dose ratio (CDR) as sub-phenotype to assess metabolism rate of psychotropic drugs in a homogenous Norwegian sample of 1334 individuals diagnosed with a severe mental disorder. The GWAS revealed one genome-wide significant marker (rs16935279, p-value=3.95×10(-10), pperm=7.5×10(-4)) located in an intronic region of the lncRNA LOC100505718. Carriers of the minor allele have a lower metabolism rate of antiepileptic drugs compared to major allele carriers. In addition, several nominally significant associations between single nucleotide polymorphisms (SNPs) and CDR for antipsychotic, antidepressant and antiepileptic drugs were disclosed. We consider standardised CDR to be a useful measure of the metabolism rate of a drug. The present findings indicate that common gene variants could affect the metabolism of psychotropic drugs. This warrants further investigations into the functional mechanisms involved as it may lead to identification of predictive markers as well as novel drug targets.
Brandl, E J; Tiwari, A K; Zai, C C; Nurmi, E L; Chowdhury, N I; Arenovich, T; Sanches, M; Goncalves, V F; Shen, J J; Lieberman, J A; Meltzer, H Y; Kennedy, J L; Müller, D J
Antipsychotic-induced weight gain (AIWG) is a common side effect with a high genetic contribution. We reanalyzed genome-wide association study (GWAS) data from the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) selecting a refined subset of patients most suitable for AIWG studies. The final GWAS was conducted in N=189 individuals. The top polymorphisms were analyzed in a second cohort of N=86 patients. None of the single-nucleotide polymorphisms was significant at the genome-wide threshold of 5x10(-8). We observed interesting trends for rs9346455 (P=6.49x10(-6)) upstream of OGFRL1, the intergenic variants rs7336345 (P=1.31 × 10(-5)) and rs1012650 (P=1.47 × 10(-5)), and rs1059778 (P=1.49x10(-5)) in IBA57. In the second cohort, rs9346455 showed significant association with AIWG (P=0.005). The combined meta-analysis P-value for rs9346455 was 1.09 × 10(-7). Our reanalysis of the CATIE GWAS data revealed interesting new variants associated with AIWG. As the functional relevance of these polymorphisms is yet to be determined, further studies are needed.The Pharmacogenomics Journal advance online publication, 1 September 2015; doi:10.1038/tpj.2015.59.
Peters, Ulrike; Hutter, Carolyn M.; Hsu, Li; Schumacher, Fredrick R.; Conti, David V.; Carlson, Christopher S.; Edlund, Christopher K.; Haile, Robert W.; Gallinger, Steven; Zanke, Brent W.; Lemire, Mathieu; Rangrej, Jagadish; Vijayaraghavan, Raakhee; Chan, Andrew T.; Hazra, Aditi; Hunter, David J.; Ma, Jing; Fuchs, Charles S.; Giovannucci, Edward L.; Kraft, Peter; Liu, Yan; Chen, Lin; Jiao, Shuo; Makar, Karen W.; Taverna, Darin; Gruber, Stephen B.; Rennert, Gad; Moreno, Victor; Ulrich, Cornelia M.; Woods, Michael O.; Green, Roger C.; Parfrey, Patrick S.; Prentice, Ross L.; Kooperberg, Charles; Jackson, Rebecca D.; LaCroix, Andrea Z.; Caan, Bette J.; Hayes, Richard B.; Berndt, Sonja I.; Chanock, Stephen J.; Schoen, Robert E.; Chang-Claude, Jenny; Hoffmeister, Michael; Brenner, Hermann; Frank, Bernd; Bézieau, Stéphane; Küry, Sébastien; Slattery, Martha L.; Hopper, John L.; Jenkins, Mark A.; Le Marchand, Loic; Lindor, Noralane M.; Newcomb, Polly A.; Seminara, Daniela; Hudson, Thomas J.; Duggan, David J.; Potter, John D.; Casey, Graham
Colorectal cancer is the second leading cause of cancer death in developed countries. Genome-wide association studies (GWAS) have successfully identified novel susceptibility loci for colorectal cancer. To follow-up on these findings, and try to identify novel colorectal cancer susceptibility loci, we present results for genome-wide association studies (GWAS) of colorectal cancer (2,906 cases, 3,416 controls) that have not previously published main associations. Specifically, we calculated odds ratios (ORs) and 95% confidence intervals (CIs) using log-additive models for each study. In order to improve our power to detect novel colorectal cancer susceptibility loci, we performed a meta-analysis combining the results across studies. We selected the most statistically significant single nucleotide polymorphisms (SNPs) for replication using 10 independent studies (8,161 cases and 9,101 controls). We again used a meta-analysis to summarize results for the replication studies alone, and for a combined analysis of GWAS and replication studies. We measured 10 SNPs previously identified in colorectal cancer susceptibility loci and found eight to be associated with colorectal cancer (p-value range: 0.02 to 1.8 × 10−8). When we excluded studies that have previously published on these SNPs, five SNPs remained significant at p<0.05 in the combined analysis. No novel susceptibility loci were significant in the replication study after adjustment for multiple testing, and none reached genome-wide significance from a combined analysis of GWAS and replication. We observed marginally significant evidence for a second independent SNP in the BMP2 region at chromosomal location 20p12 (rs4813802; replication p-value 0.03; combined p-value 7.3 × 10−5). In a region on 5p33.15, which includes the coding regions of the TERT-CLPTM1L genes and has been identified in GWAS to be associated with susceptibility to at least seven other cancers, we observed a marginally significant
Newcomb, Polly A.; Campbell, Peter T.; Baron, John A.; Berndt, Sonja I.; Bezieau, Stephane; Brenner, Hermann; Casey, Graham; Chan, Andrew T.; Chang-Claude, Jenny; Du, Mengmeng; Figueiredo, Jane C.; Gallinger, Steven; Giovannucci, Edward L.; Haile, Robert W.; Harrison, Tabitha A.; Hayes, Richard B.; Hoffmeister, Michael; Hopper, John L.; Hudson, Thomas J.; Jeon, Jihyoun; Jenkins, Mark A.; Küry, Sébastien; Le Marchand, Loic; Lin, Yi; Lindor, Noralane M.; Nishihara, Reiko; Ogino, Shuji; Potter, John D.; Rudolph, Anja; Schoen, Robert E.; Seminara, Daniela; Slattery, Martha L.; Thibodeau, Stephen N.; Thornquist, Mark; Toth, Reka; Wallace, Robert; White, Emily; Jiao, Shuo; Lemire, Mathieu; Hsu, Li; Peters, Ulrike
Genome-wide association studies (GWAS) have identified many genetic susceptibility loci for colorectal cancer (CRC). However, variants in these loci explain only a small proportion of familial aggregation, and there are likely additional variants that are associated with CRC susceptibility. Genome-wide studies of gene-environment interactions may identify variants that are not detected in GWAS of marginal gene effects. To study this, we conducted a genome-wide analysis for interaction between genetic variants and alcohol consumption and cigarette smoking using data from the Colon Cancer Family Registry (CCFR) and the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO). Interactions were tested using logistic regression. We identified interaction between CRC risk and alcohol consumption and variants in the 9q22.32/HIATL1 (Pinteraction = 1.76×10−8; permuted p-value 3.51x10-8) region. Compared to non-/occasional drinking light to moderate alcohol consumption was associated with a lower risk of colorectal cancer among individuals with rs9409565 CT genotype (OR, 0.82 [95% CI, 0.74–0.91]; P = 2.1×10−4) and TT genotypes (OR,0.62 [95% CI, 0.51–0.75]; P = 1.3×10−6) but not associated among those with the CC genotype (p = 0.059). No genome-wide statistically significant interactions were observed for smoking. If replicated our suggestive finding of a genome-wide significant interaction between genetic variants and alcohol consumption might contribute to understanding colorectal cancer etiology and identifying subpopulations with differential susceptibility to the effect of alcohol on CRC risk. PMID:27723779
Martin-Collado, D; Diaz, C; Drucker, A G; Carabaño, M J; Zander, K K
Livestock breed-related public good functions are often used to justify support for endangered breed conservation despite the fact that little is known about such non-market values. We show how stated preference techniques can be used to assess the non-market values that people place on livestock breeds. Through the application of a case study choice experiment survey in Zamora province, Spain, the total economic value (TEV) of the threatened Alistana-Sanabresa (AS) cattle breed was investigated. An analysis of the relative importance of the non-market components of its TEV and an assessment of the socio-economic variables that influence people's valuation of such components is used to inform conservation strategy design. Overall, the findings reveal that the AS breed had significant non-market values associated with it and that the value that respondents placed on each specific public good function also varied significantly. Functions related with indirect use cultural and existence values were much more highly valued than landscape maintenance values. These high cultural and existence values (totalling over 80% of TEV) suggest that an AS in situ conservation strategy will be required to secure such values. As part of such a strategy, incentive mechanisms will be needed to permit farmers to capture some of these public good values and thus be able to afford to maintain breed population numbers at socially desirable levels. One such mechanism could be related to the development of breed-related agritourism initiatives, with a view to enhancing private good values and providing an important addition to continued direct support. Where linked with cultural dimensions, niche product market development, including through improving AS breed-related product quality and brand recognition may also have a role to play as part of such an overall conservation and use strategy. We conclude that livestock breed conservation strategies with the highest potential to maximise
Mbuthia, Jackson Mwenda; Rewe, Thomas Odiwuor; Kahi, Alexander Kigunzu
This study estimated economic values for production traits (dressing percentage (DP), %; live weight for growers (LWg), kg; live weight for sows (LWs), kg) and functional traits (feed intake for growers (FEEDg), feed intake for sow (FEEDs), preweaning survival rate (PrSR), %; postweaning survival (PoSR), %; sow survival rate (SoSR), %, total number of piglets born (TNB) and farrowing interval (FI), days) under different smallholder pig production systems in Kenya. Economic values were estimated considering two production circumstances: fixed-herd and fixed-feed. Under the fixed-herd scenario, economic values were estimated assuming a situation where the herd cannot be increased due to other constraints apart from feed resources. The fixed-feed input scenario assumed that the herd size is restricted by limitation of feed resources available. In addition to the tradition profit model, a risk-rated bio-economic model was used to derive risk-rated economic values. This model accounted for imperfect knowledge concerning risk attitude of farmers and variance of input and output prices. Positive economic values obtained for traits DP, LWg, LWs, PoSR, PrSR, SoSR and TNB indicate that targeting them in improvement would positively impact profitability in pig breeding programmes. Under the fixed-feed basis, the risk-rated economic values for DP, LWg, LWs and SoSR were similar to those obtained under the fixed-herd situation. Accounting for risks in the EVs did not yield errors greater than ±50 % in all the production systems and basis of evaluation meaning there would be relatively little effect on the real genetic gain of a selection index. Therefore, both traditional and risk-rated models can be satisfactorily used to predict profitability in pig breeding programmes.
With the introduction of new sequencing technologies, single nucleotide polymorphisms (SNPs) are rapidly replacing simple sequence repeats (SSRs) as the DNA marker of choice for applications in plant breeding and genetics because they are more abundant, stable, amenable to automation, efficient, and...
Herrero-Medrano, J M; Mathur, P K; ten Napel, J; Rashidi, H; Alexandri, P; Knol, E F; Mulder, H A
Robustness is an important issue in the pig production industry. Since pigs from international breeding organizations have to withstand a variety of environmental challenges, selection of pigs with the inherent ability to sustain their productivity in diverse environments may be an economically feasible approach in the livestock industry. The objective of this study was to estimate genetic parameters and breeding values across different levels of environmental challenge load. The challenge load (CL) was estimated as the reduction in reproductive performance during different weeks of a year using 925,711 farrowing records from farms distributed worldwide. A wide range of levels of challenge, from favorable to unfavorable environments, was observed among farms with high CL values being associated with confirmed situations of unfavorable environment. Genetic parameters and breeding values were estimated in high- and low-challenge environments using a bivariate analysis, as well as across increasing levels of challenge with a random regression model using Legendre polynomials. Although heritability estimates of number of pigs born alive were slightly higher in environments with extreme CL than in those with intermediate levels of CL, the heritabilities of number of piglet losses increased progressively as CL increased. Genetic correlations among environments with different levels of CL suggest that selection in environments with extremes of low or high CL would result in low response to selection. Therefore, selection programs of breeding organizations that are commonly conducted under favorable environments could have low response to selection in commercial farms that have unfavorable environmental conditions. Sows that had experienced high levels of challenge at least once during their productive life were ranked according to their EBV. The selection of pigs using EBV ignoring environmental challenges or on the basis of records from only favorable environments
Sanchez-Juan, Pascual; Bishop, Matthew T.; Kovacs, Gabor G.; Calero, Miguel; Aulchenko, Yurii S.; Ladogana, Anna; Boyd, Alison; Lewis, Victoria; Ponto, Claudia; Calero, Olga; Poleggi, Anna; Carracedo, Ángel; van der Lee, Sven J.; Ströbel, Thomas; Rivadeneira, Fernando; Hofman, Albert; Haïk, Stéphane; Combarros, Onofre; Berciano, José; Uitterlinden, Andre G.; Collins, Steven J.; Budka, Herbert; Brandel, Jean-Philippe; Laplanche, Jean Louis; Pocchiari, Maurizio; Zerr, Inga; Knight, Richard S. G.; Will, Robert G.; van Duijn, Cornelia M.
We performed a genome-wide association (GWA) study in 434 sporadic Creutzfeldt-Jakob disease (sCJD) patients and 1939 controls from the United Kingdom, Germany and The Netherlands. The findings were replicated in an independent sample of 1109 sCJD and 2264 controls provided by a multinational consortium. From the initial GWA analysis we selected 23 SNPs for further genotyping in 1109 sCJD cases from seven different countries. Five SNPs were significantly associated with sCJD after correction for multiple testing. Subsequently these five SNPs were genotyped in 2264 controls. The pooled analysis, including 1543 sCJD cases and 4203 controls, yielded two genome wide significant results: rs6107516 (p-value=7.62x10-9) a variant tagging the prion protein gene (PRNP); and rs6951643 (p-value=1.66x10-8) tagging the Glutamate Receptor Metabotropic 8 gene (GRM8). Next we analysed the data stratifying by country of origin combining samples from the pooled analysis with genotypes from the 1000 Genomes Project and imputed genotypes from the Rotterdam Study (Total n=12967). The meta-analysis of the results showed that rs6107516 (p-value=3.00x10-8) and rs6951643 (p-value=3.91x10-5) remained as the two most significantly associated SNPs. Rs6951643 is located in an intronic region of GRM8, a gene that was additionally tagged by a cluster of 12 SNPs within our top100 ranked results. GRM8 encodes for mGluR8, a protein which belongs to the metabotropic glutamate receptor family, recently shown to be involved in the transduction of cellular signals triggered by the prion protein. Pathway enrichment analyses performed with both Ingenuity Pathway Analysis and ALIGATOR postulates glutamate receptor signalling as one of the main pathways associated with sCJD. In summary, we have detected GRM8 as a novel, non-PRNP, genome-wide significant marker associated with heightened disease risk, providing additional evidence supporting a role of glutamate receptors in sCJD pathogenesis. PMID:25918841
Fodor, Agota; Segura, Vincent; Denis, Marie; Neuenschwander, Samuel; Fournier-Level, Alexandre; Chatelet, Philippe; Homa, Félix Abdel Aziz; Lacombe, Thierry; This, Patrice; Le Cunff, Loic
Nowadays, genome-wide association studies (GWAS) and genomic selection (GS) methods which use genome-wide marker data for phenotype prediction are of much potential interest in plant breeding. However, to our knowledge, no studies have been performed yet on the predictive ability of these methods for structured traits when using training populations with high levels of genetic diversity. Such an example of a highly heterozygous, perennial species is grapevine. The present study compares the accuracy of models based on GWAS or GS alone, or in combination, for predicting simple or complex traits, linked or not with population structure. In order to explore the relevance of these methods in this context, we performed simulations using approx 90,000 SNPs on a population of 3,000 individuals structured into three groups and corresponding to published diversity grapevine data. To estimate the parameters of the prediction models, we defined four training populations of 1,000 individuals, corresponding to these three groups and a core collection. Finally, to estimate the accuracy of the models, we also simulated four breeding populations of 200 individuals. Although prediction accuracy was low when breeding populations were too distant from the training populations, high accuracy levels were obtained using the sole core-collection as training population. The highest prediction accuracy was obtained (up to 0.9) using the combined GWAS-GS model. We thus recommend using the combined prediction model and a core-collection as training population for grapevine breeding or for other important economic crops with the same characteristics. PMID:25365338
Crossa, José; Campos, Gustavo de los; Pérez, Paulino; Gianola, Daniel; Burgueño, Juan; Araus, José Luis; Makumbi, Dan; Singh, Ravi P.; Dreisigacker, Susanne; Yan, Jianbing; Arief, Vivi; Banziger, Marianne; Braun, Hans-Joachim
The availability of dense molecular markers has made possible the use of genomic selection (GS) for plant breeding. However, the evaluation of models for GS in real plant populations is very limited. This article evaluates the performance of parametric and semiparametric models for GS using wheat (Triticum aestivum L.) and maize (Zea mays) data in which different traits were measured in several environmental conditions. The findings, based on extensive cross-validations, indicate that models including marker information had higher predictive ability than pedigree-based models. In the wheat data set, and relative to a pedigree model, gains in predictive ability due to inclusion of markers ranged from 7.7 to 35.7%. Correlation between observed and predictive values in the maize data set achieved values up to 0.79. Estimates of marker effects were different across environmental conditions, indicating that genotype × environment interaction is an important component of genetic variability. These results indicate that GS in plant breeding can be an effective strategy for selecting among lines whose phenotypes have yet to be observed. PMID:20813882
Darcy, Diana; Atwal, Paldeep Singh; Angell, Cathy; Gadi, Inder; Wallerstein, Robert
We report on a 6-month-old girl with two apparent cell lines; one with trisomy 21, and the other with paternal genome-wide uniparental isodisomy (GWUPiD), identified using single nucleotide polymorphism (SNP) based microarray and microsatellite analysis of polymorphic loci. The patient has Beckwith-Wiedemann syndrome (BWS) due to paternal uniparental disomy (UPD) at chromosome location 11p15 (UPD 11p15), which was confirmed through methylation analysis. Hyperinsulinemic hypoglycemia is present, which is associated with paternal UPD 11p15.5; and she likely has medullary nephrocalcinosis, which is associated with paternal UPD 20, although this was not biochemically confirmed. Angelman syndrome (AS) analysis was negative but this testing is not completely informative; she has no specific features of AS. Clinical features of this patient include: dysmorphic features consistent with trisomy 21, tetralogy of Fallot, hemihypertrophy, swirled skin hyperpigmentation, hepatoblastoma, and Wilms tumor. Her karyotype is 47,XX,+21/46,XX, and microarray results suggest that the cell line with trisomy 21 is biparentally inherited and represents 40-50% of the genomic material in the tested specimen. The difference in the level of cytogenetically detected mosaicism versus the level of mosaicism observed via microarray analysis is likely caused by differences in the test methodologies. While a handful of cases of mosaic paternal GWUPiD have been reported, this patient is the only reported case that also involves trisomy 21. Other GWUPiD patients have presented with features associated with multiple imprinted regions, as does our patient.
Abrantes, Patrícia; Francisco, Vânia; Teixeira, Gilberto; Monteiro, Marta; Neves, João; Norte, Ana; Robalo Cordeiro, Carlos; Moura e Sá, João; Reis, Ernestina; Santos, Patrícia; Oliveira, Manuela; Sousa, Susana; Fradinho, Marta; Malheiro, Filipa; Negrão, Luís
Despite elevated incidence and recurrence rates for Primary Spontaneous Pneumothorax (PSP), little is known about its etiology, and the genetics of idiopathic PSP remains unexplored. To identify genetic variants contributing to sporadic PSP risk, we conducted the first PSP genome-wide association study. Two replicate pools of 92 Portuguese PSP cases and of 129 age- and sex-matched controls were allelotyped in triplicate on the Affymetrix Human SNP Array 6.0 arrays. Markers passing quality control were ranked by relative allele score difference between cases and controls (|RASdiff|), by a novel cluster method and by a combined Z-test. 101 single nucleotide polymorphisms (SNPs) were selected using these three approaches for technical validation by individual genotyping in the discovery dataset. 87 out of 94 successfully tested SNPs were nominally associated in the discovery dataset. Replication of the 87 technically validated SNPs was then carried out in an independent replication dataset of 100 Portuguese cases and 425 controls. The intergenic rs4733649 SNP in chromosome 8 (between LINC00824 and LINC00977) was associated with PSP in the discovery (P = 4.07E-03, ORC[95% CI] = 1.88[1.22–2.89]), replication (P = 1.50E-02, ORC[95% CI] = 1.50[1.08–2.09]) and combined datasets (P = 8.61E-05, ORC[95% CI] = 1.65[1.29–2.13]). This study identified for the first time one genetic risk factor for sporadic PSP, but future studies are warranted to further confirm this finding in other populations and uncover its functional role in PSP pathogenesis. PMID:27203581
Lai, Rose K; Chen, Yanwen; Guan, Xiaowei; Nousome, Darryl; Sharma, Charu; Canoll, Peter; Bruce, Jeffrey; Sloan, Andrew E; Cortes, Etty; Vonsattel, Jean-Paul; Su, Tao; Delgado-Cruzata, Lissette; Gurvich, Irina; Santella, Regina M; Ostrom, Quinn; Lee, Annette; Gregersen, Peter; Barnholtz-Sloan, Jill
Few studies had investigated genome-wide methylation in glioblastoma multiforme (GBM). Our goals were to study differential methylation across the genome in gene promoters using an array-based method, as well as repetitive elements using surrogate global methylation markers. The discovery sample set for this study consisted of 54 GBM from Columbia University and Case Western Reserve University, and 24 brain controls from the New York Brain Bank. We assembled a validation dataset using methylation data of 162 TCGA GBM and 140 brain controls from dbGAP. HumanMethylation27 Analysis Bead-Chips (Illumina) were used to interrogate 26,486 informative CpG sites in both the discovery and validation datasets. Global methylation levels were assessed by analysis of L1 retrotransposon (LINE1), 5 methyl-deoxycytidine (5m-dC) and 5 hydroxylmethyl-deoxycytidine (5hm-dC) in the discovery dataset. We validated a total of 1548 CpG sites (1307 genes) that were differentially methylated in GBM compared to controls. There were more than twice as many hypomethylated genes as hypermethylated ones. Both the discovery and validation datasets found 5 tumor methylation classes. Pathway analyses showed that the top ten pathways in hypomethylated genes were all related to functions of innate and acquired immunities. Among hypermethylated pathways, transcriptional regulatory network in embryonic stem cells was the most significant. In the study of global methylation markers, 5m-dC level was the best discriminant among methylation classes, whereas in survival analyses, high level of LINE1 methylation was an independent, favorable prognostic factor in the discovery dataset. Based on a pathway approach, hypermethylation in genes that control stem cell differentiation were significant, poor prognostic factors of overall survival in both the discovery and validation datasets. Approaches that targeted these methylated genes may be a future therapeutic goal.
Christopoulou, Marilena; Wo, Sebastian Reyes-Chin; Kozik, Alex; McHale, Leah K; Truco, Maria-Jose; Wroblewski, Tadeusz; Michelmore, Richard W
Genome-wide motif searches identified 1134 genes in the lettuce reference genome of cv. Salinas that are potentially involved in pathogen recognition, of which 385 were predicted to encode nucleotide binding-leucine rich repeat receptor (NLR) proteins. Using a maximum-likelihood approach, we grouped the NLRs into 25 multigene families and 17 singletons. Forty-one percent of these NLR-encoding genes belong to three families, the largest being RGC16 with 62 genes in cv. Salinas. The majority of NLR-encoding genes are located in five major resistance clusters (MRCs) on chromosomes 1, 2, 3, 4, and 8 and cosegregate with multiple disease resistance phenotypes. Most MRCs contain primarily members of a single NLR gene family but a few are more complex. MRC2 spans 73 Mb and contains 61 NLRs of six different gene families that cosegregate with nine disease resistance phenotypes. MRC3, which is 25 Mb, contains 22 RGC21 genes and colocates with Dm13. A library of 33 transgenic RNA interference tester stocks was generated for functional analysis of NLR-encoding genes that cosegregated with disease resistance phenotypes in each of the MRCs. Members of four NLR-encoding families, RGC1, RGC2, RGC21, and RGC12 were shown to be required for 16 disease resistance phenotypes in lettuce. The general composition of MRCs is conserved across different genotypes; however, the specific repertoire of NLR-encoding genes varied particularly of the rapidly evolving Type I genes. These tester stocks are valuable resources for future analyses of additional resistance phenotypes.
Yuen, Ryan K C; Merico, Daniele; Cao, Hongzhi; Pellecchia, Giovanna; Alipanahi, Babak; Thiruvahindrapuram, Bhooma; Tong, Xin; Sun, Yuhui; Cao, Dandan; Zhang, Tao; Wu, Xueli; Jin, Xin; Zhou, Ze; Liu, Xiaomin; Nalpathamkalam, Thomas; Walker, Susan; Howe, Jennifer L.; Wang, Zhuozhi; MacDonald, Jeffrey R.; Chan, Ada; D’Abate, Lia; Deneault, Eric; Siu, Michelle T.; Tammimies, Kristiina; Uddin, Mohammed; Zarrei, Mehdi; Wang, Mingbang; Li, Yingrui; Wang, Jun; Wang, Jian; Yang, Huanming; Bookman, Matt; Bingham, Jonathan; Gross, Samuel S.; Loy, Dion; Pletcher, Mathew; Marshall, Christian R.; Anagnostou, Evdokia; Zwaigenbaum, Lonnie; Weksberg, Rosanna; Fernandez, Bridget A; Roberts, Wendy; Szatmari, Peter; Glazer, David; Frey, Brendan J.; Ring, Robert H.; Xu, Xun; Scherer, Stephen W.
De novo mutations (DNMs) are important in Autism Spectrum Disorder (ASD), but so far analyses have mainly been on the ~1.5% of the genome encoding genes. Here, we performed whole genome sequencing (WGS) of 200 ASD parent-child trios and characterized germline and somatic DNMs. We confirmed that the majority of germline DNMs (75.6%) originated from the father, and these increased significantly with paternal age only (p=4.2×10−10). However, when clustered DNMs (those within 20kb) were found in ASD, not only did they mostly originate from the mother (p=7.7×10−13), but they could also be found adjacent to de novo copy number variations (CNVs) where the mutation rate was significantly elevated (p=2.4×10−24). By comparing DNMs detected in controls, we found a significant enrichment of predicted damaging DNMs in ASD cases (p=8.0×10−9; OR=1.84), of which 15.6% (p=4.3×10−3) and 22.5% (p=7.0×10−5) were in the non-coding or genic non-coding, respectively. The non-coding elements most enriched for DNM were untranslated regions of genes, boundaries involved in exon-skipping and DNase I hypersensitive regions. Using microarrays and a novel outlier detection test, we also found aberrant methylation profiles in 2/185 (1.1%) of ASD cases. These same individuals carried independently identified DNMs in the ASD risk- and epigenetic- genes DNMT3A and ADNP. Our data begins to characterize different genome-wide DNMs, and highlight the contribution of non-coding variants, to the etiology of ASD. PMID:27525107
Background Sleep is a highly conserved behavior, yet its duration and pattern vary extensively among species and between individuals within species. The genetic basis of natural variation in sleep remains unknown. Results We used the Drosophila Genetic Reference Panel (DGRP) to perform a genome-wide association (GWA) study of sleep in D. melanogaster. We identified candidate single nucleotide polymorphisms (SNPs) associated with differences in the mean as well as the environmental sensitivity of sleep traits; these SNPs typically had sex-specific or sex-biased effects, and were generally located in non-coding regions. The majority of SNPs (80.3%) affecting sleep were at low frequency and had moderately large effects. Additive models incorporating multiple SNPs explained as much as 55% of the genetic variance for sleep in males and females. Many of these loci are known to interact physically and/or genetically, enabling us to place them in candidate genetic networks. We confirmed the role of seven novel loci on sleep using insertional mutagenesis and RNA interference. Conclusions We identified many SNPs in novel loci that are potentially associated with natural variation in sleep, as well as SNPs within genes previously known to affect Drosophila sleep. Several of the candidate genes have human homologues that were identified in studies of human sleep, suggesting that genes affecting variation in sleep are conserved across species. Our discovery of genetic variants that influence environmental sensitivity to sleep may have a wider application to all GWA studies, because individuals with highly plastic genotypes will not have consistent phenotypes. PMID:23617951
Legarra, A; Misztal, I
Genome-wide genetic evaluation might involve the computation of BLUP-like estimations, potentially including thousands of covariates (i.e., single-nucleotide polymorphism markers) for each record. This implies dense Henderson's mixed-model equations and considerable computing resources in time and storage, even for a few thousand records. Possible computing options include the type of storage and the solving algorithm. This work evaluated several computing options, including half-stored Cholesky decomposition, Gauss-Seidel, and 3 matrix-free strategies: Gauss-Seidel, Gauss-Seidel with residuals update, and preconditioned conjugate gradients. Matrix-free Gauss-Seidel with residuals update adjusts the residuals after computing the solution for each effect. This avoids adjusting the left-hand side of the equations by all other effects at every step of the algorithm and saves considerable computing time. Any Gauss-Seidel algorithm can easily be extended for variance component estimation by Markov chain-Monte Carlo. Let m and n be the number of records and markers, respectively. Computing time for Cholesky decomposition is proportional to n3. Computing times per round are proportional to mn2 in matrix-free Gauss-Seidel, to n2 for half-stored Gauss-Seidel, and to n and m for the rest of the algorithms. Algorithms were tested on a real mouse data set, which included 1,928 records and 10,946 single-nucleotide polymorphism markers. Computing times were in the order of a few minutes for Gauss-Seidel with residuals update and preconditioned conjugate gradients, more than 1 h for half-stored Gauss-Seidel, 2 h for Cholesky decomposition, and 4 d for matrix-free Gauss-Seidel. Preconditioned conjugate gradients was the fastest. Gauss-Seidel with residuals update would be the method of choice for variance component estimation as well as solving.
Background Condensins are multi-subunit protein complexes that are essential for chromosome condensation during mitosis and meiosis, and play key roles in transcription regulation during interphase. Metazoans contain two condensins, I and II, which perform different functions and localize to different chromosomal regions. Caenorhabditis elegans contains a third condensin, IDC, that is targeted to and represses transcription of the X chromosome for dosage compensation. Results To understand condensin binding and function, we performed ChIP-seq analysis of C. elegans condensins in mixed developmental stage embryos, which contain predominantly interphase nuclei. Condensins bind to a subset of active promoters, tRNA genes and putative enhancers. Expression analysis in kle-2-mutant larvae suggests that the primary effect of condensin II on transcription is repression. A DNA sequence motif, GCGC, is enriched at condensin II binding sites. A sequence extension of this core motif, AGGG, creates the condensin IDC motif. In addition to differences in recruitment that result in X-enrichment of condensin IDC and condensin II binding to all chromosomes, we provide evidence for a shared recruitment mechanism, as condensin IDC recruiter SDC-2 also recruits condensin II to the condensin IDC recruitment sites on the X. In addition, we found that condensin sites overlap extensively with the cohesin loader SCC-2, and that SDC-2 also recruits SCC-2 to the condensin IDC recruitment sites. Conclusions Our results provide the first genome-wide view of metazoan condensin II binding in interphase, define putative recruitment motifs, and illustrate shared loading mechanisms for condensin IDC and condensin II. PMID:24125077
Background Even before having its genome sequence published in 2004, Kluyveromyces lactis had long been considered a model organism for studies in genetics and physiology. Research on Kluyveromyces lactis is quite advanced and this yeast species is one of the few with which it is possible to perform formal genetic analysis. Nevertheless, until now, no complete metabolic functional annotation has been performed to the proteins encoded in the Kluyveromyces lactis genome. Results In this work, a new metabolic genome-wide functional re-annotation of the proteins encoded in the Kluyveromyces lactis genome was performed, resulting in the annotation of 1759 genes with metabolic functions, and the development of a methodology supported by merlin (software developed in-house). The new annotation includes novelties, such as the assignment of transporter superfamily numbers to genes identified as transporter proteins. Thus, the genes annotated with metabolic functions could be exclusively enzymatic (1410 genes), transporter proteins encoding genes (301 genes) or have both metabolic activities (48 genes). The new annotation produced by this work largely surpassed the Kluyveromyces lactis currently available annotations. A comparison with KEGG’s annotation revealed a match with 844 (~90%) of the genes annotated by KEGG, while adding 850 new gene annotations. Moreover, there are 32 genes with annotations different from KEGG. Conclusions The methodology developed throughout this work can be used to re-annotate any yeast or, with a little tweak of the reference organism, the proteins encoded in any sequenced genome. The new annotation provided by this study offers basic knowledge which might be useful for the scientific community working on this model yeast, because new functions have been identified for the so-called metabolic genes. Furthermore, it served as the basis for the reconstruction of a compartmentalized, genome-scale metabolic model of Kluyveromyces lactis, which is
Shen, Xun; Collier, John Michael; Hlaing, Myint; Zhang, Leanne; Delshad, Elizabeth H.; Bristow, James; Bernstein, Harold S.
Skeletal and cardiac myocytes cease division within weeks of birth. Although skeletal muscle retains limited capacity for regeneration through recruitment of satellite cells, resident populations of adult myocardial stem cells have not been identified. Because cell cycle withdrawal accompanies myocyte differentiation, we hypothesized that C2C12 cells, a mouse myoblast cell line previously used to characterize myocyte differentiation, also would provide a model for studying cell cycle withdrawal during differentiation. C2C12 cells were differentiated in culture medium containing horse serum and harvested at various time points to characterize the expression profiles of known cell cycle and myogenic regulatory factors by immunoblot analysis. BrdU incorporation decreased dramatically in confluent cultures 48 hr after addition of horse serum, as cells started to form myotubes. This finding was preceded by up-regulation of MyoD, followed by myogenin, and activation of Bcl-2. Cyclin D1 was expressed in proliferating cultures and became undetectable in cultures containing 40 percent fused myotubes, as levels of p21(WAF1/Cip1) increased and alpha-actin became detectable. Because C2C12 myoblasts withdraw from the cell cycle during myocyte differentiation following a course that recapitulates this process in vivo, we performed a genome-wide screen to identify other gene products involved in this process. Using microarrays containing approximately 10,000 minimally redundant mouse sequences that map to the UniGene database of the National Center for Biotechnology Information, we compared gene expression profiles between proliferating, differentiating, and differentiated C2C12 cells and verified candidate genes demonstrating differential expression by RT-PCR. Cluster analysis of differentially expressed genes revealed groups of gene products involved in cell cycle withdrawal, muscle differentiation, and apoptosis. In addition, we identified several genes, including DDAH2 and Ly
Bergfelder-Drüing, Sarah; Grosse-Brinkhaus, Christine; Lind, Bianca; Erbe, Malena; Schellander, Karl; Simianer, Henner; Tholen, Ernst
The number of piglets born alive (NBA) per litter is one of the most important traits in pig breeding due to its influence on production efficiency. It is difficult to improve NBA because the heritability of the trait is low and it is governed by a high number of loci with low to moderate effects. To clarify the biological and genetic background of NBA, genome-wide association studies (GWAS) were performed using 4,012 Large White and Landrace pigs from herdbook and commercial breeding companies in Germany (3), Austria (1) and Switzerland (1). The animals were genotyped with the Illumina PorcineSNP60 BeadChip. Because of population stratifications within and between breeds, clusters were formed using the genetic distances between the populations. Five clusters for each breed were formed and analysed by GWAS approaches. In total, 17 different significant markers affecting NBA were found in regions with known effects on female reproduction. No overlapping significant chromosome areas or QTL between Large White and Landrace breed were detected.
Gurgul, A; Szmatoła, T; Ropka-Molik, K; Jasielczuk, I; Pawlina, K; Semik, E; Bugno-Poniewierska, M
The study is aimed at identifying selection footprints within the genome of Limousin cattle. With the use of Extended Haplotype Homozygosity test, supplemented with correction for variation in recombination rates across the genome, we created map of selection footprints and detected 173 significant (p < 0.01) core haplotypes being potentially under positive selection. Within these regions, a number of candidate genes associated inter alia with skeletal muscle growth (GDF15, BMP7, BMP4 and TGFB3) or postmortem proteolysis and meat maturation (CAPN1 and CAPN5) were annotated. Noticeable clusters of selection footprints were detected on chromosomes 1, 4, 8 and 14, which are known to carry several quantitative trait loci for growth traits and meat quality. The study provides information about the genes and metabolic pathways potentially modified under the influence of directional selection, aimed at improving beef production characteristics in Limousin cattle.
Entropion is an inversion of the eyelid margin causing lashes or external hairs to rub against the ocular surface. If uncorrected, discomfort, ocular damage, increased eye infection rates, and potential blindness can occur. Entropion affects many mammalian species, can be expressed in both upper and...
A genome scan was conducted to identify QTL affecting milk yield in a Brazilian Gyr population of progeny test bulls (N=319). Data used in this study was derived from traditional genetic evaluation records computed by the Embrapa Dairy Cattleand released in May/2009 (http://www.cnpgl.embrapa.br/nova...
Zhang, Heping; Baldwin, Don A; Bukowski, Radek K; Parry, Samuel; Xu, Yaji; Song, Chi; Andrews, William W; Saade, George R; Esplin, M Sean; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John; Varner, Michael; Biggio, Joseph R
Preterm birth is the leading cause of infant morbidity and mortality. Despite extensive research, the genetic contributions to spontaneous preterm birth (SPTB) are not well understood. Term controls were matched with cases by race/ethnicity, maternal age, and parity prior to recruitment. Genotyping was performed using Affymetrix SNP Array 6.0 assays. Statistical analyses utilized PLINK to compare allele occurrence rates between case and control groups, and incorporated quality control and multiple-testing adjustments. We analyzed DNA samples from mother-infant pairs from early SPTB cases (20(0/7)-33(6/7) weeks, 959 women and 979 neonates) and term delivery controls (39(0/7)-41(6/7) weeks, 960 women and 985 neonates). For validation purposes, we included an independent validation cohort consisting of early SPTB cases (293 mothers and 243 infants) and term controls (200 mothers and 149 infants). Clustering analysis revealed no population stratification. Multiple maternal SNPs were identified with association P-values between 10×10(-5) and 10×10(-6). The most significant maternal SNP was rs17053026 on chromosome 3 with an odds ratio (OR) 0.44 with a P-value of 1.0×10(-6). Two neonatal SNPs reached the genome-wide significance threshold, including rs17527054 on chromosome 6p22 with a P-value of 2.7×10(-12) and rs3777722 on chromosome 6q27 with a P-value of 1.4×10(-10). However, we could not replicate these findings after adjusting for multiple comparisons in a validation cohort. This is the first report of a genome-wide case-control study to identify single nucleotide polymorphisms (SNPs) that correlate with SPTB.
Chen, Gengshen; Wang, Xiaoming; Hao, Junjie; Yan, Jianbing; Ding, Junqiang
Maize rough dwarf disease (MRDD) is a destructive viral disease in China, which results in 20-30% of the maize yield losses in affected areas and even as high as 100% in severely infected fields. Understanding the genetic basis of resistance will provide important insights for maize breeding program. In this study, a diverse maize population comprising of 527 inbred lines was evaluated in four environments and a genome-wide association study (GWAS) was undertaken with over 556000 SNP markers. Fifteen candidate genes associated with MRDD resistance were identified, including ten genes with annotated protein encoding functions. The homologous of nine candidate genes were predicted to relate to plant defense in different species based on published results. Significant correlation (R2 = 0.79) between the MRDD severity and the number of resistance alleles was observed. Consequently, we have broadened the resistant germplasm to MRDD and identified a number of resistance alleles by GWAS. The results in present study also imply the candidate genes in defense pathway play an important role in resistance to MRDD in maize.
Wang, Da-Wei; Li, Da; Wang, Junjun; Zhao, Yue; Wang, Zhaojun; Yue, Guidong; Liu, Xin; Qin, Huanju; Zhang, Kunpu; Dong, Lingli; Wang, Daowen
Gliadins, specified by six compound chromosomal loci (Gli-A1/B1/D1 and Gli-A2/B2/D2) in hexaploid bread wheat, are the dominant carriers of celiac disease (CD) epitopes. Because of their complexity, genome-wide characterization of gliadins is a strong challenge. Here, we approached this challenge by combining transcriptomic, proteomic and bioinformatic investigations. Through third-generation RNA sequencing, full-length transcripts were identified for 52 gliadin genes in the bread wheat cultivar Xiaoyan 81. Of them, 42 were active and predicted to encode 25 α-, 11 γ-, one δ- and five ω-gliadins. Comparative proteomic analysis between Xiaoyan 81 and six newly-developed mutants each lacking one Gli locus indicated the accumulation of 38 gliadins in the mature grains. A novel group of α-gliadins (the CSTT group) was recognized to contain very few or no CD epitopes. The δ-gliadins identified here or previously did not carry CD epitopes. Finally, the mutant lacking Gli-D2 showed significant reductions in the most celiac-toxic α-gliadins and derivative CD epitopes. The insights and resources generated here should aid further studies on gliadin functions in CD and the breeding of healthier wheat.
Diao, Wei-Ping; Snyder, John C.; Wang, Shu-Bin; Liu, Jin-Bing; Pan, Bao-Gui; Guo, Guang-Jun; Wei, Ge
The WRKY family of transcription factors is one of the most important families of plant transcriptional regulators with members regulating multiple biological processes, especially in regulating defense against biotic and abiotic stresses. However, little information is available about WRKYs in pepper (Capsicum annuum L.). The recent release of completely assembled genome sequences of pepper allowed us to perform a genome-wide investigation for pepper WRKY proteins. In the present study, a total of 71 WRKY genes were identified in the pepper genome. According to structural features of their encoded proteins, the pepper WRKY genes (CaWRKY) were classified into three main groups, with the second group further divided into five subgroups. Genome mapping analysis revealed that CaWRKY were enriched on four chromosomes, especially on chromosome 1, and 15.5% of the family members were tandemly duplicated genes. A phylogenetic tree was constructed depending on WRKY domain' sequences derived from pepper and Arabidopsis. The expression of 21 selected CaWRKY genes in response to seven different biotic and abiotic stresses (salt, heat shock, drought, Phytophtora capsici, SA, MeJA, and ABA) was evaluated by quantitative RT-PCR; Some CaWRKYs were highly expressed and up-regulated by stress treatment. Our results will provide a platform for functional identification and molecular breeding studies of WRKY genes in pepper. PMID:26941768
Röder, Marion S.; van Eeuwijk, Fred
Malting quality is an important trait in breeding barley (Hordeum vulgare L.). It requires elaborate, expensive phenotyping, which involves micro-malting experiments. Although there is abundant historical information available for different cultivars in different years and trials, that historical information is not often used in genetic analyses. This study aimed to exploit historical records to assist in identifying genomic regions that affect malting and kernel quality traits in barley. This genome-wide association study utilized information on grain yield and 18 quality traits accumulated over 25 years on 174 European spring and winter barley cultivars combined with diversity array technology markers. Marker-trait associations were tested with a mixed linear model. This model took into account the genetic relatedness between cultivars based on principal components scores obtained from marker information. We detected 140 marker-trait associations. Some of these associations confirmed previously known quantitative trait loci for malting quality (on chromosomes 1H, 2H, and 5H). Other associations were reported for the first time in this study. The genetic correlations between traits are discussed in relation to the chromosomal regions associated with the different traits. This approach is expected to be particularly useful when designing strategies for multiple trait improvements. PMID:25372869
Wang, Da-Wei; Li, Da; Wang, Junjun; Zhao, Yue; Wang, Zhaojun; Yue, Guidong; Liu, Xin; Qin, Huanju; Zhang, Kunpu; Dong, Lingli; Wang, Daowen
Gliadins, specified by six compound chromosomal loci (Gli-A1/B1/D1 and Gli-A2/B2/D2) in hexaploid bread wheat, are the dominant carriers of celiac disease (CD) epitopes. Because of their complexity, genome-wide characterization of gliadins is a strong challenge. Here, we approached this challenge by combining transcriptomic, proteomic and bioinformatic investigations. Through third-generation RNA sequencing, full-length transcripts were identified for 52 gliadin genes in the bread wheat cultivar Xiaoyan 81. Of them, 42 were active and predicted to encode 25 α-, 11 γ-, one δ- and five ω-gliadins. Comparative proteomic analysis between Xiaoyan 81 and six newly-developed mutants each lacking one Gli locus indicated the accumulation of 38 gliadins in the mature grains. A novel group of α-gliadins (the CSTT group) was recognized to contain very few or no CD epitopes. The δ-gliadins identified here or previously did not carry CD epitopes. Finally, the mutant lacking Gli-D2 showed significant reductions in the most celiac-toxic α-gliadins and derivative CD epitopes. The insights and resources generated here should aid further studies on gliadin functions in CD and the breeding of healthier wheat. PMID:28300172
Pausch, Hubert; Jung, Simone; Edel, Christian; Emmerling, Reiner; Krogmeier, Dieter; Götz, Kay-Uwe; Fries, Ruedi
Supernumerary teats (hyperthelia, SNTs) are a common abnormality of the bovine udder with a medium to high heritability and a postulated oligogenic or polygenic inheritance pattern. SNTs not only negatively affect machine milking ability but also act as a reservoir for bacteria. A genome-wide association study was carried out to identify genes involved in the development of SNTs in the dual-purpose Fleckvieh breed. A total of 2467 progeny-tested bulls were genotyped at 43 698 single nucleotide polymorphisms, and daughter yield deviations (DYDs) for 'udder clearness' (UC) were used as high-heritability phenotypes. Massive structuring of the study population was accounted for by principal components analysis-based and mixed model-based approaches. Four loci on BTA5, BTA6, BTA11 and BTA17 were significantly associated with the UC DYD. Three associated regions contain genes of the highly conserved Wnt signalling pathway. The four QTL together account for 10.7% of the variance of the UC DYD, whereas the major fraction of the DYD variance is attributable to chromosomes with no identified QTL. Our results support both an oligogenic and a polygenic inheritance pattern of SNTs in cattle. The identified candidate genes permit insights into the genetic architecture of teat malformations in cattle and provide clues to unravel the molecular mechanisms of mammary gland alterations in cattle and other species.
Pantalião, Gabriel Feresin; Narciso, Marcelo; Guimarães, Cléber; Castro, Adriano; Colombari, José Manoel; Breseghello, Flavio; Rodrigues, Luana; Vianello, Rosana Pereira; Borba, Tereza Oliveira; Brondani, Claudio
The identification of rice drought tolerant materials is crucial for the development of best performing cultivars for the upland cultivation system. This study aimed to identify markers and candidate genes associated with drought tolerance by Genome Wide Association Study analysis, in order to develop tools for use in rice breeding programs. This analysis was made with 175 upland rice accessions (Oryza sativa), evaluated in experiments with and without water restriction, and 150,325 SNPs. Thirteen SNP markers associated with yield under drought conditions were identified. Through stepwise regression analysis, eight SNP markers were selected and validated in silico, and when tested by PCR, two out of the eight SNP markers were able to identify a group of rice genotypes with higher productivity under drought. These results are encouraging for deriving markers for the routine analysis of marker assisted selection. From the drought experiment, including the genes inherited in linkage blocks, 50 genes were identified, from which 30 were annotated, and 10 were previously related to drought and/or abiotic stress tolerance, such as the transcription factors WRKY and Apetala2, and protein kinases.
Roos, Thomas R.; Roos, Andrew K.; Kleimeyer, John P.; Ahmed, Marwa A.; Goodlin, Gabrielle T.; Fredericson, Michael; Ioannidis, John P. A.; Avins, Andrew L.; Dragoo, Jason L.
Achilles tendinopathy or rupture and anterior cruciate ligament (ACL) rupture are substantial injuries affecting athletes, associated with delayed recovery or inability to return to competition. To identify genetic markers that might be used to predict risk for these injuries, we performed genome-wide association screens for these injuries using data from the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort consisting of 102,979 individuals. We did not find any single nucleotide polymorphisms (SNPs) associated with either of these injuries with a p-value that was genome-wide significant (p<5x10-8). We found, however, four and three polymorphisms with p-values that were borderline significant (p<10−6) for Achilles tendon injury and ACL rupture, respectively. We then tested SNPs previously reported to be associated with either Achilles tendon injury or ACL rupture. None showed an association in our cohort with a false discovery rate of less than 5%. We obtained, however, moderate to weak evidence for replication in one case; specifically, rs4919510 in MIR608 had a p-value of 5.1x10-3 for association with Achilles tendon injury, corresponding to a 7% chance of false replication. Finally, we tested 2855 SNPs in 90 candidate genes for musculoskeletal injury, but did not find any that showed a significant association below a false discovery rate of 5%. We provide data containing summary statistics for the entire genome, which will be useful for future genetic studies on these injuries. PMID:28358823
Figueroa, Jonine D.; Han, Summer S.; Garcia-Closas, Montserrat; Baris, Dalsu; Jacobs, Eric J.; Kogevinas, Manolis; Schwenn, Molly; Malats, Nuria; Johnson, Alison; Purdue, Mark P.; Caporaso, Neil; Landi, Maria Teresa; Prokunina-Olsson, Ludmila; Wang, Zhaoming; Hutchinson, Amy; Burdette, Laurie; Wheeler, William; Vineis, Paolo; Siddiq, Afshan; Cortessis, Victoria K.; Kooperberg, Charles; Cussenot, Olivier; Benhamou, Simone; Prescott, Jennifer; Porru, Stefano; Bueno-de-Mesquita, H.Bas; Trichopoulos, Dimitrios; Ljungberg, Börje; Clavel-Chapelon, Françoise; Weiderpass, Elisabete; Krogh, Vittorio; Dorronsoro, Miren; Travis, Ruth; Tjønneland, Anne; Brenan, Paul; Chang-Claude, Jenny; Riboli, Elio; Conti, David; Gago-Dominguez, Manuela; Stern, Mariana C.; Pike, Malcolm C.; Van Den Berg, David; Yuan, Jian-Min; Hohensee, Chancellor; Rodabough, Rebecca; Cancel-Tassin, Geraldine; Roupret, Morgan; Comperat, Eva; Chen, Constance; De Vivo, Immaculata; Giovannucci, Edward; Hunter, David J.; Kraft, Peter; Lindstrom, Sara; Carta, Angela; Pavanello, Sofia; Arici, Cecilia; Mastrangelo, Giuseppe; Karagas, Margaret R.; Schned, Alan; Armenti, Karla R.; Hosain, G.M.Monawar; Haiman, Chris A.; Fraumeni, Joseph F.; Chanock, Stephen J.; Chatterjee, Nilanjan; Rothman, Nathaniel; Silverman, Debra T.
Bladder cancer is a complex disease with known environmental and genetic risk factors. We performed a genome-wide interaction study (GWAS) of smoking and bladder cancer risk based on primary scan data from 3002 cases and 4411 controls from the National Cancer Institute Bladder Cancer GWAS. Alternative methods were used to evaluate both additive and multiplicative interactions between individual single nucleotide polymorphisms (SNPs) and smoking exposure. SNPs with interaction P values < 5 × 10− 5 were evaluated further in an independent dataset of 2422 bladder cancer cases and 5751 controls. We identified 10 SNPs that showed association in a consistent manner with the initial dataset and in the combined dataset, providing evidence of interaction with tobacco use. Further, two of these novel SNPs showed strong evidence of association with bladder cancer in tobacco use subgroups that approached genome-wide significance. Specifically, rs1711973 (FOXF2) on 6p25.3 was a susceptibility SNP for never smokers [combined odds ratio (OR) = 1.34, 95% confidence interval (CI) = 1.20–1.50, P value = 5.18 × 10− 7]; and rs12216499 (RSPH3-TAGAP-EZR) on 6q25.3 was a susceptibility SNP for ever smokers (combined OR = 0.75, 95% CI = 0.67–0.84, P value = 6.35 × 10− 7). In our analysis of smoking and bladder cancer, the tests for multiplicative interaction seemed to more commonly identify susceptibility loci with associations in never smokers, whereas the additive interaction analysis identified more loci with associations among smokers—including the known smoking and NAT2 acetylation interaction. Our findings provide additional evidence of gene–environment interactions for tobacco and bladder cancer. PMID:24662972
Körber, Niklas; Bus, Anja; Li, Jinquan; Parkin, Isobel A. P.; Wittkop, Benjamin; Snowdon, Rod J.; Stich, Benjamin
In Brassica napus breeding, traits related to commercial success are of highest importance for plant breeders. However, such traits can only be assessed in an advanced developmental stage. Molecular markers genetically linked to such traits have the potential to accelerate the breeding process of B. napus by marker-assisted selection. Therefore, the objectives of this study were to identify (i) genome regions associated with the examined agronomic and seed quality traits, (ii) the interrelationship of population structure and the detected associations, and (iii) candidate genes for the revealed associations. The diversity set used in this study consisted of 405 B. napus inbred lines which were genotyped using a 6K single nucleotide polymorphism (SNP) array and phenotyped for agronomic and seed quality traits in field trials. In a genome-wide association study, we detected a total of 112 associations between SNPs and the seed quality traits as well as 46 SNP-trait associations for the agronomic traits with a P < 1.28e-05 (Bonferroni correction of α = 0.05) for the inbreds of the spring and winter trial. For the seed quality traits, a single SNP-sulfur concentration in seeds (SUL) association explained up to 67.3% of the phenotypic variance, whereas for the agronomic traits, a single SNP-blossom color (BLC) association explained up to 30.2% of the phenotypic variance. In a basic local alignment search tool (BLAST) search within a distance of 2.5 Mbp around these SNP-trait associations, 62 hits of potential candidate genes with a BLAST-score of ≥100 and a sequence identity of ≥70% to A. thaliana or B. rapa could be found for the agronomic SNP-trait associations and 187 hits of potential candidate genes for the seed quality SNP-trait associations. PMID:27066036
Upadhyaya, Hari D.; Bajaj, Deepak; Narnoliya, Laxmi; Das, Shouvik; Kumar, Vinod; Gowda, C. L. L.; Sharma, Shivali; Tyagi, Akhilesh K.; Parida, Swarup K.
Identification of potential genes/alleles governing complex seed-protein content (SPC) is essential in marker-assisted breeding for quality trait improvement of chickpea. Henceforth, the present study utilized an integrated genomics-assisted breeding strategy encompassing trait association analysis, selective genotyping in traditional bi-parental mapping population and differential expression profiling for the first-time to understand the complex genetic architecture of quantitative SPC trait in chickpea. For GWAS (genome-wide association study), high-throughput genotyping information of 16376 genome-based SNPs (single nucleotide polymorphism) discovered from a structured population of 336 sequenced desi and kabuli accessions [with 150–200 kb LD (linkage disequilibrium) decay] was utilized. This led to identification of seven most effective genomic loci (genes) associated [10–20% with 41% combined PVE (phenotypic variation explained)] with SPC trait in chickpea. Regardless of the diverse desi and kabuli genetic backgrounds, a comparable level of association potential of the identified seven genomic loci with SPC trait was observed. Five SPC-associated genes were validated successfully in parental accessions and homozygous individuals of an intra-specific desi RIL (recombinant inbred line) mapping population (ICC 12299 × ICC 4958) by selective genotyping. The seed-specific expression, including differential up-regulation (>four fold) of six SPC-associated genes particularly in accessions, parents and homozygous individuals of the aforementioned mapping population with a high level of contrasting SPC (21–22%) was evident. Collectively, the integrated genomic approach delineated diverse naturally occurring novel functional SNP allelic variants in six potential candidate genes regulating SPC trait in chickpea. Of these, a non-synonymous SNP allele-carrying zinc finger transcription factor gene exhibiting strong association with SPC trait was found to be the most
Background DNA methylation (DNAm) has important regulatory roles in many biological processes and diseases. It is the only epigenetic mark with a clear mechanism of mitotic inheritance and the only one easily available on a genome scale. Aberrant cytosine-phosphate-guanine (CpG) methylation has been discussed in the context of disease aetiology, especially cancer. CpG hypermethylation of promoter regions is often associated with silencing of tumour suppressor genes and hypomethylation with activation of oncogenes. Supervised principal component analysis (SPCA) is a popular machine learning method. However, in a recent application to phenotype prediction from DNAm data SPCA was inferior to the specific method EVORA. Results We present Model-Selection-SPCA (MS-SPCA), an enhanced version of SPCA. MS-SPCA applies several models that perform well in the training data to the test data and selects the very best models for final prediction based on parameters of the test data. We have applied MS-SPCA for phenotype prediction from genome-wide DNAm data. CpGs used for prediction are selected based on the quantification of three features of their methylation (average methylation difference, methylation variation difference and methylation-age-correlation). We analysed four independent case–control datasets that correspond to different stages of cervical cancer: (i) cases currently cytologically normal, but will later develop neoplastic transformations, (ii, iii) cases showing neoplastic transformations and (iv) cases with confirmed cancer. The first dataset was split into several smaller case–control datasets (samples either Human Papilloma Virus (HPV) positive or negative). We demonstrate that cytology normal HPV+ and HPV- samples contain DNAm patterns which are associated with later neoplastic transformations. We present evidence that DNAm patterns exist in cytology normal HPV- samples that (i) predispose to neoplastic transformations after HPV infection and (ii
Panagiotou, Orestis A; Travis, Ruth C; Campa, Daniele; Berndt, Sonja I.; Lindstrom, Sara; Kraft, Peter; Schumacher, Fredrick R.; Siddiq, Afshan; Papatheodorou, Stefania I.; Stanford, Janet L.; Albanes, Demetrius; Virtamo, Jarmo; Weinstein, Stephanie J.; Diver, W. Ryan; Gapstur, Susan M.; Stevens, Victoria L.; Boeing, Heiner; Bueno-de-Mesquita, H. Bas; Gurrea, Aurelio Barricarte; Kaaks, Rudolf; Khaw, Kay-Tee; Krogh, Vittorio; Overvad, Kim; Riboli, Elio; Trichopoulos, Dimitrios; Giovannucci, Edward; Stampfer, Meir; Haiman, Christopher; Henderson, Brian; Le Marchand, Loic; Gaziano, J. Michael; Hunter, DavidJ.; Koutros, Stella; Yeager, Meredith; Hoover, Robert N.; Chanock, Stephen J.; Wacholder, Sholom; Key, Timothy J.; Tsilidis, Konstantinos K
Background No single-nucleotide polymorphisms (SNPs) specific for aggressive prostate cancer have been identified in genome-wide association studies (GWAS). Objective To test if SNPs associated with other traits may also affect the risk of aggressive prostate cancer. Design, setting, and participants SNPs implicated in any phenotype other than prostate cancer (p ≤ 10−7) were identified through the catalog of published GWAS and tested in 2891 aggressive prostate cancer cases and 4592 controls from the Breast and Prostate Cancer Cohort Consortium (BPC3). The 40 most significant SNPs were followed up in 4872 aggressive prostate cancer cases and 24 534 controls from the Prostate Cancer Association Group to Investigate Cancer Associated Alterations in the Genome (PRACTICAL) consortium. Outcome measurements and statistical analysis Odds ratios (ORs) and 95% confidence intervals (CIs) for aggressive prostate cancer were estimated. Results and limitations A total of 4666 SNPs were evaluated by the BPC3. Two signals were seen in regions already reported for prostate cancer risk. rs7014346 at 8q24.21 was marginally associated with aggressive prostate cancer in the BPC3 trial (p = 1.6 × 10-6), whereas after meta-analysis by PRACTICAL the summary OR was 1.21 (95%CI 1.16–1.27; p = 3.22 × 10−18). rs9900242 at 17q24.3 was also marginally associated with aggressive disease in the meta-analysis (OR 0.90, 95% CI 0.86–0.94; p = 2.5 × 10−6). Neither of these SNPs remained statistically significant when conditioning on correlated known prostate cancer SNPs. The meta-analysis by BPC3 and PRACTICAL identified a third promising signal, marked by rs16844874 at 2q34, independent of known prostate cancer loci (OR 1.12,95% CI 1.06–1.19; p = 4.67 × 10−5); it has been shown that SNPs correlated with this signal affect glycine concentrations. The main limitation is the heterogeneity in the definition of aggressive prostate cancer between BPC3 and PRACTICAL. Conclusions We did
Bhatti, Parveen; Zhang, Yuzheng; Song, Xiaoling; Makar, Karen W; Sather, Cassandra L; Kelsey, Karl T; Houseman, E Andres; Wang, Pei
The negative health effects of shift work, including carcinogenesis, may be mediated by changes in DNA methylation, particularly in the circadian genes. Using the Infinium HumanMethylation450 Bead Array (Illumina, San Diego, CA), we compared genome-wide methylation between 65 actively working dayshift workers and 59 actively working nightshift workers in the healthcare industry. A total of 473 800 loci, including 391 loci across the 12 core circadian genes, were analyzed to identify methylation markers associated with shift work status using linear regression models adjusted for gender, age, body mass index, race, smoking status and leukocyte cell profile as measured by flow cytometry. Analyses at the level of gene, CpG island and gene region were also conducted. To account for multiple comparisons, we controlled the false discovery rate (FDR ≤0.05). Significant differences between nightshift and dayshift workers were found at 16 135 of 473 800 loci, across 3769 of 20 164 genes, across 7173 of 22 721 CpG islands and across 5508 of 51 843 gene regions. For each significant loci, gene, CpG island or gene region, average methylation was consistently found to be decreased among nightshift workers compared to dayshift workers. Twenty-one loci located in the circadian genes were also found to be significantly hypomethylated among nightshift workers. The largest differences were observed for three loci located in the gene body of PER3. A total of nine significant loci were found in the CSNK1E gene, most of which were located in a CpG island and near the transcription start site of the gene. Methylation changes in these circadian genes may lead to altered expression of these genes which has been associated with cancer in previous studies. Gene ontology enrichment analysis revealed that among the significantly hypomethylated genes, processes related to host defense and immunity were represented. Our results indicate that the health effects of shift work may be
Johnson, Daniel H.; Venuto, Charles; Ritchie, Marylyn D.; Morse, Gene D.; Daar, Eric S.; McLaren, Paul J.; Haas, David W.
Background Atazanavir-associated hyperbilirubinemia can cause premature discontinuation of atazanavir and avoidance of its initial prescribing. We used genome-wide genotyping and clinical data to characterize determinants of atazanavir pharmacokinetics and hyperbilirubinemia in AIDS Clinical Trials Group protocol A5202. Methods Plasma atazanavir pharmacokinetics and indirect bilirubin concentrations were characterized in HIV-1-infected subjects randomized to atazanavir/ritonavir-containing regimens. A subset had genome-wide genotype data available. Results Genome-wide assay data were available from 542 subjects, of who 475 also had estimated atazanavir clearance and relevant covariate data available. Peak bilirubin concentration and relevant covariates were available for 443 participants. By multivariate analysis, higher peak on-treatment bilirubin was associated with UGT1A1 rs887829 T allele (P=6.4×10−12), higher baseline hemoglobin (P=4.9×10−13), higher baseline bilirubin (P=6.7×10−12), and slower plasma atazanavir clearance (P=8.6×10−11). For peak bilirubin >3.0 mg/dL, the positive predictive value of baseline bilirubin ≥0.5 mg/dL with hemoglobin ≥14g/dL was 0.51, which increased to 0.85 with rs887829 TT homozygosity. For peak bilirubin ≤3.0 mg/dL, the positive predictive value of baseline bilirubin <0.5 mg/dL with hemoglobin <14 g/dL was 0.91, which increased to 0.96 with rs887829 CC homozygosity. No polymorphism predicted atazanavir pharmacokinetics at genome-wide significance. Conclusions Atazanavir-associated hyperbilirubinemia is best predicted by considering UGT1A1 genotype, baseline bilirubin, and baseline hemoglobin values in combination. Use of ritonavir as a pharmacokinetic enhancer may have abrogated genetic associations with atazanavir pharmacokinetics. PMID:24557078
Ritter, McKenzie L.; Guo, Wei; Samuels, Jack F.; Wang, Ying; Nestadt, Paul S.; Krasnow, Janice; Greenberg, Benjamin D.; Fyer, Abby J.; McCracken, James T.; Geller, Daniel A.; Murphy, Dennis L.; Knowles, James A.; Grados, Marco A.; Riddle, Mark A.; Rasmussen, Steven A.; McLaughlin, Nicole C.; Nurmi, Erika L.; Askland, Kathleen D.; Cullen, Bernadette; Piacentini, John; Pauls, David L.; Bienvenu, Joseph; Stewart, Evelyn; Goes, Fernando S.; Maher, Brion; Pulver, Ann E.; Mattheisen, Manuel; Qian, Ji; Nestadt, Gerald; Shugart, Yin Yao
Objective: The aim of this study was to identify any potential genetic overlap between attention deficit hyperactivity disorder (ADHD) and obsessive compulsive disorder (OCD). We hypothesized that since these disorders share a sub-phenotype, they may share common risk alleles. In this manuscript, we report the overlap found between these two disorders. Methods: A meta-analysis was conducted between ADHD and OCD, and polygenic risk scores (PRS) were calculated for both disorders. In addition, a protein-protein analysis was completed in order to examine the interactions between proteins; p-values for the protein-protein interaction analysis was calculated using permutation. Conclusion: None of the single nucleotide polymorphisms (SNPs) reached genome wide significance and there was little evidence of genetic overlap between ADHD and OCD. PMID:28386217
Karlas, Alexander; Berre, Stefano; Couderc, Thérèse; Varjak, Margus; Braun, Peter; Meyer, Michael; Gangneux, Nicolas; Karo-Astover, Liis; Weege, Friderike; Raftery, Martin; Schönrich, Günther; Klemm, Uwe; Wurzlbauer, Anne; Bracher, Franz; Merits, Andres; Meyer, Thomas F.; Lecuit, Marc
Chikungunya virus (CHIKV) is a globally spreading alphavirus against which there is no commercially available vaccine or therapy. Here we use a genome-wide siRNA screen to identify 156 proviral and 41 antiviral host factors affecting CHIKV replication. We analyse the cellular pathways in which human proviral genes are involved and identify druggable targets. Twenty-one small-molecule inhibitors, some of which are FDA approved, targeting six proviral factors or pathways, have high antiviral activity in vitro, with low toxicity. Three identified inhibitors have prophylactic antiviral effects in mouse models of chikungunya infection. Two of them, the calmodulin inhibitor pimozide and the fatty acid synthesis inhibitor TOFA, have a therapeutic effect in vivo when combined. These results demonstrate the value of loss-of-function screening and pathway analysis for the rational identification of small molecules with therapeutic potential and pave the way for the development of new, host-directed, antiviral agents. PMID:27177310
Ritter, McKenzie L; Guo, Wei; Samuels, Jack F; Wang, Ying; Nestadt, Paul S; Krasnow, Janice; Greenberg, Benjamin D; Fyer, Abby J; McCracken, James T; Geller, Daniel A; Murphy, Dennis L; Knowles, James A; Grados, Marco A; Riddle, Mark A; Rasmussen, Steven A; McLaughlin, Nicole C; Nurmi, Erika L; Askland, Kathleen D; Cullen, Bernadette; Piacentini, John; Pauls, David L; Bienvenu, Joseph; Stewart, Evelyn; Goes, Fernando S; Maher, Brion; Pulver, Ann E; Mattheisen, Manuel; Qian, Ji; Nestadt, Gerald; Shugart, Yin Yao
Objective: The aim of this study was to identify any potential genetic overlap between attention deficit hyperactivity disorder (ADHD) and obsessive compulsive disorder (OCD). We hypothesized that since these disorders share a sub-phenotype, they may share common risk alleles. In this manuscript, we report the overlap found between these two disorders. Methods: A meta-analysis was conducted between ADHD and OCD, and polygenic risk scores (PRS) were calculated for both disorders. In addition, a protein-protein analysis was completed in order to examine the interactions between proteins; p-values for the protein-protein interaction analysis was calculated using permutation. Conclusion: None of the single nucleotide polymorphisms (SNPs) reached genome wide significance and there was little evidence of genetic overlap between ADHD and OCD.
Schumacher, Fredrick R; Schmit, Stephanie L; Jiao, Shuo; Edlund, Christopher K; Wang, Hansong; Zhang, Ben; Hsu, Li; Huang, Shu-Chen; Fischer, Christopher P; Harju, John F; Idos, Gregory E; Lejbkowicz, Flavio; Manion, Frank J; McDonnell, Kevin; McNeil, Caroline E; Melas, Marilena; Rennert, Hedy S; Shi, Wei; Thomas, Duncan C; Van Den Berg, David J; Hutter, Carolyn M; Aragaki, Aaron K; Butterbach, Katja; Caan, Bette J; Carlson, Christopher S; Chanock, Stephen J; Curtis, Keith R; Fuchs, Charles S; Gala, Manish; Giovannucc, Edward L; Giocannucci, Edward L; Gogarten, Stephanie M; Hayes, Richard B; Henderson, Brian; Hunter, David J; Jackson, Rebecca D; Kolonel, Laurence N; Kooperberg, Charles; Küry, Sébastien; Kury, Sebastian; LaCroix, Andrea; Laurie, Cathy C; Laurie, Cecelia A; Lemire, Mathieu; Lemire, Mathiew; Levine, David; Ma, Jing; Makar, Karen W; Qu, Conghui; Taverna, Darin; Ulrich, Cornelia M; Wu, Kana; Kono, Suminori; West, Dee W; Berndt, Sonja I; Bezieau, Stéphane; Brenner, Hermann; Campbell, Peter T; Chan, Andrew T; Chang-Claude, Jenny; Coetzee, Gerhard A; Conti, David V; Duggan, David; Figueiredo, Jane C; Fortini, Barbara K; Gallinger, Steven J; Gauderman, W James; Giles, Graham; Green, Roger; Haile, Robert; Harrison, Tabitha A; Hoffmeister, Michael; Hopper, John L; Hudson, Thomas J; Jacobs, Eric; Iwasaki, Motoki; Jee, Sun Ha; Jenkins, Mark; Jia, Wei-Hua; Joshi, Amit; Li, Li; Lindor, Noralene M; Matsuo, Keitaro; Moreno, Victor; Mukherjee, Bhramar; Newcomb, Polly A; Potter, John D; Raskin, Leon; Rennert, Gad; Rosse, Stephanie; Severi, Gianluca; Schoen, Robert E; Seminara, Daniela; Shu, Xiao-Ou; Slattery, Martha L; Tsugane, Shoichiro; White, Emily; Xiang, Yong-Bing; Zanke, Brent W; Zheng, Wei; Le Marchand, Loic; Casey, Graham; Gruber, Stephen B; Peters, Ulrike
Genetic susceptibility to colorectal cancer is caused by rare pathogenic mutations and common genetic variants that contribute to familial risk. Here we report the results of a two-stage association study with 18,299 cases of colorectal cancer and 19,656 controls, with follow-up of the most statistically significant genetic loci in 4,725 cases and 9,969 controls from two Asian consortia. We describe six new susceptibility loci reaching a genome-wide threshold of P<5.0E-08. These findings provide additional insight into the underlying biological mechanisms of colorectal cancer and demonstrate the scientific value of large consortia-based genetic epidemiology studies.
Schumacher, Fredrick R.; Schmit, Stephanie L.; Jiao, Shuo; Edlund, Christopher K.; Wang, Hansong; Zhang, Ben; Hsu, Li; Huang, Shu-Chen; Fischer, Christopher P.; Harju, John F.; Idos, Gregory E.; Lejbkowicz, Flavio; Manion, Frank J.; McDonnell, Kevin; McNeil, Caroline E.; Melas, Marilena; Rennert, Hedy S.; Shi, Wei; Thomas, Duncan C.; Van Den Berg, David J.; Hutter, Carolyn M.; Aragaki, Aaron K.; Butterbach, Katja; Caan, Bette J.; Carlson, Christopher S.; Chanock, Stephen J.; Curtis, Keith R.; Fuchs, Charles S.; Gala, Manish; Giovannucci, Edward L.; Gogarten, Stephanie M.; Hayes, Richard B.; Henderson, Brian; Hunter, David J.; Jackson, Rebecca D.; Kolonel, Laurence N.; Kooperberg, Charles; Küry, Sébastien; LaCroix, Andrea; Laurie, Cathy C.; Laurie, Cecelia A.; Lemire, Mathieu; Levine, David; Ma, Jing; Makar, Karen W.; Qu, Conghui; Taverna, Darin; Ulrich, Cornelia M.; Wu, Kana; Kono, Suminori; West, Dee W.; Berndt, Sonja I.; Bezieau, Stéphane; Brenner, Hermann; Campbell, Peter T.; Chan, Andrew T.; Chang-Claude, Jenny; Coetzee, Gerhard A.; Conti, David V.; Duggan, David; Figueiredo, Jane C.; Fortini, Barbara K.; Gallinger, Steven J.; Gauderman, W. James; Giles, Graham; Green, Roger; Haile, Robert; Harrison, Tabitha A.; Hoffmeister, Michael; Hopper, John L.; Hudson, Thomas J.; Jacobs, Eric; Iwasaki, Motoki; Jee, Sun Ha; Jenkins, Mark; Jia, Wei-Hua; Joshi, Amit; Li, Li; Lindor, Noralene M.; Matsuo, Keitaro; Moreno, Victor; Mukherjee, Bhramar; Newcomb, Polly A.; Potter, John D.; Raskin, Leon; Rennert, Gad; Rosse, Stephanie; Severi, Gianluca; Schoen, Robert E.; Seminara, Daniela; Shu, Xiao-Ou; Slattery, Martha L.; Tsugane, Shoichiro; White, Emily; Xiang, Yong-Bing; Zanke, Brent W.; Zheng, Wei; Le Marchand, Loic; Casey, Graham; Gruber, Stephen B.; Peters, Ulrike
Genetic susceptibility to colorectal cancer is caused by rare pathogenic mutations and common genetic variants that contribute to familial risk. Here we report the results of a two-stage association study with 18,299 cases of colorectal cancer and 19,656 controls, with follow-up of the most statistically significant genetic loci in 4,725 cases and 9,969 controls from two Asian consortia. We describe six new susceptibility loci reaching a genome-wide threshold of P<5.0E-08. These findings provide additional insight into the underlying biological mechanisms of colorectal cancer and demonstrate the scientific value of large consortia-based genetic epidemiology studies. PMID:26151821
Background A popular objective of many high-throughput genome projects is to discover various genomic markers associated with traits and develop statistical models to predict traits of future patients based on marker values. Results In this paper, we present a prediction method for time-to-event traits using genome-wide single-nucleotide polymorphisms (SNPs). We also propose a MaxTest associating between a time-to-event trait and a SNP accounting for its possible genetic models. The proposed MaxTest can help screen out nonprognostic SNPs and identify genetic models of prognostic SNPs. The performance of the proposed method is evaluated through simulations. Conclusions In conjunction with the MaxTest, the proposed method provides more parsimonious prediction models but includes more prognostic SNPs than some naive prediction methods. The proposed method is demonstrated with real GWAS data. PMID:23418752
Hillenmeyer, Sara; Davis, Lea K.; Gamazon, Eric R.; Cook, Edwin H.; Cox, Nancy J.; Altman, Russ B.
Motivation: Analyzing genome wide association data in the context of biological pathways helps us understand how genetic variation influences phenotype and increases power to find associations. However, the utility of pathway-based analysis tools is hampered by undercuration and reliance on a distribution of signal across all of the genes in a pathway. Methods that combine genome wide association results with genetic networks to infer the key phenotype-modulating subnetworks combat these issues, but have primarily been limited to network definitions with yes/no labels for gene-gene interactions. A recent method (EW_dmGWAS) incorporates a biological network with weighted edge probability by requiring a secondary phenotype-specific expression dataset. In this article, we combine an algorithm for weighted-edge module searching and a probabilistic interaction network in order to develop a method, STAMS, for recovering modules of genes with strong associations to the phenotype and probable biologic coherence. Our method builds on EW_dmGWAS but does not require a secondary expression dataset and performs better in six test cases. Results: We show that our algorithm improves over EW_dmGWAS and standard gene-based analysis by measuring precision and recall of each method on separately identified associations. In the Wellcome Trust Rheumatoid Arthritis study, STAMS-identified modules were more enriched for separately identified associations than EW_dmGWAS (STAMS P-value 3.0 × 10−4; EW_dmGWAS- P-value = 0.8). We demonstrate that the area under the Precision-Recall curve is 5.9 times higher with STAMS than EW_dmGWAS run on the Wellcome Trust Type 1 Diabetes data. Availability and Implementation: STAMS is implemented as an R package and is freely available at https://simtk.org/projects/stams. Contact: email@example.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27542772
Ng, Esther; Lind, P Monica; Lindgren, Cecilia; Ingelsson, Erik; Mahajan, Anubha; Morris, Andrew; Lind, Lars
The accumulation of toxic metals in the human body is influenced by exposure and mechanisms involved in metabolism, some of which may be under genetic control. This is the first genome-wide association study to investigate variants associated with whole blood levels of a range of toxic metals. Eleven toxic metals and trace elements (aluminium, cadmium, cobalt, copper, chromium, mercury, manganese, molybdenum, nickel, lead and zinc) were assayed in a cohort of 949 individuals using mass spectrometry. DNA samples were genotyped on the Infinium Omni Express bead microarray and imputed up to reference panels from the 1000 Genomes Project. Analyses revealed two regions associated with manganese level at genome-wide significance, mapping to 4q24 and 1q41. The lead single nucleotide polymorphism (SNP) in the 4q24 locus was rs13107325 (P-value = 5.1 × 10(-11), β = -0.77), located in an exon of SLC39A8, which encodes a protein involved in manganese and zinc transport. The lead SNP in the 1q41 locus is rs1776029 (P-value = 2.2 × 10(-14), β = -0.46). The SNP lies within the intronic region of SLC30A10, another transporter protein. Among other metals, the loci 6q14.1 and 3q26.32 were associated with cadmium and mercury levels (P = 1.4 × 10(-10), β = -1.2 and P = 1.8 × 10(-9), β = -1.8, respectively). Whole blood measurements of toxic metals are associated with genetic variants in metal transporter genes and others. This is relevant in inferring metabolic pathways of metals and identifying subsets of individuals who may be more susceptible to metal toxicity.
Felix, Janine F; Bradfield, Jonathan P; Monnereau, Claire; van der Valk, Ralf J P; Stergiakouli, Evie; Chesi, Alessandra; Gaillard, Romy; Feenstra, Bjarke; Thiering, Elisabeth; Kreiner-Møller, Eskil; Mahajan, Anubha; Pitkänen, Niina; Joro, Raimo; Cavadino, Alana; Huikari, Ville; Franks, Steve; Groen-Blokhuis, Maria M; Cousminer, Diana L; Marsh, Julie A; Lehtimäki, Terho; Curtin, John A; Vioque, Jesus; Ahluwalia, Tarunveer S; Myhre, Ronny; Price, Thomas S; Vilor-Tejedor, Natalia; Yengo, Loïc; Grarup, Niels; Ntalla, Ioanna; Ang, Wei; Atalay, Mustafa; Bisgaard, Hans; Blakemore, Alexandra I; Bonnefond, Amelie; Carstensen, Lisbeth; Eriksson, Johan; Flexeder, Claudia; Franke, Lude; Geller, Frank; Geserick, Mandy; Hartikainen, Anna-Liisa; Haworth, Claire M A; Hirschhorn, Joel N; Hofman, Albert; Holm, Jens-Christian; Horikoshi, Momoko; Hottenga, Jouke Jan; Huang, Jinyan; Kadarmideen, Haja N; Kähönen, Mika; Kiess, Wieland; Lakka, Hanna-Maaria; Lakka, Timo A; Lewin, Alexandra M; Liang, Liming; Lyytikäinen, Leo-Pekka; Ma, Baoshan; Magnus, Per; McCormack, Shana E; McMahon, George; Mentch, Frank D; Middeldorp, Christel M; Murray, Clare S; Pahkala, Katja; Pers, Tune H; Pfäffle, Roland; Postma, Dirkje S; Power, Christine; Simpson, Angela; Sengpiel, Verena; Tiesler, Carla M T; Torrent, Maties; Uitterlinden, André G; van Meurs, Joyce B; Vinding, Rebecca; Waage, Johannes; Wardle, Jane; Zeggini, Eleftheria; Zemel, Babette S; Dedoussis, George V; Pedersen, Oluf; Froguel, Philippe; Sunyer, Jordi; Plomin, Robert; Jacobsson, Bo; Hansen, Torben; Gonzalez, Juan R; Custovic, Adnan; Raitakari, Olli T; Pennell, Craig E; Widén, Elisabeth; Boomsma, Dorret I; Koppelman, Gerard H; Sebert, Sylvain; Järvelin, Marjo-Riitta; Hyppönen, Elina; McCarthy, Mark I; Lindi, Virpi; Harri, Niinikoski; Körner, Antje; Bønnelykke, Klaus; Heinrich, Joachim; Melbye, Mads; Rivadeneira, Fernando; Hakonarson, Hakon; Ring, Susan M; Smith, George Davey; Sørensen, Thorkild I A; Timpson, Nicholas J; Grant, Struan F A; Jaddoe, Vincent W V
A large number of genetic loci are associated with adult body mass index. However, the genetics of childhood body mass index are largely unknown. We performed a meta-analysis of genome-wide association studies of childhood body mass index, using sex- and age-adjusted standard deviation scores. We included 35 668 children from 20 studies in the discovery phase and 11 873 children from 13 studies in the replication phase. In total, 15 loci reached genome-wide significance (P-value < 5 × 10(-8)) in the joint discovery and replication analysis, of which 12 are previously identified loci in or close to ADCY3, GNPDA2, TMEM18, SEC16B, FAIM2, FTO, TFAP2B, TNNI3K, MC4R, GPR61, LMX1B and OLFM4 associated with adult body mass index or childhood obesity. We identified three novel loci: rs13253111 near ELP3, rs8092503 near RAB27B and rs13387838 near ADAM23. Per additional risk allele, body mass index increased 0.04 Standard Deviation Score (SDS) [Standard Error (SE) 0.007], 0.05 SDS (SE 0.008) and 0.14 SDS (SE 0.025), for rs13253111, rs8092503 and rs13387838, respectively. A genetic risk score combining all 15 SNPs showed that each additional average risk allele was associated with a 0.073 SDS (SE 0.011, P-value = 3.12 × 10(-10)) increase in childhood body mass index in a population of 1955 children. This risk score explained 2% of the variance in childhood body mass index. This study highlights the shared genetic background between childhood and adult body mass index and adds three novel loci. These loci likely represent age-related differences in strength of the associations with body mass index.
Mirkov, Maša Umiċeviċ; Cui, Jing; Vermeulen, Sita H; Stahl, Eli A.; Toonen, Erik JM; Makkinje, Remco R; Lee, Annette T; Huizinga, Tom WJ; Allaart, Renee; Barton, Anne; Mariette, Xavier; Miceli-Richard, Corinne; Criswell, Lindsey A; Tak, Paul P; de Vries, Niek; Saevarsdottir, Saedis; Padyukov, Leonid; Bridges, S. Louis; van Schaardenburg, Dirk-Jan; Jansen, Tim; Dutmer, Ellen AJ; van de Laar, Mart; Barrera, Pilar; Radstake, Timothy RDJ; van Riel, Piet LCM; Scheffer, Hans; Franke, Barbara; Brunner, Han G; Plenge, Robert M; Gregersen, Peter K; Guchelaar, Henk-Jan; Coenen, Marieke JH
Background Treatment strategies blocking tumor necrosis factor (anti-TNF) have proven very successful in patients with rheumatoid arthritis (RA). However, a significant subset of patients does not respond for unknown reasons. Currently there are no means of identifying these patients prior to treatment. This study was aimed at identifying genetic factors predicting anti-TNF treatment outcome in patient with RA using a genome-wide association approach. Methods We conducted a multi-stage, genome-wide association study with a primary analysis of 2,557,253 single nucleotide polymorphisms (SNPs) in 882 RA patients receiving anti-TNF therapy included through the Dutch Rheumatoid Arthritis Monitoring (DREAM) registry and the database of Apotheekzorg. Linear regression analysis of changes in the Disease Activity Score in 28 joints after 14 weeks of treatment was performed using an additive model. Markers with a p<10−3 were selected for replication in 1,821 RA patients from three independent cohorts. Pathway analysis including all SNPs with a p-value < 10−3 was performed using Ingenuity. Results Seven hundred seventy two markers demonstrated evidence of association with treatment outcome in the initial stage. Eight genetic loci showed improved p-value in the overall meta-analysis compared to the first stage, three of which (rs1568885, rs1813443 and rs4411591) showed directional consistency over all four studied cohorts. We were unable to replicate markers previously reported to be associated with anti-TNF outcome. Network analysis indicated strong involvement of biological processes underlying inflammatory response and cell morphology. Conclusion Using a multi-stage strategy, we have identified 8 genetic loci associated with response to anti-TNF treatment. Further studies are required to validate these findings in additional patient collections. PMID:23233654
Felix, Janine F.; Bradfield, Jonathan P.; Monnereau, Claire; van der Valk, Ralf J.P.; Stergiakouli, Evie; Chesi, Alessandra; Gaillard, Romy; Feenstra, Bjarke; Thiering, Elisabeth; Kreiner-Møller, Eskil; Mahajan, Anubha; Pitkänen, Niina; Joro, Raimo; Cavadino, Alana; Huikari, Ville; Franks, Steve; Groen-Blokhuis, Maria M.; Cousminer, Diana L.; Marsh, Julie A.; Lehtimäki, Terho; Curtin, John A.; Vioque, Jesus; Ahluwalia, Tarunveer S.; Myhre, Ronny; Price, Thomas S.; Vilor-Tejedor, Natalia; Yengo, Loïc; Grarup, Niels; Ntalla, Ioanna; Ang, Wei; Atalay, Mustafa; Bisgaard, Hans; Blakemore, Alexandra I.; Bonnefond, Amelie; Carstensen, Lisbeth; Eriksson, Johan; Flexeder, Claudia; Franke, Lude; Geller, Frank; Geserick, Mandy; Hartikainen, Anna-Liisa; Haworth, Claire M.A.; Hirschhorn, Joel N.; Hofman, Albert; Holm, Jens-Christian; Horikoshi, Momoko; Hottenga, Jouke Jan; Huang, Jinyan; Kadarmideen, Haja N.; Kähönen, Mika; Kiess, Wieland; Lakka, Hanna-Maaria; Lakka, Timo A.; Lewin, Alexandra M.; Liang, Liming; Lyytikäinen, Leo-Pekka; Ma, Baoshan; Magnus, Per; McCormack, Shana E.; McMahon, George; Mentch, Frank D.; Middeldorp, Christel M.; Murray, Clare S.; Pahkala, Katja; Pers, Tune H.; Pfäffle, Roland; Postma, Dirkje S.; Power, Christine; Simpson, Angela; Sengpiel, Verena; Tiesler, Carla M. T.; Torrent, Maties; Uitterlinden, André G.; van Meurs, Joyce B.; Vinding, Rebecca; Waage, Johannes; Wardle, Jane; Zeggini, Eleftheria; Zemel, Babette S.; Dedoussis, George V.; Pedersen, Oluf; Froguel, Philippe; Sunyer, Jordi; Plomin, Robert; Jacobsson, Bo; Hansen, Torben; Gonzalez, Juan R.; Custovic, Adnan; Raitakari, Olli T.; Pennell, Craig E.; Widén, Elisabeth; Boomsma, Dorret I.; Koppelman, Gerard H.; Sebert, Sylvain; Järvelin, Marjo-Riitta; Hyppönen, Elina; McCarthy, Mark I.; Lindi, Virpi; Harri, Niinikoski; Körner, Antje; Bønnelykke, Klaus; Heinrich, Joachim; Melbye, Mads; Rivadeneira, Fernando; Hakonarson, Hakon; Ring, Susan M.; Smith, George Davey; Sørensen, Thorkild I.A.; Timpson, Nicholas J.; Grant, Struan F.A.; Jaddoe, Vincent W.V.
A large number of genetic loci are associated with adult body mass index. However, the genetics of childhood body mass index are largely unknown. We performed a meta-analysis of genome-wide association studies of childhood body mass index, using sex- and age-adjusted standard deviation scores. We included 35 668 children from 20 studies in the discovery phase and 11 873 children from 13 studies in the replication phase. In total, 15 loci reached genome-wide significance (P-value < 5 × 10−8) in the joint discovery and replication analysis, of which 12 are previously identified loci in or close to ADCY3, GNPDA2, TMEM18, SEC16B, FAIM2, FTO, TFAP2B, TNNI3K, MC4R, GPR61, LMX1B and OLFM4 associated with adult body mass index or childhood obesity. We identified three novel loci: rs13253111 near ELP3, rs8092503 near RAB27B and rs13387838 near ADAM23. Per additional risk allele, body mass index increased 0.04 Standard Deviation Score (SDS) [Standard Error (SE) 0.007], 0.05 SDS (SE 0.008) and 0.14 SDS (SE 0.025), for rs13253111, rs8092503 and rs13387838, respectively. A genetic risk score combining all 15 SNPs showed that each additional average risk allele was associated with a 0.073 SDS (SE 0.011, P-value = 3.12 × 10−10) increase in childhood body mass index in a population of 1955 children. This risk score explained 2% of the variance in childhood body mass index. This study highlights the shared genetic background between childhood and adult body mass index and adds three novel loci. These loci likely represent age-related differences in strength of the associations with body mass index. PMID:26604143
Ng, Esther; Lind, P. Monica; Lindgren, Cecilia; Ingelsson, Erik; Mahajan, Anubha; Morris, Andrew; Lind, Lars
The accumulation of toxic metals in the human body is influenced by exposure and mechanisms involved in metabolism, some of which may be under genetic control. This is the first genome-wide association study to investigate variants associated with whole blood levels of a range of toxic metals. Eleven toxic metals and trace elements (aluminium, cadmium, cobalt, copper, chromium, mercury, manganese, molybdenum, nickel, lead and zinc) were assayed in a cohort of 949 individuals using mass spectrometry. DNA samples were genotyped on the Infinium Omni Express bead microarray and imputed up to reference panels from the 1000 Genomes Project. Analyses revealed two regions associated with manganese level at genome-wide significance, mapping to 4q24 and 1q41. The lead single nucleotide polymorphism (SNP) in the 4q24 locus was rs13107325 (P-value = 5.1 × 10−11, β = −0.77), located in an exon of SLC39A8, which encodes a protein involved in manganese and zinc transport. The lead SNP in the 1q41 locus is rs1776029 (P-value = 2.2 × 10−14, β = −0.46). The SNP lies within the intronic region of SLC30A10, another transporter protein. Among other metals, the loci 6q14.1 and 3q26.32 were associated with cadmium and mercury levels (P = 1.4 × 10−10, β = −1.2 and P = 1.8 × 10−9, β = −1.8, respectively). Whole blood measurements of toxic metals are associated with genetic variants in metal transporter genes and others. This is relevant in inferring metabolic pathways of metals and identifying subsets of individuals who may be more susceptible to metal toxicity. PMID:26025379
Kuo, Po-Hsiu; Chuang, Li-Chung; Su, Mei-Hsin; Chen, Chia-Hsiang; Chen, Chien-Hsiun; Wu, Jer-Yuarn; Yen, Chung-Jen; Wu, Yu-Yu; Liu, Shih-Kai; Chou, Miao-Chun; Chou, Wen-Jiun; Chiu, Yen-Nan; Tsai, Wen-Che; Gau, Susan Shur-Fen
Background Autism spectrum disorder (ASD) is a neurodevelopmental disorder with strong genetic components. Several recent genome-wide association (GWA) studies in Caucasian samples have reported a number of gene regions and loci correlated with the risk of ASD—albeit with very little consensus across studies. Methods A two-stage GWA study was employed to identify common genetic variants for ASD in the Taiwanese Han population. The discovery stage included 315 patients with ASD and 1,115 healthy controls, using the Affymetrix SNP array 6.0 platform for genotyping. Several gene regions were then selected for fine-mapping and top markers were examined in extended samples. Single marker, haplotype, gene-based, and pathway analyses were conducted for associations. Results Seven SNPs had p-values ranging from 3.4~9.9*10−6, but none reached the genome-wide significant level. Five of them were mapped to three known genes (OR2M4, STYK1, and MNT) with significant empirical gene-based p-values in OR2M4 (p = 3.4*10−5) and MNT (p = 0.0008). Results of the fine-mapping study showed single-marker associations in the GLIS1 (rs12082358 and rs12080993) and NAALADL2 (rs3914502 and rs2222447) genes, and gene-based associations for the OR2M3-OR2T5 (olfactory receptor genes, p = 0.02), and GLIPR1/KRR1 gene regions (p = 0.015). Pathway analyses revealed important pathways for ASD, such as olfactory and G protein–coupled receptors signaling pathways. Conclusions We reported Taiwanese Han specific susceptibility genes and variants for ASD. However, further replication in other Asian populations is warranted to validate our findings. Investigation in the biological functions of our reported genetic variants might also allow for better understanding on the underlying pathogenesis of autism. PMID:26398136
Hofer, Edith; Cavalieri, Margherita; Bis, Joshua C; DeCarli, Charles; Fornage, Myriam; Sigurdsson, Sigurdur; Srikanth, Velandai; Trompet, Stella; Verhaaren, Benjamin FJ; Wolf, Christiane; Yang, Qiong; Adams, Hieab HH; Amouyel, Philippe; Beiser, Alexa; Buckley, Brendan M; Callisaya, Michele; Chauhan, Ganesh; de Craen, Anton JM; Dufouil, Carole; van Duijn, Cornelia M; Ford, Ian; Freudenberger, Paul; Gottesman, Rebecca F; Gudnason, Vilmundur; Heiss, Gerardo; Hofman, Albert; Lumley, Thomas; Martinez, Oliver; Mazoyer, Bernard; Moran, Chris; Niessen, Wiro J.; Phan, Thanh; Psaty, Bruce M; Satizabal, Claudia L; Sattar, Naveed; Schilling, Sabrina; Shibata, Dean K; Slagboom, P Eline; Smith, Albert; Stott, David J; Taylor, Kent D; Thomson, Russell; Töglhofer, Anna M; Tzourio, Christophe; van Buchem, Mark; Wang, Jing; Westendorp, Rudi GJ; Windham, B Gwen; Vernooij, Meike W; Zijdenbos, Alex; Beare, Richard; Debette, Stéphanie; Ikram, M Arfan; Jukema, J Wouter; Launer, Lenore J; Longstreth, W T; Mosley, Thomas H; Seshadri, Sudha; Schmidt, Helena; Schmidt, Reinhold
Background and Purpose White matter lesion (WML) progression on magnetic resonance imaging (MRI) is related to cognitive decline and stroke, but its determinants besides baseline WML burden are largely unknown. Here, we estimated heritability of WML progression, and sought common genetic variants associated with WML progression in elderly participants from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium. Methods Heritability of WML progression was calculated in the Framingham Heart Study. The genome-wide association study included 7773 elderly participants from 10 cohorts. To assess the relative contribution of genetic factors to progression of WML, we compared in seven cohorts risk models including demographics, vascular risk factors plus single nucleotide polymorphisms (SNPs) that have been shown to be associated cross-sectionally with WML in the current and previous association studies. Results A total of 1085 subjects showed WML progression. The heritability estimate for WML progression was low at 6.5%, and no SNPs achieved genome-wide significance (p-value < 5×10−8). Four loci were suggestive (p-value < 1×10−5) of an association with WML progression: 10q24.32 (rs10883817, p=1.46×10−6); 12q13.13 (rs4761974, p=8.71×10−7); 20p12.1 (rs6135309, p=3.69×10−6); and 4p15.31 (rs7664442, p=2.26×10−6). Variants that have been previously related to WML explained only 0.8% to 11.7% more of the variance in WML progression than age, vascular risk factors and baseline WML burden. Conclusions Common genetic factors contribute little to the progression of age-related WML in middle-aged and older adults. Future research on determinants of WML progression should focus more on environmental, life-style or host-related biological factors. PMID:26451028
van der Westhuizen, R R; van der Westhuizen, J
It is generally accepted that feed intake and growth (gain) are the most important economic components when calculating profitability in a growth test or feedlot. We developed a single post-weaning growth (feedlot) index based on the economic values of different components. Variance components, heritabilities and genetic correlations for and between initial weight (IW), final weight (FW), feed intake (FI), and shoulder height (SHD) were estimated by multitrait restricted maximum likelihood procedures. The estimated breeding values (EBVs) and the economic values for IW, FW and FI were used in a selection index to estimate a post-weaning or feedlot profitability value. Heritabilities for IW, FW, FI, and SHD were 0.41, 0.40, 0.33, and 0.51, respectively. The highest genetic correlations were 0.78 (between IW and FW) and 0.70 (between FI and FW). EBVs were used in a selection index to calculate a single economical value for each animal. This economic value is an indication of the gross profitability value or the gross test value (GTV) of the animal in a post-weaning growth test. GTVs varied between -R192.17 and R231.38 with an average of R9.31 and a standard deviation of R39.96. The Pearson correlations between EBVs (for production and efficiency traits) and GTV ranged from -0.51 to 0.68. The lowest correlation (closest to zero) was 0.26 between the Kleiber ratio and GTV. Correlations of 0.68 and -0.51 were estimated between average daily gain and GTV and feed conversion ratio and GTV, respectively. These results showed that it is possible to select for GTV. The selection index can benefit feedlotting in selecting offspring of bulls with high GTVs to maximize profitability.
Lee, Sang Hong; Wray, Naomi R.
Genome-wide association studies (GWAS) are routinely conducted for both quantitative and binary (disease) traits. We present two analytical tools for use in the experimental design of GWAS. Firstly, we present power calculations quantifying power in a unified framework for a range of scenarios. In this context we consider the utility of quantitative scores (e.g. endophenotypes) that may be available on cases only or both cases and controls. Secondly, we consider, the accuracy of prediction of genetic risk from genome-wide SNPs and derive an expression for genomic prediction accuracy using a liability threshold model for disease traits in a case-control design. The expected values based on our derived equations for both power and prediction accuracy agree well with observed estimates from simulations. PMID:23977056
Fernandez-Rozadilla, C; Cazier, J B; Moreno, V; Crous-Bou, M; Guinó, E; Durán, G; Lamas, M J; López, R; Candamio, S; Gallardo, E; Paré, L; Baiget, M; Páez, D; López-Fernández, L A; Cortejoso, L; García, M I; Bujanda, L; González, D; Gonzalo, V; Rodrigo, L; Reñé, J M; Jover, R; Brea-Fernández, A; Andreu, M; Bessa, X; Llor, X; Xicola, R; Palles, C; Tomlinson, I; Castellví-Bel, S; Castells, A; Ruiz-Ponte, C; Carracedo, A
The development of genotyping technologies has allowed for wider screening for inherited causes of variable outcomes following drug administration. We have performed a genome-wide association study (GWAS) on 221 colorectal cancer (CRC) patients that had been treated with 5-fluorouracil (5-FU), either alone or in combination with oxaliplatin (FOLFOX). A validation set of 791 patients was also studied. Seven SNPs (rs16857540, rs2465403, rs10876844, rs10784749, rs17626122, rs7325568 and rs4243761) showed evidence of association (pooled P-values 0.020, 9.426E-03, 0.010, 0.017, 0.042, 2.302E-04, 2.803E-03) with adverse drug reactions (ADRs). This is the first study to explore the genetic basis of inter-individual variation in toxicity responses to the administration of 5-FU or FOLFOX in CRC patients on a genome-wide scale.
Wang, Hansong; Burnett, Terrilea; Kono, Suminori; Haiman, Christopher A.; Iwasaki, Motoki; Wilkens, Lynne R.; Loo, Lenora W.M.; Berg, David Van Den; Kolonel, Laurence N.; Henderson, Brian E.; Keku, Temitope O.; Sandler, Robert S.; Signorello, Lisa B.; Blot, William J.; Newcomb, Polly A.; Pande, Mala; Amos, Christopher I.; West, Dee W.; Bézieau, Stéphane; Berndt, Sonja I.; Zanke, Brent W.; Hsu, Li; Lindor, Noralane M.; Haile, Robert W.; Hopper, John L.; Jenkins, Mark A.; Gallinger, Steven; Casey, Graham; Stenzel, Stephanie L.; Schumacher, Fredrick R.; Peters, Ulrike; Gruber, Stephen B.; Tsugane, Shoichiro; Stram, Daniel O.; Marchand, Loïc Le
The genetic basis of sporadic colorectal cancer (CRC) is not well explained by known risk polymorphisms. Here we perform a meta-analysis of two genome-wide association studies in 2,627 cases and 3,797 controls of Japanese ancestry and 1,894 cases and 4,703 controls of African ancestry, to identify genetic variants that contribute to CRC susceptibility. We replicate genome-wide statistically significant associations (P < 5×10−8) in 16,823 cases and 18,211 controls of European ancestry. This study reveals a new pan-ethnic CRC risk locus at 10q25 (rs12241008, intronic to VTI1A; P=1.4×10−9), providing additional insight into the etiology of CRC and highlighting the value of association mapping in diverse populations. PMID:25105248
Adhikari, Kaustubh; Fontanil, Tania; Cal, Santiago; Mendoza-Revilla, Javier; Fuentes-Guajardo, Macarena; Chacón-Duque, Juan-Camilo; Al-Saadi, Farah; Johansson, Jeanette A.; Quinto-Sanchez, Mirsha; Acuña-Alonzo, Victor; Jaramillo, Claudia; Arias, William; Barquera Lozano, Rodrigo; Macín Pérez, Gastón; Gómez-Valdés, Jorge; Villamil-Ramírez, Hugo; Hunemeier, Tábita; Ramallo, Virginia; Silva de Cerqueira, Caio C.; Hurtado, Malena; Villegas, Valeria; Granja, Vanessa; Gallo, Carla; Poletti, Giovanni; Schuler-Faccini, Lavinia; Salzano, Francisco M.; Bortolini, Maria-Cátira; Canizales-Quinteros, Samuel; Rothhammer, Francisco; Bedoya, Gabriel; Gonzalez-José, Rolando; Headon, Denis; López-Otín, Carlos; Tobin, Desmond J.; Balding, David; Ruiz-Linares, Andrés
We report a genome-wide association scan in over 6,000 Latin Americans for features of scalp hair (shape, colour, greying, balding) and facial hair (beard thickness, monobrow, eyebrow thickness). We found 18 signals of association reaching genome-wide significance (P values 5 × 10−8 to 3 × 10−119), including 10 novel associations. These include novel loci for scalp hair shape and balding, and the first reported loci for hair greying, monobrow, eyebrow and beard thickness. A newly identified locus influencing hair shape includes a Q30R substitution in the Protease Serine S1 family member 53 (PRSS53). We demonstrate that this enzyme is highly expressed in the hair follicle, especially the inner root sheath, and that the Q30R substitution affects enzyme processing and secretion. The genome regions associated with hair features are enriched for signals of selection, consistent with proposals regarding the evolution of human hair. PMID:26926045
Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan
Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…
The capacity to identify immunogens for vaccine development by genome-wide screening has been markedly enhanced by the availability of complete microbial genome sequences coupled to rapid proteomic and bioinformatic analysis. Critical to this genome-wide screening is in vivo testing in the context o...
Neale, Benjamin M.; Medland, Sarah; Ripke, Stephan; Anney, Richard J. L.; Asherson, Philip; Buitelaar, Jan; Franke, Barbara; Gill, Michael; Kent, Lindsey; Holmans, Peter; Middleton, Frank; Thapar, Anita; Lesch, Klaus-Peter; Faraone, Stephen V.; Daly, Mark; Nguyen, Thuy Trang; Schafer, Helmut; Steinhausen, Hans-Christoph; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Warnke, Andreas; Walitza, Susanne; Freitag, Christine; Meyer, Jobst; Palmason, Haukur; Rothenberger, Aribert; Hawi, Ziarih; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph
Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. Thus additional genome-wide association studies (GWAS) are needed. Method: We used case-control analyses of 896 cases…
Mick, Eric; Todorov, Alexandre; Smalley, Susan; Hu, Xiaolan; Loo, Sandra; Todd, Richard D.; Biederman, Joseph; Byrne, Deirdre; Dechairo, Bryan; Guiney, Allan; McCracken, James; McGough, James; Nelson, Stanley F.; Reiersen, Angela M.; Wilens, Timothy E.; Wozniak, Janet; Neale, Benjamin M.; Faraone, Stephen V.
Objective: Genes likely play a substantial role in the etiology of attention-deficit/hyperactivity disorder (ADHD). However, the genetic architecture of the disorder is unknown, and prior genome-wide association studies (GWAS) have not identified a genome-wide significant association. We have conducted a third, independent, multisite GWAS of…
Peters, Ulrike; Jiao, Shuo; Schumacher, Fredrick R.; Hutter, Carolyn M.; Aragaki, Aaron K.; Baron, John A.; Berndt, Sonja I.; Bézieau, Stéphane; Brenner, Hermann; Butterbach, Katja; Caan, Bette J.; Campbell, Peter T.; Carlson, Christopher S.; Casey, Graham; Chan, Andrew T.; Chang-Claude, Jenny; Chanock, Stephen J.; Chen, Lin S.; Coetzee, Gerhard A.; Coetzee, Simon G.; Conti, David V.; Curtis, Keith R.; Duggan, David; Edwards, Todd; Fuchs, Charles S.; Gallinger, Steven; Giovannucci, Edward L.; Gogarten, Stephanie M.; Gruber, Stephen B.; Haile, Robert W.; Harrison, Tabitha A.; Hayes, Richard B.; Henderson, Brian E.; Hoffmeister, Michael; Hopper, John L.; Hudson, Thomas J.; Hunter, David J.; Jackson, Rebecca D.; Jee, Sun Ha; Jenkins, Mark A.; Jia, Wei-Hua; Kolonel, Laurence N.; Kooperberg, Charles; Küry, Sébastien; Lacroix, Andrea Z.; Laurie, Cathy C.; Laurie, Cecelia A.; Le Marchand, Loic; Lemire, Mathieu; Levine, David; Lindor, Noralane M.; Liu, Yan; Ma, Jing; Makar, Karen W.; Matsuo, Keitaro; Newcomb, Polly A.; Potter, John D.; Prentice, Ross L.; Qu, Conghui; Rohan, Thomas; Rosse, Stephanie A.; Schoen, Robert E.; Seminara, Daniela; Shrubsole, Martha; Shu, Xiao-Ou; Slattery, Martha L.; Taverna, Darin; Thibodeau, Stephen N.; Ulrich, Cornelia M.; White, Emily; Xiang, Yongbing; Zanke, Brent W.; Zeng, Yi-Xin; Zhang, Ben; Zheng, Wei; Hsu, Li
BACKGROUND & AIMS Heritable factors contribute to the development of colorectal cancer. Identifying the genetic loci associated with colorectal tumor formation could elucidate the mechanisms of pathogenesis. METHODS We conducted a genome-wide association study that included 14 studies, 12,696 cases of colorectal tumors (11,870 cancer, 826 adenoma), and 15,113 controls of European descent. The 10 most statistically significant, previously unreported findings were followed up in 6 studies; these included 3056 colorectal tumor cases (2098 cancer, 958 adenoma) and 6658 controls of European and Asian descent. RESULTS Based on the combined analysis, we identified a locus that reached the conventional genome-wide significance level at less than 5.0 × 10−8: an intergenic region on chromosome 2q32.3, close to nucleic acid binding protein 1 (most significant single nucleotide polymorphism: rs11903757; odds ratio [OR], 1.15 per risk allele; P = 3.7 × 10−8). We also found evidence for 3 additional loci with P values less than 5.0 × 10−7: a locus within the laminin gamma 1 gene on chromosome 1q25.3 (rs10911251; OR, 1.10 per risk allele; P = 9.5 × 10−8), a locus within the cyclin D2 gene on chromosome 12p13.32 (rs3217810 per risk allele; OR, 0.84; P = 5.9 × 10−8), and a locus in the T-box 3 gene on chromosome 12q24.21 (rs59336; OR, 0.91 per risk allele; P = 3.7 × 10−7). CONCLUSIONS In a large genome-wide association study, we associated polymorphisms close to nucleic acid binding protein 1 (which encodes a DNA-binding protein involved in DNA repair) with colorectal tumor risk. We also provided evidence for an association between colorectal tumor risk and polymorphisms in laminin gamma 1 (this is the second gene in the laminin family to be associated with colorectal cancers), cyclin D2 (which encodes for cyclin D2), and T-box 3 (which encodes a T-box transcription factor and is a target of Wnt signaling to β-catenin). The roles of these genes and their products
Jiang, Long; Liu, Lu; Cheng, Yuyan; Lin, Yan; Shen, Changbing; Zhu, Caihong; Yang, Sen; Yin, Xianyong; Zhang, Xuejun
Missing heritability is a common problem in genome-wide association studies in complex diseases/traits. To quantify the unbiased heritability estimate, we applied the phenotype correlation-genotype correlation regression in psoriasis genome-wide association data in Han Chinese which comprises 1139 cases and 1132 controls. We estimated that 45.7% heritability of psoriasis in Han Chinese were captured by common variants (s.e.=12.5%), which reinforced that the majority of psoriasis heritability can be covered by common variants in genome-wide association data (68.2%). The results provided evidence that the heritability covered by psoriasis genome-wide genotyping data was probably underestimated in previous restricted maximum likelihood method. Our study highlights the broad role of common variants in the etiology of psoriasis and sheds light on the possibility to identify more common variants of small effect by increasing the sample size in psoriasis genome-wide association studies.
Dumitrescu, Logan; Ritchie, Marylyn D.; Denny, Joshua C.; El Rouby, Nihal M.; McDonough, Caitrin W.; Bradford, Yuki; Ramirez, Andrea H.; Bielinski, Suzette J.; Basford, Melissa A.; Chai, High Seng; Peissig, Peggy; Carrell, David; Pathak, Jyotishman; Rasmussen, Luke V.; Wang, Xiaoming; Pacheco, Jennifer A.; Kho, Abel N.; Hayes, M. Geoffrey; Matsumoto, Martha; Smith, Maureen E.; Li, Rongling; Cooper-DeHoff, Rhonda M.; Kullo, Iftikhar J.; Chute, Christopher G.; Chisholm, Rex L.; Jarvik, Gail P.; Larson, Eric B.; Carey, David; McCarty, Catherine A.; Williams, Marc S.; Roden, Dan M.; Bottinger, Erwin; Johnson, Julie A.; de Andrade, Mariza; Crawford, Dana C.
Resistant hypertension is defined as high blood pressure that remains above treatment goals in spite of the concurrent use of three antihypertensive agents from different classes. Despite the important health consequences of resistant hypertension, few studies of resistant hypertension have been conducted. To perform a genome-wide association study for resistant hypertension, we defined and identified cases of resistant hypertension and hypertensives with treated, controlled hypertension among >47,500 adults residing in the US linked to electronic health records (EHRs) and genotyped as part of the electronic MEdical Records & GEnomics (eMERGE) Network. Electronic selection logic using billing codes, laboratory values, text queries, and medication records was used to identify resistant hypertension cases and controls at each site, and a total of 3,006 cases of resistant hypertension and 876 controlled hypertensives were identified among eMERGE Phase I and II sites. After imputation and quality control, a total of 2,530,150 SNPs were tested for an association among 2,830 multi-ethnic cases of resistant hypertension and 876 controlled hypertensives. No test of association was genome-wide significant in the full dataset or in the dataset limited to European American cases (n = 1,719) and controls (n = 708). The most significant finding was CLNK rs13144136 at p = 1.00x10-6 (odds ratio = 0.68; 95% CI = 0.58–0.80) in the full dataset with similar results in the European American only dataset. We also examined whether SNPs known to influence blood pressure or hypertension also influenced resistant hypertension. None was significant after correction for multiple testing. These data highlight both the difficulties and the potential utility of EHR-linked genomic data to study clinically-relevant traits such as resistant hypertension. PMID:28222112
Demerath, Ellen W.; Liu, Ching-Ti; Franceschini, Nora; Chen, Gary; Palmer, Julie R.; Smith, Erin N.; Chen, Christina T.L.; Ambrosone, Christine B.; Arnold, Alice M.; Bandera, Elisa V.; Berenson, Gerald S.; Bernstein, Leslie; Britton, Angela; Cappola, Anne R.; Carlson, Christopher S.; Chanock, Stephen J.; Chen, Wei; Chen, Zhao; Deming, Sandra L.; Elks, Cathy E.; Evans, Michelle K.; Gajdos, Zofia; Henderson, Brian E.; Hu, Jennifer J.; Ingles, Sue; John, Esther M.; Kerr, Kathleen F.; Kolonel, Laurence N.; Le Marchand, Loic; Lu, Xiaoning; Millikan, Robert C.; Musani, Solomon K.; Nock, Nora L.; North, Kari; Nyante, Sarah; Press, Michael F.; Rodriquez-Gil, Jorge L.; Ruiz-Narvaez, Edward A.; Schork, Nicholas J.; Srinivasan, Sathanur R.; Woods, Nancy F.; Zheng, Wei; Ziegler, Regina G.; Zonderman, Alan; Heiss, Gerardo; Gwen Windham, B.; Wellons, Melissa; Murray, Sarah S.; Nalls, Michael; Pastinen, Tomi; Rajkovic, Aleksandar; Hirschhorn, Joel; Adrienne Cupples, L.; Kooperberg, Charles; Murabito, Joanne M.; Haiman, Christopher A.
African-American (AA) women have earlier menarche on average than women of European ancestry (EA), and earlier menarche is a risk factor for obesity and type 2 diabetes among other chronic diseases. Identification of common genetic variants associated with age at menarche has a potential value in pointing to the genetic pathways underlying chronic disease risk, yet comprehensive genome-wide studies of age at menarche are lacking for AA women. In this study, we tested the genome-wide association of self-reported age at menarche with common single-nucleotide polymorphisms (SNPs) in a total of 18 089 AA women in 15 studies using an additive genetic linear regression model, adjusting for year of birth and population stratification, followed by inverse-variance weighted meta-analysis (Stage 1). Top meta-analysis results were then tested in an independent sample of 2850 women (Stage 2). First, while no SNP passed the pre-specified P < 5 × 10−8 threshold for significance in Stage 1, suggestive associations were found for variants near FLRT2 and PIK3R1, and conditional analysis identified two independent SNPs (rs339978 and rs980000) in or near RORA, strengthening the support for this suggestive locus identified in EA women. Secondly, an investigation of SNPs in 42 previously identified menarche loci in EA women demonstrated that 25 (60%) of them contained variants significantly associated with menarche in AA women. The findings provide the first evidence of cross-ethnic generalization of menarche loci identified to date, and suggest a number of novel biological links to menarche timing in AA women. PMID:23599027
Fox, Caroline S; White, Charles C; Lohman, Kurt; Heard-Costa, Nancy; Cohen, Paul; Zhang, Yingying; Johnson, Andrew D; Emilsson, Valur; Liu, Ching-Ti; Chen, Y-D Ida; Taylor, Kent D; Allison, Matthew; Budoff, Matthew; Rotter, Jerome I; Carr, J Jeffrey; Hoffmann, Udo; Ding, Jingzhong; Cupples, L Adrienne; Liu, Yongmei
Pericardial fat is a localized fat depot associated with coronary artery calcium and myocardial infarction. We hypothesized that genetic loci would be associated with pericardial fat independent of other body fat depots. Pericardial fat was quantified in 5,487 individuals of European ancestry from the Framingham Heart Study (FHS) and the Multi-Ethnic Study of Atherosclerosis (MESA). Genotyping was performed using standard arrays and imputed to ~2.5 million Hapmap SNPs. Each study performed a genome-wide association analysis of pericardial fat adjusted for age, sex, weight, and height. A weighted z-score meta-analysis was conducted, and validation was obtained in an additional 3,602 multi-ethnic individuals from the MESA study. We identified a genome-wide significant signal in our primary meta-analysis at rs10198628 near TRIB2 (MAF 0.49, p = 2.7 × 10(-08)). This SNP was not associated with visceral fat (p = 0.17) or body mass index (p = 0.38), although we observed direction-consistent, nominal significance with visceral fat adjusted for BMI (p = 0.01) in the Framingham Heart Study. Our findings were robust among African ancestry (n = 1,442, p = 0.001), Hispanic (n = 1,399, p = 0.004), and Chinese (n = 761, p = 0.007) participants from the MESA study, with a combined p-value of 5.4E-14. We observed TRIB2 gene expression in the pericardial fat of mice. rs10198628 near TRIB2 is associated with pericardial fat but not measures of generalized or visceral adiposity, reinforcing the concept that there are unique genetic underpinnings to ectopic fat distribution.
Dumitrescu, Logan; Ritchie, Marylyn D; Denny, Joshua C; El Rouby, Nihal M; McDonough, Caitrin W; Bradford, Yuki; Ramirez, Andrea H; Bielinski, Suzette J; Basford, Melissa A; Chai, High Seng; Peissig, Peggy; Carrell, David; Pathak, Jyotishman; Rasmussen, Luke V; Wang, Xiaoming; Pacheco, Jennifer A; Kho, Abel N; Hayes, M Geoffrey; Matsumoto, Martha; Smith, Maureen E; Li, Rongling; Cooper-DeHoff, Rhonda M; Kullo, Iftikhar J; Chute, Christopher G; Chisholm, Rex L; Jarvik, Gail P; Larson, Eric B; Carey, David; McCarty, Catherine A; Williams, Marc S; Roden, Dan M; Bottinger, Erwin; Johnson, Julie A; de Andrade, Mariza; Crawford, Dana C
Resistant hypertension is defined as high blood pressure that remains above treatment goals in spite of the concurrent use of three antihypertensive agents from different classes. Despite the important health consequences of resistant hypertension, few studies of resistant hypertension have been conducted. To perform a genome-wide association study for resistant hypertension, we defined and identified cases of resistant hypertension and hypertensives with treated, controlled hypertension among >47,500 adults residing in the US linked to electronic health records (EHRs) and genotyped as part of the electronic MEdical Records & GEnomics (eMERGE) Network. Electronic selection logic using billing codes, laboratory values, text queries, and medication records was used to identify resistant hypertension cases and controls at each site, and a total of 3,006 cases of resistant hypertension and 876 controlled hypertensives were identified among eMERGE Phase I and II sites. After imputation and quality control, a total of 2,530,150 SNPs were tested for an association among 2,830 multi-ethnic cases of resistant hypertension and 876 controlled hypertensives. No test of association was genome-wide significant in the full dataset or in the dataset limited to European American cases (n = 1,719) and controls (n = 708). The most significant finding was CLNK rs13144136 at p = 1.00x10-6 (odds ratio = 0.68; 95% CI = 0.58-0.80) in the full dataset with similar results in the European American only dataset. We also examined whether SNPs known to influence blood pressure or hypertension also influenced resistant hypertension. None was significant after correction for multiple testing. These data highlight both the difficulties and the potential utility of EHR-linked genomic data to study clinically-relevant traits such as resistant hypertension.
Hawken, R J; Zhang, Y D; Fortes, M R S; Collis, E; Barris, W C; Corbet, N J; Williams, P J; Fordyce, G; Holroyd, R G; Walkley, J R W; Barendse, W; Johnston, D J; Prayaga, K C; Tier, B; Reverter, A; Lehnert, S A
The genetics of reproduction is poorly understood because the heritabilities of traits currently recorded are low. To elucidate the genetics underlying reproduction in beef cattle, we performed a genome-wide association study using the bovine SNP50 chip in 2 tropically adapted beef cattle breeds, Brahman and Tropical Composite. Here we present the results for 3 female reproduction traits: 1) age at puberty, defined as age in days at first observed corpus luteum (CL) after frequent ovarian ultrasound scans (AGECL); 2) the postpartum anestrous interval, measured as the number of days from calving to first ovulation postpartum (first rebreeding interval, PPAI); and 3) the occurrence of the first postpartum ovulation before weaning in the first rebreeding period (PW), defined from PPAI. In addition, correlated traits such as BW, height, serum IGF1 concentration, condition score, and fatness were also examined. In the Brahman and Tropical Composite cattle, 169 [false positive rate (FPR) = 0.262] and 84 (FPR = 0.581) SNP, respectively, were significant (P < 0.001) for AGECL. In Brahman, 41% of these significant markers mapped to a single chromosomal region on BTA14. In Tropical Composites, 16% of these significant markers were located on BTA5. For PPAI, 66 (FPR = 0.67) and 113 (FPR = 0.432) SNP were significant (P < 0.001) in Brahman and Tropical Composite, respectively, whereas for PW, 68 (FPR = 0.64) and 113 (FPR = 0.432) SNP were significant (P < 0.01). In Tropical Composites, the largest concentration of PPAI markers were located on BTA5 [19% (PPAI) and 23% (PW)], and BTA16 [17% (PPAI) and 18% (PW)]. In Brahman cattle, the largest concentration of markers for postpartum anestrus was located on BTA3 (14% for PPAI and PW) and BTA14 (17% PPAI). Very few of the significant markers for female reproduction traits for the Brahman and Tropical Composite breeds were located in the same chromosomal regions. However, fatness and BW traits as well as serum IGF1 concentration
Genome-Wide Association Mapping for Intelligence in Military Working Dogs: Canine Cohort, Canine Intelligence Assessment Regimen, Genome-Wide Single Nucleotide Polymorphism (SNP) Typing, and Unsupervised Classification Algorithm for Genome-Wide Association Data Analysis
were down-selected and successfully genotyped for whole genome (WG) single nucleotide polymorphism (SNP) markers by means of the Affymetrix Canine...SUBJECT TERMS Military working dog genome-wide association study genetic marker intelligence... marker , intelligence, Canine Intelligence Testing Protocol, classification technique, clustering analysis Technical Report: September 2011 2
Zheng, Xianhu; Kuang, Youyi; Lv, Weihua; Cao, Dingchen; Sun, Zhipeng; Sun, Xiaowen
Muscle fat content is an important phenotypic trait in fish, as it affects the nutritional, technical and sensory qualities of flesh. To identify loci and candidate genes associated with muscle fat content and abdominal fat traits, we performed a genome-wide association study (GWAS) using the common carp 250 K SNP assay in a common carp F2 resource population. A total of 18 loci surpassing the genome-wide suggestive significance level were detected for 4 traits: fat content in dorsal muscle (MFdo), fat content in abdominal muscle (MFab), abdominal fat weight (AbFW), and AbFW as a percentage of eviscerated weight (AbFP). Among them, one SNP (carp089419) affecting both AbFW and AbFP reached the genome-wide significance level. Ten of those loci were harbored in or near known genes. Furthermore, relative expressions of 5 genes related to MFdo were compared using dorsal muscle samples with high and low phenotypic values. The results showed that 4 genes were differentially expressed between the high and low phenotypic groups. These genes are, therefore, prospective candidate genes for muscle fat content: ankyrin repeat domain 10a (ankrd10a), tetratricopeptide repeat, ankyrin repeat and coiled-coil containing 2 (tanc2), and four jointed box 1 (fjx1) and choline kinase alpha (chka). These results offer valuable insights into the complex genetic basis of fat metabolism and deposition. PMID:28030623
Zheng, Xianhu; Kuang, Youyi; Lv, Weihua; Cao, Dingchen; Sun, Zhipeng; Sun, Xiaowen
Muscle fat content is an important phenotypic trait in fish, as it affects the nutritional, technical and sensory qualities of flesh. To identify loci and candidate genes associated with muscle fat content and abdominal fat traits, we performed a genome-wide association study (GWAS) using the common carp 250 K SNP assay in a common carp F2 resource population. A total of 18 loci surpassing the genome-wide suggestive significance level were detected for 4 traits: fat content in dorsal muscle (MFdo), fat content in abdominal muscle (MFab), abdominal fat weight (AbFW), and AbFW as a percentage of eviscerated weight (AbFP). Among them, one SNP (carp089419) affecting both AbFW and AbFP reached the genome-wide significance level. Ten of those loci were harbored in or near known genes. Furthermore, relative expressions of 5 genes related to MFdo were compared using dorsal muscle samples with high and low phenotypic values. The results showed that 4 genes were differentially expressed between the high and low phenotypic groups. These genes are, therefore, prospective candidate genes for muscle fat content: ankyrin repeat domain 10a (ankrd10a), tetratricopeptide repeat, ankyrin repeat and coiled-coil containing 2 (tanc2), and four jointed box 1 (fjx1) and choline kinase alpha (chka). These results offer valuable insights into the complex genetic basis of fat metabolism and deposition.
Parkhomenko, Elena; Tritchler, David; Lemire, Mathieu; Hu, Pingzhao; Beyene, Joseph
In high-dimensional studies such as genome-wide association studies, the correction for multiple testing in order to control total type I error results in decreased power to detect modest effects. We present a new analytical approach based on the higher criticism statistic that allows identification of the presence of modest effects. We apply our method to the genome-wide study of rheumatoid arthritis provided in the Genetic Analysis Workshop 16 Problem 1 data set. There is evidence for unknown bias in this study that could be explained by the presence of undetected modest effects. We compared the asymptotic and empirical thresholds for the higher criticism statistic. Using the asymptotic threshold we detected the presence of modest effects genome-wide. We also detected modest effects using 90th percentile of the empirical null distribution as a threshold; however, there is no such evidence when the 95th and 99th percentiles were used. While the higher criticism method suggests that there is some evidence for modest effects, interpreting individual single-nucleotide polymorphisms with significant higher criticism statistics is of undermined value. The goal of higher criticism is to alert the researcher that genetic effects remain to be discovered and to promote the use of more targeted and powerful studies to detect the remaining effects.
de Oliveira, Marco Antônio Rott; Higashi, Wilson; Scapim, Carlos Alberto; Schuster, Ivan
Mapping quantitative trait loci through the use of linkage disequilibrium (LD) in populations of unrelated individuals provides a valuable approach for dissecting the genetic basis of complex traits in soybean (Glycine max). The haplotype-based genome-wide association study (GWAS) has now been proposed as a complementary approach to intensify benefits from LD, which enable to assess the genetic determinants of agronomic traits. In this study a GWAS was undertaken to identify genomic regions that control 100-seed weight (SW), plant height (PH) and seed yield (SY) in a soybean association mapping panel using single nucleotide polymorphism (SNP) markers and haplotype information. The soybean cultivars (N = 169) were field-evaluated across four locations of southern Brazil. The genome-wide haplotype association analysis (941 haplotypes) identified eleven, seventeen and fifty-nine SNP-based haplotypes significantly associated with SY, SW and PH, respectively. Although most marker-trait associations were environment and trait specific, stable haplotype associations were identified for SY and SW across environments (i.e., haplotypes Gm12_Hap12). The haplotype block 42 on Chr19 (Gm19_Hap42) was confirmed to be associated with PH in two environments. These findings enable us to refine the breeding strategy for tropical soybean, which confirm that haplotype-based GWAS can provide new insights on the genetic determinants that are not captured by the single-marker approach. PMID:28152092
Friedrich, Juliane; Brand, Bodo; Ponsuksili, Siriluck; Graunke, Katharina L; Langbein, Jan; Knaust, Jacqueline; Kühn, Christa; Schwerin, Manfred
Behaviour traits of cattle have been reported to affect important production traits, such as meat quality and milk performance as well as reproduction and health. Genetic predisposition is, together with environmental stimuli, undoubtedly involved in the development of behaviour phenotypes. Underlying molecular mechanisms affecting behaviour in general and behaviour and productions traits in particular still have to be studied in detail. Therefore, we performed a genome-wide association study in an F2 Charolais × German Holstein cross-breed population to identify genetic variants that affect behaviour-related traits assessed in an open-field and novel-object test and analysed their putative impact on milk performance. Of 37,201 tested single nucleotide polymorphism (SNPs), four showed a genome-wide and 37 a chromosome-wide significant association with behaviour traits assessed in both tests. Nine of the SNPs that were associated with behaviour traits likewise showed a nominal significant association with milk performance traits. On chromosomes 14 and 29, six SNPs were identified to be associated with exploratory behaviour and inactivity during the novel-object test as well as with milk yield traits. Least squares means for behaviour and milk performance traits for these SNPs revealed that genotypes associated with higher inactivity and less exploratory behaviour promote higher milk yields. Whether these results are due to molecular mechanisms simultaneously affecting behaviour and milk performance or due to a behaviour predisposition, which causes indirect effects on milk performance by influencing individual reactivity, needs further investigation.
Wang, Jia; Jian, Hongju; Wei, Lijuan; Qu, Cunmin; Xu, Xinfu; Lu, Kun; Qian, Wei; Li, Jiana; Li, Maoteng; Liu, Liezhao
A stable yellow-seeded variety is the breeding goal for obtaining the ideal rapeseed (Brassica napus L.) plant, and the amount of acid detergent lignin (ADL) in the seeds and the hull content (HC) are often used as yellow-seeded rapeseed screening indices. In this study, a genome-wide association analysis of 520 accessions was performed using the Q + K model with a total of 31,839 single-nucleotide polymorphism (SNP) sites. As a result, three significant associations on the B. napus chromosomes A05, A09, and C05 were detected for seed ADL content. The peak SNPs were within 9.27, 14.22, and 20.86 kb of the key genes BnaA.PAL4, BnaA.CAD2/BnaA.CAD3, and BnaC.CCR1, respectively. Further analyses were performed on the major locus of A05, which was also detected in the seed HC examination. A comparison of our genome-wide association study (GWAS) results and previous linkage mappings revealed a common chromosomal region on A09, which indicates that GWAS can be used as a powerful complementary strategy for dissecting complex traits in B. napus. Genomic selection (GS) utilizing the significant SNP markers based on the GWAS results exhibited increased predictive ability, indicating that the predictive ability of a given model can be substantially improved by using GWAS and GS.
Nimmakayala, Padma; Abburi, Venkata L; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C V Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K
Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST ) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9-2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers.
Nimmakayala, Padma; Abburi, Venkata L.; Saminathan, Thangasamy; Almeida, Aldo; Davenport, Brittany; Davidson, Joshua; Reddy, C. V. Chandra Mohan; Hankins, Gerald; Ebert, Andreas; Choi, Doil; Stommel, John; Reddy, Umesh K.
Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to characterize population structure and species domestication of these two important incompatible cultivated pepper species. Estimated mean nucleotide diversity (π) and Tajima's D across various chromosomes revealed biased distribution toward negative values on all chromosomes (except for chromosome 4) in cultivated C. baccatum, indicating a population bottleneck during domestication of C. baccatum. In contrast, C. annuum chromosomes showed positive π and Tajima's D on all chromosomes except chromosome 8, which may be because of domestication at multiple sites contributing to wider genetic diversity. For C. baccatum, 13,129 SNPs were available, with minor allele frequency (MAF) ≥0.05; PCA of the SNPs revealed 283 C. baccatum accessions grouped into 3 distinct clusters, for strong population structure. The fixation index (FST) between domesticated C. annuum and C. baccatum was 0.78, which indicates genome-wide divergence. We conducted extensive linkage disequilibrium (LD) analysis of C. baccatum var. pendulum cultivars on all adjacent SNP pairs within a chromosome to identify regions of high and low LD interspersed with a genome-wide average LD block size of 99.1 kb. We characterized 1742 haplotypes containing 4420 SNPs (range 9–2 SNPs per haplotype). Genome-wide association study (GWAS) of peduncle length, a trait that differentiates wild and domesticated C. baccatum types, revealed 36 significantly associated genome-wide SNPs. Population structure, identity by state (IBS) and LD patterns across the genome will be of potential use for future GWAS of economically important traits in C. baccatum peppers. PMID:27857720
Mapholi, N O; Maiwashe, A; Matika, O; Riggio, V; Bishop, S C; MacNeil, M D; Banga, C; Taylor, J F; Dzama, K
Ticks and tick-borne diseases are among the main causes of economic loss in the South African cattle industry through high morbidity and mortality rates. Concerns of the general public regarding chemical residues may tarnish their perceptions of food safety and environmental health when the husbandry of cattle includes frequent use of acaricides to manage ticks. The primary objective of this study was to identify single nucleotide polymorphism (SNP) markers associated with host resistance to ticks in South African Nguni cattle. Tick count data were collected monthly from 586 Nguni cattle reared in four herds under natural grazing conditions over a period of two years. The counts were recorded for six species of ticks attached in eight anatomical locations on the animals and were summed by species and anatomical location. This gave rise to 63 measured phenotypes or traits, with results for 12 of these traits being reported here. Tick count (x) data were transformed using log10(x+1) and the resulting values were examined for normality. DNA was extracted from hair and blood samples and was genotyped using the Illumina BovineSNP50 assay. After quality control (call rate >90%, minor allele frequency >0.02), 40,436 SNPs were retained for analysis. Genetic parameters were estimated and association analysis for tick resistance was carried out using two approaches: a genome-wide association (GWA) analysis using the GenABEL package and a regional heritability mapping (RHM) analysis. The Bonferroni genome-wide (P<0.05) corrected significance threshold was 1.24×10(-6), with 2.47×10(-5) as the suggestive significance threshold (P<0.10) (i.e., one false positive per genome scan) in the GWA analysis. Likelihood ratio test (LRT) thresholds for genome-wide and suggestive significance were 13.5 and 9.15 for the RHM analysis. Six ixodid tick species were identified, with Amblyomma hebraeum (the vector for Heartwater disease) being the dominant species. Heritability estimates (h(2
Ros-Freixedes, Roger; Gol, Sofia; Pena, Ramona N.; Tor, Marc; Ibáñez-Escriche, Noelia; Dekkers, Jack C. M.; Estany, Joan
Intramuscular fat (IMF) content and fatty acid composition affect the organoleptic quality and nutritional value of pork. A genome-wide association study was performed on 138 Duroc pigs genotyped with a 60k SNP chip to detect biologically relevant genomic variants influencing fat content and composition. Despite the limited sample size, the genome-wide association study was powerful enough to detect the association between fatty acid composition and a known haplotypic variant in SCD (SSC14) and to reveal an association of IMF and fatty acid composition in the LEPR region (SSC6). The association of LEPR was later validated with an independent set of 853 pigs using a candidate quantitative trait nucleotide. The SCD gene is responsible for the biosynthesis of oleic acid (C18:1) from stearic acid. This locus affected the stearic to oleic desaturation index (C18:1/C18:0), C18:1, and saturated (SFA) and monounsaturated (MUFA) fatty acids content. These effects were consistently detected in gluteus medius, longissimus dorsi, and subcutaneous fat. The association of LEPR with fatty acid composition was detected only in muscle and was, at least in part, a consequence of its effect on IMF content, with increased IMF resulting in more SFA, less polyunsaturated fatty acids (PUFA), and greater SFA/PUFA ratio. Marker substitution effects estimated with a subset of 65 animals were used to predict the genomic estimated breeding values of 70 animals born 7 years later. Although predictions with the whole SNP chip information were in relatively high correlation with observed SFA, MUFA, and C18:1/C18:0 (0.48–0.60), IMF content and composition were in general better predicted by using only SNPs at the SCD and LEPR loci, in which case the correlation between predicted and observed values was in the range of 0.36 to 0.54 for all traits. Results indicate that markers in the SCD and LEPR genes can be useful to select for optimum fatty acid profiles of pork. PMID:27023885
Pszczola, M; Veerkamp, R F; de Haas, Y; Wall, E; Strabel, T; Calus, M P L
The genomic breeding value accuracy of scarcely recorded traits is low because of the limited number of phenotypic observations. One solution to increase the breeding value accuracy is to use predictor traits. This study investigated the impact of recording additional phenotypic observations for predictor traits on reference and evaluated animals on the genomic breeding value accuracy for a scarcely recorded trait. The scarcely recorded trait was dry matter intake (DMI, n = 869) and the predictor traits were fat-protein-corrected milk (FPCM, n = 1520) and live weight (LW, n = 1309). All phenotyped animals were genotyped and originated from research farms in Ireland, the United Kingdom and the Netherlands. Multi-trait REML was used to simultaneously estimate variance components and breeding values for DMI using available predictors. In addition, analyses using only pedigree relationships were performed. Breeding value accuracy was assessed through cross-validation (CV) and prediction error variance (PEV). CV groups (n = 7) were defined by splitting animals across genetic lines and management groups within country. With no additional traits recorded for the evaluated animals, both CV- and PEV-based accuracies for DMI were substantially higher for genomic than for pedigree analyses (CV: max. 0.26 for pedigree and 0.33 for genomic analyses; PEV: max. 0.45 and 0.52, respectively). With additional traits available, the differences between pedigree and genomic accuracies diminished. With additional recording for FPCM, pedigree accuracies increased from 0.26 to 0.47 for CV and from 0.45 to 0.48 for PEV. Genomic accuracies increased from 0.33 to 0.50 for CV and from 0.52 to 0.53 for PEV. With additional recording for LW instead of FPCM, pedigree accuracies increased to 0.54 for CV and to 0.61 for PEV. Genomic accuracies increased to 0.57 for CV and to 0.60 for PEV. With both FPCM and LW available for evaluated animals, accuracy was highest (0.62 for CV and 0.61 for PEV in
Genotyping breeding materials is now relatively inexpensive but phenotyping costs have remained the same. One method to increase gene mapping power is to use genome-wide genetic markers to combine existing phenotype data for multiple populations into a unified analysis. We combined data from 15 bipa...
Background Switchgrass (Panicum virgatum) is a herbaceous crop for the cellulosic biofuel feedstock development in the USA and Europe. As switchgrass is a naturally outcrossing species, accurate identification of selfed progeny is important to producing inbreds, which can be used in the production of heterotic hybrids. Development of a technically reliable, time-saving and easily used marker system is needed to quantify and characterize breeding origin of progeny plants of targeted parents. Results Genome-wide screening of 915 mapped microsatellite (simple sequence repeat, SSR) markers was conducted, and 842 (92.0%) produced clear and scorable bands on a pooled DNA sample of eight switchgrass varieties. A total of 166 primer pairs were selected on the basis of their relatively even distribution in switchgrass genome and PCR amplification quality on 16 tetraploid genotypes. Mean polymorphic information content value for the 166 markers was 0.810 ranging from 0.116 to 0.959. From them, a core set of 48 loci, which had been mapped on 17 linkage groups, was further tested and optimized to develop 24 sets of duplex markers. Most of (up to 87.5%) targeted, but non-allelic amplicons within each duplex were separated by more than 10-bp. Using the established duplex PCR protocol, selfing ratio (i.e., selfed/all progeny x100%) was identified as 0% for a randomly selected open-pollinated ‘Kanlow’ genotype grown in the field, 15.4% for 22 field-grown plants of bagged inflorescences, and 77.3% for a selected plant grown in a growth chamber. Conclusions The study developed a duplex SSR-based PCR protocol consisting of 48 markers, providing ample choices of non-tightly-linked loci in switchgrass whole genome, and representing a powerful, time-saving and easily used method for the identification of selfed progeny in switchgrass. The protocol should be a valuable tool in switchgrass breeding efforts. PMID:23031617
Schrimpf, Rahel; Dierks, Claudia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar
A consistently high level of stallion fertility plays an economically important role in modern horse breeding. We performed a genome-wide association study for estimated breeding values of the paternal component of the pregnancy rate per estrus cycle (EBV-PAT) in Hanoverian stallions. A total of 228 Hanoverian stallions were genotyped using the Equine SNP50 Beadchip. The most significant association was found on horse chromosome 6 for a single nucleotide polymorphism (SNP) within phospholipase C zeta 1 (PLCz1). In the close neighbourhood to PLCz1 is located CAPZA3 (capping protein (actin filament) muscle Z-line, alpha 3). The gene PLCz1 encodes a protein essential for spermatogenesis and oocyte activation through sperm induced Ca2+-oscillation during fertilization. We derived equine gene models for PLCz1 and CAPZA3 based on cDNA and genomic DNA sequences. The equine PLCz1 had four different transcripts of which two contained a premature termination codon. Sequencing all exons and their flanking sequences using genomic DNA samples from 19 Hanoverian stallions revealed 47 polymorphisms within PLCz1 and one SNP within CAPZA3. Validation of these 48 polymorphisms in 237 Hanoverian stallions identified three intronic SNPs within PLCz1 as significantly associated with EBV-PAT. Bioinformatic analysis suggested regulatory effects for these SNPs via transcription factor binding sites or microRNAs. In conclusion, non-coding polymorphisms within PLCz1 were identified as conferring stallion fertility and PLCz1 as candidate locus for male fertility in Hanoverian warmblood. CAPZA3 could be eliminated as candidate gene for fertility in Hanoverian stallions. PMID:25354211
Jayakody, Lahiru N; Hayashi, Nobuyuki; Kitagaki, Hiroshi
Degradation of lignocellulose with pressurised hot water is an efficient method of bioethanol production. However, the resultant solution inhibits ethanol fermentation by Saccharomyces cerevisiae. Here, we first report that glycolaldehyde, which is formed when lignocellulose is treated with pressurised hot water, inhibits ethanol fermentation. The final concentration of glycolaldehyde formed by the treatment of lignocellulose with pressurised hot water ranges from 1 to 24 M, and 1-10 mM glycolaldehyde was sufficient to inhibit fermentation. This result indicates that glycolaldehyde is one of the main substances responsible for inhibiting fermentation after pressurised hot water degradation of lignocellulose. Genome-wide screening of S. cerevisiae revealed that genes encoding alcohol dehydrogenase, methylglyoxal reductase, polysomes, and the ubiquitin ligase complex are required for glycolaldehyde tolerance. These novel findings will provide new perspectives on breeding yeast for bioethanol production from biomass treated with pressurised hot water.
Boerner, V; Johnston, D; Wu, X-L; Bauck, S
Genomically estimated breeding values (GEBV) for Angus beef cattle are available from at least 2 commercial suppliers (Igenity [http://www.igenity.com] and Zoetis [http://www.zoetis.com]). The utility of these GEBV for improving genetic evaluation depends on their accuracies, which can be estimated by the genetic correlation with phenotypic target traits. Genomically estimated breeding values of 1,032 Angus bulls calculated from prediction equations (PE) derived by 2 different procedures in the U.S. Angus population were supplied by Igenity. Both procedures were based on Illuminia BovineSNP50 BeadChip genotypes. In procedure sg, GEBV were calculated from PE that used subsets of only 392 SNP, where these subsets were individually selected for each trait by BayesCπ. In procedure rg GEBV were calculated from PE derived in a ridge regression approach using all available SNP. Because the total set of 1,032 bulls with GEBV contained 732 individuals used in the Igenity training population, GEBV subsets were formed characterized by a decreasing average relationship between individuals in the subsets and individuals in the training population. Accuracies of GEBV were estimated as genetic correlations between GEBV and their phenotypic target traits modeling GEBV as trait observations in a bivariate REML approach, in which phenotypic observations were those recorded in the commercial Australian Angus seed stock sector. Using results from the GEBV subset excluding all training individuals as a reference, estimated accuracies were generally in agreement with those already published, with both types of GEBV (sg and rg) yielding similar results. Accuracies for growth traits ranged from 0.29 to 0.45, for reproductive traits from 0.11 to 0.53, and for carcass traits from 0.3 to 0.75. Accuracies generally decreased with an increasing genetic distance between the training and the validation population. However, for some carcass traits characterized by a low number of phenotypic
In population genomics studies, accounting for the neutral covariance structure across population allele frequencies is critical to improve the robustness of genome-wide scan approaches. Elaborating on the BayEnv model, this study investigates several modeling extensions (i) to improve the estimation accuracy of the population covariance matrix and all the related measures, (ii) to identify significantly overly differentiated SNPs based on a calibration procedure of the XtX statistics, and (iii) to consider alternative covariate models for analyses of association with population-specific covariables. In particular, the auxiliary variable model allows one to deal with multiple testing issues and, providing the relative marker positions are available, to capture some linkage disequilibrium information. A comprehensive simulation study was carried out to evaluate the performances of these different models. Also, when compared in terms of power, robustness, and computational efficiency to five other state-of-the-art genome-scan methods (BayEnv2, BayScEnv, BayScan, flk, and lfmm), the proposed approaches proved highly effective. For illustration purposes, genotyping data on 18 French cattle breeds were analyzed, leading to the identification of 13 strong signatures of selection. Among these, four (surrounding the KITLG, KIT, EDN3, and ALB genes) contained SNPs strongly associated with the piebald coloration pattern while a fifth (surrounding PLAG1) could be associated to morphological differences across the populations. Finally, analysis of Pool-Seq data from 12 populations of Littorina saxatilis living in two different ecotypes illustrates how the proposed framework might help in addressing relevant ecological issues in nonmodel species. Overall, the proposed methods define a robust Bayesian framework to characterize adaptive genetic differentiation across populations. The BayPass program implementing the different models is available at http://www1.montpellier.inra.fr/CBGP/software/baypass/.
Li, Kexin; Hong, Wei; Jiao, Hengwu; Wang, Guo-Dong; Rodriguez, Karl A.; Buffenstein, Rochelle; Zhao, Yang; Nevo, Eviatar; Zhao, Huabin
Sympatric speciation (SS), i.e., speciation within a freely breeding population or in contiguous populations, was first proposed by Darwin [Darwin C (1859) On the Origins of Species by Means of Natural Selection] and is still controversial despite theoretical support [Gavrilets S (2004) Fitness Landscapes and the Origin of Species (MPB-41)] and mounting empirical evidence. Speciation of subterranean mammals generally, including the genus Spalax, was considered hitherto allopatric, whereby new species arise primarily through geographic isolation. Here we show in Spalax a case of genome-wide divergence analysis in mammals, demonstrating that SS in continuous populations, with gene flow, encompasses multiple widespread genomic adaptive complexes, associated with the sharply divergent ecologies. The two abutting soil populations of S. galili in northern Israel habituate the ancestral Senonian chalk population and abutting derivative Plio-Pleistocene basalt population. Population divergence originated ∼0.2–0.4 Mya based on both nuclear and mitochondrial genome analyses. Population structure analysis displayed two distinctly divergent clusters of chalk and basalt populations. Natural selection has acted on 300+ genes across the genome, diverging Spalax chalk and basalt soil populations. Gene ontology enrichment analysis highlights strong but differential soil population adaptive complexes: in basalt, sensory perception, musculature, metabolism, and energetics, and in chalk, nutrition and neurogenetics are outstanding. Population differentiation of chemoreceptor genes suggests intersoil population's mate and habitat choice substantiating SS. Importantly, distinctions in protein degradation may also contribute to SS. Natural selection and natural genetic engineering [Shapiro JA (2011) Evolution: A View From the 21st Century] overrule gene flow, evolving divergent ecological adaptive complexes. Sharp ecological divergences abound in nature; therefore, SS appears to be
Li, Kexin; Hong, Wei; Jiao, Hengwu; Wang, Guo-Dong; Rodriguez, Karl A; Buffenstein, Rochelle; Zhao, Yang; Nevo, Eviatar; Zhao, Huabin
Sympatric speciation (SS), i.e., speciation within a freely breeding population or in contiguous populations, was first proposed by Darwin [Darwin C (1859) On the Origins of Species by Means of Natural Selection] and is still controversial despite theoretical support [Gavrilets S (2004) Fitness Landscapes and the Origin of Species (MPB-41)] and mounting empirical evidence. Speciation of subterranean mammals generally, including the genus Spalax, was considered hitherto allopatric, whereby new species arise primarily through geographic isolation. Here we show in Spalax a case of genome-wide divergence analysis in mammals, demonstrating that SS in continuous populations, with gene flow, encompasses multiple widespread genomic adaptive complexes, associated with the sharply divergent ecologies. The two abutting soil populations of S. galili in northern Israel habituate the ancestral Senonian chalk population and abutting derivative Plio-Pleistocene basalt population. Population divergence originated ∼0.2-0.4 Mya based on both nuclear and mitochondrial genome analyses. Population structure analysis displayed two distinctly divergent clusters of chalk and basalt populations. Natural selection has acted on 300+ genes across the genome, diverging Spalax chalk and basalt soil populations. Gene ontology enrichment analysis highlights strong but differential soil population adaptive complexes: in basalt, sensory perception, musculature, metabolism, and energetics, and in chalk, nutrition and neurogenetics are outstanding. Population differentiation of chemoreceptor genes suggests intersoil population's mate and habitat choice substantiating SS. Importantly, distinctions in protein degradation may also contribute to SS. Natural selection and natural genetic engineering [Shapiro JA (2011) Evolution: A View From the 21st Century] overrule gene flow, evolving divergent ecological adaptive complexes. Sharp ecological divergences abound in nature; therefore, SS appears to be an
In population genomics studies, accounting for the neutral covariance structure across population allele frequencies is critical to improve the robustness of genome-wide scan approaches. Elaborating on the BayEnv model, this study investigates several modeling extensions (i) to improve the estimation accuracy of the population covariance matrix and all the related measures, (ii) to identify significantly overly differentiated SNPs based on a calibration procedure of the XtX statistics, and (iii) to consider alternative covariate models for analyses of association with population-specific covariables. In particular, the auxiliary variable model allows one to deal with multiple testing issues and, providing the relative marker positions are available, to capture some linkage disequilibrium information. A comprehensive simulation study was carried out to evaluate the performances of these different models. Also, when compared in terms of power, robustness, and computational efficiency to five other state-of-the-art genome-scan methods (BayEnv2, BayScEnv, BayScan, flk, and lfmm), the proposed approaches proved highly effective. For illustration purposes, genotyping data on 18 French cattle breeds were analyzed, leading to the identification of 13 strong signatures of selection. Among these, four (surrounding the KITLG, KIT, EDN3, and ALB genes) contained SNPs strongly associated with the piebald coloration pattern while a fifth (surrounding PLAG1) could be associated to morphological differences across the populations. Finally, analysis of Pool-Seq data from 12 populations of Littorina saxatilis living in two different ecotypes illustrates how the proposed framework might help in addressing relevant ecological issues in nonmodel species. Overall, the proposed methods define a robust Bayesian framework to characterize adaptive genetic differentiation across populations. The BayPass program implementing the different models is available at http://www1.montpellier
Huang, Shengxiong; Gao, Yongfeng; Liu, Jikai; Peng, Xiaoli; Niu, Xiangli; Fei, Zhangjun; Cao, Shuqing; Liu, Yongsheng
The WRKY transcription factors have been implicated in multiple biological processes in plants, especially in regulating defense against biotic and abiotic stresses. However, little information is available about the WRKYs in tomato (Solanum lycopersicum). The recent release of the whole-genome sequence of tomato allowed us to perform a genome-wide investigation for tomato WRKY proteins, and to compare these positively identified proteins with their orthologs in model plants, such as Arabidopsis and rice. In the present study, based on the recently released tomato whole-genome sequences, we identified 81 SlWRKY genes that were classified into three main groups, with the second group further divided into five subgroups. Depending on WRKY domains' sequences derived from tomato, Arabidopsis and rice, construction of a phylogenetic tree demonstrated distinct clustering and unique gene expansion of WRKY genes among the three species. Genome mapping analysis revealed that tomato WRKY genes were enriched on several chromosomes, especially on chromosome 5, and 16 % of the family members were tandemly duplicated genes. The tomato WRKYs from each group were shown to share similar motif compositions. Furthermore, tomato WRKY genes showed distinct temporal and spatial expression patterns in different developmental processes and in response to various biotic and abiotic stresses. The expression of 18 selected tomato WRKY genes in response to drought and salt stresses and Pseudomonas syringae invasion, respectively, was validated by quantitative RT-PCR. Our results will provide a platform for functional identification and molecular breeding study of WRKY genes in tomato and probably other Solanaceae plants.
Guo, Mei; Rupe, Mary A; Yang, Xiaofeng; Crasta, Oswald; Zinselmeier, Christopher; Smith, Oscar S; Bowen, Ben
Heterosis, or hybrid vigor, has been widely exploited in plant breeding for many decades, but the molecular mechanisms underlying the phenomenon remain unknown. In this study, we applied genome-wide transcript profiling to gain a global picture of the ways in which a large proportion of genes are expressed in the immature ear tissues of a series of 16 maize hybrids that vary in their degree of heterosis. Key observations include: (1) the proportion of allelic additively expressed genes is positively associated with hybrid yield and heterosis; (2) the proportion of genes that exhibit a bias towards the expression level of the paternal parent is negatively correlated with hybrid yield and heterosis; and (3) there is no correlation between the over- or under-expression of specific genes in maize hybrids with either yield or heterosis. The relationship of the expression patterns with hybrid performance is substantiated by analysis of a genetically improved modern hybrid (Pioneer hybrid 3394) versus a less improved older hybrid (Pioneer hybrid 3306) grown at different levels of plant density stress. The proportion of allelic additively expressed genes is positively associated with the modern high yielding hybrid, heterosis and high yielding environments, whereas the converse is true for the paternally biased gene expression. The dynamic changes of gene expression in hybrids responding to genotype and environment may result from differential regulation of the two parental alleles. Our findings suggest that differential allele regulation may play an important role in hybrid yield or heterosis, and provide a new insight to the molecular understanding of the underlying mechanisms of heterosis.
Jevsinek Skok, D; Godnic, I; Zorc, M; Horvat, S; Dovc, P; Kovac, M; Kunej, T
MicroRNAs are a class of non-coding RNAs that post-transcriptionally regulate target gene expression. Previous studies have shown that microRNA gene variability can interfere with its function, resulting in phenotypic variation. Polymorphisms within microRNA genes present a source of novel biomarkers for phenotypic traits in animal breeding. However, little is known about microRNA genetic variability in livestock species, which is also due to incomplete data in genomic resource databases. Therefore, the aim of this study was to perform a genome-wide in silico screening of genomic sources and determine the genetic variability of microRNA genes in livestock species using mirna sniper 3.0 (http://www.integratomics-time.com/miRNA-SNiPer/), a new version of our previously developed tool. By examining Ensembl and miRBase genome builds, it was possible to design a tool-based generated search of 16 genomes including four livestock species: pig, horse, cattle and chicken. The analysis revealed 65 polymorphisms located within mature microRNA regions in these four species, including 28% within the seed region in cattle and chicken. Polymorphic microRNA genes in cattle and chicken were further examined for mapping to quantitative trait loci regions associated with production and health traits. The developed bioinformatics tool enables the analysis of polymorphic microRNA genes and prioritization of potential regulatory polymorphisms and therefore contributes to the development of microRNA-based biomarkers in livestock species. The assembled catalog and the developed tool can serve the animal science community to efficiently select microRNA SNPs for further quantitative and molecular genetic evaluations of their phenotypic effects and causal associations with livestock production traits.
Green, Pamela J.
MicroRNAs (miRNAs) contribute to the control of numerous biological processes through the regulation of specific target mRNAs. Although the identities of these targets are essential to elucidate miRNA function, the targets are much more difficult to identify than the small RNAs themselves. Before this work, we pioneered the genome-wide identification of the targets of Arabidopsis miRNAs using an approach called PARE (German et al., Nature Biotech. 2008; Nature Protocols, 2009). Under this project, we applied PARE to Brachypodium distachyon (Brachypodium), a model plant in the Poaceae family, which includes the major food grain and bioenergy crops. Through in-depth global analysis and examination of specific examples, this research greatly expanded our knowledge of miRNAs and target RNAs of Brachypodium. New regulation in response to environmental stress or tissue type was found, and many new miRNAs were discovered. More than 260 targets of new and known miRNAs with PARE sequences at the precise sites of miRNA-guided cleavage were identified and characterized. Combining PARE data with the small RNA data also identified the miRNAs responsible for initiating approximately 500 phased loci, including one of the novel miRNAs. PARE analysis also revealed that differentially expressed miRNAs in the same family guide specific target RNA cleavage in a correspondingly tissue-preferential manner. The project included generation of small RNA and PARE resources for bioenergy crops, to facilitate ongoing discovery of conserved miRNA-target RNA regulation. By associating specific miRNA-target RNA pairs with known physiological functions, the research provides insights about gene regulation in different tissues and in response to environmental stress. This, and release of new PARE and small RNA data sets should contribute basic knowledge to enhance breeding and may suggest new strategies for improvement of biomass energy crops.
Jin, Y; Zhou, T; Geng, X; Liu, S; Chen, A; Yao, J; Jiang, C; Tan, S; Su, B; Liu, Z
Heat tolerance is a complex and economically important trait for catfish genetic breeding programs. With global climate change, it is becoming an increasingly important trait. To better understand the molecular basis of heat stress, a genome-wide association study (GWAS) was carried out using the 250 K catfish SNP array with interspecific backcross progenies, which derived from crossing female channel catfish with male F1 hybrid catfish (female channel catfish × male blue catfish). Three significant associated SNPs were detected by performing an EMMAX approach for GWAS. The SNP located on linkage group 14 explained 12.1% of phenotypical variation. The other two SNPs, located on linkage group 16, explained 11.3 and 11.5% of phenotypical variation respectively. A total of 14 genes with heat stress related functions were detected within the significant associated regions. Among them, five genes-TRAF2, FBXW5, ANAPC2, UBR1 and KLHL29- have known functions in the protein degradation process through the ubiquitination pathway. Other genes related to heat stress include genes involved in protein biosynthesis (PRPF4 and SYNCRIP), protein folding (DNAJC25), molecule and iron transport (SLC25A46 and CLIC5), cytoskeletal reorganization (COL12A1) and energy metabolism (COX7A2, PLCB1 and PLCB4) processes. The results provide fundamental information about genes and pathways that is useful for further investigation into the molecular mechanisms of heat stress. The associated SNPs could be promising candidates for selecting heat-tolerant catfish lines after validating their effects on larger and various catfish populations.
Farfan, Ivan D Barrero; De La Fuente, Gerald N; Murray, Seth C; Isakeit, Thomas; Huang, Pei-Cheng; Warburton, Marilyn; Williams, Paul; Windham, Gary L; Kolomiets, Mike
The primary maize (Zea mays L.) production areas are in temperate regions throughout the world and this is where most maize breeding is focused. Important but lower yielding maize growing regions such as the sub-tropics experience unique challenges, the greatest of which are drought stress and aflatoxin contamination. Here we used a diversity panel consisting of 346 maize inbred lines originating in temperate, sub-tropical and tropical areas testcrossed to stiff-stalk line Tx714 to investigate these traits. Testcross hybrids were evaluated under irrigated and non-irrigated trials for yield, plant height, ear height, days to anthesis, days to silking and other agronomic traits. Irrigated trials were also inoculated with Aspergillus flavus and evaluated for aflatoxin content. Diverse maize testcrosses out-yielded commercial checks in most trials, which indicated the potential for genetic diversity to improve sub-tropical breeding programs. To identify genomic regions associated with yield, aflatoxin resistance and other important agronomic traits, a genome wide association analysis was performed. Using 60,000 SNPs, this study found 10 quantitative trait variants for grain yield, plant and ear height, and flowering time after stringent multiple test corrections, and after fitting different models. Three of these variants explained 5-10% of the variation in grain yield under both water conditions. Multiple identified SNPs co-localized with previously reported QTL, which narrows the possible location of causal polymorphisms. Novel significant SNPs were also identified. This study demonstrated the potential to use genome wide association studies to identify major variants of quantitative and complex traits such as yield under drought that are still segregating between elite inbred lines.
Farfan, Ivan D. Barrero; De La Fuente, Gerald N.; Murray, Seth C.; Isakeit, Thomas; Huang, Pei-Cheng; Warburton, Marilyn; Williams, Paul; Windham, Gary L.; Kolomiets, Mike
The primary maize (Zea mays L.) production areas are in temperate regions throughout the world and this is where most maize breeding is focused. Important but lower yielding maize growing regions such as the sub-tropics experience unique challenges, the greatest of which are drought stress and aflatoxin contamination. Here we used a diversity panel consisting of 346 maize inbred lines originating in temperate, sub-tropical and tropical areas testcrossed to stiff-stalk line Tx714 to investigate these traits. Testcross hybrids were evaluated under irrigated and non-irrigated trials for yield, plant height, ear height, days to anthesis, days to silking and other agronomic traits. Irrigated trials were also inoculated with Aspergillus flavus and evaluated for aflatoxin content. Diverse maize testcrosses out-yielded commercial checks in most trials, which indicated the potential for genetic diversity to improve sub-tropical breeding programs. To identify genomic regions associated with yield, aflatoxin resistance and other important agronomic traits, a genome wide association analysis was performed. Using 60,000 SNPs, this study found 10 quantitative trait variants for grain yield, plant and ear height, and flowering time after stringent multiple test corrections, and after fitting different models. Three of these variants explained 5–10% of the variation in grain yield under both water conditions. Multiple identified SNPs co-localized with previously reported QTL, which narrows the possible location of causal polymorphisms. Novel significant SNPs were also identified. This study demonstrated the potential to use genome wide association studies to identify major variants of quantitative and complex traits such as yield under drought that are still segregating between elite inbred lines. PMID:25714370
Miao, Xiangyang; Luo, Qingmiao; Qin, Xiaoyu
The goats are widely kept as livestock throughout the world. Two excellent domestic breeds in China, the Laiwu Black and Jining Grey goats, have different fecundities and prolificacies. Although the goat genome sequences have been resolved recently, little is known about the gene regulations at the transcriptional level in goat. To understand the molecular and genetic mechanisms related to the fecundities and prolificacies, we performed genome-wide sequencing of the mRNAs from two breeds of goat using the next-generation RNA-Seq technology and used functional annotation to identify pathways of interest. Digital gene expression analysis showed 338 genes were up-regulated in the Jining Grey goats and 404 were up-regulated in the Laiwu Black goats. Quantitative real-time PCR verified the reliability of the RNA-Seq data. This study suggests that multiple genes responsible for various biological functions and signaling pathways are differentially expressed in the two different goat breeds, and these genes might be involved in the regulation of goat fecundity and prolificacy. Taken together, our study provides insight into the transcriptional regulation in the ovaries of 2 species of goats that might serve as a key resource for understanding goat fecundity, prolificacy and genetic diversity between species.
Yu, Long-Xi; Zheng, Ping; Zhang, Tiejun; Rodringuez, Jonas; Main, Dorrie
Verticillium wilt (VW) is a fungal disease that causes severe yield losses in alfalfa. The most effective method to control the disease is through the development and use of resistant varieties. The identification of marker loci linked to VW resistance can facilitate breeding for disease-resistant alfalfa. In the present investigation, we applied an integrated framework of genome-wide association with genotyping-by-sequencing (GBS) to identify VW resistance loci in a panel of elite alfalfa breeding lines. Phenotyping was performed by manual inoculation of the pathogen to healthy seedlings, and scoring for disease resistance was carried out according to the standard test of the North America Alfalfa Improvement Conference (NAAIC). Marker-trait association by linkage disequilibrium identified 10 single nucleotide polymorphism (SNP) markers significantly associated with VW resistance. Alignment of the SNP marker sequences to the M. truncatula genome revealed multiple quantitative trait loci (QTLs). Three, two, one and five markers were located on chromosomes 5, 6, 7 and 8, respectively. Resistance loci found on chromosomes 7 and 8 in the present study co-localized with the QTLs reported previously. A pairwise alignment (blastn) using the flanking sequences of the resistance loci against the M. truncatula genome identified potential candidate genes with putative disease resistance function. With further investigation, these markers may be implemented into breeding programmes using marker-assisted selection, ultimately leading to improved VW resistance in alfalfa.
distribution unlimited. QC – quality control QTL – quantitative trait loci SNP – single nucleotide polymorphism TE – Tris + EDTA TBE – Tris + Boric Acid + EDTA WGSA – whole genome sampling assay ...canine intelligence testing protocol EDTA – ethylenediaminetetraacetic acid GWAS – genome-wide association study LD – linkage disequilibrium MWD
We generated 13,789 single nucleotide plymorphism (SNP) markers from 97 melon accessions using genotyping by sequencing and anchored them to chromosomes to understand genome-wide fixation index between various melon morphotypes and linkage disequilibrium (LD) decay for inodorus and cantalupensis, th...
Fulton, Janet E
Poultry breeding companies are facing a new paradigm. Since 2004, extensive resources have been developed to increase understanding of the fundamental biology of the chicken. The chicken genome has been sequenced and revised twice, millions of novel DNA variants have been identified, and new tools have been created that allow rapid and inexpensive detection of these DNA variations. These developments have led to the establishment of molecular-based breeding programs within major poultry breeding companies that are revolutionizing the primary poultry breeding industries. Costs of sequencing continue to drop and are predicted to eventually reach the point where it is feasible to sequence the entire genome of elite birds before selection. There are multiple challenges to be resolved before this information can be fully incorporated into a breeding program. These include handling and analyzing the extremely large data sets generated, understanding which genes, variants, or both are relevant for commercial production traits, development of new bio-informatic tools, and integration of molecular information with traditional breeding programs. The novel variation identified within elite commercial lines will lead to enhancements in commercial breeding programs. Applications of this information include whole genomic selection, parentage identification, trait association studies, and quality control.
Wang, Chao; Wang, Hongyang; Zhang, Yu; Tang, Zhonglin; Li, Kui; Liu, Bang
Pigs from Asia and Europe were independently domesticated from c. 9000 years ago. During this period, strong artificial selection has led to dramatic phenotypic changes in domestic pigs. However, the genetic basis underlying these morphological and behavioural adaptations is relatively unknown, particularly for indigenous Chinese pigs. Here, we performed a genome-wide analysis to screen 196 regions with selective sweep signals in Tongcheng pigs, which are a typical indigenous Chinese breed. Genes located in these regions have been found to be involved in lipid metabolism, melanocyte differentiation, neural development and other biological processes, which coincide with the evolutionary phenotypic changes in this breed. A synonymous substitution, c.669T>C, in ESR1, which colocalizes with a major quantitative trait locus for litter size, shows extreme differences in allele frequency between Tongcheng pigs and wild boars. Notably, the variant C allele in this locus exhibits high allele frequency in most Chinese populations, suggesting a consequence of positive selection. Five genes (PRM1, PRM2, TNP2, GPR149 and JMJD1C) related to reproductive traits were found to have high haplotype similarity in Chinese breeds. Two selected genes, MITF and EDNRB, are implied to shape the two-end black colour trait in Tongcheng pig. Subsequent SNP microarray studies of five Chinese white-spotted breeds displayed a concordant signature at both loci, suggesting that these two genes are responsible for colour variations in Chinese breeds. Utilizing massively parallel sequencing, we characterized the candidate sites that adapt to artificial and environmental selections during the Chinese pig domestication. This study provides fundamental proof for further research on the evolutionary adaptation of Chinese pigs.
Background In contrast to international pig breeds, the Iberian breed has not been admixed with Asian germplasm. This makes it an important model to study both domestication and relevance of Asian genes in the pig. Besides, Iberian pigs exhibit high meat quality as well as appetite and propensity to obesity. Here we provide a genome wide analysis of nucleotide and structural diversity in a reduced representation library from a pool (n=9 sows) and shotgun genomic sequence from a single sow of the highly inbred Guadyerbas strain. In the pool, we applied newly developed tools to account for the peculiarities of these data. Results A total of 254,106 SNPs in the pool (79.6 Mb covered) and 643,783 in the Guadyerbas sow (1.47 Gb covered) were called. The nucleotide diversity (1.31x10-3 per bp in autosomes) is very similar to that reported in wild boar. A much lower than expected diversity in the X chromosome was confirmed (1.79x10-4 per bp in the individual and 5.83x10-4 per bp in the pool). A strong (0.70) correlation between recombination and variability was observed, but not with gene density or GC content. Multicopy regions affected about 4% of annotated pig genes in their entirety, and 2% of the genes partially. Genes within the lowest variability windows comprised interferon genes and, in chromosome X, genes involved in behavior like HTR2C or MCEP2. A modified Hudson-Kreitman-Aguadé test for pools also indicated an accelerated evolution in genes involved in behavior, as well as in spermatogenesis and in lipid metabolism. Conclusions This work illustrates the strength of current sequencing technologies to picture a comprehensive landscape of variability in livestock species, and to pinpoint regions containing genes potentially under selection. Among those genes, we report genes involved in behavior, including feeding behavior, and lipid metabolism. The pig X chromosome is an outlier in terms of nucleotide diversity, which suggests selective constraints. Our data
Red meat from Bos taurus and Bos indicus breeds are an important source of nutrients for humans and intramuscular fat (IMF) influences its flavor, nutritional value and impacts human health. Human consumption of fat that contains high levels of monounsaturated fatty acids (MUFA) can reduce the conce...
Slovak, Radka; Göschl, Christian; Seren, Ümit; Busch, Wolfgang
Genome-wide association (GWA) mapping is a powerful technique to address the molecular basis of genotype to phenotype relationships and to map regulators of biological processes. This chapter presents a protocol for genome-wide association mapping in Arabidopsis thaliana using the user-friendly internet application GWAPP, and provides a specific protocol for acquiring root trait data suitable for GWA studies using the semi-automated, high-throughput phenotyping pipeline BRAT for early root growth.
Kertai, Miklos D; Li, Yi-Ju; Li, Yen-Wei; Ji, Yunqi; Alexander, John; Newman, Mark F; Smith, Peter K; Joseph, Diane; Mathew, Joseph P
Objectives Identification of patient subpopulations susceptible to develop myocardial infarction (MI) or, conversely, those displaying either intrinsic cardioprotective phenotypes or highly responsive to protective interventions remain high-priority knowledge gaps. We sought to identify novel common genetic variants associated with perioperative MI in patients undergoing coronary artery bypass grafting using genome-wide association methodology. Setting 107 secondary and tertiary cardiac surgery centres across the USA. Participants We conducted a stage I genome-wide association study (GWAS) in 1433 ethnically diverse patients of both genders (112 cases/1321 controls) from the Genetics of Myocardial Adverse Outcomes and Graft Failure (GeneMAGIC) study, and a stage II analysis in an expanded population of 2055 patients (225 cases/1830 controls) combined from the GeneMAGIC and Duke Perioperative Genetics and Safety Outcomes (PEGASUS) studies. Patients undergoing primary non-emergent coronary bypass grafting were included. Primary and secondary outcome measures The primary outcome variable was perioperative MI, defined as creatine kinase MB isoenzyme (CK-MB) values ≥10× upper limit of normal during the first postoperative day, and not attributable to preoperative MI. Secondary outcomes included postoperative CK-MB as a quantitative trait, or a dichotomised phenotype based on extreme quartiles of the CK-MB distribution. Results Following quality control and adjustment for clinical covariates, we identified 521 single nucleotide polymorphisms in the stage I GWAS analysis. Among these, 8 common variants in 3 genes or intergenic regions met p<10−5 in stage II. A secondary analysis using CK-MB as a quantitative trait (minimum p=1.26×10−3 for rs609418), or a dichotomised phenotype based on extreme CK-MB values (minimum p=7.72×10−6 for rs4834703) supported these findings. Pathway analysis revealed that genes harbouring top-scoring variants cluster in pathways of
Zhang, Lei; Choi, Hyung Jin; Estrada, Karol; Leo, Paul J; Li, Jian; Pei, Yu-Fang; Zhang, Yinping; Lin, Yong; Shen, Hui; Liu, Yao-Zhong; Liu, Yongjun; Zhao, Yingchun; Zhang, Ji-Gang; Tian, Qing; Wang, Yu-ping; Han, Yingying; Ran, Shu; Hai, Rong; Zhu, Xue-Zhen; Wu, Shuyan; Yan, Han; Liu, Xiaogang; Yang, Tie-Lin; Guo, Yan; Zhang, Feng; Guo, Yan-fang; Chen, Yuan; Chen, Xiangding; Tan, Lijun; Zhang, Lishu; Deng, Fei-Yan; Deng, Hongyi; Rivadeneira, Fernando; Duncan, Emma L; Lee, Jong Young; Han, Bok Ghee; Cho, Nam H; Nicholson, Geoffrey C; McCloskey, Eugene; Eastell, Richard; Prince, Richard L; Eisman, John A; Jones, Graeme; Reid, Ian R; Sambrook, Philip N; Dennison, Elaine M; Danoy, Patrick; Yerges-Armstrong, Laura M; Streeten, Elizabeth A; Hu, Tian; Xiang, Shuanglin; Papasian, Christopher J; Brown, Matthew A; Shin, Chan Soo; Uitterlinden, André G; Deng, Hong-Wen
Aiming to identify novel genetic variants and to confirm previously identified genetic variants associated with bone mineral density (BMD), we conducted a three-stage genome-wide association (GWA) meta-analysis in 27 061 study subjects. Stage 1 meta-analyzed seven GWA samples and 11 140 subjects for BMDs at the lumbar spine, hip and femoral neck, followed by a Stage 2 in silico replication of 33 SNPs in 9258 subjects, and by a Stage 3 de novo validation of three SNPs in 6663 subjects. Combining evidence from all the stages, we have identified two novel loci that have not been reported previously at the genome-wide significance (GWS; 5.0 × 10(-8)) level: 14q24.2 (rs227425, P-value 3.98 × 10(-13), SMOC1) in the combined sample of males and females and 21q22.13 (rs170183, P-value 4.15 × 10(-9), CLDN14) in the female-specific sample. The two newly identified SNPs were also significant in the GEnetic Factors for OSteoporosis consortium (GEFOS, n = 32 960) summary results. We have also independently confirmed 13 previously reported loci at the GWS level: 1p36.12 (ZBTB40), 1p31.3 (GPR177), 4p16.3 (FGFRL1), 4q22.1 (MEPE), 5q14.3 (MEF2C), 6q25.1 (C6orf97, ESR1), 7q21.3 (FLJ42280, SHFM1), 7q31.31 (FAM3C, WNT16), 8q24.12 (TNFRSF11B), 11p15.3 (SOX6), 11q13.4 (LRP5), 13q14.11 (AKAP11) and 16q24 (FOXL1). Gene expression analysis in osteogenic cells implied potential functional association of the two candidate genes (SMOC1 and CLDN14) in bone metabolism. Our findings independently confirm previously identified biological pathways underlying bone metabolism and contribute to the discovery of novel pathways, thus providing valuable insights into the intervention and treatment of osteoporosis.
Zhang, Lei; Choi, Hyung Jin; Estrada, Karol; Leo, Paul J.; Li, Jian; Pei, Yu-Fang; Zhang, Yinping; Lin, Yong; Shen, Hui; Liu, Yao-Zhong; Liu, Yongjun; Zhao, Yingchun; Zhang, Ji-Gang; Tian, Qing; Wang, Yu-ping; Han, Yingying; Ran, Shu; Hai, Rong; Zhu, Xue-Zhen; Wu, Shuyan; Yan, Han; Liu, Xiaogang; Yang, Tie-Lin; Guo, Yan; Zhang, Feng; Guo, Yan-fang; Chen, Yuan; Chen, Xiangding; Tan, Lijun; Zhang, Lishu; Deng, Fei-Yan; Deng, Hongyi; Rivadeneira, Fernando; Duncan, Emma L; Lee, Jong Young; Han, Bok Ghee; Cho, Nam H.; Nicholson, Geoffrey C.; McCloskey, Eugene; Eastell, Richard; Prince, Richard L.; Eisman, John A.; Jones, Graeme; Reid, Ian R.; Sambrook, Philip N.; Dennison, Elaine M.; Danoy, Patrick; Yerges-Armstrong, Laura M.; Streeten, Elizabeth A.; Hu, Tian; Xiang, Shuanglin; Papasian, Christopher J.; Brown, Matthew A.; Shin, Chan Soo; Uitterlinden, André G.; Deng, Hong-Wen
Aiming to identify novel genetic variants and to confirm previously identified genetic variants associated with bone mineral density (BMD), we conducted a three-stage genome-wide association (GWA) meta-analysis in 27 061 study subjects. Stage 1 meta-analyzed seven GWA samples and 11 140 subjects for BMDs at the lumbar spine, hip and femoral neck, followed by a Stage 2 in silico replication of 33 SNPs in 9258 subjects, and by a Stage 3 de novo validation of three SNPs in 6663 subjects. Combining evidence from all the stages, we have identified two novel loci that have not been reported previously at the genome-wide significance (GWS; 5.0 × 10−8) level: 14q24.2 (rs227425, P-value 3.98 × 10−13, SMOC1) in the combined sample of males and females and 21q22.13 (rs170183, P-value 4.15 × 10−9, CLDN14) in the female-specific sample. The two newly identified SNPs were also significant in the GEnetic Factors for OSteoporosis consortium (GEFOS, n = 32 960) summary results. We have also independently confirmed 13 previously reported loci at the GWS level: 1p36.12 (ZBTB40), 1p31.3 (GPR177), 4p16.3 (FGFRL1), 4q22.1 (MEPE), 5q14.3 (MEF2C), 6q25.1 (C6orf97, ESR1), 7q21.3 (FLJ42280, SHFM1), 7q31.31 (FAM3C, WNT16), 8q24.12 (TNFRSF11B), 11p15.3 (SOX6), 11q13.4 (LRP5), 13q14.11 (AKAP11) and 16q24 (FOXL1). Gene expression analysis in osteogenic cells implied potential functional association of the two candidate genes (SMOC1 and CLDN14) in bone metabolism. Our findings independently confirm previously identified biological pathways underlying bone metabolism and contribute to the discovery of novel pathways, thus providing valuable insights into the intervention and treatment of osteoporosis. PMID:24249740
Shim, Unjin; Kim, Han-Na; Lee, Hyejin; Oh, Jee-Young
Background Polycystic ovary syndrome (PCOS) is one of the most common endocrine disorders in women of reproductive age, and it is affected by both environmental and genetic factors. Although the genetic component of PCOS is evident, studies aiming to identify susceptibility genes have shown controversial results. This study conducted a pathway-based analysis using a dataset obtained through a genome-wide association study (GWAS) to elucidate the biological pathways that contribute to PCOS susceptibility and the associated genes. Methods We used GWAS data on 636,797 autosomal single nucleotide polymorphisms (SNPs) from 1,221 individuals (432 PCOS patients and 789 controls) for analysis. A pathway analysis was conducted using meta-analysis gene-set enrichment of variant associations (MAGENTA). Top-ranking pathways or gene sets associated with PCOS were identified, and significant genes within the pathways were analyzed. Results The pathway analysis of the GWAS dataset identified significant pathways related to oocyte meiosis and the regulation of insulin secretion by acetylcholine and free fatty acids (all nominal gene-set enrichment analysis (GSEA) P-values < 0.05). In addition, INS, GNAQ, STXBP1, PLCB3, PLCB2, SMC3 and PLCZ1 were significant genes observed within the biological pathways (all gene P-values < 0.05). Conclusions By applying MAGENTA pathway analysis to PCOS GWAS data, we identified significant pathways and candidate genes involved in PCOS. Our findings may provide new leads for understanding the mechanisms underlying the development of PCOS. PMID:26308735
van Leeuwen, Elisabeth M; Smouter, Françoise A S; Kam-Thong, Tony; Karbalai, Nazanin; Smith, Albert V; Harris, Tamara B; Launer, Lenore J; Sitlani, Colleen M; Li, Guo; Brody, Jennifer A; Bis, Joshua C; White, Charles C; Jaiswal, Alok; Oostra, Ben A; Hofman, Albert; Rivadeneira, Fernando; Uitterlinden, Andre G; Boerwinkle, Eric; Ballantyne, Christie M; Gudnason, Vilmundur; Psaty, Bruce M; Cupples, L Adrienne; Järvelin, Marjo-Riitta; Ripatti, Samuli; Isaacs, Aaron; Müller-Myhsok, Bertram; Karssen, Lennart C; van Duijn, Cornelia M
Genome-wide association studies (GWAS) have revealed 74 single nucleotide polymorphisms (SNPs) associated with high-density lipoprotein cholesterol (HDL) blood levels. This study is, to our knowledge, the first genome-wide interaction study (GWIS) to identify SNP×SNP interactions associated with HDL levels. We performed a GWIS in the Rotterdam Study (RS) cohort I (RS-I) using the GLIDE tool which leverages the massively parallel computing power of Graphics Processing Units (GPUs) to perform linear regression on all genome-wide pairs of SNPs. By performing a meta-analysis together with Rotterdam Study cohorts II and III (RS-II and RS-III), we were able to filter 181 interaction terms with a p-value<1 · 10-8 that replicated in the two independent cohorts. We were not able to replicate any of these interaction term in the AGES, ARIC, CHS, ERF, FHS and NFBC-66 cohorts (Ntotal = 30,011) when adjusting for multiple testing. Our GWIS resulted in the consistent finding of a possible interaction between rs774801 in ARMC8 (ENSG00000114098) and rs12442098 in SPATA8 (ENSG00000185594) being associated with HDL levels. However, p-values do not reach the preset Bonferroni correction of the p-values. Our study suggest that even for highly genetically determined traits such as HDL the sample sizes needed to detect SNP×SNP interactions are large and the 2-step filtering approaches do not yield a solution. Here we present our analysis plan and our reservations concerning GWIS.
van Leeuwen, Elisabeth M.; Smouter, Françoise A. S.; Kam-Thong, Tony; Karbalai, Nazanin; Smith, Albert V.; Harris, Tamara B.; Launer, Lenore J.; Sitlani, Colleen M.; Li, Guo; Brody, Jennifer A.; Bis, Joshua C.; White, Charles C.; Jaiswal, Alok; Oostra, Ben A.; Hofman, Albert; Rivadeneira, Fernando; Uitterlinden, Andre G.; Boerwinkle, Eric; Ballantyne, Christie M.; Gudnason, Vilmundur; Psaty, Bruce M.; Cupples, L. Adrienne; Järvelin, Marjo-Riitta; Ripatti, Samuli; Isaacs, Aaron; Müller-Myhsok, Bertram; Karssen, Lennart C.; van Duijn, Cornelia M.
Genome-wide association studies (GWAS) have revealed 74 single nucleotide polymorphisms (SNPs) associated with high-density lipoprotein cholesterol (HDL) blood levels. This study is, to our knowledge, the first genome-wide interaction study (GWIS) to identify SNP×SNP interactions associated with HDL levels. We performed a GWIS in the Rotterdam Study (RS) cohort I (RS-I) using the GLIDE tool which leverages the massively parallel computing power of Graphics Processing Units (GPUs) to perform linear regression on all genome-wide pairs of SNPs. By performing a meta-analysis together with Rotterdam Study cohorts II and III (RS-II and RS-III), we were able to filter 181 interaction terms with a p-value<1 · 10−8 that replicated in the two independent cohorts. We were not able to replicate any of these interaction term in the AGES, ARIC, CHS, ERF, FHS and NFBC-66 cohorts (Ntotal = 30,011) when adjusting for multiple testing. Our GWIS resulted in the consistent finding of a possible interaction between rs774801 in ARMC8 (ENSG00000114098) and rs12442098 in SPATA8 (ENSG00000185594) being associated with HDL levels. However, p-values do not reach the preset Bonferroni correction of the p-values. Our study suggest that even for highly genetically determined traits such as HDL the sample sizes needed to detect SNP×SNP interactions are large and the 2-step filtering approaches do not yield a solution. Here we present our analysis plan and our reservations concerning GWIS. PMID:25329471
The ability to predict individual breeding values in natural populations with known pedigrees has provided a powerful tool to separate phenotypic values into their genetic and environmental components in a nonexperimental setting. This has allowed sophisticated analyses of selection, as well as powerful tests of evolutionary change and differentiation. To date, there has, however, been no evaluation of the reliability or potential limitations of the approach. In this article, I address these gaps. In particular, I emphasize the differences between true and predicted breeding values (PBVs), which as yet have largely been ignored. These differences do, however, have important implications for the interpretation of, firstly, the relationship between PBVs and fitness, and secondly, patterns in PBVs over time. I subsequently present guidelines I believe to be essential in the formulation of the questions addressed in studies using PBVs, and I discuss possibilities for future research.
Background Subtilisin/kexin-like proprotein convertase (PCSK) enzymes have important regulatory function in a wide variety of biological processes. PCSKs proteolytically process at a target sequence that contains basic amino acids arginine and lysine, which results in functional maturation of the target protein. In vitro assays have showed significant biochemical redundancy between the seven family members, but the phenotypes of PCSK deficient mice and patients carrying an inactive PCSK allele argue for a specific biological function. Modeling the structures of individual PCSK enzymes has offered little insights into the specificity determinants. However, previous studies have shown that there can be a coordinated expression between a PCSK and its target molecule. Here, we have surveyed the putative PCSK target proteins using genome-wide expression correlation analysis and cleavage site prediction algorithms. Results We first performed a gene expression correlation analysis over the whole genome for all PCSK enzymes. PCSKs were found to cluster differently based on the strength of correlations. The screen for putative PCSK target proteins showed a significant enrichment (p-values from 1.2e-4 to < 1.0e-10) of putative targets among the most positively correlating genes for most PCSKs. Interestingly, there was no enrichment in putative targets among the genes that correlated positively with the biologically redundant PCSK7, whereas PCSK5 showed an inverse correlation. PCSKs also showed a highly variable degree of shared target genes that were identified by expression correlation and cleavage site prediction. Multiple alignments were used to evaluate the putative targets to pinpoint the important residues for the substrate recognition. Finally, we validated our approach and identified biochemically PAPPA1 and ADAMTS6 as novel targets for FURIN proteolytic activity. Conclusions Most PCSK enzymes display strong positive expression correlation with predicted target
Aulchenko, Yurii S.; Kirichenko, Anatoly V.; Janssens, A. Cecile J. W.; Jansen, Ritsert C.; Gnewuch, Carsten; Domingues, Francisco S.; Pattaro, Cristian; Wild, Sarah H.; Jonasson, Inger; Polasek, Ozren; Zorkoltseva, Irina V.; Hofman, Albert; Karssen, Lennart C.; Struchalin, Maksim; Floyd, James; Igl, Wilmar; Biloglav, Zrinka; Broer, Linda; Pfeufer, Arne; Pichler, Irene; Campbell, Susan; Zaboli, Ghazal; Kolcic, Ivana; Rivadeneira, Fernando; Huffman, Jennifer; Hastie, Nicholas D.; Uitterlinden, Andre; Franke, Lude; Franklin, Christopher S.; Vitart, Veronique; Nelson, Christopher P.; Preuss, Michael; Bis, Joshua C.; O'Donnell, Christopher J.; Franceschini, Nora; Witteman, Jacqueline C. M.; Axenovich, Tatiana; Oostra, Ben A.; Meitinger, Thomas; Hicks, Andrew A.; Hayward, Caroline; Wright, Alan F.; Gyllensten, Ulf; Campbell, Harry; Schmitz, Gerd
Phospho- and sphingolipids are crucial cellular and intracellular compounds. These lipids are required for active transport, a number of enzymatic processes, membrane formation, and cell signalling. Disruption of their metabolism leads to several diseases, with diverse neurological, psychiatric, and metabolic consequences. A large number of phospholipid and sphingolipid species can be detected and measured in human plasma. We conducted a meta-analysis of five European family-based genome-wide association studies (N = 4034) on plasma levels of 24 sphingomyelins (SPM), 9 ceramides (CER), 57 phosphatidylcholines (PC), 20 lysophosphatidylcholines (LPC), 27 phosphatidylethanolamines (PE), and 16 PE-based plasmalogens (PLPE), as well as their proportions in each major class. This effort yielded 25 genome-wide significant loci for phospholipids (smallest P-value = 9.88×10−204) and 10 loci for sphingolipids (smallest P-value = 3.10×10−57). After a correction for multiple comparisons (P-value<2.2×10−9), we observed four novel loci significantly associated with phospholipids (PAQR9, AGPAT1, PKD2L1, PDXDC1) and two with sphingolipids (PLD2 and APOE) explaining up to 3.1% of the variance. Further analysis of the top findings with respect to within class molar proportions uncovered three additional loci for phospholipids (PNLIPRP2, PCDH20, and ABDH3) suggesting their involvement in either fatty acid elongation/saturation processes or fatty acid specific turnover mechanisms. Among those, 14 loci (KCNH7, AGPAT1, PNLIPRP2, SYT9, FADS1-2-3, DLG2, APOA1, ELOVL2, CDK17, LIPC, PDXDC1, PLD2, LASS4, and APOE) mapped into the glycerophospholipid and 12 loci (ILKAP, ITGA9, AGPAT1, FADS1-2-3, APOA1, PCDH20, LIPC, PDXDC1, SGPP1, APOE, LASS4, and PLD2) to the sphingolipid pathways. In large meta-analyses, associations between FADS1-2-3 and carotid intima media thickness, AGPAT1 and type 2 diabetes, and APOA1 and coronary artery disease were observed. In conclusion, our
Lange, Leslie; Demerath, Ellen W.; Palmas, Walter; Wojczynski, Mary K.; Ellis, Jaclyn C.; Vitolins, Mara Z.; Liu, Simin; Papanicolaou, George J.; Irvin, Marguerite R.; Xue, Luting; Griffin, Paula J.; Nalls, Michael A.; Adeyemo, Adebowale; Liu, Jiankang; Li, Guo; Ruiz-Narvaez, Edward A.; Chen, Wei-Min; Chen, Fang; Henderson, Brian E.; Millikan, Robert C.; Ambrosone, Christine B.; Strom, Sara S.; Guo, Xiuqing; Andrews, Jeanette S.; Sun, Yan V.; Mosley, Thomas H.; Yanek, Lisa R.; Shriner, Daniel; Haritunians, Talin; Rotter, Jerome I.; Speliotes, Elizabeth K.; Smith, Megan; Rosenberg, Lynn; Mychaleckyj, Josyf; Nayak, Uma; Spruill, Ida; Garvey, W. Timothy; Pettaway, Curtis; Nyante, Sarah; Bandera, Elisa V.; Britton, Angela F.; Zonderman, Alan B.; Rasmussen-Torvik, Laura J.; Chen, Yii-Der Ida; Ding, Jingzhong; Lohman, Kurt; Kritchevsky, Stephen B.; Zhao, Wei; Peyser, Patricia A.; Kardia, Sharon L. R.; Kabagambe, Edmond; Broeckel, Ulrich; Chen, Guanjie; Zhou, Jie; Wassertheil-Smoller, Sylvia; Neuhouser, Marian L.; Rampersaud, Evadnie; Psaty, Bruce; Kooperberg, Charles; Manson, JoAnn E.; Kuller, Lewis H.; Ochs-Balcom, Heather M.; Johnson, Karen C.; Sucheston, Lara; Ordovas, Jose M.; Palmer, Julie R.; Haiman, Christopher A.; McKnight, Barbara; Howard, Barbara V.; Becker, Diane M.; Bielak, Lawrence F.; Liu, Yongmei; Allison, Matthew A.; Grant, Struan F. A.; Burke, Gregory L.; Patel, Sanjay R.; Schreiner, Pamela J.; Borecki, Ingrid B.; Evans, Michele K.; Taylor, Herman; Sale, Michele M.; Howard, Virginia; Carlson, Christopher S.; Rotimi, Charles N.; Cushman, Mary; Harris, Tamara B.; Reiner, Alexander P.; Cupples, L. Adrienne; North, Kari E.; Fox, Caroline S.
Central obesity, measured by waist circumference (WC) or waist-hip ratio (WHR), is a marker of body fat distribution. Although obesity disproportionately affects minority populations, few studies have conducted genome-wide association study (GWAS) of fat distribution among those of predominantly African ancestry (AA). We performed GWAS of WC and WHR, adjusted and unadjusted for BMI, in up to 33,591 and 27,350 AA individuals, respectively. We identified loci associated with fat distribution in AA individuals using meta-analyses of GWA results for WC and WHR (stage 1). Overall, 25 SNPs with single genomic control (GC)-corrected p-values<5.0×10−6 were followed-up (stage 2) in AA with WC and with WHR. Additionally, we interrogated genomic regions of previously identified European ancestry (EA) WHR loci among AA. In joint analysis of association results including both Stage 1 and 2 cohorts, 2 SNPs demonstrated association, rs2075064 at LHX2, p = 2.24×10−8 for WC-adjusted-for-BMI, and rs6931262 at RREB1, p = 2.48×10−8 for WHR-adjusted-for-BMI. However, neither signal was genome-wide significant after double GC-correction (LHX2: p = 6.5×10−8; RREB1: p = 5.7×10−8). Six of fourteen previously reported loci for waist in EA populations were significant (p<0.05 divided by the number of independent SNPs within the region) in AA studied here (TBX15-WARS2, GRB14, ADAMTS9, LY86, RSPO3, ITPR2-SSPN). Further, we observed associations with metabolic traits: rs13389219 at GRB14 associated with HDL-cholesterol, triglycerides, and fasting insulin, and rs13060013 at ADAMTS9 with HDL-cholesterol and fasting insulin. Finally, we observed nominal evidence for sexual dimorphism, with stronger results in AA women at the GRB14 locus (p for interaction = 0.02). In conclusion, we identified two suggestive loci associated with fat distribution in AA populations in addition to confirming 6 loci previously identified in populations of EA. These findings reinforce
Background Recent genome-wide association studies (GWAS) for asthma have been successful in identifying novel associations which have been well replicated. The aim of this study is to identify the genetic variants that influence predisposition towards asthma in an ethnic Chinese population in Singapore using a GWAS approach. Methods A two-stage GWAS was performed in case samples with allergic asthma, and in control samples without asthma and atopy. In the discovery stage, 490 case and 490 control samples were analysed by pooled genotyping. Significant associations from the first stage were evaluated in a replication cohort of 521 case and 524 control samples in the second stage. The same 980 samples used in the discovery phase were also individually genotyped for purposes of a combined analysis. An additional 1445 non-asthmatic atopic control samples were also genotyped. Results 19 promising SNPs which passed our genome-wide P value threshold of 5.52 × 10-8 were individually genotyped. In the combined analysis of 1011 case and 1014 control samples, SNP rs2941504 in PERLD1 on chromosome 17q12 was found to be significantly associated with asthma at the genotypic level (P = 1.48 × 10-6, ORAG = 0.526 (0.369-0.700), ORAA = 0.480 (0.361-0.639)) and at the allelic level (P = 9.56 × 10-6, OR = 0.745 (0.654-0.848)). These findings were found to be replicated in 3 other asthma GWAS studies, thus validating our own results. Analysis against the atopy control samples suggested that the SNP was associated with allergic asthma and not to either the asthma or allergy components. Genotyping of additional SNPs in 100 kb flanking rs2941504 further confirmed that the association was indeed to PERLD1. PERLD1 is involved in the modification of the glycosylphosphatidylinositol anchors for cell surface markers such as CD48 and CD59 which are known to play multiple roles in T-cell activation and proliferation. Conclusions These findings reveal the association of a PERLD1 as a novel
Wang, Shengqin; Xu, Yuming; Lu, Zuhong
Growing evidence indicates that miRNA genes exist in the archaeal genome, though the functional role of such noncoding RNA remains unclear. Here, we integrated the phylogenetic information of available archaeal genomes to predict miRNA seeds (typically defined as the 2-8 nucleotides of mature miRNAs) on the genomic scale. Finally, we found 2649 candidate seeds with significant conservation signal. Eleven of 29 unique seeds from previous study support our result (P value <0.01), which demonstrates that the pipeline is suitable to predict experimentally detectable miRNA seeds. The statistical significance of the overlap between the detected archaeal seeds and known eukaryotic seeds shows that the miRNA may evolve before the divergence of these two domains of cellular life. In addition, miRNA targets are enriched for genes involved in transcriptional regulation, which is consistent with the situation in eukaryote. Our research will enhance the regulatory network analysis in Archaea.
Black, James R M; Clark, Simon J
In recent years, genome-wide association studies (GWAS), which are able to analyze the contribution to disease of genetic variations that are common within a population, have attracted considerable investment. Despite identifying genetic variants for many conditions, they have been criticized for yielding data with minimal clinical utility. However, in this regard, age-related macular degeneration (AMD), the most common form of blindness in the Western world, is a striking exception. Through GWAS, common genetic variants at a number of loci have been discovered. Two loci in particular, including genes of the complement cascade on chromosome 1 and the ARMS2/HTRA1 genes on chromosome 10, have been shown to convey significantly increased susceptibility to developing AMD. Today, although it is possible to screen individuals for a genetic predisposition to the disease, effective interventional strategies for those at risk of developing AMD are scarce. Ongoing research in this area is nonetheless promising. After providing brief overviews of AMD and common disease genetics, we outline the main recent advances in the understanding of AMD, particularly those made through GWAS. Finally, the true merit of these findings and their current and potential translational value is examined.Genet Med 18 4, 283-289.
Leduc, Magalie S; Lyons, Malcolm; Darvishi, Katayoon; Walsh, Kenneth; Sheehan, Susan; Amend, Sarah; Cox, Allison; Orho-Melander, Marju; Kathiresan, Sekar; Paigen, Beverly; Korstanje, Ron
Genome-wide association (GWA) studies represent a powerful strategy for identifying susceptibility genes for complex diseases in human populations but results must be confirmed and replicated. Because of the close homology between mouse and human genomes, the mouse can be used to add evidence to genes suggested by human studies. We used the mouse quantitative trait loci (QTL) map to interpret results from a GWA study for genes associated with plasma HDL cholesterol levels. We first positioned single nucleotide polymorphisms (SNPs) from a human GWA study on the genomic map for mouse HDL QTL. We then used mouse bioinformatics, sequencing, and expression studies to add evidence for one well-known HDL gene (Abca1) and three newly identified genes (Galnt2, Wwox, and Cdh13), thus supporting the results of the human study. For GWA peaks that occur in human haplotype blocks with multiple genes, we examined the homologous regions in the mouse to prioritize the genes using expression, sequencing, and bioinformatics from the mouse model, showing that some genes were unlikely candidates and adding evidence for candidate genes Mvk and Mmab in one haplotype block and Fads1 and Fads2 in the second haplotype block. Our study highlights the value of mouse genetics for evaluating genes found in human GWA studies.
Gara, Sudheer Kumar; Wang, Yonghong; Patel, Dhaval; Liu-Chittenden, Yi; Jain, Meenu; Boufraqech, Myriem; Zhang, Lisa; Meltzer, Paul S; Kebebew, Electron
To gain insight into the pathogenesis of adrenocortical carcinoma (ACC) and whether there is progression from normal-to-adenoma-to-carcinoma, we performed genome-wide gene expression, gene methylation, microRNA expression and comparative genomic hybridization (CGH) analysis in human adrenocortical tissue (normal, adrenocortical adenomas and ACC) samples. A pairwise comparison of normal, adrenocortical adenomas and ACC gene expression profiles with more than four-fold expression differences and an adjusted P-value < 0.05 revealed no major differences in normal versus adrenocortical adenoma whereas there are 808 and 1085, respectively, dysregulated genes between ACC versus adrenocortical adenoma and ACC versus normal. The majority of the dysregulated genes in ACC were downregulated. By integrating the CGH, gene methylation and expression profiles of potential miRNAs with the gene expression of dysregulated genes, we found that there are higher alterations in ACC versus normal compared to ACC versus adrenocortical adenoma. Importantly, we identified several novel molecular pathways that are associated with dysregulated genes and further experimentally validated that oncostatin m signaling induces caspase 3 dependent apoptosis and suppresses cell proliferation. Finally, we propose that there is higher number of genomic changes from normal-to-adenoma-to-carcinoma and identified oncostatin m signaling as a plausible druggable pathway for therapeutics.
Black, James R. M.; Clark, Simon J.
In recent years, genome-wide association studies (GWAS), which are able to analyze the contribution to disease of genetic variations that are common within a population, have attracted considerable investment. Despite identifying genetic variants for many conditions, they have been criticized for yielding data with minimal clinical utility. However, in this regard, age-related macular degeneration (AMD), the most common form of blindness in the Western world, is a striking exception. Through GWAS, common genetic variants at a number of loci have been discovered. Two loci in particular, including genes of the complement cascade on chromosome 1 and the ARMS2/HTRA1 genes on chromosome 10, have been shown to convey significantly increased susceptibility to developing AMD. Today, although it is possible to screen individuals for a genetic predisposition to the disease, effective interventional strategies for those at risk of developing AMD are scarce. Ongoing research in this area is nonetheless promising. After providing brief overviews of AMD and common disease genetics, we outline the main recent advances in the understanding of AMD, particularly those made through GWAS. Finally, the true merit of these findings and their current and potential translational value is examined. Genet Med 18 4, 283–289. PMID:26020418
Liu, Yuwei; Xie, Shaojun; Yu, Jingjuan
Lysine is one of the most limiting essential amino acids for humans and livestock. The nutritional value of maize (Zea mays L.) is reduced by its poor lysine content. To better understand the lysine biosynthesis pathway in maize seed, we conducted a genome-wide analysis of the genes involved in lysine biosynthesis. We identified lysine biosynthesis pathway genes (LBPGs) and investigated whether a diaminopimelate pathway variant exists in maize. We analyzed two genes encoding the key enzyme dihydrodipicolinate synthase, and determined that they contribute differently to lysine synthesis during maize seed development. A coexpression network of LBPGs was constructed using RNA-sequencing data from 21 developmental stages of B73 maize seed. We found a large set of genes encoding ribosomal proteins, elongation factors and zein proteins that were coexpressed with LBPGs. The coexpressed genes were enriched in cellular metabolism terms and protein related terms. A phylogenetic analysis of the LBPGs from different plant species revealed different relationships. Additionally, six transcription factor (TF) families containing 13 TFs were identified as the Hub TFs of the LBPGs modules. Several expression quantitative trait loci of LBPGs were also identified. Our results should help to elucidate the lysine biosynthesis pathway network in maize seed. PMID:26829553
Bradfield, Jonathan P; Taal, H Rob; Timpson, Nicholas J; Scherag, André; Lecoeur, Cecile; Warrington, Nicole M; Hypponen, Elina; Holst, Claus; Valcarcel, Beatriz; Thiering, Elisabeth; Salem, Rany M; Schumacher, Fredrick R; Cousminer, Diana L; Sleiman, Patrick M A; Zhao, Jianhua; Berkowitz, Robert I; Vimaleswaran, Karani S; Jarick, Ivonne; Pennell, Craig E; Evans, David M; St Pourcain, Beate; Berry, Diane J; Mook-Kanamori, Dennis O; Hofman, Albert; Rivadeneira, Fernando; Uitterlinden, André G; van Duijn, Cornelia M; van der Valk, Ralf J P; de Jongste, Johan C; Postma, Dirkje S; Boomsma, Dorret I; Gauderman, W James; Hassanein, Mohamed T; Lindgren, Cecilia M; Mägi, Reedik; Boreham, Colin A G; Neville, Charlotte E; Moreno, Luis A; Elliott, Paul; Pouta, Anneli; Hartikainen, Anna-Liisa; Li, Mingyao; Raitakari, Olli; Lehtimäki, Terho; Eriksson, Johan G; Palotie, Aarno; Dallongeville, Jean; Das, Shikta; Deloukas, Panos; McMahon, George; Ring, Susan M; Kemp, John P; Buxton, Jessica L; Blakemore, Alexandra I F; Bustamante, Mariona; Guxens, Mònica; Hirschhorn, Joel N; Gillman, Matthew W; Kreiner-Møller, Eskil; Bisgaard, Hans; Gilliland, Frank D; Heinrich, Joachim; Wheeler, Eleanor; Barroso, Inês; O'Rahilly, Stephen; Meirhaeghe, Aline; Sørensen, Thorkild I A; Power, Chris; Palmer, Lyle J; Hinney, Anke; Widen, Elisabeth; Farooqi, I Sadaf; McCarthy, Mark I; Froguel, Philippe; Meyre, David; Hebebrand, Johannes; Jarvelin, Marjo-Riitta; Jaddoe, Vincent W V; Smith, George Davey; Hakonarson, Hakon; Grant, Struan F A
Multiple genetic variants have been associated with adult obesity and a few with severe obesity in childhood; however, less progress has been made in establishing genetic influences on common early-onset obesity. We performed a North American, Australian and European collaborative meta-analysis of 14 studies consisting of 5,530 cases (≥95th percentile of body mass index (BMI)) and 8,318 controls (<50th percentile of BMI) of European ancestry. Taking forward the eight newly discovered signals yielding association with P < 5 × 10(-6) in nine independent data sets (2,818 cases and 4,083 controls), we observed two loci that yielded genome-wide significant combined P values near OLFM4 at 13q14 (rs9568856; P = 1.82 × 10(-9); odds ratio (OR) = 1.22) and within HOXB5 at 17q21 (rs9299; P = 3.54 × 10(-9); OR = 1.14). Both loci continued to show association when two extreme childhood obesity cohorts were included (2,214 cases and 2,674 controls). These two loci also yielded directionally consistent associations in a previous meta-analysis of adult BMI(1).
Fan, Huizhong; Wu, Yang; Zhou, Xiaojing; Xia, Jiangwei; Zhang, Wengang; Song, Yuxin; Liu, Fei; Chen, Yan; Zhang, Lupei; Gao, Xue; Gao, Huijiang; Li, Junya
Most single nucleotide polymorphisms (SNPs) detected by genome-wide association studies (GWAS), explain only a small fraction of phenotypic variation. Pathway-based GWAS were proposed to improve the proportion of genes for some human complex traits that could be explained by enriching a mass of SNPs within genetic groups. However, few attempts have been made to describe the quantitative traits in domestic animals. In this study, we used a dataset with approximately 7,700,000 SNPs from 807 Simmental cattle and analyzed live weight and longissimus muscle area using a modified pathway-based GWAS method to orthogonalise the highly linked SNPs within each gene using principal component analysis (PCA). As a result, of the 262 biological pathways of cattle collected from the KEGG database, the gamma aminobutyric acid (GABA)ergic synapse pathway and the non-alcoholic fatty liver disease (NAFLD) pathway were significantly associated with the two traits analyzed. The GABAergic synapse pathway was biologically applicable to the traits analyzed because of its roles in feed intake and weight gain. The proposed method had high statistical power and a low false discovery rate, compared to those of the smallest P-value and SNP set enrichment analysis methods. PMID:26672757
Mitha, Faheem; Herodotou, Herodotos; Borisov, Nedyalko; Jiang, Chen; Yoder, Josh; Owzar, Kouros
Background We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS). This system makes it possible to merge study data with HapMap data and merge across studies for meta-analyses, including data filtering based on the values of phenotype and Single-Nucleotide Polymorphism (SNP) data. SNPpy and its dependencies are open source software. Results The current version of SNPpy offers utility functions to import genotype and annotation data from two commercial platforms. We use these to import data from two GWAS studies and the HapMap Project. We then export these individual datasets to standard data format files that can be imported into statistical software for downstream analyses. Conclusions By leveraging the power of relational databases, SNPpy offers integrated management and manipulation of genotype and phenotype data from GWAS studies. The analysis of these studies requires merging across GWAS datasets as well as patient and marker selection. To this end, SNPpy enables the user to filter the data and output the results as standardized GWAS file formats. It does low level and flexible data validation, including validation of patient data. SNPpy is a practical and extensible solution for investigators who seek to deploy central management of their GWAS data. PMID:22039405
Braggio, Esteban; Van Wier, Scott; Ojha, Juhi; McPhail, Ellen; Asmann, Yan W.; Egan, Jan; da Silva, Jackline Ayres; Schiff, David; Lopes, M Beatriz; Decker, Paul A; Valdez, Riccardo; Tibes, Raoul; Eckloff, Bruce; Witzig, Thomas E.; Stewart, A Keith; Fonseca, Rafael; O’Neill, Brian Patrick
Purpose Primary central nervous system lymphoma (PCNSL) is an aggressive non-Hodgkin lymphoma confined to the CNS. Whether there is a PCNSL-specific genomic signature and, if so, how it differs from systemic diffuse large B-cell lymphoma (DLBCL) is uncertain. Experimental design We performed a comprehensive genomic study of tumor samples from 19 immunocompetent PCNSL patients. Testing comprised array-comparative genomic hybridization and whole exome sequencing. Results Biallelic inactivation of TOX and PRKCD were recurrently found in PCNSL but not in systemic DLBCL, suggesting a specific role in PCNSL pathogenesis. Additionally, we found a high prevalence of MYD88 mutations (79%) and CDKN2A biallelic loss (60%). Several genes recurrently affected in PCNSL were common with systemic DLBCL, including loss of TNFAIP3, PRDM1, GNA13, TMEM30A, TBL1XR1, B2M, CD58, activating mutations of CD79B, CARD11 and translocations IgH-BCL6. Overall, BCR/TLR/NF-κB pathways were altered in >90% of PNCSL, highlighting its value for targeted therapeutic approaches. Furthermore, integrated analysis showed enrichment of pathways associated with immune response, proliferation, apoptosis, and lymphocyte differentiation. Conclusions In summary, genome-wide analysis uncovered novel recurrent alterations, including TOX and PRKCD, helping to differentiate PCNSL from systemic DLBCL and related lymphomas. PMID:25991819
Gacek, Katarzyna; Bayer, Philipp E.; Bartkowiak-Broda, Iwona; Szala, Laurencja; Bocianowski, Jan; Edwards, David; Batley, Jacqueline
Fatty acids and their composition in seeds determine oil value for nutritional or industrial purposes and also affect seed germination as well as seedling establishment. To better understand the genetic basis of seed fatty acid biosynthesis in oilseed rape (Brassica napus L.) we applied a genome-wide association study, using 91,205 single nucleotide polymorphisms (SNPs) characterized across a mapping population with high-resolution skim genotyping by sequencing (SkimGBS). We identified a cluster of loci on chromosome A05 associated with oleic and linoleic seed fatty acids. The delineated genomic region contained orthologs of the Arabidopsis thaliana genes known to play a role in regulation of seed fatty acid biosynthesis such as Fatty acyl-ACP thioesterase B (FATB) and Fatty Acid Desaturase (FAD5). This approach allowed us to identify potential functional genes regulating fatty acid composition in this important oil producing crop and demonstrates that this approach can be used as a powerful tool for dissecting complex traits for B. napus improvement programs. PMID:28163710
Bradfield, Jonathan P.; Taal, H. Rob; Timpson, Nicholas J.; Scherag, André; Lecoeur, Cecile; Warrington, Nicole M.; Hypponen, Elina; Holst, Claus; Valcarcel, Beatriz; Thiering, Elisabeth; Salem, Rany M.; Schumacher, Fredrick R.; Cousminer, Diana L.; Sleiman, Patrick M.A.; Zhao, Jianhua; Berkowitz, Robert I.; Vimaleswaran, Karani S.; Jarick, Ivonne; Pennell, Craig E.; Evans, David M.; St. Pourcain, Beate; Berry, Diane J.; Mook-Kanamori, Dennis O; Hofman, Albert; Rivadeinera, Fernando; Uitterlinden, André G.; van Duijn, Cornelia M.; van der Valk, Ralf J.P.; de Jongste, Johan C.; Postma, Dirkje S.; Boomsma, Dorret I.; Gauderman, William J.; Hassanein, Mohamed T.; Lindgren, Cecilia M.; Mägi, Reedik; Boreham, Colin A.G.; Neville, Charlotte E.; Moreno, Luis A.; Elliott, Paul; Pouta, Anneli; Hartikainen, Anna-Liisa; Li, Mingyao; Raitakari, Olli; Lehtimäki, Terho; Eriksson, Johan G.; Palotie, Aarno; Dallongeville, Jean; Das, Shikta; Deloukas, Panos; McMahon, George; Ring, Susan M.; Kemp, John P.; Buxton, Jessica L.; Blakemore, Alexandra I.F.; Bustamante, Mariona; Guxens, Mònica; Hirschhorn, Joel N.; Gillman, Matthew W.; Kreiner-Møller, Eskil; Bisgaard, Hans; Gilliland, Frank D.; Heinrich, Joachim; Wheeler, Eleanor; Barroso, Inês; O'Rahilly, Stephen; Meirhaeghe, Aline; Sørensen, Thorkild I.A.; Power, Chris; Palmer, Lyle J.; Hinney, Anke; Widen, Elisabeth; Farooqi, I. Sadaf; McCarthy, Mark I.; Froguel, Philippe; Meyre, David; Hebebrand, Johannes; Jarvelin, Marjo-Riitta; Jaddoe, Vincent W.V.; Smith, George Davey; Hakonarson, Hakon; Grant, Struan F.A.
Multiple genetic variants have been associated with adult obesity and a few with severe obesity in childhood; however, less progress has been made to establish genetic influences on common early-onset obesity. We performed a North American-Australian-European collaborative meta-analysis of fourteen studies consisting of 5,530 cases (≥95th percentile of body mass index (BMI)) and 8,318 controls (<50th percentile of BMI) of European ancestry. Taking forward the eight novel signals yielding association with P < 5×10−6 in to nine independent datasets (n = 2,818 cases and 4,083 controls) we observed two loci that yielded a genome wide significant combined P-value, namely near OLFM4 on 13q14 (rs9568856; P=1.82×10−9; OR=1.22) and within HOXB5 on 17q21 (rs9299; P=3.54×10−9; OR=1.14). Both loci continued to show association when including two extreme childhood obesity cohorts (n = 2,214 cases and 2,674 controls). Finally, these two loci yielded directionally consistent associations in the GIANT meta-analysis of adult BMI1. PMID:22484627
Thomassen, Mads . E-mail: firstname.lastname@example.org; Skov, Vibe; Eiriksdottir, Freyja; Tan, Qihua; Jochumsen, Kirsten; Fritzner, Niels; Brusgaard, Klaus; Dahlgaard, Jesper; Kruse, Torben A.
The quality of DNA microarray based gene expression data relies on the reproducibility of several steps in a microarray experiment. We have developed a spotted genome wide microarray chip with oligonucleotides printed in duplicate in order to minimise undesirable biases, thereby optimising detection of true differential expression. The validation study design consisted of an assessment of the microarray chip performance using the MessageAmp and FairPlay labelling kits. Intraclass correlation coefficient (ICC) was used to demonstrate that MessageAmp was significantly more reproducible than FairPlay. Further examinations with MessageAmp revealed the applicability of the system. The linear range of the chips was three orders of magnitude, the precision was high, as 95% of measurements deviated less than 1.24-fold from the expected value, and the coefficient of variation for relative expression was 13.6%. Relative quantitation was more reproducible than absolute quantitation and substantial reduction of variance was attained with duplicate spotting. An analysis of variance (ANOVA) demonstrated no significant day-to-day variation.
Wu, Chen; Jin, Guangfu; Dai, Juncheng; Wang, Cheng; Hu, Lingmin; Gou, Jianwei; Qian, Chen; Bai, Jianling; Wu, Tangchun; Hu, Zhibin; Lin, Dongxin; Shen, Hongbing; Chen, Feng
Genome-wide association studies (GWAS) have identified a number of genetic variants associated with lung cancer risk. However, these loci explain only a small fraction of lung cancer hereditability and other variants with weak effect may be lost in the GWAS approach due to the stringent significance level after multiple comparison correction. In this study, in order to identify important pathways involving the lung carcinogenesis, we performed a two-stage pathway analysis in GWAS of lung cancer in Han Chinese using gene set enrichment analysis (GSEA) method. Predefined pathways by BioCarta and KEGG databases were systematically evaluated on Nanjing study (Discovery stage: 1,473 cases and 1,962 controls) and the suggestive pathways were further to be validated in Beijing study (Replication stage: 858 cases and 1,115 controls). We found that four pathways (achPathway, metPathway, At1rPathway and rac1Pathway) were consistently significant in both studies and the P values for combined dataset were 0.012, 0.010, 0.022 and 0.005 respectively. These results were stable after sensitivity analysis based on gene definition and gene overlaps between pathways. These findings may provide new insights into the etiology of lung cancer. PMID:23469231
Ponsuksili, Siriluck; Reyer, Henry; Trakooljul, Nares; Murani, Eduard
Haematological traits are important traits that show associations with immune and metabolic status, as well as diseases in humans and animals. Mapping genome regions that affect the blood cell traits can contribute to the identification of genomic features useable as biomarkers for immune, disease and metabolic status. A genome-wide association study (GWAS) was conducted using PorcineSNP60 BeadChips. Single-marker and Bayesian multi-marker approaches were integrated to identify genomic regions and corresponding genes overlapping for both methods. GWAS was performed for haematological traits of 591 German Landrace pig. Heritability estimates for haematological traits were medium to high. In total 252 single SNPs associated with 12 haematological traits were identified (NegLog10 of p-value > 5). The Bayesian multi-marker approach revealed 102 QTL regions across the genome, indicated by 1-Mb windows with contribution to additive genetic variance above 0.5%. The integration of both methods resulted in 24 overlapping QTL regions. This study identified overlapping QTL regions from single- and multi-marker approaches for haematological traits. Identifying candidate genes that affect blood cell traits provides the first step towards the understanding of the molecular basis of haematological phenotypes. PMID:27434032
Xiao, Zhengtao; Zou, Qin; Liu, Yu; Yang, Xuerui
The closely regulated process of mRNA translation is crucial for precise control of protein abundance and quality. Ribosome profiling, a combination of ribosome foot-printing and RNA deep sequencing, has been used in a large variety of studies to quantify genome-wide mRNA translation. Here, we developed Xtail, an analysis pipeline tailored for ribosome profiling data that comprehensively and accurately identifies differentially translated genes in pairwise comparisons. Applied on simulated and real datasets, Xtail exhibits high sensitivity with minimal false-positive rates, outperforming existing methods in the accuracy of quantifying differential translations. With published ribosome profiling datasets, Xtail does not only reveal differentially translated genes that make biological sense, but also uncovers new events of differential translation in human cancer cells on mTOR signalling perturbation and in human primary macrophages on interferon gamma (IFN-γ) treatment. This demonstrates the value of Xtail in providing novel insights into the molecular mechanisms that involve translational dysregulations.
Suratanee, Apichat; Schaefer, Martin H.; Betts, Matthew J.; Soons, Zita; Mannsperger, Heiko; Harder, Nathalie; Oswald, Marcus; Gipp, Markus; Ramminger, Ellen; Marcus, Guillermo; Männer, Reinhard; Rohr, Karl; Wanker, Erich; Russell, Robert B.; Andrade-Navarro, Miguel A.; Eils, Roland; König, Rainer
Characterizing the activating and inhibiting effect of protein-protein interactions (PPI) is fundamental to gain insight into the complex signaling system of a human cell. A plethora of methods has been suggested to infer PPI from data on a large scale, but none of them is able to characterize the effect of this interaction. Here, we present a novel computational development that employs mitotic phenotypes of a genome-wide RNAi knockdown screen and enables identifying the activating and inhibiting effects of PPIs. Exemplarily, we applied our technique to a knockdown screen of HeLa cells cultivated at standard conditions. Using a machine learning approach, we obtained high accuracy (82% AUC of the receiver operating characteristics) by cross-validation using 6,870 known activating and inhibiting PPIs as gold standard. We predicted de novo unknown activating and inhibiting effects for 1,954 PPIs in HeLa cells covering the ten major signaling pathways of the Kyoto Encyclopedia of Genes and Genomes, and made these predictions publicly available in a database. We finally demonstrate that the predicted effects can be used to cluster knockdown genes of similar biological processes in coherent subgroups. The characterization of the activating or inhibiting effect of individual PPIs opens up new perspectives for the interpretation of large datasets of PPIs and thus considerably increases the value of PPIs as an integrated resource for studying the detailed function of signaling pathways of the cellular system of interest. PMID:25255318
Legge, Sophie E; Hamshere, Marian L; Ripke, Stephan; Pardinas, Antonio F; Goldstein, Jacqueline I; Rees, Elliott; Richards, Alexander L; Leonenko, Ganna; Jorskog, L Fredrik; Chambert, Kimberly D; Collier, David A; Genovese, Giulio; Giegling, Ina; Holmans, Peter; Jonasdottir, Adalbjorg; Kirov, George; McCarroll, Steven A; MacCabe, James H; Mantripragada, Kiran; Moran, Jennifer L; Neale, Benjamin M; Stefansson, Hreinn; Rujescu, Dan; Daly, Mark J; Sullivan, Patrick F; Owen, Michael J; O’Donovan, Michael C; Walters, James T R
The antipsychotic clozapine is uniquely effective in the management of schizophrenia, but its use is limited by its potential to induce agranulocytosis. The causes of this, and of its precursor neutropenia, are largely unknown although genetic factors play an important role. We sought risk alleles for clozapine-associated neutropenia in a sample of 66 cases and 5583 clozapine-treated controls, through a genome-wide association study (GWAS), imputed HLA alleles, exome array, and copy number variation analyses. We then combined associated variants in a meta-analysis with data from the Clozapine-Induced Agranulocytosis Consortium (up to 163 cases and 7970 controls). In the largest combined sample to date, we identified a novel association with rs149104283 (OR=4.32, P=1.79×10-8), intronic to transcripts of SLCO1B3 and SLCO1B7, members of a family of hepatic transporter genes previously implicated in adverse drug reactions including simvastatin-induced myopathy and docetaxel-induced neutropenia. Exome array analysis identified gene-wide associations of uncommon non-synonymous variants within UBAP2 and STARD9. We additionally provide independent replication of a previously identified variant in HLA-DQB1 (OR=15.6, P = 0.015, positive predictive value = 35.1%). These results implicate biological pathways through which clozapine may act to cause this serious adverse effect. PMID:27400856
Suhre, Karsten; Wallaschofski, Henri; Raffler, Johannes; Friedrich, Nele; Haring, Robin; Michael, Kathrin; Wasner, Christina; Krebs, Alexander; Kronenberg, Florian; Chang, David; Meisinger, Christa; Wichmann, H-Erich; Hoffmann, Wolfgang; Völzke, Henry; Völker, Uwe; Teumer, Alexander; Biffar, Reiner; Kocher, Thomas; Felix, Stephan B; Illig, Thomas; Kroemer, Heyo K; Gieger, Christian; Römisch-Margl, Werner; Nauck, Matthias
We present a genome-wide association study of metabolic traits in human urine, designed to investigate the detoxification capacity of the human body. Using NMR spectroscopy, we tested for associations between 59 metabolites in urine from 862 male participants in the population-based SHIP study. We replicated the results using 1,039 additional samples of the same study, including a 5-year follow-up, and 992 samples from the independent KORA study. We report five loci with joint P values of association from 3.2 × 10(-19) to 2.1 × 10(-182). Variants at three of these loci have previously been linked with important clinical outcomes: SLC7A9 is a risk locus for chronic kidney disease, NAT2 for coronary artery disease and genotype-dependent response to drug toxicity, and SLC6A20 for iminoglycinuria. Moreover, we identify rs37369 in AGXT2 as the genetic basis of hyper-β-aminoisobutyric aciduria.
Himes, Blanca E.; Hunninghake, Gary M.; Baurley, James W.; Rafaels, Nicholas M.; Sleiman, Patrick; Strachan, David P.; Wilk, Jemma B.; Willis-Owen, Saffron A.G.; Klanderman, Barbara; Lasky-Su, Jessica; Lazarus, Ross; Murphy, Amy J.; Soto-Quiros, Manuel E.; Avila, Lydiana; Beaty, Terri; Mathias, Rasika A.; Ruczinski, Ingo; Barnes, Kathleen C.; Celedón, Juan C.; Cookson, William O.C.; Gauderman, W. James; Gilliland, Frank D.; Hakonarson, Hakon; Lange, Christoph; Moffatt, Miriam F.; O'Connor, George T.; Raby, Benjamin A.; Silverman, Edwin K.; Weiss, Scott T.
Asthma, a chronic airway disease with known heritability, affects more than 300 million people around the world. A genome-wide association (GWA) study of asthma with 359 cases from the Childhood Asthma Management Program (CAMP) and 846 genetically matched controls from the Illumina ICONdb public resource was performed. The strongest region of association seen was on chromosome 5q12 in PDE4D. The phosphodiesterase 4D, cAMP-specific (phosphodiesterase E3 dunce homolog, Drosophila) gene (PDE4D) is a regulator of airway smooth-muscle contractility, and PDE4 inhibitors have been developed as medications for asthma. Allelic p values for top SNPs in this region were 4.3 × 10−07 for rs1588265 and 9.7 × 10−07 for rs1544791. Replications were investigated in ten independent populations with different ethnicities, study designs, and definitions of asthma. In seven white and Hispanic replication populations, two PDE4D SNPs had significant results with p values less than 0.05, and five had results in the same direction as the original population but had p values greater than 0.05. Combined p values for 18,891 white and Hispanic individuals (4,342 cases) in our replication populations were 4.1 × 10−04 for rs1588265 and 9.2 × 10−04 for rs1544791. In three black replication populations, which had different linkage disequilibrium patterns than the other populations, original findings were not replicated. Further study of PDE4D variants might lead to improved understanding of the role of PDE4D in asthma pathophysiology and the efficacy of PDE4 inhibitor medications. PMID:19426955
Sun, Chengming; Wang, Benqi; Yan, Lei; Hu, Kaining; Liu, Sheng; Zhou, Yongming; Guan, Chunyun; Zhang, Zhenqian; Li, Jiana; Zhang, Jiefu; Chen, Song; Wen, Jing; Ma, Chaozhi; Tu, Jinxing; Shen, Jinxiong; Fu, Tingdong; Yi, Bin
Plant height is a key morphological trait of rapeseed. In this study, we measured plant height of a rapeseed population across six environments. This population contains 476 inbred lines representing the major Chinese rapeseed genepool and 44 lines from other countries. The 60K Brassica Infinium® SNP array was utilized to genotype the association panel. A genome-wide association study (GWAS) was performed via three methods, including a robust, novel, nonparametric Anderson–Darling (A–D) test. Consequently, 68 loci were identified as significantly associated with plant height (P < 5.22 × 10−5), and more than 70% of the loci (48) overlapped the confidence intervals of reported QTLs from nine mapping populations. Moreover, 24 GWAS loci were detected with selective sweep signals, which reflected the signatures of historical semi-dwarf breeding. In the linkage disequilibrium (LD) decay range up—and downstream of 65 loci (r2 > 0.1), we found plausible candidates orthologous to the documented Arabidopsis genes involved in height regulation. One significant association found by GWAS colocalized with the established height locus BnRGA in rapeseed. Our results provide insights into the genetic basis of plant height in rapeseed and may facilitate marker-based breeding. PMID:27512396
Hu, Haixiao; Schrag, Tobias A; Peis, Regina; Unterseer, Sandra; Schipprack, Wolfgang; Chen, Shaojiang; Lai, Jinsheng; Yan, Jianbing; Prasanna, Boddupalli M; Nair, Sudha K; Chaikam, Vijay; Rotarenco, Valeriu; Shatskaya, Olga A; Zavalishina, Alexandra; Scholten, Stefan; Schön, Chris-Carolin; Melchinger, Albrecht E
In vivo haploid induction (HI) triggered by pollination with special intraspecific genotypes, called inducers, is unique to Zea maysL. within the plant kingdom and has revolutionized maize breeding during the last decade. However, the molecular mechanisms underlying HI in maize are still unclear. To investigate the genetic basis of HI, we developed a new approach for genome-wide association studies (GWAS), termed conditional haplotype extension (CHE) test that allows detection of selective sweeps even under almost perfect confounding of population structure and trait expression. Here, we applied this test to identify genomic regions required for HI expression and dissected the combined support interval (50.34 Mb) of the QTL qhir1, detected in a previous study, into two closely linked genomic segments relevant for HI expression. The first, termed qhir11(0.54 Mb), comprises an already fine-mapped region but was not diagnostic for differentiating inducers and noninducers. The second segment, termed qhir12(3.97 Mb), had a haplotype allele common to all 53 inducer lines but not found in any of the 1482 noninducers. By comparing resequencing data of one inducer with 14 noninducers, we detected in the qhir12 region three candidate genes involved in DNA or amino acid binding, however, none for qhir11 We propose that the CHE test can be utilized in introgression breeding and different fields of genetics to detect selective sweeps in heterogeneous genetic backgrounds.
Zila, Charles T.; Samayoa, L. Fernando; Santiago, Rogelio; Butrón, Ana; Holland, James B.
Fusarium ear rot is a common disease of maize that affects food and feed quality globally. Resistance to the disease is highly quantitative, and maize breeders have difficulty incorporating polygenic resistance alleles from unadapted donor sources into elite breeding populations without having a negative impact on agronomic performance. Identification of specific allele variants contributing to improved resistance may be useful to breeders by allowing selection of resistance alleles in coupling phase linkage with favorable agronomic characteristics. We report the results of a genome-wide association study to detect allele variants associated with increased resistance to Fusarium ear rot in a maize core diversity panel of 267 inbred lines evaluated in two sets of environments. We performed association tests with 47,445 single-nucleotide polymorphisms (SNPs) while controlling for background genomic relationships with a mixed model and identified three marker loci significantly associated with disease resistance in at least one subset of environments. Each associated SNP locus had relatively small additive effects on disease resistance (±1.1% on a 0–100% scale), but nevertheless were associated with 3 to 12% of the genotypic variation within or across environment subsets. Two of three identified SNPs colocalized with genes that have been implicated with programmed cell death. An analysis of associated allele frequencies within the major maize subpopulations revealed enrichment for resistance alleles in the tropical/subtropical and popcorn subpopulations compared with other temperate breeding pools. PMID:24048647
Wei, Yunxie; Hu, Wei; Xia, Feiyu; Zeng, Hongqiu; Li, Xiaolin; Yan, Yu; He, Chaozu; Shi, Haitao
Banana (Musa acuminata) is one of the most popular fresh fruits. However, the rapid spread of fungal pathogen Fusarium oxysporum f. sp. cubense (Foc) in tropical areas severely affected banana growth and production. Thus, it is very important to identify candidate genes involved in banana response to abiotic stress and pathogen infection, as well as the molecular mechanism and possible utilization for genetic breeding. Heat stress transcription factors (Hsfs) are widely known for their common involvement in various abiotic stresses and plant-pathogen interaction. However, no MaHsf has been identified in banana, as well as its possible role. In this study, genome-wide identification and further analyses of evolution, gene structure and conserved motifs showed closer relationship of them in every subgroup. The comprehensive expression profiles of MaHsfs revealed the tissue- and developmental stage-specific or dependent, as well as abiotic and biotic stress-responsive expressions of them. The common regulation of several MaHsfs by abiotic and biotic stress indicated the possible roles of them in plant stress responses. Taken together, this study extended our understanding of MaHsf gene family and identified some candidate MaHsfs with specific expression profiles, which may be used as potential candidates for genetic breeding in banana. PMID:27857174
Chasman, Daniel I.; Fuchsberger, Christian; Pattaro, Cristian; Teumer, Alexander; Böger, Carsten A.; Endlich, Karlhans; Olden, Matthias; Chen, Ming-Huei; Tin, Adrienne; Taliun, Daniel; Li, Man; Gao, Xiaoyi; Gorski, Mathias; Yang, Qiong; Hundertmark, Claudia; Foster, Meredith C.; O'Seaghdha, Conall M.; Glazer, Nicole; Isaacs, Aaron; Liu, Ching-Ti; Smith, Albert V.; O'Connell, Jeffrey R.; Struchalin, Maksim; Tanaka, Toshiko; Li, Guo; Johnson, Andrew D.; Gierman, Hinco J.; Feitosa, Mary F.; Hwang, Shih-Jen; Atkinson, Elizabeth J.; Lohman, Kurt; Cornelis, Marilyn C.; Johansson, Åsa; Tönjes, Anke; Dehghan, Abbas; Lambert, Jean-Charles; Holliday, Elizabeth G.; Sorice, Rossella; Kutalik, Zoltan; Lehtimäki, Terho; Esko, Tõnu; Deshmukh, Harshal; Ulivi, Sheila; Chu, Audrey Y.; Murgia, Federico; Trompet, Stella; Imboden, Medea; Coassin, Stefan; Pistis, Giorgio; Harris, Tamara B.; Launer, Lenore J.; Aspelund, Thor; Eiriksdottir, Gudny; Mitchell, Braxton D.; Boerwinkle, Eric; Schmidt, Helena; Cavalieri, Margherita; Rao, Madhumathi; Hu, Frank; Demirkan, Ayse; Oostra, Ben A.; de Andrade, Mariza; Turner, Stephen T.; Ding, Jingzhong; Andrews, Jeanette S.; Freedman, Barry I.; Giulianini, Franco; Koenig, Wolfgang; Illig, Thomas; Meisinger, Christa; Gieger, Christian; Zgaga, Lina; Zemunik, Tatijana; Boban, Mladen; Minelli, Cosetta; Wheeler, Heather E.; Igl, Wilmar; Zaboli, Ghazal; Wild, Sarah H.; Wright, Alan F.; Campbell, Harry; Ellinghaus, David; Nöthlings, Ute; Jacobs, Gunnar; Biffar, Reiner; Ernst, Florian; Homuth, Georg; Kroemer, Heyo K.; Nauck, Matthias; Stracke, Sylvia; Völker, Uwe; Völzke, Henry; Kovacs, Peter; Stumvoll, Michael; Mägi, Reedik; Hofman, Albert; Uitterlinden, Andre G.; Rivadeneira, Fernando; Aulchenko, Yurii S.; Polasek, Ozren; Hastie, Nick; Vitart, Veronique; Helmer, Catherine; Wang, Jie Jin; Stengel, Bénédicte; Ruggiero, Daniela; Bergmann, Sven; Kähönen, Mika; Viikari, Jorma; Nikopensius, Tiit; Province, Michael; Ketkar, Shamika; Colhoun, Helen; Doney, Alex; Robino, Antonietta; Krämer, Bernhard K.; Portas, Laura; Ford, Ian; Buckley, Brendan M.; Adam, Martin; Thun, Gian-Andri; Paulweber, Bernhard; Haun, Margot; Sala, Cinzia; Mitchell, Paul; Ciullo, Marina; Kim, Stuart K.; Vollenweider, Peter; Raitakari, Olli; Metspalu, Andres; Palmer, Colin; Gasparini, Paolo; Pirastu, Mario; Jukema, J. Wouter; Probst-Hensch, Nicole M.; Kronenberg, Florian; Toniolo, Daniela; Gudnason, Vilmundur; Shuldiner, Alan R.; Coresh, Josef; Schmidt, Reinhold; Ferrucci, Luigi; Siscovick, David S.; van Duijn, Cornelia M.; Borecki, Ingrid B.; Kardia, Sharon L.R.; Liu, Yongmei; Curhan, Gary C.; Rudan, Igor; Gyllensten, Ulf; Wilson, James F.; Franke, Andre; Pramstaller, Peter P.; Rettig, Rainer; Prokopenko, Inga; Witteman, Jacqueline; Hayward, Caroline; Ridker, Paul M; Parsa, Afshin; Bochud, Murielle; Heid, Iris M.; Kao, W.H. Linda; Fox, Caroline S.; Köttgen, Anna
In conducting genome-wide association studies (GWAS), analytical approaches leveraging biological information may further understanding of the pathophysiology of clinical traits. To discover novel associations with estimated glomerular filtration rate (eGFR), a measure of kidney function, we developed a strategy for integrating prior biological knowledge into the existing GWAS data for eGFR from the CKDGen Consortium. Our strategy focuses on single nucleotide polymorphism (SNPs) in genes that are connected by functional evidence, determined by literature mining and gene ontology (GO) hierarchies, to genes near previously validated eGFR associations. It then requires association thresholds consistent with multiple testing, and finally evaluates novel candidates by independent replication. Among the samples of European ancestry, we identified a genome-wide significant SNP in FBXL20 (P = 5.6 × 10−9) in meta-analysis of all available data, and additional SNPs at the INHBC, LRP2, PLEKHA1, SLC3A2 and SLC7A6 genes meeting multiple-testing corrected significance for replication and overall P-values of 4.5 × 10−4–2.2 × 10−7. Neither the novel PLEKHA1 nor FBXL20 associations, both further supported by association with eGFR among African Americans and with transcript abundance, would have been implicated by eGFR candidate gene approaches. LRP2, encoding the megalin receptor, was identified through connection with the previously known eGFR gene DAB2 and extends understanding of the megalin system in kidney function. These findings highlight integration of existing genome-wide association data with independent biological knowledge to uncover novel candidate eGFR associations, including candidates lacking known connections to kidney-specific pathways. The strategy may also be applicable to other clinical phenotypes, although more testing will be needed to assess its potential for discovery in general. PMID:22962313
Perseguini, Juliana Morini Küpper Cardoso; Oblessuc, Paula Rodrigues; Rosa, João Ricardo Bachega Feijó; Gomes, Kleber Alves; Chiorato, Alisson Fernando; Carbonell, Sérgio Augusto Morais; Garcia, Antonio Augusto Franco; Vianello, Rosana Pereira; Benchimol-Reis, Luciana Lasry
The common bean (Phaseolus vulgaris L.) is the world’s most important legume for human consumption. Anthracnose (ANT; Colletotrichum lindemuthianum) and angular leaf spot (ALS; Pseudocercospora griseola) are complex diseases that cause major yield losses in common bean. Depending on the cultivar and environmental conditions, anthracnose and angular leaf spot infections can reduce crop yield drastically. This study aimed to estimate linkage disequilibrium levels and identify quantitative resistance loci (QRL) controlling resistance to both ANT and ALS diseases of 180 accessions of common bean using genome-wide association analysis. A randomized complete block design with four replicates was performed for the ANT and ALS experiments, with four plants per genotype in each replicate. Association mapping analyses were performed for ANT and ALS using a mixed linear model approach implemented in TASSEL. A total of 17 and 11 significant statistically associations involving SSRs were detected for ANT and ALS resistance loci, respectively. Using SNPs, 21 and 17 significant statistically associations were obtained for ANT and angular ALS, respectively, providing more associations with this marker. The SSR-IAC167 and PvM95 markers, both located on chromosome Pv03, and the SNP scaffold00021_89379, were associated with both diseases. The other markers were distributed across the entire common bean genome, with chromosomes Pv03 and Pv08 showing the greatest number of loci associated with ANT resistance. The chromosome Pv04 was the most saturated one, with six markers associated with ALS resistance. The telomeric region of this chromosome showed four markers located between approximately 2.5 Mb and 4.4 Mb. Our results demonstrate the great potential of genome-wide association studies to identify QRLs related to ANT and ALS in common bean. The results indicate a quantitative and complex inheritance pattern for both diseases in common bean. Our findings will contribute to more
Li, Cong; Sun, Dongxiao; Zhang, Shengli; Wang, Sheng; Wu, Xiaoping; Zhang, Qin; Liu, Lin; Li, Yanhua; Qiao, Lv
Detecting genes associated with milk fat composition could provide valuable insights into the complex genetic networks of genes underling variation in fatty acids synthesis and point towards opportunities for changing milk fat composition via selective breeding. In this study, we conducted a genome-wide association study (GWAS) for 22 milk fatty acids in 784 Chinese Holstein cows with the PLINK software. Genotypes were obtained with the Illumina BovineSNP50 Bead chip and a total of 40,604 informative, high-quality single nucleotide polymorphisms (SNPs) were used. Totally, 83 genome-wide significant SNPs and 314 suggestive significant SNPs associated with 18 milk fatty acid traits were detected. Chromosome regions that affect milk fatty acid traits were mainly observed on BTA1, 2, 5, 6, 7, 9, 13, 14, 18, 19, 20, 21, 23, 26 and 27. Of these, 146 SNPs were associated with more than one milk fatty acid trait; most of studied fatty acid traits were significant associated with multiple SNPs, especially C18:0 (105 SNPs), C18 index (93 SNPs), and C14 index (84 SNPs); Several SNPs are close to or within the DGAT1, SCD1 and FASN genes which are well-known to affect milk composition traits of dairy cattle. Combined with the previously reported QTL regions and the biological functions of the genes, 20 novel promising candidates for C10:0, C12:0, C14:0, C14:1, C14 index, C18:0, C18:1n9c, C18 index, SFA, UFA and SFA/UFA were found, which composed of HTR1B, CPM, PRKG1, MINPP1, LIPJ, LIPK, EHHADH, MOGAT1, ECHS1, STAT1, SORBS1, NFKB2, AGPAT3, CHUK, OSBPL8, PRLR, IGF1R, ACSL3, GHR and OXCT1. Our findings provide a groundwork for unraveling the key genes and causal mutations affecting milk fatty acid traits in dairy cattle.
Zhang, Fan; Wu, Zhi-Chao; Wang, Ming-Ming; Zhang, Fan; Dingkuhn, Michael; Xu, Jian-Long; Zhou, Yong-Li; Li, Zhi-Kang
Bacterial blight, which is caused by Xanthomonas oryzae pv. oryzae (Xoo), is one of the most devastating rice diseases worldwide. The development and use of disease-resistant cultivars have been the most effective strategy to control bacterial blight. Identifying the genes mediating bacterial blight resistance is a prerequisite for breeding cultivars with broad-spectrum and durable resistance. We herein describe a genome-wide association study involving 172 diverse Oryza sativa ssp. indica accessions to identify loci influencing the resistance to representative strains of six Xoo races. Twelve resistance loci containing 121 significantly associated signals were identified using 317,894 single nucleotide polymorphisms, which explained 13.3-59.9% of the variability in lesion length caused by Xoo races P1, P6, and P9a. Two hotspot regions (L11 and L12) were located within or nearby two cloned R genes (xa25 and Xa26) and one fine-mapped R gene (Xa4). Our results confirmed the relatively high resolution of genome-wide association studies. Moreover, we detected novel significant associations on chromosomes 2, 3, and 6-10. Haplotype analyses of xa25, the Xa26 paralog (MRKc; LOC_Os11g47290), and a Xa4 candidate gene (LOC_11g46870) revealed differences in bacterial blight resistance among indica subgroups. These differences were responsible for the observed variations in lesion lengths resulting from infections by Xoo races P1 and P9a. Our findings may be relevant for future studies involving bacterial blight resistance gene cloning, and provide insights into the genetic basis for bacterial blight resistance in indica rice, which may be useful for knowledge-based crop improvement.
Mwadzingeni, Learnmore; Shimelis, Hussein; Rees, D. Jasper G.; Tsilo, Toi J.
This study determined the population structure and genome-wide marker-trait association of agronomic traits of wheat for drought-tolerance breeding. Ninety-three diverse bread wheat genotypes were genotyped using the Diversity Arrays Technology sequencing (DArTseq) protocol. The number of days-to-heading (DTH), number of days-to-maturity (DTM), plant height (PHT), spike length (SPL), number of kernels per spike (KPS), thousand kernel weight (TKW) and grain yield (GYLD), assessed under drought-stressed and non-stressed conditions, were considered for the study. Population structure analysis and genome-wide association mapping were undertaken based on 16,383 silico DArTs loci with < 10% missing data. The population evaluated was grouped into nine distinct genetic structures. Inter-chromosomal linkage disequilibrium showed the existence of linkage decay as physical distance increased. A total of 62 significant (P < 0.001) marker-trait associations (MTAs) were detected explaining more than 20% of the phenotypic variation observed under both drought-stressed and non-stressed conditions. Significant (P < 0.001) MTA event(s) were observed for DTH, PHT, SPL, SPS, and KPS; under both stressed and non-stressed conditions, while additional significant (P < 0.05) associations were observed for TKW, DTM and GYLD under non-stressed condition. The MTAs reported in this population could be useful to initiate marker-assisted selection (MAS) and targeted trait introgression of wheat under drought-stressed and non-stressed conditions, and for fine mapping and cloning of the underlying genes and QTL. PMID:28234945
Perseguini, Juliana Morini Küpper Cardoso; Oblessuc, Paula Rodrigues; Rosa, João Ricardo Bachega Feijó; Gomes, Kleber Alves; Chiorato, Alisson Fernando; Carbonell, Sérgio Augusto Morais; Garcia, Antonio Augusto Franco; Vianello, Rosana Pereira; Benchimol-Reis, Luciana Lasry
The common bean (Phaseolus vulgaris L.) is the world's most important legume for human consumption. Anthracnose (ANT; Colletotrichum lindemuthianum) and angular leaf spot (ALS; Pseudocercospora griseola) are complex diseases that cause major yield losses in common bean. Depending on the cultivar and environmental conditions, anthracnose and angular leaf spot infections can reduce crop yield drastically. This study aimed to estimate linkage disequilibrium levels and identify quantitative resistance loci (QRL) controlling resistance to both ANT and ALS diseases of 180 accessions of common bean using genome-wide association analysis. A randomized complete block design with four replicates was performed for the ANT and ALS experiments, with four plants per genotype in each replicate. Association mapping analyses were performed for ANT and ALS using a mixed linear model approach implemented in TASSEL. A total of 17 and 11 significant statistically associations involving SSRs were detected for ANT and ALS resistance loci, respectively. Using SNPs, 21 and 17 significant statistically associations were obtained for ANT and angular ALS, respectively, providing more associations with this marker. The SSR-IAC167 and PvM95 markers, both located on chromosome Pv03, and the SNP scaffold00021_89379, were associated with both diseases. The other markers were distributed across the entire common bean genome, with chromosomes Pv03 and Pv08 showing the greatest number of loci associated with ANT resistance. The chromosome Pv04 was the most saturated one, with six markers associated with ALS resistance. The telomeric region of this chromosome showed four markers located between approximately 2.5 Mb and 4.4 Mb. Our results demonstrate the great potential of genome-wide association studies to identify QRLs related to ANT and ALS in common bean. The results indicate a quantitative and complex inheritance pattern for both diseases in common bean. Our findings will contribute to more
Wang, Ming-Ming; Zhang, Fan; Dingkuhn, Michael; Xu, Jian-Long; Zhou, Yong-Li; Li, Zhi-Kang
Bacterial blight, which is caused by Xanthomonas oryzae pv. oryzae (Xoo), is one of the most devastating rice diseases worldwide. The development and use of disease-resistant cultivars have been the most effective strategy to control bacterial blight. Identifying the genes mediating bacterial blight resistance is a prerequisite for breeding cultivars with broad-spectrum and durable resistance. We herein describe a genome-wide association study involving 172 diverse Oryza sativa ssp. indica accessions to identify loci influencing the resistance to representative strains of six Xoo races. Twelve resistance loci containing 121 significantly associated signals were identified using 317,894 single nucleotide polymorphisms, which explained 13.3–59.9% of the variability in lesion length caused by Xoo races P1, P6, and P9a. Two hotspot regions (L11 and L12) were located within or nearby two cloned R genes (xa25 and Xa26) and one fine-mapped R gene (Xa4). Our results confirmed the relatively high resolution of genome-wide association studies. Moreover, we detected novel significant associations on chromosomes 2, 3, and 6–10. Haplotype analyses of xa25, the Xa26 paralog (MRKc; LOC_Os11g47290), and a Xa4 candidate gene (LOC_11g46870) revealed differences in bacterial blight resistance among indica subgroups. These differences were responsible for the observed variations in lesion lengths resulting from infections by Xoo races P1 and P9a. Our findings may be relevant for future studies involving bacterial blight resistance gene cloning, and provide insights into the genetic basis for bacterial blight resistance in indica rice, which may be useful for knowledge-based crop improvement. PMID:28355306
Estimating genetic parameters is an essential step in breeding by recurrent selection to maximize genetic gains over time. This study evaluated the effects of selection on genetic variation across two successive generations (Cycle 1 [C1] and Cycle 2 [C2]) of a Summer x Kanlow switchgrass (Panicum vi...
Amin, N; Schuur, M; Gusareva, E S; Isaacs, A; Aulchenko, Y S; Kirichenko, A V; Zorkoltseva, I V; Axenovich, T I; Oostra, B A; Janssens, A C J W; van Duijn, C M
The NEO-Five-Factor Inventory divides human personality traits into five dimensions: neuroticism, extraversion, openness, conscientiousness and agreeableness. In this study, we sought to identify regions harboring genes with large effects on the five NEO personality traits by performing genome-wide linkage analysis of individuals scoring in the extremes of these traits (>90th percentile). Affected-only linkage analysis was performed using an Illumina 6K linkage array in a family-based study, the Erasmus Rucphen Family study. We subsequently determined whether distinct, segregating haplotypes found with linkage analysis were associated with the trait of interest in the population. Finally, a dense single-nucleotide polymorphism genotyping array (Illumina 318K) was used to search for copy number variations (CNVs) in the associated regions. In the families with extreme phenotype scores, we found significant evidence of linkage for conscientiousness to 20p13 (rs1434789, log of odds (LOD)=5.86) and suggestive evidence of linkage (LOD >2.8) for neuroticism to 19q, 21q and 22q, extraversion to 1p, 1q, 9p and12q, openness to 12q and 19q, and agreeableness to 2p, 6q, 17q and 21q. Further analysis determined haplotypes in 21q22 for neuroticism (P-values