Note: This page contains sample records for the topic snp frequency haplotype from
While these samples are representative of the content of,
they are not comprehensive nor are they the most current set.
We encourage you to perform a real-time search of
to obtain the most current and comprehensive results.
Last update: November 12, 2013.

Malaria haplotype frequency estimation.  


We present a Bayesian approach for estimating the relative frequencies of multi-single nucleotide polymorphism (SNP) haplotypes in populations of the malaria parasite Plasmodium falciparum by using microarray SNP data from human blood samples. Each sample comes from a malaria patient and contains one or several parasite clones that may genetically differ. Samples containing multiple parasite clones with different genetic markers pose a special challenge. The situation is comparable with a polyploid organism. The data from each blood sample indicates whether the parasites in the blood carry a mutant or a wildtype allele at various selected genomic positions. If both mutant and wildtype alleles are detected at a given position in a multiply infected sample, the data indicates the presence of both alleles, but the ratio is unknown. Thus, the data only partially reveals which specific combinations of genetic markers (i.e. haplotypes across the examined SNPs) occur in distinct parasite clones. In addition, SNP data may contain errors at non-negligible rates. We use a multinomial mixture model with partially missing observations to represent this data and a Markov chain Monte Carlo method to estimate the haplotype frequencies in a population. Our approach addresses both challenges, multiple infections and data errors. Copyright © 2013 John Wiley & Sons, Ltd. PMID:23609602

Wigger, Leonore; Vogt, Julia E; Roth, Volker



Inferring combined CNV/SNP haplotypes from genotype data  

PubMed Central

Motivation: Copy number variations (CNVs) are increasingly recognized as an substantial source of individual genetic variation, and hence there is a growing interest in investigating the evolutionary history of CNVs as well as their impact on complex disease susceptibility. CNV/SNP haplotypes are critical for this research, but although many methods have been proposed for inferring integer copy number, few have been designed for inferring CNV haplotypic phase and none of these are applicable at genome-wide scale. Here, we present a method for inferring missing CNV genotypes, predicting CNV allelic configuration and for inferring CNV haplotypic phase from SNP/CNV genotype data. Our method, implemented in the software polyHap v2.0, is based on a hidden Markov model, which models the joint haplotype structure between CNVs and SNPs. Thus, haplotypic phase of CNVs and SNPs are inferred simultaneously. A sampling algorithm is employed to obtain a measure of confidence/credibility of each estimate. Results: We generated diploid phase-known CNV–SNP genotype datasets by pairing male X chromosome CNV–SNP haplotypes. We show that polyHap provides accurate estimates of missing CNV genotypes, allelic configuration and CNV haplotypic phase on these datasets. We applied our method to a non-simulated dataset—a region on Chromosome 2 encompassing a short deletion. The results confirm that polyHap's accuracy extends to real-life datasets. Availability: Our method is implemented in version 2.0 of the polyHap software package and can be downloaded from Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

Su, Shu-Yi; Asher, Julian E.; Jarvelin, Marjo-Riita; Froguel, Phillipe; Blakemore, Alexandra I.F.; Balding, David J.; Coin, Lachlan J.M.



SNP Haplotype Mapping in a Small ALS Family  

PubMed Central

The identification of genes for monogenic disorders has proven to be highly effective for understanding disease mechanisms, pathways and gene function in humans. Nevertheless, while thousands of Mendelian disorders have not yet been mapped there has been a trend away from studying single-gene disorders. In part, this is due to the fact that many of the remaining single-gene families are not large enough to map the disease locus to a single site in the genome. New tools and approaches are needed to allow researchers to effectively tap into this genetic gold-mine. Towards this goal, we have used haploid cell lines to experimentally validate the use of high-density single nucleotide polymorphism (SNP) arrays to define genome-wide haplotypes and candidate regions, using a small amyotrophic lateral sclerosis (ALS) family as a prototype. Specifically, we used haploid-cell lines to determine if high-density SNP arrays accurately predict haplotypes across entire chromosomes and show that haplotype information significantly enhances the genetic information in small families. Panels of haploid-cell lines were generated and a 5 centimorgan (cM) short tandem repeat polymorphism (STRP) genome scan was performed. Experimentally derived haplotypes for entire chromosomes were used to directly identify regions of the genome identical-by-descent in 5 affected individuals. Comparisons between experimentally determined and in silico haplotypes predicted from SNP arrays demonstrate that SNP analysis of diploid DNA accurately predicted chromosomal haplotypes. These methods precisely identified 12 candidate intervals, which are shared by all 5 affected individuals. Our study illustrates how genetic information can be maximized using readily available tools as a first step in mapping single-gene disorders in small families.

Krueger, Katherine A. Dick; Tsuji, Shoji; Fukuda, Yoko; Takahashi, Yuji; Goto, Jun; Mitsui, Jun; Ishiura, Hiroyuki; Dalton, Joline C.; Miller, Michael B.; Day, John W.; Ranum, Laura P. W.



High-density SNP haplotyping suggests altered regulation of tau gene expression in progressive supranuclear palsy.  


Two extended haplotypes exist across the tau gene-H1 and H2-with H1 consistently associated with increased risk of progressive supranuclear palsy (PSP). Using 15 haplotype tagging SNPs (htSNPs), capturing >95% of MAPT haplotype diversity, we performed association analysis in a US sample of 274 predominantly pathologically confirmed PSP patients and 424 matched control individuals. We found that PSP risk is associated with one of two major ancestral H1 haplotypes, H1B, increasing from 14% in control individuals to 22% in PSP patients (P<0.001). In young PSP patients, the H1B risk could be localized to a 22 kb regulatory region in intron 0 (P<0.001) and could be fully explained by one SNP, htSNP167, creating a LBP-1c/LSF/CP2 site, shown to regulate the expression of genes in other neurodegenerative disorders. Luciferase reporter data indicated that the 182 bp conserved regulatory region, in which htSNP167 is located, is transcriptionally active with both alleles differentially influencing expression. Further, we replicated the htSNP167 association in a second, independently ascertained US PSP patient-control sample. However, the htSNP association showed that H1 risk alone could not explain the overall differences in H1 and H2 frequencies in PSP patients and control individuals. Thus, risk variants on different H1 htSNP haplotypes and protective variants on H2 contribute to population risk for PSP. PMID:16195395

Rademakers, Rosa; Melquist, Stacey; Cruts, Marc; Theuns, Jessie; Del-Favero, Jurgen; Poorkaj, Parvoneh; Baker, Matt; Sleegers, Kristel; Crook, Richard; De Pooter, Tim; Bel Kacem, Samira; Adamson, Jennifer; Van den Bossche, Dirk; Van den Broeck, Marleen; Gass, Jennifer; Corsmit, Ellen; De Rijk, Peter; Thomas, Natalie; Engelborghs, Sebastiaan; Heckman, Michael; Litvan, Irene; Crook, Julia; De Deyn, Peter P; Dickson, Dennis; Schellenberg, Gerard D; Van Broeckhoven, Christine; Hutton, Michael L



Grouping preprocess for haplotype inference from SNP and CNV data  

NASA Astrophysics Data System (ADS)

The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.

Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Kamatani, Naoyuki; Inoue, Masato



Defining multiple common "completely" conserved major histocompatibility complex SNP haplotypes  

PubMed Central

The availability of both HLA data and genotypes for thousands of SNPs across the major histocompatibility complex (MHC) in 1240 complete families of the Type 1 Diabetes Genetics Consortium allowed us to analyze the occurrence and extent of megabase contiguous identity for founder chromosomes from unrelated individuals. We identified 82 HLA-defined haplotype groups, and within these groups, megabase regions of SNP identity were readily apparent. The conserved chromosomes within the 82 haplotype groups comprise approximately one third of the founder chromosomes. It is currently unknown whether such frequent conservation for groups of unrelated individuals is specific to the MHC, or if initial binning by highly polymorphic HLA alleles facilitated detection of a more general phenomenon within the MHC. Such common identity, specifically across the MHC, impacts type 1 diabetes susceptibility and may impact transplantation between unrelated individuals.

Baschal, Erin E.; Aly, Theresa A.; Jasinski, Jean M.; Steck, Andrea K.; Noble, Janelle A.; Erlich, Henry A.; Eisenbarth, George S.



Estimating haplotype frequencies and standard errors for multiple single nucleotide polymorphisms  

Microsoft Academic Search

SUMMARY Estimating haplotype frequencies becomes increasingly important in the mapping of complex disease genes, as millions of single nucleotide polymorphisms (SNPs) are being identified and genotyped. When genotypes at multiple SNP loci are gathered from unrelated individuals, haplotype frequencies can be accurately estimated using expectation-maximization (EM) algorithms (Excoffier and Slatkin, 1995; Hawley and Kidd, 1995; Long et al., 1995), with




Haplotype inference from unphased SNP data in heterozygous polyploids based on SAT  

PubMed Central

Background Haplotype inference based on unphased SNP markers is an important task in population genetics. Although there are different approaches to the inference of haplotypes in diploid species, the existing software is not suitable for inferring haplotypes from unphased SNP data in polyploid species, such as the cultivated potato (Solanum tuberosum). Potato species are tetraploid and highly heterozygous. Results Here we present the software SATlotyper which is able to handle polyploid and polyallelic data. SATlo-typer uses the Boolean satisfiability problem to formulate Haplotype Inference by Pure Parsimony. The software excludes existing haplotype inferences, thus allowing for calculation of alternative inferences. As it is not known which of the multiple haplotype inferences are best supported by the given unphased data set, we use a bootstrapping procedure that allows for scoring of alternative inferences. Finally, by means of the bootstrapping scores, it is possible to optimise the phased genotypes belonging to a given haplotype inference. The program is evaluated with simulated and experimental SNP data generated for heterozygous tetraploid populations of potato. We show that, instead of taking the first haplotype inference reported by the program, we can significantly improve the quality of the final result by applying additional methods that include scoring of the alternative haplotype inferences and genotype optimisation. For a sub-population of nineteen individuals, the predicted results computed by SATlotyper were directly compared with results obtained by experimental haplotype inference via sequencing of cloned amplicons. Prediction and experiment gave similar results regarding the inferred haplotypes and phased genotypes. Conclusion Our results suggest that Haplotype Inference by Pure Parsimony can be solved efficiently by the SAT approach, even for data sets of unphased SNP from heterozygous polyploids. SATlotyper is freeware and is distributed as a Java JAR file. The software can be downloaded from the webpage of the GABI Primary Database at . The application of SATlotyper will provide haplotype information, which can be used in haplotype association mapping studies of polyploid plants.

Neigenfind, Jost; Gyetvai, Gabor; Basekow, Rico; Diehl, Svenja; Achenbach, Ute; Gebhardt, Christiane; Selbig, Joachim; Kersten, Birgit



Estimating Haplotype Frequency and Coverage of Databases  

PubMed Central

A variety of forensic, population, and disease studies are based on haploid DNA (e.g. mitochondrial DNA or Y-chromosome data). For any set of genetic markers databases of conventional size will normally contain only a fraction of all haplotypes. For several applications, reliable estimates of haplotype frequencies, the total number of haplotypes and coverage of the database (the probability that the next random haplotype is contained in the database) will be useful. We propose different approaches to the problem based on classical methods as well as new applications of Principal Component Analysis (PCA). We also discuss previous proposals based on saturation curves. Several conclusions can be inferred from simulated and real data. First, classical estimates of the fraction of unseen haplotypes can be seriously biased. Second, there is no obvious way to decide on required sample size based on traditional approaches. Methods based on testing of hypotheses or length of confidence intervals may appear artificial since no single test or parameter stands out as particularly relevant. Rather the coverage may be more relevant since it indicates the percentage of different haplotypes that are contained in a database; if the coverage is low, there is a considerable chance that the next haplotype to be observed does not appear in the database and this indicates that the database needs to be expanded. Finally, freeware and example data sets accompany the methods discussed in this paper:

Egeland, Thore; Salas, Antonio



Detecting genome wide haplotype sharing using SNP or microsatellite haplotype data  

Microsoft Academic Search

Genome wide association studies using high throughput technology are already being conducted despite the significant hurdles\\u000a that need to be overcome (Nat Rev Genet 6:95–108, 2005; Nat Rev Genet 6:109–118, 2005). Methods for detecting haplotype association\\u000a signals in genome wide haplotype datasets are as yet very limited. Much methodological research has already been devoted to\\u000a linkage disequilibrium (LD) fine mapping

Melanie Bahlo; Jim Stankovich; Terence P. Speed; Justin P. Rubio; Rachel K. Burfoot; Simon J. Foote



A common SNP haplotype provides molecular proof of a founder effect of Huntington disease linking two South African populations  

Microsoft Academic Search

This study involved the detailed investigation of the region surrounding the huntingtin gene in families with a history of Huntington Disease (HD) in South Africa. The primary aim was to investigate the origins of the HD mutation in South Africa by constructing a single-nucleotide polymorphism (SNP) haplotype around the HD gene and to determine how many haplotypes there are in

Janine Scholefield; Jacquie Greenberg



A new SNP haplotype associated with blue disease resistance gene in cotton (Gossypium hirsutum L.).  


Resistance to cotton blue disease (CBD) was evaluated in 364 F(2.3) families of three populations derived from resistant variety 'Delta Opal'. The CBD resistance in 'Delta Opal' was controlled by one single dominant gene designated Cbd. Two simple sequence repeat (SSR) markers were identified as linked to Cbd by bulked segregant analysis. Cbd resides at the telomere region of chromosome 10. SSR marker DC20027 was 0.75 cM away from Cbd. DC20027 marker fragments amplified from 3 diploid species and 13 cotton varieties whose CBD resistance was known were cloned and sequenced. One single nucleotide polymorphism (SNP) was identified at the 136 th position by sequence alignment analysis. Screening SNP markers previously mapped on chromosome 10 identified an additional 3 SNP markers that were associated with Cbd. A strong association between a haplotype based on four SNP markers and Cbd was developed. This demonstrates one of the first examples in cotton where SNP markers were used to effectively tag a trait enabling marker-assisted selection for high levels of CBD resistance in breeding programs. PMID:19960336

Fang, David D; Xiao, Jinhua; Canci, Paulo C; Cantrell, Roy G



Common SNP-Based Haplotype Analysis of the 4p16.3 Huntington Disease Gene Region  

PubMed Central

Age at the onset of motor symptoms in Huntington disease (HD) is determined largely by the length of a CAG repeat expansion in HTT but is also influenced by other genetic factors. We tested whether common genetic variation near the mutation site is associated with differences in the distribution of expanded CAG alleles or age at the onset of motor symptoms. To define disease-associated single-nucleotide polymorphisms (SNPs), we compared 4p16.3 SNPs in HD subjects with population controls in a case:control strategy, which revealed that the strongest signals occurred at a great distance from the HD mutation as a result of “synthetic association” with SNP alleles that are of low frequency in population controls. Detailed analysis delineated a prominent ancestral haplotype that accounted for ?50% of HD chromosomes and extended to at least 938 kb on about half of these. Together, the seven most abundant haplotypes accounted for ?83% of HD chromosomes. Neither the extended shared haplotype nor the individual local HTT haplotypes were associated with altered CAG-repeat length distribution or residual age at the onset of motor symptoms, arguing against modification of these disease features by common cis-regulatory elements. Similarly, the 11 most frequent control haplotypes showed no trans-modifier effect on age at the onset of motor symptoms. Our results argue against common local regulatory variation as a factor influencing HD pathogenesis, suggesting that genetic modifiers be sought elsewhere in the genome. They also indicate that genome-wide association analysis with a small number of cases can be effective for regional localization of genetic defects, even when a founder effect accounts for only a fraction of the disorder.

Lee, Jong-Min; Gillis, Tammy; Mysore, Jayalakshmi Srinidhi; Ramos, Eliana Marisa; Myers, Richard H.; Hayden, Michael R.; Morrison, Patrick J.; Nance, Martha; Ross, Christopher A.; Margolis, Russell L.; Squitieri, Ferdinando; Griguoli, Annamaria; Di Donato, Stefano; Gomez-Tortosa, Estrella; Ayuso, Carmen; Suchowersky, Oksana; Trent, Ronald J.; McCusker, Elizabeth; Novelletto, Andrea; Frontali, Marina; Jones, Randi; Ashizawa, Tetsuo; Frank, Samuel; Saint-Hilaire, Marie-Helene; Hersch, Steven M.; Rosas, Herminia D.; Lucente, Diane; Harrison, Madaline B.; Zanko, Andrea; Abramson, Ruth K.; Marder, Karen; Sequeiros, Jorge; MacDonald, Marcy E.; Gusella, James F.



Single Nucleotide Differences (SNDs) in the dbSNP Database May Lead to Errors in Genotyping and Haplotyping Studies  

PubMed Central

The creation of single-nucleotide polymorphism (SNP) databases (such as NCBI dbSNP) has facilitated scientific research in many fields. SNP discovery and detection has improved to the extent that there are over 17 million human reference (rs) SNPs reported to date (Build 129 of dbSNP). SNP databases are unfortunately not always complete and/or accurate. In fact, half of the reported SNPs are still only candidate SNPs and are not validated in a population. We describe the identification of SNDs (Single Nucleotide Differences) in humans, that may contaminate the dbSNP database. These SNDs, reported as real SNPs in the database, do not exist as such, but are merely artifacts due to the presence of a paralogue (highly similar duplicated) sequence in the genome. Using sequencing we showed how SNDs could originate in two paralogous genes and evaluated samples from a population of 100 individuals for the presence/absence of SNPs. Moreover using bioinformatics, we predicted as many as 8.32% of the biallelic, coding SNPs in the dbSNP database to be SNDs. Our identification of SNDs in the database will allow researchers to not only select truly informative SNPs for association studies, but also aid in determining accurate SNP genotypes and haplotypes.

Musumeci, Lucia; Arthur, Jonathan W; Cheung, Florence SG; Hoque, Ashraful; Lippman, Scott; Reichardt, Juergen KV



Linked region detection using high-density SNP genotype data via the minimum recombinant model of pedigree haplotype inference  

PubMed Central

Background With the rapid development of high-throughput genotyping technologies, efficient methods for identifying linked regions using high-density SNP genotype data have become more and more important. Recently, a deterministic method that works very well on SNP genotyping data has been developed (Lin et al. Bioinformatics 2008, 24(1): 86–93). However, that program can only work on a limited number of family structures. In particular, the results (if any) will be poor when the genotype data for the whole chromosome of one of the parents in a nuclear family is missing. Results We have developed a software package (LIden) for identifying linked regions using high-density SNP genotype data. We focus on handling the case where the genotype data for the whole chromosome of one of the parents in a nuclear family is missing. We use the minimum recombinant model for haplotype inference in pedigrees. Several local optimization algorithms are used to infer the haplotype of each individual and determine the linked regions based on the inferred haplotype data. We have developed a more flexible method to combine nuclear families to further refine (reduce the length of) the linked regions. Conclusion Our new package (LIden) is efficient software for linked region detection using high-density SNP genotype data. LIden can handle some important cases where the existing programs do not work well. In particular, the new package can handle many cases where the genotype data of one of the two parents is missing for the entire chromosome. The running time of the program is O(mn), where m is the number of members in the family and n is the number of SNP sites in the chromosome. LIden is specifically suitable for handling big sized families. This research also demonstrates another practical use of the minimum recombinant model for haplotype inference in pedigrees. The software package can be downloaded at .

Wang, Lusheng; Wang, Zhanyong; Yang, Wanling



Molecular evidence of founder effects of fatal familial insomnia through SNP haplotypes around the D178N mutation  

Microsoft Academic Search

This work presents a detailed investigation of the genomic region surrounding the PRNP gene in a sample of patients diagnosed with fatal familial insomnia (FFI) from several European countries, notably Spain.\\u000a The main focus of the study was to explore the origins of the chromosomes carrying the D178N mutation by designing a single-nucleotide\\u000a polymorphism (SNP) haplotype around the PRNP gene.

Ana B. Rodríguez-Martínez; Miguel A. Alfonso-Sánchez; José A. Peña; Raquel Sánchez-Valle; Inga Zerr; Sabina Capellari; Miguel Calero; Juan J. Zarranz; Marian M. de Pancorbo



Molecular evidence of founder effects of fatal familial insomnia through SNP haplotypes around the D178N mutation.  


This work presents a detailed investigation of the genomic region surrounding the PRNP gene in a sample of patients diagnosed with fatal familial insomnia (FFI) from several European countries, notably Spain. The main focus of the study was to explore the origins of the chromosomes carrying the D178N mutation by designing a single-nucleotide polymorphism (SNP) haplotype around the PRNP gene. Haplotypes were constructed by genotyping six SNPs (rs2756271, rs13040327, rs6037932, rs13045348, rs6116474, and rs6116475) in 25 FFI patients from all over Spain. To augment the geographical scope of our study, 13 further FFI cases from Germany (9) and Italy (4) were also examined. Genotyping of SNPs in conjunction with the analysis of genealogical data for a group of FFI patients revealed the existence of two distinct haplotypes potentially associated with the D178N mutation. Of them, GCATTA-M proved to be the common haplotype of Spanish patients, whereas ACATTA-M was typical of the German cases. It is interesting to note that both haplotypes were identified in the Italian samples: GCATTA-M in a family from the Tuscany region and ACATTA-M in a family from the Veneto region. Our findings suggest the occurrence of two independent D178N-129M mutational events in Europe, preserved and transmitted from one generation to the next until nowadays. Likewise, results based on the analysis of SNP data indicate that previous hypotheses postulating that the D178N mutation had independent origins for each family and that its global distribution was determined by recurrent mutational events must be regarded with caution. PMID:18347820

Rodríguez-Martínez, Ana B; Alfonso-Sánchez, Miguel A; Peña, José A; Sánchez-Valle, Raquel; Zerr, Inga; Capellari, Sabina; Calero, Miguel; Zarranz, Juan J; de Pancorbo, Marian M



High-density SNP haplotyping suggests altered regulation of tau gene expression in progressive supranuclear palsy  

Microsoft Academic Search

Two extended haplotypes exist across the tau gene—H1 and H2—with H1 consistently associated with increased risk of progressive supranuclear palsy (PSP). Using 15 haplotype tagging SNPs (htSNPs), capturing >95% of MAPT haplotype diversity, we performed association analysis in a US sample of 274 predominantly pathologically confirmed PSP patients and 424 matched control individuals. We found that PSP risk is associated

Rosa Rademakers; Stacey Melquist; Marc Cruts; Jessie Theuns; Jurgen Del-Favero; Parvoneh Poorkaj; Matt Baker; Kristel Sleegers; Richard Crook; Tim De Pooter; Samira Bel Kacem; Jennifer Adamson; Dirk Van den Bossche; Marleen Van den Broeck; Jennifer Gass; Ellen Corsmit; Peter De Rijk; Natalie Thomas; Sebastiaan Engelborghs; Michael Heckman; Irene Litvan; Julia Crook; Peter P. De Deyn; Dennis Dickson; Gerard D. Schellenberg; Christine Van Broeckhoven; Michael L. Hutton



Rapid gene-based SNP and haplotype marker development in non-model eukaryotes using 3'UTR sequencing  

PubMed Central

Background Sweet cherry (Prunus avium L.), a non-model crop with narrow genetic diversity, is an important member of sub-family Amygdoloideae within Rosaceae. Compared to other important members like peach and apple, sweet cherry lacks in genetic and genomic information, impeding understanding of important biological processes and development of efficient breeding approaches. Availability of single nucleotide polymorphism (SNP)-based molecular markers can greatly benefit breeding efforts in such non-model species. RNA-seq approaches employing second generation sequencing platforms offer a unique avenue to rapidly identify gene-based SNPs. Additionally, haplotype markers can be rapidly generated from transcript-based SNPs since they have been found to be extremely utile in identification of genetic variants related to health, disease and response to environment as highlighted by the human HapMap project. Results RNA-seq was performed on two sweet cherry cultivars, Bing and Rainier using a 3' untranslated region (UTR) sequencing method yielding 43,396 assembled contigs. In order to test our approach of rapid identification of SNPs without any reference genome information, over 25% (10,100) of the contigs were screened for the SNPs. A total of 207 contigs from this set were identified to contain high quality SNPs. A set of 223 primer pairs were designed to amplify SNP containing regions from these contigs and high resolution melting (HRM) analysis was performed with eight important parental sweet cherry cultivars. Six of the parent cultivars were distantly related to Bing and Rainier, the cultivars used for initial SNP discovery. Further, HRM analysis was also performed on 13 seedlings derived from a cross between two of the parents. Our analysis resulted in the identification of 84 (38.7%) primer sets that demonstrated variation among the tested germplasm. Reassembly of the raw 3'UTR sequences using upgraded transcriptome assembly software yielded 34,620 contigs containing 2243 putative SNPs in 887 contigs after stringent filtering. Contigs with multiple SNPs were visually parsed to identify 685 putative haplotypes at 335 loci in 301 contigs. Conclusions This approach, which leverages the advantages of RNA-seq approaches, enabled rapid generation of gene-linked SNP and haplotype markers. The general approach presented in this study can be easily applied to other non-model eukaryotes irrespective of the ploidy level to identify gene-linked polymorphisms that are expected to facilitate efficient Gene Assisted Breeding (GAB), genotyping and population genetics studies. The identified SNP haplotypes reveal some of the allelic differences in the two sweet cherry cultivars analyzed. The identification of these SNP and haplotype markers is expected to significantly improve the genomic resources for sweet cherry and facilitate efficient GAB in this non-model crop.



The linkage method: a novel approach for SNP detection and haplotype reconstruction from a single diploid individual using next-generation sequence data.  


When we sequence a diploid individual, the output actually comprises two genomes: one from the paternal parent and the other from the maternal parent. In this study, we introduce a novel heuristic algorithm for distinguishing single-nucleotide polymorphisms (SNPs) from the two parents and phasing them into haplotypes. The algorithm is unique because it simultaneously performs SNP calling and haplotype phasing. This approach can exploit the linkage information of nearby SNPs, which facilitates the efficient removal of haplotypes that originate from incorrectly mapped short reads. Using simulated data we demonstrated that our approach increased the accuracy of SNP calls. The haplotype reconstruction performance depended largely on the density of SNPs. Using current next-generation sequence technology with a relatively short read length, reasonable performance is expected when this approach is applied to species with an average of five heterozygous sites per 1 kb. The algorithm was implemented as the program "linkSNPs." PMID:23728796

Sasaki, Eriko; Sugino, Ryuichi P; Innan, Hideki



SNP/haplotype associations in cytokine and cytokine receptor genes and immunity to rubella vaccine  

PubMed Central

An effective immune response to vaccination is, in part, a complex interaction of alleles of multiple genes regulating cytokine networks. We conducted a genotyping study of Th1/Th2/inflammatory cytokines/cytokine receptors in healthy children (n=738, 11–19 years) to determine associations between individual single-nucleotide polymorphisms (SNPs)/haplotypes and immune outcomes after two doses of rubella vaccine. SNPs (n=501) were selected using the ldSelect-approach and genotyped using Illumina GoldenGate™ and TaqMan assays. Rubella-IgG levels were measured by immunoassay and secreted cytokines by ELISA. Linear regression and post hoc haplotype analyses were used to determine associations between single SNPs/haplotypes and immune outcomes. Increased carriage of minor alleles for the promoter SNPs (rs2844482 and rs2857708) of the TNFA gene were associated with dose-related increases in rubella antibodies. IL-6 secretion was co-directionally associated (p?0.01) with five intronic SNPs in the TNFRSF1B gene in an allele dose-related manner, while five promoter/intronic SNPs in the IL12B gene were associated with variations in IL-6 secretion. TNFA haplotype AAACGGGGC (t-statistic=3.32) and IL12B promoter haplotype TAG (t-statistic=2.66) were associated with higher levels of (p?0.01) rubella-IgG and IL-6 secretion, respectively. We identified individual SNPs/haplotypes in TNFA/TNFRSF1B and IL12B genes that appear to modulate immunity to rubella vaccination. Identification of such “genetic fingerprints” may predict the outcome of vaccine response and inform new vaccine strategies.

Dhiman, Neelam; Haralambieva, Iana H.; Kennedy, Richard B.; Vierkant, Robert A.; O'Byrne, Megan M.; Ovsyannikova, Inna G.; Jacobson, Robert M.



SNP and haplotype analysis reveals new HFE variants associated with iron overload trait.  


Hereditary hemochromatosis is a common-recessive-autosomal disease characterized by progressive iron overload, and its prevalence correlates with c.845G>A (p. C282Y) mutation of the HFE gene. Two other variants c.187C>G and c.193A>T are associated with a mild iron overload phenotype. The correlation studies have revealed incompletely penetrance of the HFE mutations, as well as the lack of mutation on some chromosomes from patients. We screened for SNPs before examining allele and haplotype association with elevated iron parameters. We confirmed that the c.845G>A mutation is in complete linkage disequilibrium with a unique haplotype, whereas two haplotypes proved to account for 79.8 and 20.2% of the c.187G chromosomes whose only difference was the g.4694C>G variation. A greater prevalence of the g.4694G allele among patients' chromosomes, compared to controls, was observed. In addition, among non-mutant chromosomes the analyses revealed a risk haplotype and a protective haplotype, and the g.4694G and the c.1007-47A alleles were associated with a higher risk of elevated iron parameters. We determined that the g.4694C allele was located within a putative hypoxia-response element, protein binding was evidenced and was reduced with the g.4694C>G change. In addition, IVS4 was not spliced as well in the c.1007-47A allele compared to the c.1007-47G allele. PMID:21412944

Yang, Yizhen; Férec, Claude; Mura, Catherine



Experimental Generation of SNP Haplotype Signatures in Patients with Sickle Cell Anaemia  

Microsoft Academic Search

BackgroundSickle cell anemia is caused by a single type of mutation, a homozygous A?T substitution in the ß globin gene. Clinical severity is diverse, partially due to additional, disease-modifying genetic factors. We are studying one such modifier locus, HMIP (HBS1L-MYB intergenic polymorphism, chromosome 6q23.3). Working with a genetically admixed patient population, we have encountered the necessity to generate haplotype signatures

Stephan Menzel; Jian Qin; Nisha Vasavda; Swee Lay Thein; Ramesh Ramakrishnan



SNP analyses of growth factor genes EGF, TGF{beta}-1, and HGF reveal haplotypic association of EGF with autism  

SciTech Connect

Autism is a pervasive neurodevelopmental disorder diagnosed in early childhood. Growth factors have been found to play a key role in the cellular differentiation and proliferation of the central and peripheral nervous systems. Epidermal growth factor (EGF) is detected in several regions of the developing and adult brain, where, it enhances the differentiation, maturation, and survival of a variety of neurons. Transforming growth factor-{beta} (TGF{beta}) isoforms play an important role in neuronal survival, and the hepatocyte growth factor (HGF) has been shown to exhibit neurotrophic activity. We examined the association of EGF, TGF{beta}1, and HGF genes with autism, in a trio association study, using DNA samples from families recruited to the Autism Genetic Resource Exchange; 252 trios with a male offspring scored for autism were selected for the study. Transmission disequilibrium test revealed significant haplotypic association of EGF with autism. No significant SNP or haplotypic associations were observed for TGF{beta}1 or HGF. Given the role of EGF in brain and neuronal development, we suggest a possible role of EGF in the pathogenesis of autism.

Toyoda, Takao; Thanseem, Ismail; Kawai, Masayoshi; Sekine, Yoshimoto [Department of Psychiatry and Neurology, Hamamatsu University School of Medicine, Hamamatsu 431-3192 (Japan); Nakamura, Kazuhiko; Anitha, Ayyappan; Suda, Shiro [Department of Psychiatry and Neurology, Hamamatsu University School of Medicine, Hamamatsu 431-3192 (Japan)]. E-mail:; Yamada, Kazuo [Laboratory of Molecular Psychiatry, RIKEN Brain Science Institute, Saitama (Japan); Tsujii, Masatsugu [Faculty of Sociology, Chukyo University, Toyota, Aichi (Japan)]|[The Osaka-Hamamatsu Joint Research Center for Child Mental Development, Hamamatsu University School of Medicine, Hamamatsu (Japan); Iwayama, Yoshimi; Hattori, Eiji; Toyota, Tomoko; Yoshikawa, Takeo [Laboratory of Molecular Psychiatry, RIKEN Brain Science Institute, Saitama (Japan); Miyachi, Taishi; Tsuchiya, Kenji; Sugihara, Gen-ichi; Matsuzaki, Hideo [The Osaka-Hamamatsu Joint Research Center for Child Mental Development, Hamamatsu University School of Medicine, Hamamatsu (Japan); Iwata, Yasuhide; Suzuki, Katsuaki [Department of Psychiatry and Neurology, Hamamatsu University School of Medicine, Hamamatsu 431-3192 (Japan); Mori, Norio [Department of Psychiatry and Neurology, Hamamatsu University School of Medicine, Hamamatsu 431-3192 (Japan)]|[The Osaka-Hamamatsu Joint Research Center for Child Mental Development, Graduate School of Medicine, Osaka University (Japan); Ouchi, Yasuomi [The Osaka-Hamamatsu Joint Research Center for Child Mental Development, Hamamatsu University School of Medicine, Hamamatsu (Japan)]|[The Positron Medical Center, Hamamatsu Medical Center, Hamamatsu (Japan); Sugiyama, Toshiro [Aichi Children's Health and Medical Center, Obu, Aichi (Japan); Takei, Nori [The Osaka-Hamamatsu Joint Research Center for Child Mental Development, Hamamatsu University School of Medicine, Hamamatsu (Japan)



An efficient haplotyping method with DNA pools  

PubMed Central

Determination of haplotype frequencies (the joint distribution of genetic markers) in large population samples is a powerful tool for association studies. This is due to their greater extent of polymorphism since any two bi-allelic single nucleotide polymorphisms (SNPs) generate a potential four-allele genetic marker. Therefore, a haplotype may capture a given functional polymorphism with higher statistical power than its SNP components. The statistical estimation of haplotype frequencies, usually employed in linkage disequilibrium studies, requires individual genotyping for each SNP in the haplotype, thus making it an expensive process. In this study, we describe a new method for direct measurement of haplotype frequencies in DNA pools by allele-specific, long-range haplotype amplification. The proposed method allows the efficient determination of haplotypes composed of two SNPs in close vicinity (up to 20 kb).

Inbar, Ester; Yakir, Benjamin; Darvasi, Ariel



SNP\\/haplotype associations in cytokine and cytokine receptor genes and immunity to rubella vaccine  

Microsoft Academic Search

An effective immune response to vaccination is, in part, a complex interaction of alleles of multiple genes regulating cytokine\\u000a networks. We conducted a genotyping study of Th1\\/Th2\\/inflammatory cytokines\\/cytokine receptors in healthy children (n?=?738, 11–19 years) to determine associations between individual single-nucleotide polymorphisms (SNPs)\\/haplotypes and immune\\u000a outcomes after two doses of rubella vaccine. SNPs (n?=?501) were selected using the ldSelect-approach and genotyped

Neelam Dhiman; Iana H. Haralambieva; Richard B. Kennedy; Robert A. Vierkant; Megan M. O’Byrne; Inna G. Ovsyannikova; Robert M. Jacobson; Gregory A. Poland



Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication.  


Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity. PMID:20237475

Vonholdt, Bridgett M; Pollinger, John P; Lohmueller, Kirk E; Han, Eunjung; Parker, Heidi G; Quignon, Pascale; Degenhardt, Jeremiah D; Boyko, Adam R; Earl, Dent A; Auton, Adam; Reynolds, Andy; Bryc, Kasia; Brisbin, Abra; Knowles, James C; Mosher, Dana S; Spady, Tyrone C; Elkahloun, Abdel; Geffen, Eli; Pilot, Malgorzata; Jedrzejewski, Wlodzimierz; Greco, Claudia; Randi, Ettore; Bannasch, Danika; Wilton, Alan; Shearman, Jeremy; Musiani, Marco; Cargill, Michelle; Jones, Paul G; Qian, Zuwei; Huang, Wei; Ding, Zhao-Li; Zhang, Ya-Ping; Bustamante, Carlos D; Ostrander, Elaine A; Novembre, John; Wayne, Robert K



Linked region detection using high-density SNP genotype data via the minimum recombinant model of pedigree haplotype inference  

Microsoft Academic Search

BACKGROUND: With the rapid development of high-throughput genotyping technologies, efficient methods for identifying linked regions using high-density SNP genotype data have become more and more important. Recently, a deterministic method that works very well on SNP genotyping data has been developed (Lin et al. Bioinformatics 2008, 24(1): 86–93). However, that program can only work on a limited number of family

Lusheng Wang; Zhanyong Wang; Wanling Yang



Chromosome X centromere region—Haplotype frequencies for different populations  

Microsoft Academic Search

Searching for suitable and closely linked STRs on the X-chromosome (ChrX) we evaluated several polymorphic markers located within the human ChrX centromere region. Stable haplotypes can be expected in this region because of low recombination rates. The five markers investigated here show a tetranucleotide or pentanucleotide structure and exhibit high or medium polymorphic information content. We validated a pentaplex PCR

Jeanett Edelmann; Sandra Hering; Christa Augustin; Uta-Dorothee Immel; Reinhard Szibor



Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data.  


Inference of haplotypes is important for many genetic approaches, including the process of assigning a phenotype to a genetic region. Usually, the population frequencies of haplotypes, as well as the diplotype configuration of each subject, are estimated from a set of genotypes of the subjects in a sample from the population. We have developed an algorithm to infer haplotype frequencies and the combination of haplotype copies in each pool by using pooled DNA data. The input data are the genotypes in pooled DNA samples, each of which contains the quantitative genotype data from one to six subjects. The algorithm infers by the maximum-likelihood method both frequencies of the haplotypes in the population and the combination of haplotype copies in each pool by an expectation-maximization algorithm. The algorithm was implemented in the computer program LDPooled. We also used the bootstrap method to calculate the standard errors of the estimated haplotype frequencies. Using this program, we analyzed the published genotype data for the SAA (n=156), MTHFR (n=80), and NAT2 (n=116) genes, as well as the smoothelin gene (n=102). Our study has shown that the frequencies of major (frequency >0.1 in a population) haplotypes can be inferred rather accurately from the pooled DNA data by the maximum-likelihood method, although with some limitations. The estimated D and D' values had large variations except when the /D/ values were >0.1. The estimated linkage-disequilibrium measure rho2 for 36 linked loci of the smoothelin gene when one- and two-subject pool protocols were used suggested that the gross pattern of the distribution of the measure can be reproduced using the two-subject pool data. PMID:12533787

Ito, Toshikazu; Chiku, Suenori; Inoue, Eisuke; Tomita, Makoto; Morisaki, Takayuki; Morisaki, Hiroko; Kamatani, Naoyuki



Novel Quantitative Real-Time LCR for the Sensitive Detection of SNP Frequencies in Pooled DNA: Method Development, Evaluation and Application  

Microsoft Academic Search

BackgroundSingle nucleotide polymorphisms (SNP) have proven to be powerful genetic markers for genetic applications in medicine, life science and agriculture. A variety of methods exist for SNP detection but few can quantify SNP frequencies when the mutated DNA molecules correspond to a small fraction of the wild-type DNA. Furthermore, there is no generally accepted gold standard for SNP quantification, and,

Androniki Psifidi; Chrysostomos Dovas; Georgios Banos; Katy C. Kao



HLA gene and haplotype frequencies in bone marrow donors worldwide registries  

Microsoft Academic Search

To calculate reliable HLA gene and haplotype frequencies of bone marrow donors in various regions in the world, we have analyzed the HLA-A, -B, and -DR phenotype frequencies of 18 bone marrow donor registries with a total of more than 300,000 HLA-A, -B-typed donors. These registries were included in the 22nd edition of Bone Marrow Donors Worldwide. Maximum likelihood gene

R. F. Schipper; J. D'Amaro; J. T. Bakker; J. J. van Rood; M. Oudshoorn



An evaluation of the performance of HapMap SNP data in a Shanghai Chinese population: Analyses of allele frequency, linkage disequilibrium pattern and tagging SNPs transferability on chromosome 1q21-q25  

PubMed Central

Background The HapMap project aimed to catalog millions of common single nucleotide polymorphisms (SNPs) in the human genome in four major populations, in order to facilitate association studies of complex diseases. To examine the transferability of Han Chinese in Beijing HapMap data to the Southern Han Chinese in Shanghai, we performed comparative analyses between genotypes from over 4,500 SNPs in a 21 Mb region on chromosome 1q21-q25 in 80 unrelated Shanghai Chinese and 45 HapMap Chinese data. Results Three thousand and forty-two SNPs were analyzed after removal of SNPs that failed quality control and those not in the HapMap panel. We compared the allele frequency distributions, linkage disequilibrium patterns, haplotype frequency distributions and tagging SNP sets transferability between the HapMap population and Shanghai Chinese population. Among the four HapMap populations, Beijing Chinese showed the best correlation with Shanghai population on allele frequencies, linkage disequilibrium and haplotype frequencies. Tagging SNP sets selected from four HapMap populations at different thresholds were evaluated in the Shanghai sample. Under the threshold of r2 equal to 0.8 or 0.5, both HapMap Chinese and Japanese data showed better coverage and tagging efficiency than Caucasian and African data. Conclusion Our study supported the applicability of HapMap Beijing Chinese SNP data to the study of complex diseases among southern Chinese population.

Hu, Cheng; Jia, Weiping; Zhang, Weihua; Wang, Congrong; Zhang, Rong; Wang, Jie; Ma, Xiaojing; Xiang, Kunsan



Association of AKT1 haplotype with the risk of schizophrenia in Iranian population.  


AKT-glycogen synthase kinase 3beta (GSK3beta) signaling is a target of lithium and has been implicated in the pathogenesis of mood disorders and schizophrenia. AKT1 protein level is decreased in the peripheral lymphocytes and brains of schizophrenic patients. The SNP2/3/4 TCG haplotype of AKT1 was associated with schizophrenia in patients with Northern European origin. In the present study, we genotyped five single nucleotide polymorphisms (SNP1-5) of AKT1 gene according to the original study in Iranians comprising of 321 schizophrenic patients and 383 controls, all residing in Mashhad city, Northeastern Iran. Haplotype analysis showed that the frequency of a five-SNP haplotype (AGCAG) was significantly higher in schizophrenic patients (0.068) than that of controls (0.034) (P = 0.03 after Bonferroni correction, OR = 2.04, CI = 1.2-3.4). In stratified analysis by schizophrenia subtypes, the frequency of the same haplotype was significantly higher in disorganized subtype (n = 78, frequency of haplotype=0.081) when compared with normal controls (P = 0.04 after Bonferroni correction, OR = 2.59, CI = 1.3-5.2). Our findings did not confirm the association of AKT1 SNP2/3/4 TCG haplotype with the risk of schizophrenia as reported in the original study but showed the evidence of association with a different haplotype, AKT1 five-SNP AGCAG haplotype, with the risk of schizophrenia in Iranian population. PMID:16583435

Bajestan, Sepideh N; Sabouri, Amir H; Nakamura, Masayuki; Takashima, Hiroshi; Keikhaee, Mohammad R; Behdani, Fatemeh; Fayyazi, Mohammad R; Sargolzaee, Mohammad R; Bajestan, Mahboobeh N; Sabouri, Zahra; Khayami, Esmaeil; Haghighi, Sima; Hashemi, Susan B; Eiraku, Nobutaka; Tufani, Hamid; Najmabadi, Hossein; Arimura, Kimiyoshi; Sano, Akira; Osame, Mitsuhiro



SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines  

Microsoft Academic Search

BACKGROUND: Recent studies of ancestral maize populations indicate that linkage disequilibrium tends to dissipate rapidly, sometimes within 100 bp. We set out to examine the linkage disequilibrium and diversity in maize elite inbred lines, which have been subject to population bottlenecks and intense selection by breeders. Such population events are expected to increase the amount of linkage disequilibrium, but reduce

Ada Ching; Katherine S Caldwell; Mark Jung; Maurine Dolan; Scott Tingey; Michele Morgante; Antoni J Rafalski



SNP500Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes.  


The SNP500Cancer database provides sequence and genotype assay information for candidate SNPs useful in mapping complex diseases, such as cancer. The database is an integral component of the NCI Cancer Genome Anatomy Project ( SNP500Cancer reports sequence analysis of anonymized control DNA samples (n = 102 Coriell samples representing four self-described ethnic groups: African/African-American, Caucasian, Hispanic and Pacific Rim). The website is searchable by gene, chromosome, gene ontology pathway, dbSNP ID and SNP500Cancer SNP ID. As of October 2005, the database contains >13 400 SNPs, 9124 of which have been sequenced in the SNP500Cancer population. For each analysed SNP, gene location and >200 bp of surrounding annotated sequence (including nearby SNPs) are provided, with frequency information in total and per subpopulation as well as calculation of Hardy-Weinberg equilibrium for each subpopulation. The website provides the conditions for validated sequencing and genotyping assays, as well as genotype results for the 102 samples, in both viewable and downloadable formats. A subset of sequence validated SNPs with minor allele frequency >5% are entered into a high-throughput pipeline for genotyping analysis to determine concordance for the same 102 samples. In addition, the results of genotype analysis for select validated SNP assays (defined as 100% concordance between sequence analysis and genotype results) are posted for an additional 280 samples drawn from the Human Diversity Panel (HDP). SNP500Cancer provides an invaluable resource for investigators to select SNPs for analysis, design genotyping assays using validated sequence data, choose selected assays already validated on one or more genotyping platforms, and select reference standards for genotyping assays. The SNP500Cancer database is freely accessible via the web page at PMID:16381944

Packer, Bernice R; Yeager, Meredith; Burdett, Laura; Welch, Robert; Beerman, Michael; Qi, Liqun; Sicotte, Hugues; Staats, Brian; Acharya, Mekhala; Crenshaw, Andrew; Eckert, Andrew; Puri, Vinita; Gerhard, Daniela S; Chanock, Stephen J



Haplotype frequencies at the DRD2 locus in populations of the East European Plain  

PubMed Central

Background It was demonstrated previously that the three-locus RFLP haplotype, TaqI B-TaqI D-TaqI A (B-D-A), at the DRD2 locus constitutes a powerful genetic marker and probably reflects the most ancient dispersal of anatomically modern humans. Results We investigated TaqI B, BclI, MboI, TaqI D, and TaqI A RFLPs in 17 contemporary populations of the East European Plain and Siberia. Most of these populations belong to the Indo-European or Uralic language families. We identified three common haplotypes, which occurred in more than 90% of chromosomes investigated. The frequencies of the haplotypes differed according to linguistic and geographical affiliation. Conclusion Populations in the northwestern (Byelorussians from Mjadel'), northern (Russians from Mezen' and Oshevensk), and eastern (Russians from Puchezh) parts of the East European Plain had relatively high frequencies of haplotype B2-D2-A2, which may reflect admixture with Uralic-speaking populations that inhabited all of these regions in the Early Middle Ages.

Flegontova, Olga V; Khrunin, Andrey V; Lylova, Olga I; Tarskaia, Larisa A; Spitsyn, Victor A; Mikulich, Alexey I; Limborska, Svetlana A



High-Throughput SNP Allele-Frequency Determination in Pooled DNA Samples by Kinetic PCR  

PubMed Central

We have developed an accurate, yet inexpensive and high-throughput, method for determining the allele frequency of biallelic polymorphisms in pools of DNA samples. The assay combines kinetic (real-time quantitative) PCR with allele-specific amplification and requires no post-PCR processing. The relative amounts of each allele in a sample are quantified. This is performed by dividing equal aliquots of the pooled DNA between two separate PCR reactions, each of which contains a primer pair specific to one or the other allelic SNP variant. For pools with equal amounts of the two alleles, the two amplifications should reach a detectable level of fluorescence at the same cycle number. For pools that contain unequal ratios of the two alleles, the difference in cycle number between the two amplification reactions can be used to calculate the relative allele amounts. We demonstrate the accuracy and reliability of the assay on samples with known predetermined SNP allele frequencies from 5% to 95%, including pools of both human and mouse DNAs using eight different SNPs altogether. The accuracy of measuring known allele frequencies is very high, with the strength of correlation between measured and known frequencies having an r2?=?0.997. The loss of sensitivity as a result of measurement error is typically minimal, compared with that due to sampling error alone, for population samples up to 1000. We believe that by providing a means for SNP genotyping up to thousands of samples simultaneously, inexpensively, and reproducibly, this method is a powerful strategy for detecting meaningful polymorphic differences in candidate gene association studies and genome-wide linkage disequilibrium scans.

Germer, S?ren; Holland, Michael J.; Higuchi, Russell



Antigen, allele, and haplotype frequencies report of the ASHI minority antigens workshops: part 1, African-Americans.  


HLA typing was performed on 977 African Americans residing throughout most of the United States. Class I and class II antigens and class II alleles were defined for all individuals and class I alleles were determined for a subset of individuals. The occurrence of 854 of the individuals in family groups permitted direct counting of allele and haplotype frequencies. The data were analyzed for antigen, allele, and haplotype frequencies; recombination frequencies; segregation distortion; distribution of haplotype frequencies; linkage disequilibria; and geographic distribution of DR antigens. Tables of the antigen, allele, the most common two and three point haplotypes, and 88 extended haplotypes that include class I and class II alleles are presented. Notable findings include a lower than expected frequency of recombination between the B and DR loci (theta= 0.0013), lower than expected frequency of inheritance (44.5% vs 54.5%) of the DRB1*1503; DQB1*0602 haplotype, lower than anticipated linkage disequilibrium values for DR; DQ haplotypes, and a skewed geographic distribution of DR antigens. PMID:11600220

Zachary, A A; Bias, W B; Johnson, A; Rose, S M; Leffell, M S



SNP500Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes  

PubMed Central

The SNP500Cancer database provides sequence and genotype assay information for candidate SNPs useful in mapping complex diseases, such as cancer. The database is an integral component of the NCI Cancer Genome Anatomy Project (). SNP500Cancer reports sequence analysis of anonymized control DNA samples (n = 102 Coriell samples representing four self-described ethnic groups: African/African-American, Caucasian, Hispanic and Pacific Rim). The website is searchable by gene, chromosome, gene ontology pathway, dbSNP ID and SNP500Cancer SNP ID. As of October 2005, the database contains >13?400 SNPs, 9124 of which have been sequenced in the SNP500Cancer population. For each analysed SNP, gene location and >200 bp of surrounding annotated sequence (including nearby SNPs) are provided, with frequency information in total and per subpopulation as well as calculation of Hardy–Weinberg equilibrium for each subpopulation. The website provides the conditions for validated sequencing and genotyping assays, as well as genotype results for the 102 samples, in both viewable and downloadable formats. A subset of sequence validated SNPs with minor allele frequency >5% are entered into a high-throughput pipeline for genotyping analysis to determine concordance for the same 102 samples. In addition, the results of genotype analysis for select validated SNP assays (defined as 100% concordance between sequence analysis and genotype results) are posted for an additional 280 samples drawn from the Human Diversity Panel (HDP). SNP500Cancer provides an invaluable resource for investigators to select SNPs for analysis, design genotyping assays using validated sequence data, choose selected assays already validated on one or more genotyping platforms, and select reference standards for genotyping assays. The SNP500Cancer database is freely accessible via the web page at .

Packer, Bernice R.; Yeager, Meredith; Burdett, Laura; Welch, Robert; Beerman, Michael; Qi, Liqun; Sicotte, Hugues; Staats, Brian; Acharya, Mekhala; Crenshaw, Andrew; Eckert, Andrew; Puri, Vinita; Gerhard, Daniela S.; Chanock, Stephen J.



The discrete Laplace exponential family and estimation of Y-STR haplotype frequencies.  


Estimating haplotype frequencies is important in e.g. forensic genetics, where the frequencies are needed to calculate the likelihood ratio for the evidential weight of a DNA profile found at a crime scene. Estimation is naturally based on a population model, motivating the investigation of the Fisher-Wright model of evolution for haploid lineage DNA markers. An exponential family (a class of probability distributions that is well understood in probability theory such that inference is easily made by using existing software) called the 'discrete Laplace distribution' is described. We illustrate how well the discrete Laplace distribution approximates a more complicated distribution that arises by investigating the well-known population genetic Fisher-Wright model of evolution by a single-step mutation process. It was shown how the discrete Laplace distribution can be used to estimate haplotype frequencies for haploid lineage DNA markers (such as Y-chromosomal short tandem repeats), which in turn can be used to assess the evidential weight of a DNA profile found at a crime scene. This was done by making inference in a mixture of multivariate, marginally independent, discrete Laplace distributions using the EM algorithm to estimate the probabilities of membership of a set of unobserved subpopulations. The discrete Laplace distribution can be used to estimate haplotype frequencies with lower prediction error than other existing estimators. Furthermore, the calculations could be performed on a normal computer. This method was implemented in the freely available open source software R that is supported on Linux, MacOS and MS Windows. PMID:23524164

Andersen, Mikkel Meyer; Eriksen, Poul Svante; Morling, Niels



Haplotypic Background of a Private Allele at High Frequency in the Americas  

PubMed Central

Recently, the observation of a high-frequency private allele, the 9-repeat allele at microsatellite D9S1120, in all sampled Native American and Western Beringian populations has been interpreted as evidence that all modern Native Americans descend primarily from a single founding population. However, this inference assumed that all copies of the 9-repeat allele were identical by descent and that the geographic distribution of this allele had not been influenced by natural selection. To investigate whether these assumptions are satisfied, we genotyped 34 single nucleotide polymorphisms across ?500 kilobases (kb) around D9S1120 in 21 Native American and Western Beringian populations and 54 other worldwide populations. All chromosomes with the 9-repeat allele share the same haplotypic background in the vicinity of D9S1120, suggesting that all sampled copies of the 9-repeat allele are identical by descent. Ninety-one percent of these chromosomes share the same 76.26 kb haplotype, which we call the “American Modal Haplotype” (AMH). Three observations lead us to conclude that the high frequency and widespread distribution of the 9-repeat allele are unlikely to be the result of positive selection: 1) aside from its association with the 9-repeat allele, the AMH does not have a high frequency in the Americas, 2) the AMH is not unusually long for its frequency compared with other haplotypes in the Americas, and 3) in Latin American mestizo populations, the proportion of Native American ancestry at D9S1120 is not unusual compared with that observed at other genomewide microsatellites. Using a new method for estimating the time to the most recent common ancestor (MRCA) of all sampled copies of an allele on the basis of an estimate of the length of the genealogy descended from the MRCA, we calculate the mean time to the MRCA of the 9-repeat allele to be between 7,325 and 39,900 years, depending on the demographic model used. The results support the hypothesis that all modern Native Americans and Western Beringians trace a large portion of their ancestry to a single founding population that may have been isolated from other Asian populations prior to expanding into the Americas.

Schroeder, Kari B.; Jakobsson, Mattias; Crawford, Michael H.; Schurr, Theodore G.; Boca, Simina M.; Conrad, Donald F.; Tito, Raul Y.; Osipova, Ludmilla P.; Tarskaia, Larissa A.; Zhadanov, Sergey I.; Wall, Jeffrey D.; Pritchard, Jonathan K.; Malhi, Ripan S.; Smith, David G.; Rosenberg, Noah A.



Haplotypic background of a private allele at high frequency in the Americas.  


Recently, the observation of a high-frequency private allele, the 9-repeat allele at microsatellite D9S1120, in all sampled Native American and Western Beringian populations has been interpreted as evidence that all modern Native Americans descend primarily from a single founding population. However, this inference assumed that all copies of the 9-repeat allele were identical by descent and that the geographic distribution of this allele had not been influenced by natural selection. To investigate whether these assumptions are satisfied, we genotyped 34 single nucleotide polymorphisms across approximately 500 kilobases (kb) around D9S1120 in 21 Native American and Western Beringian populations and 54 other worldwide populations. All chromosomes with the 9-repeat allele share the same haplotypic background in the vicinity of D9S1120, suggesting that all sampled copies of the 9-repeat allele are identical by descent. Ninety-one percent of these chromosomes share the same 76.26 kb haplotype, which we call the "American Modal Haplotype" (AMH). Three observations lead us to conclude that the high frequency and widespread distribution of the 9-repeat allele are unlikely to be the result of positive selection: 1) aside from its association with the 9-repeat allele, the AMH does not have a high frequency in the Americas, 2) the AMH is not unusually long for its frequency compared with other haplotypes in the Americas, and 3) in Latin American mestizo populations, the proportion of Native American ancestry at D9S1120 is not unusual compared with that observed at other genomewide microsatellites. Using a new method for estimating the time to the most recent common ancestor (MRCA) of all sampled copies of an allele on the basis of an estimate of the length of the genealogy descended from the MRCA, we calculate the mean time to the MRCA of the 9-repeat allele to be between 7,325 and 39,900 years, depending on the demographic model used. The results support the hypothesis that all modern Native Americans and Western Beringians trace a large portion of their ancestry to a single founding population that may have been isolated from other Asian populations prior to expanding into the Americas. PMID:19221006

Schroeder, Kari B; Jakobsson, Mattias; Crawford, Michael H; Schurr, Theodore G; Boca, Simina M; Conrad, Donald F; Tito, Raul Y; Osipova, Ludmilla P; Tarskaia, Larissa A; Zhadanov, Sergey I; Wall, Jeffrey D; Pritchard, Jonathan K; Malhi, Ripan S; Smith, David G; Rosenberg, Noah A



Allele frequencies and haplotypes of 12 Y-STR loci for the local Chinese population in Hong Kong  

Microsoft Academic Search

Haplotype frequencies were established for 12 Y-chromosome STR loci, including all loci recommended by Scientific Working Group on DNA Analysis Methods Y-STR Subcommittee (DYS391, DYS389I\\/II, DYS439, DYS393, DYS390, DYS385a\\/b, DYS438, DYS19 and DYS392) plus DYS437, in the local Chinese population in Hong Kong. In a sample of 481 unrelated males, it was possible to define 424 different haplotypes of which

S. M. Yeung; L. M. Wong; B. K. K. Cheung; K. Y. To



Allele frequencies for 40 autosomal SNP loci typed for US population samples using electrospray ionization mass spectrometry  

PubMed Central

Aim To type a set of 194 US African American, Caucasian, and Hispanic samples (self-declared ancestry) for 40 autosomal single nucleotide polymorphism (SNP) markers intended for human identification purposes. Methods Genotyping was performed on an automated commercial electrospray ionization time-of-flight mass spectrometer, the PLEX-ID. The 40 SNP markers were amplified in eight unique 5plex PCRs, desalted, and resolved based on amplicon mass. For each of the three US sample groups statistical analyses were performed on the resulting genotypes. Results The assay was found to be robust and capable of genotyping the 40 SNP markers consuming approximately 4 nanograms of template per sample. The combined random match probabilities for the 40 SNP assay ranged from 10?16 to 10?21. Conclusion The multiplex PLEX-ID SNP-40 assay is the first fully automated genotyping method capable of typing a panel of 40 forensically relevant autosomal SNP markers on a mass spectrometry platform. The data produced provided the first allele frequencies estimates for these 40 SNPs in a National Institute of Standards and Technology US population sample set. No population bias was detected although one locus deviated from its expected level of heterozygosity.

Kiesler, Kevin M.; Vallone, Peter M.



Molecular analysis of human leukocyte antigen class I and class II allele frequencies and haplotype distribution in Pakistani population  

PubMed Central

AIM: Distribution of HLA class I and II alleles and haplotype was studied in Pakistani population and compared with the data reported for Caucasoid, Africans, Orientals and Arab populations. MATERIALS AND METHODS: HLA class I and II polymorphisms in 1000 unrelated Pakistani individuals was studied using sequence-specific primers and polymerase chain reaction and assay. RESULTS: The most frequent class I alleles observed were A*02, B*35 and CW*07, with frequencies of 19.2, 13.7 and 20%, respectively. Fifteen distinct HLA-DRB1 alleles and eight HLA-DQB1 alleles were recognized. The most frequently observed DRB1 alleles which represented more than 60% of the subjects were DRB1 *03, *07, *11 and *15. The rare DRB1 alleles detected in this study were HLADRB1 *08 and *09, having frequencies of 0.9 and 1.7%, respectively. In addition, at DRB1-DQB1 loci there were 179 different haplotypes and 285 unique genotypes and the most common haplotype was DRB1*15-DQB1*06 which represented 17% of the total DRB1-DQB1 haplotypes. In our population, haplotype A*33-B*58-Cw*03 comprised 2.8% of the total class I haplotypes observed. This haplotype was seen only in the oriental populations and has not been reported in the African or European Caucasoid. CONCLUSION: Our study showed a close similarity of HLA class I and II alleles with that of European Caucasoid and Orientals. In Pakistani population, two rare loci and three haplotypes were identified, whereas haplotypes characteristic of Caucasians, Africans and Orientals were also found, suggesting an admixture of different races due to migration to and from this region.

Moatter, T.; Aban, M.; Tabassum, S.; Shaikh, U.; Pervez, S.



Genotype Variability and Haplotype Frequency of MDR1 (ABCB1) Gene Polymorphism in Morocco.  


The multidrug resistance gene (MDR1) plays an important role in the transport of a wide range of drugs and elimination of xenobiotics from the body. Identification of polymorphisms and haplotypes in the MDR1 gene might not only help understand pharmacokinetics and pharmacodynamics of drugs, but also can help in the prediction of drug responses, toxicity, and side effects, especially, in the era of personalized medicine. We have analyzed the genotypic and haplotypic frequencies of the three most common single-nucleotide polymorphisms in the MDR1 gene in a sample of 100 unrelated healthy Moroccan subjects by polymerase chain reaction-restrictive fragment length polymorphism. The observed genotype frequencies were 43% for 1236CC, 49% for 1236CT, and 8% for 1236TT in exon 12; 49% for 2677GG, 47% for 2677GT, and 4% for 2677TT in exon 21; 39% for 3435CC, 51% 3435CT for 3435TT, and 10% for 3435TT in exon 26, respectively. We found that all polymorphisms were in Hardy-Weinberg equilibrium. Moderate linkage disequilibrium (LD) was observed between the three polymorphisms, the strongest LD in our study has been observed between C1236T and G2677T (D'=0.76; r(2)=0.45). We identified eight haplotypes, the most frequent were 1236C-2677G-3435C (53%), 1236T-2677T-3435T (21%), and 1236C-2677G-3435T (10%), respectively. Our findings might facilitate future studies on pharmacokinetics of P-glycoprotein substrate drugs and interindividual variability to drugs in Moroccan patients. PMID:23930592

Kassogue, Yaya; Dehbi, Hind; Nassereddine, Sanaa; Quachouh, Meryem; Nadifi, Sellama



Haplotype Association between Haptoglobin (Hp2) and Hp Promoter SNP (A-61C) May Explain Previous Controversy of Haptoglobin and Malaria Protection  

PubMed Central

Background Malaria is one of the strongest recent selective pressures on the human genome, as evidenced by the high levels of varying haemoglobinopathies in human populations–despite the increased risk of mortality in the homozygous states. Previously, functional polymorphisms of Hp, coded by the co-dominant alleles Hp1 and Hp2, have been variously associated with several infectious diseases, including malaria susceptibility. Methodology/Principal Findings Risk of a clinical malarial episode over the course of a malarial transmission season was assessed using active surveillance in a cohort of Gambian children aged 10–72 months. We report for the first time that the major haplotype for the A-61C mutant allele in the promoter of haptoglobin (Hp)–an acute phase protein that clears haemoglobin released from haemolysis of red cells–is associated with protection from malarial infection in older children, (children aged ?36 months, >500 parasites/ul and temperature >37.5°C; OR?=?0.42; [95% CI 0.24–0.73] p?=?0.002) (lr test for interaction, <36 vs ?36 months, p?=?0.014). Protection was also observed using two other definitions, including temperature >37.5°C, dipstick positive, plus clinical judgement of malaria blinded to dipstick result (all ages, OR?=?0.48, [95% CI 0.30–0.78] p?=?0.003; ?36 months, OR?=?0.31, [95% CI 0.15–0.62] p?=?0.001). A similar level of protection was observed for the known protective genetic variant, sickle cell trait (HbAS). Conclusions/Significance We propose that previous conflicting results between Hp phenotypes/genotypes and malaria susceptibility may be explained by differing prevalence of the A-61C SNP in the populations studied, which we found to be highly associated with the Hp2 allele. We report the -61C allele to be associated with decreased Hp protein levels (independent of Hp phenotype), confirming in vitro studies. Decreased Hp expression may lead to increased oxidant stress and increased red cell turnover, and facilitate the development of acquired immunity, similar to a mechanism suggested for sickle cell trait.

Cox, Sharon E.; Doherty, Conor; Atkinson, Sarah H.; Nweneka, Chidi V.; Fulford, Anthony J.C.; Ghattas, Hala; Rockett, Kirk A.; Kwiatkowski, Dominic P.; Prentice, Andrew M.



[Gene frequency and haplotypes of the HLA system in the Panamanian population].  


The authors determined the frequency of genes and haplotypes of the HLA system in 965 panamanian men and women not related to each other, between 6 and 65 years of age. The HLA-A locus genes with the highest frequency (f) were A2, with f 0.1763; A24, f 0.1584; A30, f 0.1340; A23, f 0.1069; A3, f 0.0774. The other 20 genes each had less than 0.07. The genes with the highest frequency for locus HLA-B were B35, f 0.1946; B44, f 0.0904; B7, f 0.0774; B60 and B61, f 0.0582. For locus HLA-C, the most frequent genes were Cw3 with f 0.1549 and Cw4, f 0.1444. For locus HLA-DR, the most frequent genes were DR2 with f 0.1283; DR3, f 0.0620; DR7, f 0.0409. The most frequent haplotypes in the panamanian population were A2-B35 with f 0.0382; A3-B35, f 0.0191; A24-35, f 0.0287; A24-B61, f 0.0239; A29-B44, f 0.0287; A30-B42, f 0.0239; A23-B44, f 0.0191; A1-B8, f 0.0143. The authors conclude that the panamanian population exhibits a high degree of polymorphism for loci HLA-A, B and C, while for locu HLA-DR the frequency is the median when compared with that in caucasian, negro and oriental groups; and that, according to locus, predominant genes originating from these groups and found, corroborating the multiracial origen of the panamanian population. PMID:8668821

Vernaza-Kwiers, A A; de Gómez, I J; Díaz-Isaacs, M; Cuero, C J; Pérez Guardia, E; Moreno Saavedra, M



Rainfall-driven sex-ratio genes in African buffalo suggested by correlations between Y-chromosomal haplotype frequencies and foetal sex ratio  

Microsoft Academic Search

BACKGROUND: The Y-chromosomal diversity in the African buffalo (Syncerus caffer) population of Kruger National Park (KNP) is characterized by rainfall-driven haplotype frequency shifts between year cohorts. Stable Y-chromosomal polymorphism is difficult to reconcile with haplotype frequency variations without assuming frequency-dependent selection or specific interactions in the population dynamics of X- and Y-chromosomal genes, since otherwise the fittest haplotype would inevitably

Pim van Hooft; Herbert HT Prins; Wayne M Getz; Anna E Jolles; Sipke E van Wieren; Barend J Greyling; Paul D van Helden; Armanda DS Bastos



Allele frequencies of a SNP and a 27-bp deletion that are the determinant of earwax type in the ABCC11 gene  

Microsoft Academic Search

Allele frequencies for a SNP (rs17822931) and a 27-bp deletion that are the determinant of earwax type in the ABCC11 gene were investigated in seven Japanese, one Korean, and one German populations. The SNP will be useful as one of ancestry information markers, because it showed marked difference in frequencies between Asian and European populations.

Takashi Kitano; Isao Yuasa; Kentaro Yamazaki; Nori Nakayashiki; Aya Miyoshi; Kyung Sook Park; Kazuo Umetsu



Analysis of concordance of different haplotype block partitioning algorithms  

PubMed Central

Background Different classes of haplotype block algorithms exist and the ideal dataset to assess their performance would be to comprehensively re-sequence a large genomic region in a large population. Such data sets are expensive to collect. Alternatively, we performed coalescent simulations to generate haplotypes with a high marker density and compared block partitioning results from diversity based, LD based, and information theoretic algorithms under different values of SNP density and allele frequency. Results We simulated 1000 haplotypes using the standard coalescent for three world populations – European, African American, and East Asian – and applied three classes of block partitioning algorithms – diversity based, LD based, and information theoretic. We assessed algorithm differences in number, size, and coverage of blocks inferred under different conditions of SNP density, allele frequency, and sample size. Each algorithm inferred blocks differing in number, size, and coverage under different density and allele frequency conditions. Different partitions had few if any matching block boundaries. However they still overlapped and a high percentage of total chromosomal region was common to all methods. This percentage was generally higher with a higher density of SNPs and when rarer markers were included. Conclusion A gold standard definition of a haplotype block is difficult to achieve, but collecting haplotypes covered with a high density of SNPs, partitioning them with a variety of block algorithms, and identifying regions common to all methods may be the best way to identify genomic regions that harbor SNP variants that cause disease.

Indap, Amit R; Marth, Gabor T; Struble, Craig A; Tonellato, Peter; Olivier, Michael



HLA antigen, allele and haplotype frequencies and their use in virtual panel reactive antigen calculations in the Finnish population.  


The human leukocyte antigen (HLA) antigen, allele and haplotype frequencies of the Finnish population are quite unique because of a rather restricted and homogeneous gene pool. This has a strong influence on finding suitable donors for transplant patients; hence knowledge about the HLA frequencies of the patient population is essential. Here we report the HLA antigen frequencies for a large population sample and show high resolution HLA allele frequencies for 11 loci, including the rarely typed DPA1 and DQA1 loci. Furthermore, the most common Finnish high resolution haplotypes are presented for five HLA loci. The study shows that there are fewer HLA haplotypes in the Finnish population compared with mixed populations, and the common Finnish HLA haplotypes are more frequent. Using HLA antibody identification and panel reactive antibody calculations we show that a virtual population-specific panel, combined with single antigen testing, gives a more accurate and reliable estimate of the reactivity of the recipient serum against potential solid organ donors within the Finnish population. The results can be directly used to improve donor search for patients waiting for stem cell transplantation and to allocate highly immunised patients accurately to acceptable mismatch programs. PMID:23216287

Haimila, K; Peräsaari, J; Linjama, T; Koskela, S; Saarinenl, T; Lauronen, J; Auvinen, M-K; Jaatinen, T



Haplotype frequencies of the PowerPlex ® Y system in a Mexican-Mestizo population sample from Mexico City  

Microsoft Academic Search

The PowerPlex® Y system including 11 Y-STRs (DYS19, DYS389I\\/II, DYS390, DYS391, DYS392, DYS393, DYS385, DYS437, DYS438 and DYS439) was analyzed by capillary electrophoresis in 357 males from Mexico City. Haplotype frequency for this system was reported. The haplotype diversity was 99.56±0.04%, and gene diversity ranged from 51.4% for DYS393 to 92.5% for DYS385. AMOVA tests including previous reports from Mexico

A. Luna-Vázquez; G. Vilchis-Dorantes; M. O. Aguilar-Ruiz; A. Bautista-Rivas; A. Pérez-García; R. Orea-Ochoa; D. Villanueva-Hernández; J. F. Muñoz-Valle; H. Rangel-Villalobos



Application of site and haplotype-frequency based approaches for detecting selection signatures in cattle  

PubMed Central

Background 'Selection signatures' delimit regions of the genome that are, or have been, functionally important and have therefore been under either natural or artificial selection. In this study, two different and complementary methods--integrated Haplotype Homozygosity Score (|iHS|) and population differentiation index (FST)--were applied to identify traces of decades of intensive artificial selection for traits of economic importance in modern cattle. Results We scanned the genome of a diverse set of dairy and beef breeds from Germany, Canada and Australia genotyped with a 50 K SNP panel. Across breeds, a total of 109 extreme |iHS| values exceeded the empirical threshold level of 5% with 19, 27, 9, 10 and 17 outliers in Holstein, Brown Swiss, Australian Angus, Hereford and Simmental, respectively. Annotating the regions harboring clustered |iHS| signals revealed a panel of interesting candidate genes like SPATA17, MGAT1, PGRMC2 and ACTC1, COL23A1, MATN2, respectively, in the context of reproduction and muscle formation. In a further step, a new Bayesian FST-based approach was applied with a set of geographically separated populations including Holstein, Brown Swiss, Simmental, North American Angus and Piedmontese for detecting differentiated loci. In total, 127 regions exceeding the 2.5 per cent threshold of the empirical posterior distribution were identified as extremely differentiated. In a substantial number (56 out of 127 cases) the extreme FST values were found to be positioned in poor gene content regions which deviated significantly (p < 0.05) from the expectation assuming a random distribution. However, significant FST values were found in regions of some relevant genes such as SMCP and FGF1. Conclusions Overall, 236 regions putatively subject to recent positive selection in the cattle genome were detected. Both |iHS| and FST suggested selection in the vicinity of the Sialic acid binding Ig-like lectin 5 gene on BTA18. This region was recently reported to be a major QTL with strong effects on productive life and fertility traits in Holstein cattle. We conclude that high-resolution genome scans of selection signatures can be used to identify genomic regions contributing to within- and inter-breed phenotypic variation.



Haplotype frequencies and population data of nine Y-chromosomal STR polymorphisms in a German and a Chinese population  

Microsoft Academic Search

Y-chromosomal STR loci are of increasing interest in paternity testing, forensic casework, anthropological and evolutionary studies. We participate in a cooperation to establish an international reference database of at least nine Y-chromosomal STR loci to be used for biostatistic calculations. We present frequency distributions of nine Y-chromosome specific STR polymorphisms and frequencies of compound haplotypes in two populations. We chose

M Hidding; C Schmitt



Human leukocyte antigen alleles, genotypes and haplotypes frequencies in renal transplant donors and recipients from West Central India  

PubMed Central

BACKGROUND: Human leukocyte antigen (HLA) is comprised of a highly polymorphic set of genes which determines the histocompatibility of organ transplantation. The present study was undertaken to identify HLA class I and class II allele, genotype and haplotype frequencies in renal transplant recipients and donors from West Central India. MATERIALS AND METHODS: HLA typing was carried out using Polymerase Chain Reaction-Sequence Specific Primer in 552 live related and unrelated renal transplant recipients and donors. RESULTS: The most frequent HLA class I and class II alleles and their frequencies in recipients were HLA-AFNx0101 (0.1685) and AFNx0102 (0.1649), HLA-BFNx0135 (0.1322), and HLA-DR beta 1 (DRB 1)FNx0115 (0.2192), whereas in donors, these were HLA-AFNx0102 (0.1848) and AFNx0101 (0.1667), HLA-BFNx0135 (0.1359), and HLA-DRB1FNx0115 (0.2409). The two-locus haplotype statistical analysis revealed HLA-AFNx0102-B61 as the most common haplotype with the frequency of 0.0487 and 0.0510 in recipients and donors, respectively. Further, among the three locus haplotypes HLA-AFNx0133-BFNx0144-DRB1FNx0107 and HLA-AFNx0102-BFNx0161-DRB1FNx0115 were the most common haplotypes with frequencies 0.0362 and 0.0326, respectively in recipients and 0.0236 and 0.0323, respectively in donors. Genotype frequency revealed a high prevalence of genotype HLA-AFNx0102/AFNx0124 in recipients (0.058) compared to donors (0.0109) whereas low prevalence of HLA-AFNx0101/AFNx0102 in recipients (0.0435) than in donors (0.0797). The phylogenetic and principal component analysis of HLA allele and haplotype frequency distribution revealed genetic similarities of various ethnic groups. Further, case control analysis provides preliminary evidence of association of HLA-A genotype (P < 0.05) with renal failure. CONCLUSION: This study will be helpful in suitable donor search besides providing valuable information for population genetics and HLA disease association analysis.

Patel, Jaina S.; Patel, Manisha M.; Koringa, Prakash G.; Shah, Tejas M.; Patel, Amrutlal K.; Tripathi, Ajai K.; Mathew, Anila; Rajapurkar, Mohan M.; Joshi, Chaitanya G.



FMR1 haplotype analyses among Indians: a weak founder effect and other findings  

Microsoft Academic Search

This study on allelic\\/haplotypic fragile X associations evaluated using STR (DXS548, FRAXAC1, FRAXAC2) and SNP (ATL1) markers flanking the (CGG)n locus of FMR1 is the first report from the large ethnically complex Indian population. Results have been compared with allele\\/haplotype distributions reported for other major ethnic groups, including White Caucasians, Africans, and Pacific Asians. Though overall allele frequency distributions at

Deepti Sharma; Meena Gupta; B. K. Thelma



SNP discovery, validation, haplotype structure and linkage disequilibrium in full-length herbage nutritive quality genes of perennial ryegrass ( Lolium perenne L.)  

Microsoft Academic Search

Development of accurate high-throughput molecular marker systems such as SNPs permits evaluation and selection of favourable\\u000a gene variants to accelerate elite varietal production. SNP discovery in perennial ryegrass has been based on PCR amplification\\u000a and sequencing of multiple amplicons designed to scan all components of the transcriptional unit. Full-length genes (with\\u000a complete intron–exon structure and promoter information) corresponding to well-defined

Rebecca C. Ponting; Michelle C. Drayton; Noel O. I. Cogan; Mark P. Dobrowolski; Germán C. Spangenberg; Kevin F. Smith; John W. Forster



Estimating the frequency of Asian cytochrome B haplotypes in standard European and local Spanish pig breeds  

Microsoft Academic Search

Mitochondrial DNA has been widely used to perform phylogenetic studies in different animal species. In pigs, genetic variability at the cytochrome B gene and the D-loop region has been used as a tool to dissect the genetic relationships between different breeds and populations. In this work, we analysed four SNP at the cytochrome B gene to infer the Asian (A1

Alex Clop; Marcel Amills; José Luís Noguera; Ana Fernández; Juan Capote; Misericòrdia Maria Ramón; Lucía Kelly; James MH Kijas; Leif Andersson; Armand Sànchez



HLA-A, -B, -C, -DRB1 Allele and Haplotype Frequencies Distinguish Eastern European Americans from the General European American Population  

PubMed Central

Sequence based typing was used to identify HLA-A,B,C,DRB1 alleles from 558 consecutively recruited U.S. volunteers with Eastern European ancestry for an unrelated hematopoietic stem cell registry. Four of the 31 HLA-A alleles, 29 -C alleles, 59 -B alleles, and 42 -DRB1 alleles identified (A*0325, B*440204, Cw*0332, and *0732N) are novel. The HLA-A*02010101g allele was observed at a frequency of 0.28. Two-, three- and four-locus haplotypes were estimated using the expectation maximization algorithm. The highest-frequency extended haplotypes (A*010101g-Cw*070101g-B*0801g-DRB1*0301 and A*03010101g-Cw*0702-B*0702-DRB1*1501) were observed at frequencies of 0.04 and 0.03, respectively. Linkage disequilibrium values (D’ij) of the constituent 2-locus haplotypes were highly significant for both extended haplotypes (p-values were less than 8 × 10?10), but were consistently higher for the more frequent haplotype. Balancing selection was inferred to be acting on all four loci, with the strongest evidence of balancing selection observed for the HLA-C locus. Comparisons of the A-C-B haplotype and DRB1 frequencies in this population to those for African, European and western Asian populations revealed high degrees of identity with Czech, Polish, and Slovenian populations and significant differences from the general European American population.

Mack, Steven J.; Tu, Bin; Lazaro, Ana; Yang, Ruyan; Lancaster, Alex K.; Cao, Kai; Ng, Jennifer; Hurley, Carolyn Katovich



Linear reduction methods for tag SNP selection.  


It is widely hoped that constructing a complete human haplotype map will help to associate complex diseases with certain SNP's. Unfortunately, the number of SNP's is huge and it is very costly to sequence many individuals. Therefore, it is desirable to reduce the number of SNP's that should be sequenced to considerably small number of informative representatives, so called tag SNP's. In this paper, we propose a new linear algebra based method for selecting and using tag SNP's. Our method is purely combinatorial and can be combined with linkage disequilibrium (LD) and block based methods. We measure the quality of our tag SNP selection algorithm by comparing actual SNP's with SNP's linearly predicted from linearly chosen tag SNP's. We obtain an extremely good compression and prediction rates. For example, for long haplotypes (>25000 SNP's), knowing only 0.4% of all SNP's we predict the entire unknown haplotype with 2% accuracy while the prediction method is based on a 10% sample of the population. PMID:17270869

He, Jingwu; Zelikovsky, Alex



Genome-Wide SNP Detection, Validation, and Development of an 8K SNP Array for Apple  

PubMed Central

As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of ‘Golden Delicious’, SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple.

Chagne, David; Crowhurst, Ross N.; Troggio, Michela; Davey, Mark W.; Gilmore, Barbara; Lawley, Cindy; Vanderzande, Stijn; Hellens, Roger P.; Kumar, Satish; Cestaro, Alessandro; Velasco, Riccardo; Main, Dorrie; Rees, Jasper D.; Iezzoni, Amy; Mockler, Todd; Wilhelm, Larry; Van de Weg, Eric; Gardiner, Susan E.; Bassil, Nahla; Peace, Cameron



Application of site and haplotype-frequency based approaches for detecting selection signatures in cattle  

Microsoft Academic Search

Background  'Selection signatures' delimit regions of the genome that are, or have been, functionally important and have therefore been\\u000a under either natural or artificial selection. In this study, two different and complementary methods--integrated Haplotype\\u000a Homozygosity Score (|iHS|) and population differentiation index (FST)--were applied to identify traces of decades of intensive artificial selection for traits of economic importance in modern\\u000a cattle.\\u000a \\u000a \\u000a \\u000a \\u000a Results  We

Saber Qanbari; Daniel Gianola; Ben Hayes; Flavio Schenkel; Steve Miller; Stephen Moore; Georg Thaller; Henner Simianer



[Gene and haplotype frequencies for the loci HLA-A, B and DRB1 in 11755 north Chinese Han bone marrow registry donors].  


The study was aimed to investigate the human leukocyte antigen (HLA)-A, B, DRB1 alleles and haplotype frequencies and the characteristics of linkage disequilibrium in north Chinese Han bone marrow donors. HLA phenotype data of 11 755 north Chinese Han bone marrow donors were identified by PCR-SSP and PCR-SSO. HLA-A, B, DRB1 allele and haplotype frequencies were calculated by computer software named Arleguin which was based on Expectation-Maximization (EM) algorithms. The results showed that the population of 11755 unrelated-donors was tested by Hardy-Weinberg equilibrium, and 18,42 and 15 specificities of HLA alleles were identified on the HLA-A, B, DRB1 locus respectively, including HLA-A25, B42, B53, B73 and DR3 which were rarely reported in Han population. HLA-A36, A43, A80, B78, B82 and DR18 were not detected in this study. The most frequent alleles with a frequency of over 0.05 were HLA-A*02, A*11, A*24, A*33, A*30, A*01, A*03, A*13, B62, B*51, B*46, B60, B61, B*35, B*44, DRB1*15, DRB1*09, DRB1*04, DRB1*07, DRB1*12, DRB1*11, DRB1*14, DRB1*08, DRB1*13. There were a total of 2 026 kinds of HLA-A-B-DR haplotypes (with a frequency of over 10(-6)) to be obtained. The each frequency of 26 kinds of three-locus haplotypes including HLA-A30-B13-DR7, A2-B46-DR9, A33-B58-DR17 etc was higher than 0.005. A30-B13-DR7 was the most frequent haplotype in north Chinese Han population. There were a total of 538 kinds of haplotypes for HLA-A-B, 227 kinds for A-DR and 522 kinds for B-DR to be obtained, and there were 409, 195, 423 kinds of haplotypes respectively with a frequency higher than 10 - 6. There were 28 kinds of HLA-A-B haplotypes including A30-B13, A2-B46, A33-B58 etc, 26 kinds of HLA-A-DR haplotypes including A2-DR9, A2-DR15, A30-DR7 etc, and 24 kinds of HLA-B-DR haplotypes including B13-DR7, B46-DR9, B13-DR12 etc with a frequency higher than 0.01. 296 (72%) kinds of HLA-A-B, 130 (67%) kinds of A-DR and 308 (73%) kinds of B-DR haplotypes were statistical linkage disequilibrium. HLA-A30-B13, A33-B58, A1-B37, A30-DR7, A33-DR13, A1-DR10, B37-DR10, B8-DR17, B13-DR7, B58-DR17 were significant positive linkage disequilibrium. It is concluded that this HLA-A, B, DRB1 gene and haplotype frequencies and linkage disequilibrium data with the largest sample size up to now is unique in north Chinese Han population. The study will be helpful to find matched donors for patients and establish the important foundation for further studying of transplantation immunity, HLA-related diseases and population genetics of this area. PMID:17493347

Wu, Qiang-Ju; Liu, Meng-Li; Qi, Jun; Liu, Sheng; Zhang, Yan; Wei, Xiao-Qian



COL1A1 haplotypes and hip fracture.  


Fragility fractures resulting from low-trauma events such as a fall from standing height are associated with osteoporosis and are very common in older people, especially women. Three single nucleotide polymorphisms (SNPs) at the COL1A1 gene (rs1107946, rs11327935, and rs1800012) have been widely studied and previously associated with bone mineral density (BMD) and fracture. A rare haplotype (T-delT-T) of these three SNPs was found to be greatly overrepresented in fractured individuals compared with nonfractured controls, thus becoming a good candidate for predicting increased fracture risk. The aim of our study was to assess the association of this haplotype with fracture risk in Spanish individuals. We recruited two independent groups of ?100 patients with hip fracture (a total of 203 individuals) and compared the genotype and haplotype distributions of the three SNPs in the fractured patients with those of 397 control individuals from the BARCOS Spanish cohort. We found no association with risk of fracture at the genotype level for any of the SNPs, and no differences in the SNP frequencies between the two groups. At the haplotype level, we found no association between the T-delT-T haplotype and fracture. However, we observed a small but significant (p?=?0.03) association with another rare haplotype, G-insT-T, which was slightly overrepresented in the patient group. PMID:22190259

Urreizti, Roser; Garcia-Giralt, Natàlia; Riancho, José A; González-Macías, Jesús; Civit, Sergi; Güerri, Roberto; Yoskovitz, Guy; Sarrion, Patricia; Mellivobsky, Leonardo; Díez-Pérez, Adolfo; Nogués, Xavier; Balcells, Susana; Grinberg, Daniel



Haplotyping Problem, A Clustering Approach  

NASA Astrophysics Data System (ADS)

Construction of two haplotypes from a set of Single Nucleotide Polymorphism (SNP) fragments is called haplotype reconstruction problem. One of the most popular computational model for this problem is Minimum Error Correction (MEC). Since MEC is an NP-hard problem, here we propose a novel heuristic algorithm based on clustering analysis in data mining for haplotype reconstruction problem. Based on hamming distance and similarity between two fragments, our iterative algorithm produces two clusters of fragments; then, in each iteration, the algorithm assigns a fragment to one of the clusters. Our results suggest that the algorithm has less reconstruction error rate in comparison with other algorithms.

Eslahchi, Changiz; Sadeghi, Mehdi; Pezeshk, Hamid; Kargar, Mehdi; Poormohammadi, Hadi



Direct determination of molecular haplotypes by chromosome microdissection  

PubMed Central

Direct observation of haplotypes is still technical challenging. Here we report a method for the determination of haplotypes through chromosome microdissection. We determine human haplotypes with more than 98.85% accuracy at 24,245 heterozygous single-nucleotide polymorphism (SNP) loci in genome-wide chromosome-range phasing distance.

Ma, Li; Xiao, Yan; Huang, Hui; Wang, Qingwei; Rao, Weinian; Feng, Yue; Zhang, Kui; Song, Qing



A fast collapsed data method for estimating haplotype frequencies from pooled genotype data with applications to the study of rare variants.  


Haplotype information could lead to more powerful tests of genetic association than single-locus analyses but it is not easy to estimate haplotype frequencies from genotype data due to phase ambiguity. The challenge is compounded when individuals are pooled together to save costs or to increase sample size, which is crucial in the study of rare variants. Existing expectation-maximization type algorithms are slow and cannot cope with large pool size or long haplotypes. We show that by collapsing the total allele frequencies of each pool suitably, the maximum likelihood estimates of haplotype frequencies based on the collapsed data can be calculated very quickly regardless of pool size and haplotype length. We provide a running time analysis to demonstrate the considerable savings in time that the collapsed data method can bring. The method is particularly well suited to estimating certain union probabilities useful in the study of rare variants. We provide theoretical and empirical evidence to suggest that the proposed estimation method will not suffer much loss in efficiency if the variants are rare. We use the method to analyze re-sequencing data collected from a case control study involving 148 obese persons and 150 controls. Focusing on a region containing 25 rare variants around the?MGLL gene, our method selects three rare variants as potentially causal. This is more parsimonious than the 12 variants selected by a recently proposed covering method. From another set of 32 rare variants around?the FAAH gene, we discover an interesting potential interaction between two of them. PMID:22855289

Kuk, Anthony Y C; Li, Xiang; Xu, Jinfeng



Discovery of high frequencies of the Gly-Ile haplotype of TLR4 in Indian populations requires reformulation of the evolutionary model of its maintenance.  


The Out-of-Africa migration of modern humans has led to the evolution of immunity genes in general, particularly those related to direct host-pathogen interactions. The Toll-like receptor 4 (TLR4) is one such cell-surface pattern recognition receptor that has been associated with susceptibility and resistance to Gram-negative infections. In this report, we have studied the genetic variation in the TLR4 gene across pre- and post-agricultural populations in India. Two non-synonymous SNPs at the loci Asp299Gly and Thr399Ile are genotyped in 266 individuals from these populations. Previous studies have shown that specific alleles at these two loci are associated with inflammatory response and also claimed the complete absence of the Gly-Ile (double-mutated) haplotype in populations from Asia and America due to some evolutionary disadvantage owing to septic shock. Contrary to such claims, our study reports for thefirst time, high (10%) to moderate (3-6%) frequencies of the Gly-Ile haplotype in one non-tribal and two tribal populations of India respectively. The presence of this haplotype in ancient tribal populations of India indicates the possibility of its important role in pathogen recognition or susceptibility to infections. Therefore, natural selection, not merely genetic drift, may have played an important role in shaping the frequency distribution of haplotypes at these two loci in TLR4. For a more global perspective, we have also estimated the frequency of this haplotype in all the 14 continental populations included in the 1000 Genomes Project. Our study provides direct evidence for the reformulation of existing models of evolutionary maintenance of these polymorphisms in the TLR4 gene. PMID:23892373

Mukherjee, Souvik; Ganguli, Debdutta; Majumder, Partha P



Pure parsimony xor haplotyping.  


The haplotype resolution from xor-genotype data has been recently formulated as a new model for genetic studies. The xor-genotype data is a cheaply obtainable type of data distinguishing heterozygous from homozygous sites without identifying the homozygous alleles. In this paper, we propose a formulation based on a well-known model used in haplotype inference: pure parsimony. We exhibit exact solutions of the problem by providing polynomial time algorithms for some restricted cases and a fixed-parameter algorithm for the general case. These results are based on some interesting combinatorial properties of a graph representation of the solutions. Furthermore, we show that the problem has a polynomial time k-approximation, where k is the maximum number of xor-genotypes containing a given single nucleotide polymorphisms (SNP). Finally, we propose a heuristic and produce an experimental analysis showing that it scales to real-world large instances taken from the HapMap project. PMID:20498511

Bonizzoni, Paola; Della Vedova, Gianluca; Dondi, Riccardo; Pirola, Yuri; Rizzi, Romeo


APC Yin-Yang haplotype associated with colorectal cancer risk  

PubMed Central

The Yin-Yang haplotype is defined as two mismatched haplotypes (Yin and Yang) representing the majority of the existing haplotypes in a particular genomic region. The human adenomatous polyposis coli (APC) gene shows a Yin-Yang haplotype pattern accounting for 84% of all of the haplotypes existing in the Spanish population. Several association studies have been published regarding APC gene variants (SNPs and haplotypes) and colorectal cancer (CRC) risk. However, no studies concerning diplotype structure and CRC risk have been conducted. The aim of the present study was to investigate whether the APC Yin-Yang homozygote diplotype is over-represented in patients with sporadic CRC when compared to its distribution in controls, and its association with CRC risk. TaqMan® assays were used to genotype three tagSNPs selected across the APC Yin-Yang region. Frequencies of the APC Yin-Yang tagSNP alleles, haplotype and diplotype of 378 CRC cases and 642 controls were compared. Two Spanish CRC group samples were included [Hospital Clínico San Carlos in Madrid (HCSC) and Instituto Catalán de Oncología in Barcelona (ICO)]. Analysis of 157 consecutive CRC patients and 405 control subjects from HCSC showed a significative effect for the risk of CRC (OR=1.93; 95% CI 1.32–2.81; P=0.001). However, this effect was not confirmed in 221 CRC patients and 237 control subjects from ICO (OR=0.89; 95% CI 0.61–1.28; P=0.521). We found a significant association between the APC homozygote Yin-Yang diplotype and the risk of colorectal cancer in the HCSC samples. However, we did not observe this association in the ICO samples. These observations suggest that a study with a larger Spanish cohort is necessary to confirm the effects of the APC Yin-Yang diplotype on the risk of CRC.




FMR1 haplotype analyses among Indians: a weak founder effect and other findings.  


This study on allelic/haplotypic fragile X associations evaluated using STR (DXS548, FRAXAC1, FRAXAC2) and SNP (ATL1) markers flanking the (CGG)(n) locus of FMR1is the first report from the large ethnically complex Indian population. Results have been compared with allele/haplotype distributions reported for other major ethnic groups, including White Caucasians, Africans, and Pacific Asians. Though overall allele frequency distributions at the individual loci are more similar to Western Caucasians compared with others, significant differences are observed in haplotypic associations with the mutated X. The striking findings are: (1) high diversity and heterozygosity of haplotypes among fragile X chromosomes ( n=40) and controls ( n=262), including four haplotypes found exclusively in this study sample; (2) weak association of DXS548-FRAXAC1-FRAXAC2 haplotypes, 2-1-3, 6-3-3+ and 7-4-6+ with the disorder, and absence of White Caucasian fragile X haplotypes 6-4-4 and 6-4-5; (3) weak founder effect for the fragile X expansion mutation in the Indians; (4) lack of a continuum of haplotype-based FMR1 alleles between intermediate (CGG)(n) size ranges and expanded alleles; (5) exclusion of ATL1 as a candidate genetic indicator of FMR1 instability. The high STR-based haplotype diversity observed among fragile X lineages, irrespective of ethnic alliances, strongly suggests the inappropriateness of using STR haplotypes to infer predisposition to instability among ethnically separated fragile X pedigrees and may reiterate the need for identifying newer SNPs from this region to not only determine true founder effects for the fragile X mutation, but also decipher possible mechanisms leading to CGG instability. PMID:12596051

Sharma, Deepti; Gupta, Meena; Thelma, B K



Haplotype variation, recombination, and gene conversion within the turkey MHC-B locus.  


The major histocompatibility complex (MHC) is a gene dense region with profound effects on the disease phenotype. In many species, characterizations of MHC polymorphisms have focused on identifying allelic haplotypes of the highly polymorphic class I and class II loci through direct immunological approaches such as monoclonal antibodies specific for the major antigens or indirectly through DNA sequence-based approaches. Invariably, these studies fail to assess the broader range of variation at the other loci within the MHC. This study examines variation in the turkey MHC by resequencing 15 interspersed amplicons ( approximately 14 kb) spaced across the MHC-B locus in a representative sampling of 52 commercial birds. Over 200 single nucleotide polymorphisms (SNPs) were identified with high levels of polymorphism (1 SNP/70 bp) and heterozygosity (average minor allele frequency of 0.15). SNP genotypes were used to identify the major haplotypes segregating in the commercial lines. Sequencing of the peptide binding region (PBR, exon 2) of the class IIB loci of select individuals identified 10 PBR alleles/isotypes among the major MHC haplotypes. Examination of pedigreed families provides direct evidence of gene conversion and recombination within the B locus. Results of this study demonstrate the MHC diversity available in commercial flocks and provide genomic resources for studying the effect of this diversity (alleles and/or haplotypes) on disease susceptibility and resistance. PMID:20461369

Chaves, Lee D; Faile, Gretchen M; Krueth, Stacy B; Hendrickson, Julie A; Reed, Kent M



Haplotype block structures show significant variation among populations  

Microsoft Academic Search

Recent studies suggest that haplotypes tend to have block-like structures throughout the human genome. Several methods were proposed for haplotype block partitioning and for tagging single-nucleotide polymorphism (SNP) identification. In population genetics studies, several research groups compared block structures across human populations. However, the measures used to quantify population similarity are either less than satisfactory or nonexistent. In this article,

Nianjun Liu; Sarah L. Sawyer; Namita Mukherjee; Andrew J. Pakstis; Judith R. Kidd; Kenneth K. Kidd; Anthony J. Brookes; Hongyu Zhao



Association between two ?-opioid receptor gene (OPRM1) haplotype blocks and drug or alcohol dependence  

PubMed Central

We examined 13 single nucleotide polymorphisms (SNPs) spanning the coding region of the ?-opioid receptor gene (OPRM1), among 382 European Americans (EAs) affected with substance dependence [alcohol dependence (AD) and/or drug dependence (DD)] and 338 EA healthy controls. These SNPs delineated two haplotype blocks. Genotype distributions for all SNPs were in Hardy–Weinberg equilibrium (HWE) in controls, but in cases, four SNPs in Block I and three SNPs in Block II showed deviation from HWE. Significant differences were found between cases and controls in allele and/or genotype frequencies for six SNPs in Block I and two SNPs in Block II. Association of SNP4 in Block I with DD (allele: P = 0.004), SNP5 in Block I with AD and DD (allele: P ? 0.005 for both) and two SNPs in Block II with AD (SNP11 genotype: P = 0.002; SNP12 genotype: P = 0.001) were significant after correction for multiple testing. Frequency distributions of haplotypes (constructed by five tag SNPs) differed significantly for cases and controls (P < 0.001 for both AD and DD). Logistic regression analyses confirmed the association between OPRM1 variants and substance dependence, when sex and age of subjects and alleles, genotypes, haplotypes or diplotypes of five tag SNPs were considered. Population structure analyses excluded population stratification artifact. Additional supporting evidence for association between OPRM1 and AD was obtained in a smaller Russian sample (247 cases and 100 controls). These findings suggest that OPRM1 intronic variants play a role in susceptibility to AD and DD in populations of European ancestry.

Zhang, Huiping; Luo, Xingguang; Kranzler, Henry R.; Lappalainen, Jaakko; Yang, Bao-Zhu; Krupitsky, Evgeny; Zvartau, Edwin; Gelernter, Joel



A method for calling copy number polymorphism using haplotypes  

PubMed Central

Single nucleotide polymorphism (SNP) and copy number variation (CNV) are both widespread characteristic of the human genome, but are often called separately on common genotyping platforms. To capture integrated SNP and CNV information, methods have been developed for calling allelic specific copy numbers or so called copy number polymorphism (CNP), using limited inter-marker correlation. In this paper, we proposed a haplotype-based maximum likelihood method to call CNP, which takes advantage of the valuable multi-locus linkage disequilibrium (LD) information in the population. We also developed a computationally efficient algorithm to estimate haplotype frequencies and optimize individual CNP calls iteratively, even at presence of missing data. Through simulations, we demonstrated our model is more sensitive and accurate in detecting various CNV regions, compared with commonly-used CNV calling methods including PennCNV, another hidden Markov model (HMM) using CNP, a scan statistic, segCNV, and cnvHap. Our method often performs better in the regions with higher LD, in longer CNV regions, and in common CNV than the opposite. We implemented our method on the genotypes of 90 HapMap CEU samples and 23 patients with acute lung injury (ALI). For each ALI patient the genotyping was performed twice. The CNPs from our method show good consistency and accuracy comparable to others.

Ho Jang, Gun; Christie, Jason D.; Feng, Rui



Whole-genome resequencing of two elite sires for the detection of haplotypes under selection in dairy cattle  

PubMed Central

Using a combination of whole-genome resequencing and high-density genotyping arrays, genome-wide haplotypes were reconstructed for two of the most important bulls in the history of the dairy cattle industry, Pawnee Farm Arlinda Chief (“Chief”) and his son Walkway Chief Mark (“Mark”), each accounting for ?7% of all current genomes. We aligned 20.5 Gbp (?7.3× coverage) and 37.9 Gbp (?13.5× coverage) of the Chief and Mark genomic sequences, respectively. More than 1.3 million high-quality SNPs were detected in Chief and Mark sequences. The genome-wide haplotypes inherited by Mark from Chief were reconstructed using ?1 million informative SNPs. Comparison of a set of 15,826 SNPs that overlapped in the sequence-based and BovineSNP50 SNPs showed the accuracy of the sequence-based haplotype reconstruction to be as high as 97%. By using the BovineSNP50 genotypes, the frequencies of Chief alleles on his two haplotypes then were determined in 1,149 of his descendants, and the distribution was compared with the frequencies that would be expected assuming no selection. We identified 49 chromosomal segments in which Chief alleles showed strong evidence of selection. Candidate polymorphisms for traits that have been under selection in the dairy cattle population then were identified by referencing Chief’s DNA sequence within these selected chromosome blocks. Eleven candidate genes were identified with functions related to milk-production, fertility, and disease-resistance traits. These data demonstrate that haplotype reconstruction of an ancestral proband by whole-genome resequencing in combination with high-density SNP genotyping of descendants can be used for rapid, genome-wide identification of the ancestor’s alleles that have been subjected to artificial selection.

Larkin, Denis M.; Daetwyler, Hans D.; Hernandez, Alvaro G.; Wright, Chris L.; Hetrick, Lorie A.; Boucek, Lisa; Bachman, Sharon L.; Band, Mark R.; Akraiko, Tatsiana V.; Cohen-Zinder, Miri; Thimmapuram, Jyothi; Macleod, Iona M.; Harkins, Timothy T.; McCague, Jennifer E.; Goddard, Michael E.; Hayes, Ben J.; Lewin, Harris A.



A functional haplotype in EIF2AK3, an ER stress sensor, is associated with lower bone mineral density.  


EIF2AK3 is a type I transmembrane protein that functions as an endoplasmic reticulum (ER) stress sensor to regulate global protein synthesis. Rare mutations in EIF2AK3 cause Wolcott-Rallison syndrome (OMIM 226980), an autosomal recessive disorder characterized by diabetes, epiphyseal dysplasia, osteoporosis, and growth retardation. To investigate the role of common genetic variation in EIF2AK3 as a determinant of bone mineral density (BMD) and osteoporosis, we sequenced all exons and flanking regions, then genotyped six potentially functional single nucleotide polymorphisms (SNPs) in this gene in 997 Amish subjects for association analysis, and attempted replication in 887 Mexican Americans. We found that the minor allele of a nonsynonymous SNP rs13045 had borderline associations with decreased forearm BMD in both discovery and replication cohorts (unadjusted p = 0.036 and ? = -0.007 for the Amish; unadjusted p = 0.031 and ? = -0.008 for Mexican Americans). A meta-analysis indicated this association achieved statistical significance in the combined sample (unadjusted p = 0.003; Bonferroni corrected p = 0.009). Rs13045 and three other potentially functional SNPs, a promoter SNP (rs6547787) and two nonsynonymous SNPs (rs867529 and rs1805165), formed two haplotypes: a low-BMD associated haplotype, denoted haplotype B [minor allele frequency (MAF) = 0.311] and a common haplotype A (MAF = 0.676). There were no differences in mRNA expression in lymphoblastoid cell lines between the two haplotypes. However, after treating lymphoblastoid cell lines with thapsigargin to induce ER stress, cell lines with haplotype B showed increased sensitivity to ER stress (p = 0.014) compared with cell lines with haplotype A. Taken together, our results suggest that common nonsynonymous sequence variants in EIF2AK3 have a modest effect on ER stress response and may contribute to the risk for low BMD through this mechanism. PMID:22028037

Liu, Jie; Hoppman, Nicole; O'Connell, Jeffrey R; Wang, Hong; Streeten, Elizabeth A; McLenithan, John C; Mitchell, Braxton D; Shuldiner, Alan R



Human Leukocyte Antigens-A, -B, -C, -DRB1 allele and haplotype frequencies in Americans originating from Southern Europe: Contrasting patterns of population differentiation between Italian and Spanish Americans  

PubMed Central

High resolution DNA sequencing was used to identify the HLA-A, -B, -C, and -DRB1 alleles found in 552 individuals from the United States indicating Southern European (Italian or Spanish) heritage. A total of 46 HLA-A, 80 HLA-B, 32 HLA-C, and 50 DRB1 alleles were identified. Frequent alleles included A*02:01:01G (allele frequency = 0.26 in Italian Americans; 0.22 in Spanish Americans); B*07:02:01G (Italian Americans allele frequency = 0.11); B*44:03 (Spanish Americans allele frequency = 0.07); C*04:01:01G and C*07:01:01G (allele frequency = 0.13 and 0.16, respectively, in Italian Americans; 0.15 and 0.12, respectively, in Spanish Americans); and DRB1*07:01:01 (allele frequency = 0.12 in each population). The action of balancing selection was inferred at the HLA-B and -C loci in both populations. The A*01:01:01G-C*07:01:01G-B*08:01:01G-DRB1*03:01:01 haplotype was the most frequent A-C-B-DRB1 haplotype in Italian Americans (haplotype frequency = 0.049), and was the second most frequent haplotype in Spanish Americans (haplotype frequency = 0.021). A*29:02:01-C*16:01:01-B*44:03-DRB1*07:01:01 was the most frequent A-C-B-DRB1 haplotype in Spanish Americans (haplotype frequency = 0.023), and was observed at a frequency of 0.015 in Italian Americans. Pairwise F’st values measuring the degree of differentiation between these Southern European-American populations and European and European-American populations suggest that Spanish Americans constitute a distinct subset of the European-American population, most similar to Mexican Americans, whereas Italian Americans cannot be distinguished from the larger European-American population.

Mack, Steven J.; Tu, Bin; Yang, Ruyan; Masaberg, Carly; Ng, Jennifer; Hurley, Carolyn Katovich



High frequency of the IL-2 -330 T/HLA-DRB1*1501 haplotype in patients with multiple sclerosis.  


We have evaluated the role of the HLA-DRB1*1501 allele and the IL-2 -330 T/G polymorphism and their interaction in susceptibility to multiple sclerosis on 360 patients and 426 matched healthy individuals. We used the SSP-PCR method to determine the alleles. Fisher's exact test was used to analyses. We observed a significant increase in the T allele at IL-2 -330 position in patients (OR=1.34, P<0.05), and the T/T and T/G genotypes were more frequent among patients than controls. The HLA-DRB1*1501 allele was overrepresented in patients as compared to the control group (OR=1.7, P=0.0006). The two-locus analysis of the interaction between the IL-2 promoter polymorphism and the HLA-DRB1 allele showed that the HLA-DRB1*1501/T haplotype was more frequent in patients than controls (OR=16, P<0.0001). Our findings support previous findings about the role of the HLA-DRB1*1501 allele in susceptibility to MS. This work also provides new findings about the importance of gene-gene interactions in the development of MS. PMID:20594918

Shahbazi, Majid; Roshandel, Danial; Ebadi, Hamid; Fathi, Davood; Zamani, Mahdi; Boghaee, Mojdeh; Mohammadhoseeeni, Mana; Rshaidbaghan, Azam; Bakhshandeh, Azam; Shahbazi, Saleh



Particle swarm optimization algorithm for analyzing SNP-SNP interaction of renin-angiotensin system genes against hypertension.  


Most non-significant individual single nucleotide polymorphisms (SNPs) were undiscovered in hypertension association studies. Their possible SNP-SNP interactions were usually ignored and leaded to missing heritability. In present study, we proposed a particle swarm optimization (PSO) algorithm to analyze the SNP-SNP interaction associated with hypertension. Genotype dataset of eight SNPs of renin-angiotensin system genes for 130 non-hypertension and 313 hypertension subjects were included. Without SNP-SNP interaction, most individual SNPs were non-significant difference between the hypertension and non-hypertension groups. For SNP-SNP interaction, PSO can select the SNP combinations involving different SNP numbers, namely the best SNP barcodes, to show the maximum frequency difference between non-hypertension and hypertension groups. After computation, the best PSO-generated SNP barcodes were dominant in non-hypertension in terms of the occurrences of frequency differences between non-hypertension and hypertension groups. The OR values of the best SNP barcodes involving 2-8 SNPs were 0.705-0.334, suggesting that these SNP barcodes were protective against hypertension. In conclusion, this study demonstrated that non-significant SNPs may generate the joint effect in association study. Our proposed PSO algorithm is effective to identify the best protective SNP barcodes against hypertension. PMID:23695493

Wu, Shyh-Jong; Chuang, Li-Yeh; Lin, Yu-Da; Ho, Wen-Hsien; Chiang, Fu-Tien; Yang, Cheng-Hong; Chang, Hsueh-Wei



A haplotype inference method based on sparsely connected multi-body ising model  

NASA Astrophysics Data System (ADS)

Statistical haplotype inference is an indispensable technique in the field of medical science. The method usually has two steps: inference of haplotype frequencies and inference of diplotype for each subject. The first step can be done by using the expectation-maximization (EM) algorithm, but it incurs an unreasonably large calculation cost when the number of single-nucleotide polymorphism (SNP) loci of concern is large. In this article, we describe an approximate probabilistic model of haplotype frequencies. The model is constructed by using several distributions of nearby local SNPs. This approximation seems good because SNPs are generally more strongly correlated when they are close to one another on a chromosome. To implement this approach, we use a log linear model, the Walsh-Hadamard transform, and a combinatorial optimization method. Artificial data suggested that the overall haplotype inference of our method is good if there are nine or more local consecutive SNPs. Some minor problems should be dealt with before this method can be applied to real data.

Kato, Masashi; Gao, Qian Ji; Chigira, Hiroshi; Shindo, Hiroyuki; Inoue, Masato



Population diversity and distinct haplotype frequencies associated with ACHE and BCHE genes of Israeli Jews from trans-caucasian Georgia and from Europe  

SciTech Connect

Variant alleles of the butyrylcholinesterase gene, BCHE, have often been used to trace the genetic histories of populations. The D70G substitution in BCHE causes prolonged postanesthesia apnea ({open_quotes}atypical{close_quotes} phenotype); H322N substitution in the closely related acetylcholinesterase gene, ACHE, is the basis of the mutually incompatible YT blood groups. In both genes, additional point mutations were reported to be linked to these phenotypically evident ones. To examine whether the intragenic linkage reported for the ACHE and BCHE mutations in Americans is universal, the authors studied frequencies of these mutations in trans-Caucasian Georgian Jews, a population that has remained relatively isolated for 1500 years. To this end they employed PCR amplification followed by DNA sequencing and enzymatic restriction and compared the frequencies found to corresponding reported phenotype data. Georgian Jews` N322 ACHE was a rather low 7.0% and was totally linked to a P446 mutation, in agreement with a recent report. In BCHE, however, G70 was a relatively high 5.8%, and the V497 and T539 mutations were not found, either in Georgian or in Ashkenazi Jews, in contrast to reported findings in Americans. The findings reveal distinct displays of ACHE and BCHE haplotypes in Georgian Jews and suggest different founder effects, genetic drifts, and/or selection pressures in the evolution of each of these genes. 29 refs., 3 figs., 2 tabs.

Ehrlich, G.; Ginzberg, D.; Loewenstein, Y. [Hebrew Univ., Jerusalem (Israel)] [and others



Mapping a new spontaneous preterm birth susceptibility gene, IGF1R, using linkage, haplotype sharing, and association analysis.  


Preterm birth is the major cause of neonatal death and serious morbidity. Most preterm births are due to spontaneous onset of labor without a known cause or effective prevention. Both maternal and fetal genomes influence the predisposition to spontaneous preterm birth (SPTB), but the susceptibility loci remain to be defined. We utilized a combination of unique population structures, family-based linkage analysis, and subsequent case-control association to identify a susceptibility haplotype for SPTB. Clinically well-characterized SPTB families from northern Finland, a subisolate founded by a relatively small founder population that has subsequently experienced a number of bottlenecks, were selected for the initial discovery sample. Genome-wide linkage analysis using a high-density single-nucleotide polymorphism (SNP) array in seven large northern Finnish non-consanginous families identified a locus on 15q26.3 (HLOD 4.68). This region contains the IGF1R gene, which encodes the type 1 insulin-like growth factor receptor IGF-1R. Haplotype segregation analysis revealed that a 55 kb 12-SNP core segment within the IGF1R gene was shared identical-by-state (IBS) in five families. A follow-up case-control study in an independent sample representing the more general Finnish population showed an association of a 6-SNP IGF1R haplotype with SPTB in the fetuses, providing further evidence for IGF1R as a SPTB predisposition gene (frequency in cases versus controls 0.11 versus 0.05, P = 0.001, odds ratio 2.3). This study demonstrates the identification of a predisposing, low-frequency haplotype in a multifactorial trait using a well-characterized population and a combination of family and case-control designs. Our findings support the identification of the novel susceptibility gene IGF1R for predisposition by the fetal genome to being born preterm. PMID:21304894

Haataja, Ritva; Karjalainen, Minna K; Luukkonen, Aino; Teramo, Kari; Puttonen, Hilkka; Ojaniemi, Marja; Varilo, Teppo; Chaudhari, Bimal P; Plunkett, Jevon; Murray, Jeffrey C; McCarroll, Steven A; Peltonen, Leena; Muglia, Louis J; Palotie, Aarno; Hallman, Mikko



Association of MDR1 Gene SNPs and Haplotypes with the Tacrolimus Dose Requirements in Han Chinese Liver Transplant Recipients  

PubMed Central

Background This work seeks to evaluate the association between the C/D ratios (plasma concentration of tacrolimus divided by daily dose of tacrolimus per body weight) of tacrolimus and the haplotypes of MDR1 gene combined by C1236T (rs1128503), G2677A/T (rs2032582) and C3435T (rs1045642), and to further determine the functional significance of haplotypes in the clinical pharmacokinetics of oral tacrolimus in Han Chinese liver transplant recipients. Methodology/Principal Findings The tacrolimus blood concentrations were continuously recorded for one month after initial administration, and the peripheral blood DNA from a total of 62 liver transplant recipients was extracted. Genotyping of C1236T, G2677A/T and C3435T was performed, and SNP frequency, Hardy-Weinberg equilibrium, linkage disequilibrium, haplotypes analysis and multiple testing were achieved by software PLINK. C/D ratios of different SNP groups or haplotype groups were compared, with a p value<0.05 considered statistically significant. Linkage studies revealed that C1236T, G2677A/T and C3435T are genetically associated with each other. Patients carrying T-T haplotype combined by C1236T and G2677A/T, and an additional T/T homozygote at either position would require higher dose of tacrolimus. Tacrolimus C/D ratios of liver transplant recipients varied significantly among different haplotype groups of MDR1 gene. Conclusions Our studies suggest that the genetic polymorphism could be used as a valuable molecular marker for the prediction of tacrolimus C/D ratios of liver transplant recipients.

Yu, Xiaobo; Xie, Haiyang; Wei, Bajin; Zhang, Min; Wang, Weilin; Wu, Jian; Yan, Sheng; Zheng, Shusen; Zhou, Lin



MDM2 promoter SNP285 and SNP309; phylogeny and impact on cancer risk  

PubMed Central

MDM2 plays a key role to physiological processes like growth arrest, senescence and apoptosis. It binds to and inhibits key proteins like p53 and the RB protein, and MDM2 amplification as well as protein overexpression without amplification is seen in many solid tumors. An MDM2 promoter polymorphism (SNP309T>G) has been found associated with enhanced Sp1 transcription factor binding and elevated MDM2 transcription. While 309G has been found associated with elevated cancer risk and young age at diagnosis of different cancers, results in Caucasians have been at variance. Recently, we reported a second polymorphism (SNP285G>C) located on the 309G allele. The 285C/309G haplotype accounts for about 12% of all 309G alleles among Norwegians, Dutch and British habitants. Assessing Sp1 binding to the MDM2 promoter using surface plasmon resonance technology, we found SNP309G to enhance Sp1 binding by 22% while SNP285C reduced Sp1 binding by 51%. SNP285C reduced the risk of breast cancer and ovarian cancer among 309TG/309GG carriers by 21 and 26%, respectively, but in particular the risk of ovarian cancer among 309TG heterozygotes (reduction by 37%). The fact that the 285C/309G haplotype accounted for only 1.9% of all 309G alleles among Finns and was absent in Chinese indicate 285C to be a young polymorphism.

Knappskog, Stian; L?nning, Per E.



Comparison of linkage disequilibrium and haplotype diversity on macro- and microchromosomes in chicken  

PubMed Central

Background The chicken (Gallus gallus), like most avian species, has a very distinct karyotype consisting of many micro- and a few macrochromosomes. While it is known that recombination frequencies are much higher for micro- as compared to macrochromosomes, there is limited information on differences in linkage disequilibrium (LD) and haplotype diversity between these two classes of chromosomes. In this study, LD and haplotype diversity were systematically characterized in 371 birds from eight chicken populations (commercial lines, fancy breeds, and red jungle fowl) across macro- and microchromosomes. To this end we sampled four regions of ~1 cM each on macrochromosomes (GGA1 and GGA2), and four 1.5 -2 cM regions on microchromosomes (GGA26 and GGA27) at a high density of 1 SNP every 2 kb (total of 889 SNPs). Results At a similar physical distance, LD, haplotype homozygosity, haploblock structure, and haplotype sharing were all lower for the micro- as compared to the macrochromosomes. These differences were consistent across populations. Heterozygosity, genetic differentiation, and derived allele frequencies were also higher for the microchromosomes. Differences in LD, haplotype variation, and haplotype sharing between populations were largely in line with known demographic history of the commercial chicken. Despite very low levels of LD, as measured by r2 for most populations, some haploblock structure was observed, particularly in the macrochromosomes, but the haploblock sizes were typically less than 10 kb. Conclusion Differences in LD between micro- and macrochromosomes were almost completely explained by differences in recombination rate. Differences in haplotype diversity and haplotype sharing between micro- and macrochromosomes were explained by differences in recombination rate and genotype variation. Haploblock structure was consistent with demography of the chicken populations, and differences in recombination rates between micro- and macrochromosomes. The limited haploblock structure and LD suggests that future whole-genome marker assays will need 100+K SNPs to exploit haplotype information. Interpretation and transferability of genetic parameters will need to take into account the size of chromosomes in chicken, and, since most birds have microchromosomes, in other avian species as well.



Improved branch and bound algorithm for detecting SNP-SNP interactions in breast cancer  

PubMed Central

Background Single nucleotide polymorphisms (SNPs) in genes derived from distinct pathways are associated with a breast cancer risk. Identifying possible SNP-SNP interactions in genome-wide case–control studies is an important task when investigating genetic factors that influence common complex traits; the effects of SNP-SNP interaction need to be characterized. Furthermore, observations of the complex interplay (interactions) between SNPs for high-dimensional combinations are still computationally and methodologically challenging. An improved branch and bound algorithm with feature selection (IBBFS) is introduced to identify SNP combinations with a maximal difference of allele frequencies between the case and control groups in breast cancer, i.e., the high/low risk combinations of SNPs. Results A total of 220 real case and 334 real control breast cancer data are used to test IBBFS and identify significant SNP combinations. We used the odds ratio (OR) as a quantitative measure to estimate the associated cancer risk of multiple SNP combinations to identify the complex biological relationships underlying the progression of breast cancer, i.e., the most likely SNP combinations. Experimental results show the estimated odds ratio of the best SNP combination with genotypes is significantly smaller than 1 (between 0.165 and 0.657) for specific SNP combinations of the tested SNPs in the low risk groups. In the high risk groups, predicted SNP combinations with genotypes are significantly greater than 1 (between 2.384 and 6.167) for specific SNP combinations of the tested SNPs. Conclusions This study proposes an effective high-speed method to analyze SNP-SNP interactions in breast cancer association studies. A number of important SNPs are found to be significant for the high/low risk group. They can thus be considered a potential predictor for breast cancer association.



A SNP in the ACT gene associated with astrocytosis and rapid cognitive decline in AD  

Microsoft Academic Search

There is biochemical and animal model evidence supporting a pathological role of the ACT gene in AD. However, direct genetic evidence remains controversial and has been mostly limited to individual single nucleotide polymorphism (SNP) analysis. To resolve this apparent conflict we have used a high-density ACT SNP map, constructed haplotypes and explored correlations with phenotype. SNPs were identified by sequencing

O. Belbin; J. L. Dunn; S. Chappell; A. E. Ritchie; Y. Ling; L. Morgan; A. Pritchard; D. R. Warden; C. L. Lendon; D. J. Lehmann; D. M. A. Mann; A. D. Smith; N. Kalsheker; K. Morgan



HLA-A, HLA-B, HLA-DRB1 allele and haplotype frequencies in 6384 umbilical cord blood units and transplantation matching and engraftment statistics in the Zhejiang cord blood bank of China.  


Umbilical cord blood (UCB) is a widely accepted source of progenitor cells, and now, many cord blood banks were established. Here, we analysed the HLA-A, HLA-B and HLA-DRB1 allele and haplotype frequencies, HLA matching possibilities for searching potential donors and outcome of UCB transplantations in Zhejiang cord blood bank of China. A total of 6384 UCB units were characterized for 17 HLA-A, 30 HLA-B and 13 HLA-DRB1 alleles at the first field resolution level. Additionally, B*14, B*15 and B*40 were typed to the second field level. A total of 1372 distinct A-B-DRB1 haplotypes were identified. The frequencies of 7 haplotypes were more than 1%, and 439 haplotypes were <0.01%. A*02-B*46-DRB1*09, A*33-B*58-DRB1*03 and A*30-B*13-DRB1*07 were the most common haplotypes, with frequencies of 4.4%, 3.3%, and 2.9%, respectively. Linkage disequilibrium(LD) analysis showed that there were 83 A-B, 106 B-DRB1, 54 A-DRB1 haplotypes with positive LD, in which 51 A-B, 60 B-DRB1, 32 A-DRB1 haplotypes exhibited a significant LD (P < 0.05). In 682 search requests, 12.9%, 40.0% and 42.7% of patients were found to have 6 of 6, 5 of 6 and 4 of 6 HLA-A, HLA-B and HLA-DRB1 matching donors, respectively. A total of 30 UCB units were transplanted to 24 patients (3 patients not evaluated due to early death); 14 of 21 patients (66.7%) engrafted. This study reveals the HLA distribution and its transplantation application in the cord blood bank of Zhejiang province. These data can help to select potential UCB donors for transplantation and used to assess the scale of new cord blood banking endeavours. PMID:23731569

Wang, F; He, J; Chen, S; Qin, F; Dai, B; Zhang, W; Zhu, Fm; Lv, Hj



A candidate CpG SNP approach identifies a breast cancer associated ESR1-SNP.  


Altered DNA methylation is often seen in malignant cells, potentially contributing to carcinogenesis by suppressing gene expression. We hypothesized that heritable methylation potential might be a risk factor for breast cancer and evaluated possible association with breast cancer for single nucleotide polymorphisms (SNPs) either involving CpG sequences in extended 5'-regulatory regions of candidate genes (ESR1, ESR2, PGR, and SHBG) or CpG and missense coding SNPs in genes involved in methylation (MBD1, MECP2, DNMT1, MGMT, MTHFR, MTR, MTRR, MTHFD1, MTHFD2, BHMT, DCTD, and SLC19A1). Genome-wide searches for genetic risk factors for breast cancers have in general not investigated these SNPs, because of low minor allele frequency or weak haplotype associations. Genotyping was performed using Mass spectrometry-Maldi-Tof in a screening panel of 538 cases and 1,067 controls. Potential association to breast cancer was identified for 15 SNPs and one of these SNPs (rs7766585 in ESR1) was found to associate strongly with breast cancer, OR 1.30 (95% CI 1.17-1.45; p-value 2.1 × 10(-6)), when tested in a verification panel consisting of 3,211 unique breast cancer cases and 4,223 unique controls from five European biobank cohorts. In conclusion, a candidate gene search strategy focusing on methylation-related SNPs did identify a SNP that associated with breast cancer at high significance. PMID:21105050

Harlid, Sophia; Ivarsson, Malin I L; Butt, Salma; Hussain, Shehnaz; Grzybowska, Ewa; Eyfjörd, Jorunn Erla; Lenner, Per; Försti, Asta; Hemminki, Kari; Manjer, Jonas; Dillner, Joakim; Carlson, Joyce



The Role of Haplotypes in Candidate Gene Studies  

Microsoft Academic Search

Human geneticists working on systems for which it is possible to make a strong case for a set of candidate genes face the problem of whether it is necessary to consider the variation in those genes as phased haplotypes, or whether the one-SNP- at-a-time approach might perform as well. There are three reasons why the phased haplotype route should be

Andrew G. Clarkn


Genomic breeding value estimation using genetic markers, inferred ancestral haplotypes, and the genomic relationship matrix.  


With the introduction of new single nucleotide polymorphism (SNP) chips of various densities, more and more genotype data sets will include animals genotyped for only a subset of the SNP. Imputation techniques based on unobserved ancestral haplotypes may be used to infer missing genotypes. These ancestral haplotypes may also be used in the genomic prediction model, instead of using the SNP. This may increase the reliability of predictions because the ancestral haplotype may capture more linkage disequilibrium with quantitative trait loci than SNP. The aim of this paper was to study whether using unobserved ancestral haplotypes in a genomic prediction model would provide more reliable genomic predictions than using SNP, and to determine how many loci in the genomic prediction model would be redundant. Genotypes of 8,960 bulls and cows for 39,557 SNP were analyzed with a hidden Markov model to associate each individual at each locus to 2 ancestral haplotypes. The number of ancestral haplotypes per locus was fixed at 10, 15, or 20. Subsequently, a validation study was performed in which the phenotypes of 3,251 progeny-tested bulls for 16 traits were used in a genomic prediction model to predict the estimated breeding values of at least 753 validation bulls. The squared correlation between genomic prediction and deregressed daughter performance estimated breeding value, when averaged across traits, was slightly higher when 15 or 20 ancestral haplotypes per locus were used in the prediction model instead of the SNP genotypes, whereas the prediction model using a genomic relationship matrix gave the lowest squared correlations. The number of redundant loci [i.e., loci that had less than 18 jumps (0.1%) from one ancestral haplotype to another ancestral haplotype at the next locus], was 18,793 (48%), which means that only 20,764 loci would need to be included in the genomic prediction model. This provides opportunities for greatly decreasing computer requirements of genomic evaluations with very large numbers of markers. PMID:21854945

de Roos, A P W; Schrooten, C; Druet, T



Association of CAPN10 SNPs and Haplotypes with Polycystic Ovary Syndrome among South Indian Women  

PubMed Central

Polycystic Ovary Syndrome (PCOS) is known to be characterized by metabolic disorder in which hyperinsulinemia and peripheral insulin resistance are central features. Given the physiological overlap between PCOS and type-2 diabetes (T2DM), and calpain 10 gene (CAPN10) being a strong candidate for T2DM, a number of studies have analyzed CAPN10 SNPs among PCOS women yielding contradictory results. Our study is first of its kind to investigate the association pattern of CAPN10 polymorphisms (UCSNP-44, 43, 56, 19 and 63) with PCOS among Indian women. 250 PCOS cases and 299 controls from Southern India were recruited for this study. Allele and genotype frequencies of the SNPs were determined and compared between the cases and controls. Results show significant association of UCSNP-44 genotype CC with PCOS (p?=?0.007) with highly significant odds ratio when compared to TC (OR?=?2.51, p?=?0.003, 95% CI?=?1.37–4.61) as well as TT (OR?=?1.94, p?=?0.016, 95% CI?=?1.13–3.34). While the haplotype carrying the SNP-44 and SNP-19 variants (21121) exhibited a 2 fold increase in the risk for PCOS (OR?=?2.37, p?=?0.03), the haplotype containing SNP-56 and SNP-19 variants (11221) seems to have a protective role against PCOS (OR?=?0.20, p?=?0.004). Our results support the earlier evidence for a possible role of UCSNP-44 of the CAPN10 gene in the manifestation of PCOS.

Dasgupta, Shilpi; Sirisha, Pisapati V. S.; Neelaveni, Kudugunti; Anuradha, Katragadda; Reddy, B. Mohan



WinHAP: An Efficient Haplotype Phasing Algorithm Based on Scalable Sliding Windows  

PubMed Central

Haplotype phasing represents an essential step in studying the association of genomic polymorphisms with complex genetic diseases, and in determining targets for drug designing. In recent years, huge amounts of genotype data are produced from the rapidly evolving high-throughput sequencing technologies, and the data volume challenges the community with more efficient haplotype phasing algorithms, in the senses of both running time and overall accuracy. 2SNP is one of the fastest haplotype phasing algorithms with comparable low error rates with the other algorithms. The most time-consuming step of 2SNP is the construction of a maximum spanning tree (MST) among all the heterozygous SNP pairs. We simplified this step by replacing the MST with the initial haplotypes of adjacent heterozygous SNP pairs. The multi-SNP haplotypes were estimated within a sliding window along the chromosomes. The comparative studies on four different-scale genotype datasets suggest that our algorithm WinHAP outperforms 2SNP and most of the other haplotype phasing algorithms in terms of both running speeds and overall accuracies. To facilitate the WinHAP’s application in more practical biological datasets, we released the software for free at:

Xu, Yun; Cheng, Wenhua; Nie, Pengyu; Zhou, Fengfeng



Nonparametric disequilibrium mapping of functional sites using haplotypes of multiple tightly linked single-nucleotide polymorphism markers.  

PubMed Central

As the speed and efficiency of genotyping single-nucleotide polymorphisms (SNPs) increase, using the SNP map, it becomes possible to evaluate the extent to which a common haplotype contributes to the risk of disease. In this study we propose a new procedure for mapping functional sites or regions of a candidate gene of interest using multiple linked SNPs. Based on a case-parent trio family design, we use expectation-maximization (EM) algorithm-derived haplotype frequency estimates of multiple tightly linked SNPs from both unambiguous and ambiguous families to construct a contingency statistic S for linkage disequilibrium (LD) analysis. In the procedure, a moving-window scan for functional SNP sites or regions can cover an unlimited number of loci except for the limitation of computer storage. Within a window, all possible widths of haplotypes are utilized to find the maximum statistic S* for each site (or locus). Furthermore, this method can be applied to regional or genome-wide scanning for determining linkage disequilibrium using SNPs. The sensitivity of the proposed procedure was examined on the simulated data set from the Genetic Analysis Workshop (GAW) 12. Compared with the conventional and generalized TDT methods, our procedure is more flexible and powerful.

Cheng, Rong; Ma, Jennie Z; Wright, Fred A; Lin, Shili; Gao, Xin; Wang, Daolong; Elston, Robert C; Li, Ming D



Haplotype structure and association to Crohn's disease of CARD15 mutations in two ethnically divergent populations  

Microsoft Academic Search

Current debate focuses on the relevance of linkage disequilibrium (LD), ethnicity and underlying haplotype structure to the search for genes involved in complex disorders. The recently described association between single nucleotide polymorphisms (SNPs) of the CARD15 (NOD2) gene and Crohn's disease (CD) in populations of north-European descent provides a test case that we have subjected to detailed SNP haplotype based

Peter J P Croucher; Silvia Mascheretti; Jochen Hampe; Klaus Huse; Henning Frenzel; Monika Stoll; Tim Lu; Susanna Nikolaus; Suk-Kyun Yang; Michael Krawczak; Won Ho Kim; Stefan Schreiber; Stefan Schreiber



No association between polymorphisms/haplotypes of the vascular endothelial growth factor gene and preeclampsia  

PubMed Central

Background Preeclampsia (PE) is the first worldwide cause of death in pregnant women, intra-uterine growth retardation, and fetal prematurity. Some vascular endothelial grown factor gene (VEGF) polymorphisms have been associated to PE and other pregnancy disturbances. We evaluated the associations between VEGF genotypes/haplotypes and PE in Mexican women. Methods 164 pregnant women were enrolled in a case-control study (78 cases and 86 normotensive pregnant controls). The rs699947 (-2578C/A), rs1570360 (-1154G/A), rs2010963 (+405G/C), and rs25648 (-7C/T), VEGF variants were discriminated using Polymerase Chain Reaction - Restriction Fragment Length Polymorphism (PCR-RFLP) methods or Taqman single nucleotide polymorphism (SNP) assays. Results The proportions of the minor allele for rs699947, rs1570360, rs2010963, and rs25648 VEGF SNPs were 0.33, 0.2, 0.39, and 0.17 in controls, and 0.39, 0.23, 0.41, and 0.15 in cases, respectively (P values > 0.05). The most frequent haplotypes of rs699947, rs1570360, rs2010963, and rs25648 VEGF SNPs, were C-G-C-C and C-G-G-C with frequencies of 0.39, 0.21 in cases and 0.37, 0.25 in controls, respectively (P values > 0.05) Conclusion There was no evidence of an association between VEGF alleles, genotypes, or haplotypes frequencies and PE in our study.



HLA class-I and class-II allele frequencies and two-locus haplotypes in Melanesians of Vanuatu and New Caledonia.  


HLA class-I and class-II allele frequencies and two-locus haplotypes were examined in 367 unrelated Melanesians living on the islands of Vanuatu and New Caledonia. Diversity at all HLA class-I and class-II loci was relatively limited. In class-I loci, three HLA-A allelic groups (HLA-A*24, HLA-A*34 and HLA-A*11), seven HLA-B alleles or allelic groups (HLA-B*1506, HLA-B*5602, HLA-B*13, HLA-B*5601, HLA-B*4001, HLA-B*4002 and HLA-B*2704) and four HLA-C alleles or allelic groups (HLA-Cw*04, HLA-Cw*01, HLA-Cw*0702 and HLA-Cw*15) constituted more than 90% of the alleles observed. In the class-II loci, four HLA-DRB1 alleles (HLA-DRB1*15, HLA-DRB1*11, HLA-DRB1*04 and HLA-DRB1*16), three HLA-DRB3-5 alleles (HLA-DRB3*02, HLA-DRB4*01 and HLA-DRB5*01/02) and five HLA-DQB1 alleles (HLA-DQB1*0301, HLA-DQB1*04, HLA-DQB1*05, HLA-DQB1*0601 and HLA-DQB1*0602) constituted over 93, 97 and 98% of the alleles observed, respectively. Homozygosity showed significant departures from expected levels for neutrality based on allele frequency (i.e. excess diversity) at the HLA-B, HLA-Cw, HLA-DQB1 and HLA-DRB3/5 loci on some islands. The locus with the strongest departure from neutrality was HLA-DQB1, homozygosity being significantly lower than expected on all islands except New Caledonia. No consistent pattern was demonstrated for any HLA locus in relation to malaria endemicity. PMID:15546341

Maitland, K; Bunce, M; Harding, R M; Barnardo, M C N M; Clegg, J B; Welsh, K; Bowden, D K; Williams, T N



High SNP density in the blacklegged tick, Ixodes scapularis, the principal vector of Lyme disease spirochetes.  


Single-nucleotide polymorphisms (SNPs) are the most widespread type of sequence variation in genomes. SNP density and distribution varies among different organisms and genes. Here, we report the first estimates of SNP distribution and density in the genome of the blacklegged tick (Ixodes scapularis), an important vector of the pathogens causing Lyme disease, human granulocytic anaplasmosis and human babesiosis in North America. We sampled 10 individuals from each of 4 collections from New Jersey, Virginia, Georgia, and Mississippi and analyzed the sequences of 9 nuclear genes and the mitochondrial 16S gene. SNPs are extremely abundant (one SNP per every 14 bases). This is the second highest density so far reported in any eukaryotic organism. Population genetic analyses based either on haplotype frequencies or the 372 SNPs in these 9 genes showed that the 40 ticks formed 3 genetic groups. In agreement with earlier population genetic studies, northern ticks from New Jersey and Virginia formed a homogeneous group with low genetic diversity, whereas southern ticks from Georgia and Mississippi consisted of 2 separate groups, each with high genetic diversity. PMID:23219364

Van Zee, Janice; Black, William C; Levin, Michael; Goddard, Jerome; Smith, Joshua; Piesman, Joseph



Testing Haplotype-Environment Interactions Using Case-Parent Triads  

Microsoft Academic Search

Objective: Joint analysis of multiple SNP markers can be informative, but studying joint effects of haplotypes and environmental exposures is challenging. Population structure can involve both genes and exposures and a case-control study is susceptible to bias from either source of stratification. We propose a procedure that uses case-parent triad data and, though not fully robust, resists bias from population

Min Shi; David M. Umbach; Clarice R. Weinberg



Whole-genome molecular haplotyping of single cells  

Microsoft Academic Search

Conventional experimental methods of studying the human genome are limited by the inability to independently study the combination of alleles, or haplotype, on each of the homologous copies of the chromosomes. We developed a microfluidic device capable of separating and amplifying homologous copies of each chromosome from a single human metaphase cell. Single-nucleotide polymorphism (SNP) array analysis of amplified DNA

H Christina Fan; Jianbin Wang; Anastasia Potanina; Stephen R Quake



Model, properties and imputation method of missing SNP genotype data utilizing mutual information  

NASA Astrophysics Data System (ADS)

Mutual information can be used as a measure for the association of a genetic marker or a combination of markers with the phenotype. In this paper, we study the imputation of missing genotype data. We first utilize joint mutual information to compute the dependence between SNP sites, then construct a mathematical model in order to find the two SNP sites having maximal dependence with missing SNP sites, and further study the properties of this model. Finally, an extension method to haplotype-based imputation is proposed to impute the missing values in genotype data. To verify our method, extensive experiments have been performed, and numerical results show that our method is superior to haplotype-based imputation methods. At the same time, numerical results also prove joint mutual information can better measure the dependence between SNP sites. According to experimental results, we also conclude that the dependence between the adjacent SNP sites is not necessarily strongest.

Wang, Ying; Wan, Weiming; Wang, Rui-Sheng; Feng, Enmin



Combinatorial Problems Arising in SNP and Haplotype Analysis  

Microsoft Academic Search

It is widely anticipated that the study of variation in the human genome will provide a means of predicting riskof a variety of complex diseases. This paper presents a number of algorithmic and com- binatorial problems that arise when studying a very common form of genomic variation, single nucleotide polymorphisms (SNPs). We review recent results and present challenging open problems.

Bjarni V. Halldórsson; Vineet Bafna; Nathan Edwards; Ross Lippert; Shibu Yooseph; Sorin Istrail



Imputation of microsatellite alleles from dense SNP genotypes for parentage verification across multiple Bos taurus and Bos indicus breeds.  


To assist cattle producers transition from microsatellite (MS) to single nucleotide polymorphism (SNP) genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N = 479) from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8000 animals representing 39 breeds (Bos taurus and B. indicus) were used to predict 9410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles from 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N = 2 to 36 breeds). These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had ?1 Mendelian inheritance conflicts with their parents' reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset. PMID:24065982

McClure, Matthew C; Sonstegard, Tad S; Wiggans, George R; Van Eenennaam, Alison L; Weber, Kristina L; Penedo, Cecilia T; Berry, Donagh P; Flynn, John; Garcia, Jose F; Carmo, Adriana S; Regitano, Luciana C A; Albuquerque, Milla; Silva, Marcos V G B; Machado, Marco A; Coffey, Mike; Moore, Kirsty; Boscher, Marie-Yvonne; Genestout, Lucie; Mazza, Raffaele; Taylor, Jeremy F; Schnabel, Robert D; Simpson, Barry; Marques, Elisa; McEwan, John C; Cromie, Andrew; Coutinho, Luiz L; Kuehn, Larry A; Keele, John W; Piper, Emily K; Cook, Jim; Williams, Robert; Van Tassell, Curtis P



Imputation of microsatellite alleles from dense SNP genotypes for parentage verification across multiple Bos taurus and Bos indicus breeds  

PubMed Central

To assist cattle producers transition from microsatellite (MS) to single nucleotide polymorphism (SNP) genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N = 479) from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8000 animals representing 39 breeds (Bos taurus and B. indicus) were used to predict 9410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles from 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N = 2 to 36 breeds). These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had ?1 Mendelian inheritance conflicts with their parents' reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset.

McClure, Matthew C.; Sonstegard, Tad S.; Wiggans, George R.; Van Eenennaam, Alison L.; Weber, Kristina L.; Penedo, Cecilia T.; Berry, Donagh P.; Flynn, John; Garcia, Jose F.; Carmo, Adriana S.; Regitano, Luciana C. A.; Albuquerque, Milla; Silva, Marcos V. G. B.; Machado, Marco A.; Coffey, Mike; Moore, Kirsty; Boscher, Marie-Yvonne; Genestout, Lucie; Mazza, Raffaele; Taylor, Jeremy F.; Schnabel, Robert D.; Simpson, Barry; Marques, Elisa; McEwan, John C.; Cromie, Andrew; Coutinho, Luiz L.; Kuehn, Larry A.; Keele, John W.; Piper, Emily K.; Cook, Jim; Williams, Robert; Van Tassell, Curtis P.



Haplotype-based search for SNPs associated with differential type 1 diabetes risk among chromosomes carrying a specific HLA DRB1-DQA1-DQB1 haplotype  

PubMed Central

SUMMARY Aim: To test chromosomes carrying the same DRB1-DQA1-DQB1 haplotype for SNPs in the major histocompatibility complex (MHC) that might mark subgroups of the haplotype with different risks for type 1 diabetes (T1D). Methods: Chromosomes from T1D children, their parents, and non-diabetic siblings in families of the Type 1 Diabetes Genetics Consortium (T1DGC) were analyzed by two haplotype-based methods: (1) logistic regression analysis restricted to phased chromosomes carrying the same DRB1-DQA1-DQB1 haplotype but differentiated by the two alleles at MHC SNPs which were individually tested for association with T1D; (2) homozygous parent TDT (hpTDT) testing for biased transmission of a SNP allele to diabetic children from parents who are heterozygous at the SNP but homozygous for the specific DRB1-DQA1-DQB1 haplotype being evaluated. Results: A number of SNPs gave nominally significant (p<0.05) evidence of marking two subsets of the 301-501-201 haplotype that might differ with respect to their diabetogenic potency. However, none of the SNPs achieved experiment-wide significance and hence may be false-positive associations. Conclusions: We discuss limitations and possible deficiencies of our study suggesting further work which might yield more robust SNP associations marking two subgroups of a DRB1-DQA1-DQB1 haplotype with different T1D risks.

McGinnis, R; McLaren, W; Ranganath, V; Whittaker, P; Hunt, S; Deloukas, P



Temporal stability and spatial divergence of mitochondrial DNA haplotype frequencies in red drum (Sciaenops ocellatus) from coastal regions of the western Atlantic Ocean and Gulf of Mexico  

Microsoft Academic Search

Restriction-site variation in mitochondrial (mt) DNA was assayed among 1675 red drum (Sciaenops ocellatus Linnaeus) sampled from 20 localities along the southeastern coast of the USA (western Atlantic) and the Gulf of Mexico (Gulf).\\u000a Up to four consecutive year-classes (cohorts) were sampled at most localities. Nucleotide-sequence divergence among 170 mtDNA\\u000a haplotypes identified ranged (in percentage) from 0.184 to 1.913, with

J. R. Gold; L. R. Richardson; T. F. Turner



CaSNP: a database for interrogating copy number alterations of cancer genome from SNP array data  

PubMed Central

Cancer is known to have abundant copy number alterations (CNAs) that greatly contribute to its pathogenesis and progression. Investigation of CNA regions could potentially help identify oncogenes and tumor suppressor genes and infer cancer mechanisms. Although single-nucleotide polymorphism (SNP) arrays have strengthened our ability to identify CNAs with unprecedented resolution, a comprehensive collection of CNA information from SNP array data is still lacking. We developed a web-based CaSNP ( database for storing and interrogating quantitative CNA data, which curated ?11?500 SNP arrays on 34 different cancer types in 104 studies. With a user input of region or gene of interest, CaSNP will return the CNA information summarizing the frequencies of gain/loss and averaged copy number for each study, and provide links to download the data or visualize it in UCSC Genome Browser. CaSNP also displays the heatmap showing copy numbers estimated at each SNP marker around the query region across all studies for a more comprehensive visualization. Finally, we used CaSNP to study the CNA of protein-coding genes as well as LincRNA genes across all cancer SNP arrays, and found putative regions harboring novel oncogenes and tumor suppressors. In summary, CaSNP is a useful tool for cancer CNA association studies, with the potential to facilitate both basic science and translational research on cancer.

Cao, Qingyi; Zhou, Meng; Wang, Xujun; Meyer, Cliff A.; Zhang, Yong; Chen, Zhi; Li, Cheng; Liu, X. Shirley



SNP discovery by transcriptome pyrosequencing.  


Single nucleotide polymorphisms (SNPs) are single base differences between haplotypes. SNPs are abundant in many species and valuable as markers for genetic map construction, modern molecular breeding programs, and quantitative genetic studies. SNPs are readily mined from genomic DNA or cDNA sequence obtained from individuals having two or more distinct genotypes. While automated Sanger sequencing has become less expensive over time, it is still costly to acquire deep Sanger sequence from several genotypes. "Next-generation" DNA sequencing technologies that utilize new chemistries and massively parallel approaches have enabled DNA sequences to be acquired at extremely high depths of coverage faster and for less cost than traditional sequencing. One such method is represented by the Roche/454 Life Sciences GS-FLX Titanium Series, which currently uses pyrosequencing to produce up to 400-600 million bases of DNA sequence/run (>1 million reads, ~400 bp/read). This chapter discusses the use of high-throughput pyrosequencing for SNP discovery by focusing on 454 sequencing of maize cDNA, the development of a computational pipeline for polymorphism detection, and the subsequent identification of over 7,000 putative SNPs between Mo17 and B73 maize. In addition, alternative alignment and polymorphism detection strategies that implement Illumina short reads, data processing and visualization tools, and reduced representation techniques that reduce the sequencing of repeat DNA, thus enabling efficient analysis of genome sequence, are discussed. PMID:21365494

Barbazuk, W Brad; Schnable, Patrick S



Y-chromosome DNA haplotypes in Jews: comparisons with Lebanese and Palestinians.  


One Y-specific DNA polymorphism (p49/Taq I) was studied in 54 Lebanese and 69 Palestinian males, and compared with the results found in 693 Jews from three communities (Oriental, Sephardic, and Ashkenazic). Lebanese, Palestinian, and Sephardic Jews seem to be similar in their Y-haplotype patterns, both with regard to the haplotype distributions and the ancestral haplotype VIII frequencies. The haplotype distribution in Oriental Jews is characterized by a significantly higher frequency of haplotype VIII. These results confirm similarities in the Y-haplotype frequencies in Lebanese, Palestinian, and Sephardic Jewish men, three Near-Eastern populations sharing a common geographic origin. PMID:12820706

Lucotte, Gérard; Mercier, Géraldine



Identity by Descent Mapping of Founder Mutations in Cancer Using High-Resolution Tumor SNP Data  

PubMed Central

Dense genotype data can be used to detect chromosome fragments inherited from a common ancestor in apparently unrelated individuals. A disease-causing mutation inherited from a common founder may thus be detected by searching for a common haplotype signature in a sample population of patients. We present here FounderTracker, a computational method for the genome-wide detection of founder mutations in cancer using dense tumor SNP profiles. Our method is based on two assumptions. First, the wild-type allele frequently undergoes loss of heterozygosity (LOH) in the tumors of germline mutation carriers. Second, the overlap between the ancestral chromosome fragments inherited from a common founder will define a minimal haplotype conserved in each patient carrying the founder mutation. Our approach thus relies on the detection of haplotypes with significant identity by descent (IBD) sharing within recurrent regions of LOH to highlight genomic loci likely to harbor a founder mutation. We validated this approach by analyzing two real cancer data sets in which we successfully identified founder mutations of well-characterized tumor suppressor genes. We then used simulated data to evaluate the ability of our method to detect IBD tracts as a function of their size and frequency. We show that FounderTracker can detect haplotypes of low prevalence with high power and specificity, significantly outperforming existing methods. FounderTracker is thus a powerful tool for discovering unknown founder mutations that may explain part of the “missing” heritability in cancer. This method is freely available and can be used online at the FounderTracker website.

Letouze, Eric; Sow, Aliou; Petel, Fabien; Rosati, Roberto; Figueiredo, Bonald C.; Burnichon, Nelly; Gimenez-Roqueplo, Anne-Paule



Genome-wide association studies using single-nucleotide polymorphisms versus haplotypes: an empirical comparison with data from the North American Rheumatoid Arthritis Consortium  

PubMed Central

The high genomic density of the single-nucleotide polymorphism (SNP) sets that are typically surveyed in genome-wide association studies (GWAS) now allows the application of haplotype-based methods. Although the choice of haplotype-based vs. individual-SNP approaches is expected to affect the results of association studies, few empirical comparisons of method performance have been reported on the genome-wide scale in the same set of individuals. To measure the relative ability of the two strategies to detect associations, we used a large dataset from the North American Rheumatoid Arthritis Consortium to: 1) partition the genome into haplotype blocks, 2) associate haplotypes with disease, and 3) compare the results with individual-SNP association mapping. Although some associations were shared across methods, each approach uniquely identified several strong candidate regions. Our results suggest that the application of both haplotype-based and individual-SNP testing to GWAS should be adopted as a routine procedure.



Genome Patterns of Selection and Introgression of Haplotypes in Natural Populations of the House Mouse (Mus musculus)  

PubMed Central

General parameters of selection, such as the frequency and strength of positive selection in natural populations or the role of introgression, are still insufficiently understood. The house mouse (Mus musculus) is a particularly well-suited model system to approach such questions, since it has a defined history of splits into subspecies and populations and since extensive genome information is available. We have used high-density single-nucleotide polymorphism (SNP) typing arrays to assess genomic patterns of positive selection and introgression of alleles in two natural populations of each of the subspecies M. m. domesticus and M. m. musculus. Applying different statistical procedures, we find a large number of regions subject to apparent selective sweeps, indicating frequent positive selection on rare alleles or novel mutations. Genes in the regions include well-studied imprinted loci (e.g. Plagl1/Zac1), homologues of human genes involved in adaptations (e.g. alpha-amylase genes) or in genetic diseases (e.g. Huntingtin and Parkin). Haplotype matching between the two subspecies reveals a large number of haplotypes that show patterns of introgression from specific populations of the respective other subspecies, with at least 10% of the genome being affected by partial or full introgression. Using neutral simulations for comparison, we find that the size and the fraction of introgressed haplotypes are not compatible with a pure migration or incomplete lineage sorting model. Hence, it appears that introgressed haplotypes can rise in frequency due to positive selection and thus can contribute to the adaptive genomic landscape of natural populations. Our data support the notion that natural genomes are subject to complex adaptive processes, including the introgression of haplotypes from other differentiated populations or species at a larger scale than previously assumed for animals. This implies that some of the admixture found in inbred strains of mice may also have a natural origin.

Staubach, Fabian; Lorenc, Anna; Messer, Philipp W.; Tang, Kun; Petrov, Dmitri A.; Tautz, Diethard



PPC: an algorithm for accurate estimation of SNP allele frequencies in small equimolar pools of DNA using data from high density microarrays  

Microsoft Academic Search

Robust estimation of allele frequencies in pools of DNA has the potential to reduce genotyping costs and\\/or increase the number of individuals contribut- ing to a study where hundreds of thousands of genetic markers need to be genotyped in very large popula- tions sample sets, such as genome wide association studies. In order to make accurate allele frequency estimations from

Jesper Brohede; Rob Dunne; James D. McKay; Garry N. Hannan



Haplotype-based linkage disequilibrium mapping via direct data mining  

Microsoft Academic Search

Motivation: With the availability of large-scale, high-density single- nucleotide polymorphism markers and information on haplotype struc- tures and frequencies, a great challenge is how to take advantage of haplotype information in the association mapping of complex diseases in case-control studies. Results: We present a novel approach for association mapping based on directly mining haplotypes (i.e. phased genotype pairs) produced from

Jing Li; Tao Jiang



PanSNPdb: The Pan-Asian SNP Genotyping Database  

PubMed Central

The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb), which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at:

Ngamphiw, Chumpol; Assawamakin, Anunchai; Xu, Shuhua; Shaw, Philip J.; Yang, Jin Ok; Ghang, Ho; Bhak, Jong; Liu, Edison; Tongsima, Sissades



Apolipoprotein A1/C3/A5 haplotypes and serum lipid levels  

PubMed Central

Background The association of single nucleotide polymorphisms (SNPs) in the apolipoprotein (Apo) A1/C3/A4/A5 gene cluster and serum lipid profiles is inconsistent. The present study was undertaken to detect the association between the ApoA1/C3/A5 gene polymorphisms and their haplotypes with serum lipid levels in the general Chinese population. Methods A total of 1030 unrelated subjects (492 males and 538 females) aged 15-89 were randomly selected from our previous stratified randomized cluster samples. Genotyping of the ApoA1 -75 bp G>A, ApoC3 3238C>G, ApoA5 -1131T>C, ApoA5 c.553G>T and ApoA5 c.457G>A was performed by polymerse chain reaction and restriction fragment length polymorphism combined with gel electrophoresis, and then confirmed by direct sequencing. Pair-wise linkage disequilibria and haplotype analysis among the five SNPs were estimated. Results The levels of high-density lipoprotein cholesterol (HDL-C) and ApoA1 were lower in males than in femailes (P < 0.05 for each). The allelic and genotypic frequencies of the SNPs were no significant difference between males and females except ApoC3 3238C>G. There were 11 haplotypes with a frequency >1% identified in the cluster in our population. At the global level, the haplotypes comprised of all five SNPs were significantly associated with all seven lipid traits. In particular, haplotype G-G-C-C-A (6%; in the order of ApoA5 c.553G>T, ApoA5 c.457G>A, ApoA5 -1131T>C, ApoC3 3238C>G, and ApoA1 -75bp G>A) and G-A-T-C-G (4%) showed consistent association with total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C), ApoA1, ApoB, and the ApoA1/ApoB ratio. In addition, carriers of haplotype G-G-T-C-G (26%) had increased serum concentration of HDL-C and ApoA1, whereas carriers of G-G-C-G-G (15%) had high concentrations of TC, triglyceride (TG) and ApoB. We also found that haplotypes with five SNPs explain much more serum lipid variation than any single SNP alone, especially for TG (4.4% for haplotype vs. 2.4% for -1131T>C max based on R-square) and HDL-C (5.1% for haplotype vs. 0.9% for c.553G>T based on R-square). Serum lipid parameters were also correlated with genotypes and several environment factors. Conclusions Several common SNPs and their haplotypes in the ApoA1/C3/A5 gene cluster are closely associated with modifications of serum lipid parameters in the general Chinese population.



SNP Data Consulting Program  

Microsoft Academic Search

In the post genome era, considerable effort has been put into genetic association study with single nucleotide polymorphisms (SNPs) to investigate genes affecting traits, for example diseases and response to drugs. Although various software tools for SNP association study read plain text files as input data, their formats is not standardized. Manual data conversion may cause incorrect input. In addition,

Toshiko Matsumoto; Yasuyuki Nozaki; Ryo Nakashige



A novel tool for individual haplotype inference using mixed data  

PubMed Central

Background In many studies, researchers may recruit samples consisting of independent trios and unrelated individuals. However, most of the currently available haplotype inference methods do not cope well with these kinds of mixed data sets. Methods We propose a general and simple methodology using a mixture of weighted multinomial (MIXMUL) approach that combines separate haplotype information from unrelated individuals and independent trios for haplotype inference to the individual level. Results The new MIXMUL procedure improves over existing methods in that it can accurately estimate haplotype frequencies from mixed data sets and output probable haplotype pairs in optimized reconstruction outcomes for all subjects that have contributed to estimation. Simulation results showed that this new MIXMUL procedure competes well with the EM-based method, i.e. FAMHAP, under a few assumed scenarios. Conclusion The results showed that MIXMUL can provide accurate estimates similar to those haplotype frequencies obtained from FAMHAP and output the probable haplotype pairs in the most optimal reconstruction outcome for all subjects that have contributed to estimation. If available data consist of combinations of unrelated individuals and independent trios, the MIXMUL procedure can be used to estimate the haplotype frequencies accurately and output the most likely reconstructed haplotype pairs of each subject in the estimation.

Lin, Chen-Pang; Fann, Cathy SJ



The effect of using genealogy-based haplotypes for genomic prediction  

PubMed Central

Background Genomic prediction uses two sources of information: linkage disequilibrium between markers and quantitative trait loci, and additive genetic relationships between individuals. One way to increase the accuracy of genomic prediction is to capture more linkage disequilibrium by regression on haplotypes instead of regression on individual markers. The aim of this study was to investigate the accuracy of genomic prediction using haplotypes based on local genealogy information. Methods A total of 4429 Danish Holstein bulls were genotyped with the 50K SNP chip. Haplotypes were constructed using local genealogical trees. Effects of haplotype covariates were estimated with two types of prediction models: (1) assuming that effects had the same distribution for all haplotype covariates, i.e. the GBLUP method and (2) assuming that a large proportion (?) of the haplotype covariates had zero effect, i.e. a Bayesian mixture method. Results About 7.5 times more covariate effects were estimated when fitting haplotypes based on local genealogical trees compared to fitting individuals markers. Genealogy-based haplotype clustering slightly increased the accuracy of genomic prediction and, in some cases, decreased the bias of prediction. With the Bayesian method, accuracy of prediction was less sensitive to parameter ? when fitting haplotypes compared to fitting markers. Conclusions Use of haplotypes based on genealogy can slightly increase the accuracy of genomic prediction. Improved methods to cluster the haplotypes constructed from local genealogy could lead to additional gains in accuracy.



Haplotypes in SLC24A5 Gene as Ancestry Informative Markers in Different Populations.  


Ancestry informative markers (AIMs) are human polymorphisms that exhibit substantially allele frequency differences among populations. These markers can be useful to provide information about ancestry of samples which may be useful in predicting a perpetrator's ethnic origin to aid criminal investigations. Variations in human pigmentation are the most obvious phenotypes to distinguish individuals. It has been recently shown that the variation of a G in an A allele of the coding single-nucleotide polymorphism (SNP) rs1426654 within SLC24A5 gene varies in frequency among several population samples according to skin pigmentation. Because of these observations, the SLC24A5 locus has been evaluated as Ancestry Informative Region (AIR) by typing rs1426654 together with two additional intragenic markers (rs2555364 and rs16960620) in 471 unrelated individuals originating from three different continents (Africa, Asia and Europe). This study further supports the role of human SLC24A5 gene in skin pigmentation suggesting that variations in SLC24A5 haplotypes can correlate with human migration and ancestry. Furthermore, our data do reveal the utility of haplotype and combined unphased genotype analysis of SLC24A5 in predicting ancestry and provide a good example of usefulness of genetic characterization of larger regions, in addition to single polymorphisms, as candidates for population-specific sweeps in the ancestral population. PMID:19440451

Giardina, Emiliano; Pietrangeli, Ilenia; Martínez-Labarga, Cristina; Martone, Claudia; de Angelis, Flavio; Spinella, Aldo; De Stefano, Gianfranco; Rickards, Olga; Novelli, Giuseppe



?-Globin Gene Haplotype Characteristics of Colombian Amerinds in South America  

Microsoft Academic Search

Haplotypes and subhaplotypes in the ?-globin gene cluster were identified in 146 and 156 chromosomes, respectively, of three tribes of Colombian Amerinds. Subhaplotype [+––––] was a major one in Colombian Amerinds as in most human ethnic groups except Africans. A major subhaplotype [––––+] in Africans was observed in only one chromosome. The framework 2 frequencies were very low (0.018–0.067). Haplotype

Koji Shimizu; Toyoko Hashimoto; Shinji Harihara; Kazuo Tajima; Shunro Sonoda; Vladimir Zaninovic



Functional characterisation of bovine interleukin 8 promoter haplotypes in vitro.  


Interleukin 8 (IL-8) is a major mediator of the innate immune response and polymorphisms in this gene are associated with susceptibility to inflammatory disease in humans. The aim of this study was to characterise the promoter region of the bovine IL8 gene towards understanding its regulation and the effect of promoter polymorphisms on gene expression levels. Twenty-nine polymorphic sites were identified across a 2.1kb upstream promoter region of the IL8 gene including two insertion/deletion polymorphisms. Sequence analysis and SNP genotyping identified two distinct promoter haplotypes (IL8-h1 and IL8-h2), which were present at significantly different frequencies in two divergently selected cattle breeds - Holstein-Friesian and Norwegian Red (IL8-h1 at 48% and 80% respectively). IL8-h1 was functionally less responsive in unstimulated mammary epithelial cells and in response to stimulation with LPS or bovine TNF. Serial deletion analysis and in silico transcription-factor binding site analysis indicated that allele specific binding of the transcriptional repressor Oct-1 may account for the reduced sensitivity of IL8-h1. Our finding of genetic variation in the bovine IL8 promoter that differentially regulates its expression has significant functional implications for IL8 expression in vitro and which may impact on susceptibility to bovine infectious disease and inflammation. PMID:22244152

Meade, Kieran G; O'Gorman, Grace M; Narciandi, Fernando; Machugh, David E; O'Farrelly, Cliona



A validated genome-wide association study in 2 dairy cattle breeds for milk production and fertility traits using variable length haplotypes.  


Genome-wide association studies (GWAS) were used to discover genomic regions explaining variation in dairy production and fertility traits. Associations were detected with either single nucleotide polymorphism (SNP) markers or haplotypes of SNP alleles. An across-breed validation strategy was used to narrow the genomic interval containing causative mutations. There were 39,048 SNP tested in a discovery population of 780 Holstein sires and validated in 386 Holsteins and 364 Jersey sires. Previously identified mutations affecting milk production traits were confirmed. In addition, several novel regions were identified, including a putative quantitative trait loci for fertility on chromosome 18 that was detected only using haplotypes greater than 3 SNP long. It was found that the precision of quantitative trait loci mapping increased with haplotype length as did the number of validated haplotypes discovered, especially across breed. Promising candidate genes have been identified in several of the validated regions. PMID:20630249

Pryce, J E; Bolormaa, S; Chamberlain, A J; Bowman, P J; Savin, K; Goddard, M E; Hayes, B J



SORL1 haplotypes modulate risk of Alzheimer's disease in Chinese.  


Genetic variants of the neuronal sortilin-related receptor (SORL1) have been demonstrated to modulate the risk of Alzheimer's disease (AD) in different American and European populations [Rogaeva, E., Meng, Y., Lee, J.H., Gu, Y., Kawarai, T., Zou, F., Katayama, T., Baldwin, C.T., Cheng, R., Hasegawa, H., Chen, F., Shibata, N., Lunetta, K.L., Pardossi-Piquard, R., Bohm, C., Wakutani, Y., Cupples, L.A., Cuenco, K.T., Green, R.C., Pinessi, L., Rainero, I., Sorbi, S., Bruni, A., Duara, R., Friedland, R.P., Inzelberg, R., Hampe, W., Bujo, H., Song, Y.Q., Andersen, O.M., Willnow, T.E., Graff-Radford, N., Petersen, R.C., Dickson, D., Der, S.D., Fraser, P.E., Schmitt-Ulms, G., Younkin, S., Mayeux, R., Farrer, L.A., St George-Hyslop, P., 2007. The neuronal sortilin-related receptor SORL1 is genetically associated with Alzheimer disease. Nat. Genet. 39 (2), 168-177]. We conducted haloptype analysis involving two genetic clusters of SORL1 in AD and controls among Han Chinese. rs3824968 (SNP 23) was associated with an increased risk of AD, and there was a trend towards association for rs1699102 (SNP 22) and rs2282649 (SNP 24). More robust associations were found for three-loci haplotypes. In particular, the GCA haplotype at SNPs 19-22-23 was associated with an increased risk (odds ratio 1.4), and CTC haplotype at SNPs 19-22-23 and TCT at SNPs 22-23-24 a decreased risk (odds ratio 0.67) of AD. The complete absence of some at-risk North European haplotypes in our Chinese study subjects was likely due to different ancestral origins, with allelic heterogeneity among races. However, our study suggests that certain SORL1 haplotypes at SNPs 19-24 modulated risk of AD in our Chinese population. PMID:18063222

Tan, E K; Lee, J; Chen, C P; Teo, Y Y; Zhao, Y; Lee, W L



Haplotype reconstruction using perfect phylogeny and sequence data  

PubMed Central

Haplotype phasing is a well studied problem in the context of genotype data. With the recent developments in high-throughput sequencing, new algorithms are needed for haplotype phasing, when the number of samples sequenced is low and when the sequencing coverage is blow. High-throughput sequencing technologies enables new possibilities for the inference of haplotypes. Since each read is originated from a single chromosome, all the variant sites it covers must derive from the same haplotype. Moreover, the sequencing process yields much higher SNP density than previous methods, resulting in a higher correlation between neighboring SNPs. We offer a new approach for haplotype phasing, which leverages on these two properties. Our suggested algorithm, called Perfect Phlogeny Haplotypes from Sequencing (PPHS) uses a perfect phylogeny model and it models the sequencing errors explicitly. We evaluated our method on real and simulated data, and we demonstrate that the algorithm outperforms previous methods when the sequencing error rate is high or when coverage is low.



Identification and genetic effect of haplotype in the bovine BMP7 gene.  


Bone morphogenetic proteins (BMPs) are peptide growth factors belonging to the transforming growth factor-beta (TGF-?) superfamily, and some members of the BMP family support white adipocyte differentiation. In this study, we focused on the BMP7 which singularly promotes the differentiation of brown preadipocytes. Haplotypes involving 5 single nucleotide polymorphism (SNP) sites in the bovine BMP7 gene were identified and their effect on body weight was analyzed. 16 haplotypes and 18 combined haplotypes were revealed and the linkage disequilibrium was assessed in the cattle population with 602 individuals representing three main cattle breeds from China. The results showed that haplotypes 3, 10 and 14 were predominant and accounted for 75.64%, 69.85%, and 83.36% in Nanyang, Qinchuan and Jiaxian cattle breeds, respectively. The statistical analyses indicated that the SNP 1, 4, and 5 are associated with the body weight, body length, and heart girth at 12 and 24months in Nanyang cattle population (P<0.05), whereas there is no significant association between their 16 haplotypes and 18 combined haplotypes. Our results provide evidence that some SNPs and haplotypes in BMP7 are associated with growth traits, and may be utilized as a genetic marker in marker-assisted selection for beef cattle breeding programs. PMID:23500594

Huang, Yong-Zhen; Wang, Xin-Lei; He, Hua; Lan, Xian-Yong; Lei, Chu-Zhao; Zhang, Chun-Lei; Chen, Hong



Online resources for SNP analysis  

Microsoft Academic Search

The major online single nucleotide polymorphism (SNP) databases freely available as research tools for genetic analysis are\\u000a explained, reviewed, and compared. An outline is given of the search strategies that can be used with the most extensive current\\u000a SNP databases: National Centre for Biotechnology Information (NCBI) dbSNP and HapMap to help the use secure the most appropriate\\u000a data for the

Christopher Phillips



High frequency of human leukocyte antigen class II DRB1*1602 haplotype in Greek patients with myelodysplastic syndrome and of DRB1*1501 in the low-risk subgroup.  


Myelodysplastic syndromes (MDS) comprise a heterogenous group of clonal hematopoietic disorders in which the immune-mediated pathogenetic mechanisms are under investigation. Overrepresentation of human leukocyte antigen (HLA)-DR2 and its serologic split HLA-DR15 has been associated with low-risk MDS in certain ethnic groups and has been proposed as a predictive factor for a favorable response to immunomodulatory treatment. Because the HLA-DRB1*15 haplotype does not predominate in the Greek population, we investigated the frequency of HLA-DRB1 alleles among 114 patients of Greek origin suffering from various types of MDS: 36 refractory anemia (RA), 24 refractory anemia with ringed sideroblasts (RARS), 19 refractory anemia with excess of blasts (RAEB), 14 refractory anemia with excess of blasts in transformation (RAEB-t), 14 chronic myelomonocytic leukemia, and 7 hypoplastic MDS patients. HLA-DRB1 molecular typing was performed with polymerase chain reaction-sequence specific oligonucleotides and results were compared with that from a previously reported control Greek population. HLA-DRB1*1602 was the only allele that was significantly overrepresented in Greek MDS patients as a whole, whereas HLA-DRB1*1501 allele frequency was significantly higher in Greek patients with low-risk myelodysplasia. Our results suggest the possible value of HLA-DR15 and HLA-DR16 as determinants for immunomodulatory interventions, at least for Greek patients with low-risk MDS. PMID:22244918

Kritikou-Griva, Elpiniki; Spyropoulou-Vlachou, Maria; Tsagarakis, Nikolaos J; Goumakou, Eleni; Vrani, Vasiliki; Galanopoulos, Athanasios; Papadhimitriou, Stefanos I; Androutsos, George; Paterakis, George; Stavropoulos-Giokas, Catherine



Asian population frequencies and haplotype distribution of killer cell immunoglobulin-like receptor ( KIR ) genes among Chinese, Malay, and Indian in Singapore  

Microsoft Academic Search

Killer cell immunoglobulin-like receptors (KIR) gene frequencies have been shown to be distinctly different between populations and contribute to functional variation in\\u000a the immune response. We have investigated KIR gene frequencies in 370 individuals representing three Asian populations in Singapore and report here the distribution of\\u000a 14 KIR genes (2DL1, 2DL2, 2DL3, 2DL4, 2DL5, 2DS1, 2DS2, 2DS3, 2DS4, 2DS5, 3DL1,

Yi Chuan Lee; Soh Ha Chan; Ee Chee Ren



Haplotype Association Mapping of Acute Lung Injury in Mice Implicates Activin A Receptor, Type 1  

PubMed Central

Rationale: Because acute lung injury is a sporadic disease produced by heterogeneous precipitating factors, previous genetic analyses are mainly limited to candidate gene case-control studies. Objectives: To develop a genome-wide strategy in which single nucleotide polymorphism associations are assessed for functional consequences to survival during acute lung injury in mice. Methods: To identify genes associated with acute lung injury, 40 inbred strains were exposed to acrolein and haplotype association mapping, microarray, and DNA-protein binding were assessed. Measurements and Main Results: The mean survival time varied among mouse strains with polar strains differing approximately 2.5-fold. Associations were identified on chromosomes 1, 2, 4, 11, and 12. Seven genes (Acvr1, Cacnb4, Ccdc148, Galnt13, Rfwd2, Rpap2, and Tgfbr3) had single nucleotide polymorphism (SNP) associations within the gene. Because SNP associations may encompass “blocks” of associated variants, functional assessment was performed in 91 genes within ± 1 Mbp of each SNP association. Using 10% or greater allelic frequency and 10% or greater phenotype explained as threshold criteria, 16 genes were assessed by microarray and reverse real-time polymerase chain reaction. Microarray revealed several enriched pathways including transforming growth factor-? signaling. Transcripts for Acvr1, Arhgap15, Cacybp, Rfwd2, and Tgfbr3 differed between the strains with exposure and contained SNPs that could eliminate putative transcriptional factor recognition sites. Ccdc148, Fancl, and Tnn had sequence differences that could produce an amino acid substitution. Mycn and Mgat4a had a promoter SNP or 3?untranslated region SNPs, respectively. Several genes were related and encoded receptors (ACVR1, TGFBR3), transcription factors (MYCN, possibly CCDC148), and ubiquitin-proteasome (RFWD2, FANCL, CACYBP) proteins that can modulate cell signaling. An Acvr1 SNP eliminated a putative ELK1 binding site and diminished DNA–protein binding. Conclusions: Assessment of genetic associations can be strengthened using a genetic/genomic approach. This approach identified several candidate genes, including Acvr1, associated with increased susceptibility to acute lung injury in mice.

Leikauf, George D.; Concel, Vincent J.; Liu, Pengyuan; Bein, Kiflai; Berndt, Annerose; Ganguly, Koustav; Jang, An Soo; Brant, Kelly A.; Dietsch, Maggie; Pope-Varsalona, Hannah; Dopico, Richard A.; Di, Y. P. Peter; Li, Qian; Vuga, Louis J.; Medvedovic, Mario; Kaminski, Naftali; You, Ming; Prows, Daniel R.



Maori origins, Y-chromosome haplotypes and implications for human history in the Pacific  

Microsoft Academic Search

For the SNP 2000 Special Issue An assessment of 28 pertinent binary genetic markers on the non-recombining portion of the Y chromosome (NRY) in New Zealand Maori and other relevant populations has revealed a diverse genetic paternal heritage of extant Maori. A maximum parsimony phylogeny was constructed in which nine of the 25 possible binary haplotypes were observed. Although ~40%

Peter A. Underhill; Giuseppe Passarino; Alice A. Lin; Sangkot Marzuki; Peter J. Oefner; L. Luca Cavalli-Sforza; Geoffrey K. Chambers



Polynomial and APX-hard cases of the individual haplotyping problem  

Microsoft Academic Search

SNP haplotyping problems have been the subject of extensive research in the last few years, and are one of the hottest areas of Computational Biology today. In this paper we report on our work of the last two years, whose preliminary results were presented at the European Symposium on Algorithms (Proceedings of the Annual European Symposium on Algorithms (ESA), Vol.

Vineet Bafna; Sorin Istrail; Giuseppe Lancia; Romeo Rizzi



Haplotype Inference in General Pedigrees Using the Cluster Variation Method  

PubMed Central

We present CVMHAPLO, a probabilistic method for haplotyping in general pedigrees with many markers. CVMHAPLO reconstructs the haplotypes by assigning in every iteration a fixed number of the ordered genotypes with the highest marginal probability, conditioned on the marker data and ordered genotypes assigned in previous iterations. CVMHAPLO makes use of the cluster variation method (CVM) to efficiently estimate the marginal probabilities. We focused on single-nucleotide polymorphism (SNP) markers in the evaluation of our approach. In simulated data sets where exact computation was feasible, we found that the accuracy of CVMHAPLO was high and similar to that of maximum-likelihood methods. In simulated data sets where exact computation of the maximum-likelihood haplotype configuration was not feasible, the accuracy of CVMHAPLO was similar to that of state of the art Markov chain Monte Carlo (MCMC) maximum-likelihood approximations when all ordered genotypes were assigned and higher when only a subset of the ordered genotypes was assigned. CVMHAPLO was faster than the MCMC approach and provided more detailed information about the uncertainty in the inferred haplotypes. We conclude that CVMHAPLO is a practical tool for the inference of haplotypes in large complex pedigrees.

Albers, Cornelis A.; Heskes, Tom; Kappen, Hilbert J.



SNPSTR: a database of compound microsatellite-SNP markers  

PubMed Central

There has been widespread and growing interest in genetic markers suitable for drawing population genetic inferences about past demographic events and to detect the effects of selection. In addition to single nucleotide polymorphisms (SNPs), microsatellites (or short tandem repeats, STRs) have received great attention in the analysis of human population history. In the SNPSTR database () we catalogue a relatively new type of compound genetic marker called SNPSTR which combines a microsatellite marker (STR) with one or more tightly linked SNPs. Here, the SNP(s) and the microsatellite are less than 250 bp apart so each SNPSTR can be considered a small haplotype with no recombination occurring between the two individual markers. Thus, SNPSTRs have the potential to become a very useful tool in the field of population genetics. The SNPSTR database contains all inferable human SNPSTRs as well as those in mouse, rat, dog and chicken, i.e. all model organisms for which extensive SNP datasets are available.

Agrafioti, I.; Stumpf, M. P. H.



Patterns of nucleotide and haplotype diversity at ICAM-1 across global human populations with varying levels of malaria exposure.  


Malaria is one of the strongest selective pressures in recent human evolution. African populations have been and continue to be at risk for malarial infections. However, few studies have re-sequenced malaria susceptibility loci across geographically and genetically diverse groups in Africa. We examined nucleotide diversity at Intercellular adhesion molecule-1 (ICAM-1), a malaria susceptibility candidate locus, in a number of human populations with a specific focus on diverse African ethnic groups. We used tests of neutrality to assess whether natural selection has impacted this locus and tested whether SNP variation at ICAM-1 is correlated with malaria endemicity. We observe differing patterns of nucleotide and haplotype variation in global populations and higher levels of diversity in Africa. Although we do not observe a deviation from neutrality based on the allele frequency distribution, we do observe several alleles at ICAM-1, including the ICAM-1 (Kilifi) allele, that are correlated with malaria endemicity. We show that the ICAM-1 (Kilifi) allele, which is common in Africa and Asia, exists on distinct haplotype backgrounds and is likely to have arisen more recently in Asia. Our results suggest that correlation analyses of allele frequencies and malaria endemicity may be useful for identifying candidate functional variants that play a role in malaria resistance and susceptibility. PMID:23609612

Gomez, Felicia; Tomas, Gil; Ko, Wen-Ya; Ranciaro, Alessia; Froment, Alain; Ibrahim, Muntaser; Lema, Godfrey; Nyambo, Thomas B; Omar, Sabah A; Wambebe, Charles; Hirbo, Jibril B; Rocha, Jorge; Tishkoff, Sarah A



Comparative analysis of haplotype association mapping algorithms  

PubMed Central

Background Finding the genetic causes of quantitative traits is a complex and difficult task. Classical methods for mapping quantitative trail loci (QTL) in miceuse an F2 cross between two strains with substantially different phenotype and an interval mapping method to compute confidence intervals at each position in the genome. This process requires significant resources for breeding and genotyping, and the data generated are usually only applicable to one phenotype of interest. Recently, we reported the application of a haplotype association mapping method which utilizes dense genotyping data across a diverse panel of inbred mouse strains and a marker association algorithm that is independent of any specific phenotype. As the availability of genotyping data grows in size and density, analysis of these haplotype association mapping methods should be of increasing value to the statistical genetics community. Results We describe a detailed comparative analysis of variations on our marker association method. In particular, we describe the use of inferred haplotypes from adjacent SNPs, parametric and nonparametric statistics, and control of multiple testing error. These results show that nonparametric methods are slightly better in the test cases we study, although the choice of test statistic may often be dependent on the specific phenotype and haplotype structure being studied. The use of multi-SNP windows to infer local haplotype structure is critical to the use of a diverse panel of inbred strains for QTL mapping. Finally, because the marginal effect of any single gene in a complex disease is often relatively small, these methods require the use of sensitive methods for controlling family-wise error. We also report our initial application of this method to phenotypes cataloged in the Mouse Phenome Database. Conclusion The use of inbred strains of mice for QTL mapping has many advantages over traditional methods. However, there are also limitations in comparison to the traditional linkage analysis from F2 and RI lines. Application of these methods requires careful consideration of algorithmic choices based on both theoretical and practical factors. Our findings suggest general guidelines, though a complete evaluation of these methods can only be performed as more genetic data in complex diseases becomes available.

McClurg, Phillip; Pletcher, Mathew T; Wiltshire, Tim; Su, Andrew I



Allelic frequencies for SNP variants in the gene Nramp1 in bovine infected with Brucella abortus or classified by resistance to the pathogen Frecuencias alélicas para variantes SNP en el gen Nramp1 en bovinos infectados con Brucella abortus o clasificados por resistencia al patógeno  

Microsoft Academic Search

The natural resistance to brucellosis in cattle has been associated to genetic factors mainly to some single nucleotide polymorphism (SNP), located within Nramp1 gen. The current research has studied the effect of nucleotide variants to be found in coding regions and other one located in 3 non translated region of Nramp1 gene, on the animal classification as resistant or susceptible,

Esperanza Rueda


16(th) IHIW: global distribution of extended HLA haplotypes.  


This report describes the project to identify the global distribution of extended HLA haplotypes, a component of 16th International HLA and Immunogenetics Workshop (IHIW), and summarizes the initial analyses of data collected. The project aims to investigate extended HLA haplotypes, compare their distribution among different populations, assess their frequency in hematopoietic stem cell unrelated donor registries and initiate an international family studies database and DNA repository to be made publicly available. HLA haplotypes compiled in immunogenetics laboratories during the evaluation of transplant candidates and related potential donors were analysed. Haplotypes were determined using the pedigree analysis tool publicly available from the National Marrow Donor Program (NMDP) website. Nineteen laboratories from 10 countries (11 laboratories from North America, five from Asia, two from Latin America and one from Australia) contributed data on a total of 1719 families comprised of 7474 individuals. We identified 10393 HLA haplotypes, of which 1682 haplotypes included high-resolution typing at HLA-A, B, C, DRB1 and DQB1 loci. We also present haplotypes containing MICA and other HLA loci and haplotypes containing rare alleles seen in these families. The project will be extended through the 17th IHIW, and investigators interested in joining the project may communicate with the first author. PMID:23302097

Askar, M; Daghstani, J; Thomas, D; Leahy, N; Dunn, P; Claas, F; Doran, S; Saji, H; Kanangat, S; Karoichane, M; Tambur, A; Monos, D; El-Khalifa, M; Turner, V; Kamoun, M; Mustafa, M; Ramon, D; Gandhi, M; Vernaza, A; Gorodezky, C; Wagenknecht, D; Gautreaux, M; Hajeer, A; Kashi, Z; Fernandez-Vina, M



Conserved extended haplotypes of the major histocompatibility complex: further characterization.  


Since the complete sequencing of a human major histocompatibility complex (MHC) haplotype, interest in non-human leucocyte antigen (HLA) genes encoded in the MHC has been growing. Non-HLA genes, which outnumber the HLA genes, may contribute to or account for HLA and disease associations. Most information on non-HLA genes has been obtained in separate studies of individual loci. To comprehensively address polymorphisms of relevant non-HLA genes in 'conserved extended haplotypes' (CEH), we investigated 101 International Histocompatibility Workshop reference cell lines and nine additional anonymous samples representing all 37 unambiguously characterized CEHs at MICA, NFKBIL1, LTA, NCR3, AIF1, HSPA1A, HSPA1B, BF, NOTCH4 and a single nucleotide polymorphism (SNP) at HLA-DQA1 as well as MICA, NOTCH4, HSPA1B and all five tumour necrosis factor short tandem repeat (STR) polymorphisms. This work (1) provides an extensive catalogue of MHC polymorphisms in all CEHs, (2) unravels interrelationships between HLA and non-HLA haplotypical lineages, (3) resolves reported typing ambiguities and (4) describes haplospecific markers for a number of CEHs. Analysis also identified a DQA1 SNP and segments containing MHC class III polymorphisms that corresponded with class II (DRB3 and DRB4) lineages. These results portray the MHC where lineages containing non-HLA and HLA variants in linkage disequilibrium may operate in concert and can guide more thorough design and interpretation of HLA-disease relationships. PMID:16791278

Dorak, M T; Shao, W; Machulla, H K G; Lobashevsky, E S; Tang, J; Park, M H; Kaslow, R A



Haplotypes of the Y chromosome in some populations of west Africa  

Microsoft Academic Search

One Y-specific DNA polymorphism (p49\\/Taql) was studied in a sample of 469 African males coming from twelve populations of sub-Saharan Africa. An high frequency (62.5%)\\u000a of the Y-haplotype IV was observed in these populations, the most elevated percentage of this haplotype being observed in\\u000a Mossis (from Burkina-Fasso). The “Arabic” haplotype V is present in these populations at a mean frequency

G. Lucotte; N. Gérard



Distribution of Y-chromosome STR defined haplotypes in Iberia  

Microsoft Academic Search

Seven Y-specific STR loci (DYS19, DYS389I, DY5389II, DYS390, DYS391, DYS392 and DYS393) were studied in five populations from the Iberian Peninsula: Andalusia, Valencia, Basque Country, Galicia and Northern Portugal. Haplotype and allele frequencies of these seven Y-chromosome STRs were estimated. Observed haplotype diversities are in a range between 0.96 (Basque Country) and 0.99 (Valencia and Andalusia). Significant population differentiation was

Annabel González-Neira; Leonor Gusmão; Mar??a Brión; Mar??a Victoria Lareu; António Amorim; Angel Carracedo



North African genes in Iberia studied by Y-chromosome DNA haplotype V.  


Haplotype V at the Y-chromosome specific DNA polymorphism (p49/TaqI) was reported in a study concerning 487 males originating from five different geographic locations in Iberia and North Africa. The highest frequency of haplotype V (68.9%) was previously observed in Berbers from Morocco, and it was previously established that this haplotype is a characteristic Berberian haplotype in North Africa. Percentages of haplotype V geographic distribution reveal a gradient of decreasing frequencies with latitude in Iberia: 40.8% in Andalusia, 36.2% in Portugal, 12.1% in Catalonia, and 11.3% in Basques; such a cline of decreasing haplotype V frequencies from the South to the North in Iberia clearly establishes a North African toward Iberian gene flow. PMID:11543890

Lucotte, G; Gérard, N; Mercier, G



MHC Class II haplotypes of Colombian Amerindian tribes  

PubMed Central

We analyzed 1041 individuals belonging to 17 Amerindian tribes of Colombia, Chimila, Bari and Tunebo (Chibcha linguistic family), Embera, Waunana (Choco linguistic family), Puinave and Nukak (Maku-Puinave linguistic families), Cubeo, Guanano, Tucano, Desano and Piratapuyo (Tukano linguistic family), Guahibo and Guayabero (Guayabero Linguistic Family), Curripaco and Piapoco (Arawak linguistic family) and Yucpa (Karib linguistic family). for MHC class II haplotypes (HLA-DRB1, DQA1, DQB1). Approximately 90% of the MHC class II haplotypes found among these tribes are haplotypes frequently encountered in other Amerindian tribes. Nonetheless, striking differences were observed among Chibcha and non-Chibcha speaking tribes. The DRB1*04:04, DRB1*04:11, DRB1*09:01 carrying haplotypes were frequently found among non-Chibcha speaking tribes, while the DRB1*04:07 haplotype showed significant frequencies among Chibcha speaking tribes, and only marginal frequencies among non-Chibcha speaking tribes. Our results suggest that the differences in MHC class II haplotype frequency found among Chibcha and non-Chibcha speaking tribes could be due to genetic differentiation in Mesoamerica of the ancestral Amerindian population into Chibcha and non-Chibcha speaking populations before they entered into South America.

Yunis, Juan J.; Yunis, Edmond J.; Yunis, Emilio



Analysis of Allele and Haplotype Diversity Across 25 Genomic Regions in Three Eastern European Populations  

Microsoft Academic Search

Objective: Individual population history is the main reason for the variability of linkage disequilibrium (LD) patterns and haplotype frequencies among populations. Such diversity may influence the transferability of tag SNPs from one population to another. Our goal was to compare patterns of pairwise LD and allele and haplotype frequencies in Estonian and Russian populations, to estimate the genetic variation between

Andrey Khrunin; Evelin Mihailov; Tiit Nikopensius; Kaarel Krjutškov; Svetlana Limborska; Andres Metspalu



RNA-Seq Identifies SNP Markers for Growth Traits in Rainbow Trout  

PubMed Central

Fast growth is an important and highly desired trait, which affects the profitability of food animal production, with feed costs accounting for the largest proportion of production costs. Traditional phenotype-based selection is typically used to select for growth traits; however, genetic improvement is slow over generations. Single nucleotide polymorphisms (SNPs) explain 90% of the genetic differences between individuals; therefore, they are most suitable for genetic evaluation and strategies that employ molecular genetics for selective breeding. SNPs found within or near a coding sequence are of particular interest because they are more likely to alter the biological function of a protein. We aimed to use SNPs to identify markers and genes associated with genetic variation in growth. RNA-Seq whole-transcriptome analysis of pooled cDNA samples from a population of rainbow trout selected for improved growth versus unselected genetic cohorts (10 fish from 1 full-sib family each) identified SNP markers associated with growth-rate. The allelic imbalances (the ratio between the allele frequencies of the fast growing sample and that of the slow growing sample) were considered at scores >5.0 as an amplification and <0.2 as loss of heterozygosity. A subset of SNPs (n?=?54) were validated and evaluated for association with growth traits in 778 individuals of a three-generation parent/offspring panel representing 40 families. Twenty-two SNP markers and one mitochondrial haplotype were significantly associated with growth traits. Polymorphism of 48 of the markers was confirmed in other commercially important aquaculture stocks. Many markers were clustered into genes of metabolic energy production pathways and are suitable candidates for genetic selection. The study demonstrates that RNA-Seq at low sequence coverage of divergent populations is a fast and effective means of identifying SNPs, with allelic imbalances between phenotypes. This technique is suitable for marker development in non-model species lacking complete and well-annotated genome reference sequences.

Salem, Mohamed; Vallejo, Roger L.; Leeds, Timothy D.; Palti, Yniv; Liu, Sixin; Sabbagh, Annas; Rexroad, Caird E.; Yao, Jianbo



Haplotypes of angiotensinogen in essential hypertension.  

PubMed Central

The M235T polymorphism of the angiotensinogen gene (AGT) has been associated with essential and pregnancy-induced hypertension. Generation of haplotypes can help to resolve whether the T235 allele itself predisposes to the development of hypertension or acts as a marker of an unknown causal molecular variant. We identified 10 diallelic polymorphisms at the AGT locus and genotyped both a series of 477 probands of hypertensive families and 364 controls, all French Caucasians, as well as a series of 92 hypertensives and 122 controls from Japan. Despite a large ethnic difference in gene frequency, a significant association of T235 with hypertension was observed both in Cancasians (.46 vs. .38, P = .004) and in Japanese (.91 vs. .76, P = .002). In both groups, the G-->A substitution located at position -6 upstream of the initial transcription site occurred at the same frequency and in complete linkage disequilibrium with the T235 allele. No other polymorphism was found to be consistently associated with hypertension. Five informative haplotypes subdividing the T235 allele were generated. Whereas two of them were associated with hypertension in Caucasians, none of these two haplotypes (H3 and H4) reached statistical significance in Japanese. The analysis of the AGT-GT repeat revealed marked linkage disequilibriums between each of the diallelic polymorphisms and some (GT)n alleles, with similar patterns in the two populations. The strong disequilibrium between M235 and (GT)16 explained the increased frequency of that particular allele in French controls compared with hypertensives (.42 vs. .36, P < .01). The haplotype combining the M235T and G-6A polymorphisms appears as the ancestral allele of the human AGT gene and as the one associated with hypertension.

Jeunemaitre, X; Inoue, I; Williams, C; Charru, A; Tichet, J; Powers, M; Sharma, A M; Gimenez-Roqueplo, A P; Hata, A; Corvol, P; Lalouel, J M



High-resolution haplotype block structure in the cattle genome  

PubMed Central

Background The Bovine HapMap Consortium has generated assay panels to genotype ~30,000 single nucleotide polymorphisms (SNPs) from 501 animals sampled from 19 worldwide taurine and indicine breeds, plus two outgroup species (Anoa and Water Buffalo). Within the larger set of SNPs we targeted 101 high density regions spanning up to 7.6 Mb with an average density of approximately one SNP per 4 kb, and characterized the linkage disequilibrium (LD) and haplotype block structure within individual breeds and groups of breeds in relation to their geographic origin and use. Results From the 101 targeted high-density regions on bovine chromosomes 6, 14, and 25, between 57 and 95% of the SNPs were informative in the individual breeds. The regions of high LD extend up to ~100 kb and the size of haplotype blocks ranges between 30 bases and 75 kb (10.3 kb average). On the scale from 1–100 kb the extent of LD and haplotype block structure in cattle has high similarity to humans. The estimation of effective population sizes over the previous 10,000 generations conforms to two main events in cattle history: the initiation of cattle domestication (~12,000 years ago), and the intensification of population isolation and current population bottleneck that breeds have experienced worldwide within the last ~700 years. Haplotype block density correlation, block boundary discordances, and haplotype sharing analyses were consistent in revealing unexpected similarities between some beef and dairy breeds, making them non-differentiable. Clustering techniques permitted grouping of breeds into different clades given their similarities and dissimilarities in genetic structure. Conclusion This work presents the first high-resolution analysis of haplotype block structure in worldwide cattle samples. Several novel results were obtained. First, cattle and human share a high similarity in LD and haplotype block structure on the scale of 1–100 kb. Second, unexpected similarities in haplotype block structure between dairy and beef breeds make them non-differentiable. Finally, our findings suggest that ~30,000 uniformly distributed SNPs would be necessary to construct a complete genome LD map in Bos taurus breeds, and ~580,000 SNPs would be necessary to characterize the haplotype block structure across the complete cattle genome.

Villa-Angulo, Rafael; Matukumalli, Lakshmi K; Gill, Clare A; Choi, Jungwoo; Van Tassell, Curtis P; Grefenstette, John J



The Systemic Lupus Erythematosus IRF5 Risk Haplotype Is Associated with Systemic Sclerosis  

PubMed Central

Systemic sclerosis (SSc) is a fibrotic autoimmune disease in which the genetic component plays an important role. One of the strongest SSc association signals outside the human leukocyte antigen (HLA) region corresponds to interferon (IFN) regulatory factor 5 (IRF5), a major regulator of the type I IFN pathway. In this study we aimed to evaluate whether three different haplotypic blocks within this locus, which have been shown to alter the protein function influencing systemic lupus erythematosus (SLE) susceptibility, are involved in SSc susceptibility and clinical phenotypes. For that purpose, we genotyped one representative single-nucleotide polymorphism (SNP) of each block (rs10488631, rs2004640, and rs4728142) in a total of 3,361 SSc patients and 4,012 unaffected controls of Caucasian origin from Spain, Germany, The Netherlands, Italy and United Kingdom. A meta-analysis of the allele frequencies was performed to analyse the overall effect of these IRF5 genetic variants on SSc. Allelic combination and dependency tests were also carried out. The three SNPs showed strong associations with the global disease (rs4728142: P ?=?1.34×10?8, OR ?=?1.22, CI 95% ?=?1.14–1.30; rs2004640: P ?=?4.60×10?7, OR ?=?0.84, CI 95% ?=?0.78–0.90; rs10488631: P ?=?7.53×10?20, OR ?=?1.63, CI 95% ?=?1.47–1.81). However, the association of rs2004640 with SSc was not independent of rs4728142 (conditioned P ?=?0.598). The haplotype containing the risk alleles (rs4728142*A-rs2004640*T-rs10488631*C: P ?=?9.04×10?22, OR ?=?1.75, CI 95% ?=?1.56–1.97) better explained the observed association (likelihood P-value ?=?1.48×10?4), suggesting an additive effect of the three haplotypic blocks. No statistical significance was observed in the comparisons amongst SSc patients with and without the main clinical characteristics. Our data clearly indicate that the SLE risk haplotype also influences SSc predisposition, and that this association is not sub-phenotype-specific.

Beretta, Lorenzo; Simeon, Carmen P.; Carreira, Patricia E.; Callejas, Jose Luis; Fernandez-Castro, Monica; Saez-Comet, Luis; Beltran, Emma; Camps, Maria Teresa; Egurbide, Maria Victoria; Airo, Paolo; Scorza, Raffaella; Lunardi, Claudio; Hunzelmann, Nicolas; Riemekasten, Gabriela; Witte, Torsten; Kreuter, Alexander; Distler, Jorg H. W.; Madhok, Rajan; Shiels, Paul; van Laar, Jacob M.; Fonseca, Carmen; Denton, Christopher; Herrick, Ariane; Worthington, Jane; Schuerwegh, Annemie J.; Vonk, Madelon C.; Voskuyl, Alexandre E.; Radstake, Timothy R. D. J.; Martin, Javier



A High Density SNP Array for the Domestic Horse and Extant Perissodactyla: Utility for Association Mapping, Genetic Diversity, and Phylogeny Studies  

PubMed Central

An equine SNP genotyping array was developed and evaluated on a panel of samples representing 14 domestic horse breeds and 18 evolutionarily related species. More than 54,000 polymorphic SNPs provided an average inter-SNP spacing of ?43 kb. The mean minor allele frequency across domestic horse breeds was 0.23, and the number of polymorphic SNPs within breeds ranged from 43,287 to 52,085. Genome-wide linkage disequilibrium (LD) in most breeds declined rapidly over the first 50–100 kb and reached background levels within 1–2 Mb. The extent of LD and the level of inbreeding were highest in the Thoroughbred and lowest in the Mongolian and Quarter Horse. Multidimensional scaling (MDS) analyses demonstrated the tight grouping of individuals within most breeds, close proximity of related breeds, and less tight grouping in admixed breeds. The close relationship between the Przewalski's Horse and the domestic horse was demonstrated by pair-wise genetic distance and MDS. Genotyping of other Perissodactyla (zebras, asses, tapirs, and rhinoceros) was variably successful, with call rates and the number of polymorphic loci varying across taxa. Parsimony analysis placed the modern horse as sister taxa to Equus przewalski. The utility of the SNP array in genome-wide association was confirmed by mapping the known recessive chestnut coat color locus (MC1R) and defining a conserved haplotype of ?750 kb across all breeds. These results demonstrate the high quality of this SNP genotyping resource, its usefulness in diverse genome analyses of the horse, and potential use in related species.

McCue, Molly E.; Bannasch, Danika L.; Petersen, Jessica L.; Gurr, Jessica; Bailey, Ernie; Binns, Matthew M.; Distl, Ottmar; Guerin, Gerard; Hasegawa, Telhisa; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Penedo, M. Cecilia T.; R?ed, Knut H.; Ryder, Oliver A.; Swinburne, June E.; Tozaki, Teruaki; Valberg, Stephanie J.; Vaudin, Mark; Lindblad-Toh, Kerstin



Dimensional anxiety mediates linkage of GABRA2 haplotypes with alcoholism.  


The GABAAalpha2 receptor gene (GABRA2) modulates anxiety and stress response. Three recent association studies implicate GABRA2 in alcoholism, however in these papers both common, opposite-configuration haplotypes in the region distal to intron3 predict risk. We have now replicated the GABRA2 association with alcoholism in 331 Plains Indian men and women and 461 Finnish Caucasian men. Using a dimensional measure of anxiety, harm avoidance (HA), we also found that the association with alcoholism is mediated, or moderated, by anxiety. Nine SNPs were genotyped revealing two haplotype blocks. Within the previously implicated block 2 region, we identified the two common, opposite-configuration risk haplotypes, A and B. Their frequencies differed markedly in Finns and Plains Indians. In both populations, most block 2 SNPs were significantly associated with alcoholism. The associations were due to increased frequencies of both homozygotes in alcoholics, indicating the possibility of alcoholic subtypes with opposite genotypes. Congruently, there was no significant haplotype association. Using HA as an indicator variable for anxiety, we found haplotype linkage to alcoholism with high and low dimensional anxiety, and to HA itself, in both populations. High HA alcoholics had the highest frequency of the more abundant haplotype (A in Finns, B in Plains Indians); low HA alcoholics had the highest frequency of the less abundant haplotype (B in Finns, A in Plains Indians) (Finns: P = 0.007, OR = 2.1, Plains Indians: P = 0.040, OR = 1.9). Non-alcoholics had intermediate frequencies. Our results suggest that within the distal GABRA2 region is a functional locus or loci that may differ between populations but that alters risk for alcoholism via the mediating action of anxiety. PMID:16874763

Enoch, Mary-Anne; Schwartz, Lori; Albaugh, Bernard; Virkkunen, Matti; Goldman, David



TGFBR1 haplotypes and risk of non-small cell lung cancer  

PubMed Central

Transforming growth factor beta (TGF-?) receptors are centrally involved in TGF-?-mediated cell growth and differentiation and are frequently inactivated in non-small cell lung cancer (NSCLC). Constitutively decreased type I TGF-? receptor (TGFBR1) expression is emerging as a novel tumor-predisposing phenotype. The association of TGFBR1 haplotypes with risk for NSCLC has not yet been studied. We tested the hypothesis that single nucleotide polymorphisms (SNPs) and/or TGFBR1 haplotypes are associated with risk of NSCLC. We genotyped six TGFBR1 haplotype tagging SNPs (htSNPs) by PCR-restriction fragment length polymorphism (PCR-RFLP) assays and one htSNP by PCR-single strand conformation polymorphism (PCR-SSCP) assay in two case-control studies. Case-control study 1 included 102 NSCLC patients and 104 healthy controls from Suzhou. Case-control study 2 included 131 patients with NSCLC and 133 healthy controls from Wuxi. Individuals included in both case-control studies were Han Chinese. Haplotypes were reconstructed according to the genotyping data and linkage disequilibrium (LD) status of these seven htSNPs. None of the htSNP was associated with NSCLC risk in either study. However, a four-marker haplotype CTGC was significantly more common among controls than among cases in both studies (P=0.014 and P=0.010, respectively) indicating that this haplotype is associated with decreased NSCLC risk (adjusted OR, 0.09; 95% CI, 0.01-0.61 and adjusted OR, 0.11; 95% CI, 0.02-0.59, respectively). Combined analysis of both studies shows a strong association of this four-marker haplotype with decreased NSCLC risk (adjusted OR, 0.11; 95% CI, 0.03-0.39). This is the first evidence of an association between a TGFBR1 haplotype and risk for NSCLC.

Lei, Zhe; Liu, Reng Yun; Zhao, Jun; Liu, Zeyi; Jiang, Xiefang; You, Weiming; Chen, Xiao Feng; Liu, Xia; Zhang, Kui; Pasche, Boris; Zhang, Hong Tao



Haplotypes that include the integrin alpha 11 gene are associated with tick burden in cattle  

PubMed Central

Background Infestations on cattle by the ectoparasite Boophilus (Rhipicephalus) microplus (cattle tick) impact negatively on animal production systems. Host resistance to tick infestation has a low to moderate heritability in the range 0.13 - 0.64 in Australia. Previous studies identified a QTL on bovine chromosome 10 (BTA10) linked to tick burden in cattle. Results To confirm these associations, we collected genotypes of 17 SNP from BTA10, including three obtained by sequencing part of the ITGA11 (Integrin alpha 11) gene. Initially, we genotyped 1,055 dairy cattle for the 17 SNP, and then genotyped 557 Brahman and 216 Tropical Composite beef cattle for 11 of the 17 SNP. In total, 7 of the SNP were significantly (P < 0.05) associated with tick burden tested in any of the samples. One SNP, ss161109814, was significantly (P < 0.05) associated with tick burden in both the taurine and the Brahman sample, but the favourable allele was different. Haplotypes for three and for 10 SNP were more significantly (P < 0.001) associated with tick burden than SNP analysed individually. Some of the common haplotypes with the largest sample sizes explained between 1.3% and 1.5% of the residual variance in tick burden. Conclusions These analyses confirm the location of a QTL affecting tick burden on BTA10 and position it close to the ITGA11 gene. The presence of a significant association in such widely divergent animals suggests that further SNP discovery in this region to detect causal mutations would be warranted.



Haplotype Analysis Improved Evidence for Candidate Genes for Intramuscular Fat Percentage from a Genome Wide Association Study of Cattle  

PubMed Central

In genome wide association studies (GWAS), haplotype analyses of SNP data are neglected in favour of single point analysis of associations. In a recent GWAS, we found that none of the known candidate genes for intramuscular fat (IMF) had been identified. In this study, data from the GWAS for these candidate genes were re-analysed as haplotypes. First, we confirmed that the methodology would find evidence for association between haplotypes in candidate genes of the calpain-calpastatin complex and musculus longissimus lumborum peak force (LLPF), because these genes had been confirmed through single point analysis in the GWAS. Then, for intramuscular fat percent (IMF), we found significant partial haplotype substitution effects for the genes ADIPOQ and CXCR4, as well as suggestive associations to the genes CEBPA, FASN, and CAPN1. Haplotypes for these genes explained 80% more of the phenotypic variance compared to the best single SNP. For some genes the analyses suggested that there was more than one causative mutation in some genes, or confirmed that some causative mutations are limited to particular subgroups of a species. Fitting the SNPs and their interactions simultaneously explained a similar amount of the phenotypic variance compared to haplotype analyses. Haplotype analysis is a neglected part of the suite of tools used to analyse GWAS data, would be a useful method to extract more information from these data sets, and may contribute to reducing the missing heritability problem.

Barendse, William



A genome-wide scan for breast cancer risk haplotypes among African American women.  


Genome-wide association studies (GWAS) simultaneously investigating hundreds of thousands of single nucleotide polymorphisms (SNP) have become a powerful tool in the investigation of new disease susceptibility loci. Haplotypes are sometimes thought to be superior to SNPs and are promising in genetic association analyses. The application of genome-wide haplotype analysis, however, is hindered by the complexity of haplotypes themselves and sophistication in computation. We systematically analyzed the haplotype effects for breast cancer risk among 5,761 African American women (3,016 cases and 2,745 controls) using a sliding window approach on the genome-wide scale. Three regions on chromosomes 1, 4 and 18 exhibited moderate haplotype effects. Furthermore, among 21 breast cancer susceptibility loci previously established in European populations, 10p15 and 14q24 are likely to harbor novel haplotype effects. We also proposed a heuristic of determining the significance level and the effective number of independent tests by the permutation analysis on chromosome 22 data. It suggests that the effective number was approximately half of the total (7,794 out of 15,645), thus the half number could serve as a quick reference to evaluating genome-wide significance if a similar sliding window approach of haplotype analysis is adopted in similar populations using similar genotype density. PMID:23468962

Song, Chi; Chen, Gary K; Millikan, Robert C; Ambrosone, Christine B; John, Esther M; Bernstein, Leslie; Zheng, Wei; Hu, Jennifer J; Ziegler, Regina G; Nyante, Sarah; Bandera, Elisa V; Ingles, Sue A; Press, Michael F; Deming, Sandra L; Rodriguez-Gil, Jorge L; Chanock, Stephen J; Wan, Peggy; Sheng, Xin; Pooler, Loreall C; Van Den Berg, David J; Le Marchand, Loic; Kolonel, Laurence N; Henderson, Brian E; Haiman, Chris A; Stram, Daniel O



HLA haplotypes associated with hemochromatosis mutations in the Spanish population  

PubMed Central

Background The present study is an analysis of the frequencies of HLA-A and -B antigens and HLA haplotypes in two groups of individuals homozygous for the two main HFE mutations (C282Y and H63D) and a group heterozygous for the S65C mutation. Methods The study population includes: 1123 healthy individuals, 100 homozygous for the C282Y mutation, 138 homozygous for the H63D mutation and 17 heterozygous for the S65C mutation. HFE and HLA alleles were detected using DNA-based and microlymphocytotoxicity techniques respectively. Results An expected significant association between C282Y and the HLA-A3/B7 haplotype was found, but other HLA haplotypes carrying the -A3 antigen were found: HLA-A3/B62 and HLA-A3/B44. Also, a significant association between H63D mutation and HLA-A29/B44 haplotype was found, and again other HLA haplotypes carrying the HLA-A29 antigen were also found: HLA-A29/B14 and HLA-A29/B62. In addition, the S65C mutation seems to be associated with a HLA haplotype carrying the HLA-A26 antigen. Conclusion These findings clearly suggest that HLA-A3/B7 and HLA-A29/B44 are the ancestral haplotypes from which the C282Y and H63D mutations originated, respectively. The frequencies of these mutations in different populations, their geographical distribution, and the degree of the statistical association to the ancestral haplotypes, suggest that the H63D mutation must have occurred earlier than the C282Y mutation.

Pacho, Arantza; Mancebo, Esther; del Rey, Manuel J; Castro, Maria J; Oliver, Desamparados; Garcia-Berciano, Miguel; Gonzalez, Luis; Morales, Pablo



Sequential sentinel SNP Regional Association Plots (SSS-RAP): an approach for testing independence of SNP association signals using meta-analysis data.  


Genome-Wide Association Studies (GWAS) frequently incorporate meta-analysis within their framework. However, conditional analysis of individual-level data, which is an established approach for fine mapping of causal sites, is often precluded where only group-level summary data are available for analysis. Here, we present a numerical and graphical approach, "sequential sentinel SNP regional association plot" (SSS-RAP), which estimates regression coefficients (beta) with their standard errors using the meta-analysis summary results directly. Under an additive model, typical for genes with small effect, the effect for a sentinel SNP can be transformed to the predicted effect for a possibly dependent SNP through a 2×2 2-SNP haplotypes table. The approach assumes Hardy-Weinberg equilibrium for test SNPs. SSS-RAP is available as a Web-tool ( To develop and illustrate SSS-RAP we analyzed lipid and ECG traits data from the British Women's Heart and Health Study (BWHHS), evaluated a meta-analysis for ECG trait and presented several simulations. We compared results with existing approaches such as model selection methods and conditional analysis. Generally findings were consistent. SSS-RAP represents a tool for testing independence of SNP association signals using meta-analysis data, and is also a convenient approach based on biological principles for fine mapping in group level summary data. PMID:23278391

Zheng, Jie; Gaunt, Tom R; Day, Ian N M



A novel approach for haplotype-based association analysis using family data  

PubMed Central

Background Haplotype-based approaches have been extensively studied for case-control association mapping in recent years. It has been shown that haplotype methods can provide more consistent results comparing to single-locus based approaches, especially in cases where causal variants are not typed. Improved power has been observed by clustering similar or rare haplotypes into groups to reduce the degrees of freedom of association tests. For family-based association studies, one commonly used strategy is Transmission Disequilibrium Tests (TDT), which examine the imbalanced transmission of alleles/haplotypes to affected and normal children. Many extensions have been developed to deal with general pedigrees and continuous traits. Results In this paper, we propose a new haplotype-based association method for family data that is different from the TDT framework. Our approach (termed F_HapMiner) is based on our previous successful experiences on haplotype inference from pedigree data and haplotype-based association mapping. It first infers diplotype pairs of each individual in each pedigree assuming no recombination within a family. A phenotype score is then defined for each founder haplotype. Finally, F_HapMiner applies a clustering algorithm on those founder haplotypes based on their similarities and identifies haplotype clusters that show significant associations with diseases/traits. We have performed extensive simulations based on realistic assumptions to evaluate the effectiveness of the proposed approach by considering different factors such as allele frequency, linkage disequilibrium (LD) structure, disease model and sample size. Comparisons with single-locus and haplotype-based TDT methods demonstrate that our approach consistently outperforms the TDT-based approaches regardless of disease models, local LD structures or allele/haplotype frequencies. Conclusion We present a novel haplotype-based association approach using family data. Experiment results demonstrate that it achieves significantly higher power than TDT-based approaches.



Genome-wide haplotype changes produced by artificial selection during modern rice breeding in Japan.  


During the last 90 years, the breeding of rice has delivered cultivars with improved agronomic and economic characteristics. Crossing of different lines and successive artificial selection of progeny based on their phenotypes have changed the chromosomal constitution of the ancestors of modern rice; however, the nature of these changes is unclear. The recent accumulation of data for genome-wide single-nucleotide polymorphisms (SNPs) in rice has allowed us to investigate the change in haplotype structure and composition. To assess the impact of these changes during modern breeding, we studied 177 Japanese rice accessions, which were categorized into three groups: landraces, improved cultivars developed from 1931 to 1974 (the early breeding phase), and improved cultivars developed from 1975 to 2005 (the late breeding phase). Phylogenetic tree and structure analysis indicated genetic differentiation between non-irrigated (upland) and irrigated (lowland) rice groups as well as genetic structuring within the irrigated rice group that corresponded to the existence of three subgroups. Pedigree analysis revealed that a limited number of landraces and cultivars was used for breeding at the beginning of the period of systematic breeding and that 11 landraces accounted for 70% of the ancestors of the modern improved cultivars. The values for linkage disequilibrium estimated from SNP alleles and the haplotype diversity determined from consecutive alleles in five-SNP windows indicated that haplotype blocks became less diverse over time as a result of the breeding process. A decrease in haplotype diversity, caused by a reduced number of polymorphisms in the haplotype blocks, was observed in several chromosomal regions. However, our results also indicate that new haplotype polymorphisms have been generated across the genome during the breeding process. These findings will facilitate our understanding of the association between particular haplotypes and desirable phenotypes in modern Japanese rice cultivars. PMID:22427922

Yonemaru, Jun-ichi; Yamamoto, Toshio; Ebana, Kaworu; Yamamoto, Eiji; Nagasaki, Hideki; Shibaya, Taeko; Yano, Masahiro



Genome-Wide Haplotype Changes Produced by Artificial Selection during Modern Rice Breeding in Japan  

PubMed Central

During the last 90 years, the breeding of rice has delivered cultivars with improved agronomic and economic characteristics. Crossing of different lines and successive artificial selection of progeny based on their phenotypes have changed the chromosomal constitution of the ancestors of modern rice; however, the nature of these changes is unclear. The recent accumulation of data for genome-wide single-nucleotide polymorphisms (SNPs) in rice has allowed us to investigate the change in haplotype structure and composition. To assess the impact of these changes during modern breeding, we studied 177 Japanese rice accessions, which were categorized into three groups: landraces, improved cultivars developed from 1931 to 1974 (the early breeding phase), and improved cultivars developed from 1975 to 2005 (the late breeding phase). Phylogenetic tree and structure analysis indicated genetic differentiation between non-irrigated (upland) and irrigated (lowland) rice groups as well as genetic structuring within the irrigated rice group that corresponded to the existence of three subgroups. Pedigree analysis revealed that a limited number of landraces and cultivars was used for breeding at the beginning of the period of systematic breeding and that 11 landraces accounted for 70% of the ancestors of the modern improved cultivars. The values for linkage disequilibrium estimated from SNP alleles and the haplotype diversity determined from consecutive alleles in five-SNP windows indicated that haplotype blocks became less diverse over time as a result of the breeding process. A decrease in haplotype diversity, caused by a reduced number of polymorphisms in the haplotype blocks, was observed in several chromosomal regions. However, our results also indicate that new haplotype polymorphisms have been generated across the genome during the breeding process. These findings will facilitate our understanding of the association between particular haplotypes and desirable phenotypes in modern Japanese rice cultivars.

Yamamoto, Eiji; Nagasaki, Hideki; Shibaya, Taeko; Yano, Masahiro



Haplotype-based quantitative trait mapping using a clustering algorithm  

PubMed Central

Background With the availability of large-scale, high-density single-nucleotide polymorphism (SNP) markers, substantial effort has been made in identifying disease-causing genes using linkage disequilibrium (LD) mapping by haplotype analysis of unrelated individuals. In addition to complex diseases, many continuously distributed quantitative traits are of primary clinical and health significance. However the development of association mapping methods using unrelated individuals for quantitative traits has received relatively less attention. Results We recently developed an association mapping method for complex diseases by mining the sharing of haplotype segments (i.e., phased genotype pairs) in affected individuals that are rarely present in normal individuals. In this paper, we extend our previous work to address the problem of quantitative trait mapping from unrelated individuals. The method is non-parametric in nature, and statistical significance can be obtained by a permutation test. It can also be incorporated into the one-way ANCOVA (analysis of covariance) framework so that other factors and covariates can be easily incorporated. The effectiveness of the approach is demonstrated by extensive experimental studies using both simulated and real data sets. The results show that our haplotype-based approach is more robust than two statistical methods based on single markers: a single SNP association test (SSA) and the Mann-Whitney U-test (MWU). The algorithm has been incorporated into our existing software package called HapMiner, which is available from our website at . Conclusion For QTL (quantitative trait loci) fine mapping, to identify QTNs (quantitative trait nucleotides) with realistic effects (the contribution of each QTN less than 10% of total variance of the trait), large samples sizes (? 500) are needed for all the methods. The overall performance of HapMiner is better than that of the other two methods. Its effectiveness further depends on other factors such as recombination rates and the density of typed SNPs. Haplotype-based methods might provide higher power than methods based on a single SNP when using tag SNPs selected from a small number of samples or some other sources (such as HapMap data). Rank-based statistics usually have much lower power, as shown in our study.

Li, Jing; Zhou, Yingyao; Elston, Robert C




Technology Transfer Automated Retrieval System (TEKTRAN)

Cow genome sequencing is underway at Baylor College of Medicine (BCM) sequencing center and will be completed in the next few months. The bovine genome sequencing white paper indicated a goal to identify 100,000 SNP for use in identification and mapping of quantitative trait loci (QTL) regions. The ...


Haplotype allelic classes for detecting ongoing positive selection  

PubMed Central

Background Natural selection eliminates detrimental and favors advantageous phenotypes. This process leaves characteristic signatures in underlying genomic segments that can be recognized through deviations in allelic or haplotypic frequency spectra. To provide an identifiable signature of recent positive selection that can be detected by comparison with the background distribution, we introduced a new way of looking at genomic polymorphisms: haplotype allelic classes. Results The model combines segregating sites and haplotypic information in order to reveal useful data characteristics. We developed a summary statistic, Svd, to compare the distribution of the haplotypes carrying the selected allele with the distribution of the remaining ones. Coalescence simulations are used to study the distributions under standard population models assuming neutrality, demographic scenarios and selection models. To test, in practice, haplotype allelic class performance and the derived statistic in capturing deviation from neutrality due to positive selection, we analyzed haplotypic variation in detail in the locus of lactase persistence in the three HapMap Phase II populations. Conclusions We showed that the Svd statistic is less sensitive than other tests to confounding factors such as demography or recombination. Our approach succeeds in identifying candidate loci, such as the lactase-persistence locus, as targets of strong positive selection and provides a new tool complementary to other tests to study natural selection in genomic data.



Gene Mapping by Haplotype Pattern Mining  

Microsoft Academic Search

We describe a new method for linkage disequilibrium mapping, Haplotype Pattern Mining (HPM). The method is based on discovering recurrent patterns, inspired by data mining methods. We define a class of useful haplotype pat- terns in genetic case-control data, and give an algorithm for finding disease-associated haplotypes. The haplotypes are ordered by their strength of association to the pheno- type,

Hannu Toivonen; Paivi Onkamo; Kari Vasko; Vesa Ollikainen; Petteri Sevon; Heikki Mannila; Juha Kere



Low Enzymatic Activity Haplotypes of the Human Catechol-O-Methyltransferase Gene: Enrichment for Marker SNPs  

PubMed Central

Catechol-O-methyltransferase (COMT) is an enzyme that plays a key role in the modulation of catechol-dependent functions such as cognition, cardiovascular function, and pain processing. Three common haplotypes of the human COMT gene, divergent in two synonymous and one nonsynonymous (val158met) position, designated as low (LPS), average (APS), and high pain sensitive (HPS), are associated with experimental pain sensitivity and risk of developing chronic musculoskeletal pain conditions. APS and HPS haplotypes produce significant functional effects, coding for 3- and 20-fold reductions in COMT enzymatic activity, respectively. In the present study, we investigated whether additional minor single nucleotide polymorphisms (SNPs), accruing in 1 to 5% of the population, situated in the COMT transcript region contribute to haplotype-dependent enzymatic activity. Computer analysis of COMT ESTs showed that one synonymous minor SNP (rs769224) is linked to the APS haplotype and three minor SNPs (two synonymous: rs6267, rs740602 and one nonsynonymous: rs8192488) are linked to the HPS haplotype. Results from in silico and in vitro experiments revealed that inclusion of allelic variants of these minor SNPs in APS or HPS haplotypes did not modify COMT function at the level of mRNA folding, RNA transcription, protein translation, or enzymatic activity. These data suggest that neutral variants are carried with APS and HPS haplotypes, while the high activity LPS haplotype displays less linked variation. Thus, both minor synonymous and nonsynonymous SNPs in the coding region are markers of functional APS and HPS haplotypes rather than independent contributors to COMT activity.

Nackley, Andrea G.; Shabalina, Svetlana A.; Lambert, Jason E.; Conrad, Mathew S.; Gibson, Dustin G.; Spiridonov, Alexey N.; Satterfield, Sarah K.; Diatchenko, Luda



Sickle Cell Disease in a Brazilian Population from Sao Paulo: A Study of the ?s Haplotypes  

Microsoft Academic Search

In this study we have determined the frequency of ?s haplotypes in a Brazilian sickle cell disease population from Sao Paulo, Brazil, by analyzing sequence variations in the immediate 5’ flanking and second intervening sequence (IVSII) regions of the ? globin genes. This association between sequence differences and ?s haplotype backgrounds was determined by screening genomic DNA samples using dot

M. S. Goncalves; J. F. Nechtman; M. S. Figueiredo; J. Kerbauy; V. R. Armda; M. F. Sonati; S. O. T. Saad; F. F. Costa; T. A. Stoming



Y-chromosomal DNA haplotype differences in control and infertile Italian subpopulations  

Microsoft Academic Search

Y-chromosomal DNA haplotypes were determined in 74 infertile and 216 control Italian males using eight biallelic markers. A significant difference in haplotype frequency was found, but could be explained by the geographical origins of the samples. The Y chromosome is thus a sensitive marker for population substructuring and may be useful for determining whether two population samples come from a

Carlo Previderé; Liborio Stuppia; Valentina Gatta; Paolo Fattorini; Giandomenico Palka; Chris Tyler-Smith



The Evolution of Lethals in the t-Haplotype System of the Mouse  

Microsoft Academic Search

The evolution of lethal haplotypes in the t-haplotype segregation distortion system of Mus is examined by mathematical and computer models. The models assume that there is reproductive compensation for the loss of lethal embryos, such that the net reproductive success of a female is not reduced in proportion to the frequency of lethal offspring which she produces. The initial population

Brian Charlesworth



Characterization of Killer cell immunoglobulin-like receptor (KIR) genotypes and haplotypes in Chinese Han population.  


We performed Killer cell immunoglobulin-like receptor (KIR) genotyping on 1271 individuals of Chinese Han origin including 102 families and 965 unrelated individuals. The families (with one child and both parents) were subjected for haplotype analysis. Forty-one different genotypes were identified. The frequencies of the KIR genotypes found in the family panel were confirmed by those found in the unrelated panel. The family study showed segregation of one A haplotype and at least 15 unique B haplotypes. The most commonly observed haplotypes in group B were B1, B2, and B3, present at a frequency of 10.05%, 6.62%, and 4.90%, respectively. On the basis of the combination of KIR genes, six centromeric and seven telomeric gene motifs have been identified. Motif cB02 was the most frequent haplotype B specific centromeric segment while tB01 was the most frequent haplotype B specific telomeric segment. The distinct distribution of KIR haplotypes in each population may reflect the history of directional and balancing selection of different races. The gene combinations of group A and B1/B2/B3 haplotype were the most frequent genotypes named as Bx1, Bx2, and Bx3, present at a frequency of 13.72%, 7.35%, and 4.41% in the family panel, and at a frequency of 15.86%, 10.15%, and 5.80% in the unrelated panel, respectively. Overall, this study showed the diversity of KIR haplotypes and genotypes in Chinese Han population and developed a criterion for distinguish KIR haplotypes/genotypes for the population. KIR genotyping and haplotype analysis should be useful for selection of the most optimum donor grafts with favorable KIR gene content for transplants. PMID:24131019

Bao, X; Wang, M; Zhou, H; Wu, X; Yang, L; Xu, C; Yuan, X; Zhang, J; Li, L; Wu, D; He, J



Dutch myotonic dystrophy type 2 patients and a North-African DM2 family carry the common European founder haplotype  

PubMed Central

Myotonic dystrophy type 2 (DM2) is a progressive multisystem disease with muscle weakness and myotonia as main characteristics. The disease is caused by a repeat expansion in the zinc-finger protein 9 (ZNF9) gene on chromosome 3q21. Several reports show that patients from European ancestry share an identical haplotype surrounding the ZNF9 gene. In this study, we investigated whether the Dutch DM2 population carries the same founder haplotype. In all, 40 Dutch DM2 patients from 16 families were genotyped for eight short tandem repeat markers surrounding the ZNF9 gene. In addition, the single-nucleotide polymorphism (SNP) rs1871922 located in the first intron of DM2 was genotyped. Results were compared with previously published haplotypes from unrelated Caucasian patients. The repeat lengths identified in this study were in agreement with existing literature. In 36 patients of our population, we identified three common haplotypes. One patient showed overlap with the common haplotype for only one marker closest to the ZNF9 gene. The haplotype from a family originating from Morocco showed overlap with that of the patients of European descent for a region of 222?kb. All patients carried at least one C allele of SNP rs1871922 indicating that all patients carry the European founder haplotype. We conclude that DM2 patients from the Netherlands, including a North-African family, harbor a common haplotype surrounding the ZNF9 gene. This data show that the Dutch patients carry the common founder haplotype and strongly suggest that DM2 mutations in Europe and North Africa originate from a single ancestral founder.

Coenen, Marieke J H; Tieleman, Alide A; Schijvenaars, Mascha M V A P; Leferink, Maike; Ranum, Laura P W; Scheffer, Hans; van Engelen, Baziel G M



Common PCSK1 haplotypes are associated with obesity in the Chinese population.  


Prohormone convertase subtilisin/kexin type 1 (PCSK1) genetic polymorphisms have recently been associated with obesity in European populations. This study aimed to examine whether common PCSK1 genetic variation is associated with obesity and related metabolic phenotypes in the Chinese population. We genotyped nine common tag single-nucleotide polymorphisms (tagSNP) of the PCSK1 gene in 1,094 subjects of Chinese origin from the Stanford Asia-Pacific Program for Hypertension and Insulin Resistance (SAPPHIRe) family study. One SNP in the PCSK1 gene (rs155971) were nominally associated with risk of obesity in the SAPPHIRe cohort (P = 0.01). A common protective haplotype was associated with reduced risk of obesity (23.79% vs. 32.89%, P = 0.01) and smaller waist circumference (81.71 +/- 10.22 vs. 84.75 +/- 10.48 cm, P = 0.02). Another common haplotype was significantly associated with increased risk of obesity (37.07% vs. 23.84%, P = 0.005). The global P value for haplotype association with obesity was 0.02. We also identified a suggestive association of another PCSK1 SNP (rs3811951) with fasting glucose, fasting insulin, homeostasis model assessment of insulin resistance (HOMA(IR)), triglycerides, and high-density lipoprotein cholesterol (P = 0.05, 0.003, 0.001, 0.04, and 0.04, respectively). These data indicate common PCSK1 genetic variants are associated with obesity in the Chinese population. PMID:19875984

Chang, Yi-Cheng; Chiu, Yen-Feng; Shih, Kuang-Chung; Lin, Ming-Wei; Sheu, Wayne Huey-Herng; Donlon, Timothy; Curb, Jess David; Jou, Yuh-Shan; Chang, Tien-Jyun; Li, Hung-Yuan; Chuang, Lee-Ming



Hap-seq: an optimal algorithm for haplotype phasing with imputation using sequencing data.  


Inference of haplotypes, or the sequence of alleles along each chromosome, is a fundamental problem in genetics and is important for many analyses, including admixture mapping, identifying regions of identity by descent, and imputation. Traditionally, haplotypes are inferred from genotype data obtained from microarrays using information on population haplotype frequencies inferred from either a large sample of genotyped individuals or a reference dataset such as the HapMap. Since the availability of large reference datasets, modern approaches for haplotype phasing along these lines are closely related to imputation methods. When applied to data obtained from sequencing studies, a straightforward way to obtain haplotypes is to first infer genotypes from the sequence data and then apply an imputation method. However, this approach does not take into account that alleles on the same sequence read originate from the same chromosome. Haplotype assembly approaches take advantage of this insight and predict haplotypes by assigning the reads to chromosomes in such a way that minimizes the number of conflicts between the reads and the predicted haplotypes. Unfortunately, assembly approaches require very high sequencing coverage and are usually not able to fully reconstruct the haplotypes. In this work, we present a novel approach, Hap-seq, which is simultaneously an imputation and assembly method that combines information from a reference dataset with the information from the reads using a likelihood framework. Our method applies a dynamic programming algorithm to identify the predicted haplotype, which maximizes the joint likelihood of the haplotype with respect to the reference dataset and the haplotype with respect to the observed reads. We show that our method requires only low sequencing coverage and can reconstruct haplotypes containing both common and rare alleles with higher accuracy compared to the state-of-the-art imputation methods. PMID:23383995

He, Dan; Han, Buhm; Eskin, Eleazar



Fetal Haemoglobin and ?-globin Gene Cluster Haplotypes among Sickle Cell Patients in Chhattisgarh  

PubMed Central

Background: Foetal Haemoglobin (HbF) is the best-known genetic modulator of sickle cell anaemia, which varies dramatically in concentration in the blood of these patients. The patients with SCA display a remarkable variability in the disease severity. High HbF levels and the ?-globin gene cluster haplotypes influence the clinical presentation of sickle cell disease. To identify the genetic modifiers which influence the disease severity, we conducted a ?-globin haplotype analysis in the sickle cell disease patients of Chhattisgarh. Aim: The foetal haemoglobin and the ?-globin gene haplotypes of the sickle cell trait and the sickle cell disease patients from Chhattisgarh were investigated. Materials and Method: A total of 100 sickle cell patients (SS), 50 sickle cell trait patients (AS) and 50 healthy control individuals were included in the present study. The distribution of the ?-globin gene haplotype was done by the PCR-RFLP method. Result: PCR-RFLP showed that the homozygous Arab-Indian haplotype (65%) was the most frequent one, followed by the heterozygous Arab-Indian haplotype (11%) in the sickle cell patients (SS), while the AS patients had a higher frequency of the heterozygous Arab-Indian haplotype (38%) in comparison to homozygous one (32%). Four atypical haplotypes, 3 Benin and 1 Cameroon were also observed, although they were in lower frequencies. In the present study, the HbF levels were higher in the AS and the SS patients, with one or two Arab-Indian haplotypes as compared to the other haplotypes. Conclusion: The presence of the Arab-Indian haplotype as the predominant haplotype might be suggestive of a gene flow to/from Saudi-Arabia or India and it was associated with higher HbF levels and a milder disease severity.

Bhagat, Sanjana; Patra, Pradeep Kumar; Thakur, Amar Singh



Congruence as a measurement of extended haplotype structure across the genome  

PubMed Central

Background Historically, extended haplotypes have been defined using only a few data points, such as alleles for several HLA genes in the MHC. High-density SNP data, and the increasing affordability of whole genome SNP typing, creates the opportunity to define higher resolution extended haplotypes. This drives the need for new tools that support quantification and visualization of extended haplotypes as defined by as many as 2000 SNPs. Confronted with high-density SNP data across the major histocompatibility complex (MHC) for 2,300 complete families, compiled by the Type 1 Diabetes Genetics Consortium (T1DGC), we developed software for studying extended haplotypes. Methods The software, called ExHap (Extended Haplotype), uses a similarity measurement we term congruence to identify and quantify long-range allele identity. Using ExHap, we analyzed congruence in both the T1DGC data and family-phased data from the International HapMap Project. Results Congruent chromosomes from the T1DGC data have between 96.5% and 99.9% allele identity over 1,818 SNPs spanning 2.64 megabases of the MHC (HLA-DRB1 to HLA-A). Thirty-three of 132 DQ-DR-B-A defined haplotype groups have > 50% congruent chromosomes in this region. For example, 92% of chromosomes within the DR3-B8-A1 haplotype are congruent from HLA-DRB1 to HLA-A (99.8% allele identity). We also applied ExHap to all 22 autosomes for both CEU and YRI cohorts from the International HapMap Project, identifying multiple candidate extended haplotypes. Conclusions Long-range congruence is not unique to the MHC region. Patterns of allele identity on phased chromosomes provide a simple, straightforward approach to visually and quantitatively inspect complex long-range structural patterns in the genome. Such patterns aid the biologist in appreciating genetic similarities and differences across cohorts, and can lead to hypothesis generation for subsequent studies.



Weighted SNP Set Analysis in Genome-Wide Association Study  

PubMed Central

Genome-wide association studies (GWAS) are popular for identifying genetic variants which are associated with disease risk. Many approaches have been proposed to test multiple single nucleotide polymorphisms (SNPs) in a region simultaneously which considering disadvantages of methods in single locus association analysis. Kernel machine based SNP set analysis is more powerful than single locus analysis, which borrows information from SNPs correlated with causal or tag SNPs. Four types of kernel machine functions and principal component based approach (PCA) were also compared. However, given the loss of power caused by low minor allele frequencies (MAF), we conducted an extension work on PCA and used a new method called weighted PCA (wPCA). Comparative analysis was performed for weighted principal component analysis (wPCA), logistic kernel machine based test (LKM) and principal component analysis (PCA) based on SNP set in the case of different minor allele frequencies (MAF) and linkage disequilibrium (LD) structures. We also applied the three methods to analyze two SNP sets extracted from a real GWAS dataset of non-small cell lung cancer in Han Chinese population. Simulation results show that when the MAF of the causal SNP is low, weighted principal component and weighted IBS are more powerful than PCA and other kernel machine functions at different LD structures and different numbers of causal SNPs. Application of the three methods to a real GWAS dataset indicates that wPCA and wIBS have better performance than the linear kernel, IBS kernel and PCA.

Qian, Cheng; Cai, Min; Zhang, Ruyang; Chu, Minjie; Dai, Juncheng; Hu, Zhibin; Shen, Hongbing; Chen, Feng



Weighted SNP set analysis in genome-wide association study.  


Genome-wide association studies (GWAS) are popular for identifying genetic variants which are associated with disease risk. Many approaches have been proposed to test multiple single nucleotide polymorphisms (SNPs) in a region simultaneously which considering disadvantages of methods in single locus association analysis. Kernel machine based SNP set analysis is more powerful than single locus analysis, which borrows information from SNPs correlated with causal or tag SNPs. Four types of kernel machine functions and principal component based approach (PCA) were also compared. However, given the loss of power caused by low minor allele frequencies (MAF), we conducted an extension work on PCA and used a new method called weighted PCA (wPCA). Comparative analysis was performed for weighted principal component analysis (wPCA), logistic kernel machine based test (LKM) and principal component analysis (PCA) based on SNP set in the case of different minor allele frequencies (MAF) and linkage disequilibrium (LD) structures. We also applied the three methods to analyze two SNP sets extracted from a real GWAS dataset of non-small cell lung cancer in Han Chinese population. Simulation results show that when the MAF of the causal SNP is low, weighted principal component and weighted IBS are more powerful than PCA and other kernel machine functions at different LD structures and different numbers of causal SNPs. Application of the three methods to a real GWAS dataset indicates that wPCA and wIBS have better performance than the linear kernel, IBS kernel and PCA. PMID:24098741

Dai, Hui; Zhao, Yang; Qian, Cheng; Cai, Min; Zhang, Ruyang; Chu, Minjie; Dai, Juncheng; Hu, Zhibin; Shen, Hongbing; Chen, Feng



Use of haplotypes to estimate Mendelian sampling effects and selection limits.  


Limits to selection and Mendelian sampling (MS) terms can be calculated using haplotypes by summing the individual additive effects on each chromosome. Haplotypes were imputed for 43 382 single-nucleotide polymorphisms (SNP) in 1455 Brown Swiss, 40 351 Holstein and 4064 Jersey bulls and cows using the Fortran program findhap.f90, which combines population and pedigree haplotyping methods. Lower and upper bounds of MS variance were calculated for daughter pregnancy rate (a measure of fertility), milk yield, lifetime net merit (a measure of profitability) and protein yield assuming either no or complete linkage among SNP on the same chromosome. Calculated selection limits were greater than the largest direct genomic values observed in all breeds studied. The best chromosomal genotypes generally consisted of two copies of the same haplotype even after adjustment for inbreeding. Selection of animals rather than chromosomes may result in slower progress, but limits may be the same because most chromosomes will become homozygous with either strategy. Selection on functions of MS could be used to change variances in later generations. PMID:22059578

Cole, J B; VanRaden, P M



UASIS: Universal Automatic SNP Identification System  

PubMed Central

Background SNP (Single Nucleotide Polymorphism), the most common genetic variations between human beings, is believed to be a promising way towards personalized medicine. As more and more research on SNPs are being conducted, non-standard nomenclatures may generate potential problems. The most serious issue is that researchers cannot perform cross referencing among different SNP databases. This will result in more resources and time required to track SNPs. It could be detrimental to the entire academic community. Results UASIS (Universal Automated SNP Identification System) is a web-based server for SNP nomenclature standardization and translation at DNA level. Three utilities are available. They are UASIS Aligner, Universal SNP Name Generator and SNP Name Mapper. UASIS maps SNPs from different databases, including dbSNP, GWAS, HapMap and JSNP etc., into an uniform view efficiently using a proposed universal nomenclature and state-of-art alignment algorithms. UASIS is freely available at with no requirement of log-in. Conclusions UASIS is a helpful platform for SNP cross referencing and tracking. By providing an informative, unique and unambiguous nomenclature, which utilizes unique position of a SNP, we aim to resolve the ambiguity of SNP nomenclatures currently practised. Our universal nomenclature is a good complement to mainstream SNP notations such as rs# and HGVS guidelines. UASIS acts as a bridge to connect heterogeneous representations of SNPs.



Development and mapping of SNP assays in allotetraploid cotton.  


A narrow germplasm base and a complex allotetraploid genome have made the discovery of single nucleotide polymorphism (SNP) markers difficult in cotton (Gossypium hirsutum). To generate sequence for SNP discovery, we conducted a genome reduction experiment (EcoRI, BafI double digest, followed by adapter ligation, biotin-streptavidin purification, and agarose gel separation) on two accessions of G. hirsutum and two accessions of G. barbadense. From the genome reduction experiment, a total of 2.04 million genomic sequence reads were assembled into contigs with an N(50) of 508 bp and analyzed for SNPs. A previously generated assembly of expressed sequence tags (ESTs) provided an additional source for SNP discovery. Using highly conservative parameters (minimum coverage of 8× at each SNP and 20% minor allele frequency), a total of 11,834 and 1,679 non-genic SNPs were identified between accessions of G. hirsutum and G. barbadense in genome reduction assemblies, respectively. An additional 4,327 genic SNPs were also identified between accessions of G. hirsutum in the EST assembly. KBioscience KASPar assays were designed for a portion of the intra-specific G. hirsutum SNPs. From 704 non-genic and 348 genic markers developed, a total of 367 (267 non-genic, 100 genic) mapped in a segregating F(2) population (Acala Maxxa × TX2094) using the Fluidigm EP1 system. A G. hirsutum genetic linkage map of 1,688 cM was constructed based entirely on these new SNP markers. Of the genic-based SNPs, we were able to identify within which genome ('A' or 'D') each SNP resided using diploid species sequence data. Genetic maps generated by these newly identified markers are being used to locate quantitative, economically important regions within the cotton genome. PMID:22252442

Byers, Robert L; Harker, David B; Yourstone, Scott M; Maughan, Peter J; Udall, Joshua A



A 48 SNP set for grapevine cultivar identification  

PubMed Central

Background Rapid and consistent genotyping is an important requirement for cultivar identification in many crop species. Among them grapevine cultivars have been the subject of multiple studies given the large number of synonyms and homonyms generated during many centuries of vegetative multiplication and exchange. Simple sequence repeat (SSR) markers have been preferred until now because of their high level of polymorphism, their codominant nature and their high profile repeatability. However, the rapid application of partial or complete genome sequencing approaches is identifying thousands of single nucleotide polymorphisms (SNP) that can be very useful for such purposes. Although SNP markers are bi-allelic, and therefore not as polymorphic as microsatellites, the high number of loci that can be multiplexed and the possibilities of automation as well as their highly repeatable results under any analytical procedure make them the future markers of choice for any type of genetic identification. Results We analyzed over 300 SNP in the genome of grapevine using a re-sequencing strategy in a selection of 11 genotypes. Among the identified polymorphisms, we selected 48 SNP spread across all grapevine chromosomes with allele frequencies balanced enough as to provide sufficient information content for genetic identification in grapevine allowing for good genotyping success rate. Marker stability was tested in repeated analyses of a selected group of cultivars obtained worldwide to demonstrate their usefulness in genetic identification. Conclusions We have selected a set of 48 stable SNP markers with a high discrimination power and a uniform genome distribution (2-3 markers/chromosome), which is proposed as a standard set for grapevine (Vitis vinifera L.) genotyping. Any previous problems derived from microsatellite allele confusion between labs or the need to run reference cultivars to identify allele sizes disappear using this type of marker. Furthermore, because SNP markers are bi-allelic, allele identification and genotype naming are extremely simple and genotypes obtained with different equipments and by different laboratories are always fully comparable.



SNP-SNP interactions in breast cancer susceptibility  

PubMed Central

Background Breast cancer predisposition genes identified to date (e.g., BRCA1 and BRCA2) are responsible for less than 5% of all breast cancer cases. Many studies have shown that the cancer risks associated with individual commonly occurring single nucleotide polymorphisms (SNPs) are incremental. However, polygenic models suggest that multiple commonly occurring low to modestly penetrant SNPs of cancer related genes might have a greater effect on a disease when considered in combination. Methods In an attempt to identify the breast cancer risk conferred by SNP interactions, we have studied 19 SNPs from genes involved in major cancer related pathways. All SNPs were genotyped by TaqMan 5'nuclease assay. The association between the case-control status and each individual SNP, measured by the odds ratio and its corresponding 95% confidence interval, was estimated using unconditional logistic regression models. At the second stage, two-way interactions were investigated using multivariate logistic models. The robustness of the interactions, which were observed among SNPs with stronger functional evidence, was assessed using a bootstrap approach, and correction for multiple testing based on the false discovery rate (FDR) principle. Results None of these SNPs contributed to breast cancer risk individually. However, we have demonstrated evidence for gene-gene (SNP-SNP) interaction among these SNPs, which were associated with increased breast cancer risk. Our study suggests cross talk between the SNPs of the DNA repair and immune system (XPD-[Lys751Gln] and IL10-[G(-1082)A]), cell cycle and estrogen metabolism (CCND1-[Pro241Pro] and COMT-[Met108/158Val]), cell cycle and DNA repair (BARD1-[Pro24Ser] and XPD-[Lys751Gln]), and within carcinogen metabolism (GSTP1-[Ile105Val] and COMT-[Met108/158Val]) pathways. Conclusion The importance of these pathways and their communication in breast cancer predisposition has been emphasized previously, but their biological interactions through SNPs have not been described. The strategy used here has the potential to identify complex biological links among breast cancer genes and processes. This will provide novel biological information, which will ultimately improve breast cancer risk management.

Onay, Venus Ummiye; Briollais, Laurent; Knight, Julia A; Shi, Ellen; Wang, Yuanyuan; Wells, Sean; Li, Hong; Rajendram, Isaac; Andrulis, Irene L; Ozcelik, Hilmi



VKORC1 polymorphisms, haplotypes and haplotype groups on warfarin dose among African-Americans and European-Americans  

PubMed Central

Background: Although the influence of VKORC1 and CYP2C9 polymorphisms on warfarin response has been studied, variability in dose explained by CYP2C9 and VKORC1 is lower among African–Americans compared with European–Americans. This has lead investigators to hypothesize that assessment of VKORC1 haplotypes may help capture a greater proportion of the variability in dose for this under-represented group. However, the inadequate representation of African–Americans and the assessment of a few VKORC1 polymorphisms have hindered this effort. Methods: To determine if VKORC1 haplotypes or haplotype groups explain a higher variability in warfarin dose, we comprehensively assessed VKORC1 polymorphisms in 273 African–Americans and 302 European–Americans. The influence of VKORC1 polymorphisms, race-specific haplotypes and haplotype groups on warfarin dose was evaluated in race-stratified multivariable analyses after accounting for CYP2C9 (*2, *3, *5, *6 and *11) and clinical covariates. Results: VKORC1 explained 18% (30% with CYP2C9) variability in warfarin dose among European–Americans and 5% (8% with CYP2C9) among African–Americans. Four common haplotypes in European–Americans and twelve in African–Americans were identified. In each race VKORC1 haplotypes emerged into two groups: low-dose (Group A) and high-dose (Group B). African–Americans had a lower frequency of Group A haplotype (10.6%) compared with European–Americans (35%, p < 0.0001).The variability in dose explained by VKORC1 haplotype or haplotype groups was similar to that of a single informative polymorphism. Conclusions: Our findings support the use of CYP2C9, VKORC1 polymorphisms (rs9934438 or rs9923231) and clinical covariates to predict warfarin dose in both African– and European–Americans. A uniform set of common polymorphisms in CYP2C9 and VKORC1, and limited clinical covariates can be used to improve warfarin dose prediction for a racially diverse population.

Limdi, Nita A; Beasley, T Mark; Crowley, Michael R; Goldstein, Joyce A; Rieder, Mark J; Flockhart, David A; Arnett, Donna K; Acton, Ronald T; Liu, Nianjun



The haplotype runs test: the parent-parent-affected offspring trio design.  


The increasing availability of maps of dense polymorphic markers makes use of haplotype data in family-based association analyses an attractive alternative to single marker association tests. We describe a novel class of statistics designed to test for an association between marker haplotypes and a qualitative trait using the parent-parent-affected-offspring trio design. Our haplotype runs test (HRT) is based on consecutive allele-sharing between pairs of haplotypes. We assign weights according to the relative frequencies of the alleles for which the two haplotypes match. Herein, we compare the HRT to the maximum-identity-length-contrast (MILC) statistic, the single-locus transmission/disequilibrium test (TDT), and the generalized test of transmission disequilibrium for haplotype data, as implemented in the software TRANSMIT, using both simulated data and published haplotype data from the recessive disorder ataxia-telangiectasia. Our simulation results suggest that the HRT outperforms the MILC and that the HRT provides comparable power to the TDT and TRANSMIT when the number of distinct founder haplotypes with a disease susceptibility allele is small but substantially outperforms the TDT and TRANSMIT when the number of distinct founder haplotypes with a disease susceptibility allele is even of modest size. PMID:15305328

Lange, Ethan M; Boehnke, Michael



Automated SNP detection in expressed sequence tags: statistical considerations and application to maritime pine sequences.  


We developed an automated pipeline for the detection of single nucleotide polymorphisms (SNPs) in expressed sequence tag (EST) data sets, by combining three DNA sequence analysis programs: Phred, Phrap and PolyBayes. This application requires access to the individual electrophoregram traces. First, a reference set of 65 SNPs was obtained from the sequencing of 30 gametes in 13 maritime pine (Pinus pinaster Ait.) gene fragments (6671 bp), resulting in a frequency of 1 SNP every 102.6 bp. Second, parameters of the three programs were optimized in order to retrieve as many true SNPs, while keeping the rate of false positive as low as possible. Overall, the efficiency of detection of true SNPs was 83.1%. However, this rate varied largely as a function of the rare SNP allele frequency: down to 41% for rare SNP alleles (frequency < 10%), up to 98% for allele frequencies above 10%. Third, the detection method was applied to the 18498 assembled maritime pine (Pinus pinaster Ait.) ESTs, allowing to identify a total of 1400 candidate SNPs, in contigs containing between 4 and 20 sequence reads. These genetic resources, described for the first time in a forest tree species, were made available at http://www.pierroton.inra/genetics/Pinesnps. We also derived an analytical expression for the SNP detection probability as a function of the SNP allele frequency, the number of haploid genomes used to generate the EST sequence database, and the sample size of the contigs considered for SNP detection. The frequency of the SNP allele was shown to be the main factor influencing the probability of SNP detection. PMID:15284499

Dantec, Loïck Le; Chagné, David; Pot, David; Cantin, Olivier; Garnier-Géré, Pauline; Bedon, Frank; Frigerio, Jean-Marc; Chaumeil, Philippe; Léger, Patrick; Garcia, Virginie; Laigret, Frédéric; De Daruvar, Antoine; Plomion, Christophe



Score Tests for Association between Traits and Haplotypes when Linkage Phase Is Ambiguous  

PubMed Central

A key step toward the discovery of a gene related to a trait is the finding of an association between the trait and one or more haplotypes. Haplotype analyses can also provide critical information regarding the function of a gene; however, when unrelated subjects are sampled, haplotypes are often ambiguous because of unknown linkage phase of the measured sites along a chromosome. A popular method of accounting for this ambiguity in case-control studies uses a likelihood that depends on haplotype frequencies, so that the haplotype frequencies can be compared between the cases and controls; however, this traditional method is limited to a binary trait (case vs. control), and it does not provide a method of testing the statistical significance of specific haplotypes. To address these limitations, we developed new methods of testing the statistical association between haplotypes and a wide variety of traits, including binary, ordinal, and quantitative traits. Our methods allow adjustment for nongenetic covariates, which may be critical when analyzing genetically complex traits. Furthermore, our methods provide several different global tests for association, as well as haplotype-specific tests, which give a meaningful advantage in attempts to understand the roles of many different haplotypes. The statistics can be computed rapidly, making it feasible to evaluate the associations between many haplotypes and a trait. To illustrate the use of our new methods, they are applied to a study of the association of haplotypes (composed of genes from the human-leukocyte-antigen complex) with humoral immune response to measles vaccination. Limited simulations are also presented to demonstrate the validity of our methods, as well as to provide guidelines on how our methods could be used.

Schaid, Daniel J.; Rowland, Charles M.; Tines, David E.; Jacobson, Robert M.; Poland, Gregory A.



Haplotype association mapping in mice.  


Haplotype Association Mapping (HAM) is a novel phenotype-driven approach to identify genetic loci and was originally developed for mice. This method, which is similar to Genome-Wide Association (GWA) studies in humans, looks for associations between the phenotype and the haplotypes of mouse inbred strains, treating inbred strains as individuals. Although this approach is still in development, we review the current literature, present the different methods and applications that are in use, and provide a glimpse of what is to come in the near future. PMID:19763930

Tsaih, Shirng-Wern; Korstanje, Ron



Approximation properties of haplotype tagging  

Microsoft Academic Search

Background: Single nucleotide polymorphisms (SNPs) are locations at which the genomic sequences of population members differ. Since these differences are known to follow patterns, disease association studies are facilitated by identifying SNPs that allow the unique identification of such patterns. This process, known as haplotype tagging, is formulated as a combinatorial optimization problem and analyzed in terms of complexity and

Staal A. Vinterbo; Stephan Dreiseitl; Lucila Ohno-machado



Approximation properties of haplotype tagging  

PubMed Central

Background Single nucleotide polymorphisms (SNPs) are locations at which the genomic sequences of population members differ. Since these differences are known to follow patterns, disease association studies are facilitated by identifying SNPs that allow the unique identification of such patterns. This process, known as haplotype tagging, is formulated as a combinatorial optimization problem and analyzed in terms of complexity and approximation properties. Results It is shown that the tagging problem is NP-hard but approximable within 1 + ln((n2 - n)/2) for n haplotypes but not approximable within (1 - ?) ln(n/2) for any ? > 0 unless NP ? DTIME(nlog log n). A simple, very easily implementable algorithm that exhibits the above upper bound on solution quality is presented. This algorithm has running time O((2m - p + 1)) ? O(m(n2 - n)/2) where p ? min(n, m) for n haplotypes of size m. As we show that the approximation bound is asymptotically tight, the algorithm presented is optimal with respect to this asymptotic bound. Conclusion The haplotype tagging problem is hard, but approachable with a fast, practical, and surprisingly simple algorithm that cannot be significantly improved upon on a single processor machine. Hence, significant improvement in computatational efforts expended can only be expected if the computational effort is distributed and done in parallel.

Vinterbo, Staal A; Dreiseitl, Stephan; Ohno-Machado, Lucila



New uses for new haplotypes  

Microsoft Academic Search

Recent discoveries of many new genes have made it clear that there is more to the human Y chromosome than a heap of evolutionary debris, hooked up to a sequence that happens to endow its bearer with testes. Coupled with the recent development of new polymorphic markers on the Y, making it the best-characterized haplotypic system in the genome, this

Mark A. Jobling; Chris Tyler-Smith



Probabilistic Logic Learning from Haplotype Data  

Microsoft Academic Search

The analysis of haplotype data of human populations has received much attention recently. For instance, problems such as Haplo- type Reconstruction are important intermediate steps in gene association studies, which seek to uncover the genetic basis of complex diseases. In this chapter, we explore the application of probabilistic logic learning techniques to haplotype data. More specifically, a new haplotype recon-

Niels Landwehr; Taneli Mielikäinen



SNPsyn: detection and exploration of SNP-SNP interactions  

PubMed Central

SNPsyn ( is an interactive software tool for the discovery of synergistic pairs of single nucleotide polymorphisms (SNPs) from large genome-wide case-control association studies (GWAS) data on complex diseases. Synergy among SNPs is estimated using an information-theoretic approach called interaction analysis. SNPsyn is both a stand-alone C++/Flash application and a web server. The computationally intensive part is implemented in C++ and can run in parallel on a dedicated cluster or grid. The graphical user interface is written in Adobe Flash Builder 4 and can run in most web browsers or as a stand-alone application. The SNPsyn web server hosts the Flash application, receives GWAS data submissions, invokes the interaction analysis and serves result files. The user can explore details on identified synergistic pairs of SNPs, perform gene set enrichment analysis and interact with the constructed SNP synergy network.

Curk, Tomaz; Rot, Gregor; Zupan, Blaz



Association of LRP5 haplotypes with osteoporosis in Mexican women.  


Osteoporosis is a common health problem in Mexico, so it is essential to investigate the status of different gene polymorphisms that could serve as genetic susceptibility markers in the Mexican population. Genes with a role in bone metabolism are excellent candidates for association studies. In this study were determined the allelic and genotypic frequencies of four polymorphic markers (C/T rs3736228, G/A rs4988321, T/C rs627174 and T/C rs901824) in the low-density lipoprotein receptor-related protein 5 gene (LRP5) and their association with osteoporosis in 100 pos-menopausal osteoporotic Mexican women and their controls, using real time-PCR and TaqMan probes. Only the G/A polymorphism (rs4988321, Val667Met) showed significant differences (p = 0.039) when genotype frequencies were compared. However, when the haplotypes of these four polymorphisms were analyzed, interesting associations became evident. The CGTT haplotype showed significant association with low risk of osteoporosis (OR 0.629; p = 0.007; [95 % CI, 0.448-0.884]), whereas the TACT haplotype was significantly associated with a higher risk of osteoporosis (OR 7.965; p = 0.006; [95 % CI, 1.557-54.775]). Our results supported the association of LRP5 with osteoporosis and showed the potential value of LRP5 haplotypes to identify risk of osteoporosis in Mexican population. PMID:23242660

Falcón-Ramírez, Edith; Casas-Avila, Leonora; Cerda-Flores, Ricardo M; Castro-Hernández, Clementina; Rubio-Lightbourn, Julieta; Velázquez-Cruz, Rafael; Diez-G, Pilar; Peñaloza-Espinosa, Rosenda; Valdés-Flores, Margarita



Haplotype Analysis Reveals a Possible Founder Effect of RET Mutation R114H for Hirschsprung's Disease in the Chinese Population  

PubMed Central

Background Hirschsprung's disease (HSCR) is a congenital disorder associated with the lack of intramural ganglion cells in the myenteric and sub-mucosal plexuses along varying segments of the gastrointestinal tract. The RET gene is the major gene implicated in this gastrointestinal disease. A highly recurrent mutation in RET (RETR114H) has recently been identified in ?6–7% of the Chinese HSCR patients which, to date, has not been found in Caucasian patients or controls nor in Chinese controls. Due to the high frequency of RETR114H in this population, we sought to investigate whether this mutation may be a founder HSCR mutation in the Chinese population. Methodology and Principal Findings To test whether all RETR114 were originated from a single mutational event, we predicted the approximate age of RETR114H by applying a Bayesian method to RET SNPs genotyped in 430 Chinese HSCR patients (of whom 25 individuals had the mutation) to be between 4–23 generations old depending on growth rate. We reasoned that if RETR114H was a founder mutation then those with the mutation would share a haplotype on which the mutation resides. Including SNPs spanning 509.31 kb across RET from a recently obtained 500 K genome-wide dataset for a subset of 181 patients (14 RETR114H patients), we applied haplotype estimation methods to determine whether there were any segments shared between patients with RETR114H that are not present in those without the mutation or controls. Analysis yielded a 250.2 kb (51 SNP) shared segment over the RET gene (and downstream) in only those patients with the mutation with no similar segments found among other patients. Conclusions This suggests that RETR114H is a founder mutation for HSCR in the Chinese population.

Cornes, Belinda K.; Tang, Clara S.; Leon, Thomas Y. Y.; Hui, Kenneth J. W. S.; So, Man-Ting; Miao, Xiaoping; Cherny, Stacey S.; Sham, Pak C.; Tam, Paul K. H.; Garcia-Barcelo, Maria-Merce



Genetic polymorphisms and haplotype structures of the human CYP2W1 gene in a Japanese population.  


A novel human cytochrome P450, designated CYP2W1, has recently been identified and is found to be present mainly in tumor cells, particularly in colon cancer cells. In the present study, we report the first systematic investigation of polymorphisms in the human CYP2W1 gene. Based on denaturing high performance liquid chromatography analyses of polymerase chain reaction products, we analyzed nine exons and exon-intron junctions of the gene in DNA samples from 200 Japanese subjects and identified six single nucleotide polymorphisms (SNP). Three of the novel nonsynonymous SNPs were as follows: 173A>C (Glu58Ala) in exon 1 and 5432G>A (Val432Ile) and 5584G>C (Gln482His) in exon 9. Two previously known nonsynonymous SNPs, that is, 2008G>A (Ala181Thr) in exon 4 and 5601C>T (Pro488Leu) in exon 9, were also found. On haplotype analyses, in addition to the wild-type CYP2W1*1A (frequency, 0.295) allele, other alleles, namely, CYP2W1*1B (0.318), CYP2W1*2 (0.005), CYP2W1*3 (0.005), CYP2W1*4 (0.008), CYP2W1*5 (0.003), and CYP2W1*6 (0.368), were also characterized. The most common allele, CYP2W1*6, exhibited the amino acid substitution Pro488Leu. These results were in good agreement with the expected genotype distributions that were calculated using the Hardy-Weinberg equation. The data on variant alleles and comprehensive haplotype structures would be useful for predicting the metabolic phenotypes of CYP2W1 substrates in the Japanese population. PMID:17998294

Hanzawa, Yoshiyuki; Sasaki, Takamitsu; Mizugaki, Michinao; Ishikawa, Masaaki; Hiratsuka, Masahiro



Haplotype analysis of Apo AI-CIII-AIV gene cluster and lipids level: Tehran Lipid and Glucose Study.  


Iranian populations show an increased tendency for abnormal lipid levels and high risk of Coronary artery disease. Considering the important role played by the ApoAI-CIII-AIV gene cluster in the regulation of the level and metabolism of lipids, this study aimed at elucidating the association between five single nucleotide polymorphisms on the Apo11q cluster gene and lipid levels. A cross-sectional study of 823 subjects (340 males and 483 females) from the Tehran lipid and glucose study (TLGS) was conducted. Levels of TG, Chol, HDL-C, Apo AI, Apo AIV, Apo B, and Apo CIII were measured, and the selected segments of the APOAI-CIII-AIV gene cluster were amplified by PCR and the polymorphisms were revealed by RFLP using restriction enzymes. The allele frequencies for each SNP between males and females were not significantly different. The distribution of Genotypes and alleles was in Hardy-Weinberg equilibrium except for Apo AI (+83C>T). The results showed a significant association between TG, HDL-C, HDL(2), Apo AI, and Apo B levels and the presence of some alleles in the polymorphisms studied. After haplotype analysis not only did the association between these variables and SNPs remain but also levels of Chol and LDL-C were added. This study demonstrates that the level of lipids such as TG, HDL-C, HDL(2), Apo AI, and Apo B, maybe regulated partly by genetic factors and their haplotype within the Apo11q gene cluster. PMID:22105741

Daneshpour, Maryam S; Faam, Bita; Mansournia, Mohamad Ali; Hedayati, Mehdi; Halalkhor, Sohrab; Mesbah-Namin, Seyed Alireza; Shojaei, Shahla; Zarkesh, Maryam; Azizi, Fereidoun



A common TPH2 haplotype regulates the neural processing of a cognitive control demand.  


The monoamine neurotransmitter, serotonin, critically regulates the function of the cerebral cortex and is involved in psychiatric disorders. Tryptophan hydroxylase (TPH) is the rate-limiting enzyme in the synthesis of serotonin with the neuron-specific TPH2 isoform present exclusively in the brain and encoded by the TPH2 gene on chromosome 12q21. The haplotype structure of TPH2 was defined for 16 single-nucleotide polymorphisms (SNPs) in a healthy subject population and a haplotype block analysis confirmed the presence of a six SNP haplotype in a yin configuration that has previously been associated with risk for suicidality, depression, and anxiety disorders. Functional magnetic resonance imaging (fMRI) was used to assess the influence of TPH2 variation on brain function related to cognitive control using the Multi-Source Interference Task (MSIT). The MSIT-related blood oxygen level-dependent (BOLD) response was increased with increasing copies of the TPH2 yin haplotype for the dorsal anterior cingulate cortex (dACC), right inferior frontal cortex (IFC), and anterior striatum. A functional connectivity analysis further revealed that increasing numbers of the TPH2 yin haplotype was associated with diminished functional coupling between the dACC and the right IFC, precentral gyrus, parietal cortex and dlPFC. A moderation analysis indicated that the relationship between neural processing networks and cognitive control was significantly modulated by allelic variation for the TPH2 yin haplotype. These findings suggest that the association of risk for psychiatric disorders with a common TPH2 yin haplotype is related to the inefficient functional engagement of cortical areas involved in cognitive control and alterations in the mode of functional connectivity of dACC pathways. PMID:22915309

Kennedy, Ashley P; Binder, Elisabeth B; Bowman, Dubois; Harenski, Keith; Ely, Timothy; Cisler, Josh M; Tripathi, Shanti P; VanNess, Sidney; Kilts, Clinton D



Detecting disease-predisposing variants: The haplotype method  

SciTech Connect

For many HLA-associated diseases, multiple alleles - and, in some cases, multiple loci - have been suggested as the causative agents. The haplotype method for identifying disease-predisposing amino acids in a genetic region is a stratification analysis. We show that, for each haplotype combination containing all the amino acid sites involved in the disease process, the relative frequencies of amino acid variants at sites not involved in disease but in linkage disequilibrium with the disease-predisposing sites are expected to be the same in patients and controls. The haplotype method is robust to mode of inheritance and penetrance of the disease and can be used to determine unequivocally whether all amino acid sites involved in the disease have not been identified. Using a resampling technique, we developed a statistical test that takes account of the nonindependence of the sites sampled. Further, when multiple sites in the genetic region are involved in disease, the test statistic gives a closer fit to the null expectation when some - compared with none - of the true predisposing factors are included in the haplotype analysis. Although the haplotype method cannot distinguish between very highly correlated sites in one population, ethnic comparisons may help identify the true predisposing factors. The haplotype method was applied to insulin-dependent diabetes mellitus (IDDM) HLA class II DQA1-DQB1 data from Caucasian, African, and Japanese populations. Our results indicate that the combination DQA1 No. 52 (Arg predisposing) DQB1 No. 57 (Asp protective), which has been proposed as an important IDDM agent, does not include all the predisposing elements. With rheumatoid arthritis HLA class H DRB1 data, the results were consistent with the shared-epitope hypothesis. 35 refs., 2 figs., 6 tabs.

Valdes, A.M.; Thomson, G. [Univ. of California, Berkeley, CA (United States)



Effects of Vitamin A and D Receptor Gene Polymorphisms/Haplotypes on Immune Responses to Measles Vaccine  

PubMed Central

OBJECTIVE Vitamin A and D, and their receptors, are important regulators of the immune system, including vaccine immune response. We assessed the association between polymorphisms in the vitamin A (RARA, RARB and RARG) and vitamin D receptor (VDR)/RXRA genes and inter-individual variations in immune responses after two doses of measles vaccine in 745 subjects. METHODS Using a tagSNP approach, we genotyped 745 healthy children for the 391 polymorphisms in vitamin A and D receptor genes. RESULTS The RARB haplotype (rs6800566/rs6550976/rs9834818) was significantly associated with variations in both measles antibody (global p=0.013) and cytokine secretion levels, such as IL-10 (global p=0.006), IFN-? (global p=0.008), and TNF-? (global p=0.039) in the Caucasian subgroup. Specifically, the RARB haplotype AAC was associated with higher (t-statistic 3.27, p=0.001) measles antibody levels. At the other end of the spectrum, haplotype GG for rs6550978/rs6777544 was associated with lower antibody levels (t-statistic ?2.32, p=0.020) in the Caucasian subgroup. In a sensitivity analysis, the RARB haplotype CTGGGCAA remained marginally significant (p<0.02) when the single SNP rs12630816 was included in the model for IL-10 secretion levels. A significant association was found between lower measles-specific IFN-? Elispot responses and haplotypes rs11102986/rs11103473/rs11103482/rs10776909/rs12004589/rs35780541/rs2266677/rs875444 (global p=0.004) and rs6537944/rs3118571 (global p<0.001) in the RXRA gene for Caucasians. We also found associations between multiple RARB, VDR and RXRA SNPs/haplotypes and measles-specific IL-2, IL-6, IL-10, IFN-?, IFN-?, IFN?-1, and TNF-? cytokine secretion. CONCLUSION Our results suggest that specific allelic variations and haplotypes in the vitamin A and D receptor genes may influence adaptive immune responses to measles vaccine.

Ovsyannikova, Inna G.; Haralambieva, Iana H.; Vierkant, Robert A.; O'Byrne, Megan M.; Jacobson, Robert M.; Poland, Gregory A.



A GCH1 haplotype and risk of neural tube defects in the National Birth Defects Prevention Study.  


Tetrahydrobiopterin (BH(4)) is an essential cofactor and an important cellular antioxidant. BH(4) deficiency has been associated with diseases whose etiologies stem from excessive oxidative stress. GTP cyclohydrolase I (GCH1) catalyzes the first and rate-limiting step of de novo BH(4) synthesis. A 3-SNP haplotype in GCH1 (rs8007267, rs3783641, and rs10483639) is known to modulate GCH1 gene expression levels and has been suggested as a major determinant of plasma BH(4) bioavailability. As plasma BH(4) bioavailability has been suggested as a mechanism of neural tube defect (NTD) teratogenesis, we evaluated the association between this GCH1 haplotype and the risk of NTDs. Samples were obtained from 760 NTD case-parent triads included in the National Birth Defects Prevention Study (NBDPS). The three SNPs were genotyped using TaqMan® SNP assays. An extension of the log-linear model was used to assess the association between NTDs and both offspring and maternal haplotypes. Offspring carrying two copies of haplotype C-T-C had a significantly increased NTD risk (risk ratio [RR]=3.40, 95% confidence interval [CI]: 1.02-11.50), after adjusting for the effect of the maternal haplotype. Additionally, mothers carrying two copies of haplotype C-T-C had a significantly increased risk of having an NTD-affected offspring (RR=3.46, 95% CI: 1.05-11.00), after adjusting for the effect of the offspring haplotype. These results suggest offspring and maternal variation in the GCH1 gene and altered BH(4) biosynthesis may contribute to NTD risk. PMID:23059057

Lupo, Philip J; Chapa, Claudia; Nousome, Darryl; Duhon, Cody; Canfield, Mark A; Shaw, Gary M; Finnell, Richard H; Zhu, Huiping



Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication  

Microsoft Academic Search

Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a

Bridgett M. Vonholdt; John P. Pollinger; Kirk E. Lohmueller; Eunjung Han; Heidi G. Parker; Pascale Quignon; Jeremiah D. Degenhardt; Adam R. Boyko; Dent A. Earl; Adam Auton; Andy Reynolds; Kasia Bryc; Abra Brisbin; James C. Knowles; Dana S. Mosher; Tyrone C. Spady; Abdel Elkahloun; Eli Geffen; Malgorzata Pilot; Wlodzimierz Jedrzejewski; Claudia Greco; Ettore Randi; Danika Bannasch; Alan Wilton; Jeremy Shearman; Marco Musiani; Michelle Cargill; Paul G. Jones; Zuwei Qian; Wei Huang; Zhao-Li Ding; Ya-Ping Zhang; Carlos D. Bustamante; Elaine A. Ostrander; John Novembre; Robert K. Wayne



Clarifying haplotype ambiguity of NAT2 in multi-national cohorts.  


N-Acetyltransferase 2 (NAT2) is the key enzyme in aromatic amine metabolism. NAT2 genotyping requires a subsequent determination of the haplotype pairs (formerly: alleles) to derive the acetylation status. The chromosomal phase of the single nucleotide polymorphisms (SNPs) is unclear for about 2/3 of the genotypes. We investigated NAT2 genotypes of 1,234 bladder cancer cases and 2,207 controls from Germany, Hungary, Pakistan and Venezuela plus 696 further German cancer cases. We reconstructed NAT2 haplotypes using PHASE v2.1.1. We analysed if the variability of the NAT2 haplotypes affected the haplotype reconstruction. Furthermore, we compared population haplotype frequencies in three Caucasian control cohorts (German, Hungarian, Spanish), in Pakistanis and Venezuelans and the impact on bladder cancer. We conclude that a common haplotype reconstruction is feasible, enhances precision and reliability. Hungarian controls showed the largest intra-ethnic variability whereas the Pakistanis showed a haplotype distribution typical for Caucasians. The main differences could be observed for the slow haplotypes *5B, *6A and *7B. The association of slow NAT2 genotypes with bladder cancer risk was most prominent in the Venezuelan study group. PMID:23277078

Selinski, Silvia; Blaszkewicz, Meinolf; Agundez, Jose A G; Martinez, Carmen; Garcia-Martin, Elena; Hengstler, Jan G; Golka, Klaus



Recombination mapping using Boolean logic and high-density SNP genotyping for exome sequence filtering  

PubMed Central

Whole genome sequence data for small pedigrees has been shown to provide sufficient information to resolve detailed haplotypes in small pedigrees. Using such information, recombinations can be mapped onto chromosomes, compared with the segregation of a disease of interest and used to filter genome sequence variants. We now show that relatively inexpensive SNP array data from small pedigrees can be used in a similar manner to provide a means of identifying regions of interest in exome sequencing projects. We demonstrate that in those situations where one can assume complete penetrance and parental DNA is available, SNP recombination mapping using Boolean logic identifies chromosomal regions identical to those detected by multipoint linkage using microsatellites but with much less computation. We further show that this approach is successful because the probability of a double crossover between informative SNP loci is negligible. Our observations provide a rationale for using SNP arrays and recombination mapping as a rapid and cost-effective means of incorporating chromosome segregation information into exome sequencing projects intended for disease-gene identification.

Markello, Thomas C.; Han, Ted; Carlson-Donohoe, Hannah; Ahaghotu, Chidi; Harper, Ursula; Jones, MaryPat; Chandrasekharappa, Settara; Anikster, Yair; Adams, David R.; Gahl, William A.; Boerkoel, Cornelius F.



SSCP-SNP in pearl millet--a new marker system for comparative genetics.  


A considerable array of genomic resources are in place in pearl millet, and marker-aided selection is already in use in the public breeding programme at ICRISAT. This paper describes experiments to extend these publicly available resources to a single nucleotide polymorphism (SNP)-based marker system. A new marker system, single-strand conformational polymorphism (SSCP)-SNP, was developed using annotated rice genomic sequences to initially predict the intron-exon borders in millet expressed sequence tags (ESTs) and then to design primers that would amplify across the introns. An adequate supply of millet ESTs was available for us to identify 299 homologues of single-copy rice genes in which the intron positions could be precisely predicted. PCR primers were then designed to amplify approximately 500-bp genomic fragments containing introns. Analysis of these fragments on SSCP gels revealed considerable polymorphism. A detailed DNA sequence analysis of variation at four of the SSCP-SNP loci over a panel of eight inbred genotypes showed complex patterns of variation, with about one SNP or indel (insertion-deletion) every 59 bp in the introns, but considerably fewer in the exons. About two-thirds of the variation was derived from SNPs and one-third from indels. Most haplotypes were detected by SSCP. As a marker system, SSCP-SNP has lower development costs than simple sequence repeats (SSRs), because much of the work is in silico, and similar deployment costs and through-put potential. The rates of polymorphism were lower but useable, with a mean PIC of 0.49 relative to 0.72 for SSRs in our eight inbred genotype panel screen. The major advantage of the system is in comparative applications. Syntenic information can be used to target SSCP-SNP markers to specific chromosomal regions or, conversely, SSCP-SNP markers can be used to unravel detailed syntenic relationships in specific parts of the genome. Finally, a preliminary analysis showed that the millet SSCP-SNP primers amplified in other cereals with a success rate of about 50%. There is also considerable potential to promote SSCP-SNP to a COS (conserved orthologous set) marker system for application across species by more specifically designing primers to precisely match the model genome sequence. PMID:15809850

Bertin, I; Zhu, J H; Gale, M D



Survey of the fragile X syndrome CGG repeat and the short-tandem-repeat and single-nucleotide-polymorphism haplotypes in an African American population.  


Previous studies have shown that specific short-tandem-repeat (STR) and single-nucleotide-polymorphism (SNP)-based haplotypes within and among unaffected and fragile X white populations are found to be associated with specific CGG-repeat patterns. It has been hypothesized that these associations result from different mutational mechanisms, possibly influenced by the CGG structure and/or cis-acting factors. Alternatively, haplotype associations may result from the long mutational history of increasing instability. To understand the basis of the mutational process, we examined the CGG-repeat size, three flanking STR markers (DXS548-FRAXAC1-FRAXAC2), and one SNP (ATL1) spanning 150 kb around the CGG repeat in unaffected (n=637) and fragile X (n=63) African American populations and compared them with unaffected (n=721) and fragile X (n=102) white populations. Several important differences were found between the two ethnic groups. First, in contrast to that seen in the white population, no associations were observed among the African American intermediate or "predisposed" alleles (41-60 repeats). Second, two previously undescribed haplotypes accounted for the majority of the African American fragile X population. Third, a putative "protective" haplotype was not found among African Americans, whereas it was found among whites. Fourth, in contrast to that seen in whites, the SNP ATL1 was in linkage equilibrium among African Americans, and it did not add new information to the STR haplotypes. These data indicate that the STR- and SNP-based haplotype associations identified in whites probably reflect the mutational history of the expansion, rather than a mutational mechanism or pathway. PMID:10677308

Crawford, D C; Schwartz, C E; Meadows, K L; Newman, J L; Taft, L F; Gunter, C; Brown, W T; Carpenter, N J; Howard-Peebles, P N; Monaghan, K G; Nolin, S L; Reiss, A L; Feldman, G L; Rohlfs, E M; Warren, S T; Sherman, S L



A TNF region haplotype offers protection from typhoid fever in Vietnamese patients  

PubMed Central

The genomic region surrounding the TNF locus on human chromosome 6 has previously been associated with typhoid fever in Vietnam. We used a haplotypic approach to understand this association further. Eighty single nucleotide polymorphisms (SNPs) spanning a 150 kb region were genotyped in 95 Vietnamese individuals (typhoid case/mother/father trios). A subset of data from 33 SNPs with a minor allele frequency of >4.3% was used to construct haplotypes. Fifteen SNPs, which tagged the 42 constructed haplotypes were selected. The haplotype tagging SNPs (T1-T15) were genotyped in 380 confirmed typhoid cases and 380 Vietnamese ethnically matched controls. Allelic frequencies of seven SNPs (T1, T2, T3, T5, T6, T7, T8) were significantly different between typhoid cases and controls. Logistic regression results support the hypothesis that there is just one signal associated with disease at this locus. Haplotype-based analysis of the tag SNPs provided positive evidence of association with typhoid (posterior probability 0.821). The analysis highlighted a low-risk cluster of haplotypes that each carry the minor allele of T1 or T7, but not both, and otherwise carry the combination of alleles *12122*1111 at T1-T11, further supporting the one associated signal hypothesis. Finally, individuals that carry the typhoid fever protective haplotype *12122*1111 also produce a relatively low TNF-? response to LPS.



Bayesian Analysis of Haplotypes for Linkage Disequilibrium Mapping  

PubMed Central

Haplotype analysis of disease chromosomes can help identify probable historical recombination events and localize disease mutations. Most available analyses use only marginal and pairwise allele frequency information. We have developed a Bayesian framework that utilizes full haplotype information to overcome various complications such as multiple founders, unphased chromosomes, data contamination, and incomplete marker data. A stochastic model is used to describe the dependence structure among several variables characterizing the observed haplotypes, for example, the ancestral haplotypes and their ages, mutation rate, recombination events, and the location of the disease mutation. An efficient Markov chain Monte Carlo algorithm was developed for computing the estimates of the quantities of interest. The method is shown to perform well in both real data sets (cystic fibrosis data and Friedreich ataxia data) and simulated data sets. The program that implements the proposed method, BLADE, as well as the two real datasets, can be obtained from

Liu, Jun S.; Sabatti, Chiara; Teng, Jun; Keats, Bronya J.B.; Risch, Neil



Genome complexity reduction for SNP genotyping analysis  

PubMed Central

Efficient single nucleotide polymorphism (SNP) genotyping methods are necessary to accomplish many current gene discovery goals. A crucial element in large-scale SNP genotyping is the number of individual biochemical reactions that must be performed. An efficient method that can be used to simultaneously amplify a set of genetic loci across a genome with high reliability can provide a valuable tool for large-scale SNP genotyping studies. In this paper we describe and characterize a method that addresses this goal. We have developed a strategy for reducing genome complexity by using degenerate oligonucleotide primer (DOP)-PCR and applied this strategy to SNP genotyping in three complex eukaryotic genomes; human, mouse, and Arabidopsis thaliana. Using a single DOP-PCR primer, SNP loci spread throughout a genome can be amplified and accurately genotyped directly from a DOP-PCR product mixture. DOP-PCRs are extremely reproducible. The DOP-PCR method is transferable to many species of interest. Finally, we describe an in silico approach that can effectively predict the SNP loci amplified in a given DOP-PCR, permitting the design of an efficient set of reactions for large-scale, genome-wide SNP studies.

Jordan, Barbara; Charest, Alain; Dowd, John F.; Blumenstiel, Justin P.; Yeh, Ru-fang; Osman, Asiah; Housman, David E.; Landers, John E.



Haplotyping for disease association: a combinatorial approach.  


We consider a combinatorial problem derived from haplotyping a population with respect to a genetic disease, either recessive or dominant. Given a set of individuals, partitioned into healthy and diseased, and the corresponding sets of genotypes, we want to infer "bad'' and "good'' haplotypes to account for these genotypes and for the disease. Assume e.g. the disease is recessive. Then, the resolving haplotypes must consist of bad and good haplotypes, so that (i) each genotype belonging to a diseased individual is explained by a pair of bad haplotypes and (ii) each genotype belonging to a healthy individual is explained by a pair of haplotypes of which at least one is good. We prove that the associated decision problem is NP-complete. However, we also prove that there is a simple solution, provided the data satisfy a very weak requirement. PMID:18451433

Lancia, Giuseppe; Ravi, R; Rizzi, Romeo


Allele and Haplotype Diversity of 26 X-STR Loci in Four Nationality Populations from China  

PubMed Central

Background Haplotype analysis of closely associated markers has proven to be a powerful tool in kinship analysis, especially when short tandem repeats (STR) fail to resolve uncertainty in relationship analysis. STR located on the X chromosome show stronger linkage disequilibrium compared with autosomal STR. So, it is necessary to estimate the haplotype frequencies directly from population studies as linkage disequilibrium is population-specific. Methodology and Findings Twenty-six X-STR loci including six clusters of linked markers DXS6807-DXS8378-DXS9902(Xp22), DXS7132-DXS10079-DXS10074-DXS10075-DXS981 (Xq12), DXS6801-DXS6809-DXS6789-DXS6799(Xq21), DXS7424-DXS101-DXS7133(Xq22), DXS6804-GATA172D05(Xq23), DXS8377-DXS7423 (Xq28) and the loci DXS6800, DXS6803, DXS9898, GATA165B12, DXS6854, HPRTB and GATA31E08 were typed in four nationality (Han, Uigur, Kazakh and Mongol) samples from China (n?=?1522, 876 males and 646 females). Allele and haplotype frequency as well as linkage disequilibrium data for kinship calculation were observed. The allele frequency distribution among different populations was compared. A total of 5–20 alleles for each locus were observed and altogether 289 alleles for all the selected loci were found. Allele frequency distribution for most X-STR loci is different in different populations. A total of 876 male samples were investigated by haplotype analysis and for linkage disequilibrium. A total of 89, 703, 335, 147, 39 and 63 haplotypes were observed. Haplotype diversity was 0.9584, 0.9994, 0.9935, 0.9736, 0.9427 and 0.9571 for cluster I, II, III, IV, V and VI, respectively. Eighty-two percent of the haplotype of cluster IIwas found only once. And 94% of the haplotype of cluster III show a frequency of <1%. Conclusions These results indicate that allele frequency distribution for most X-STR loci is population-specific and haplotypes of six clusters provide a powerful tool for kinship testing and relationship investigation. So it is necessary to obtain allele frequency and haplotypes data of the linked loci for forensic application.

Quan, Li; Zhao, Hu; Wu, Ye-Da; Huang, Xiao-Ling; Lu, De-Jian



Analysis of the French National Registry of unrelated bone marrow donors, using surnames as a tool for improving geographical localisation of HLA haplotypes  

Microsoft Academic Search

The first statistical analysis of the French National Registry of volunteer bone marrow donors estimated the probabilities of haplotype frequencies separately for each of the 20 administrative regions of France. Here we propose to use donors' surnames to increase the accuracy of location of the donor's geographical origin. This approach allows us to estimate haplotype frequencies for administrative entities (90

Anna Degioanni; Pierre Darlu; Colette Raffoux



SNP Set Association Analysis for Genome-Wide Association Studies  

PubMed Central

Genome-wide association study (GWAS) is a promising approach for identifying common genetic variants of the diseases on the basis of millions of single nucleotide polymorphisms (SNPs). In order to avoid low power caused by overmuch correction for multiple comparisons in single locus association study, some methods have been proposed by grouping SNPs together into a SNP set based on genomic features, then testing the joint effect of the SNP set. We compare the performances of principal component analysis (PCA), supervised principal component analysis (SPCA), kernel principal component analysis (KPCA), and sliced inverse regression (SIR). Simulated SNP sets are generated under scenarios of 0, 1 and ?2 causal SNPs model. Our simulation results show that all of these methods can control the type I error at the nominal significance level. SPCA is always more powerful than the other methods at different settings of linkage disequilibrium structures and minor allele frequency of the simulated datasets. We also apply these four methods to a real GWAS of non-small cell lung cancer (NSCLC) in Han Chinese population

Cai, Min; Dai, Hui; Qiu, Yongyong; Zhao, Yang; Zhang, Ruyang; Chu, Minjie; Dai, Juncheng; Hu, Zhibin; Shen, Hongbing; Chen, Feng



Multiple Cross and Inbred Strain Haplotype Mapping of Complex-Trait Candidate Genes  

PubMed Central

Identifying complex-trait candidate genes after initial low-resolution mapping has proven to be a difficult and labor-intensive undertaking, usually requiring years to develop and analyze congenic strains. As a result, to date, few complex-trait genes have been discovered. Recently it was suggested that SNP haplotype analysis in inbred strains might be useful for mapping of complex traits. In this study, we have combined medium-resolution haplotype mapping with multiple experimental cross-mapping experiments to reduce the number of potential candidate genes in a complex-trait candidate interval. Coincident mapping of a modifier gene in multiple experimental crosses using different inbred strains is consistent with the common inheritance of a modifier allele. A haplotype map was developed in four inbred strains of mice used in our complex-trait mapping crosses across the proximal 10 cM of proximal Chromosome 19 to identify haplotype blocks that segregate appropriately. Only ?23 out of >400 genes met this criteria. This strategy coupled with tissue and expression arrays, as well as our recently described common pathway analysis to reduce the number of high-priority candidates, may provide a rapid, efficient method to identify and prioritize complex-trait candidate genes without requiring construction of congenic mouse strains.

Park, Yeong-Gwon; Clifford, Robert; Buetow, Kenneth H.; Hunter, Kent W.



A new haplotype in BMP4 implicated in ossification of the posterior longitudinal ligament (OPLL) in a Chinese population.  


Previous genome-wide microarray analysis of candidate genes involved in the ossification of the posterior longitudinal ligament (OPLL) of the spine resulted in the identification of a novel, clinically relevant gene encoding bone morphogenetic protein 4 (BMP4) but was defined only by its expression patterns. The complete genomic BMP4 coding DNA from 450 patients with OPLL and 550 matched controls were sequenced and compared. We identified 18 SNPs, among which the minor alleles of SNP8 (C>T; p?SNP13 (rs17563C>T; p?SNP14 (rs76335800A>T; p?SNP8 (p?SNP13 (p?SNP14 (p?haplotype, TGGGCTT (p?haplotype TGGGCTT appear to contribute to the risk of developing OPLL. Also the severity of OPLL seems to be mediated predominantly by genetic variations in this specific BMP4 gene region, but might be associated with other certain clinical and demographic characteristics in the Chinese population studied. PMID:22052794

Ren, Yuan; Feng, Jie; Liu, Zhi-zhong; Wan, Hong; Li, Jun-hua; Lin, Xin



SNP genotyping by DNA photoligation: application to SNP detection of genes from food crops  

NASA Astrophysics Data System (ADS)

We describe a simple and inexpensive single-nucleotide polymorphism (SNP) typing method, using DNA photoligation with 5-carboxyvinyl-2'-deoxyuridine and two fluorophores. This SNP-typing method facilitates qualitative determination of genes from indica and japonica rice, and showed a high degree of single nucleotide specificity up to 10 000. This method can be used in the SNP typing of actual genomic DNA samples from food crops.

Yoshimura, Yoshinaga; Ohtake, Tomoko; Okada, Hajime; Ami, Takehiro; Tsukaguchi, Tadashi; Fujimoto, Kenzo



Managing large SNP datasets with SNPpy.  


Using relational databases to manage SNP datasets is a very useful technique that has significant advantages over alternative methods, including the ability to leverage the power of relational databases to perform data validation, and the use of the powerful SQL query language to export data. SNPpy is a Python program which uses the PostgreSQL database and the SQLAlchemy Python library to automate SNP data management. This chapter shows how to use SNPpy to store and manage large datasets. PMID:23756888

Mitha, Faheem



Computing the minimum recombinant haplotype configuration from incomplete genotype data on a pedigree by integer linear programming.  


We study the problem of reconstructing haplotype configurations from genotypes on pedigree data with missing alleles under the Mendelian law of inheritance and the minimum-recombination principle, which is important for the construction of haplotype maps and genetic linkage/association analyses. Our previous results show that the problem of finding a minimum-recombinant haplotype configuration (MRHC) is in general NP-hard. This paper presents an effective integer linear programming (ILP) formulation of the MRHC problem with missing data and a branch-and-bound strategy that utilizes a partial order relationship and some other special relationships among variables to decide the branching order. Nontrivial lower and upper bounds on the optimal number of recombinants are introduced at each branching node to effectively prune the search tree. When multiple solutions exist, a best haplotype configuration is selected based on a maximum likelihood approach. The paper also shows for the first time how to incorporate marker interval distance into a rule-based haplotyping algorithm. Our results on simulated data show that the algorithm could recover haplotypes with 50 loci from a pedigree of size 29 in seconds on a Pentium IV computer. Its accuracy is more than 99.8% for data with no missing alleles and 98.3% for data with 20% missing alleles in terms of correctly recovered phase information at each marker locus. A comparison with a statistical approach SimWalk2 on simulated data shows that the ILP algorithm runs much faster than SimWalk2 and reports better or comparable haplotypes on average than the first and second runs of SimWalk2. As an application of the algorithm to real data, we present some test results on reconstructing haplotypes from a genome-scale SNP dataset consisting of 12 pedigrees that have 0.8% to 14.5% missing alleles. PMID:16108713

Li, Jing; Jiang, Tao


Characterization of human collagen XVIII promoter 2: interaction of Sp1, Sp3 and YY1 with the regulatory region and a SNP that increases transcription in hepatocytes.  


Different levels of Collagen XVIII expression have been associated with several pathological processes such as cancer, liver fibrosis, diabetic retinopathy and Alzheimer's disease. Understanding the transcriptional regulation of Collagen XVIII might elucidate some pathways related to the progression of these diseases. The promoter 2 of COL18A1 gene is poorly understood and is responsible for the transcription of this gene in several adult tissues such as liver, eyes and brain. This study focused upon characterization of cis-regulatory elements interacting with human COL18A1 promoter 2 and identification of SNPs in this region in different ethnic groups. Our results show that there are five conserved regions (I to V) between human and mouse promoter 2 and that the human COL18A1 core promoter is located between nucleotides -186 and -21. Sp1 and Sp3 bind to conserved regions I and V, while Sp3 and YY1 interact with region II. We have verified that the SNP at position -700 (T>G) is embedded in two common haplotypes, which have different frequencies between European and African descendents. The allele -700G increases transcription and binding for a still unknown transcription factor. SNP -700 affects Sp3 and YY1 interaction with this region, even though it is not part of these transcription factors' predicted binding sites. Therefore, our results show for the first time that Sp3 and YY1 interact with human COL18A1 promoter 2, and that nucleotide -700 is part of a binding motif for a still unknown TF that is involved in the expression of this gene in hepatocytes. In addition, we also confirm the involvement of Sp1 in the regulation of this gene. PMID:16229994

Armelin-Correa, Lucia M; Lin, Chin J; Barbosa, Angela; Bagatini, Kelly; Winnischofer, Sheila M B; Sogayar, Mari C; Passos-Bueno, Maria Rita



Specific haplotypes of the RET proto-oncogene are over-represented in patients with sporadic papillary thyroid carcinoma  

PubMed Central

Background: Papillary thyroid carcinoma (PTC), which may be sporadic (95%) or familial (5%), has a prevalence adjusted for age in the general population of 1:100 000. Somatic rearrangements of the RET proto-oncogene are present in up to 66% of sporadic tumours, while they are rarely found in familial cases. Purpose: In order to determine if some variants of this gene, or a combination of them, might predispose to PTC, we looked for an association of RET haplotype(s) in PTC cases and in controls from four countries matched for sex, age, and population. Methods: Four single nucleotide polymorphisms (SNPs) across the RET coding sequence were typed and haplotype frequencies were estimated. Genotype and haplotype distributions were compared among these cases and controls. Results: Ten haplotypes were observed, the seven most frequent of which have been previously described in sporadic Hirschsprung patients and controls. The single locus analyses suggested association of exon 2 and exon 13 SNPs with sporadic PTC. The haplotype analysis showed over-representation of one haplotype in French and Italian sporadic PTC, whereas a different haplotype was significantly under-represented in French familial PTC. Conclusions: Our data suggest that some variants of RET and some specific haplotypes may act as low penetrance alleles in the predisposition to PTC.

Lesueur, F; Corbex, M; McKay, J; Lima, J; Soares, P; Griseri, P; Burgess, J; Ceccherini, I; Landolfi, S; Papotti, M; Amorim, A; Goldgar, D; Romeo, G



Haplotype analysis in Huntington desease provides insights into mechanisms of CAG repeat expansion  

SciTech Connect

Huntington disease (HD) is one of 7 disorders now known to be caused by expansion of a trinucleotide repeat. The HD mutation is a polymorphic trinucleotide (CAG) repeat in the 5{prime} region of a novel gene that expands beyond the normal range of 10-35 repeats in persons destined to develop the disease. Haplotype analysis of other dynamic mutation disorders such as myotonic dystrophy and Fragil X have suggested that a rare ancestral expansion event on a normal chromosome is followed by subsequent expansion events, resulting in a pool of chromosomes in the premutation range, which is inherently unstable and prone to further multiple expansion events leading to disease range chromosomes. Haplotype analysis of 67 HD and 84 control chromosomes using 5 polymorphic markers, both intragenic and 5{prime} to the disease mutation, demonstrate that multiple haplotypes underlie HD. However, 94% of the chromosomes can be grouped under two major haplotypes. These two haplotypes are also present in the normal population. A third major haplotype is seen on 38% of normal chromosomes but rarely on HD chromosomes (6%). CAG lengths on the normal chromosomes with the two haplotypes seen in the HD population are higher than those seen on the normal chromosomes with the haplotype rarely seen on HD chromosomes. Furthermore, in populations with a diminished frequency of HD, CAG length on normal chromosomes is significantly less than other populations with higher prevalence rates for HD. These data suggest that CAG length on normal chromosomes may be a significant factor contributing to repeat instability that eventually leads to chromosomes with CAG repeat lengths in the HD range. Haplotypes on the HD chromosomes are identical to those normal chromosomes which have CAG lengths in the high range of normal, suggesting that further expansions of this pool of chromosomes leads to chromosomes with CAG repeat sizes within the disease range, consistent with a multistep model.

Andrew, S.E.; Goldberg, Y.P.; Squitieri, F. [Univ. of British Columbia, Vancouver (Canada)] [and others



Filling in missing genotypes using haplotypes  

Technology Transfer Automated Retrieval System (TEKTRAN)

Unknown genotypes can be made known (imputed) from observed genotypes at the same or nearby loci of relatives using pedigree haplotyping, or from matching allele patterns (regardless of pedigree) using population haplotyping. Fortran program findhap.f90 was designed to combine population and pedigre...


Network analysis of human Y microsatellite haplotypes  

Microsoft Academic Search

To investigate the utility of Y chromosome micro- satellites for studying human male-lineage evolution, we typed samples from three populations for five tetranucleotide repeats and an Alu insertion poly- morphism. We found very high levels of haplotype diversity and evidence that most mutations involve the gain or loss of only one repeat unit, implying that any given microsatellite haplotype may

Gillian Cooper; William Amos; Dorota Hoffman; David C. Rubinsztein



Multilocus analysis of SNP and metabolic data within a given pathway  

PubMed Central

Background Complex traits, which are under the influence of multiple and possibly interacting genes, have become a subject of new statistical methodological research. One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common multifactorial diseases and their association to different quantitative phenotypic traits. Results Two types of data from the same metabolic pathway were used in the analysis: categorical measurements of 18 SNPs; and quantitative measurements of plasma levels of several steroids and their precursors. Using the combinatorial partitioning method we tested various thresholds for each metabolic trait and each individual SNP locus. One SNP in CYP19, 3UTR, two SNPs in CYP1B1 (R48G and A119S) and one in CYP1A1 (T461N) were significantly differently distributed between the high and low level metabolic groups. The leave one out cross validation method showed that 6 SNPs in concert make 65% correct prediction of phenotype. Further we used pattern recognition, computing the p-value by Monte Carlo simulation to identify sets of SNPs and physiological characteristics such as age and weight that contribute to a given metabolic level. Since the SNPs detected by both methods reside either in the same gene (CYP1B1) or in 3 different genes in immediate vicinity on chromosome 15 (CYP19, CYP11 and CYP1A1) we investigated the possibility that they form intragenic and intergenic haplotypes, which may jointly account for a higher activity in the pathway. We identified such haplotypes associated with metabolic levels. Conclusion The methods reported here may enable to study multiple low-penetrance genetic factors that together determine various quantitative phenotypic traits. Our preliminary data suggest that several genes coding for proteins involved in a common pathway, that happen to be located on common chromosomal areas and may form intragenic haplotypes, together account for a higher activity of the whole pathway.

Kristensen, Vessela N; Tsalenko, Anya; Geisler, Jurgen; Faldaas, Anne; Grenaker, Grethe Irene; Lingjaerde, Ole Christian; Fjeldstad, Stale; Yakhini, Zohar; L?nning, Per Eystein; B?rresen-Dale, Anne-Lise



Evaluation of breast cancer susceptibility using improved genetic algorithms to generate genotype SNP barcodes.  


Genetic association is a challenging task for the identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases. To fully execute genetic studies of complex diseases, modern geneticists face the challenge of detecting interactions between loci. A genetic algorithm (GA) is developed to detect the association of genotype frequencies of cancer cases and noncancer cases based on statistical analysis. An improved genetic algorithm (IGA) is proposed to improve the reliability of the GA method for high-dimensional SNP-SNP interactions. The strategy offers the top five results to the random population process, in which they guide the GA toward a significant search course. The IGA increases the likelihood of quickly detecting the maximum ratio difference between cancer cases and noncancer cases. The study systematically evaluates the joint effect of 23 SNP combinations of six steroid hormone metabolisms, and signaling-related genes involved in breast carcinogenesis pathways were systematically evaluated, with IGA successfully detecting significant ratio differences between breast cancer cases and noncancer cases. The possible breast cancer risks were subsequently analyzed by odds-ratio (OR) and risk-ratio analysis. The estimated OR of the best SNP barcode is significantly higher than 1 (between 1.15 and 7.01) for specific combinations of two to 13 SNPs. Analysis results support that the IGA provides higher ratio difference values than the GA between breast cancer cases and noncancer cases over 3-SNP to 13-SNP interactions. A more specific SNP-SNP interaction profile for the risk of breast cancer is also provided. PMID:23929860

Yang, Cheng-Hong; Lin, Yu-Da; Chuang, Li-Yeh; Chang, Hsueh-Wei


Assessment of two flexible and compatible SNP genotyping platforms: TaqMan ® SNP Genotyping Assays and the SNPlex™ Genotyping System  

Microsoft Academic Search

In this review we describe the principles, protocols, and applications of two commercially available SNP genotyping platforms, the TaqMan® SNP Genotyping Assays and the SNPlex™ Genotyping System. Combined, these two technologies meet the requirements of multiple SNP applications in genetics research and pharmacogenetics. We also describe a set of SNP selection tools and validated assay resources which we developed to

Francisco M. De La Vega; Katherine D. Lazaruk; Michael D. Rhodes; Michael H. Wenz



Acute chest syndrome is associated with single nucleotide polymorphism-defined beta globin cluster haplotype in children with sickle cell anaemia.  


Genetic diversity at the human ?-globin locus has been implicated as a modifier of sickle cell anaemia (SCA) severity. However, haplotypes defined by restriction fragment length polymorphism sites across the ?-globin locus have not been consistently associated with clinical phenotypes. To define the genetic structure at the ?-globin locus more thoroughly, we performed high-density single nucleotide polymorphism (SNP) mapping in 820 children who were homozygous for the sickle cell mutation (HbSS). Genotyping results revealed very high linkage disequilibrium across a large region spanning the locus control region and the HBB (?-globin gene) cluster. We identified three predominant haplotypes accounting for 96% of the ?(S) -carrying chromosomes in this population that could be distinguished using a minimal set of common SNPs. Consistent with previous studies, fetal haemoglobin level was significantly associated with ?(S) -haplotypes. After controlling for covariates, an association was detected between haplotype and rate of hospitalization for acute chest syndrome (ACS) (incidence rate ratio 0·51, 95% confidence interval 0·29-0·89) but not incidence rate of vaso-occlusive pain or presence of silent cerebral infarct (SCI). Our results suggest that these SNP-defined ?(S) -haplotypes may be associated with ACS, but not pain or SCI in a study population of children with SCA. PMID:23952145

Bean, Christopher J; Boulet, Sheree L; Yang, Genyan; Payne, Amanda B; Ghaji, Nafisa; Pyle, Meredith E; Hooper, W Craig; Bhatnagar, Pallav; Keefer, Jeffrey; Barron-Casella, Emily A; Casella, James F; Debaun, Michael R



SNP mining porcine ESTs with MAVIANT, a novel tool for SNP evaluation and annotation  

Microsoft Academic Search

Motivation: Single nucleotide polymorphisms (SNPs) analysis is an important means to study genetic variation. A fast and cost-efficient approach to identify large numbers of novel candidates is the SNP mining of large scale sequencing projects. The increasing availability of sequence trace data in public repositories makes it feasible to evaluate SNP predictions on the DNA chromatogram level. MAVIANT, a platform-independent

Frank Panitz; Henrik Stengaard; Henrik Hornshøj; Jan Gorodkin; Jakob Hedegaard; Susanna Cirera; Bo Thomsen; Lone B. Madsen; Anette Høj; Rikke K. Vingborg; Bujie Zahn; Xuegang Wang; Xuefei Wang; Rasmus Wernersson; Claus B. Jørgensen; Karsten Scheibye-knudsen; Troels Arvin; Steen Lumholdt; Milena Sawera; Trine Green; Bente J. Nielsen; Jakob Hull Havgaard; Søren Brunak; Merete Fredholm; Christian Bendixen



Genome-wide SNP detection, validation, and development of an 8K SNP array for apple  

Technology Transfer Automated Retrieval System (TEKTRAN)

As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide...


Multi objective SNP selection using pareto optimality.  


Biomarker discovery is a challenging task of bioinformatics especially when targeting high dimensional problems such as SNP (single nucleotide polymorphism) datasets. Various types of feature selection methods can be applied to accomplish this task. Typically, using features versus class labels of samples in the training dataset, these methods aim at selecting feature subsets with maximal classification accuracies. Although finding such class-discriminative features is crucial, selection of relevant SNPs for maximizing other properties that exist in the nature of population genetics such as the correlation between genetic diversity and geographical distance of ethnic groups can also be equally important. In this work, a methodology using a multi objective optimization technique called Pareto Optimal is utilized for selecting SNP subsets offering both high classification accuracy and correlation between genomic and geographical distances. In this method, discriminatory power of an SNP is determined using mutual information and its contribution to the genomic-geographical correlation is estimated using its loadings on principal components. Combining these objectives, the proposed method identifies SNP subsets that can better discriminate ethnic groups than those obtained with sole mutual information and yield higher correlation than those obtained with sole principal components on the Human Genome Diversity Project (HGDP) SNP dataset. PMID:23318882

Gumus, Ergun; Gormez, Zeliha; Kursun, Olcay



Fundamental problem of forensic mathematics--the evidential value of a rare haplotype.  


Y-chromosomal and mitochondrial haplotyping offer special advantages for criminal (and other) identification. For different reasons, each of them is sometimes detectable in a crime stain for which autosomal typing fails. But they also present special problems, including a fundamental mathematical one: When a rare haplotype is shared between suspect and crime scene, how strong is the evidence linking the two? Assume a reference population sample is available which contains n-1 haplotypes. The most interesting situation as well as the most common one is that the crime scene haplotype was never observed in the population sample. The traditional tools of product rule and sample frequency are not useful when there are no components to multiply and the sample frequency is zero. A useful statistic is the fraction ? of the population sample that consists of "singletons" - of once-observed types. A simple argument shows that the probability for a random innocent suspect to match a previously unobserved crime scene type is (1-?)/n - distinctly less than 1/n, likely ten times less. The robust validity of this model is confirmed by testing it against a range of population models. This paper hinges above all on one key insight: probability is not frequency. The common but erroneous "frequency" approach adopts population frequency as a surrogate for matching probability and attempts the intractable problem of guessing how many instances exist of the specific haplotype at a certain crime. Probability, by contrast, depends by definition only on the available data. Hence if different haplotypes but with the same data occur in two different crimes, although the frequencies are different (and are hopelessly elusive), the matching probabilities are the same, and are not hard to find. PMID:20457055

Brenner, Charles H



Characterization of the Streptomyces sp. Strain C5 snp Locus and Development of snp-Derived Expression Vectors  

PubMed Central

The Streptomyces sp. strain C5 snp locus is comprised of two divergently oriented genes: snpA, a metalloproteinase gene, and snpR, which encodes a LysR-like activator of snpA transcription. The transcriptional start point of snpR is immediately downstream of a strong T-N11-A inverted repeat motif likely to be the SnpR binding site, while the snpA transcriptional start site overlaps the ATG start codon, generating a leaderless snpA transcript. By using the aphII reporter gene of pIJ486 as a reporter, the plasmid-borne snpR-activated snpA promoter was ca. 60-fold more active than either the nonactivated snpA promoter or the melC1 promoter of pIJ702. The snpR-activated snpA promoter produced reporter protein levels comparable to those of the up-mutated ermE? promoter. The SnpR-activated snpA promoter was built into a set of transcriptional and translational fusion expression vectors which have been used for the intracellular expression of numerous daunomycin biosynthesis pathway genes from Streptomyces sp. strain C5 as well as the expression and secretion of soluble recombinant human endostatin.

DeSanti, Charles L.; Strohl, William R.



Identification of a type 1 diabetes-associated CD4 promoter haplotype with high constitutive activity.  


CD4 is a candidate gene in autoimmune diseases, including Type 1 diabetes mellitus (T1DM), because the CD4 receptor is crucial for appropriate antigen responses of CD4(+) T cells. We previously found linkage between a CD4-1188(TTTTC)(5-14) promoter polymorphism and T1DM. In the present study, we screened the human CD4 promoter for mutations and identified three frequent single nucleotide polymorphisms (SNPs): CD4-181C/G, CD4-521C/G and CD4-1050T/C. The SNPs are in strong linkage disequilibrium (LD) and association with the CD4-1188(TTTTC)(5-14) alleles, and we observed nine CD4 promoter haplotypes, of which four are frequent. We genotyped the SNPs in 253 Danish T1DM families (1129 individuals) and found evidence for linkage and association of a CD4 (A4(-1188)T(-1050)G(-521)C(-181)) haplotype to T1DM. In reporter studies, we show that (1) the T1DM-associated CD4 haplotype encodes high constitutive promoter activity and (2) the CD4-181G variant encodes higher stimulated promoter activity than the CD4-181C variant. This difference is in part neutralized in the frequently occurring CD4 promoter haplotypes by the more upstream genetic variants. Thus, we report functional impact of a novel CD4-181C/G SNP on stimulated CD4 promoter activity and the identification of a novel CD4 haplotype with high constitutive promoter activity that is linked and associated with T1DM. PMID:15182254

Kristiansen, O P; Karlsen, A E; Larsen, Z M; Johannesen, J; Pociot, F; Mandrup-Poulsen, T



Strobe sequence design for haplotype assembly  

PubMed Central

Background Humans are diploid, carrying two copies of each chromosome, one from each parent. Separating the paternal and maternal chromosomes is an important component of genetic analyses such as determining genetic association, inferring evolutionary scenarios, computing recombination rates, and detecting cis-regulatory events. As the pair of chromosomes are mostly identical to each other, linking together of alleles at heterozygous sites is sufficient to phase, or separate the two chromosomes. In Haplotype Assembly, the linking is done by sequenced fragments that overlap two heterozygous sites. While there has been a lot of research on correcting errors to achieve accurate haplotypes via assembly, relatively little work has been done on designing sequencing experiments to get long haplotypes. Here, we describe the different design parameters that can be adjusted with next generation and upcoming sequencing technologies, and study the impact of design choice on the length of the haplotype. Results We show that a number of parameters influence haplotype length, with the most significant one being the advance length (distance between two fragments of a clone). Given technologies like strobe sequencing that allow for large variations in advance lengths, we design and implement a simulated annealing algorithm to sample a large space of distributions over advance-lengths. Extensive simulations on individual genomic sequences suggest that a non-trivial distribution over advance lengths results a 1-2 order of magnitude improvement in median haplotype length. Conclusions Our results suggest that haplotyping of large, biologically important genomic regions is feasible with current technologies.



Haplotype Association Mapping Identifies a Candidate Gene Region in Mice Infected With Staphylococcus aureus.  


Exposure to Staphylococcus aureus has a variety of outcomes, from asymptomatic colonization to fatal infection. Strong evidence suggests that host genetics play an important role in susceptibility, but the specific host genetic factors involved are not known. The availability of genome-wide single nucleotide polymorphism (SNP) data for inbred Mus musculus strains means that haplotype association mapping can be used to identify candidate susceptibility genes. We applied haplotype association mapping to Perlegen SNP data and kidney bacterial counts from Staphylococcus aureus-infected mice from 13 inbred strains and detected an associated block on chromosome 7. Strong experimental evidence supports the result: a separate study demonstrated the presence of a susceptibility locus on chromosome 7 using consomic mice. The associated block contains no genes, but lies within the gene cluster of the 26-member extended kallikrein gene family, whose members have well-recognized roles in the generation of antimicrobial peptides and the regulation of inflammation. Efficient mixed-model association (EMMA) testing of all SNPs with two alleles and located within the gene cluster boundaries finds two significant associations: one of the three polymorphisms defining the associated block and one in the gene closest to the block, Klk1b11. In addition, we find that 7 of the 26 kallikrein genes are differentially expressed between susceptible and resistant mice, including the Klk1b11 gene. These genes represent a promising set of candidate genes influencing susceptibility to Staphylococcus aureus. PMID:22690378

Johnson, Nicole V; Ahn, Sun Hee; Deshmukh, Hitesh; Levin, Mikhail K; Nelson, Charlotte L; Scott, William K; Allen, Andrew; Fowler, Vance G; Cowell, Lindsay G



Haplotype Association Mapping Identifies a Candidate Gene Region in Mice Infected With Staphylococcus aureus  

PubMed Central

Exposure to Staphylococcus aureus has a variety of outcomes, from asymptomatic colonization to fatal infection. Strong evidence suggests that host genetics play an important role in susceptibility, but the specific host genetic factors involved are not known. The availability of genome-wide single nucleotide polymorphism (SNP) data for inbred Mus musculus strains means that haplotype association mapping can be used to identify candidate susceptibility genes. We applied haplotype association mapping to Perlegen SNP data and kidney bacterial counts from Staphylococcus aureus-infected mice from 13 inbred strains and detected an associated block on chromosome 7. Strong experimental evidence supports the result: a separate study demonstrated the presence of a susceptibility locus on chromosome 7 using consomic mice. The associated block contains no genes, but lies within the gene cluster of the 26-member extended kallikrein gene family, whose members have well-recognized roles in the generation of antimicrobial peptides and the regulation of inflammation. Efficient mixed-model association (EMMA) testing of all SNPs with two alleles and located within the gene cluster boundaries finds two significant associations: one of the three polymorphisms defining the associated block and one in the gene closest to the block, Klk1b11. In addition, we find that 7 of the 26 kallikrein genes are differentially expressed between susceptible and resistant mice, including the Klk1b11 gene. These genes represent a promising set of candidate genes influencing susceptibility to Staphylococcus aureus.

Johnson, Nicole V.; Ahn, Sun Hee; Deshmukh, Hitesh; Levin, Mikhail K.; Nelson, Charlotte L.; Scott, William K.; Allen, Andrew; Fowler, Vance G.; Cowell, Lindsay G.



Genetic stock structure of walleye pollock (Theragra chalcogramma) inferred by PCR-RFLP analysis of the mitochondrial DNA and SNP analysis of nuclear DNA.  


Walleye pollock, Theragra chalocogramma, is one of the most important species in the North Pacific and Bering Sea ecosystems. However genetic population structuring of walleye pollock is uncertain. In the present study, genetic variation of walleye pollock collected in several spawning areas ranging from the Japan Sea to the Gulf of Alaska was investigated by DNA analysis. Three regions of the spacer control region, the ND5 and ND6 region (ND complex), and the ND1 and 16S rRNA region (rDNA complex) were amplified using the polymerase chain reaction (PCR). Restriction fragment length polymorphism (RFLP) was conducted on these PCR products and composite haplotypes were calculated. Furthermore, several nuclear DNA regions (actin, Calmodulin, S7 ribosomal protein, creatin kinase, and SypI gene) were investigated to study the stock structure of walleye pollock. It was considered that Calmodulin gene was one of good genetic marker, therefore we conducted SNP analysis for Calmodulin gene by SnaPshot kits. In RFLP analyses, there were no area-specific fragment patterns in the three regions, control region, ND complex and rDNA complex of mtDNA. However compositions of the fragment patterns for the three digested sets, control region/HinfI, rDNA complex/MspI and ND complex/MspI indicated that there are significant differences between around the Japan (Sado-Funka Bay-Wakkanai-Rausu) and the Bering Sea (Western Bering Sea-Nabarin-Atka I.-Bogoslof I). Furthermore, in the case of haplotype frequency, composition showed also significant genetic difference between two areas. Moreover, in Calmodulin analyses, haplotype compositions were changing from western area to eastern area gradually and the results of AMOVA analysis showed that there are interesting differences between western Pacific, western Bering Sea, and eastern Bering Sea. Judging from these results, it was considered that there are three populations of walleye pollock in the Northern Ocean. However, area-specific pattern was not found in some populations in the Northern Ocean. Therefore, we suggested that these populations were related by weak gene flow, and the walleye pollock was formed with meta-population around the Japan and the Bering Sea. PMID:22897958

Yanagimoto, Takashi; Kitamura, Toru; Kobayashi, Takanori



Efficient mining of haplotype patterns for linkage disequilibrium mapping.  


Effective identification of disease-causing gene locations can have significant impact on patient management decisions that will ultimately increase survival rates and improve the overall quality of health care. Linkage disequilibrium mapping is the process of finding disease gene locations through comparisons of haplotype frequencies between disease chromosomes and normal chromosomes. This work presents a new method for linkage disequilibrium mapping. The main advantage of the proposed algorithm, called LinkageTracker, is its consistency in producing good predictive accuracy under different conditions, including extreme conditions where the occurrence of disease samples with the mutation of interest is very low and there is presence of error or noise. We compared our method with some leading methods in linkage disequilibrium mapping such as HapMiner, Blade, GeneRecon, and Haplotype Pattern Mining (HPM). Experimental results show that for a substantial class of problems, our method has good predictive accuracy while taking reasonably short processing time. Furthermore, LinkageTracker does not require any population ancestry information about the disease and the genealogy of the haplotypes. Therefore, it is useful for linkage disequilibrium mapping when the users do not have such information about their datasets. PMID:21155024

Lin, Li; Wong, Limsoon; Leong, Tze-Yun; Lai, Poh San



The Association of the Angiotensinogen Gene with Insulin Sensitivity in Humans: A Tagging SNPs and Haplotype Approach  

PubMed Central

Objective The purpose of this study was to clarify the association of the angiotensinogen gene (AGT) with insulin sensitivity using SNP and haplotype analyses in a Caucasian cohort. Material and Methods A candidate gene association study was conducted in Caucasians with and without hypertension (N=449). Seventeen single nucleotide polymorphisms (SNPs) of the AGT gene and their haplotypes were analyzed for an association with HOMA-IR. Multivariate regression model accounting for age, gender, BMI, hypertension status, study site, and sibling relatedness was used to test the hypothesis. Results Nine of the seventeen SNPs were significantly associated with lower HOMA-IR levels. Homozygous minor allele carriers of the most significant SNP rs2493134 (GG), a surrogate for the gain of function mutation rs699 [AGT p.M268T], had significantly lower HOMA-IR levels (p=0.0001) than heterozygous or homozygous major allele carriers (GC, AA). Direct genotyping of rs699 in a subset of the population showed similar results with minor allele carriers exhibiting significantly decreased HOMA-IR levels (p=0.003). Haplotype analysis demonstrated that haplotypes rs2493137A|rs5050A|rs3789678G|rs2493134A and rs2004776G|rs11122576A|rs699T|rs6687360G were also significantly associated with HOMA-IR (p=0.0009, p=0.02) and these results were driven by rs2493134 and rs699. Conclusion This study confirms an association between the AGT gene and insulin sensitivity in Caucasian humans. Haplotype analysis extends this finding and implicates SNPs rs2493134 and rs699 as the most influential. Thus, AGT gene variants, previously shown to be associated with AGT levels, are also associated with insulin sensitivity; suggesting a relationship between the AGT gene, AGT levels, and insulin sensitivity in humans.

Underwood, Patricia C.; Sun, Bei; Williams, Jonathan S.; Pojoga, Luminita H.; Raby, Benjamin; Lasky-Su, Jessica; Hunt, Steven; Hopkins, Paul N.; Jeunemaitre, Xavier; Adler, Gail K.; Williams, Gordon H.



Frequency of genetic polymorphisms of ADAM33 and their association with allergic rhinitis among Jordanians.  


Allergic rhinitis is a chronic inflammatory disease that is assumed to be due to an interaction between different genetic and/or environmental factors. A disintegrin and metalloprotease domain 33 (ADAM33) has been extensively studied as a susceptibility gene in asthma and has been linked to bronchial hyper-responsiveness. In this study, we investigated the association between ADAM33 single nucleotide polymorphisms and the incidence of allergic rhinitis among the Jordanian population. We conducted a case-control association study on 120 adult individuals diagnosed with allergic rhinitis and 128 normal healthy controls. 8 single-nucleotide polymorphisms in ADAM33 were genotyped using PCR-RFLP method. No significant differences in the allelic frequencies of all SNPs tested between AR patients and the control volunteers were found, although S2 C/G SNP showed a tendency toward significance with P=0.06. On the genotype level significant association were found in the following genotypes: T1 AA, T1 AG, T2 GG, T2 AG, T+1 GG, T+1 AG, V4 CG, S2 CC, S2 CG, Q-1AA. Seven haplotypes were present only within AR patients and eight haplotypes were completely absent from the AR patients. Three haplotypes exhibited significant association with AR P?0.05, two of them were present only in AR patients. In conclusion, the polymorphisms in the ADAM33 gene are associated with susceptibility to AR in the Jordanian population. Furthermore, the haplotype of the tested SNPs were also associated with the risk of AR. PMID:24035932

Zihlif, Malek; Mahafza, Tareq; Obeidat, Nathir M; Froukh, Tawfiq; Shaban, Mazen; Al-Akhras, Fatima M; Zihlif, Nadwa; Naffa, Randa



The Role of Osteopontin (OPN/SPP1) Haplotypes in the Susceptibility to Crohn's Disease  

PubMed Central

Background Osteopontin represents a multifunctional molecule playing a pivotal role in chronic inflammatory and autoimmune diseases. Its expression is increased in inflammatory bowel disease (IBD). The aim of our study was to analyze the association of osteopontin (OPN/SPP1) gene variants in a large cohort of IBD patients. Methodology/Principal Findings Genomic DNA from 2819 Caucasian individuals (n?=?841 patients with Crohn's disease (CD), n?=?473 patients with ulcerative colitis (UC), and n?=?1505 healthy unrelated controls) was analyzed for nine OPN SNPs (rs2728127, rs2853744, rs11730582, rs11739060, rs28357094, rs4754?=?p.Asp80Asp, rs1126616?=?p.Ala236Ala, rs1126772 and rs9138). Considering the important role of osteopontin in Th17-mediated diseases, we performed analysis for epistasis with IBD-associated IL23R variants and analyzed serum levels of the Th17 cytokine IL-22. For four OPN SNPs (rs4754, rs1126616, rs1126772 and rs9138), we observed significantly different distributions between male and female CD patients. rs4754 was protective in male CD patients (p?=?0.0004, OR?=?0.69). None of the other investigated OPN SNPs was associated with CD or UC susceptibility. However, several OPN haplotypes showed significant associations with CD susceptibility. The strongest association was found for a haplotype consisting of the 8 OPN SNPs rs2728127-rs2853744-rs11730582-rs11439060-rs28357094-rs112661-rs1126772-rs9138 (omnibus p-value?=?2.07×10?8). Overall, the mean IL-22 secretion in the combined group of OPN minor allele carriers with CD was significantly lower than that of CD patients with OPN wildtype alleles (p?=?3.66×10?5). There was evidence for weak epistasis between the OPN SNP rs28357094 with the IL23R SNP rs10489629 (p?=?4.18×10?2) and between OPN SNP rs1126616 and IL23R SNP rs2201841 (p?=?4.18×10?2) but none of these associations remained significant after Bonferroni correction. Conclusions/Significance Our study identified OPN haplotypes as modifiers of CD susceptibility, while the combined effects of certain OPN variants may modulate IL-22 secretion.

Bayrle, Corinna; Wetzke, Martin; Fries, Christoph; Tillack, Cornelia; Olszak, Torsten; Beigel, Florian; Steib, Christian; Friedrich, Matthias; Diegelmann, Julia; Czamara, Darina; Brand, Stephan



Y chromosome haplotype analysis in Portuguese cattle breeds using SNPs and STRs.  


DNA samples from 307 males of 13 Portuguese native cattle breeds, 57 males of the 3 major exotic breeds in Portugal (Charolais, Friesian, and Limousin), and 5 Brahman (Bos indicus) were tested for 5 single nucleotide polymorphisms, 1 "indel," and 7 microsatellites specific to the Y chromosome. The 13 Y-haplotypes defined included 3 previously described patrilines (Y1, Y2, and Y3) and 10 new haplotypes within Bos taurus. Native cattle contained most of the diversity with 7 haplotypes (H2Y1, H3Y1, H5Y1, H7Y2, H8Y2, H10Y2, and H12Y2) found only in these breeds. H6Y2 and H11Y2 occurred in high frequency across breeds including the exotics. Introgression of Friesian cattle into Ramo Grande was inferred through their sharing of haplotype H4Y1. Among the native breeds, Mertolenga had the highest haplotype diversity (0.68 +/- 0.07), Brava de Lide was the least differentiated. The analyses of molecular variance showed significant (P < 0.0001) differences between breeds with more than 64% of the total genetic variation found among breeds within groups and 33-35% within breeds. The detection of INRA189-104 allele in 8 native breeds suggested influence of African cattle in breeds of the Iberian Peninsula. The presence in Portuguese breeds of Y1 patrilines, also found in aurochs, could represent more ancient local haplotypes. PMID:18832111

Ginja, Catarina; Telo da Gama, Luís; Penedo, Maria Cecilia T



Evolution of mouse chromosome 17 and the origin of inversions associated with t haplotypes.  

PubMed Central

Mouse t haplotypes are variant forms of chromosome 17 that exist at high frequencies in worldwide populations of several species of house mouse. They are known to differ from wild-type chromosomes with respect to two relative inversions referred to as proximal and distal. An untested assumption has been that these two inversions originated in the chromosomal lineage leading to present-day t haplotypes. To investigate the evolutionary origins of these inversions and the possibility of additional inversions, interspecific crosses were performed between Mus spretus or Mus abbotti and laboratory strains of Mus domesticus that carried wild-type and t haplotypes forms of chromosome 17. The results provide evidence for the existence of two additional nonoverlapping inversions--one between the proximal and distal inversions and one between the centromere and the proximal inversion. These four inversions span nearly the entire region of t haplotype recombination suppression. Considering the distribution of these inversions among the species studied as well as the organization of the D17Leh66 family of DNA elements, we infer that the proximal inversion occurred on the lineage leading to the common ancestor of M. domesticus and M. abbotti, and that the other three inversions occurred on the separate lineage leading to present-day t haplotypes. Alternative models for the evolution of t haplotypes are discussed in light of these findings.

Hammer, M F; Schimenti, J; Silver, L M



Y-chromosome polymorphisms and ethnic group - a combined STR and SNP approach in a population sample from northern Italy  

PubMed Central

Aim To find an association between Y chromosome polymorphisms and some ethnic groups. Methods Short tandem repeats (STR) and single-nucleotide polymorphisms (SNP) on the Y chromosome were typed in 311 unrelated men from four different ethnic groups – Italians from northern Italy, Albanians, Africans from the Maghreb region, and Indo-Pakistanis, using the AmpFlSTR® Yfiler PCR Amplification Kit and the SNaPshot Multiplex Kit. Results STRs analysis found 299 different haplotypes and SNPs analysis 11 different haplogroups. Haplotypes and haplogroups were analyzed and compared between different ethnic groups. Significant differences were found among all the population groups, except between Italians and Indo-Pakistanis and between Albanians and Indo-Pakistanis. Conclusions Typing both STRs and SNPs on the Y chromosome could become useful in determining ethnic origin of a potential suspect.

Cortellini, Venusia; Verzeletti, Andrea; Cerri, Nicoletta; Marino, Alberto; De Ferrari, Francesco



Exact coalescent simulation of new haplotype data from existing reference haplotypes  

PubMed Central

Motivation: We introduce a coalescent-based method (RECOAL) for the simulation of new haplotype data from a reference population of haplotypes. A coalescent genealogy for the reference haplotype data is sampled from the appropriate posterior probability distribution, then a coalescent genealogy is simulated which extends the sampled genealogy to include new haplotype data. The new haplotype data will, therefore, contain both some of the existing polymorphic sites and new polymorphisms added based on the structure of the simulated coalescent genealogy. This allows exact coalescent simulation of new haplotype data, compared with other methods which are more approximate in nature. Results: We demonstrate the performance of our method using a variety of data simulated under a coalescent model, before applying it to data from the 1000 Genomes project. Availability: The source code is freely available for download at Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

Kang, Chul Joo; Marjoram, Paul



Low Diversity of T Haplotypes in the Eastern Form of the House Mouse, Mus Musculus L  

PubMed Central

In previous studies, 13 different recessive embryonic lethal genes have been associated with t haplotypes in the wild mice of the species Mus domesticus. In this communication we have analyzed five populations of Mus musculus for the presence and identity of t haplotypes. The populations occupy geographically distant regions in the Soviet Union: Altai Mountains, western and eastern Siberia, Azerbaijan and Turkmenistan. No t haplotypes were found in mice from eastern Siberia. In the remaining four populations, t haplotypes occurred with frequencies ranging from 0.07 to 0.21. All the t haplotypes extracted from these populations and analyzed by the genetic complementation test were shown to carry the same lethal gene tcl-w73. In one population (that of western Siberia), another lethal gene (tcl-w5) was found to be present on the same chromosome as tcl-w73. This situation is in striking contrast to that found in the populations of the western form of the house mouse, M. domesticus. In the latter species, tcl-w73 has not been found at all and the different populations are characterized by the presence of several different lethal genes. The low diversity of t haplotypes in M. musculus is consistent with lower genetic variability of other traits and indicates a different origin and speciation mode compared to M. domesticus. Serological typing for H-2 antigenic determinants suggests that most, if not all, of the newly described t haplotypes might have arisen by recombination of t(w73) from M. musculus with t haplotypes from M. domesticus either in the hybrid zone between the two species or in regions where the two species mixed accidentally.

Ruvinsky, A.; Polyakov, A.; Agulnik, A.; Tichy, H.; Figueroa, F.; Klein, J.



Comparing genotyping algorithms for Illumina's Infinium whole-genome SNP BeadChips  

PubMed Central

Background Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study. Results In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics. Conclusions CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.



Ribosomal DNA haplotype distribution of Bursaphelenchus xylophilus in Kyushu and Okinawa islands, Japan.  


Ribosomal DNA region sequences (partial 18S, 28S and complete ITS1, 5.8S, and ITS2) of the pinewood nematode (Bursaphelenchus xylophilus) were obtained from DNA extracted directly from wood pieces collected from wilted pine trees throughout the Kyushu and Okinawa islands, Japan. Either a 2569bp or 2573bp sequence was obtained from 88 of 143 samples. Together with the 45 rDNA sequences of pinewood nematode isolates previously reported, there were eight single nucleotide polymorphisms and two indels of two bases. Based on these mutations, nine haplotypes were estimated. The haplotype frequencies differed among regions in Kyushu island (northwest, northeast and center, southeast, and southwest), and the distribution was consistent with the invasion and spreading routes of the pinewood nematode previously estimated from past records of pine wilt and wood importation. There was no significant difference in haplotype frequencies among the collection sites on Okinawa island. PMID:22736814

Nose, Mine; Shiraishi, Susumu; Miyahara, Fumihiko; Ohira, Mineko; Matsunaga, Koji; Tobase, Masashi; Koyama, Takao; Yoshimoto, Kikuo



Ribosomal DNA haplotype distribution of Bursaphelenchus xylophilus in Kyushu and Okinawa islands, Japan  

PubMed Central

Ribosomal DNA region sequences (partial 18S, 28S and complete ITS1, 5.8S, and ITS2) of the pinewood nematode (Bursaphelenchus xylophilus) were obtained from DNA extracted directly from wood pieces collected from wilted pine trees throughout the Kyushu and Okinawa islands, Japan. Either a 2569bp or 2573bp sequence was obtained from 88 of 143 samples. Together with the 45 rDNA sequences of pinewood nematode isolates previously reported, there were eight single nucleotide polymorphisms and two indels of two bases. Based on these mutations, nine haplotypes were estimated. The haplotype frequencies differed among regions in Kyushu island (northwest, northeast and center, southeast, and southwest), and the distribution was consistent with the invasion and spreading routes of the pinewood nematode previously estimated from past records of pine wilt and wood importation. There was no significant difference in haplotype frequencies among the collection sites on Okinawa island.

Nose, Mine; Miyahara, Fumihiko; Ohira, Mineko; Matsunaga, Koji; Tobase, Masashi; Koyama, Takao; Yoshimoto, Kikuo



Haplotype analysis of Norwegian and Swedish patients with acute intermittent porphyria (AIP): Extreme haplotype heterogeneity for the mutation R116W.  


Acute intermittent porphyria (AIP), the most common of the acute porphyrias, is caused by mutations in the gene encoding hydroxymethylbilane synthase (HMBS) also called porphobilinogen deaminase (PBGD). The mutation spectrum in the HMBS gene is characterized by a majority of family specific mutations. Among the exceptions are R116W and W198X, with high prevalence in both the Dutch and Swedish populations. These two mutations were also detected in unrelated Norwegian patients. Thus, Norwegian and Swedish patients were haplotyped using closely linked flanking microsatellites and intragenic single nucleotide polymorphisms (SNPs) to see if the high frequency of these two mutations is due to a founder effect. Twelve intragenic SNPs were determined by a method based on fluorescent restriction enzyme fingerprinting single-strand conformation polymorphism (F-REF-SSCP). W198X occurred exclusively on one haplotype in both Norwegian and Swedish patients, showing that it has originated from a common gene source. In contrast, R116W was found on three different haplotypes in three Norwegian families, and in five Swedish families on four or five haplotypes. This extreme haplotype heterogeneity indicates that R116W is a recurrent mutation, maybe explained by the high mutability of CpG dinucleotides. This can also explain why it is the only AIP mutation reported to occur in seven different populations (Norway, Sweden, Finland, Netherlands, France, Spain and South Africa). PMID:14757946

Tjensvoll, Kjersti; Bruland, Ove; Floderus, Ylva; Skadberg, Øyvind; Sandberg, Sverre; Apold, Jaran


PCR-based detection of mtDNA haplotypes of native and invading mussels on the northeastern Pacific coast: latitudinal pattern of invasion  

Microsoft Academic Search

Frequencies of mitochondrial haplotypes characteristic of native Mytilus trossulus and introduced M. galloprovincialis were determined in populations along the west coast of North America from San Diego, California, to the Aleutian Islands, Alaska. We also identified the haplotypes of mussels cultured from larvae arriving in Coos Bay, Oregon, during 1988–1990 from sites in Japan in the seawater ballast of ocean-going

J. B. Geller; J. T. Carlton; D. A. Powers



Analysis of the adequate size of a cord blood bank and comparison of HLA haplotype distributions between four populations.  


The number of units and especially the number of different HLA haplotypes present in a cord blood (CB) bank is a crucial determinant of its usefulness. We generated data relevant to the development of our national CB in Finland. The HLA haplotype distribution was examined between specific populations. We developed graphical ways of data presentation that enable easy visualization of differences. First, we estimated the optimal size of a CB bank for Finland and found that approximately 1700 units are needed to provide a 5/6 HLA-matched donor for 80% of Finnish patients. Secondly, we evaluated HLA haplotype distributions between four locations, Finland, Japan, Sweden and Belgium. Our results showed that the Japanese Tokyo Cord Blood Bank differs in both the frequency and distribution of haplotypes from the European banks. The European banks (Finnish Cord Blood Registry, The Swedish National Cord Blood Bank, and Marrow Donor Program-Belgium) have similar frequencies of common haplotypes, but 26% of the haplotypes in the Finnish CB bank are unique, which justifies the existence of a national bank. The tendency to a homogenous HLA haplotype distribution in banks underlines the need for targeting recruitment at the poorly represented minority populations. PMID:23137880

Haimila, Katri; Penttilä, Antti; Arvola, Anne; Auvinen, Marja-Kaisa; Korhonen, Matti



Mapping genes for resistance to Verticillium albo-atrum in tetraploid and diploid potato populations using haplotype association tests and genetic linkage analysis  

Microsoft Academic Search

Verticillium wilt disease of potato is caused predominantly by Verticillium albo-atrum and V. dahliae. StVe1 —a putative QTL for resistance against V. dahliae —was previously mapped to potato chromosome 9. To develop allele-specific, SNP-based markers within the locus, the StVe1 fragment from a set of 30 North American potato cultivars was analyzed. Three distinct and highly diverse haplotypes can be

I. Simko; K. G. Haynes; E. E. Ewing; S. Costanzo; B. J. Christ; R. W. Jones



Survey of the Fragile X Syndrome CGG Repeat and the Short-Tandem-Repeat and Single-Nucleotide-Polymorphism Haplotypes in an African American Population  

Microsoft Academic Search

Summary Previous studies have shown that specific short-tan- dem-repeat (STR) and single-nucleotide-polymorphism (SNP)-based haplotypes within and among unaffected and fragile X white populations are found to be asso- ciated with specific CGG-repeat patterns. It has been hypothesized that these associations result from different mutational mechanisms, possibly influenced by the CGG structure and\\/or cis-acting factors. Alternatively, hap- lotype associations may result

Dana C. Crawford; Charles E. Schwartz; Kellen L. Meadows; James L. Newman; Lisa F. Taft; Chris Gunter; W. Ted Brown; Nancy J. Carpenter; Patricia N. Howard-Peebles; Kristin G. Monaghan; Sarah L. Nolin; Allan L. Reiss; Gerald L. Feldman; Elizabeth M. Rohlfs; Stephen T. Warren; Stephanie L. Sherman



A Haplotype Framework for Cystic Fibrosis Mutations in Iran  

PubMed Central

This is the first comprehensive profile of cystic fibrosis transmembrane conductance regulator (CFTR) mutations and their corresponding haplotypes in the Iranian population. All of the 27 CFTR exons of 60 unrelated Iranian CF patients were sequenced to identify disease-causing mutations. Eleven core haplotypes of CFTR were identified by genotyping six high-frequency simple nucleotide polymorphisms. The carrier frequency of 2.5 in 100 (1 in 40) was estimated from the frequency of heterozygous patients and suggests that contrary to popular belief, cystic fibrosis may be a common, under-diagnosed disease in Iran. A heterogeneous mutation spectrum was observed at the CFTR locus in 60 cystic fibrosis (CF) patients from Iran. Twenty putative disease-causing mutations were identified on 64 (53%) of the 120 chromosomes. The five most common Iranian mutations together represented 37% of the expected mutated alleles. The most frequent mutation, ?F508 (p.F508del), represented only 16% of the expected mutated alleles. The next most frequent mutations were c.1677del2 (p.515fs) at 7.5%, c.4041C>G (p.N1303K) at 5.6%, c.2183AA>G (p.684fs) at 5%, and c.3661A>T (p.K1177X) at 2.5%. Three of the five most frequent Iranian mutations are not included in a commonly used panel of CF mutations, underscoring the importance of identifying geographic-specific mutations in this population.

Elahi, Elahe; Khodadad, Ahmad; Kupershmidt, Ilya; Ghasemi, Fereshteh; Alinasab, Babak; Naghizadeh, Ramin; Eason, Robert G.; Amini, Mahshid; Esmaili, Mehran; Esmaeili Dooki, Mohammad R.; Sanati, Mohammad H.; Davis, Ronald W.; Ronaghi, Mostafa; Thorstenson, Yvonne R.



Evidence for a SULT4A1 haplotype correlating with baseline psychopathology and atypical antipsychotic response  

PubMed Central

Aim This study evaluated the impact of SULT4A1 gene variation on psychopathology and antipsychotic drug response in Caucasian subjects from the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) study and a replication sample. Patients & methods SULT4A1 haplotypes were determined using SNP data. The relationship to baseline psychopathology was evaluated using linear regression of Positive and Negative Syndrome Scale (PANSS) total score. Drug response was evaluated using Mixed Model Repeat Measures (MMRM) for change in PANSS. Results For the CATIE sample, patients carrying a haplotype designated SULT4A1-1(+) displayed higher baseline PANSS (p = 0.03) and, when treated with olanzapine, demonstrated a significant interaction with time (p = 0.009) in the MMRM. SULT4A1-1(+) patients treated with olanzapine displayed improved response compared with SULT4A1-1(?) patients treated with olanzapine (p = 0.008) or to SULT4A1-1(+) patients treated with risperidone (p = 0.006). In the replication sample, SULT4A1-1(+) patients treated with olanzapine demonstrated greater improvement than SULT4A1-1(?) patients treated with olanzapine (p = 0.05) or than SULT4A1-1(+) patients treated with risperidone (p = 0.05). Conclusion If validated, determination of SULT4A1-1 haplotype status might be useful for identifying patients who show an enhanced response to long-term olanzapine treatment.

Ramsey, Timothy L; Meltzer, Herbert Y; Brock, Guy N; Mehrotra, Bharat; Jayathilake, Karu; Bobo, William V; Brennan, Mark D



Estimation of Pairwise Identity by Descent From Dense Genetic Marker Data in a Population Sample of Haplotypes  

PubMed Central

I present a new approach for calculating probabilities of identity by descent for pairs of haplotypes. The approach is based on a joint hidden Markov model for haplotype frequencies and identity by descent (IBD). This model allows for linkage disequilibrium, and the method can be applied to very dense marker data. The method has high power for detecting IBD tracts of genetic length of 1 cM, with the use of sufficiently dense markers. This enables detection of pairwise IBD between haplotypes from individuals whose most recent common ancestor lived up to 50 generations ago.

Browning, Sharon R.



Bis-SNP: Combined DNA methylation and SNP calling for Bisulfite-seq data  

PubMed Central

Bisulfite treatment of DNA followed by high-throughput sequencing (Bisulfite-seq) is an important method for studying DNA methylation and epigenetic gene regulation, yet current software tools do not adequately address single nucleotide polymorphisms (SNPs). Identifying SNPs is important for accurate quantification of methylation levels and for identification of allele-specific epigenetic events such as imprinting. We have developed a model-based bisulfite SNP caller, Bis-SNP, that results in substantially better SNP calls than existing methods, thereby improving methylation estimates. At an average 30× genomic coverage, Bis-SNP correctly identified 96% of SNPs using the default high-stringency settings. The open-source package is available at



SNP-SNP Interactions Discovered by Logic Regression Explain Crohn's Disease Genetics  

PubMed Central

In genome-wide association studies (GWAS), the association between each single nucleotide polymorphism (SNP) and a phenotype is assessed statistically. To further explore genetic associations in GWAS, we considered two specific forms of biologically plausible SNP-SNP interactions, ‘SNP intersection’ and ‘SNP union,’ and analyzed the Crohn's Disease (CD) GWAS data of the Wellcome Trust Case Control Consortium for these interactions using a limited form of logic regression. We found strong evidence of CD-association for 195 genes, identifying novel susceptibility genes (e.g., ISX, SLCO6A1, TMEM183A) as well as confirming many previously identified susceptibility genes in CD GWAS (e.g., IL23R, NOD2, CYLD, NKX2-3, IL12RB2, ATG16L1). Notably, 37 of the 59 chromosomal locations indicated for CD-association by a meta-analysis of CD GWAS, involving over 22,000 cases and 29,000 controls, were represented in the 195 genes, as well as some chromosomal locations previously indicated only in linkage studies, but not in GWAS. We repeated the analysis with two smaller GWASs from the Database of Genotype and Phenotype (dbGaP): in spite of differences of populations and study power across the three datasets, we observed some consistencies across the three datasets. Notable examples included TMEM183A and SLCO6A1 which exhibited strong evidence consistently in our WTCCC and both of the dbGaP SNP-SNP interaction analyses. Examining these specific forms of SNP interactions could identify additional genetic associations from GWAS. R codes, data examples, and a ReadMe file are available for download from our website:

Liu, Qi; Yanai, Hideki; Sharaf Eldin, Noha; Kreiter, Erin; Wu, Xuan; Jabbari, Shahab; Tokunaga, Katsushi; Yasui, Yutaka



Haplotype-based analysis of MDR1/ABCB1 gene polymorphisms in a Turkish population.  


The three functional single-nucleotide polymorphisms (SNPs) of the MDR1 gene, C1236T, G2677T/A, and C3435T, exhibit an interpopulation difference. In this study, we analyzed the haplotype frequencies of these three SNPs in 107 unrelated healthy Turkish subjects and compared them with those of other reported populations. We found that C1236T, G2677T/A, and C3435T SNPs are expected to be structured in 10 different haplotypes, with 4 prominent haplotypes T-T-T (33.7%), C-G-C (25.0%), T-G-C (10.9%), and C-T-T (8.7%). There was a statistically significant linkage disequilibrium between all C1236T, G2677T/A, and C3435T SNPs (p < 0.0001); however, our results indicated that only loci 2677 and 3435 show relatively strong linkage disequilibrium (Lewontin's coefficient [D'] = 0.74, Pearson's correlation [r(2)] = 0.47). The haplotype frequency distribution of our study group was found to be significantly different from that in Han Chinese, Uygur Chinese, Kazakh Chinese, Indian, Malay, Japanese, Caucasian, and Ashkenazi Jewish populations (p < 0.0001). The results of this study may contribute to population-specific haplotype data on the MDR1 gene and may serve as a basis for studies on response to P-glycoprotein substrate drugs as well as for future association studies of certain diseases in Turkish population. PMID:20025534

Gümü?-Akay, Güvem; Rüstemo?lu, Aydin; Karada?, Aynur; Sunguro?lu, Asuman



Gains in power for exhaustive analyses of haplotypes using variable-sized sliding window strategy: a comparison of association-mapping strategies  

PubMed Central

Linkage disequilibrium (LD)-based association mapping is often performed by analyzing either individual SNPs or block-based multi-SNP haplotypes. Sliding windows of several fixed sizes (in terms of SNP numbers) were also applied to a few simulated or real data sets. In comparison, exhaustively testing based on variable-sized sliding windows (VSW) of all possible sizes of SNPs over a genomic region has the best chance to capture the optimum markers (single SNPs or haplotypes) that are most significantly associated with the traits under study. However, the cost is the increased number of multiple tests and computation. Here, a strategy of VSW of all possible sizes is proposed and its power is examined, in comparison with those using only haplotype blocks (BLK) or single SNP loci (SGL) tests. Critical values for statistical significance testing that account for multiple testing are simulated. We demonstrated that, over a wide range of parameters simulated, VSW increased power for the detection of disease variants by ?1–15% over the BLK and SGL approaches. The improved performance was more significant in regions with high recombination rates. In an empirical data set, VSW obtained the most significant signal and identified the LRP5 gene as strongly associated with osteoporosis. With the use of computational techniques such as parallel algorithms and clustering computing, it is feasible to apply VSW to large genomic regions or those regions preliminarily identified by traditional SGL/BLK methods.

Guo, Yanfang; Li, Jian; Bonham, Aaron J; Wang, Yuping; Deng, Hongwen



The SNP Consortium website: past, present and future  

Microsoft Academic Search

The SNP Consortium website (http:\\/\\/ has undergone many changes since its initial conception three years ago. The database back end has been changed from the venerable ACeDB to the more scalable MySQL engine. Users can access the data via gene or single nucleotide polymorphism (SNP) keyword searches and browse or dump SNP data to textfiles. A graphical genome browsing interface

Gudmundur A. Thorisson; Lincoln Stein



A SNP transferability survey within the genus Vitis  

PubMed Central

Background Efforts to sequence the genomes of different organisms continue to increase. The DNA sequence is usually decoded for one individual and its application is for the whole species. The recent sequencing of the highly heterozygous Vitis vinifera L. cultivar Pinot Noir (clone ENTAV 115) genome gave rise to several thousand polymorphisms and offers a good model to study the transferability of its degree of polymorphism to other individuals of the same species and within the genus. Results This study was performed by genotyping 137 SNPs through the SNPlex™ Genotyping System (Applied Biosystems Inc.) and by comparing the SNPlex sequencing results across 35 (of the 137) regions from 69 grape accessions. A heterozygous state transferability of 31.5% across the unrelated cultivars of V. vinifera, of 18.8% across the wild forms of V. vinifera, of 2.3% among non-vinifera Vitis species, and of 0% with Muscadinia rotundifolia was found. In addition, mean allele frequencies were used to evaluate SNP informativeness and develop useful subsets of markers. Conclusion Using SNPlex application and corroboration from the sequencing analysis, the informativeness of SNP markers from the heterozygous grape cultivar Pinot Noir was validated in V. vinifera (including cultivars and wild forms), but had a limited application for non-vinifera Vitis species where a resequencing strategy may be preferred, knowing that homology at priming sites is sufficient. This work will allow future applications such as mapping and diversity studies, accession identification and genomic-research assisted breeding within V. vinifera.

Vezzulli, Silvia; Micheletti, Diego; Riaz, Summaira; Pindo, Massimo; Viola, Roberto; This, Patrice; Walker, M Andrew; Troggio, Michela; Velasco, Riccardo



Polymorphisms of cardiac presynaptic alpha2C adrenergic receptors: Diverse intragenic variability with haplotype-specific functional effects.  


The presynaptic alpha2C adrenergic receptors (AR) act to inhibit norepinephrine release in cardiac and other presynaptic nerves. We have recently shown that a genetic variant in the alpha2CAR coding region (Del322-325), which renders the receptor partially uncoupled from Gi, is a risk factor for heart failure. However, variability of heart failure phenotypes and a dominance of Del322-325 in those of African descent led us to hypothesize that other regions of this gene have functional polymorphisms. In a multiethnic population, we found 20 polymorphisms within 4,625 bp of contiguous sequence of this intronless gene encompassing the promoter, 5' UTR, coding, and 3' UTR. These polymorphisms occur in 24 distinct haplotypes with complex organizations, including multiple 5'-upstream polymorphisms in regions known to direct expression, a 3' UTR substitution polymorphism within an insertion/deletion sequence, and the radical coding polymorphism that deletes four amino acids. Relatively low linkage disequilibrium between many polymorphisms, few cosmopolitan haplotypes, prevalent ethnic-specific haplotypes, and substantial genetic divergence among haplotypes was noted. The dysfunctional Del322-325 allele was partitioned into multiple haplotypes, with frequencies of 48% to 2%. The functional implications of the haplotypes were ascertained by whole-gene transfections of human neuronal cells, where haplotype was significantly related (P < 0.001) to expression levels of receptor transcript and protein. Expression varied by as much as approximately 50% by haplotype, and such studies enabled haplotype clustering by phenotypic, rather than genotypic, similarities. Thus, depending on phenotype, expression-specific haplotypes may amplify, attenuate, or dominate the cardiomyopathic effect attributed to the alpha2CDel322-325 marker. PMID:15319474

Small, Kersten M; Mialet-Perez, Jeanne; Seman, Carrie A; Theiss, Cheryl T; Brown, Kari M; Liggett, Stephen B



Polymorphisms of cardiac presynaptic ?2C adrenergic receptors: Diverse intragenic variability with haplotype-specific functional effects  

PubMed Central

The presynaptic ?2C adrenergic receptors (AR) act to inhibit norepinephrine release in cardiac and other presynaptic nerves. We have recently shown that a genetic variant in the ?2CAR coding region (Del322-325), which renders the receptor partially uncoupled from Gi, is a risk factor for heart failure. However, variability of heart failure phenotypes and a dominance of Del322-325 in those of African descent led us to hypothesize that other regions of this gene have functional polymorphisms. In a multiethnic population, we found 20 polymorphisms within 4,625 bp of contiguous sequence of this intronless gene encompassing the promoter, 5? UTR, coding, and 3? UTR. These polymorphisms occur in 24 distinct haplotypes with complex organizations, including multiple 5?-upstream polymorphisms in regions known to direct expression, a 3? UTR substitution polymorphism within an insertion/deletion sequence, and the radical coding polymorphism that deletes four amino acids. Relatively low linkage disequilibrium between many polymorphisms, few cosmopolitan haplotypes, prevalent ethnic-specific haplotypes, and substantial genetic divergence among haplotypes was noted. The dysfunctional Del322-325 allele was partitioned into multiple haplotypes, with frequencies of 48% to 2%. The functional implications of the haplotypes were ascertained by whole-gene transfections of human neuronal cells, where haplotype was significantly related (P < 0.001) to expression levels of receptor transcript and protein. Expression varied by as much as ?50% by haplotype, and such studies enabled haplotype clustering by phenotypic, rather than genotypic, similarities. Thus, depending on phenotype, expression-specific haplotypes may amplify, attenuate, or dominate the cardiomyopathic effect attributed to the ?2CDel322-325 marker.

Small, Kersten M.; Mialet-Perez, Jeanne; Seman, Carrie A.; Theiss, Cheryl T.; Brown, Kari M.; Liggett, Stephen B.



Diversity of 26-locus Y-STR haplotypes in a Nepalese population sample: Isolation and drift in the Himalayas  

Microsoft Academic Search

Twenty-six Y-chromosomal short tandem repeat (STR) loci were amplified in a sample of 769 unrelated males from Nepal, using two multiplex polymerase chain reaction (PCR) assays. The 26 loci gave a discriminating power of 0.997, with 59% unique haplotypes, and the highest frequency haplotype occurring 12 times. We identified novel alleles at four loci, microvariants at a further two, and

Emma J. Parkin; Thirsa Kraayenbrink; Jean Robert M. L. Opgenort; George L. van Driem; Nirmal Man Tuladhar; Peter de Knijff; Mark A. Jobling



Low frequency of mutated methylenetetrahydrofolate reductase 677C-->T and 1298A-->C genetics single nucleotide polymorphisms (SNPs) in Sub-Saharan populations.  


5,10-Methylenetetrahydrofolate reductase (MTHFR) and methionine synthase (MTR) are two of the key enzymes in the folate/vitamin B12-dependent remethylation of homocysteine to methionine. The frequencies of MTHFR single nucleotide polymorphisms (SNPs), 677C-->T, 1298A-->C, 1317T-->C and of MTR, 2756A-->G, have been widely studied in Caucasians, but they have never been reported simultaneously in a large population from Sub-Saharan Africa. Presently, we report the prevalence of these SNPs and their relationship to homocysteine in 240 subjects recruited in West Africa. The frequencies of the mutant genotypes 677TT (0.8%) and 1298CC (2%) were lower than that usually observed in Caucasians, while the frequency of the mutant 1317CC was higher (16%). We formed a systematic association of the mutated MTHFR 677C-->T SNP with a 1298A/1317T common haplotype. The MTHFR mutant genotype 677TT was associated with an intermediate hyperhomocysteinemia (92.4 +/- 6.0 micromol/l) higher than that described in Caucasians. The 2756A-->G SNP in the MTR was similarly distributed in Africans compared to Caucasians. In conclusion, the MTHFR 677TTor 1298CC genotypes are much rarer in Africans than in Caucasians. The 677TT low frequency may be related to the high effect of this mutation on homocysteine metabolism in the environmental conditions of this African region. PMID:12964809

Adjalla, Charles E; Amouzou, Emile K; Sanni, Ambaliou; Abdelmouttaleb, Idrissia; Chabi, Nicodème W; Namour, Fares; Soussou, Batoma; Guéant, Jean-Louis



Novel Tau Polymorphisms, Tau Haplotypes, and Splicing in Familial and Sporadic Frontotemporal Dementia  

Microsoft Academic Search

Background: A subset of familial cases (FTDP-17) of frontotemporal dementia (FTD) are caused by muta- tions in thetau gene. The role oftau gene mutations and haplotypes in sporadic FTD and the functional conse- quences of tau polymorphisms are unknown. Objectives: To investigate (1) the frequency of known FTDP-17 mutations in familial and sporadic FTD and compare these results with previous

Maria-Jesus Sobrido; Bruce L. Miller; Necat Havlioglu; Victoria Zhukareva; Zhihong Jiang; Ziad S. Nasreddine; Virginia M.-Y. Lee; Tiffany W. Chow; Kirk C. Wilhelmsen; Jeffrey L. Cummings; Jane Y. Wu; Daniel H. Geschwind



Haplotypes and Linkage Disequilibrium at the Phenylalanine Hydroxylase Locus, PAH, in a Global Representation of Populations  

Microsoft Academic Search

Because defects in the phenylalanine hydroxylase gene (PAH) cause phenylketonuria (PKU), PAH was studied for normal polymorphisms and linkage disequilibrium soon after the gene was cloned. Studies in the 1980s concentrated on European populations in which PKU was common and showed that haplotype-frequency variation exists between some regions of the world. In European populations, linkage disequilibrium generally was found not

Judith R. Kidd; Andrew J. Pakstis; Hongyu Zhao; Ru-Band Lu; Friday E. Okonofua; Adekunle Odunsi; Elena Grigorenko; Batsheva Bonne-Tamir; Jonathan Friedlaender; Leslie O. Schulz; Josef Parnas; Kenneth K. Kidd



Haplotypes in the CRP Gene Associated with Increased BMI and Levels of CRP in Subjects with Type 2 Diabetes or Obesity from Southwestern Mexico  

PubMed Central

Objective. We evaluated the association between four polymorphisms in the CRP gene with circulating levels of C-reactive protein (CRP), type 2 diabetes (T2D), obesity, and risk score of coronary heart disease. Methods. We studied 402 individuals and classified them into four groups: healthy, obese, T2D obese, and T2D without obesity, from Guerrero, Southwestern Mexico. Blood levels of CRP, glucose, cholesterol, triglycerides, and leukocytes were measured. Genotyping was performed by PCR/RFLP, and the risk score for coronary heart disease was determined by the Framingham's methodology. Results. The TT genotype of SNP rs1130864 was associated with increased body mass index and T2D patients with obesity. We found that the haplotype 2 (TGAG) was associated with increased levels of CRP (? = 0.3; 95%CI: 0.1, 0.5; P = 0.005) and haplotype 7 (TGGG) with higher body mass index (BMI) (? = 0.2; 95%CI: 0.1, 0.3; P < 0.001). The risk score for coronary heart disease was associated with increased levels of CRP, but not with any polymorphism or haplotype. Conclusions. The association between the TT genotype of SNP rs1130864 with obesity and the haplotype 7 with BMI may explain how obesity and genetic predisposition increase the risk of diseases such as T2D in the population of Southwestern Mexico.

Martinez-Calleja, America; Quiroz-Vargas, Irma; Parra-Rojas, Isela; Munoz-Valle, Jose Francisco; Leyva-Vazquez, Marco A.; Fernandez-Tilapa, Gloria; Vences-Velazquez, Amalia; Cruz, Miguel; Salazar-Martinez, Eduardo; Flores-Alfaro, Eugenia



High-throughput and parallel SNP discovery in selected candidate genes in Eucalyptus camaldulensis using Illumina NGS platform.  


Next generation sequencing (NGS) technologies have revolutionized the pace and scale of genomics- and transcriptomics-based SNP discovery across different plant and animal species. Herein, 72-base paired-end Illumina sequencing was employed for high-throughput, parallel and large-scale SNP discovery in 41 growth-related candidate genes in Eucalyptus camaldulensis. Approximately 100?kb of genome from 96 individuals was amplified and sequenced using a hierarchical DNA/PCR pooling strategy and assembled over corresponding E. grandis reference. A total of 1191 SNPs (minimum 5% other allele frequency) were identified with an average frequency of 1 SNP/83.9?bp, whereas in exons and introns, it was 1 SNP/108.4?bp and 1 SNP/65.6?bp, respectively. A total of 75 insertions and 89 deletions were detected of which approximately 15% were exonic. Transitions (Tr) were in excess than transversions (Tv) (Tr/Tv: 1.89), but exceeded in exons (Tr/Tv: 2.73). In exons, synonymous SNPs (Ka) prevailed over the non-synonymous SNPs (Ks; average Ka/Ks ratio: 0.72, range: 0-3.00 across genes). Many of the exonic SNPs/indels had potential to change amino acid sequence of respective genes. Transcription factors appeared more conserved, whereas enzyme coding genes appeared under relaxed control. Further, 541 SNPs were classified into 196 'equal frequency' (EF) blocks with almost similar minor allele frequencies to facilitate selection of one tag-SNP/EF-block. There were 241 (approximately 20%) 'zero-SNP' blocks with absence of SNPs in surrounding ±60?bp windows. The data thus indicated enormous extant and unexplored diversity in E. camaldulensis in the studied genes with potential applications for marker-trait associations. PMID:22607345

Hendre, Prasad S; Kamalakannan, R; Varghese, Mohan



A Hidden Markov Model Combining Linkage and Linkage Disequilibrium Information for Haplotype Reconstruction and Quantitative Trait Locus Fine Mapping  

PubMed Central

Faithful reconstruction of haplotypes from diploid marker data (phasing) is important for many kinds of genetic analyses, including mapping of trait loci, prediction of genomic breeding values, and identification of signatures of selection. In human genetics, phasing most often exploits population information (linkage disequilibrium), while in animal genetics the primary source of information is familial (Mendelian segregation and linkage). We herein develop and evaluate a method that simultaneously exploits both sources of information. It builds on hidden Markov models that were initially developed to exploit population information only. We demonstrate that the approach improves the accuracy of allele phasing as well as imputation of missing genotypes. Reconstructed haplotypes are assigned to hidden states that are shown to correspond to clusters of genealogically related chromosomes. We show that these cluster states can directly be used to fine map QTL. The method is computationally effective at handling large data sets based on high-density SNP panels.

Druet, Tom; Georges, Michel



SNP500Cancer: a public resource for sequence validation and assay development for genetic variation in candidate genes.  


The SNP500Cancer Database provides sequence and genotype assay information for candidate single nucleotide polymorphisms (SNPs) useful in mapping complex diseases, such as cancer. The database is an integral component of the NCI's Cancer Genome Anatomy Project. SNP500Cancer provides bi-directional sequencing information on a set of control DNA samples derived from anonymized subjects (102 Coriell samples representing four self-described ethnic groups: African/African-American, Caucasian, Hispanic and Pacific Rim). All SNPs are chosen from public databases and reports, and the choice of genes includes a bias towards non-synonymous and promoter SNPs in genes that have been implicated in one or more cancers. The web site is searchable by gene, chromosome, gene ontology pathway and by known dbSNP ID. As of July 2003, the database contains over 3400 SNPs, 2490 of which have been sequenced in the SNP500Cancer population. For each analyzed SNP, gene location and over 200 bp of surrounding annotated sequence (including nearby SNPs) are provided, with frequency information in total and per subpopulation, and calculation of Hardy-Weinberg Equilibrium (HWE) for each subpopulation. Sequence validated SNPs with minor allele frequency > 5% are entered into a high-throughput pipeline for genotyping analysis to determine concordance for the same 102 samples. The website provides the conditions for validated genotyping assays. SNP500Cancer provides an invaluable resource for investigators to select SNPs for analysis, design genotyping assays using validated sequence data, choose selected assays already validated on one or more genotyping platforms, and select reference standards for genotyping assays. The SNP500Cancer Database is freely accessible via the web page at PMID:14681474

Packer, Bernice R; Yeager, Meredith; Staats, Brian; Welch, Robert; Crenshaw, Andrew; Kiley, Maureen; Eckert, Andrew; Beerman, Michael; Miller, Edward; Bergen, Andrew; Rothman, Nathaniel; Strausberg, Robert; Chanock, Stephen J



Estimating the effect of SNP genotype on quantitative traits from pooled DNA samples  

PubMed Central

Background Studies to detect associations between DNA markers and traits of interest in humans and livestock benefit from increasing the number of individuals genotyped. Performing association studies on pooled DNA samples can provide greater power for a given cost. For quantitative traits, the effect of an SNP is measured in the units of the trait and here we propose and demonstrate a method to estimate SNP effects on quantitative traits from pooled DNA data. Methods To obtain estimates of SNP effects from pooled DNA samples, we used logistic regression of estimated allele frequencies in pools on phenotype. The method was tested on a simulated dataset, and a beef cattle dataset using a model that included principal components from a genomic correlation matrix derived from the allele frequencies estimated from the pooled samples. The performance of the obtained estimates was evaluated by comparison with estimates obtained using regression of phenotype on genotype from individual samples of DNA. Results For the simulated data, the estimates of SNP effects from pooled DNA are similar but asymptotically different to those from individual DNA data. Error in estimating allele frequencies had a large effect on the accuracy of estimated SNP effects. For the beef cattle dataset, the principal components of the genomic correlation matrix from pooled DNA were consistent with known breed groups, and could be used to account for population stratification. Correctly modeling the contemporary group structure was essential to achieve estimates similar to those from individual DNA data, and pooling DNA from individuals within groups was superior to pooling DNA across groups. For a fixed number of assays, pooled DNA samples produced results that were more correlated with results from individual genotyping data than were results from one random individual assayed from each pool. Conclusions Use of logistic regression of allele frequency on phenotype makes it possible to estimate SNP effects on quantitative traits from pooled DNA samples. With pooled DNA samples, genotyping costs are reduced, and in cases where trait records are abundant this approach is promising to obtain SNP associations for marker-assisted selection.



is-rSNP: a novel technique for in silico regulatory SNP detection  

PubMed Central

Motivation: Determining the functional impact of non-coding disease-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) is challenging. Many of these SNPs are likely to be regulatory SNPs (rSNPs): variations which affect the ability of a transcription factor (TF) to bind to DNA. However, experimental procedures for identifying rSNPs are expensive and labour intensive. Therefore, in silico methods are required for rSNP prediction. By scoring two alleles with a TF position weight matrix (PWM), it can be determined which SNPs are likely rSNPs. However, predictions in this manner are noisy and no method exists that determines the statistical significance of a nucleotide variation on a PWM score. Results: We have designed an algorithm for in silico rSNP detection called is-rSNP. We employ novel convolution methods to determine the complete distributions of PWM scores and ratios between allele scores, facilitating assignment of statistical significance to rSNP effects. We have tested our method on 41 experimentally verified rSNPs, correctly predicting the disrupted TF in 28 cases. We also analysed 146 disease-associated SNPs with no known functional impact in an attempt to identify candidate rSNPs. Of the 11 significantly predicted disrupted TFs, 9 had previous evidence of being associated with the disease in the literature. These results demonstrate that is-rSNP is suitable for high-throughput screening of SNPs for potential regulatory function. This is a useful and important tool in the interpretation of GWAS. Availability: is-rSNP software is available for use at: Contact:; Supplementary information: Supplementary data are available at Bioinformatics online.

Macintyre, Geoff; Bailey, James; Haviv, Izhak; Kowalczyk, Adam



A CRHR1 Haplotype Moderates the Effect of Adverse Childhood Experiences on Lifetime Risk of Major Depressive Episode in African-American Women  

PubMed Central

Background Adverse childhood experiences (ACEs) increase the risk for adult depression and substance dependence, possibly mediated by the corticotropin-releasing hormone type 1 receptor (CRHR1). In some studies, a three-SNP “T-A-T” haplotype in CRHR1, which encodes CRHR1, exerted a protective moderating effect on risk of depression in adults with ACEs. Other studies have shown a main or moderating effect of SNPs in CRHR1 on alcohol consumption. Methods We tested the moderating effects of the three-SNP haplotype on lifetime risk of a major depressive episode (MDE) and alcohol dependence (AD) in 1,211 European Americans (EAs) and 1,869 African Americans (AAs), most of whom had a lifetime substance use disorder. Results There were no significant main or interaction effects of the TAT haplotype on AD. There was a significant interaction of ACE by TAT on risk of depression only in AA women (p=0.005); each copy of the TAT haplotype reduced the odds of MDE by almost 40% (OR = 0.63). In AA women without an ACE and two TAT haplotypes, the risk of MDE was increased (OR=1.51). Conclusion Our findings in relation to the TAT haplotype of CRHR1 extend those obtained in other populations to a largely substance-dependent one. The complex structure of CRHR1 may help to explain why some variants in the gene moderate the effects of an ACE only on depression risk while others moderate the effect of an ACE only on AD risk.

Kranzler, Henry R.; Feinn, Richard; Nelson, Elliot C.; Covault, Jonathan; Anton, Raymond F.; Farrer, Lindsay; Gelernter, Joel



Modeling of Identity-by-Descent Processes Along a Chromosome Between Haplotypes and Their Genotyped Ancestors  

PubMed Central

Identity-by-descent probabilities are important for many applications in genetics. Here we propose a method for modeling the transmission of the haplotypes from the closest genotyped relatives along an entire chromosome. The method relies on a hidden Markov model where hidden states correspond to the set of all possible origins of a haplotype within a given pedigree. Initial state probabilities are estimated from average genetic contribution of each origin to the modeled haplotype while transition probabilities are computed from recombination probabilities and pedigree relationships between the modeled haplotype and the various possible origins. The method was tested on three simulated scenarios based on real data sets from dairy cattle, Arabidopsis thaliana, and maize. The mean identity-by-descent probabilities estimated for the truly inherited parental chromosome ranged from 0.94 to 0.98 according to the design and the marker density. The lowest values were observed in regions close to crossing over or where the method was not able to discriminate between several origins due to their similarity. It is shown that the estimated probabilities were correctly calibrated. For marker imputation (or QTL allele prediction for fine mapping or genomic selection), the method was efficient, with 3.75% allelic imputation error rates on a dairy cattle data set with a low marker density map (1 SNP/Mb). The method should prove useful for situations we are facing now in experimental designs and in plant and animal breeding, where founders are genotyped with relatively high markers densities and last generation(s) genotyped with a lower-density panel.

Druet, Tom; Farnir, Frederic Paul



A Bayesian Framework for SNP Identification  

SciTech Connect

Current proteomics techniques, such as mass spectrometry, focus on protein identification, usually ignoring most types of modifications beyond post-translational modifications, with the assumption that only a small number of peptides have to be matched to a protein for a positive identification. However, not all proteins are being identified with current techniques and improved methods to locate points of mutation are becoming a necessity. In the case when single-nucleotide polymorphisms (SNPs) are observed, brute force is the most common method to locate them, quickly becoming computationally unattractive as the size of the database associated with the model organism grows. We have developed a Bayesian model for SNPs, BSNP, incorporating evolutionary information at both the nucleotide and amino acid levels. Formulating SNPs as a Bayesian inference problem allows probabilities of interest to be easily obtained, for example the probability of a specific SNP or specific type of mutation over a gene or entire genome. Three SNP databases were observed in the evaluation of the BSNP model; the first SNP database is a disease specific gene in human, hemoglobin, the second is also a disease specific gene in human, p53, and the third is a more general SNP database for multiple genes in mouse. We validate that the BSNP model assigns higher posterior probabilities to the SNPs defined in all three separate databases than can be attributed to chance under specific evolutionary information, for example the amino acid model described by Majewski and Ott in conjunction with either the four-parameter nucleotide model by Bulmer or seven-parameter nucleotide model by Majewski and Ott.

Webb-Robertson, Bobbie-Jo M.; Havre, Susan L.; Payne, Deborah A.



SNP alleles in human disease and evolution  

Microsoft Academic Search

.   In two randomly selected human genomes, 99.9% of the DNA sequence is identical. The remaining 0.1% of DNA contains sequence\\u000a variations. The most common type of such variation is called a single-nucleotide polymorphism, or SNP. SNPs are highly abundant,\\u000a stable, and distributed throughout the genome. These variations are associated with diversity in the population, individuality,\\u000a susceptibility to diseases, and

Barkur S Shastry



Analyzing cancer samples with SNP arrays.  


Single nucleotide polymorphism (SNP) arrays are powerful tools to delineate genomic aberrations in cancer genomes. However, the analysis of these SNP array data of cancer samples is complicated by three phenomena: (a) aneuploidy: due to massive aberrations, the total DNA content of a cancer cell can differ significantly from its normal two copies; (b) nonaberrant cell admixture: samples from solid tumors do not exclusively contain aberrant tumor cells, but always contain some portion of nonaberrant cells; (c) intratumor heterogeneity: different cells in the tumor sample may have different aberrations. We describe here how these phenomena impact the SNP array profile, and how these can be accounted for in the analysis. In an extended practical example, we apply our recently developed and further improved ASCAT (allele-specific copy number analysis of tumors) suite of tools to analyze SNP array data using data from a series of breast carcinomas as an example. We first describe the structure of the data, how it can be plotted and interpreted, and how it can be segmented. The core ASCAT algorithm next determines the fraction of nonaberrant cells and the tumor ploidy (the average number of DNA copies), and calculates an ASCAT profile. We describe how these ASCAT profiles visualize both copy number aberrations as well as copy-number-neutral events. Finally, we touch upon regions showing intratumor heterogeneity, and how they can be detected in ASCAT profiles. All source code and data described here can be found at our ASCAT Web site ( PMID:22130873

Van Loo, Peter; Nilsen, Gro; Nordgard, Silje H; Vollan, Hans Kristian Moen; Børresen-Dale, Anne-Lise; Kristensen, Vessela N; Lingjærde, Ole Christian



The extent of linkage disequilibrium and haplotype sharing around a polymorphic site.  

PubMed Central

Various expressions related to the length of a conserved haplotype around a polymorphism of known frequency are derived. We obtain exact expressions for the probability that no recombination has occurred in a sample or subsample. We obtain an approximation for the probability that no recombination that could give rise to a detectable recombination event (through the four-gamete test) has occurred. The probabilities can be used to obtain approximate distributions for the length of variously defined haplotypes around a polymorphic site. The implications of our results for data analysis, and in particular for detecting selection, are discussed.

Innan, Hideki; Nordborg, Magnus



A Tale of Two Haplotypes: The EDA2R/AR Intergenic Region Is the Most Divergent Genomic Segment between Africans and East Asians in the Human Genome.  


Single nucleotide polymorphisms (SNPs) with large allele frequency differences between human populations are relatively rare. The longest run of SNPs with an allele frequency difference of one between the Yoruba of Nigeria and the Han Chinese is found on the long arm of the X chromosome in the intergenic region separating the EDA2R and AR genes. It has been proposed that the unusual allele frequency distributions of these SNPs are the result of a selective sweep affecting African populations that occurred after the out-of-Africa migration. To investigate the evolutionary history of the EDA2R/AR intergenic region, we characterized the haplotype structure of 52 of its highly differentiated SNPs. Using a publicly available data set of 3,000 X chromosomes from 65 human populations, we found that nearly all human X chromosomes carry one of two modal haplotypes for these 52 SNPs. The predominance of two highly divergent haplotypes at this locus was confirmed by use of a subset of individuals sequenced to high coverage. The first of these haplotypes, the ?-haplotype is at high frequencies in most of the African populations surveyed and likely arose before the separation of African populations into distinct genetic entities. The second, the ?-haplotype, is frequent or fixed in all non-African populations and likely arose in East Africa before the out-of-Africa migration. We also observed a small group or rare haplotypes with no clear relationship to the ?- and ?-haplotypes. These haplotypes occur at relatively high frequencies in African hunter-gatherer populations, such as the San and Mbuti Pygmies. Our analysis indicates that these haplotypes are part of a pool of diverse, ancestral haplotypes that have now been almost entirely replaced by the ?- and ?-haplotypes. We suggest that the rise of the ?- and ?-haplotypes was the result of the demographic forces that human populations experienced during the formation of modern African populations and the out-of-Africa migration. However, we also present evidence that this region is the target of selection in the form of positive selection on the ?- and ?-haplotypes and of purifying the selection against ?/? recombinants. PMID:23959643

Casto, Amanda M; Henn, Brenna M; Kidd, Jeffery M; Bustamante, Carlos D; Feldman, Marcus W



Powerful haplotype-based hardy-weinberg equilibrium tests for tightly linked Loci.  


Recently, there have been many case-control studies proposed to test for association between haplotypes and disease, which require the Hardy-Weinberg equilibrium (HWE) assumption of haplotype frequencies. As such, haplotype inference of unphased genotypes and development of haplotype-based HWE tests are crucial prior to fine mapping. The goodness-of-fit test is a frequently-used method to test for HWE for multiple tightly-linked loci. However, its degrees of freedom dramatically increase with the increase of the number of loci, which may lack the test power. Therefore, in this paper, to improve the test power for haplotype-based HWE, we first write out two likelihood functions of the observed data based on the Niu's model (NM) and inbreeding model (IM), respectively, which can cause the departure from HWE. Then, we use two expectation-maximization algorithms and one expectation-conditional-maximization algorithm to estimate the model parameters under the HWE, IM and NM models, respectively. Finally, we propose the likelihood ratio tests LRT[Formula: see text] and LRT[Formula: see text] for haplotype-based HWE under the NM and IM models, respectively. We simulate the HWE, Niu's, inbreeding and population stratification models to assess the validity and compare the performance of these two LRT tests. The simulation results show that both of the tests control the type I error rates well in testing for haplotype-based HWE. If the NM model is true, then LRT[Formula: see text] is more powerful. While, if the true model is the IM model, then LRT[Formula: see text] has better performance in power. Under the population stratification model, LRT[Formula: see text] is still more powerful. To this end, LRT[Formula: see text] is generally recommended. Application of the proposed methods to a rheumatoid arthritis data set further illustrates their utility for real data analysis. PMID:24167573

Mao, Wei-Gao; He, Hai-Qiang; Xu, Yan; Chen, Ping-Yan; Zhou, Ji-Yuan



Association between MT-CO3 haplotypes and high-altitude adaptation in Tibetan chicken.  


Genetic mutation in cytochrome c oxidase subunit III gene (MT-CO3) could influence the kinetics of cytochrome c oxidase (COX), which catalyzes oxygen transport capacity in oxidative phosphorylation. However, the potential relationship between MT-CO3 variants and high-altitude adaptation remains poorly understood in Tibetan chicken. Here, we sequenced MT-CO3 gene of 125 Tibetan chickens and 144 Chinese domestic chickens in areas at a low elevation (below 1000m). Eight single nucleotide polymorphisms (SNPs) were detected; and five of them (m.10081A>G, m.10115G>A, m.10270G>A, m.10336A>G and m.10447C>T) shared by Tibetan chicken and lowland chicken with the significant difference in their respective allele frequencies. Nine haplotypes (H1-H9) were finally defined. Among them, haplotype H4 was positively associated with high-altitude adaptation whereas haplotypes H6, H7 and H8 had negative association with high-altitude adaptation. The Median-joining profile suggested that haplotype H5 had the ancestral position to the other haplotypes but had no significant relationship with high-altitude adaptation. However, there was only m.10081A>G mutation differed from haplotype H4 and H5. Results also suggested that chickens with A allele at m.10081A>G, had over 2.6 times than those with G allele in the probability of the ability to adapt hypoxia. It suggests that the synonymous mutation m.10081A>G may be a prerequisite for shaping high-altitude adaptation-specific haplotypes. PMID:23850731

Sun, Jing; Zhong, Hang; Chen, Shi-Yi; Yao, Yong-Gang; Liu, Yi-Ping



Genotype - phenotype analysis of angiotensinogen polymorphisms and essential hypertension: the importance of haplotypes  

PubMed Central

Objectives To better understand the relationship between angiotensinogen (AGT) genetic variation and essential hypertension, AGT genotypes and haplotypes were tested for association with hypertensive endophenotypes and essential hypertension. Methods 256 HyperPATH/SCOR cases and 126 controls were genotyped for 24 SNPs in the AGT gene. SNPs and AGT haplotypes were tested for association with plasma AGT, renal plasma flow, and essential hypertension. Results New associations between essential hypertension, plasma AGT, and renal plasma flow (RPF) are reported for alleles ?1178G, 6066A, 6152A, 6233C, and 12822C. The maximum odds ratio for association of hypertension and AGT genetic variation was 2.3 (95% CI 1.5 – 3.8; p < 0.0003) for allele 6233C. Previous associations for ?1074T, ?532T, ?217A, ?6A, and 4072C are confirmed (p < 0.05). Sodium depletion enhances associations between AGT SNPs and plasma AGT. Most individually associated SNPs, including ?6A and 4072C, are found on a common complete AGT haplotype, H4 (frequency = 0.09). Individuals with haplotype H4 have significantly higher plasma AGT and reduced renal plasma flow (p < 0.003 and p < 0.0002, respectively). Other common haplotypes are not associated with plasma AGT levels in this data set despite the presence of the ?6A and 4072C alleles, suggesting that AGT haplotype H4 is more predictive of elevated plasma AGT than is ?6A or 4072C. Conclusions This study demonstrates the importance of analyzing haplotypes in addition to single genotypes in association studies. By demonstrating the dependence of AGT associations on sodium depletion status, it helps to explain previous conflicting association results.

Watkins, W. Scott; Hunt, Steven C.; Williams, Gordon H.; Tolpinrud, Whitney; Jeunemaitre, Xavier; Lalouel, Jean-Marc



Powerful Haplotype-Based Hardy-Weinberg Equilibrium Tests for Tightly Linked Loci  

PubMed Central

Recently, there have been many case-control studies proposed to test for association between haplotypes and disease, which require the Hardy-Weinberg equilibrium (HWE) assumption of haplotype frequencies. As such, haplotype inference of unphased genotypes and development of haplotype-based HWE tests are crucial prior to fine mapping. The goodness-of-fit test is a frequently-used method to test for HWE for multiple tightly-linked loci. However, its degrees of freedom dramatically increase with the increase of the number of loci, which may lack the test power. Therefore, in this paper, to improve the test power for haplotype-based HWE, we first write out two likelihood functions of the observed data based on the Niu's model (NM) and inbreeding model (IM), respectively, which can cause the departure from HWE. Then, we use two expectation-maximization algorithms and one expectation-conditional-maximization algorithm to estimate the model parameters under the HWE, IM and NM models, respectively. Finally, we propose the likelihood ratio tests LRT and LRT for haplotype-based HWE under the NM and IM models, respectively. We simulate the HWE, Niu's, inbreeding and population stratification models to assess the validity and compare the performance of these two LRT tests. The simulation results show that both of the tests control the type I error rates well in testing for haplotype-based HWE. If the NM model is true, then LRT is more powerful. While, if the true model is the IM model, then LRT has better performance in power. Under the population stratification model, LRT is still more powerful. To this end, LRT is generally recommended. Application of the proposed methods to a rheumatoid arthritis data set further illustrates their utility for real data analysis.

Mao, Wei-Gao; He, Hai-Qiang; Xu, Yan; Chen, Ping-Yan; Zhou, Ji-Yuan



Analysis of polymorphisms and haplotype structure of the human thymidylate synthase genetic region: a tool for pharmacogenetic studies.  


5-Fluorouracil (5FU), a widely used chemotherapeutic drug, inhibits the DNA replicative enzyme, thymidylate synthase (Tyms). Prior studies implicated a VNTR (variable numbers of tandem repeats) polymorphism in the 5'-untranslated region (5'-UTR) of the TYMS gene as a determinant of Tyms expression in tumors and normal tissues and proposed that these VNTR genotypes could help decide fluoropyrimidine dosing. Clinical associations between 5FU-related toxicity and the TYMS VNTR were reported, however, results were inconsistent, suggesting that additional genetic variation in the TYMS gene might influence Tyms expression. We thus conducted a detailed genetic analysis of this region, defining new polymorphisms in this gene including mononucleotide (poly A:T) repeats and novel single nucleotide polymorphisms (SNPs) flanking the VNTR in the TYMS genetic region. Our haplotype analysis of this region used data from both established and novel genetic variants and found nine SNP haplotypes accounting for more than 90% of the studied population. We observed non-exclusive relationships between the VNTR and adjacent SNP haplotypes, such that each type of VNTR commonly occurred on several haplotype backgrounds. Our results confirmed the expectation that the VNTR alleles exhibit homoplasy and lack the common ancestry required for a reliable marker of a linked adjacent locus that might govern toxicity. We propose that it may be necessary in a clinical trial to assay multiple types of genetic polymorphisms in the TYMS region to meaningfully model linkage of genetic markers to 5FU-related toxicity. The presence of multiple long (up to 26 nt), polymorphic monothymidine repeats in the promoter region of the sole human thymidylate synthetic enzyme is intriguing. PMID:22496803

Ghosh, Soma; Hossain, M Zulfiquer; Borges, Michael; Goggins, Michael G; Ingersoll, Roxann G; Eshleman, James R; Klein, Alison P; Kern, Scott E



BDNF tagging polymorphisms and haplotype analysis in sporadic Parkinson's disease in diverse ethnic groups.  


Experimental and clinical data suggest that genetic variations in brain-derived neurotrophic factor (BDNF) gene may affect risk for Parkinson's disease (PD). We performed a case-control association analysis of BDNF in three independent Caucasian cohorts (Greek, North American, and Finnish) of PD using eight tagging SNPs and five constructed haplotypes. No statistically significant differences in genotype and allele frequencies were found between cases and controls in all series. A relatively rare BDNF haplotype showed a trend towards association in the Greek (p=0.02) and the Finnish (p=0.03) series (this haplotype was not detected in the North American series). However, given the large number of comparisons these associations are considered non-significant. In conclusion, our results do not provide statistically significant evidence that common genetic variability in BDNF would associate with the risk for PD in the Caucasian populations studied here. PMID:17229524

Xiromerisiou, G; Hadjigeorgiou, G M; Eerola, J; Fernandez, H H; Tsimourtou, V; Mandel, R; Hellström, O; Gwinn-Hardy, K; Okun, M S; Tienari, P J; Singleton, A B



QualitySNPng: a user-friendly SNP detection and visualization tool.  


QualitySNPng is a new software tool for the detection and interactive visualization of single-nucleotide polymorphisms (SNPs). It uses a haplotype-based strategy to identify reliable SNPs; it is optimized for the analysis of current RNA-seq data; but it can also be used on genomic DNA sequences derived from next-generation sequencing experiments. QualitySNPng does not require a sequenced reference genome and delivers reliable SNPs for di- as well as polyploid species. The tool features a user-friendly interface, multiple filtering options to handle typical sequencing errors, support for SAM and ACE files and interactive visualization. QualitySNPng produces high-quality SNP information that can be used directly in genotyping by sequencing approaches for application in QTL and genome-wide association mapping as well as to populate SNP arrays. The software can be used as a stand-alone application with a graphical user interface or as part of a pipeline system like Galaxy. Versions for Windows, Mac OS X and Linux, as well as the source code, are available from PMID:23632165

Nijveen, Harm; van Kaauwen, Martijn; Esselink, Danny G; Hoegen, Brechtje; Vosman, Ben



A high-density SNP genome-wide linkage scan in a large autism extended pedigree.  


We performed a high-density, single nucleotide polymorphism (SNP), genome-wide scan on a six-generation pedigree from Utah with seven affected males, diagnosed with autism spectrum disorder. Using a two-stage linkage design, we first performed a nonparametric analysis on the entire genome using a 10K SNP chip to identify potential regions of interest. To confirm potentially interesting regions, we eliminated SNPs in high linkage disequilibrium (LD) using a principal components analysis (PCA) method and repeated the linkage results. Three regions met genome-wide significance criteria after controlling for LD: 3q13.2-q13.31 (nonparametric linkage (NPL), 5.58), 3q26.31-q27.3 (NPL, 4.85) and 20q11.21-q13.12 (NPL, 5.56). Two regions met suggestive criteria for significance 7p14.1-p11.22 (NPL, 3.18) and 9p24.3 (NPL, 3.44). All five chromosomal regions are consistent with other published findings. Haplotype sharing results showed that five of the affected subjects shared more than a single chromosomal region of interest with other affected subjects. Although no common autism susceptibility genes were found for all seven autism cases, these results suggest that multiple genetic loci within these regions may contribute to the autism phenotype in this family, and further follow-up of these chromosomal regions is warranted. PMID:18283277

Allen-Brady, K; Miller, J; Matsunami, N; Stevens, J; Block, H; Farley, M; Krasny, L; Pingree, C; Lainhart, J; Leppert, M; McMahon, W M; Coon, H



Beta-thalassemia genes in French-Canadians: haplotype and mutation analysis of Portneuf chromosomes.  

PubMed Central

beta-Thalassemia minor occurs at approximately 1% frequency in French-Canadians--in families residing in Portneuf County (population approximately 40,000) of Quebec province. We found eight different RFLP haplotypes at the beta-globin gene cluster in 37 normal persons and in 12 beta-thalassemia heterozygotes from six families. beta-Thalassemia genes in these families associated with two haplotypes only: Mediterranean I and Mediterranean II. There were two different beta-thalassemia mutations segregating in the Portneuf population: an RNA processing mutation (beta(+)IVS-1,nt110) on haplotype I (five families) and a point mutation leading to chain termination (beta(0) nonsense codon 39) on haplotype II (one family). The distribution of 5' haplotypes on normal beta A Portneuf chromosomes compared with other European populations was most similar to that in British subjects (data for French subjects have not yet been reported). Genealogical reconstructions traced the ancestry of carrier couples to settlers emigrating from several different regions of France to New France in the 17th century. These findings indicate genetic diversity of a greater degree among French-Canadians than recognized heretofore. Images Figure 4

Kaplan, F; Kokotsis, G; DeBraekeleer, M; Morgan, K; Scriver, C R



Familial Mediterranean Fever (FMF) in Moroccan Jews: Demonstration of a founder effect by extened haplotype analysis  

SciTech Connect

Familial Mediterranean fever (FMF) is an autosomal recessive disease causing attacks of fever and serositis. The FMF gene (designated MEF') is on 16p, with the gene order 16 cen-D16S80-MEF-D16S94-D16S283-D16S291-16pter. Here the authors report the association of FMF susceptibility with alleles at D16S94, D16S283, and D16S291 among 31 non-Ashkenazi Jewish families 14 Moroccan families. For the non-Moroccans, only the allelic association at D16S94 approached statistical significance. Haplotype analysis showed that 18/25 Moroccan FMF chromosomes, versus 0/21 noncarrier chromosomes, bore a specific haplotype for D16S94-D16S283-D16S291. Among non-Moroccans this haplotype was present in 6/26 FMF chromosomes versus 1/28 controls. Both groups of families are largely descended from Jews who fled the Spanish Inquisition. The strong haplotype association seen among the Moroccans is most likely a founder effect, given the recent origin and genetic isolation of the Moroccan Jewish community. The lowest haplotype frequency among non-Moroccan carriers may reflect differences both in history and in population genetics. 28 refs., 1 fig., 3 tabs.

Aksentijevich, I.; Pras, E.; Helling, S.; Prosen, L.; Kastner, D.L.; Gruberg, L.; Pras, M. (Heller Institute for Medical Research, Tel-Hashomer (Israel))



An EM algorithm and testing strategy for multiple-locus haplotypes  

Microsoft Academic Search

This paper gives an expectation maximization (EM) algorithm to obtain allele frequencies, haplotype frequencies, and gametic disequilibrium coefficients for multiple-locus systems. It permits high polymorphism and null alleles at all loci. This approach effectively deals with the primary estimation problems associated with such systems; that is, there is not a one-to-one correspondence between phenotypic and genotypic categories, and sample sizes

J. C. Long; R. C. Williams; M. Urbanek



Accuracy of Haplotype Reconstruction from Haplotype-Tagging Single-Nucleotide Polymorphisms  

PubMed Central

Many investigators are now using haplotype-tagging single-nucleotide polymorphism (htSNPs) as a way of screening regions of the genome for association with disease. A common approach is to genotype htSNPs in a study population and to use this information to draw inferences about each individual’s haplotypic makeup, including SNPs that were not directly genotyped. To test the validity of this approach, we simulated the exercise of typing htSNPs in a large sample of individuals and compared the true and inferred haplotypes. The accuracy of haplotype inference varied, depending on the method of selecting htSNPs, the linkage-disequilibrium structure of the region, and the amount of missing data. At the stage of selection of htSNPs, haplotype-block–based methods required a larger number of htSNPs than did unstructured methods but gave lower levels of error in haplotype inference, particularly when there was a significant amount of missing data. We present a Web-based utility that allows investigators to compare the likely error rates of different sets of htSNPs and to arrive at an economical set of htSNPs that provides acceptable levels of accuracy in haplotype inference.

Forton, Julian; Kwiatkowski, Dominic; Rockett, Kirk; Luoni, Gaia; Kimber, Martin; Hull, Jeremy



Haplotypic polymorphisms of the TNFB gene  

Microsoft Academic Search

The NTFB genes from two major histocomptibility complex (MHC) ancestral haplotypes have been compared. The genes carried by the ancestral haplotypes 8.1 (A1,B8,BfS,C4AQ0, C4B1,DR3) and 57.1 (A1,B57, BfS,C4A6,C4B1,DR7) were cloned and sequenced to determine the degree of polymorphism. In this report we show that the r e spective TNF genes are allelic and have unique nucleotide sequences. The data demonstrate

Lawrence J. Abraham; Daisy Chin Du; Kamyar Zahedi; Roger L. Dawkins; Alexander S. Whitehead



SNP sets and reading ability: testing confirmation of a 10-SNP set in a population sample.  


A set of 10 SNPs associated with reading ability in 7-year-olds was reported based on initial pooled analyses of 100K SNP chip data, with follow-up testing stages using pooling and individual testing. Here we examine this association in an adolescent population sample of Australian twins and siblings (N = 1177) aged 12 to 25 years. One (rs1842129) of the 10 SNPs approached significance (P = .05) but no support was found for the remaining 9 SNPs or the SNP set itself. Results indicate that these SNPs are not associated with reading ability in an Australian population. The results are interpreted as supporting use of much larger SNP sets in common disorders where effects are small. PMID:21623652

Luciano, Michelle; Montgomery, Grant W; Martin, Nicholas G; Wright, Margaret J; Bates, Timothy C



Precise genetic mapping and haplotype analysis of the familial dysautonomia gene on human chromosome 9q31.  

PubMed Central

Familial dysautonomia (FD) is an autosomal recessive disorder characterized by developmental arrest in the sensory and autonomic nervous systems and by Ashkenazi Jewish ancestry. We previously had mapped the defective gene (DYS) to an 11-cM segment of chromosome 9q31-33, flanked by D9S53 and D9S105. By using 11 new polymorphic loci, we now have narrowed the location of DYS to <0.5 cM between the markers 43B1GAGT and 157A3. Two markers in this interval, 164D1 and D9S1677, show no recombination with the disease. Haplotype analysis confirmed this candidate region and revealed a major haplotype shared by 435 of 441 FD chromosomes, indicating a striking founder effect. Three other haplotypes, found on the remaining 6 FD chromosomes, might represent independent mutations. The frequency of the major FD haplotype in the Ashkenazim (5 in 324 control chromosomes) was consistent with the estimated DYS carrier frequency of 1 in 32, and none of the four haplotypes associated with FD was observed on 492 non-FD chromosomes from obligatory carriers. It is now possible to provide accurate genetic testing both for families with FD and for carriers, on the basis of close flanking markers and the capacity to identify >98% of FD chromosomes by their haplotype.

Blumenfeld, A; Slaugenhaupt, S A; Liebert, C B; Temper, V; Maayan, C; Gill, S; Lucente, D E; Idelson, M; MacCormack, K; Monahan, M A; Mull, J; Leyne, M; Mendillo, M; Schiripo, T; Mishori, E; Breakefield, X; Axelrod, F B; Gusella, J F



Caprine CSN1S1 haplotype effect on gene expression and milk composition measured by Fourier transform infrared spectroscopy.  


The Norwegian dairy goat population has a high frequency of a CSN1S1 (alphaS1-casein) haplotype with negative effects on protein and fat content. It is characterized by a single point deletion in exon 12 of CSN1S1, leading to a truncated protein and hence a low content of alphaS1-casein in the milk. This haplotype together with another haplotype with a deletion in exon 9 are called "weak" haplotypes. "Strong" haplotypes, on the other hand, have positive effects on important milk production traits. We show that expression of CSN1S1 in the mammary gland of lactating goats is significantly lower in animals with 2 weak haplotypes. Moreover, the effects of defective alleles were not detected in animals having 1 strong and 1 weak haplotype. Expression levels of other genes in the casein cluster were not affected by the CSN1S1 haplotypes investigated. Milk samples from goats with 2 weak haplotypes could be distinguished from the other milk samples using Fourier transform infrared (FTIR) spectroscopy and partial least squares discriminant analysis (PLS-DA). The PLS-DA components were related to spectra of pure caseins and whey proteins, hence FTIR has a potential for identifying milk samples with low alphaS1-casein content and different protein composition. The results indicate that FTIR-based measurements can be incorporated into breeding plans, or for selection of milk samples with high casein content, which in turn may improve cheese-making properties of the milk. PMID:20723707

Berget, I; Martens, H; Kohler, A; Sjurseth, S K; Afseth, N K; Narum, B; Adnøy, T; Lien, S



Population structure of Y chromosome SNP haplogroups in the United States and forensic implications for constructing Y chromosome STR databases.  


A set of 61 Y chromosome single-nucleotide-polymorphisms (Y-SNPs) is typed in a sample of 2517 individuals from 38 populations to infer the geographic origins of Y chromosomes in the United States and to test for paternal admixture among African-, European-, Hispanic-, Asian-, and Native-Americans. All of the samples were previously typed with the 11 core U.S. Y chromosome short tandem repeats (Y-STRs) recommended by SWGDAM, which revealed high levels of among ethnic group variation and low levels of among-population-within-ethnic-group variation. Admixture estimates vary greatly among populations and ethnic groups. The frequencies of non-European (3.4%) and non-Asian (4.5%) Y chromosomes are generally low in European-American and Asian-American populations, respectively. The frequencies of European Y chromosomes in Native-American populations range widely (i.e., 7-89%) and follow a West to East gradient, whereas they are relatively consistent in African-American populations (26.4+/-8.9%) from different locations. The European (77.8+/-9.3%) and Native-American (13.7+/-7.4%) components of the Hispanic paternal gene pool are also relatively constant among geographic regions; however, the African contribution is much higher in the Northeast (10.5+/-6.4%) than in the Southwest (1.5+/-0.9%) or Midwest (0%). To test for the effects of inter-ethnic admixture on the structure of Y-STR diversity in the U.S., we perform subtraction analyses in which Y chromosomes inferred to be admixed by Y-SNP analysis are removed from the database and pairwise population differentiation tests are implemented on the remaining Y-STR haplotypes. Results show that low levels of heterogeneity previously observed between pairs of Hispanic-American populations disappear when African-derived chromosomes are removed from the analysis. This is not the case for an unusual sample of European-Americans from New York City when its African-derived chromosomes are removed, or for Native-American populations when European-derived chromosomes are removed. We infer that both inter-ethnic admixture and population structure in ancestral source populations may contribute to fine scale Y-STR heterogeneity within U.S. ethnic groups. PMID:16337103

Hammer, Michael F; Chamberlain, Veronica F; Kearney, Veronica F; Stover, Daryn; Zhang, Gina; Karafet, Tatiana; Walsh, Bruce; Redd, Alan J



Multi-ethnic distribution of clinically relevant CYP2C genotypes and haplotypes.  


To determine CYP2C19 and CYP2C8 allele frequencies, 28 coding and/or functional variants were genotyped in 1250 African-American, Asian, Caucasian, Hispanic and Ashkenazi Jewish (AJ) individuals. The combined CYP2C19 variant allele frequencies ranged from ?0.30 to 0.41; however, the CYP2C8 frequencies were much lower (?0.04-0.13). After incorporating previously reported CYP2C9 genotyping results from these populations (36 total CYP2C variants), 16 multi-ethnic CYP2C haplotypes were inferred with frequencies >0.5%. Notably, the 2C19*17-2C9*1-2C8*2 haplotype was identified among African-Americans (8%) and Hispanics (2%), indicating that CYP2C19*17 does not always tag a CYP2C haplotype that encodes efficient CYP2C-substrate metabolism. The 2C19*1-2C9*2-2C8*3 haplotype was identified in all populations except African-Americans and additional novel haplotypes were identified in selected populations (for example, 2C19*2-2C9*1-2C8*4 and 2C19*4B-2C9*1-2C8*1), together indicating that both CYP2C19*17 and *2 can be linked with other CYP2C loss-of-function alleles. These results have important implications for pharmacogenomic association studies involving the CYP2C locus and are clinically relevant when administering CYP2C-substrate medications. PMID:22491019

Martis, S; Peter, I; Hulot, J-S; Kornreich, R; Desnick, R J; Scott, S A



Brief communication: Y-chromosome haplotypes in Egypt.  


We analyzed Y-chromosome haplotypes in the Nile River Valley in Egypt in 274 unrelated males, using the p49a,f TaqI polymorphism. These individuals were born in three regions along the river: in Alexandria (the Delta and Lower Egypt), in Upper Egypt, and in Lower Nubia. Fifteen different p49a,f TaqI haplotypes are present in Egypt, the three most common being haplotype V (39.4%), haplotype XI (18.9%), and haplotype IV (13.9%). Haplotype V is a characteristic Arab haplotype, with a northern geographic distribution in Egypt in the Nile River Valley. Haplotype IV, characteristic of sub-Saharan populations, shows a southern geographic distribution in Egypt. PMID:12687584

Lucotte, G; Mercier, G



A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff  

PubMed Central

We describe a new computer program, SnpEff, for rapidly categorizing the effects of variants in genome sequences. Once a genome is sequenced, SnpEff annotates variants based on their genomic locations and predicts coding effects. Annotated genomic locations include intronic, untranslated region, upstream, downstream, splice site, or intergenic regions. Coding effects such as synonymous or non-synonymous amino acid replacement, start codon gains or losses, stop codon gains or losses, or frame shifts can be predicted. Here the use of SnpEff is illustrated by annotating ~356,660 candidate SNPs in ~117 Mb unique sequences, representing a substitution rate of ~1/305 nucleotides, between the Drosophila melanogaster w1118; iso-2; iso-3 strain and the reference y1; cn1 bw1 sp1 strain. We show that ~15,842 SNPs are synonymous and ~4,467 SNPs are non-synonymous (N/S ~0.28). The remaining SNPs are in other categories, such as stop codon gains (38 SNPs), stop codon losses (8 SNPs), and start codon gains (297 SNPs) in the 5?UTR. We found, as expected, that the SNP frequency is proportional to the recombination frequency (i.e., highest in the middle of chromosome arms). We also found that start-gain or stop-lost SNPs in Drosophila melanogaster often result in additions of N-terminal or C-terminal amino acids that are conserved in other Drosophila species. It appears that the 5? and 3? UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus. As genome sequencing is becoming inexpensive and routine, SnpEff enables rapid analyses of whole-genome sequencing data to be performed by an individual laboratory.

Cingolani, Pablo; Platts, Adrian; Wang, Le Lily; Coon, Melissa; Nguyen, Tung; Wang, Luan; Land, Susan J.; Lu, Xiangyi; Ruden, Douglas M.



Geostatistical inference of main Y-STR-haplotype groups in Europe.  


We examined the multifarious genetic heterogeneity of Europe and neighboring regions from a geographical perspective. We created composite maps outlining the estimated geographical distribution of major groups of genetically similar individuals on the basis of forensic Y-chromosomal markers. We analyzed Y-chromosomal haplotypes composed of 7 highly polymorphic STR loci, genotyped for 33,010 samples, collected at 249 sites in Europe, Western Asia and North Africa, deposited in the YHRD database ( The data set comprised 4176 different haplotypes, which we grouped into 20 clusters. For each cluster, the frequency per site was calculated. All geostatistical analysis was performed with the geographic information system GRASS-GIS. We interpolated frequency values across the study area separately for each cluster. Juxtaposing all 20 interpolated surfaces, we point-wisely screened for the highest cluster frequencies and stored it in parallel with the respective cluster label. We combined these two types of data in a composite map. We repeated this procedure for the second highest frequencies in Europe. Major groups were assigned to Northern, Western and Eastern Europe. North Africa built a separate region, Southeastern Europe, Turkey and Near East were divided into several regions. The spatial distribution of the groups accounting for the second highest frequencies in Europe overlapped with the territories of the largest countries. The genetic structure presented in the composite maps fits major historical geopolitical regions and is in agreement with previous studies of genetic frequencies, validating our approach. Our genetic geostatistical approach provides, on the basis of two composite maps, detailed evidence of the geographical distribution and relative frequencies of the most predominant groups of the extant male European population, examined on the basis of forensic Y-STR haplotypes. The existence of considerable genetic differences among geographic subgroups in Europe has important consequences for the statistical inference in forensic Y-STR haplotype analyses. PMID:20970399

Diaz-Lacava, Amalia; Walier, Maja; Willuweit, Sascha; Wienker, Thomas F; Fimmers, Rolf; Baur, Max P; Roewer, Lutz



PlatinumCNV: a Bayesian Gaussian mixture model for genotyping copy number polymorphisms using SNP array signal intensity data.  


We present a statistical model for allele-specific patterns of copy number polymorphisms (CNPs) in commercial single nucleotide polymorphism (SNP) array data. This model is based on the observation that fluorescent signal intensities tend to cluster into clouds of similar allele-specific copy number (ASCN) genotypes at each SNP locus. To capture the tendency of this clustering to be made vague by instrumental errors, our model allows for cluster memberships to overlap each other, according to a Bayesian Gaussian mixture model (GMM). This approach is flexible, allowing for both absolute scale differences and X/Y scale imbalances of fluorescent signal intensities. The resulting model is also robust toward unobserved ASCN genotypes, which can be problematic for ordinary GMMs. We illustrated the utility of the model by applying it to commercial SNP array intensity data obtained from the Illumina HumanHap 610K platform. We retrieved more than 4,000 allele-specific CNPs, though 99% of them showed rather simple allele-specific CNP patterns with only a single aneuploid haplotype among the normal haplotypes. The genotyping accuracy was assessed by two approaches, quantitative PCR and replicated subjects. The results of both of these approaches demonstrated mean genotyping error rates of 1%. We demonstrated a preliminary genome-wide association study of three hematological traits. The result exhibited that it could form the foundation for new, more effective statistical methods for the mapping of both disease genes and quantitative trait loci with genome-wide CNPs. The methods described in this work are implemented in a software package, PlatinumCNV, available on the Internet. PMID:22125222

Kumasaka, Natsuhiko; Fujisawa, Hironori; Hosono, Naoya; Okada, Yukinori; Takahashi, Atsushi; Nakamura, Yusuke; Kubo, Michiaki; Kamatani, Naoyuki



Vitamin D receptor gene polymorphisms and haplotypes (Apa I, Bsm I, Fok I, Taq I) in Turkish psoriasis patients  

PubMed Central

Summary Background Psoriasis is an inflammatory disease characterized by increased squamous cell proliferation and impaired differentiation. Vitamin D, Calcitriol, and its analogues are successfully used for psoriasis therapy. However, it is unknown why some psoriasis patients are resistant to Vitamin D therapy. Vitamin D mediates its activity by a nuclear receptor. It is suggested that polymorphisms and haplotypes in the VDR gene may explain the differences in response to vitamin D therapy. Material/Methods In this study, 102 psoriasis patients and 102 healthy controls were studied for VDR gene polymorphisms. The Fok I, Bsm I, Apa I and Taq I polymorphisms were examined by PCR-RFLP, and 50 subjects received vitamin D therapy to evaluate the association between VDR gene polymorphisms and response to vitamin D therapy. Existence of cutting site is shown by capital letters, and lack was shown by lower case. The haplotypes were analysed by CHAPLIN. Results There was significant difference in allele frequency of T and genotype frequency of Tt between cases and controls (p values 0.038 and 0.04, respectively). The Aa and bb genotypes were significantly higher in early onset than late onset psoriasis (p values 0.008 and 0.04, respectively). The genotypes Ff, ff and TT are significantly different between vitamin D3 therapy responders and non-responders (p values 0.04, 0.0001, 0.009, respectively). To the best of our knowledge, this is the first report showing importance of VDR gene haplotypes in psoriasis, the significance of the Wald and LR (Likelihood Ratio) statistics (p=0,0042) suggest that FfBbAatt is a disease-susceptibility haplotype. Conclusions Haplotype analysis is a recent and commonly used method in genetic association studies. Our results reveal a previously unidentified susceptibility haplotype and indicate that certain haplotypes are important in the resistance to vitamin D3 therapy and the onset of psoriasis. The haplotypes can give valuable data where genotypes unable to do.

Acikbas, Ibrahim; Sanl?, Berna; Tepeli, Emre; Ergin, Seniz; Aktan, Sebnem; Bagci, Huseyin



Y-chromosome STR-haplotype typing in El Salvador  

Microsoft Academic Search

Eight Y-chromosome STRs were investigated in a male population sample from El Salvador. Complete Y-chromosomal STRs haplotypes were obtained in 121 individuals, among which 107 different haplotypes were observed. The two most common haplotypes were shared by ?4% of the sample, while 100 haplotypes were unique. The gene diversity was 0.9883 and the discrimination capacity was 0.8926. The combined Y-chromosome

José Saul; Manuel Fondevila; Antonio Salas; Mar??a Brión; Mar??a Victoria Lareu; Ángel Carracedo



A Parsimonious Tree-Grow Method for Haplotype Inference  

Microsoft Academic Search

Motivation: Haplotype information has become increasingly important in analyzing fine-scale molecular genetics data, such as disease genes mapping and drug design. Parsimony Haplotyping is one of haplotyping problems belonging to NP-hard class. Results: In this paper, we aim to develop a novel algorithm for the haplotype inference problem with the parsimony criterion, based on a parsimonious tree-grow method (PTG). PTG

Zhenping Li; Wenfeng Zhoua; Xiangsun Zhangb; Luonan Chency


Molecular cloning and SNP association analysis of chicken PMCH gene.  


The pre-melanin-concentrating hormone (PMCH) gene is an important gene functionally concerning the regulations of body fat content, feeding behavior and energy balance. In this study, the full-length cDNA of chicken PMCH gene was amplified by SMART RACE method. The single nucleotide polymorphisms (SNPs) in the PMCH gene were screened by comparative sequence analysis. The obtained non-synonymous coding SNPs (ncSNPs) were designed for genotyping firstly. Its effects on growth, carcass characteristics and meat quality traits were investigated employing the F2 resource population of Gushi chicken crossed with Anak broiler by AluI CRS-PCR-RFLP. Our results indicated that the cDNA of chicken PMCH shared 67.25 and 66.47% homology with that of human and bovine PMCH, respectively. The deduced amino acid sequence of chicken PMCH (163 amino acids) were 52.07 and 50.89% identical to those of human and bovine PMCH, respectively. The PMCH protein sequence is predicted to have several functional domains, including pro-MCH, CSP, IL7, XPGI and some low complexity sequence. It has 8 phosphorylation sites and no signal peptide sequence. gga-miR-18a, gga-miR-18b, gga-miR-499 microRNA targeting site was predicted in the 3' untranslated region of chicken PMCH mRNA. In addition, a total of seven SNPs including an ncSNP and a synonymous coding SNP, were identified in the PMCH gene. The ncSNP c.81 A>T was found to be in moderate polymorphic state (polymorphic index=0.365), and the frequencies for genotype AA, AB and BB were 0.3648, 0.4682 and 0.1670, respectively. Significant associations between the locus and shear force of breast and leg were observed. This polymorphic site may serve as a useful target for the marker assisted selection of the growth and meat quality traits in chicken. PMID:23670042

Sun, Guirong; Li, Ming; Li, Hong; Tian, Yadong; Chen, Qixin; Bai, Yichun; Kang, Xiangtao



Haplotype study of intermediate-length alleles at the fragile X (FMR1) gene: ATL1, FMRb, and microsatellite haplotypes differ from those found in common-size FMR1 alleles.  


The CGG repeat within the X-chromosome-linked FMR1 gene, which in hyperexpansion (> 200 copies) results in fragile X syndrome, is highly polymorphic. The mechanism of expansion is not well understood, but CGG repeats called intermediate-length or gray zone alleles (approximately equal 35-60 repeats) are thought to make up the FMR1 alleles showing initial steps in this expansion process. It has been hypothesized that the background haplotype of these alleles plays a role in their susceptibility to expansion. In this study we investigate whether or not the frequencies of alleles and haplotypes at four marker loci in the FMR1 gene region (microsatellites DXS548 and FRAXAC1 and SNPs ATL1 and FMRb) in 84 intermediate-length male chromosomes differ from those in 94 common-size male alleles. The ATL1*G and FMRB*A alleles were more frequent among intermediate-length alleles than among common alleles. In addition, the DXS548-FRAXAC1 T50-T42 and T40-T42 haplotypes were strongly associated with intermediate-length alleles between 41 and 60 CGG repeats (p < 0.001). Two extended haplotypes, DXS548-FRAXAC1-ATL1-FMRb T50-T42-G-A and T40-T42-G-A, are strongly associated (p < 0.001) with intermediate-length alleles between 41 and 60 CGG repeats, and these haplotypes have also been reported as fragile X associated haplotypes in European populations. These data suggest that these haplotypes are among the most susceptible to further expansion among the intermediate-length alleles. T50-T42-G-A was also much more prevalent in males with 35-40 CGG repeats than in males with common-size alleles. ATL1 did not increase discrimination among intermediate-length alleles beyond that detected by DXS548-FRAXAC1 haplotypes, but the FMRb locus did, particularly for the DXS548-FRAXAC1-ATL1 T50-T42-G and T40-T42-G haplotypes. Comparison with fragile X associated haplotypes, from the literature, suggests that repeat hyperexpansion occurs most frequently on chromosomes carrying FMRB*A. Within the intermediate-length allele category, however, there were some significant differences in haplotype frequencies between smaller and larger alleles, and this finding has implications for future studies. PMID:16114822

Curlis, Yvette; Zhang, Cuiling; Holden, Jeanette J A; Loesch, P Ken Kirkby Danuta; Mitchell, R John



The development and characterization of a 60K SNP chip for chicken  

PubMed Central

Background In livestock species like the chicken, high throughput single nucleotide polymorphism (SNP) genotyping assays are increasingly being used for whole genome association studies and as a tool in breeding (referred to as genomic selection). To be of value in a wide variety of breeds and populations, the success rate of the SNP genotyping assay, the distribution of the SNP across the genome and the minor allele frequencies (MAF) of the SNPs used are extremely important. Results We describe the design of a moderate density (60k) Illumina SNP BeadChip in chicken consisting of SNPs known to be segregating at high to medium minor allele frequencies (MAF) in the two major types of commercial chicken (broilers and layers). This was achieved by the identification of 352,303 SNPs with moderate to high MAF in 2 broilers and 2 layer lines using Illumina sequencing on reduced representation libraries. To further increase the utility of the chip, we also identified SNPs on sequences currently not covered by the chicken genome assembly (Gallus_gallus-2.1). This was achieved by 454 sequencing of the chicken genome at a depth of 12x and the identification of SNPs on 454-derived contigs not covered by the current chicken genome assembly. In total we added 790 SNPs that mapped to 454-derived contigs as well as 421 SNPs with a position on Chr_random of the current assembly. The SNP chip contains 57,636 SNPs of which 54,293 could be genotyped and were shown to be segregating in chicken populations. Our SNP identification procedure appeared to be highly reliable and the overall validation rate of the SNPs on the chip was 94%. We were able to map 328 SNPs derived from the 454 sequence contigs on the chicken genome. The majority of these SNPs map to chromosomes that are already represented in genome build Gallus_gallus-2.1.0. Twenty-eight SNPs were used to construct two new linkage groups most likely representing two micro-chromosomes not covered by the current genome assembly. Conclusions The high success rate of the SNPs on the Illumina chicken 60K Beadchip emphasizes the power of Next generation sequence (NGS) technology for the SNP identification and selection step. The identification of SNPs from sequence contigs derived from NGS sequencing resulted in improved coverage of the chicken genome and the construction of two new linkage groups most likely representing two chicken micro-chromosomes.



Targeted SNP discovery in Atlantic salmon (Salmo salar) genes using a 3'UTR-primed SNP detection approach  

PubMed Central

Background Single nucleotide polymorphisms (SNPs) represent the most widespread type of DNA variation in vertebrates and may be used as genetic markers for a range of applications. This has led to an increased interest in identification of SNP markers in non-model species and farmed animals. The in silico SNP mining method used for discovery of most known SNPs in Atlantic salmon (Salmo salar) has applied a global (genome-wide) approach. In this study we present a targeted 3'UTR-primed SNP discovery strategy that utilizes sequence data from Salmo salar full length sequenced cDNAs (FLIcs). We compare the efficiency of this new strategy to the in silico SNP mining method when using both methods for targeted SNP discovery. Results The SNP discovery efficiency of the two methods was tested in a set of FLIc target genes. The 3'UTR-primed SNP discovery method detected novel SNPs in 35% of the target genes while the in silico SNP mining method detected novel SNPs in 15% of the target genes. Furthermore, the 3'UTR-primed SNP discovery strategy was the less labor intensive one and revealed a higher success rate than the in silico SNP mining method in the initial amplification step. When testing the methods we discovered 112 novel bi-allelic polymorphisms (type I markers) in 88 salmon genes [dbSNP: ss179319972-179320081, ss250608647-250608648], and three of the SNPs discovered were missense substitutions. Conclusions Full length insert cDNAs (FLIcs) are important genomic resources that have been developed in many farmed animals. The 3'UTR-primed SNP discovery strategy successfully utilized FLIc data to detect novel SNPs in the partially tetraploid Atlantic salmon. This strategy may therefore be useful for targeted SNP discovery in several species, and particularly useful in species that, like salmonids, have duplicated genomes.



Determination of haplotypes at structurally complex regions using emulsion haplotype fusion PCR  

PubMed Central

Background Genotyping and massively-parallel sequencing projects result in a vast amount of diploid data that is only rarely resolved into its constituent haplotypes. It is nevertheless this phased information that is transmitted from one generation to the next and is most directly associated with biological function and the genetic causes of biological effects. Despite progress made in genome-wide sequencing and phasing algorithms and methods, problems assembling (and reconstructing linear haplotypes in) regions of repetitive DNA and structural variation remain. These dynamic and structurally complex regions are often poorly understood from a sequence point of view. Regions such as these that are highly similar in their sequence tend to be collapsed onto the genome assembly. This is turn means downstream determination of the true sequence haplotype in these regions poses a particular challenge. For structurally complex regions, a more focussed approach to assembling haplotypes may be required. Results In order to investigate reconstruction of spatial information at structurally complex regions, we have used an emulsion haplotype fusion PCR approach to reproducibly link sequences of up to 1kb in length to allow phasing of multiple variants from neighbouring loci, using allele-specific PCR and sequencing to detect the phase. By using emulsion systems linking flanking regions to amplicons within the CNV, this led to the reconstruction of a 59kb haplotype across the DEFA1A3 CNV in HapMap individuals. Conclusion This study has demonstrated a novel use for emulsion haplotype fusion PCR in addressing the issue of reconstructing structural haplotypes at multiallelic copy variable regions, using the DEFA1A3 locus as an example.



Melting Curve SNP (McSNP) Genotyping: a Useful Approach for Diallelic Genotyping in Forensic Science  

Microsoft Academic Search

The increasing availability of Single Nucleotide Polymorphisms (SNPs) and Deletion\\/Insertion Polymorphisms (DIPs), as well as the outstanding progress in SNP genotyping tech- nologies, will impact forensics profoundly. We have developed a new method for genotyping SNPs and DIPs, which is based on the determination of melting curve profiles of amplified DNA in solu- tion. We have termed this method Melting

Jian Ye; Esteban J. Parra; Donna M. Sosnoski; Kevin Hiester; P. A. Underhill; Mark D. Shriver


Haplotype structure in Ashkenazi Jewish BRCA1 and BRCA2 mutation carriers.  


Three founder mutations in BRCA1 and BRCA2 contribute to the risk of hereditary breast and ovarian cancer in Ashkenazi Jews (AJ). They are observed at increased frequency in the AJ compared to other BRCA mutations in Caucasian non-Jews (CNJ). Several authors have proposed that elevated allele frequencies in the surrounding genomic regions reflect adaptive or balancing selection. Such proposals predict long-range linkage disequilibrium (LD) resulting from a selective sweep, although genetic drift in a founder population may also act to create long-distance LD. To date, few studies have used the tools of statistical genomics to examine the likelihood of long-range LD at a deleterious locus in a population that faced a genetic bottleneck. We studied the genotypes of hundreds of women from a large international consortium of BRCA1 and BRCA2 mutation carriers and found that AJ women exhibited long-range haplotypes compared to CNJ women. More than 50% of the AJ chromosomes with the BRCA1 185delAG mutation share an identical 2.1 Mb haplotype and nearly 16% of AJ chromosomes carrying the BRCA2 6174delT mutation share a 1.4 Mb haplotype. Simulations based on the best inference of Ashkenazi population demography indicate that long-range haplotypes are expected in the context of a genome-wide survey. Our results are consistent with the hypothesis that a local bottleneck effect from population size constriction events could by chance have resulted in the large haplotype blocks observed at high frequency in the BRCA1 and BRCA2 regions of Ashkenazi Jews. PMID:21597964

Im, Kate M; Kirchhoff, Tomas; Wang, Xianshu; Green, Todd; Chow, Clement Y; Vijai, Joseph; Korn, Joshua; Gaudet, Mia M; Fredericksen, Zachary; Shane Pankratz, V; Guiducci, Candace; Crenshaw, Andrew; McGuffog, Lesley; Kartsonaki, Christiana; Morrison, Jonathan; Healey, Sue; Sinilnikova, Olga M; Mai, Phuong L; Greene, Mark H; Piedmonte, Marion; Rubinstein, Wendy S; Hogervorst, Frans B; Rookus, Matti A; Collée, J Margriet; Hoogerbrugge, Nicoline; van Asperen, Christi J; Meijers-Heijboer, Hanne E J; Van Roozendaal, Cees E; Caldes, Trinidad; Perez-Segura, Pedro; Jakubowska, Anna; Lubinski, Jan; Huzarski, Tomasz; Blecharz, Pawe?; Nevanlinna, Heli; Aittomäki, Kristiina; Lazaro, Conxi; Blanco, Ignacio; Barkardottir, Rosa B; Montagna, Marco; D'Andrea, Emma; Devilee, Peter; Olopade, Olufunmilayo I; Neuhausen, Susan L; Peissel, Bernard; Bonanni, Bernardo; Peterlongo, Paolo; Singer, Christian F; Rennert, Gad; Lejbkowicz, Flavio; Andrulis, Irene L; Glendon, Gord; Ozcelik, Hilmi; Toland, Amanda Ewart; Caligo, Maria Adelaide; Beattie, Mary S; Chan, Salina; Domchek, Susan M; Nathanson, Katherine L; Rebbeck, Timothy R; Phelan, Catherine; Narod, Steven; John, Esther M; Hopper, John L; Buys, Saundra S; Daly, Mary B; Southey, Melissa C; Terry, Mary-Beth; Tung, Nadine; Hansen, Thomas V O; Osorio, Ana; Benitez, Javier; Durán, Mercedes; Weitzel, Jeffrey N; Garber, Judy; Hamann, Ute; Peock, Susan; Cook, Margaret; Oliver, Clare T; Frost, Debra; Platte, Radka; Evans, D Gareth; Eeles, Ros; Izatt, Louise; Paterson, Joan; Brewer, Carole; Hodgson, Shirley; Morrison, Patrick J; Porteous, Mary; Walker, Lisa; Rogers, Mark T; Side, Lucy E; Godwin, Andrew K; Schmutzler, Rita K; Wappenschmidt, Barbara; Laitman, Yael; Meindl, Alfons; Deissler, Helmut; Varon-Mateeva, Raymonda; Preisler-Adams, Sabine; Kast, Karin; Venat-Bouvet, Laurence; Stoppa-Lyonnet, Dominique; Chenevix-Trench, Georgia; Easton, Douglas F; Klein, Robert J; Daly, Mark J; Friedman, Eitan; Dean, Michael; Clark, Andrew G; Altshuler, David M; Antoniou, Antonis C; Couch, Fergus J; Offit, Kenneth; Gold, Bert



Inferring the history of population size change from genome-wide SNP data.  


Dense, genome-wide single-nucleotide polymorphism (SNP) data can be used to reconstruct the demographic history of human populations. However, demographic inferences from such data are complicated by recombination and ascertainment bias. We introduce two new statistics, allele frequency-identity by descent (AF-IBD) and allele frequency-identity by state (AF-IBS), that make use of linkage disequilibrium information and show defined relationships to the time of coalescence. These statistics, when conditioned on the derived allele frequency, are able to infer complex population size changes. Moreover, the AF-IBS statistic, which is based on genome-wide SNP data, is robust to varying ascertainment conditions. We constructed an efficient approximate Bayesian computation (ABC) pipeline based on AF-IBD and AF-IBS that can accurately estimate demographic parameters, even for fairly complex models. Finally, we applied this ABC approach to genome-wide SNP data and inferred the demographic histories of two human populations, Yoruba and French. Our results suggest a rather stable ancestral population size with a mild recent expansion for Yoruba, whereas the French seemingly experienced a long-lasting severe bottleneck followed by a drastic population growth. This approach should prove useful for new insights into populations, especially those with complex demographic histories. PMID:22787284

Theunert, Christoph; Tang, Kun; Lachmann, Michael; Hu, Sile; Stoneking, Mark



Bayesian epistasis association mapping via SNP imputation  

PubMed Central

Genetic mutations may interact to increase the risk of human complex diseases. Mapping of multiple interacting disease loci in the human genome has recently shown promise in detecting genes with little main effects. The power of interaction association mapping, however, can be greatly influenced by the set of single nucleotide polymorphism (SNP) genotyped in a case–control study. Previous imputation methods only focus on imputation of individual SNPs without considering their joint distribution of possible interactions. We present a new method that simultaneously detects multilocus interaction associations and imputes missing SNPs from a full Bayesian model. Our method treats both the case–control sample and the reference data as random observations. The output of our method is the posterior probabilities of SNPs for their marginal and interacting associations with the disease. Using simulations, we show that the method produces accurate and robust imputation with little overfitting problems. We further show that, with the type I error rate maintained at a common level, SNP imputation can consistently and sometimes substantially improve the power of detecting disease interaction associations. We use a data set of inflammatory bowel disease to demonstrate the application of our method.



Vitamin D receptor gene haplotypes and polymorphisms and risk of breast cancer: a nested case-control study  

PubMed Central

Background Observational and experimental studies suggest that vitamin D may influence breast cancer etiology. Most known effects of vitamin D are mediated via the vitamin D receptor (VDR). Few polymorphisms in the VDR gene have been well studied in relation to breast cancer risk and results have been inconsistent. Methods We investigated VDR polymorphisms and haplotypes in relation to breast cancer risk by genotyping 26 single nucleotide polymorphisms (SNPs) that i) had known/suspected impact on VDR function, ii) were tagging SNPs for the three VDR haplotype blocks among whites, or iii) were previously associated with breast cancer risk. We estimated odds ratios (OR) and 95% confidence intervals (CI) in relation to breast cancer risk among 270 incident cases and 554 matched controls within the Agricultural Health Study cohort. Results In individual SNP analyses, homozygous carriers of the minor allele for rs2544038 had significantly increased breast cancer risk (OR=1.5; 95% CI: 1.0, 2.5) and homozygous carriers of the minor allele for rs11168287 had significantly decreased risk (OR=0.6; 95% CI: 0.4, 1.0). Carriers of the minor allele for rs2239181 exhibited marginally significant association with risk (OR=1.4; 95% CI: 0.9, 2.0). Haplotype analyses revealed three haplotype groups (blocks “A”, “B”, and “C”). Haplotype GTCATTTCCTA in block B was significantly associated with reduced risk (OR=0.5; 95% CI: 0.3, 0.9). Conclusions These results suggest that variation in VDR may be associated with breast cancer risk. Impact Our findings may help guide future research needed to define the role of vitamin D in breast cancer prevention.

Engel, Lawrence S.; Orlow, Irene; Sima, Camelia S.; Satagopan, Jaya; Mujumdar, Urvi; Roy, Pampa; Yoo, Sarah; Sandler, Dale P.; Alavanja, Michael C.



Genome-wide haplotype association study identifies the FRMD4A gene as a risk locus for Alzheimer's disease.  


Recently, several genome-wide association studies (GWASs) have led to the discovery of nine new loci of genetic susceptibility in Alzheimer's disease (AD). However, the landscape of the AD genetic susceptibility is far away to be complete and in addition to single-SNP (single-nucleotide polymorphism) analyses as performed in conventional GWAS, complementary strategies need to be applied to overcome limitations inherent to this type of approaches. We performed a genome-wide haplotype association (GWHA) study in the EADI1 study (n=2025 AD cases and 5328 controls) by applying a sliding-windows approach. After exclusion of loci already known to be involved in AD (APOE, BIN1 and CR1), 91 regions with suggestive haplotype effects were identified. In a second step, we attempted to replicate the best suggestive haplotype associations in the GERAD1 consortium (2820 AD cases and 6356 controls) and observed that 9 of them showed nominal association. In a third step, we tested relevant haplotype associations in a combined analysis of five additional case-control studies (5093 AD cases and 4061 controls). We consistently replicated the association of a haplotype within FRMD4A on Chr.10p13 in all the data set analyzed (OR: 1.68; 95% CI: (1.43-1.96); P=1.1 × 10(-10)). We finally searched for association between SNPs within the FRMD4A locus and A? plasma concentrations in three independent non-demented populations (n=2579). We reported that polymorphisms were associated with plasma A?42/A?40 ratio (best signal, P=5.4 × 10(-7)). In conclusion, combining both GWHA study and a conservative three-stage replication approach, we characterised FRMD4A as a new genetic risk factor of AD. PMID:22430674

Lambert, J-C; Grenier-Boley, B; Harold, D; Zelenika, D; Chouraki, V; Kamatani, Y; Sleegers, K; Ikram, M A; Hiltunen, M; Reitz, C; Mateo, I; Feulner, T; Bullido, M; Galimberti, D; Concari, L; Alvarez, V; Sims, R; Gerrish, A; Chapman, J; Deniz-Naranjo, C; Solfrizzi, V; Sorbi, S; Arosio, B; Spalletta, G; Siciliano, G; Epelbaum, J; Hannequin, D; Dartigues, J-F; Tzourio, C; Berr, C; Schrijvers, E M C; Rogers, R; Tosto, G; Pasquier, F; Bettens, K; Van Cauwenberghe, C; Fratiglioni, L; Graff, C; Delepine, M; Ferri, R; Reynolds, C A; Lannfelt, L; Ingelsson, M; Prince, J A; Chillotti, C; Pilotto, A; Seripa, D; Boland, A; Mancuso, M; Bossù, P; Annoni, G; Nacmias, B; Bosco, P; Panza, F; Sanchez-Garcia, F; Del Zompo, M; Coto, E; Owen, M; O'Donovan, M; Valdivieso, F; Caffarra, P; Caffara, P; Scarpini, E; Combarros, O; Buée, L; Campion, D; Soininen, H; Breteler, M; Riemenschneider, M; Van Broeckhoven, C; Alpérovitch, A; Lathrop, M; Trégouët, D-A; Williams, J; Amouyel, P



Genome-wide haplotype association study identifies the FRMD4A gene as a risk locus for Alzheimer's disease  

PubMed Central

Recently, several genome-wide association studies (GWASs) have led to the discovery of nine new loci of genetic susceptibility in Alzheimer's disease (AD). However, the landscape of the AD genetic susceptibility is far away to be complete and in addition to single-SNP (single-nucleotide polymorphism) analyses as performed in conventional GWAS, complementary strategies need to be applied to overcome limitations inherent to this type of approaches. We performed a genome-wide haplotype association (GWHA) study in the EADI1 study (n=2025 AD cases and 5328 controls) by applying a sliding-windows approach. After exclusion of loci already known to be involved in AD (APOE, BIN1 and CR1), 91 regions with suggestive haplotype effects were identified. In a second step, we attempted to replicate the best suggestive haplotype associations in the GERAD1 consortium (2820 AD cases and 6356 controls) and observed that 9 of them showed nominal association. In a third step, we tested relevant haplotype associations in a combined analysis of five additional case–control studies (5093 AD cases and 4061 controls). We consistently replicated the association of a haplotype within FRMD4A on Chr.10p13 in all the data set analyzed (OR: 1.68; 95% CI: (1.43–1.96); P=1.1 × 10?10). We finally searched for association between SNPs within the FRMD4A locus and A? plasma concentrations in three independent non-demented populations (n=2579). We reported that polymorphisms were associated with plasma A?42/A?40 ratio (best signal, P=5.4 × 10?7). In conclusion, combining both GWHA study and a conservative three-stage replication approach, we characterised FRMD4A as a new genetic risk factor of AD.

Lambert, J-C; Grenier-Boley, B; Harold, D; Zelenika, D; Chouraki, V; Kamatani, Y; Sleegers, K; Ikram, M A; Hiltunen, M; Reitz, C; Mateo, I; Feulner, T; Bullido, M; Galimberti, D; Concari, L; Alvarez, V; Sims, R; Gerrish, A; Chapman, J; Deniz-Naranjo, C; Solfrizzi, V; Sorbi, S; Arosio, B; Spalletta, G; Siciliano, G; Epelbaum, J; Hannequin, D; Dartigues, J-F; Tzourio, C; Berr, C; Schrijvers, E M C; Rogers, R; Tosto, G; Pasquier, F; Bettens, K; Van Cauwenberghe, C; Fratiglioni, L; Graff, C; Delepine, M; Ferri, R; Reynolds, C A; Lannfelt, L; Ingelsson, M; Prince, J A; Chillotti, C; Pilotto, A; Seripa, D; Boland, A; Mancuso, M; Bossu, P; Annoni, G; Nacmias, B; Bosco, P; Panza, F; Sanchez-Garcia, F; Del Zompo, M; Coto, E; Owen, M; O'Donovan, M; Valdivieso, F; Caffara, P; Scarpini, E; Combarros, O; Buee, L; Campion, D; Soininen, H; Breteler, M; Riemenschneider, M; Van Broeckhoven, C; Alperovitch, A; Lathrop, M; Tregouet, D-A; Williams, J; Amouyel, P



DNA sequence and haplotype variation in two candidate genes for dilated cardiomyopathy in the turkey Meleagris gallopavo.  


Determining variation in genes is fundamental to understanding their function in the disease state. Cardiac troponin T (cTnT) and phospholamban (PLN) genes have been implicated in dilated cardiomyopathy (DCM) in human and model species. To investigate the role of these 2 candidate genes in DCM in the turkey Meleagris gallopavo, understanding sequence variants and map position distribution is necessary. To this end, a total of 1854 and 1771 bp of cTnT and PLN gene sequences, respectively, were scanned for single nucleotide polymorphisms (SNPs) in a randomly bred population. A total of 15 SNPs was identified in the cTnT and PLN genomic sequences. Nine haplotypes, 5 in cTnT and 4 in PLN, were identified. Observed heterozygosities (0.02-0.39) in the turkey population were low for both genes. Within each gene, 1 SNP corresponding to a restriction enzyme site was identified and used to develop a PCR-restriction fragment length polymorphism (RFLP) genotyping assay. The PLN gene was genetically mapped to turkey chromosome 2, equivalent to Gallus gallus chromosome 3, and cTnT mapped to a turkey microchromosome. Although limited because of the relatively small sample size of 55 birds, the data from this SNP analysis of PLN and cTnT provide a foundation from which to evaluate the function of cTnT and PLN in the turkey. Information about the distribution of the SNPs and haplotypes will facilitate future association and linkage studies. PMID:17612615

Lin, Kuan-chin; Xu, Jun; Kamara, Davida; Geng, Tuoyu; Gyenai, Kwaku; Reed, Kent M; Smith, Edward J



Genetic variation within the HLA class III influences T1D susceptibility conferred by high risk HLA haplotypes  

PubMed Central

HLA class II DRB1 and DQB1 represent the major type 1 diabetes (T1D) genetic susceptibility loci; however, other genes in the HLA region are also involved in T1D risk. We analyzed 1411 pedigrees (2865 affected individuals) from the type 1 diabetes genetics consortium (T1DGC) genotyped for HLA classical loci and for 12 SNPs in the class III region previously shown to be associated with T1D in a subset of 886 pedigrees. Using the transmission disequilibrium test, we compared the proportion of SNP alleles transmitted from within the high risk DR3 and DR4 haplotypes to affected offspring. Markers rs4151659 (mapping to CFB) and rs7762619 (mapping 5? of LTA) were the most strongly associated with T1D on DR3 (p=1.2 × 10?9 and p=2 × 10?12 respectively) and DR4 (p=4 × 10?15 and p= 8 × 10?8 respectively) haplotypes. They remained significantly associated after stratifying individuals in analyses for B*1801, A*0101-B*0801, DPB1*0301, DPB1*0202, DPB1*0401 or DPB1*0402. Rs7762619 and rs4151659 are in strong linkage disequilibrium (LD) (r2=0.82) with each other, but a joint analysis showed that the association for each SNP was not solely due to LD. Our data support a role for more than one locus in the class III region contributing to risk of T1D.

Valdes, Ana M; Thomson, Glenys; Barcellos, Lisa F



Complex haplotypes derived from noncoding polymorphisms of the intronless ?2A-adrenergic gene diversify receptor expression  

PubMed Central

?2A-adrenergic receptors (?2AAR) regulate multiple central nervous system, cardiovascular, and metabolic processes including neurotransmitter release, platelet aggregation, blood pressure, insulin secretion, and lipolysis. Complex diseases associated with ?2AAR dysfunction display familial clustering, phenotypic heterogeneity, and interindividual variability in response to therapy targeted to ?2AARs, suggesting common, functional polymorphisms. In a multiethnic discovery cohort we identified 16 single-nucleotide polymorphisms (SNPs) in the ?2AAR gene organized into 17 haplotypes of two major phylogenetic clades. In contrast to other adrenergic genes, variability of the ?2AAR was primarily due to SNPs in the promoter, 5? UTR and 3? UTR, as opposed to the coding block. Marked ethnic variability in the frequency of SNPs and haplotypes was observed: one haplotype represented 70% of Caucasians, whereas Africans and Asians had a wide distribution of less common haplotypes, with the highest haplotype frequencies being 16% and 35%, respectively. Despite the compact nature of this intronless gene, local linkage disequilibrium between a number of SNPs was low and ethnic-dependent. Whole-gene transfections into BE(2)-C human neuronal cells using vectors containing the entire ?5.3-kb gene without exogenous promoters were used to ascertain the effects of haplotypes on ?2AAR expression. Substantial differences (P < 0.001) in transcript and cell-surface protein expression, by as much as ?5-fold, was observed between haplotypes, including those with common frequencies. Thus, signaling by this virtually ubiquitous receptor is under major genetic influence, which may be the basis for highly divergent phenotypes in complex diseases such as systemic and pulmonary hypertension, heart failure, diabetes, and obesity.

Small, Kersten M.; Brown, Kari M.; Seman, Carrie A.; Theiss, Cheryl T.; Liggett, Stephen B.



Haplotype diversity generated by ancient recombination-like events in the MHC of Indian rhesus macaques.  


The Mamu-A, Mamu-B, and Mamu-DRB genes of the rhesus macaque show several levels of complexity such as allelic heterogeneity (polymorphism), copy number variation, differential segregation of genes/alleles present on a haplotype (diversity) and transcription level differences. A combination of techniques was implemented to screen a large panel of pedigreed Indian rhesus macaques (1,384 individuals representing the offspring of 137 founding animals) for haplotype diversity in an efficient and inexpensive manner. This approach allowed the definition of 140 haplotypes that display a relatively low degree of region variation as reflected by the presence of only 17 A, 18 B and 22 DRB types, respectively, exhibiting a global linkage disequilibrium comparable to that in humans. This finding contrasts with the situation observed in rhesus macaques from other geographic origins and in cynomolgus monkeys from Indonesia. In these latter populations, nearly every haplotype appears to be characterised by a unique A, B and DRB region. In the Indian population, however, a reshuffling of existing segments generated "new" haplotypes. Since the recombination frequency within the core MHC of the Indian rhesus macaques is relatively low, the various haplotypes were most probably produced by recombination events that accumulated over a long evolutionary time span. This idea is in accord with the notion that Indian rhesus macaques experienced a severe reduction in population during the Pleistocene due to a bottleneck caused by geographic changes. Thus, recombination-like processes appear to be a way to expand a diminished genetic repertoire in an isolated and relatively small founder population. PMID:23715823

Doxiadis, Gaby G M; de Groot, Nanine; Otting, Nel; de Vos-Rouweler, Annemiek J M; Bolijn, Maria J; Heijmans, Corrine M C; de Groot, Natasja G; van der Wiel, Marit K H; Remarque, Edmond J; Vangenot, Christelle; Nunes, José M; Sanchez-Mazas, Alicia; Bontrop, Ronald E



Allelic mRNA expression imbalance in C-type lectins reveals a frequent regulatory SNP in the human surfactant protein A (SP-A) gene.  


Genetic variation in C-type lectins influences infectious disease susceptibility but remains poorly understood. We used allelic mRNA expression imbalance (AEI) technology for surfactant protein (SP)-A1, SP-A2, SP-D, dendritic cell-specific ICAM-3-grabbing non-integrin (DC-SIGN), macrophage mannose receptor (MRC1) and Dectin-1, expressed in human macrophages and/or lung tissues. Frequent AEI, an indicator of regulatory polymorphisms, was observed in SP-A2, SP-D and DC-SIGN. AEI was measured for SP-A2 in 38 lung tissues using four marker single-nucleotide polymorphisms (SNPs) and was confirmed by next-generation sequencing of one lung RNA sample. Genomic DNA at the SP-A2 DNA locus was sequenced by Ion Torrent technology in 16 samples. Correlation analysis of genotypes with AEI identified a haplotype block, and, specifically, the intronic SNP rs1650232 (30% minor allele frequency); the only variant consistently associated with an approximately twofold change in mRNA allelic expression. Previously shown to alter a NAGNAG splice acceptor site with likely effects on SP-A2 expression, rs1650232 generates an alternative splice variant with three additional bases at the start of exon 3. Validated as a regulatory variant, rs1650232 is in partial linkage disequilibrium with known SP-A2 marker SNPs previously associated with risk for respiratory diseases including tuberculosis. Applying functional DNA variants in clinical association studies, rather than marker SNPs, will advance our understanding of genetic susceptibility to infectious diseases. PMID:23328842

Azad, A K; Curtis, A; Papp, A; Webb, A; Knoell, D; Sadee, W; Schlesinger, L S



ITGA1 polymorphisms and haplotypes are associated with gastric cancer risk in a Korean population  

PubMed Central

AIM: To evaluate the association between the genetic polymorphisms and haplotypes of the ITGA1 gene and the risk of gastric cancer. METHODS: The study subjects were 477 age- and sex-matched case-control pairs. Genotyping was performed for 15 single nucleotide polymorphisms (SNPs) in ITGA1. The associations between gastric cancer and these SNPs and haplotypes were analyzed with multivariate conditional logistic regression models. Multiple testing corrections were carried out following methodology for controlling the false discovery rate. Gene-based association tests were performed using the versatile gene-based association study (VEGAS) method. RESULTS: In the codominant model, the ORs for SNPs rs2432143 (1.517; 95%CI: 1.144-2.011) and rs2447867 (1.258; 95%CI: 1.051-1.505) were statistically significant. In the dominant model, polymorphisms of rs1862610 and rs2447867 were found to be significant risk factors, with ORs of 1.337 (95%CI: 1.029-1.737) and 1.412 (95%CI: 1.061-1.881), respectively. In the recessive model, only the rs2432143 polymorphism was significant (OR = 1.559, 95%CI: 1.150-2.114). The C-C type of ITGA1 haplotype block 2 was a significant protective factor against gastric cancer in the both codominant model (OR = 0.602, 95%CI: 0.212-0.709, P = 0.021) and the dominant model (OR = 0.653, 95%CI: 0.483-0.884). The ITGA1 gene showed a significant gene-based association with gastric cancer in the VEGAS test. In the dominant model, the A-T type of ITGA1 haplotype block 2 was a significant risk factor (OR = 1.341, 95%CI: 1.034-1.741). SNP rs2447867 might be related to the severity of gastric epithelial injury due to inflammation and, thus, to the risk of developing gastric cancer. CONCLUSION: ITGA1 gene SNPs rs1862610, rs24321 43, and rs2447867 and the ITGA1 haplotype block that includes SNPs rs1862610 and rs2432143 were significantly associated with gastric cancer.

Yim, Dong-Hyuk; Zhang, Yan-Wei; Eom, Sang-Yong; Moon, Sun In; Yun, Hyo-Yung; Song, Young-Jin; Youn, Sei-Jin; Hyun, Taisun; Park, Joo-Seung; Kim, Byung Sik; Lee, Jong-Young; Kim, Yong-Dae; Kim, Heon



Y chromosome polymorphisms and haplotypes in west Saxony (Germany).  


In order to apply a set of useful and high polymorphic Y-STRs in paternity testing, we performed a population genetic study from Saxony. The allele distributions of the systems DYS19, DYS385, DYS389I/II and DYS390 were investigated in a sample of 250 unrelated males from the area of Leipzig. PCR products were detected using native polyacrylamide gel electrophoresis as well as capillary electrophoresis and GenScan Software on the ABI Prism 310 DNA sequencer. Haplotype frequency data of 164 different types were obtained which show that these four systems are very useful for special cases of paternity and forensic stain analysis. In addition several confirmed father-son pairs were examined using the paternity cases of the institute. One mutation was found in the system DYS390 and sequencing data are presented. PMID:9646169

Lessig, R; Edelmann, J



An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data  

PubMed Central

Next-generation sequencing is a powerful approach for discovering genetic variation. Sensitive variant calling and haplotype inference from population sequencing data remain challenging. We describe methods for high-quality discovery, genotyping, and phasing of SNPs for low-coverage (approximately 5×) sequencing of populations, implemented in a pipeline called SNPTools. Our pipeline contains several innovations that specifically address challenges caused by low-coverage population sequencing: (1) effective base depth (EBD), a nonparametric statistic that enables more accurate statistical modeling of sequencing data; (2) variance ratio scoring, a variance-based statistic that discovers polymorphic loci with high sensitivity and specificity; and (3) BAM-specific binomial mixture modeling (BBMM), a clustering algorithm that generates robust genotype likelihoods from heterogeneous sequencing data. Last, we develop an imputation engine that refines raw genotype likelihoods to produce high-quality phased genotypes/haplotypes. Designed for large population studies, SNPTools' input/output (I/O) and storage aware design leads to improved computing performance on large sequencing data sets. We apply SNPTools to the International 1000 Genomes Project (1000G) Phase 1 low-coverage data set and obtain genotyping accuracy comparable to that of SNP microarray.

Wang, Yi; Lu, James; Yu, Jin; Gibbs, Richard A.; Yu, Fuli



Genetic susceptibility to tuberculosis associated with cathepsin Z haplotype in a Ugandan household contact study.  


Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), causes 9 million new cases worldwide and 2 million deaths annually. Genetic linkage and association analyses have suggested several chromosomal regions and candidate genes involved in TB susceptibility. This study examines the association of TB disease susceptibility with a selection of biologically relevant genes on regions on chromosomes 7 (IL6 and CARD11) and 20 (CTSZ and MC3R) and fine mapping of the chromosome 7p22-p21 region identified through our genome scan. We analyzed 565 individuals from Kampala, Uganda, who were previously included in our genome-wide linkage scan. Association analyses were conducted for 1,417 single-nucleotide polymorphisms (SNP) that passed quality control. None of the candidate gene or fine mapping SNPs was significantly associated with TB susceptibility (p > 0.10). When we restricted the analysis to HIV-negative individuals, 2 SNPs on chromosome 7 were significantly associated with TB susceptibility (p < 0.05). Haplotype analyses identified a significant risk haplotype in cathepsin X (CTSZ; p = 0.0281, odds ratio = 1.5493, 95% confidence interval [1.039, 2.320]). PMID:21354459

Baker, Allison R; Zalwango, Sarah; Malone, LaShaunda L; Igo, Robert P; Qiu, Feiyou; Nsereko, Mary; Adams, Mark D; Supelak, Pamela; Mayanja-Kizza, Harriet; Boom, W Henry; Stein, Catherine M



A Common Haplotype within the PON1 Promoter Region is Associated with Sporadic ALS  

PubMed Central

Amyotrophic lateral sclerosis (ALS) is a progressive, neurodegenerative disorder of upper and lower motor neurons. Genetic variants in the paraoxonase gene cluster have been associated with susceptibility to sporadic ALS. Because these studies have yielded conflicting results, we have further investigated this association in a larger data set. Twenty SNPs spanning the paraoxonase gene cluster were genotyped on a panel of 835 case and 924 control samples and tested for association with risk of sporadic ALS and with ALS sub-phenotypes. Our study revealed 2 SNPs, rs2074351 and rs705382, within the paraoxonase gene cluster that are associated with susceptibility to sporadic ALS (uncorrected p=0.0016 and 0.0022, respectfully). None of the 20 SNPs displayed significant associations with age of onset, site of onset or disease survival. Using a sliding window approach, we have also identified a 5-SNP haplotype that is significantly associated with risk of sporadic ALS (p=2.42E-04). We conclude that a common haplotype within the PON1 promoter region is associated with susceptibility to sporadic ALS.

Landers, John E.; Shi, Lijia; Cho, Ting-Jan; Glass, Jonathan D.; Shaw, Christopher E.; Leigh, P. Nigel; Diekstra, Frank; Polak, Meraida; Rodriguez-Leyza, Ildefonso; Niemann, Stephan; Traynor, Bryan J.; McKenna-Yasek, Diane; Sapp, Peter C.; Al-Chalabi, Ammar; Wills, Anne-Marie A.; Brown, Robert H.



Haplotypes of NOS3 gene polymorphisms in dilated cardiomyopathy.  


Dilated Cardiomyopathy (DCM) is characterized by systolic dysfunction, followed by heart failure necessitating cardiac transplantation. The genetic basis is well established by the identification of mutations in sarcomere and cytoskeleton gene/s. Modifier genes and environmental factors are also considered to play a significant role in the variable expression of the disease, hence various mechanisms are implicated and one such mechanism is oxidative stress. Nitric Oxide (NO), a primary physiological transmitter derived from endothelium seems to play a composite role with diverse anti-atherogenic effects as vasodilator. Three functional polymorphisms of endothelial nitric oxide synthase (NOS3) gene viz., T-786C of the 5' flanking region, 27bp VNTR in intron4 and G894T of exon 7 were genotyped to identify their role in DCM. A total of 115 DCM samples and 454 controls were included. Genotyping was carried out by PCR -RFLP method. Allelic and genotypic frequencies were computed in both control & patient groups and appropriate statistical tests were employed. A significant association of TC genotype (T-786C) with an odds ratio of 1.74, (95% CI 1.14 - 2.67, p?=?0.01) was observed in DCM. Likewise the GT genotypic frequency of G894T polymorphism was found to be statistically significant (OR 2.10, 95% CI 1.34-3.27, p?=?0.0011), with the recessive allele T being significantly associated with DCM (OR 1.64, 95% CI 1.18 - 2.30, p?=?0.003). The haplotype carrying the recessive alleles of G894T and T-786C, C4bT was found to exhibit 7 folds increased risk for DCM compared to the controls. Hence C4bT haplotype could be the risk haplotype for DCM. Our findings suggest the possible implication of NOS3 gene in the disease phenotype, wherein NOS3 may be synergistically functioning in DCM associated heart failure via the excessive production of NO in cardiomyocytes resulting in decreased myocardial contractility and systolic dysfunction, a common feature of DCM phenotype. PMID:23923002

Matsa, Lova Satyanarayana; Rangaraju, Advithi; Vengaldas, Viswamitra; Latifi, Mona; Jahromi, Hossein Mehraban; Ananthapur, Venkateshwari; Nallari, Pratibha



Whole genome sequencing of peach (Prunus persica L.) for SNP identification and selection  

PubMed Central

Background The application of next generation sequencing technologies and bioinformatic scripts to identify high frequency SNPs distributed throughout the peach genome is described. Three peach genomes were sequenced using Roche 454 and Illumina/Solexa technologies to obtain long contigs for alignment to the draft 'Lovell' peach sequence as well as sufficient depth of coverage for 'in silico' SNP discovery. Description The sequences were aligned to the 'Lovell' peach genome released April 01, 2010 by the International Peach Genome Initiative (IPGI). 'Dr. Davis', 'F8, 1-42' and 'Georgia Belle' were sequenced to add SNPs segregating in two breeding populations, Pop DF ('Dr. Davis' × 'F8, 1-42') and Pop DG ('Dr. Davis' × 'Georgia Belle'). Roche 454 sequencing produced 980,000 total reads with 236 Mb sequence for 'Dr. Davis' and 735,000 total reads with 172 Mb sequence for 'F8, 1-42'. 84 bp × 84 bp paired end Illumina/Solexa sequences yielded 25.5, 21.4, 25.5 million sequences for 'Dr. Davis', 'F8, 1-42' and 'Georgia Belle', respectively. BWA/SAMtools were used for alignment of raw reads and SNP detection, with custom PERL scripts for SNP filtering. Velvet's Columbus module was used for sequence assembly. Comparison of aligned and overlapping sequences from both Roche 454 and Illumina/Solexa resulted in the selection of 6654 high quality SNPs for 'Dr. Davis' vs. 'F8, 1-42' and 'Georgia Belle', distributed on eight major peach genome scaffolds as defined from the 'Lovell' assembly. Conclusion The eight scaffolds contained about 215-225 Mb of peach genomic sequences with one SNP/~ 40,000 bases. All sequences from Roche 454 and Illumina/Solexa have been submitted to NCBI for public use in the Short Read Archive database. SNPs have been deposited in the NCBI SNP database.



Development and Characterization of a High Density SNP Genotyping Assay for Cattle  

PubMed Central

The success of genome-wide association (GWA) studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP) genotyping for the identification of quantitative trait loci (QTL) and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF) ranging from 0.24 to 0.27). The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle.

Matukumalli, Lakshmi K.; Lawley, Cynthia T.; Schnabel, Robert D.; Taylor, Jeremy F.; Allan, Mark F.; Heaton, Michael P.; O'Connell, Jeff; Moore, Stephen S.; Smith, Timothy P. L.; Sonstegard, Tad S.; Van Tassell, Curtis P.



SNP analyses of growth factor genes EGF, TGF?-1, and HGF reveal haplotypic association of EGF with autism  

Microsoft Academic Search

Autism is a pervasive neurodevelopmental disorder diagnosed in early childhood. Growth factors have been found to play a key role in the cellular differentiation and proliferation of the central and peripheral nervous systems. Epidermal growth factor (EGF) is detected in several regions of the developing and adult brain, where, it enhances the differentiation, maturation, and survival of a variety of

Takao Toyoda; Kazuhiko Nakamura; Kazuo Yamada; Ismail Thanseem; Ayyappan Anitha; Shiro Suda; Masatsugu Tsujii; Yoshimi Iwayama; Eiji Hattori; Tomoko Toyota; Taishi Miyachi; Yasuhide Iwata; Katsuaki Suzuki; Hideo Matsuzaki; Masayoshi Kawai; Yoshimoto Sekine; Kenji Tsuchiya; Gen-ichi Sugihara; Yasuomi Ouchi; Toshiro Sugiyama; Nori Takei; Takeo Yoshikawa; Norio Mori



SNP and Haplotype Analysis of the Tryptophan Hydroxylase 2 Gene in Alcohol-Dependent Patients and Alcohol-Related Suicide  

Microsoft Academic Search

Several lines of evidence indicate that disturbances of the central serotonergic system are involved in the pathophysiology of alcohol dependence and suicidal behavior. Recent studies have indicated that a newly identified second isoform of the tryptophan hydroxylase gene (TPH2) is preferentially involved in the rate limiting synthesis of neuronal serotonin. Genetic variations in the TPH2 gene have been associated with

Peter Zill; Ulrich W Preuss; Gabrielle Koller; Brigitta Bondy; Michael Soyka



A SNP Haplotype Associated with a gene resistant to Xanthomonas axonopodis pv. malvacearum in Upland Cotton (Gossyium hirsutum L.)  

Technology Transfer Automated Retrieval System (TEKTRAN)

An F5 population of 285 families with each tracing back to a different F2 plant , derived from a cotton bacterial blight resistant line ‘DeltaOpal’ and a susceptible line ‘DP388’, was artificially inoculated with bacterial blight race 18 (Xanthomonas campestris pv. Malvacearum) to assay their resist...


HLA Class II Profile and Distribution of HLA-DRB1 and HLA-DQB1 Alleles and Haplotypes among Lebanese and Bahraini Arabs  

Microsoft Academic Search

The gene frequencies of HLA class II alleles were studied in 95 healthy Lebanese Arab and 72 healthy Bahraini Arab subjects. Our aim was to establish the genetic relationship between Bahraini and Lebanese Arabs in terms of HLA class II gene and haplotype frequencies and to compare these results with frequencies for other countries with populations of Caucasian and non-Caucasian

Wassim Y. Almawi; Marc Busson; Hala Tamim; Einas M. Al-Harbi; Ramzi R. Finan; Saria F. Wakim-Ghorayeb; Ayesha A. Motala



Functional haplotypes of PADI4: relevance for rheumatoid arthritis specific synovial intracellular citrullinated proteins and anticitrullinated protein antibodies  

PubMed Central

Background: Haplotypes of PADI4, encoding for a citrullinating enzyme, were associated with rheumatoid arthritis in a Japanese population. It was suggested they were related to the presence of anticitrullinated protein antibodies (ACPA). Objective: To explore the relation between PADI4 haplotypes, the presence of rheumatoid arthritis specific intracellular citrullinated proteins in synovial membrane, and serum ACPA titres. Methods: Synovial biopsies and peripheral blood samples were obtained in 59 patients with rheumatoid arthritis. Synovial intracellular citrullinated proteins were detected by immunohistochemistry. Serum ACPA titres were measured by anti-CCP2 ELISA. PADI4 haplotypes were determined by direct sequencing of the four exonic PADI4 single nucleotide polymorphisms. Results: PADI4 haplotype frequencies and the presence of synovial intracellular citrullinated proteins and ACPA were comparable with previous studies. There was no significant association between PADI4 haplotype 1 or 2 and the presence of synovial intracellular citrullinated proteins, although these proteins were associated with higher serum ACPA. There was no correlation between PADI4 haplotypes and serum ACPA, either by continuous analysis using the titres or by dichotomous analysis using the diagnostic cut off. Further analyses in homozygotes for haplotype 1 or 2 or in heterozygotes (1/2) also failed to show an association between PADI4 polymorphisms and ACPA. This contrasted with the clear association between ACPA levels and HLA-DR shared epitope. Conclusions: The link between synovial intracellular citrullinated proteins and ACPA emphasises the role of deimination of synovial proteins in rheumatoid arthritis, but the biological relevance of the PADI4 haplotypes for this autoimmune process is questionable, at least in a European population.

Cantaert, T; Coucke, P; De Rycke, L; Veys, E; De Keyser, F; Baeten, D



Digital genotyping and haplotyping with polymerase colonies  

PubMed Central

Polymerase colony (polony) technology amplifies multiple individual DNA molecules within a thin acrylamide gel attached to a microscope slide. Each DNA molecule included in the reaction produces an immobilized colony of double-stranded DNA. We genotype these polonies by performing single base extensions with dye-labeled nucleotides, and we demonstrate the accurate quantitation of two allelic variants. We also show that polony technology can determine the phase, or haplotype, of two single- nucleotide polymorphisms (SNPs) by coamplifying distally located targets on a single chromosomal fragment. We correctly determine the genotype and phase of three different pairs of SNPs. In one case, the distance between the two SNPs is 45 kb, the largest distance achieved to date without separating the chromosomes by cloning or somatic cell fusion. The results indicate that polony genotyping and haplotyping may play an important role in understanding the structure of genetic variation.

Mitra, Robi D.; Butty, Vincent L.; Shendure, Jay; Williams, Benjamin R.; Housman, David E.; Church, George M.



FMR1 Haplotype Analysis among Indian Communities  

Microsoft Academic Search

Objective: To analyse fragile X mental retardation 1 haplotypes among non-fragile X males from different Indian caste-based communities in order to enable inter-community comparisons to be made and permit wider comparisons with other ethnic groups. Subjects and Methods: Males (n = 124) from four major Hindu castes (Brahmins, Kshatriyas, Vaishyas and Shudras) and the Indian Muslim community were typed using

B. K. Thelma; D. Sharma



? s Haplotypes in various world populations  

Microsoft Academic Search

We have determined the ßs haplotypes in 709 patients with sickle cell anemia, 30 with SC disease, 91 with S-ß-thalassemia, and in 322 Hb S heterozygotes from different countries. The methodology concerned the detection of mutations in the promoter sequences of the G?- and A?-globin genes through dot blot analysis of amplified DNA with 32P-labeled probes, and an analysis of

Cihan Öner; Aleksandar J. Dimovski; Nancy F. Olivieri; Gino Schiliro; John F. Codrington; Sladdehine Fattoum; Adekunle D. Adekile; Reyhan Öner; Gunes T. Yüregir; C. Altay; A. Gurgey; Rashik B. Gupta; Vinod B. Jogessar; Michael N. Kitundu; Dimitris Loukopoulos; Gabriel P. Tamagnini; M. Leticia S. Ribeiro; Ferdane Kutlar; Li-Hao Gu; Kenneth D. Lanclos; Titus H. J. Huisman



The Wilson disease gene: Haplotypes and mutations  

SciTech Connect

Wilson disease (WND) is an autosomal recessive defect of copper transport. The gene involved in WND, located on chromosome 13, has recently been shown to be a putative copper transporting P-type ATPase, designated ATP7B. The gene is highly similar to ATP7A, located on the X chromosome, which is defective in Menkes disease, another disorder of copper transport. We have available for study WND families from Canada (34 families), the United Kingdom (32 families), Japan (4 families), Iceland (3 families) and Hong Kong (2 families). We have utilized four highly polymorphic CA repeat markers (D13S296, D13S301, D13S314 and D13S316) surrounding the ATP7B locus to construct haplotypes in these families. Analysis indicates that there are many unique WND haplotypes not present on normal chromosomes and that there may be a large number of different WND mutations. We have screened the WND patients for mutations in the ATP7B gene. Fifty six patients, representing all of the identified haplotypes, have been screened using single strand conformational polymorphism (SSCP), followed by selective sequencing. To date, 19 mutations and 12 polymorphisms have been identified. All of the changes are nucleotide substitutions or small insertions/deletions and there is no evidence for larger deletions as seen in the similar gene on the X chromosome, ATP7A. Haplotypes of close markers and the ability to detect some of the mutations present in the gene allow for more reliable molecular diagnosis of presymptomatic sibs of WND patients. A reassessment of individuals previously diagnosed in the presymptomatic phase is now required, as we have have identified some heterozygotes who are biochemically indistinguishable from affected homozygotes. The identification of specific mutations will soon allow direct diagnosis of WND patients with a high level of certainty.

Thomas, G.R.; Roberts, E.A.; Cox, D.W. [Hospital for Sick Children, Toronto (Canada); Walshe, J.M. [Middlesex Hospital, London (United Kingdom)



Atomic Force Microscopy for DNA SNP Identification  

NASA Astrophysics Data System (ADS)

The knowledge of the effects of single-nucleotide polymorphisms (SNPs) in the human genome greatly contributes to better comprehension of the relation between genetic factors and diseases. Sequence analysis of genomic DNA in different individuals reveals positions where variations that involve individual base substitutions can occur. Single-nucleotide polymorphisms are highly abundant and can have different consequences at phenotypic level. Several attempts were made to apply atomic force microscopy (AFM) to detect and map SNP sites in DNA strands. The most promising approach is the study of DNA mutations producing heteroduplex DNA strands and identifying the mismatches by means of a protein that labels the mismatches. MutS is a protein that is part of a well-known complex of mismatch repair, which initiates the process of repairing when the MutS binds to the mismatched DNA filament. The position of MutS on the DNA filament can be easily recorded by means of AFM imaging.

Valbusa, Ugo; Ierardi, Vincenzo


Identification of single nucleotide polymorphisms within the mtDNA genome of the domestic dog to discriminate individuals with common HVI haplotypes.  


We sequenced the entire ?16 kb canine mitochondrial genome (mtGenome) of 100 unrelated domestic dogs (Canis lupus familiaris) and compared these to 246 published sequences to assess hypervariable region I (HVI) haplotype frequencies. We then used all available sequences to identify informative single nucleotide polymorphisms (SNPs) outside of the control region for use in further resolving mtDNA haplotypes corresponding to common HVI haplotypes. Haplotype frequencies in our data set were highly correlated with previous ones (e.g., F(ST)=0.02, r=0.90), suggesting the total data set reasonably reflected the broader dog population. A total of 128 HVI haplotypes was represented. The 10 most common HVI haplotypes (n=184 dogs) represented 53.3% of the sample. We identified a total 71 SNPs in the mtGenomes (external to the control region) that resolved the 10 most common HVI haplotypes into 63 mtGenome subhaplotypes. The random match probability of the dataset based solely on the HVI sequence was 4%, whereas the random match probability of the mtGenome subhaplotypes was <1%. Thus, the panel of 71 SNPs identified in this study represents a useful forensic tool to further resolve the identity of individual dogs from mitochondrial DNA (mtDNA). PMID:22436122

Imes, Donna L; Wictum, Elizabeth J; Allard, Marc W; Sacks, Benjamin N



Accelerating genetic improvement with SNP chips and DNA sequencing  

Technology Transfer Automated Retrieval System (TEKTRAN)

The development of high-density single nucleotide polymorphism (SNP) assays is expected to have a profound impact on genetic progress in the U.S. dairy industry. In the 16 months since its initial availability, the Illumina BovineSNP50 BeadChip has been used to genotype nearly 20,000 Holsteins. Thes...


SNP, identity and citizenship: Re-imagining state and nation  

Microsoft Academic Search

The Scottish National Party (SNP) has been strongly critical of attempts to resuscitate British national identity and has sought to present an alternative Scottish cultural and political identity that is projected as ‘wholly civic’. However, questions persist as to how the SNP understand concepts such as citizenship and nationality and the extent to which their civic nationalism is reflected empirically

Andrew Mycock



A 34-plex autosomal SNP single base extension assay for ancestry investigations.  


Ancestry inference based on autosomal markers remains a niche approach in forensic analysis: most laboratories feel more secure with a review of the cumulative STR profile frequencies in a range of relevant populations with the possible additional analysis of mitochondrial and/or Y-chromosome variability. However, a proportion of autosomal single nucleotide polymorphisms (SNPs) show very well-differentiated allele frequencies among global population-groups. Furthermore, such ancestry informative marker SNPs (AIM-SNPs) lend themselves to relatively straightforward typing with short-amplicon PCR and multiplexed single base extension reactions using the same capillary electrophoresis detectors required for the sequencing and STR genotyping of mainstream forensic markers. In this chapter, we describe a 34 AIM-SNP multiplex that is robust enough for the analysis of challenging, often highly degraded DNA typical of much of routine forensic casework. We also outline in detail the in-silico procedures necessary for collecting parental population reference data from the SPSmart SNP databases and performing ancestry inference of single AIM-SNP profiles or large-scale population data using the companion ancestry analysis website of Snipper. Two casework examples are described that show, in both cases, that an inference of likely ancestry using AIM-SNPs helped the identification of highly degraded skeletal material. PMID:22139656

Phillips, C; Fondevila, M; Lareau, Maria Victoria



Haplotyping, linkage mapping and expression analysis of barley genes regulated by terminal drought stress influencing seed quality  

PubMed Central

Background The increasingly narrow genetic background characteristic of modern crop germplasm presents a challenge for the breeding of cultivars that require adaptation to the anticipated change in climate. Thus, high priority research aims at the identification of relevant allelic variation present both in the crop itself as well as in its progenitors. This study is based on the characterization of genetic variation in barley, with a view to enhancing its response to terminal drought stress. Results The expression patterns of drought regulated genes were monitored during plant ontogeny, mapped and the location of these genes was incorporated into a comprehensive barley SNP linkage map. Haplotypes within a set of 17 starch biosynthesis/degradation genes were defined, and a particularly high level of haplotype variation was uncovered in the genes encoding sucrose synthase (types I and II) and starch synthase. The ability of a panel of 50 barley accessions to maintain grain starch content under terminal drought conditions was explored. Conclusion The linkage/expression map is an informative resource in the context of characterizing the response of barley to drought stress. The high level of haplotype variation among starch biosynthesis/degradation genes in the progenitors of cultivated barley shows that domestication and breeding have greatly eroded their allelic diversity in current elite cultivars. Prospective association analysis based on core drought-regulated genes may simplify the process of identifying favourable alleles, and help to understand the genetic basis of the response to terminal drought.



HLA DPA1, DPB1 Alleles and Haplotypes Contribute to the Risk Associated With Type 1 Diabetes  

PubMed Central

OBJECTIVE To determine the relative risk associated with DPA1 and DPB1 alleles and haplotypes in type 1 diabetes. RESEARCH DESIGN AND METHODS The frequency of DPA1 and DPB1 alleles and haplotypes in type 1 diabetic patients was compared to the family based control frequency in 1,771 families directly and conditional on HLA (B)-DRB1-DQA1-DQB1 linkage disequilibrium. A relative predispositional analysis (RPA) was performed in the presence or absence of the primary HLA DR-DQ associations and the contribution of DP haplotype to individual DR-DQ haplotype risks examined. RESULTS Eight DPA1 and thirty-eight DPB1 alleles forming seventy-four DPA1-DPB1 haplotypes were observed; nineteen DPB1 alleles were associated with multiple DPA1 alleles. Following both analyses, type 1 diabetes susceptibility was significantly associated with DPB1*0301 (DPA1*0103-DPB1*0301) and protection with DPB1*0402 (DPA1*0103-DPB1*0402) and DPA1*0103-DPB1*0101 but not DPA1*0201-DPB1*0101. In addition, DPB1*0202 (DPA1*0103-DPB1*0202) and DPB1*0201 (DPA1*0103-DPB1*0201) were significantly associated with susceptibility in the presence of the high risk and protective DR-DQ haplotypes. Three associations (DPB1*0301, *0402, and *0202) remained statistically significant when only the extended HLA-A1-B8-DR3 haplotype was considered, suggesting that DPB1 alone may delineate the risk associated with this otherwise conserved haplotype. CONCLUSIONS HLA DP allelic and haplotypic diversity contributes significantly to the risk for type 1 diabetes; DPB1*0301 (DPA1*0103-DPB1*0301) is associated with susceptibility and DPB1*0402 (DPA1*0103-DPB1*0402) and DPA1*0103-DPB1*0101 with protection. Additional evidence is presented for the susceptibility association of DPB1*0202 (DPA1*0103-DPB1*0202) and for a contributory role of individual amino acids and DPA1 or a gene in linkage disequilibrium in DR3-DPB1*0101 positive haplotypes.

Varney, Michael D.; Valdes, Ana Maria; Carlson, Joyce A.; Noble, Janelle A.; Tait, Brian D.; Bonella, Persia; Lavant, Eva; Fear, Anna Lisa; Louey, Anthony; Moonsamy, Priscilla; Mychaleckyj, Josyf C.; Erlich, Henry



HLA class II linkage disequilibrium and haplotype evolution in the Cayapa Indians of Ecuador  

SciTech Connect

DNA-based typing of the HLA class II loci in a sample of the Cayapa Indians of Ecuador reveals several lines of evidence that selection has operated to maintain and to diversify the existing level of polymorphism in the class II region. As has been noticed for other Native American groups, the overall level of polymorphism at the DRB1, DQA1, DQB1, and DPB1 loci is reduced relative to that found in other human populations. Nonetheless, the relative eveness in the distribution of allele frequencies at each of the four loci points to the role of balancing selection in the maintenance of the polymorphism. The DQA1 and DQB1 loci, in particular, have near-maximum departures from the neutrality model, which suggests that balancing selection has been especially strong in these cases. Several novel DQA1-DQB1 haplotypes and the discovery of a new DRB1 allele demonstrate an evolutionary tendency favoring the diversification of class II alleles and haplotypes. The recombination interval between the centromeric DPB1 locus and the other class II loci will, in the absence of other forces such as selection, reduce disequilibrium across this region. However, nearly all common alleles were found to be part of DR-DP haplotypes in strong disequilibrium, consistent with the recent action of selection acting on these haplotypes in the Cayapa. 50 refs., 3 figs., 3 tabs.

Trachtenberg, E.A.; Erlich, H.A. [Roche Molecular Systems, Alameda, CA (United States); Klitz, W. [Univ. of California, Berkeley, CA (United States)] [and others



Haplotype data for 23 Y-chromosome markers in a reference sample from Bosnia and Herzegovina  

PubMed Central

Aim To detect polymorphisms of 23 Y-chromosomal short tandem repeat (STR) loci, including 6 new loci, in a reference database of male population of Bosnia and Herzegovina, as well as to assess the importance of increasing the number of Y-STR loci utilized in forensic DNA analysis. Methods The reference sample consisted of 100 healthy, unrelated men originating from Bosnia and Herzegovina. Sample collection using buccal swabs was performed in all geographical regions of Bosnia and Herzegovina in the period from 2010 to 2011. DNA samples were typed for 23 Y STR loci, including 6 new loci: DYS576, DYS481, DYS549, DYS533, DYS570, and DYS643, which are included in the new PowerPlex® Y 23 amplification kit. Results The absolute frequency of generated haplotypes was calculated and results showed that 98 samples had unique Y 23 haplotypes, and that only two samples shared the same haplotype. The most polymorphic locus was DYS418, with 14 detected alleles and the least polymorphic loci were DYS389I, DYS391, DYS437, and DYS393. Conclusion This study showed that by increasing the number of highly polymorphic Y STR markers, to include those tested in our analysis, leads to a reduction of repeating haplotypes, which is very important in the application of forensic DNA analysis.

Kovacevic, Lejla; Fatur-Ceric, Vera; Hadzic, Negra; Cakar, Jasmina; Primorac, Dragan; Marjanovic, Damir



Phylogenetic nomenclature and evolution of mannose-binding lectin (MBL2) haplotypes  

PubMed Central

Background Polymorphisms of the mannose-binding lectin gene (MBL2) affect the concentration and functional efficiency of the protein. We recently used haplotype-specific sequencing to identify 23 MBL2 haplotypes, associated with enhanced susceptibility to several diseases. Results In this work, we applied the same method in 288 and 470 chromosomes from Gabonese and European adults, respectively, and found three new haplotypes in the last group. We propose a phylogenetic nomenclature to standardize MBL2 studies and found two major phylogenetic branches due to six strongly linked polymorphisms associated with high MBL production. They presented high Fst values and were imbedded in regions with high nucleotide diversity and significant Tajima's D values. Compared to others using small sample sizes and unphased genotypic data, we found differences in haplotyping, frequency estimation, Fu and Li's D* and Fst results. Conclusion Using extensive testing for selective neutrality, we confirmed that stochastic evolutionary factors have had a major role in shaping this polymorphic gene worldwide.



Haplotype test reveals departure from neutrality in a segment of the white gene of Drosophila melanogaster  

SciTech Connect

Restriction map studies previously revealed extensive linkage disequilibria in the transcriptional unit of the white locus in natural Drosophila melanogaster populations. To understand the causes of these disequilibria, we sequenced a 4722-bp region of the white gene from 15 lines of D. melanogaster and 1 line of Drosophila simulans. Statistical tests applied to the entire 4722-bp region do not reject neutrality. In contrast, a test for high-frequency haplotypes ({open_quotes}Haplotype test{close_quotes}) revealed an 834-bp segment, encompassing the 3{prime} end of intron 1 to the 3{prime} end of intron 2, in which the structure of variation deviates significantly from the predictions of a neutral equilibrium model. The variants in this 834-bp segment segregate as single haplotype blocks. We propose that these unusually large haplotype blocks are due to positive selection on polymorphisms within the white gene, including a replacement polymorphism, Arg{yields}Leu, within this segment. 45 refs., 4 figs., 1 tab.

Kirby, D.A.; Stephan, W. [Univ. of Maryland, College Park, MD (United States)



Patterns of haplotypes for 92 cystic fibrosis mutations: Variability, association and recurrence  

SciTech Connect

Most CFTR mutations are very uncommon among the cystic fibrosis population, with frequencies of less than 1%, and many are found only in specific areas. We have analyzed 92 CF mutations for several markers (4 microsatellites and 3 other polymorphisms) scattered in the CFTR gene. Haplotypes associated with these mutations can be used as a framework in the screening of chromosomes carrying unknown mutations. The association between mutation and haplotype reduces the number of mutations it is necessary to search for to a maximum of 16 for the same haplotype. Only mutations {triangle}F508, G542X and N1303K are associated with more than one haplotype as a result of slippage at more than one microsatellite loci, suggesting that these three are the most ancient CF mutations. Recurrence has been found for at least 7 mutations: H199Y, R347P, L558S, R553X, 2184insA, 3272-26A{r_arrow}G, 3849+10kbC{r_arrow}T and R1162X. Also microsatellite analysis of chromosomes of several ethnic origins (Czech, Italian, Russian, Slovac and Spanish) suggested that possibility of three or more independent origins for mutations R334W, R347P, R1162X, and 3849+10kbC{r_arrow}T, which was confirmed by analysis of markers flanking these mutations.

Morral, N.; Llevadot, R.; Estivill, X. [I.R.O., Barcelona (Spain)] [and others



Green turtles (Chelonia mydas) foraging at Arvoredo Island in Southern Brazil: genetic characterization and mixed stock analysis through mtDNA control region haplotypes  

Microsoft Academic Search

We analyzed mtDNA control region sequences of green turtles (Chelonia mydas) from Arvoredo Island, a foraging ground in southern Brazil, and identified eight haplotypes. Of these, CM-A8 (64%) and CM-A5 (22%) were dominant, the remainder presenting low frequencies (< 5%). Haplotype (h) and nucleotide () diversities were 0.5570 0.0697 and 0.0021 0.0016, respectively. Exact tests of differentiation and AMOVA ST

Maíra Carneiro Proietti; Paula Lara-Ruiz; Júlia Wiener Reisser; Luciano da Silva Pinto; Odir Antonio Dellagostin; Luis Fernando Marins



Haplotype structure, adaptive history and associations with exploratory behaviour of the DRD4 gene region in four great tit (Parus major) populations.  


The assessment of genetic architecture and selection history in genes for behavioural traits is fundamental to our understanding of how these traits evolve. The dopamine receptor D4 (DRD4) gene is a prime candidate for explaining genetic variation in novelty seeking behaviour, a commonly assayed personality trait in animals. Previously, we showed that a single nucleotide polymorphism in exon 3 of this gene is associated with exploratory behaviour in at least one of four Western European great tit (Parus major) populations. These heterogeneous association results were explained by potential variable linkage disequilibrium (LD) patterns between this marker and the causal variant or by other genetic or environmental differences among the populations. Different adaptive histories are further hypothesized to have contributed to these population differences. Here, we genotyped 98 polymorphisms of the complete DRD4 gene including the flanking regions for 595 individuals of the four populations. We show that the LD structure, specifically around the original exon 3 SNP is conserved across the four populations and does not explain the heterogeneous association results. Study-wide significant associations with exploratory behaviour were detected in more than one haplotype block around exon 2, 3 and 4 in two of the four tested populations with different allele effect models. This indicates genetic heterogeneity in the association between multiple DRD4 polymorphisms and exploratory behaviour across populations. The association signals were in or close to regions with signatures of positive selection. We therefore hypothesize that variation in exploratory and other dopamine-related behaviour evolves locally by occasional adaptive shifts in the frequency of underlying genetic variants. PMID:23506506

Mueller, Jakob C; Korsten, Peter; Hermannstaedter, Christine; Feulner, Thomas; Dingemanse, Niels J; Matthysen, Erik; van Oers, Kees; van Overveld, Thijs; Patrick, Samantha C; Quinn, John L; Riemenschneider, Matthias; Tinbergen, Joost M; Kempenaers, Bart



[Correlations between haplogroup membership and Y-STR haplotype as a potential measure of quality control in forensic examinations].  


A correlation between particular Y-STR alleles from the so-called "minimal haplotype" and haplogroup membership of the Y chromosome was tested. We collected 146 Y chromosomes from haplogroups R1*, R1a1* and 1* and estimated the frequency of Y-STR alleles in each haplogroup. We then used different algorithms to assign a haplogroup to a haplotype, and tested their accuracy. Generally, a method based on calculation of haplotype similarity using the highest allele frequencies as modal values and assigning a score to each locus based on a ratio of allele frequencies turned out to give the most precise matches. However, using the same rules for Y chromosomes from other populations did not allow for precise estimation of their Y chromosome haplogroup frequencies. Possible explanations for this failure include interpopulation differences in haplotypes correlated with particular haplogroups, as well as a relatively small number of chromosomes analyzed. Potential uses for the presented method in forensics were also described. PMID:17131759

Wo?niak, Marcin; Grzybowski, Tomasz; Starzy?ski, Jaros?aw; Papuga, Marta; Stopi?ska, Katarzyna; Luczak, Sylwia


Hereditary tyrosinemia type I: strong association with haplotype 6 in French Canadians permits simple carrier detection and prenatal diagnosis.  

PubMed Central

Hereditary tyrosinemia type 1 (HT1), a severe inborn error of tyrosine catabolism, is caused by deficiency of the terminal enzyme, fumarylacetoacetate hydrolase (FAH). The highest reported frequency of HT1 is in the French Canadian population, especially in the Saguenay-Lac-St-Jean region. Using human FAH cDNA probes, we have identified 10 haplotypes with TaqI, KpnI, RsaI, BglII, and MspI RFLPs in 118 normal chromosomes from the French Canadian population. Interestingly, in 29 HT1 children, a prevalent haplotype, haplotype 6, was found to be strongly associated with the disease, at a frequency of 90% of alleles, as compared with approximately 18% in 35 control individuals. This increased to 96% in the 24 patients originating from Saguenay-Lac-St-Jean. These results suggest that one or only a few prevailing mutations are responsible for most of the HT1 cases in Saguenay-Lac-St-Jean. Since most patients were found to be homozygous for a specific haplotype in this population, FAH RFLPs have permitted simple carrier detection in nine different informative HT1 families, with a confidence level of 99.9%. Heterozygosity rate values obtained from 52 carriers indicated that approximately 88% of families at risk from Saguenay-Lac-St-Jean are fully or partially informative. Prenatal diagnosis was also achieved in an American family. Analysis of 24 HT1 patients from nine countries gave a frequency of approximately 52% for haplotype 6, suggesting a relatively high association, worldwide, of HT1 with this haplotype. Images Figure 1

Demers, S. I.; Phaneuf, D.; Tanguay, R. M.



Hereditary tyrosinemia type I: strong association with haplotype 6 in French Canadians permits simple carrier detection and prenatal diagnosis.  


Hereditary tyrosinemia type 1 (HT1), a severe inborn error of tyrosine catabolism, is caused by deficiency of the terminal enzyme, fumarylacetoacetate hydrolase (FAH). The highest reported frequency of HT1 is in the French Canadian population, especially in the Saguenay-Lac-St-Jean region. Using human FAH cDNA probes, we have identified 10 haplotypes with TaqI, KpnI, RsaI, BglII, and MspI RFLPs in 118 normal chromosomes from the French Canadian population. Interestingly, in 29 HT1 children, a prevalent haplotype, haplotype 6, was found to be strongly associated with the disease, at a frequency of 90% of alleles, as compared with approximately 18% in 35 control individuals. This increased to 96% in the 24 patients originating from Saguenay-Lac-St-Jean. These results suggest that one or only a few prevailing mutations are responsible for most of the HT1 cases in Saguenay-Lac-St-Jean. Since most patients were found to be homozygous for a specific haplotype in this population, FAH RFLPs have permitted simple carrier detection in nine different informative HT1 families, with a confidence level of 99.9%. Heterozygosity rate values obtained from 52 carriers indicated that approximately 88% of families at risk from Saguenay-Lac-St-Jean are fully or partially informative. Prenatal diagnosis was also achieved in an American family. Analysis of 24 HT1 patients from nine countries gave a frequency of approximately 52% for haplotype 6, suggesting a relatively high association, worldwide, of HT1 with this haplotype. PMID:7913582

Demers, S I; Phaneuf, D; Tanguay, R M



Y-chromosomal STR haplotypes in Pakistani populations  

Microsoft Academic Search

16 Y-specific STR loci have been analysed in 711 males from 12 populations in Pakistan. Individual loci showed between 4 and 10 alleles, and diversities ranged from 0.07 to 0.77. A total of 527 different haplotypes were found and the haplotype diversity ranged from 0.92 to 0.99 for the different populations. 446 haplotypes occurred in single individuals, and only 19

Aisha Mohyuddin; Qasim Ayub; Raheel Qamar; Tatiana Zerjal; Agnar Helgason; S. Qasim Mehdi; Chris Tyler-Smith



Racial or ethnic differences in allele frequencies of single-nucleotide polymorphisms in the methylenetetrahydrofolate reductase gene and their influence on response to methotrexate in rheumatoid arthritis  

PubMed Central

Background The anti?folate drug methotrexate (MTX) is commonly used to treat rheumatoid arthritis. Objective To determine the allele frequencies of five common coding single?nucleotide p