Science.gov

Sample records for 124-plex snp typing

  1. SNP typing reveals similarity in Mycobacterium tuberculosis genetic diversity between Portugal and Northeast Brazil.

    PubMed

    Lopes, Joao S; Marques, Isabel; Soares, Patricia; Nebenzahl-Guimaraes, Hanna; Costa, Joao; Miranda, Anabela; Duarte, Raquel; Alves, Adriana; Macedo, Rita; Duarte, Tonya A; Barbosa, Theolis; Oliveira, Martha; Nery, Joilda S; Boechat, Neio; Pereira, Susan M; Barreto, Mauricio L; Pereira-Leal, Jose; Gomes, Maria Gabriela Miranda; Penha-Goncalves, Carlos

    2013-08-01

    Human tuberculosis is an infectious disease caused by bacteria from the Mycobacterium tuberculosis complex (MTBC). Although spoligotyping and MIRU-VNTR are standard methodologies in MTBC genetic epidemiology, recent studies suggest that Single Nucleotide Polymorphisms (SNP) are advantageous in phylogenetics and strain group/lineages identification. In this work we use a set of 79 SNPs to characterize 1987 MTBC isolates from Portugal and 141 from Northeast Brazil. All Brazilian samples were further characterized using spolygotyping. Phylogenetic analysis against a reference set revealed that about 95% of the isolates in both populations are singly attributed to bacterial lineage 4. Within this lineage, the most frequent strain groups in both Portugal and Brazil are LAM, followed by Haarlem and X. Contrary to these groups, strain group T showed a very different prevalence between Portugal (10%) and Brazil (1.5%). Spoligotype identification shows about 10% of mis-matches compared to the use of SNPs and a little more than 1% of strains unidentifiability. The mis-matches are observed in the most represented groups of our sample set (i.e., LAM and Haarlem) in almost the same proportion. Besides being more accurate in identifying strain groups/lineages, SNP-typing can also provide phylogenetic relationships between strain groups/lineages and, thus, indicate cases showing phylogenetic incongruence. Overall, the use of SNP-typing revealed striking similarities between MTBC populations from Portugal and Brazil.

  2. Evaluation of the Ion Torrent™ HID SNP 169-plex: A SNP typing assay developed for human identification by second generation sequencing.

    PubMed

    Børsting, Claus; Fordyce, Sarah L; Olofsson, Jill; Mogensen, Helle Smidt; Morling, Niels

    2014-09-01

    The Ion Torrent™ HID SNP assay amplified 136 autosomal SNPs and 33 Y-chromosome markers in one PCR and the markers were subsequently typed using the Ion PGM™ second generation sequencing platform. A total of 51 of the autosomal SNPs were selected from the SNPforID panel that is routinely used in our ISO 17025 accredited laboratory. Concordance between the Ion Torrent™ HID SNP assay and the SNPforID assay was tested by typing 44 Iraqis twice with the Ion Torrent™ HID SNP assay. The same samples were previously typed with the SNPforID assay and the Y-chromosome haplogroups of the individuals were previously identified by typing 45 Y-chromosome SNPs. Full concordance between the assays were obtained except for the SNP genotypes of two SNPs. These SNPs were among the eight SNPs (rs2399332, rs1029047, rs10776839, rs4530059, rs8037429, rs430046, rs1031825 and rs1523537) with inconsistent allele balance among samples. These SNPs should be excluded from the panel. The optimal amount of DNA in the PCR seemed to be ≥0.5ng. Allele drop-outs were rare and only seen in experiments with <0.5ng input DNA and with a coverage of <50reads. No allele drop-in was observed. The great majority of the heterozygote allele balances were between 0.6 and 1.6, which is comparable to the heterozygote balances of STRs typed with PCR-CE. The number of reads with base calls that differed from the genotype call was typically less than five. This allowed detection of 1:100 mixtures with a high degree of certainty in experiments with a high total depth of coverage. In conclusion, the Ion PGM™ is a very promising platform for forensic genetics. However, the secondary sequence analysis software made wrong genotype calls from correctly sequenced alleles. These types of errors must be corrected before the platform can be used in case work. Furthermore, the sequence analysis software should be further developed and include quality settings for each SNP based on validation studies. PMID

  3. A review on SNP and other types of molecular markers and their use in animal genetics

    PubMed Central

    Vignal, Alain; Milan, Denis; SanCristobal, Magali; Eggen, André

    2002-01-01

    During the last ten years, the use of molecular markers, revealing polymorphism at the DNA level, has been playing an increasing part in animal genetics studies. Amongst others, the microsatellite DNA marker has been the most widely used, due to its easy use by simple PCR, followed by a denaturing gel electrophoresis for allele size determination, and to the high degree of information provided by its large number of alleles per locus. Despite this, a new marker type, named SNP, for Single Nucleotide Polymorphism, is now on the scene and has gained high popularity, even though it is only a bi-allelic type of marker. In this review, we will discuss the reasons for this apparent step backwards, and the pertinence of the use of SNPs in animal genetics, in comparison with other marker types. PMID:12081799

  4. Allele frequencies for 40 autosomal SNP loci typed for US population samples using electrospray ionization mass spectrometry

    PubMed Central

    Kiesler, Kevin M.; Vallone, Peter M.

    2013-01-01

    Aim To type a set of 194 US African American, Caucasian, and Hispanic samples (self-declared ancestry) for 40 autosomal single nucleotide polymorphism (SNP) markers intended for human identification purposes. Methods Genotyping was performed on an automated commercial electrospray ionization time-of-flight mass spectrometer, the PLEX-ID. The 40 SNP markers were amplified in eight unique 5plex PCRs, desalted, and resolved based on amplicon mass. For each of the three US sample groups statistical analyses were performed on the resulting genotypes. Results The assay was found to be robust and capable of genotyping the 40 SNP markers consuming approximately 4 nanograms of template per sample. The combined random match probabilities for the 40 SNP assay ranged from 10−16 to 10−21. Conclusion The multiplex PLEX-ID SNP-40 assay is the first fully automated genotyping method capable of typing a panel of 40 forensically relevant autosomal SNP markers on a mass spectrometry platform. The data produced provided the first allele frequencies estimates for these 40 SNPs in a National Institute of Standards and Technology US population sample set. No population bias was detected although one locus deviated from its expected level of heterozygosity. PMID:23771752

  5. SNP-VISTA

    SciTech Connect

    Shah, Nameeta; Teplitsky, Michael; Minovitsky, Simon; Dubchak, Inna

    2005-11-07

    SNP-VISTA aids in analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) Mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNPs data.

  6. Two New Rapid SNP-Typing Methods for Classifying Mycobacterium tuberculosis Complex into the Main Phylogenetic Lineages

    PubMed Central

    Stucki, David; Malla, Bijaya; Hostettler, Simon; Huna, Thembela; Feldmann, Julia; Yeboah-Manu, Dorothy; Borrell, Sonia; Fenner, Lukas; Comas, Iñaki; Coscollà, Mireia; Gagneux, Sebastien

    2012-01-01

    There is increasing evidence that strain variation in Mycobacterium tuberculosis complex (MTBC) might influence the outcome of tuberculosis infection and disease. To assess genotype-phenotype associations, phylogenetically robust molecular markers and appropriate genotyping tools are required. Most current genotyping methods for MTBC are based on mobile or repetitive DNA elements. Because these elements are prone to convergent evolution, the corresponding genotyping techniques are suboptimal for phylogenetic studies and strain classification. By contrast, single nucleotide polymorphisms (SNP) are ideal markers for classifying MTBC into phylogenetic lineages, as they exhibit very low degrees of homoplasy. In this study, we developed two complementary SNP-based genotyping methods to classify strains into the six main human-associated lineages of MTBC, the “Beijing” sublineage, and the clade comprising Mycobacterium bovis and Mycobacterium caprae. Phylogenetically informative SNPs were obtained from 22 MTBC whole-genome sequences. The first assay, referred to as MOL-PCR, is a ligation-dependent PCR with signal detection by fluorescent microspheres and a Luminex flow cytometer, which simultaneously interrogates eight SNPs. The second assay is based on six individual TaqMan real-time PCR assays for singleplex SNP-typing. We compared MOL-PCR and TaqMan results in two panels of clinical MTBC isolates. Both methods agreed fully when assigning 36 well-characterized strains into the main phylogenetic lineages. The sensitivity in allele-calling was 98.6% and 98.8% for MOL-PCR and TaqMan, respectively. Typing of an additional panel of 78 unknown clinical isolates revealed 99.2% and 100% sensitivity in allele-calling, respectively, and 100% agreement in lineage assignment between both methods. While MOL-PCR and TaqMan are both highly sensitive and specific, MOL-PCR is ideal for classification of isolates with no previous information, whereas TaqMan is faster for

  7. SNP-VISTA

    2005-11-07

    SNP-VISTA aids in analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) Mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering,more » based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNPs data.« less

  8. Association of rs5888 SNP in the scavenger receptor class B type 1 gene and serum lipid levels

    PubMed Central

    2012-01-01

    Background Bai Ku Yao is a special subgroup of the Yao minority in China. The present study was undertaken to detect the association of rs5888 single nucleotide polymorphism (SNP) in the scavenger receptor class B type 1 (SCARB1) gene and several environmental factors with serum lipid levels in the Guangxi Bai Ku Yao and Han populations. Methods A total of 598 subjects of Bai Ku Yao and 585 subjects of Han Chinese were randomly selected from our stratified randomized cluster samples. Genotypes of the SCARB1 rs5888 SNP were determined by polymerase chain reaction and restriction fragment length polymorphism combined with gel electrophoresis, and then confirmed by direct sequencing. Results The levels of total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), apolipoprotein (Apo) AI were lower but ApoB was higher in Bai Ku Yao than in Han (P < 0.05-0.001). The frequencies of C and T alleles were 78.3% and 21.7% in Bai Ku Yao, and 73.7% and 26.3% in Han (P < 0.01); respectively. The frequencies of CC, CT and TT genotypes were 60.0%, 36.6% and 3.4% in Bai Ku Yao, and 54.2%, 39.0% and 6.8% in Han (P < 0.01); respectively. The subjects with TT genotype in both ethnic groups had lower HDL-C and ApoAI levels than the subjects with CC or CT genotype (P < 0.05 for all). Subgroup analyses showed that the subjects with TT genotype in Bai Ku Yao had lower HDL-C and ApoAI levels in males than the subjects with CC or CT genotype (P < 0.05 for all), and the T allele carriers had higher TC, LDL-C and ApoB levels in females than the T allele noncarriers (P < 0.05 for all). The participants with TT genotype in Han also had a lower tendency of HDL-C and ApoAI levels in males than the participants with CC or CT genotype, but the difference did not reach statistically significant (P = 0.063 and P = 0.086; respectively). The association of serum HDL-C and ApoAI levels and genotypes was confirmed by

  9. Eight New Genomes and Synthetic Controls Increase the Accessibility of Rapid Melt-MAMA SNP Typing of Coxiella burnetii

    PubMed Central

    Byström, Mona; Forsman, Mats; Frangoulidis, Dimitrios; Janse, Ingmar; Larsson, Pär; Lindgren, Petter; Öhrman, Caroline; van Rotterdam, Bart; Sjödin, Andreas; Myrtennäs, Kerstin

    2014-01-01

    The case rate of Q fever in Europe has increased dramatically in recent years, mainly because of an epidemic in the Netherlands in 2009. Consequently, there is a need for more extensive genetic characterization of the disease agent Coxiella burnetii in order to better understand the epidemiology and spread of this disease. Genome reference data are essential for this purpose, but only thirteen genome sequences are currently available. Current methods for typing C. burnetii are criticized for having problems in comparing results across laboratories, require the use of genomic control DNA, and/or rely on markers in highly variable regions. We developed in this work a method for single nucleotide polymorphism (SNP) typing of C. burnetii isolates and tissue samples based on new assays targeting ten phylogenetically stable synonymous canonical SNPs (canSNPs). These canSNPs represent previously known phylogenetic branches and were here identified from sequence comparisons of twenty-one C. burnetii genomes, eight of which were sequenced in this work. Importantly, synthetic control templates were developed, to make the method useful to laboratories lacking genomic control DNA. An analysis of twenty-one C. burnetii genomes confirmed that the species exhibits high sequence identity. Most of its SNPs (7,493/7,559 shared by >1 genome) follow a clonal inheritance pattern and are therefore stable phylogenetic typing markers. The assays were validated using twenty-six genetically diverse C. burnetii isolates and three tissue samples from small ruminants infected during the epidemic in the Netherlands. Each sample was assigned to a clade. Synthetic controls (vector and PCR amplified) gave identical results compared to the corresponding genomic controls and are viable alternatives to genomic DNA. The results from the described method indicate that it could be useful for cheap and rapid disease source tracking at non-specialized laboratories, which requires accurate genotyping

  10. Eight new genomes and synthetic controls increase the accessibility of rapid melt-MAMA SNP typing of Coxiella burnetii.

    PubMed

    Karlsson, Edvin; Macellaro, Anna; Byström, Mona; Forsman, Mats; Frangoulidis, Dimitrios; Janse, Ingmar; Larsson, Pär; Lindgren, Petter; Ohrman, Caroline; van Rotterdam, Bart; Sjödin, Andreas; Myrtennäs, Kerstin

    2014-01-01

    The case rate of Q fever in Europe has increased dramatically in recent years, mainly because of an epidemic in the Netherlands in 2009. Consequently, there is a need for more extensive genetic characterization of the disease agent Coxiella burnetii in order to better understand the epidemiology and spread of this disease. Genome reference data are essential for this purpose, but only thirteen genome sequences are currently available. Current methods for typing C. burnetii are criticized for having problems in comparing results across laboratories, require the use of genomic control DNA, and/or rely on markers in highly variable regions. We developed in this work a method for single nucleotide polymorphism (SNP) typing of C. burnetii isolates and tissue samples based on new assays targeting ten phylogenetically stable synonymous canonical SNPs (canSNPs). These canSNPs represent previously known phylogenetic branches and were here identified from sequence comparisons of twenty-one C. burnetii genomes, eight of which were sequenced in this work. Importantly, synthetic control templates were developed, to make the method useful to laboratories lacking genomic control DNA. An analysis of twenty-one C. burnetii genomes confirmed that the species exhibits high sequence identity. Most of its SNPs (7,493/7,559 shared by >1 genome) follow a clonal inheritance pattern and are therefore stable phylogenetic typing markers. The assays were validated using twenty-six genetically diverse C. burnetii isolates and three tissue samples from small ruminants infected during the epidemic in the Netherlands. Each sample was assigned to a clade. Synthetic controls (vector and PCR amplified) gave identical results compared to the corresponding genomic controls and are viable alternatives to genomic DNA. The results from the described method indicate that it could be useful for cheap and rapid disease source tracking at non-specialized laboratories, which requires accurate genotyping

  11. Comparative analysis of type 2 diabetes-associated SNP alleles identifies allele-specific DNA-binding proteins for the KCNQ1 locus.

    PubMed

    Hiramoto, Masaki; Udagawa, Haruhide; Watanabe, Atsushi; Miyazawa, Keisuke; Ishibashi, Naoko; Kawaguchi, Miho; Uebanso, Takashi; Nishimura, Wataru; Nammo, Takao; Yasuda, Kazuki

    2015-07-01

    Although recent genome-wide association studies (GWAS) have been extremely successful, it remains a big challenge to functionally annotate disease‑associated single nucleotide polymorphisms (SNPs), as the majority of these SNPs are located in non‑coding regions of the genome. In this study, we described a novel strategy for identifying the proteins that bind to the SNP‑containing locus in an allele‑specific manner and successfully applied this method to SNPs in the type 2 diabetes mellitus susceptibility gene, potassium voltage‑gated channel, KQT‑like subfamily Q, member 1 (KCNQ1). DNA fragments encompassing SNPs, and risk or non‑risk alleles were immobilized onto the novel nanobeads and DNA‑binding proteins were purified from the nuclear extracts of pancreatic β cells using these DNA‑immobilized nanobeads. Comparative analysis of the allele-specific DNA-binding proteins indicated that the affinities of several proteins for the examined SNPs differed between the alleles. Nuclear transcription factor Y (NF‑Y) specifically bound the non‑risk allele of the SNP rs2074196 region and stimulated the transcriptional activity of an artificial promoter containing SNP rs2074196 in an allele‑specific manner. These results suggest that SNP rs2074196 modulates the affinity of the locus for NF‑Y and possibly induces subsequent changes in gene expression. The findings of this study indicate that our comparative method using novel nanobeads is effective for the identification of allele‑specific DNA‑binding proteins, which may provide important clues for the functional impact of disease‑associated non‑coding SNPs.

  12. SNP-RFLPing 2: an updated and integrated PCR-RFLP tool for SNP genotyping

    PubMed Central

    2010-01-01

    Background PCR-restriction fragment length polymorphism (RFLP) assay is a cost-effective method for SNP genotyping and mutation detection, but the manual mining for restriction enzyme sites is challenging and cumbersome. Three years after we constructed SNP-RFLPing, a freely accessible database and analysis tool for restriction enzyme mining of SNPs, significant improvements over the 2006 version have been made and incorporated into the latest version, SNP-RFLPing 2. Results The primary aim of SNP-RFLPing 2 is to provide comprehensive PCR-RFLP information with multiple functionality about SNPs, such as SNP retrieval to multiple species, different polymorphism types (bi-allelic, tri-allelic, tetra-allelic or indels), gene-centric searching, HapMap tagSNPs, gene ontology-based searching, miRNAs, and SNP500Cancer. The RFLP restriction enzymes and the corresponding PCR primers for the natural and mutagenic types of each SNP are simultaneously analyzed. All the RFLP restriction enzyme prices are also provided to aid selection. Furthermore, the previously encountered updating problems for most SNP related databases are resolved by an on-line retrieval system. Conclusions The user interfaces for functional SNP analyses have been substantially improved and integrated. SNP-RFLPing 2 offers a new and user-friendly interface for RFLP genotyping that can be used in association studies and is freely available at http://bio.kuas.edu.tw/snp-rflping2. PMID:20377871

  13. Association of Type 2 Diabetes Mellitus related SNP genotypes with altered serum adipokine levels and metabolic syndrome phenotypes

    PubMed Central

    Al-Daghri, Nasser M; Al-Attas, Omar S; Krishnaswamy, Soundararajan; Mohammed, Abdul Khader; Alenad, Amal M; Chrousos, George P; Alokail, Majed S

    2015-01-01

    The pathogenesis of T2DM involves secretion of several pro-inflammatory molecules by the dramatically increased adipocytes, both by number and size, and associated macrophages of adipose tissue. Since T2DM is usually preceded by obesity and chronic systemic inflammation, the objective of this study was to explore for any association between genetic variants of previously established 36 T2DM-associated SNPs and altered serum adipocytokine levels and metabolic syndrome phenotypes. Study consisted of 566 subjects (284 males and 282 females) of whom 147 were T2DM patients and 419 healthy controls. Study subjects were genotyped for 36 T2DM-linked single nucleotide polymorphisms (SNPs) using the KASPar SNP Genotyping System and grouped into different genotypes for each SNP. Various anthropometric and biochemical parameters were measured following standard procedures. The mean values of serum levels of individual adipocytokines and the presence/absence of metabolic syndrome phenotypes corresponding to various genotypes were compared by determining the odds ratios. Genotypic variants of five and seven of the 36 T2DM-related SNPs were significantly associated with altered serum levels of adiponectin and aPAI, respectively. Six variants of the 36 SNPs were associated with metabolic syndrome manifestations. This study identified positive associations between genotypic variants of five and seven of the 36 T2DM related SNPs and altered serum levels of adiponectin and aPAI, respectively. Six of 36 SNPs were also associated with metabolic syndrome in the studied population. The relation between specific SNPs and individual phenotypic traits may be useful in explaining the causal mechanisms of hereditary component of T2DM. PMID:26064370

  14. SNP-VISTA: An interactive SNP visualization tool

    PubMed Central

    Shah, Nameeta; Teplitsky, Michael V; Minovitsky, Simon; Pennacchio, Len A; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L

    2005-01-01

    Background Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at [1]. Results We have developed and present two modifications of an interactive visualization tool, SNP-VISTA, to aid in the analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein evolutionary conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. Conclusion The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNP data by the user. PMID

  15. Electrochemical detection of type 2 diabetes mellitus-related SNP via DNA-mediated growth of silver nanoparticles on single walled carbon nanotubes.

    PubMed

    Tao, Jia; Zhao, Peng; Zheng, Jing; Wu, Cuichen; Shi, Muling; Li, Jishan; Li, Yinhui; Yang, Ronghua

    2015-11-01

    Herein, we proposed a new electrochemical sensing strategy for T2DM-related SNP detection via DNA-mediated growth of AgNPs on a SWCNT-modified electrode. Coupled with RNase HII enzyme assisted amplification, this approach could realize T2DM-related SNP assay and be applied in crude extracts of carcinoma pancreatic β-cell lines.

  16. Electrochemical detection of type 2 diabetes mellitus-related SNP via DNA-mediated growth of silver nanoparticles on single walled carbon nanotubes.

    PubMed

    Tao, Jia; Zhao, Peng; Zheng, Jing; Wu, Cuichen; Shi, Muling; Li, Jishan; Li, Yinhui; Yang, Ronghua

    2015-11-01

    Herein, we proposed a new electrochemical sensing strategy for T2DM-related SNP detection via DNA-mediated growth of AgNPs on a SWCNT-modified electrode. Coupled with RNase HII enzyme assisted amplification, this approach could realize T2DM-related SNP assay and be applied in crude extracts of carcinoma pancreatic β-cell lines. PMID:26365891

  17. Genome-wide detection of CNVs in Chinese indigenous sheep with different types of tails using ovine high-density 600K SNP arrays.

    PubMed

    Zhu, Caiye; Fan, Hongying; Yuan, Zehu; Hu, Shijin; Ma, Xiaomeng; Xuan, Junli; Wang, Hongwei; Zhang, Li; Wei, Caihong; Zhang, Qin; Zhao, Fuping; Du, Lixin

    2016-01-01

    Chinese indigenous sheep can be classified into three types based on tail morphology: fat-tailed, fat-rumped, and thin-tailed sheep, of which the typical breeds are large-tailed Han sheep, Altay sheep, and Tibetan sheep, respectively. To unravel the genetic mechanisms underlying the phenotypic differences among Chinese indigenous sheep with tails of three different types, we used ovine high-density 600K SNP arrays to detect genome-wide copy number variation (CNV). In large-tailed Han sheep, Altay sheep, and Tibetan sheep, 371, 301, and 66 CNV regions (CNVRs) with lengths of 71.35 Mb, 51.65 Mb, and 10.56 Mb, respectively, were identified on autosomal chromosomes. Ten CNVRs were randomly chosen for confirmation, of which eight were successfully validated. The detected CNVRs harboured 3130 genes, including genes associated with fat deposition, such as PPARA, RXRA, KLF11, ADD1, FASN, PPP1CA, PDGFA, and PEX6. Moreover, multilevel bioinformatics analyses of the detected candidate genes were significantly enriched for involvement in fat deposition, GTPase regulator, and peptide receptor activities. This is the first high-resolution sheep CNV map for Chinese indigenous sheep breeds with three types of tails. Our results provide valuable information that will support investigations of genomic structural variation underlying traits of interest in sheep. PMID:27282145

  18. Genome-wide detection of CNVs in Chinese indigenous sheep with different types of tails using ovine high-density 600K SNP arrays

    PubMed Central

    Zhu, Caiye; Fan, Hongying; Yuan, Zehu; Hu, Shijin; Ma, Xiaomeng; Xuan, Junli; Wang, Hongwei; Zhang, Li; Wei, Caihong; Zhang, Qin; Zhao, Fuping; Du, Lixin

    2016-01-01

    Chinese indigenous sheep can be classified into three types based on tail morphology: fat-tailed, fat-rumped, and thin-tailed sheep, of which the typical breeds are large-tailed Han sheep, Altay sheep, and Tibetan sheep, respectively. To unravel the genetic mechanisms underlying the phenotypic differences among Chinese indigenous sheep with tails of three different types, we used ovine high-density 600K SNP arrays to detect genome-wide copy number variation (CNV). In large-tailed Han sheep, Altay sheep, and Tibetan sheep, 371, 301, and 66 CNV regions (CNVRs) with lengths of 71.35 Mb, 51.65 Mb, and 10.56 Mb, respectively, were identified on autosomal chromosomes. Ten CNVRs were randomly chosen for confirmation, of which eight were successfully validated. The detected CNVRs harboured 3130 genes, including genes associated with fat deposition, such as PPARA, RXRA, KLF11, ADD1, FASN, PPP1CA, PDGFA, and PEX6. Moreover, multilevel bioinformatics analyses of the detected candidate genes were significantly enriched for involvement in fat deposition, GTPase regulator, and peptide receptor activities. This is the first high-resolution sheep CNV map for Chinese indigenous sheep breeds with three types of tails. Our results provide valuable information that will support investigations of genomic structural variation underlying traits of interest in sheep. PMID:27282145

  19. Preferential access to genetic information from endogenous hominin ancient DNA and accurate quantitative SNP-typing via SPEX

    PubMed Central

    Brotherton, Paul; Sanchez, Juan J.; Cooper, Alan; Endicott, Phillip

    2010-01-01

    The analysis of targeted genetic loci from ancient, forensic and clinical samples is usually built upon polymerase chain reaction (PCR)-generated sequence data. However, many studies have shown that PCR amplification from poor-quality DNA templates can create sequence artefacts at significant levels. With hominin (human and other hominid) samples, the pervasive presence of highly PCR-amplifiable human DNA contaminants in the vast majority of samples can lead to the creation of recombinant hybrids and other non-authentic artefacts. The resulting PCR-generated sequences can then be difficult, if not impossible, to authenticate. In contrast, single primer extension (SPEX)-based approaches can genotype single nucleotide polymorphisms from ancient fragments of DNA as accurately as modern DNA. A single SPEX-type assay can amplify just one of the duplex DNA strands at target loci and generate a multi-fold depth-of-coverage, with non-authentic recombinant hybrids reduced to undetectable levels. Crucially, SPEX-type approaches can preferentially access genetic information from damaged and degraded endogenous ancient DNA templates over modern human DNA contaminants. The development of SPEX-type assays offers the potential for highly accurate, quantitative genotyping from ancient hominin samples. PMID:19864251

  20. A simple and accurate SNP scoring strategy based on typeIIS restriction endonuclease cleavage and matrix-assisted laser desorption/ionization mass spectrometry

    PubMed Central

    Hong, Sun Pyo; Ji, Seung Il; Rhee, Hwanseok; Shin, Soo Kyeong; Hwang, Sun Young; Lee, Seung Hwan; Lee, Soong Deok; Oh, Heung-Bum; Yoo, Wangdon; Kim, Soo-Ok

    2008-01-01

    Background We describe the development of a novel matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF)-based single nucleotide polymorphism (SNP) scoring strategy, termed Restriction Fragment Mass Polymorphism (RFMP) that is suitable for genotyping variations in a simple, accurate, and high-throughput manner. The assay is based on polymerase chain reaction (PCR) amplification and mass measurement of oligonucleotides containing a polymorphic base, to which a typeIIS restriction endonuclease recognition was introduced by PCR amplification. Enzymatic cleavage of the products leads to excision of oligonucleotide fragments representing base variation of the polymorphic site whose masses were determined by MALDI-TOF MS. Results The assay represents an improvement over previous methods because it relies on the direct mass determination of PCR products rather than on an indirect analysis, where a base-extended or fluorescent report tag is interpreted. The RFMP strategy is simple and straightforward, requiring one restriction digestion reaction following target amplification in a single vessel. With this technology, genotypes are generated with a high call rate (99.6%) and high accuracy (99.8%) as determined by independent sequencing. Conclusion The simplicity, accuracy and amenability to high-throughput screening analysis should make the RFMP assay suitable for large-scale genotype association study as well as clinical genotyping in laboratories. PMID:18538037

  1. SNP ID-info: SNP ID searching and visualization platform.

    PubMed

    Yang, Cheng-Hong; Chuang, Li-Yeh; Cheng, Yu-Huei; Wen, Cheng-Hao; Chang, Phei-Lang; Chang, Hsueh-Wei

    2008-09-01

    Many association studies provide the relationship between single nucleotide polymorphisms (SNPs), diseases and cancers, without giving a SNP ID, however. Here, we developed the SNP ID-info freeware to provide the SNP IDs within inputting genetic and physical information of genomes. The program provides an "SNP-ePCR" function to generate the full-sequence using primers and template inputs. In "SNPosition," sequence from SNP-ePCR or direct input is fed to match the SNP IDs from SNP fasta-sequence. In "SNP search" and "SNP fasta" function, information of SNPs within the cytogenetic band, contig position, and keyword input are acceptable. Finally, the SNP ID neighboring environment for inputs is completely visualized in the order of contig position and marked with SNP and flanking hits. The SNP identification problems inherent in NCBI SNP BLAST are also avoided. In conclusion, the SNP ID-info provides a visualized SNP ID environment for multiple inputs and assists systematic SNP association studies. The server and user manual are available at http://bio.kuas.edu.tw/snpid-info.

  2. Peopling of the North Circumpolar Region--insights from Y chromosome STR and SNP typing of Greenlanders.

    PubMed

    Olofsson, Jill Katharina; Pereira, Vania; Børsting, Claus; Morling, Niels

    2015-01-01

    The human population in Greenland is characterized by migration events of Paleo- and Neo-Eskimos, as well as admixture with Europeans. In this study, the Y-chromosomal variation in male Greenlanders was investigated in detail by typing 73 Y-chromosomal single nucleotide polymorphisms (Y-SNPs) and 17 Y-chromosomal short tandem repeats (Y-STRs). Approximately 40% of the analyzed Greenlandic Y chromosomes were of European origin (I-M170, R1a-M513 and R1b-M343). Y chromosomes of European origin were mainly found in individuals from the west and south coasts of Greenland, which is in agreement with the historic records of the geographic placements of European settlements in Greenland. Two Inuit Y-chromosomal lineages, Q-M3 (xM19, M194, L663, SA01 and L766) and Q-NWT01 (xM265) were found in 23% and 31% of the male Greenlanders, respectively. The time to the most recent common ancestor (TMRCA) of the Q-M3 lineage of the Greenlanders was estimated to be between 4,400 and 10,900 years ago (y. a.) using two different methods. This is in agreement with the theory that the North Circumpolar Region was populated via a second expansion of humans in the North American continent. The TMRCA of the Q-NWT01 (xM265) lineage in Greenland was estimated to be between 7,000 and 14,300 y. a. using two different methods, which is older than the previously reported TMRCA of this lineage in other Inuit populations. Our results indicate that Inuit individuals carrying the Q-NWT01 (xM265) lineage may have their origin in the northeastern parts of North America and could be descendants of the Dorset culture. This in turn points to the possibility that the current Inuit population in Greenland is comprised of individuals of both Thule and Dorset descent.

  3. Validation of a single nucleotide polymorphism (SNP) typing assay with 49 SNPs for forensic genetic testing in a laboratory accredited according to the ISO 17025 standard.

    PubMed

    Børsting, Claus; Rockenbauer, Eszter; Morling, Niels

    2009-12-01

    A multiplex assay with 49 autosomal single nucleotide polymorphisms (SNPs) developed for human identification was validated for forensic genetic casework and accredited according to the ISO 17025 standard. The multiplex assay was based on the SNPforID 52plex SNP assay [J.J. Sanchez, C. Phillips, C. Børsting, K. Balogh, M. Bogus, M. Fondevila, C.D. Harrison, E. Musgrave-Brown, A. Salas, D. Syndercombe-Court, P.M. Schneider, A. Carracedo, N. Morling, A multiplex assay with 52 single nucleotide polymorphisms for human identification, Electrophoresis 27 (2006) 1713-1724], where 52 fragments were amplified in one PCR reaction. The SNPs were analysed by single base extension (SBE) and capillary electrophoresis. Twenty-three of the original SBE primers were altered to improve the overall robustness of the assay and to simplify the analysis of the SBE results. A total of 216 samples from 50 paternity cases and 33 twin cases were typed at least twice for the 49 SNPs. All electropherograms were analysed independently by two expert analysts prior to approval. Based on these results, detailed guidelines for analysis of the SBE products were developed. With these guidelines, the peak height ratio of a heterozygous allele call or the signal to noise ratio of a homozygous allele call is compared with previously obtained ratios. A laboratory protocol for analysis of SBE products was developed where allele calls with unusual ratios were highlighted to facilitate the analysis of difficult allele calls. The guidelines for allele calling proved to be highly efficient for the detection of DNA mixtures and contaminated DNA preparations. DNA from two individuals was mixed in seven different ratios ranging from 1:1 to 1:10; all mixtures were easily identified as mixtures. PMID:19948332

  4. Temple syndrome: A patient with maternal hetero-UPD14, mixed iso- and hetero-disomy detected by SNP microarray typing of patient-father duos.

    PubMed

    Shin, Eun-Hye; Cho, Eunhae; Lee, Cha Gon

    2016-08-01

    Temple syndrome (TS, MIM 616222) is an imprinting disorder involving genes within the imprinted region of chromosome 14q32. TS is a genetically complex disorder, which is associated with maternal uniparental disomy of chromosome 14 (UPD14), paternal deletions on chromosome 14, or loss of methylation at the intergenic differentially methylated region (IG-DMR). Here, we describe the case of a patient with maternal hetero-UPD14, mixed iso-/hetero-disomy mechanism identified by a single nucleotide polymorphism (SNP) array analysis of patient-father duos study. The phenotype of our case is similarities to Prader-Willi syndrome (PWS) during infancy and to Russell-Silver syndrome (RSS) during childhood. This SNP array appears to be an effective initial screening tool for patients with nonspecific clinical features suggestive of chromosomal disorders. PMID:26867509

  5. [SNP-19 genotypic variants of CAPN10 gene and its relation to diabetes mellitus type 2 in a population of Ciudad Juarez, Mexico].

    PubMed

    Loya Méndez, Yolanda; Reyes Leal, Gilberto; Sánchez González, Adriana; Portillo Reyes, Verónica; Reyes Ruvalcaba, David; Bojórquez Rangel, Guillermo

    2014-09-28

    Introducción: La diabetes mellitus (DM) tipo 2 es una patología común de origen multifactorial cuyas bases genéticas exactas se desconocen aún; diversos estudios sugieren que los polimorfismos de nucleótido único (SNPs) en el gen CAPN10 (Locus 2q37.3) podrían participar en su desarrollo, incluyendo el polimorfismo de inserción/ deleción SNP-19 (2R→3R). Objetivo: Determinar la relación entre el polimorfismo SNP-19 y la presencia de DM tipo 2 en una población de Ciudad Juárez. Métodos: Se seleccionaron 107 individuos: 43 diabéticos tipo 2 (casos) y 64 no diabéticos sin antecedentes heredo-familiares de DM tipo 2 en primer grado (control). Se realizó estudio antropométrico y perfil bioquímico de lípidos, lipoproteínas y glucosa sérica. Se extrajo ADN de linfocitos de sangre periférica y se amplificó mediante la técnica de reacción en cadena de la polimerasa (PCR). Se analizaron los genotipos del polimorfismo SNP-19 del gen CAPN10 por análisis electroforético en geles de agarosa. Se calcularon las frecuencias genotípicas y alélicas y se realizaron pruebas de equilibrio de Hardy-Weinberg (GenAlEx 6.4). Resultados: El análisis mediante la prueba X² identificó diferencias en los genotipos entre casos y control, con una mayor frecuencia del genotipo homocigoto 3R del SNP-19 en el grupo de casos (0.418) respecto al grupo control (0.265). El genotipo 2R/3R presentó relación con valores elevados de peso, índice de masa corporal y perímetros de cintura y cadera; pero solo en el grupo de diabéticos (P=< 0.05). Conclusión: Los resultados de esta investigación sugieren la participación del SNP-19 del gen CAPN10 en el desarrollo de DM tipo 2 en la población estudiada.

  6. Multi objective SNP selection using pareto optimality.

    PubMed

    Gumus, Ergun; Gormez, Zeliha; Kursun, Olcay

    2013-04-01

    Biomarker discovery is a challenging task of bioinformatics especially when targeting high dimensional problems such as SNP (single nucleotide polymorphism) datasets. Various types of feature selection methods can be applied to accomplish this task. Typically, using features versus class labels of samples in the training dataset, these methods aim at selecting feature subsets with maximal classification accuracies. Although finding such class-discriminative features is crucial, selection of relevant SNPs for maximizing other properties that exist in the nature of population genetics such as the correlation between genetic diversity and geographical distance of ethnic groups can also be equally important. In this work, a methodology using a multi objective optimization technique called Pareto Optimal is utilized for selecting SNP subsets offering both high classification accuracy and correlation between genomic and geographical distances. In this method, discriminatory power of an SNP is determined using mutual information and its contribution to the genomic-geographical correlation is estimated using its loadings on principal components. Combining these objectives, the proposed method identifies SNP subsets that can better discriminate ethnic groups than those obtained with sole mutual information and yield higher correlation than those obtained with sole principal components on the Human Genome Diversity Project (HGDP) SNP dataset.

  7. [Study of association of the SNP19 polymorphism of calpain 10 gene with type 2 diabetes in ethnic sub-groups of the Tunisian population: gene-environment interaction].

    PubMed

    Ouederni, T Baroudi; Sanchez-Corona, J; Skhiri, H Aounallah; Maiz, H Ben; Abid, H Kammoun; Benammar-Elgaaied, A

    2009-01-01

    Calpaïn 10 (CAPN10) is the first diabetes gene to be identified through a genome scan followed by positional cloning, encoding the cysteine protease, the calpaïn 10 encodes for a ubiquitously expressed protease implicated in the two fundamental pathophysiological aspects of T2DM insulinoresistance and insulinosecretion. Many investigators, but not all, have subsequently found association between calpaïn 10 polymorphism and type 2 diabetes (T2DM) as well as insulin action and insulin secretion. The aim of this study was to determine whether there is an association between specific polymorphism SNP19 in CAPN10 gene and T2DM in two ethnic groups from Djerba Island. Overall, 162 patients with type 2 of diabetes and 110 healthy volunteers who served as controls for genetic characterization with no family history of diabetes were included in the present study. They consisted of 159 women and 113 men. Their mean +/- SD age was 56,47 +/- 11,86 years. All subjects were genotyped according to SNP 19 polymorphism in CAPN10 gene with PCR method to perform case-control study. After adjusting for gender and age, we found an association with a high risk of T2DM in Djerba Island only in Arab sub-group.

  8. SNPConvert: SNP Array Standardization and Integration in Livestock Species

    PubMed Central

    Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra

    2016-01-01

    One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git.

  9. SNPConvert: SNP Array Standardization and Integration in Livestock Species

    PubMed Central

    Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra

    2016-01-01

    One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git. PMID:27600083

  10. SNPConvert: SNP Array Standardization and Integration in Livestock Species.

    PubMed

    Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra

    2016-01-01

    One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git. PMID:27600083

  11. Exhaustive Genome-Wide Search for SNP-SNP Interactions Across 10 Human Diseases

    PubMed Central

    Murk, William; DeWan, Andrew T.

    2016-01-01

    The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA) study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs) with minor allele frequency (MAF) ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10−12). Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized. PMID:27185397

  12. Exhaustive Genome-Wide Search for SNP-SNP Interactions Across 10 Human Diseases.

    PubMed

    Murk, William; DeWan, Andrew T

    2016-01-01

    The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA) study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs) with minor allele frequency (MAF) ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10(-12)). Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized.

  13. Exhaustive Genome-Wide Search for SNP-SNP Interactions Across 10 Human Diseases.

    PubMed

    Murk, William; DeWan, Andrew T

    2016-01-01

    The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA) study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs) with minor allele frequency (MAF) ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10(-12)). Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized. PMID:27185397

  14. Evaluation of the iPLEX® Sample ID Plus Panel designed for the Sequenom MassARRAY® system. A SNP typing assay developed for human identification and sample tracking based on the SNPforID panel.

    PubMed

    Johansen, P; Andersen, J D; Børsting, C; Morling, N

    2013-09-01

    Sequenom launched the first commercial SNP typing kit for human identification, named the iPLEX(®) Sample ID Plus Panel. The kit amplifies 47 of the 52 SNPs in the SNPforID panel, amelogenin and two Y-chromosome SNPs in one multiplex PCR. The SNPs were analyzed by single base extension (SBE) and Matrix Assisted Laser Desorption/Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS). In this study, we evaluated the accuracy and sensitivity of the iPLEX(®) Sample ID Plus Panel by comparing the typing results of the iPLEX(®) Sample ID Plus Panel with those obtained with our ISO 17025 accredited SNPforID assay. The average call rate for duplicate typing of any one SNPs in the panel was 90.0% when the mass spectra were analyzed automatically with the MassARRAY(®) TYPER 4.0 genotyping software in real time. Two reproducible inconsistencies were observed (error rate: 0.05%) at two different SNP loci. In addition, four inconsistencies were observed once. The optimal amount of template DNA in the PCR was ≥10ng. There was a relatively high risk of allele and locus drop-outs when ≤1ng template DNA was used. We developed an R script with a stringent set of "forensic analysis parameters" based on the peak height and the signal to noise data exported from the TYPER 4.0 software. With the forensic analysis parameters, all inconsistencies were eliminated in reactions with ≥10ng DNA. However, the average call rate decreased to 69.9%. The iPLEX(®) Sample ID Plus Panel was tested on 10 degraded samples from forensic case-work. Two samples could not be typed, presumably because the samples contained PCR and SBE inhibitors. The average call rate was generally lower for degraded DNA samples and the number of inconsistencies higher than for pristine DNA. However, none of the inconsistencies were reproduced and the highest match probability for the degraded samples typed with the panel was 1.7E-9 using the stringent forensic analysis parameters. Although the relatively low

  15. A Bayesian Framework for SNP Identification

    SciTech Connect

    Webb-Robertson, Bobbie-Jo M.; Havre, Susan L.; Payne, Deborah A.

    2005-07-01

    Current proteomics techniques, such as mass spectrometry, focus on protein identification, usually ignoring most types of modifications beyond post-translational modifications, with the assumption that only a small number of peptides have to be matched to a protein for a positive identification. However, not all proteins are being identified with current techniques and improved methods to locate points of mutation are becoming a necessity. In the case when single-nucleotide polymorphisms (SNPs) are observed, brute force is the most common method to locate them, quickly becoming computationally unattractive as the size of the database associated with the model organism grows. We have developed a Bayesian model for SNPs, BSNP, incorporating evolutionary information at both the nucleotide and amino acid levels. Formulating SNPs as a Bayesian inference problem allows probabilities of interest to be easily obtained, for example the probability of a specific SNP or specific type of mutation over a gene or entire genome. Three SNP databases were observed in the evaluation of the BSNP model; the first SNP database is a disease specific gene in human, hemoglobin, the second is also a disease specific gene in human, p53, and the third is a more general SNP database for multiple genes in mouse. We validate that the BSNP model assigns higher posterior probabilities to the SNPs defined in all three separate databases than can be attributed to chance under specific evolutionary information, for example the amino acid model described by Majewski and Ott in conjunction with either the four-parameter nucleotide model by Bulmer or seven-parameter nucleotide model by Majewski and Ott.

  16. pfSNP: An integrated potentially functional SNP resource that facilitates hypotheses generation through knowledge syntheses.

    PubMed

    Wang, Jingbo; Ronaghi, Mostafa; Chong, Samuel S; Lee, Caroline G L

    2011-01-01

    Currently, >14,000,000 single nucleotide polymorphisms (SNPs) are reported. Identifying phenotype-affecting SNPs among these many SNPs pose significant challenges. Although several Web resources are available that can inform about the functionality of SNPs, these resources are mainly annotation databases and are not very comprehensive. In this article, we present a comprehensive, well-annotated, integrated pfSNP (potentially functional SNPs) Web resource (http://pfs.nus.edu.sg/), which is aimed to facilitate better hypothesis generation through knowledge syntheses mediated by better data integration and a user-friendly Web interface. pfSNP integrates >40 different algorithms/resources to interrogate >14,000,000 SNPs from the dbSNP database for SNPs of potential functional significance based on previous published reports, inferred potential functionality from genetic approaches as well as predicted potential functionality from sequence motifs. Its query interface has the user-friendly "auto-complete, prompt-as-you-type" feature and is highly customizable, facilitating different combination of queries using Boolean-logic. Additionally, to facilitate better understanding of the results and aid in hypotheses generation, gene/pathway-level information with text clouds highlighting enriched tissues/pathways as well as detailed-related information are also provided on the results page. Hence, the pfSNP resource will be of great interest to scientists focusing on association studies as well as those interested to experimentally address the functionality of SNPs.

  17. SNP genotyping by heteroduplex analysis.

    PubMed

    Paniego, Norma; Fusari, Corina; Lia, Verónica; Puebla, Andrea

    2015-01-01

    Heteroduplex-based genotyping methods have proven to be technologically effective and economically efficient for low- to medium-range throughput single-nucleotide polymorphism (SNP) determination. In this chapter we describe two protocols that were successfully applied for SNP detection and haplotype analysis of candidate genes in association studies. The protocols involve (1) enzymatic mismatch cleavage with endonuclease CEL1 from celery, associated with fragment separation using capillary electrophoresis (CEL1 cleavage), and (2) differential retention of the homo/heteroduplex DNA molecules under partial denaturing conditions on ion pair reversed-phase liquid chromatography (dHPLC). Both methods are complementary since dHPLC is more versatile than CEL1 cleavage for identifying multiple SNP per target region, and the latter is easily optimized for sequences with fewer SNPs or small insertion/deletion polymorphisms. Besides, CEL1 cleavage is a powerful method to localize the position of the mutation when fragment resolution is done using capillary electrophoresis.

  18. The utility of high-resolution melting analysis of SNP nucleated PCR amplicons--an MLST based Staphylococcus aureus typing scheme.

    PubMed

    Lilliebridge, Rachael A; Tong, Steven Y C; Giffard, Philip M; Holt, Deborah C

    2011-01-01

    High resolution melting (HRM) analysis is gaining prominence as a method for discriminating DNA sequence variants. Its advantage is that it is performed in a real-time PCR device, and the PCR amplification and HRM analysis are closed tube, and effectively single step. We have developed an HRM-based method for Staphylococcus aureus genotyping. Eight single nucleotide polymorphisms (SNPs) were derived from the S. aureus multi-locus sequence typing (MLST) database on the basis of maximized Simpson's Index of Diversity. Only G↔A, G↔T, C↔A, C↔T SNPs were considered for inclusion, to facilitate allele discrimination by HRM. In silico experiments revealed that DNA fragments incorporating the SNPs give much higher resolving power than randomly selected fragments. It was shown that the predicted optimum fragment size for HRM analysis was 200 bp, and that other SNPs within the fragments contribute to the resolving power. Six DNA fragments ranging from 83 bp to 219 bp, incorporating the resolution optimized SNPs were designed. HRM analysis of these fragments using 94 diverse S. aureus isolates of known sequence type or clonal complex (CC) revealed that sequence variants are resolved largely in accordance with G+C content. A combination of experimental results and in silico prediction indicates that HRM analysis resolves S. aureus into 268 "melt types" (MelTs), and provides a Simpson's Index of Diversity of 0.978 with respect to MLST. There is a high concordance between HRM analysis and the MLST defined CCs. We have generated a Microsoft Excel key which facilitates data interpretation and translation between MelT and MLST data. The potential of this approach for genotyping other bacterial pathogens was investigated using a computerized approach to estimate the densities of SNPs with unlinked allelic states. The MLST databases for all species tested contained abundant unlinked SNPs, thus suggesting that high resolving power is not dependent upon large numbers of SNPs.

  19. RASSF1A and the rs2073498 Cancer Associated SNP

    PubMed Central

    Donninger, Howard; Barnoud, Thibaut; Nelson, Nick; Kassler, Suzanna; Clark, Jennifer; Cummins, Timothy D.; Powell, David W.; Nyante, Sarah; Millikan, Robert C.; Clark, Geoffrey J.

    2011-01-01

    RASSF1A is one of the most frequently inactivated tumor suppressors yet identified in human cancer. It is pro-apoptotic and appears to function as a scaffolding protein that interacts with a variety of other tumor suppressors to modulate their function. It can also complex with the Ras oncoprotein and may serve to integrate pro-growth and pro-death signaling pathways. A SNP has been identified that is present in approximately 29% of European populations [rs2073498, A(133)S]. Several studies have now presented evidence that this SNP is associated with an enhanced risk of developing breast cancer. We have used a proteomics based approach to identify multiple differences in the pattern of protein/protein interactions mediated by the wild type compared to the SNP variant protein. We have also identified a significant difference in biological activity between wild type and SNP variant protein. However, we have found only a very modest association of the SNP with breast cancer predisposition. PMID:22649770

  20. Linkage mapping bovine EST-based SNP

    PubMed Central

    Snelling, Warren M; Casas, Eduardo; Stone, Roger T; Keele, John W; Harhay, Gregory P; Bennett, Gary L; Smith, Timothy PL

    2005-01-01

    Background Existing linkage maps of the bovine genome primarily contain anonymous microsatellite markers. These maps have proved valuable for mapping quantitative trait loci (QTL) to broad regions of the genome, but more closely spaced markers are needed to fine-map QTL, and markers associated with genes and annotated sequence are needed to identify genes and sequence variation that may explain QTL. Results Bovine expressed sequence tag (EST) and bacterial artificial chromosome (BAC)sequence data were used to develop 918 single nucleotide polymorphism (SNP) markers to map genes on the bovine linkage map. DNA of sires from the MARC reference population was used to detect SNPs, and progeny and mates of heterozygous sires were genotyped. Chromosome assignments for 861 SNPs were determined by twopoint analysis, and positions for 735 SNPs were established by multipoint analyses. Linkage maps of bovine autosomes with these SNPs represent 4585 markers in 2475 positions spanning 3058 cM . Markers include 3612 microsatellites, 913 SNPs and 60 other markers. Mean separation between marker positions is 1.2 cM. New SNP markers appear in 511 positions, with mean separation of 4.7 cM. Multi-allelic markers, mostly microsatellites, had a mean (maximum) of 216 (366) informative meioses, and a mean 3-lod confidence interval of 3.6 cM Bi-allelic markers, including SNP and other marker types, had a mean (maximum) of 55 (191) informative meioses, and were placed within a mean 8.5 cM 3-lod confidence interval. Homologous human sequences were identified for 1159 markers, including 582 newly developed and mapped SNP. Conclusion Addition of these EST- and BAC-based SNPs to the bovine linkage map not only increases marker density, but provides connections to gene-rich physical maps, including annotated human sequence. The map provides a resource for fine-mapping quantitative trait loci and identification of positional candidate genes, and can be integrated with other data to guide and

  1. A 48 SNP set for grapevine cultivar identification

    PubMed Central

    2011-01-01

    Background Rapid and consistent genotyping is an important requirement for cultivar identification in many crop species. Among them grapevine cultivars have been the subject of multiple studies given the large number of synonyms and homonyms generated during many centuries of vegetative multiplication and exchange. Simple sequence repeat (SSR) markers have been preferred until now because of their high level of polymorphism, their codominant nature and their high profile repeatability. However, the rapid application of partial or complete genome sequencing approaches is identifying thousands of single nucleotide polymorphisms (SNP) that can be very useful for such purposes. Although SNP markers are bi-allelic, and therefore not as polymorphic as microsatellites, the high number of loci that can be multiplexed and the possibilities of automation as well as their highly repeatable results under any analytical procedure make them the future markers of choice for any type of genetic identification. Results We analyzed over 300 SNP in the genome of grapevine using a re-sequencing strategy in a selection of 11 genotypes. Among the identified polymorphisms, we selected 48 SNP spread across all grapevine chromosomes with allele frequencies balanced enough as to provide sufficient information content for genetic identification in grapevine allowing for good genotyping success rate. Marker stability was tested in repeated analyses of a selected group of cultivars obtained worldwide to demonstrate their usefulness in genetic identification. Conclusions We have selected a set of 48 stable SNP markers with a high discrimination power and a uniform genome distribution (2-3 markers/chromosome), which is proposed as a standard set for grapevine (Vitis vinifera L.) genotyping. Any previous problems derived from microsatellite allele confusion between labs or the need to run reference cultivars to identify allele sizes disappear using this type of marker. Furthermore, because SNP

  2. Approaches for identifying multiple-SNP haplotype blocks for use in human identification.

    PubMed

    Hiroaki, Nakahara; Koji, Fujii; Tetsushi, Kitayama; Kazumasa, Sekiguchi; Hiroaki, Nakanishi; Kazuyuki, Saito

    2015-09-01

    Single nucleotide polymorphism (SNP) discrimination effectiveness is low due to the bi-allelic nature of SNPs, and large numbers of loci must be analyzed for human identification in forensic casework. To resolve these issues, the authors support the use of multiple SNP haplotypes that will generate many haplotypes based on the combination of SNP alleles. First, 27 regions were selected from the JSNP database (http://snp.ims.u-tokyo.ac.jp) according to the following criteria: (1) 3 or more SNP loci within 100bp; (2) on-intron or out-of-gene location; and (3) frequency of more than 40% for each SNP allele. PCR amplification and high-resolution melting curve (HRM) analysis were then carried out for all selected regions to determine variation in the haplotypes of each. HRM analysis indicated that 7 regions (1q25, 1q42.2, 3p24, 10p13, 11p15.1, 14q12-q13, and 20q12) containing 3 SNP loci had more than 2 haplotypes. The frequencies of the haplotypes for each region were observed via direct sequencing of more than 100 individuals. Not only haplotyping increases the effectiveness of individual identification but also the analysis region is shorter than in common short tandem repeat analysis, representing a further advantage for fragmented DNA samples in SNP typing.

  3. BM-SNP: A Bayesian Model for SNP Calling Using High Throughput Sequencing Data.

    PubMed

    Xu, Yanxun; Zheng, Xiaofeng; Yuan, Yuan; Estecio, Marcos R; Issa, Jean-Pierre; Qiu, Peng; Ji, Yuan; Liang, Shoudan

    2014-01-01

    A single-nucleotide polymorphism (SNP) is a sole base change in the DNA sequence and is the most common polymorphism. Detection and annotation of SNPs are among the central topics in biomedical research as SNPs are believed to play important roles on the manifestation of phenotypic events, such as disease susceptibility. To take full advantage of the next-generation sequencing (NGS) technology, we propose a Bayesian approach, BM-SNP, to identify SNPs based on the posterior inference using NGS data. In particular, BM-SNP computes the posterior probability of nucleotide variation at each covered genomic position using the contents and frequency of the mapped short reads. The position with a high posterior probability of nucleotide variation is flagged as a potential SNP. We apply BM-SNP to two cell-line NGS data, and the results show a high ratio of overlap ( >95 percent) with the dbSNP database. Compared with MAQ, BM-SNP identifies more SNPs that are in dbSNP, with higher quality. The SNPs that are called only by BM-SNP but not in dbSNP may serve as new discoveries. The proposed BM-SNP method integrates information from multiple aspects of NGS data, and therefore achieves high detection power. BM-SNP is fast, capable of processing whole genome data at 20-fold average coverage in a short amount of time. PMID:26357041

  4. SNP genotyping using single-tube fluorescent bidirectional PCR.

    PubMed

    Waterfall, Christy M; Cobb, Benjamin D

    2002-07-01

    SNP genotyping is a well-populatedfield with a large number of assay formats offering accurate allelic discrimination. However, there remains a discord between the ultimate goal of rapid, inexpensive assays that do not require complex design considerations and involved optimization strategies. We describe the first integration of bidirectional allele-specific amplification, SYBR Green I, and rapid-cycle PCR to provide a homogeneous SNP-typing assay. Wild-type, mutant, and heterozygous alleles were easily discriminated in a single tube using melt curve profiling of PCR products alone. We demonstrate the effectiveness and reliability of this assay with a blinded trial using clinical samples from individuals with sickle cell anemia, sickle cell trait, or unaffected individuals. The tests were completed in less than 30 min without expensive fluorogenic probes, prohibiting design rules, or lengthy downstream processing for product analysis.

  5. Detecting Susceptibility to Breast Cancer with SNP-SNP Interaction Using BPSOHS and Emotional Neural Networks.

    PubMed

    Wang, Xiao; Peng, Qinke; Fan, Yue

    2016-01-01

    Studies for the association between diseases and informative single nucleotide polymorphisms (SNPs) have received great attention. However, most of them just use the whole set of useful SNPs and fail to consider the SNP-SNP interactions, while these interactions have already been proven in biology experiments. In this paper, we use a binary particle swarm optimization with hierarchical structure (BPSOHS) algorithm to improve the effective of PSO for the identification of the SNP-SNP interactions. Furthermore, in order to use these SNP interactions in the susceptibility analysis, we propose an emotional neural network (ENN) to treat SNP interactions as emotional tendency. Different from the normal architecture, just as the emotional brain, this architecture provides a specific path to treat the emotional value, by which the SNP interactions can be considered more quickly and directly. The ENN helps us use the prior knowledge about the SNP interactions and other influence factors together. Finally, the experimental results prove that the proposed BPSOHS_ENN algorithm can detect the informative SNP-SNP interaction and predict the breast cancer risk with a much higher accuracy than existing methods. PMID:27294121

  6. Detecting Susceptibility to Breast Cancer with SNP-SNP Interaction Using BPSOHS and Emotional Neural Networks

    PubMed Central

    Wang, Xiao; Fan, Yue

    2016-01-01

    Studies for the association between diseases and informative single nucleotide polymorphisms (SNPs) have received great attention. However, most of them just use the whole set of useful SNPs and fail to consider the SNP-SNP interactions, while these interactions have already been proven in biology experiments. In this paper, we use a binary particle swarm optimization with hierarchical structure (BPSOHS) algorithm to improve the effective of PSO for the identification of the SNP-SNP interactions. Furthermore, in order to use these SNP interactions in the susceptibility analysis, we propose an emotional neural network (ENN) to treat SNP interactions as emotional tendency. Different from the normal architecture, just as the emotional brain, this architecture provides a specific path to treat the emotional value, by which the SNP interactions can be considered more quickly and directly. The ENN helps us use the prior knowledge about the SNP interactions and other influence factors together. Finally, the experimental results prove that the proposed BPSOHS_ENN algorithm can detect the informative SNP-SNP interaction and predict the breast cancer risk with a much higher accuracy than existing methods. PMID:27294121

  7. Managing large SNP datasets with SNPpy.

    PubMed

    Mitha, Faheem

    2013-01-01

    Using relational databases to manage SNP datasets is a very useful technique that has significant advantages over alternative methods, including the ability to leverage the power of relational databases to perform data validation, and the use of the powerful SQL query language to export data. SNPpy is a Python program which uses the PostgreSQL database and the SQLAlchemy Python library to automate SNP data management. This chapter shows how to use SNPpy to store and manage large datasets.

  8. SNP Arrays for Species Identification in Salmonids.

    PubMed

    Wenne, Roman; Drywa, Agata; Kent, Matthew; Sundsaasen, Kristil Kindem; Lien, Sigbjørn

    2016-01-01

    The use of SNP genotyping microarrays, developed in one species to analyze a closely related species for which genomic sequence information is scarce, enables the rapid development of a genomic resource (SNP information) without the need to develop new species-specific markers. Using large numbers of microarray SNPs offers the best chance to detect informative markers in nontarget species, markers that can very often be assayed using a lower throughput platform as is described in this paper. PMID:27460372

  9. Development of an automated SNP analysis method using a paramagnetic beads handling robot.

    PubMed

    Hagiwara, Hiroko; Sawakami-Kobayashi, Kazumi; Yamamoto, Midori; Iwasaki, Shoji; Sugiura, Mika; Abe, Hatsumi; Kunihiro-Ohashi, Sumiko; Takase, Kumiko; Yamane, Noriko; Kato, Kaoru; Son, Renkon; Nakamura, Michihiro; Segawa, Osamu; Yoshida, Mamiko; Yohda, Masafumi; Tajima, Hideji; Kobori, Masato; Takahama, Yousuke; Itakura, Mitsuo; Machida, Masayuki

    2007-10-01

    Biological and medical importance of the single nucleotide polymorphism (SNP) has led to development of a wide variety of methods for SNP typing. Aiming for establishing highly reliable and fully automated SNP typing, we have developed the adapter ligation method in combination with the paramagnetic beads handling technology, Magtration(R). The method utilizes sequence specific ligation between the fluorescently labeled adapter and the sample DNAs at the cohesive end produced by a type IIS restriction enzyme. Evaluation of the method using human genomic DNA showed clear discrimination of the three genotypes without ambiguity using the same reaction condition for any SNPs examined. The operations following PCR amplification were automatically performed by the Magtration(R)-based robot that we have previously developed. Multiplex typing of two SNPs in a single reaction by using four fluorescent dyes was successfully preformed at the almost same sensitivity and reliability as the single typing. These results demonstrate that the automated paramagnetic beads handling technology, Magtration(R), is highly adaptable to the automated SNP analysis and that our method best fits to an automated in-house SNP typing for laboratory and medical uses.

  10. SNPMeta: SNP annotation and SNP metadata collection without a reference genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The increase in availability of resequencing data is greatly accelerating SNP discovery and has facilitated the development of SNP genotyping assays. This, in turn, is increasing interest in annotation of individual SNPs. Currently, these data are only available through curation, or comparison to a ...

  11. Genome-wide SNP detection, validation, and development of an 8K SNP array for apple

    Technology Transfer Automated Retrieval System (TEKTRAN)

    As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide...

  12. SNP Discovery Using Next Generation Transcriptomic Sequencing.

    PubMed

    De Wit, Pierre

    2016-01-01

    In this chapter, I will guide the user through methods to find new SNP markers from expressed sequence (RNA-Seq) data, focusing on the sample preparation and also on the bioinformatic analyses needed to sort through the immense flood of data from high-throughput sequencing machines. The general steps included are as follows: sample preparation, sequencing, quality control of data, assembly, mapping, SNP discovery, filtering, validation. The first few steps are traditional laboratory protocols, whereas steps following the sequencing are of bioinformatic nature. The bioinformatics described herein are by no means exhaustive, rather they serve as one example of a simple way of analyzing high-throughput sequence data to find SNP markers. Ideally, one would like to run through this protocol several times with a new dataset, while varying software parameters slightly, in order to determine the robustness of the results. The final validation step, although not described in much detail here, is also quite critical as that will be the final test of the accuracy of the assumptions made in silico.There is a plethora of downstream applications of a SNP dataset, not covered in this chapter. For an example of a more thorough protocol also including differential gene expression and functional enrichment analyses, BLAST annotation and downstream applications of SNP markers, a good starting point could be the "Simple Fool's Guide to population genomics via RNA-Seq," which is available at http://sfg.stanford.edu . PMID:27460371

  13. SNP-SNP Interaction Analysis on Soybean Oil Content under Multi-Environments

    PubMed Central

    Yin, Zhengong; Leng, Yue; Yu, Hongxiao; Jia, Huiying; Jiang, Shanshan; Ni, Zhongqiu; Jiang, Hongwei; Han, Xue; Liu, Chunyan; Hu, Zhenbang; Wu, Xiaoxia; Hu, Guohua; Xin, Dawei; Qi, Zhaoming

    2016-01-01

    Soybean oil content is one of main quality traits. In this study, we used the multifactor dimensionality reduction (MDR) method and a soybean high-density genetic map including 5,308 markers to identify stable single nucleotide polymorphism (SNP)—SNP interactions controlling oil content in soybean across 23 environments. In total, 36,442,756 SNP-SNP interaction pairs were detected, 1865 of all interaction pairs associated with soybean oil content were identified under multiple environments by the Bonferroni correction with p <3.55×10−11. Two and 1863 SNP-SNP interaction pairs detected stable across 12 and 11 environments, respectively, which account around 50% of total environments. Epistasis values and contribution rates of stable interaction (the SNP interaction pairs were detected in more than 2 environments) pairs were detected by the two way ANOVA test, the available interaction pairs were ranged 0.01 to 0.89 and from 0.01 to 0.85, respectively. Some of one side of the interaction pairs were identified with previously research as a major QTL without epistasis effects. The results of this study provide insights into the genetic architecture of soybean oil content and can serve as a basis for marker-assisted selection breeding. PMID:27668866

  14. SNP discovery by amplicon sequencing and multiplex SNP genotyping in the allopolyploid species Brassica napus.

    PubMed

    Durstewitz, G; Polley, A; Plieske, J; Luerssen, H; Graner, E M; Wieseke, R; Ganal, M W

    2010-11-01

    Oilseed rape (Brassica napus) is an allotetraploid species consisting of two genomes, derived from B. rapa (A genome) and B. oleracea (C genome). The presence of these two genomes makes single nucleotide polymorphism (SNP) marker identification and SNP analysis more challenging than in diploid species, as for a given locus usually two versions of a DNA sequence (based on the two ancestral genomes) have to be analyzed simultaneously during SNP identification and analysis. One hundred amplicons derived from expressed sequence tag (ESTs) were analyzed to identify SNPs in a panel of oilseed rape varieties and within two sister species representing the ancestral genomes. A total of 604 SNPs were identified, averaging one SNP in every 42 bp. It was possible to clearly discriminate SNPs that are polymorphic between different plant varieties from SNPs differentiating the two ancestral genomes. To validate the identified SNPs for their use in genetic analysis, we have developed Illumina GoldenGate assays for some of the identified SNPs. Through the analysis of a number of oilseed rape varieties and mapping populations with GoldenGate assays, we were able to identify a number of different segregation patterns in allotetraploid oilseed rape. The majority of the identified SNP markers can be readily used for genetic mapping, showing that amplicon sequencing and Illumina GoldenGate assays can be used to reliably identify SNP markers in tetraploid oilseed rape and to convert them into successful SNP assays that can be used for genetic analysis.

  15. SNP Array in Hematopoietic Neoplasms: A Review

    PubMed Central

    Song, Jinming; Shao, Haipeng

    2015-01-01

    Cytogenetic analysis is essential for the diagnosis and prognosis of hematopoietic neoplasms in current clinical practice. Many hematopoietic malignancies are characterized by structural chromosomal abnormalities such as specific translocations, inversions, deletions and/or numerical abnormalities that can be identified by karyotype analysis or fluorescence in situ hybridization (FISH) studies. Single nucleotide polymorphism (SNP) arrays offer high-resolution identification of copy number variants (CNVs) and acquired copy-neutral loss of heterozygosity (LOH)/uniparental disomy (UPD) that are usually not identifiable by conventional cytogenetic analysis and FISH studies. As a result, SNP arrays have been increasingly applied to hematopoietic neoplasms to search for clinically-significant genetic abnormalities. A large numbers of CNVs and UPDs have been identified in a variety of hematopoietic neoplasms. CNVs detected by SNP array in some hematopoietic neoplasms are of prognostic significance. A few specific genes in the affected regions have been implicated in the pathogenesis and may be the targets for specific therapeutic agents in the future. In this review, we summarize the current findings of application of SNP arrays in a variety of hematopoietic malignancies with an emphasis on the clinically significant genetic variants. PMID:27600067

  16. SNP Array in Hematopoietic Neoplasms: A Review

    PubMed Central

    Song, Jinming; Shao, Haipeng

    2015-01-01

    Cytogenetic analysis is essential for the diagnosis and prognosis of hematopoietic neoplasms in current clinical practice. Many hematopoietic malignancies are characterized by structural chromosomal abnormalities such as specific translocations, inversions, deletions and/or numerical abnormalities that can be identified by karyotype analysis or fluorescence in situ hybridization (FISH) studies. Single nucleotide polymorphism (SNP) arrays offer high-resolution identification of copy number variants (CNVs) and acquired copy-neutral loss of heterozygosity (LOH)/uniparental disomy (UPD) that are usually not identifiable by conventional cytogenetic analysis and FISH studies. As a result, SNP arrays have been increasingly applied to hematopoietic neoplasms to search for clinically-significant genetic abnormalities. A large numbers of CNVs and UPDs have been identified in a variety of hematopoietic neoplasms. CNVs detected by SNP array in some hematopoietic neoplasms are of prognostic significance. A few specific genes in the affected regions have been implicated in the pathogenesis and may be the targets for specific therapeutic agents in the future. In this review, we summarize the current findings of application of SNP arrays in a variety of hematopoietic malignancies with an emphasis on the clinically significant genetic variants.

  17. Compression and fast retrieval of SNP data

    PubMed Central

    Sambo, Francesco; Di Camillo, Barbara; Toffolo, Gianna; Cobelli, Claudio

    2014-01-01

    Motivation: The increasing interest in rare genetic variants and epistatic genetic effects on complex phenotypic traits is currently pushing genome-wide association study design towards datasets of increasing size, both in the number of studied subjects and in the number of genotyped single nucleotide polymorphisms (SNPs). This, in turn, is leading to a compelling need for new methods for compression and fast retrieval of SNP data. Results: We present a novel algorithm and file format for compressing and retrieving SNP data, specifically designed for large-scale association studies. Our algorithm is based on two main ideas: (i) compress linkage disequilibrium blocks in terms of differences with a reference SNP and (ii) compress reference SNPs exploiting information on their call rate and minor allele frequency. Tested on two SNP datasets and compared with several state-of-the-art software tools, our compression algorithm is shown to be competitive in terms of compression rate and to outperform all tools in terms of time to load compressed data. Availability and implementation: Our compression and decompression algorithms are implemented in a C++ library, are released under the GNU General Public License and are freely downloadable from http://www.dei.unipd.it/~sambofra/snpack.html. Contact: sambofra@dei.unipd.it or cobelli@dei.unipd.it. PMID:25064564

  18. Mycobacterium leprae in Colombia described by SNP7614 in gyrA, two minisatellites and geography

    PubMed Central

    Cardona-Castro, Nora; Beltrán-Alzate, Juan Camilo; Romero-Montoya, Irma Marcela; Li, Wei; Brennan, Patrick J; Vissa, Varalakshmi

    2013-01-01

    New cases of leprosy are still being detected in Colombia after the country declared achievement of the WHO defined ‘elimination’ status. To study the ecology of leprosy in endemic regions, a combination of geographic and molecular tools were applied for a group of 201 multibacillary patients including six multi-case families from eleven departments. The location (latitude and longitude) of patient residences were mapped. Slit skin smears and/or skin biopsies were collected and DNA was extracted. Standard agarose gel electrophoresis following a multiplex PCR-was developed for rapid and inexpensive strain typing of M. leprae based on copy numbers of two VNTR minisatellite loci 27-5 and 12-5. A SNP (C/T) in gyrA (SNP7614) was mapped by introducing a novel PCR-RFLP into an ongoing drug resistance surveillance effort. Multiple genotypes were detected combining the three molecular markers. The two frequent genotypes in Colombia were SNP7614(C)/27-5(5)/12-5(4) [C54] predominantly distributed in the Atlantic departments and SNP7614 (T)/27-5(4)/12-5(5) [T45] associated with the Andean departments. A novel genotype SNP7614 (C)/27-5(6)/12-5(4) [C64] was detected in cities along the Magdalena river which separates the Andean from Atlantic departments; a subset was further characterized showing association with a rare allele of minisatellite 23-3 and the SNP type 1 of M. leprae. The genotypes within intra-family cases were conserved. Overall, this is the first large scale study that utilized simple and rapid assay formats for identification of major strain types and their distribution in Colombia. It provides the framework for further strain type discrimination and geographic information systems as tools for tracing transmission of leprosy. PMID:23291420

  19. Mycobacterium leprae in Colombia described by SNP7614 in gyrA, two minisatellites and geography.

    PubMed

    Cardona-Castro, Nora; Beltrán-Alzate, Juan Camilo; Romero-Montoya, Irma Marcela; Li, Wei; Brennan, Patrick J; Vissa, Varalakshmi

    2013-03-01

    New cases of leprosy are still being detected in Colombia after the country declared achievement of the WHO defined 'elimination' status. To study the ecology of leprosy in endemic regions, a combination of geographic and molecular tools were applied for a group of 201 multibacillary patients including six multi-case families from eleven departments. The location (latitude and longitude) of patient residences were mapped. Slit skin smears and/or skin biopsies were collected and DNA was extracted. Standard agarose gel electrophoresis following a multiplex PCR-was developed for rapid and inexpensive strain typing of Mycobacterium leprae based on copy numbers of two VNTR minisatellite loci 27-5 and 12-5. A SNP (C/T) in gyrA (SNP7614) was mapped by introducing a novel PCR-RFLP into an ongoing drug resistance surveillance effort. Multiple genotypes were detected combining the three molecular markers. The two frequent genotypes in Colombia were SNP7614(C)/27-5(5)/12-5(4) [C54] predominantly distributed in the Atlantic departments and SNP7614 (T)/27-5(4)/12-5(5) [T45] associated with the Andean departments. A novel genotype SNP7614 (C)/27-5(6)/12-5(4) [C64] was detected in cities along the Magdalena river which separates the Andean from Atlantic departments; a subset was further characterized showing association with a rare allele of minisatellite 23-3 and the SNP type 1 of M. leprae. The genotypes within intra-family cases were conserved. Overall, this is the first large scale study that utilized simple and rapid assay formats for identification of major strain types and their distribution in Colombia. It provides the framework for further strain type discrimination and geographic information systems as tools for tracing transmission of leprosy. PMID:23291420

  20. Does replication groups scoring reduce false positive rate in SNP interaction discovery?

    PubMed Central

    2010-01-01

    Background Computational methods that infer single nucleotide polymorphism (SNP) interactions from phenotype data may uncover new biological mechanisms in non-Mendelian diseases. However, practical aspects of such analysis face many problems. Present experimental studies typically use SNP arrays with hundreds of thousands of SNPs but record only hundreds of samples. Candidate SNP pairs inferred by interaction analysis may include a high proportion of false positives. Recently, Gayan et al. (2008) proposed to reduce the number of false positives by combining results of interaction analysis performed on subsets of data (replication groups), rather than analyzing the entire data set directly. If performing as hypothesized, replication groups scoring could improve interaction analysis and also any type of feature ranking and selection procedure in systems biology. Because Gayan et al. do not compare their approach to the standard interaction analysis techniques, we here investigate if replication groups indeed reduce the number of reported false positive interactions. Results A set of simulated and false interaction-imputed experimental SNP data sets were used to compare the inference of SNP-SNP interactions by means of replication groups to the standard approach where the entire data set was directly used to score all candidate SNP pairs. In all our experiments, the inference of interactions from the entire data set (e.g. without using the replication groups) reported fewer false positives. Conclusions With respect to the direct scoring approach the utility of replication groups does not reduce false positive rates, and may, depending on the data set, often perform worse. PMID:20092660

  1. Genome-wide SNP detection, validation, and development of an 8K SNP array for apple.

    PubMed

    Chagné, David; Crowhurst, Ross N; Troggio, Michela; Davey, Mark W; Gilmore, Barbara; Lawley, Cindy; Vanderzande, Stijn; Hellens, Roger P; Kumar, Satish; Cestaro, Alessandro; Velasco, Riccardo; Main, Dorrie; Rees, Jasper D; Iezzoni, Amy; Mockler, Todd; Wilhelm, Larry; Van de Weg, Eric; Gardiner, Susan E; Bassil, Nahla; Peace, Cameron

    2012-01-01

    As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of 'Golden Delicious', SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple.

  2. A SNP-Based Molecular Barcode for Characterization of Common Wheat

    PubMed Central

    Gao, LiFeng; Jia, JiZeng; Kong, XiuYing

    2016-01-01

    Wheat is grown as a staple crop worldwide. It is important to develop an effective genotyping tool for this cereal grain both to identify germplasm diversity and to protect the rights of breeders. Single-nucleotide polymorphism (SNP) genotyping provides a means for developing a practical, rapid, inexpensive and high-throughput assay. Here, we investigated SNPs as robust markers of genetic variation for typing wheat cultivars. We identified SNPs from an array of 9000 across a collection of 429 well-known wheat cultivars grown in China, of which 43 SNP markers with high minor allele frequency and variations discriminated the selected wheat varieties and their wild ancestors. This SNP-based barcode will allow for the rapid and precise identification of wheat germplasm resources and newly released varieties and will further assist in the wheat breeding program. PMID:26985664

  3. Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

    PubMed Central

    Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

    2015-01-01

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559

  4. Applying SNP marker technology in the cacao breeding program at the Cocoa Research Institute of Ghana

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this investigation 45 parental cacao plants and five progeny derived from the parental stock studied were genotyped using six SNP markers to determine off-types or mislabeled clones and to authenticate crosses made in the Cocoa Research Institute of Ghana (CRIG) breeding program. Investigation wa...

  5. High-throughput SNP genotyping in Cucurbita pepo for map construction and quantitative trait loci mapping

    PubMed Central

    2012-01-01

    Background Cucurbita pepo is a member of the Cucurbitaceae family, the second- most important horticultural family in terms of economic importance after Solanaceae. The "summer squash" types, including Zucchini and Scallop, rank among the highest-valued vegetables worldwide. There are few genomic tools available for this species. The first Cucurbita transcriptome, along with a large collection of Single Nucleotide Polymorphisms (SNP), was recently generated using massive sequencing. A set of 384 SNP was selected to generate an Illumina GoldenGate assay in order to construct the first SNP-based genetic map of Cucurbita and map quantitative trait loci (QTL). Results We herein present the construction of the first SNP-based genetic map of Cucurbita pepo using a population derived from the cross of two varieties with contrasting phenotypes, representing the main cultivar groups of the species' two subspecies: Zucchini (subsp. pepo) × Scallop (subsp. ovifera). The mapping population was genotyped with 384 SNP, a set of selected EST-SNP identified in silico after massive sequencing of the transcriptomes of both parents, using the Illumina GoldenGate platform. The global success rate of the assay was higher than 85%. In total, 304 SNP were mapped, along with 11 SSR from a previous map, giving a map density of 5.56 cM/marker. This map was used to infer syntenic relationships between C. pepo and cucumber and to successfully map QTL that control plant, flowering and fruit traits that are of benefit to squash breeding. The QTL effects were validated in backcross populations. Conclusion Our results show that massive sequencing in different genotypes is an excellent tool for SNP discovery, and that the Illumina GoldenGate platform can be successfully applied to constructing genetic maps and performing QTL analysis in Cucurbita. This is the first SNP-based genetic map in the Cucurbita genus and is an invaluable new tool for biological research, especially considering that most

  6. RNASEL and MIR146A SNP-SNP Interaction as a Susceptibility Factor for Non-Melanoma Skin Cancer

    PubMed Central

    Farzan, Shohreh F.; Karagas, Margaret R.; Christensen, Brock C.; Li, Zhongze; Kuriger, Jacquelyn K.; Nelson, Heather H.

    2014-01-01

    Immunity and inflammatory pathways are important in the genesis of non-melanoma skin cancers (NMSC). Functional genetic variation in immune modulators has the potential to affect disease etiology. We investigated associations between common variants in two key regulators, MIR146A and RNASEL, and their relation to NMSCs. Using a large population-based case-control study of basal cell (BCC) and squamous cell carcinoma (SCC), we investigated the impact of MIR146A SNP rs2910164 on cancer risk, and interaction with a SNP in one of its putative targets (RNASEL, rs486907). To examine associations between genotype and BCC and SCC, occurrence odds ratios (OR) and 95% confidence intervals (95%CI) were calculated using unconditional logistic regression, accounting for multiple confounding factors. We did not observe an overall change in the odds ratios for SCC or BCC among individuals carrying either of the RNASEL or MIR146A variants compared with those who were wild type at these loci. However, there was a sex-specific association between BCC and MIR146A in women (ORGC = 0.73, [95%CI = 0.52–1.03]; ORCC = 0.29, [95% CI = 0.14–0.61], p-trend<0.001), and a reduction in risk, albeit not statistically significant, associated with RNASEL and SCC in men (ORAG = 0.88, [95%CI = 0.65–1.19]; ORAA = 0.68, [95%CI = 0.43–1.08], p-trend = 0.10). Most striking was the strong interaction between the two genes. Among individuals carrying variant alleles of both rs2910164 and rs486907, we observed inverse relationships with SCC (ORSCC = 0.56, [95%CI = 0.38–0.81], p-interaction = 0.012) and BCC (ORBCC = 0.57, [95%CI = 0.40–0.80], p-interaction = 0.005). Our results suggest that genetic variation in immune and inflammatory regulators may influence susceptibility to NMSC, and novel SNP-SNP interaction for a microRNA and its target. These data suggest that RNASEL, an enzyme involved in RNA turnover, is controlled by miR-146a

  7. eSNPO: An eQTL-based SNP Ontology and SNP functional enrichment analysis platform

    PubMed Central

    Li, Jin; Wang, Limei; Jiang, Tao; Wang, Jizhe; Li, Xue; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Zhang, Ruijie; Lv, Hongchao; Guo, Maozu

    2016-01-01

    Genome-wide association studies (GWASs) have mined many common genetic variants associated with human complex traits like diseases. After that, the functional annotation and enrichment analysis of significant SNPs are important tasks. Classic methods are always based on physical positions of SNPs and genes. Expression quantitative trait loci (eQTLs) are genomic loci that contribute to variation in gene expression levels and have been proven efficient to connect SNPs and genes. In this work, we integrated the eQTL data and Gene Ontology (GO), constructed associations between SNPs and GO terms, then performed functional enrichment analysis. Finally, we constructed an eQTL-based SNP Ontology and SNP functional enrichment analysis platform. Taking Parkinson Disease (PD) as an example, the proposed platform and method are efficient. We believe eSNPO will be a useful resource for SNP functional annotation and enrichment analysis after we have got significant disease related SNPs. PMID:27470167

  8. eSNPO: An eQTL-based SNP Ontology and SNP functional enrichment analysis platform.

    PubMed

    Li, Jin; Wang, Limei; Jiang, Tao; Wang, Jizhe; Li, Xue; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Zhang, Ruijie; Lv, Hongchao; Guo, Maozu

    2016-01-01

    Genome-wide association studies (GWASs) have mined many common genetic variants associated with human complex traits like diseases. After that, the functional annotation and enrichment analysis of significant SNPs are important tasks. Classic methods are always based on physical positions of SNPs and genes. Expression quantitative trait loci (eQTLs) are genomic loci that contribute to variation in gene expression levels and have been proven efficient to connect SNPs and genes. In this work, we integrated the eQTL data and Gene Ontology (GO), constructed associations between SNPs and GO terms, then performed functional enrichment analysis. Finally, we constructed an eQTL-based SNP Ontology and SNP functional enrichment analysis platform. Taking Parkinson Disease (PD) as an example, the proposed platform and method are efficient. We believe eSNPO will be a useful resource for SNP functional annotation and enrichment analysis after we have got significant disease related SNPs. PMID:27470167

  9. Linkage Analysis and QTL Mapping Using SNP Dosage Data in a Tetraploid Potato Mapping Population

    PubMed Central

    Hackett, Christine A.; McLean, Karen; Bryan, Glenn J.

    2013-01-01

    New sequencing and genotyping technologies have enabled researchers to generate high density SNP genotype data for mapping populations. In polyploid species, SNP data usually contain a new type of information, the allele dosage, which is not used by current methodologies for linkage analysis and QTL mapping. Here we extend existing methodology to use dosage data on SNPs in an autotetraploid mapping population. The SNP dosages are inferred from allele intensity ratios using normal mixture models. The steps of the linkage analysis (testing for distorted segregation, clustering SNPs, calculation of recombination fractions and LOD scores, ordering of SNPs and inference of parental phase) are extended to use the dosage information. For QTL analysis, the probability of each possible offspring genotype is inferred at a grid of locations along the chromosome from the ordered parental genotypes and phases and the offspring dosages. A normal mixture model is then used to relate trait values to the offspring genotypes and to identify the most likely locations for QTLs. These methods are applied to analyse a tetraploid potato mapping population of parents and 190 offspring, genotyped using an Infinium 8300 Potato SNP Array. Linkage maps for each of the 12 chromosomes are constructed. The allele intensity ratios are mapped as quantitative traits to check that their position and phase agrees with that of the corresponding SNP. This analysis confirms most SNP positions, and eliminates some problem SNPs to give high-density maps for each chromosome, with between 74 and 152 SNPs mapped and between 100 and 300 further SNPs allocated to approximate bins. Low numbers of double reduction products were detected. Overall 3839 of the 5378 polymorphic SNPs can be assigned putative genetic locations. This methodology can be applied to construct high-density linkage maps in any autotetraploid species, and could also be extended to higher autopolyploids. PMID:23704960

  10. COL18A1 is highly expressed during human adipocyte differentiation and the SNP c.1136C > T in its "frizzled" motif is associated with obesity in diabetes type 2 patients.

    PubMed

    Errera, Flavia I V; Canani, Luís H; Yeh, Erika; Kague, Erika; Armelin-Corrêa, Lucia M; Suzuki, Oscar T; Tschiedel, Balduíno; Silva, Maria Elizabeth R; Sertié, Andréa L; Passos-Bueno, Maria Rita

    2008-03-01

    Collagen XVIII can generate two fragments, NC11-728 containing a frizzled motif which possibly acts in Wnt signaling and Endostatin, which is cleaved from the NC1 and is a potent inhibitor of angiogenesis. Collagen XVIII and Wnt signaling have recently been associated with adipogenic differentiation and obesity in some animal models, but not in humans. In the present report, we have shown that COL18A1 expression increases during human adipogenic differentiation. We also tested if polymorphisms in the Frizzled (c.1136C>T; Thr379Met) and Endostatin (c.4349G>A; Asp1437Asn) regions contribute towards susceptibility to obesity in patients with type 2 diabetes (113 obese, BMI > or =30; 232 non-obese, BMI < 30) of European ancestry. No evidence of association was observed between the allele c.4349G>A and obesity, but we observed a significantly higher frequency of homozygotes c.1136TT in obese (19.5%) than in non-obese individuals (10.9%) [P = 0.02; OR = 2.0 (95%CI: 1.07-3.73)], suggesting that the allele c.1136T is associated to obesity in a recessive model. This genotype, after controlling for cholesterol, LDL cholesterol, and triglycerides, was independently associated with obesity (P = 0.048), and increases the chance of obesity in 2.8 times. Therefore, our data suggest the involvement of collagen XVIII in human adipogenesis and susceptibility to obesity.

  11. SNP marker detection and genotyping in tilapia.

    PubMed

    Van Bers, N E M; Crooijmans, R P M A; Groenen, M A M; Dibbits, B W; Komen, J

    2012-09-01

    We have generated a unique resource consisting of nearly 175 000 short contig sequences and 3569 SNP markers from the widely cultured GIFT (Genetically Improved Farmed Tilapia) strain of Nile tilapia (Oreochromis niloticus). In total, 384 SNPs were selected to monitor the wider applicability of the SNPs by genotyping tilapia individuals from different strains and different geographical locations. In all strains and species tested (O. niloticus, O. aureus and O. mossambicus), the genotyping assay was working for a similar number of SNPs (288-305 SNPs). The actual number of polymorphic SNPs was, as expected, highest for individuals from the GIFT population (255 SNPs). In the individuals from an Egyptian strain and in individuals caught in the wild in the basin of the river Volta, 197 and 163 SNPs were polymorphic, respectively. A pairwise calculation of Nei's genetic distance allowed the discrimination of the individual strains and species based on the genotypes determined with the SNP set. We expect that this set will be widely applicable for use in tilapia aquaculture, e.g. for pedigree reconstruction. In addition, this set is currently used for assaying the genetic diversity of native Nile tilapia in areas where tilapia is, or will be, introduced in aquaculture projects. This allows the tracing of escapees from aquaculture and the monitoring of effects of introgression and hybridization. PMID:22524158

  12. SNP marker detection and genotyping in tilapia.

    PubMed

    Van Bers, N E M; Crooijmans, R P M A; Groenen, M A M; Dibbits, B W; Komen, J

    2012-09-01

    We have generated a unique resource consisting of nearly 175 000 short contig sequences and 3569 SNP markers from the widely cultured GIFT (Genetically Improved Farmed Tilapia) strain of Nile tilapia (Oreochromis niloticus). In total, 384 SNPs were selected to monitor the wider applicability of the SNPs by genotyping tilapia individuals from different strains and different geographical locations. In all strains and species tested (O. niloticus, O. aureus and O. mossambicus), the genotyping assay was working for a similar number of SNPs (288-305 SNPs). The actual number of polymorphic SNPs was, as expected, highest for individuals from the GIFT population (255 SNPs). In the individuals from an Egyptian strain and in individuals caught in the wild in the basin of the river Volta, 197 and 163 SNPs were polymorphic, respectively. A pairwise calculation of Nei's genetic distance allowed the discrimination of the individual strains and species based on the genotypes determined with the SNP set. We expect that this set will be widely applicable for use in tilapia aquaculture, e.g. for pedigree reconstruction. In addition, this set is currently used for assaying the genetic diversity of native Nile tilapia in areas where tilapia is, or will be, introduced in aquaculture projects. This allows the tracing of escapees from aquaculture and the monitoring of effects of introgression and hybridization.

  13. Longevity and plasticity of CFTR provide an argument for noncanonical SNP organization in hominid DNA.

    PubMed

    Hill, Aubrey E; Plyler, Zackery E; Tiwari, Hemant; Patki, Amit; Tully, Joel P; McAtee, Christopher W; Moseley, Leah A; Sorscher, Eric J

    2014-01-01

    Like many other ancient genes, the cystic fibrosis transmembrane conductance regulator (CFTR) has survived for hundreds of millions of years. In this report, we consider whether such prodigious longevity of an individual gene--as opposed to an entire genome or species--should be considered surprising in the face of eons of relentless DNA replication errors, mutagenesis, and other causes of sequence polymorphism. The conventions that modern human SNP patterns result either from purifying selection or random (neutral) drift were not well supported, since extant models account rather poorly for the known plasticity and function (or the established SNP distributions) found in a multitude of genes such as CFTR. Instead, our analysis can be taken as a polemic indicating that SNPs in CFTR and many other mammalian genes may have been generated--and continue to accrue--in a fundamentally more organized manner than would otherwise have been expected. The resulting viewpoint contradicts earlier claims of 'directional' or 'intelligent design-type' SNP formation, and has important implications regarding the pace of DNA adaptation, the genesis of conserved non-coding DNA, and the extent to which eukaryotic SNP formation should be viewed as adaptive. PMID:25350658

  14. SNP-microarrays can accurately identify the presence of an individual in complex forensic DNA mixtures.

    PubMed

    Voskoboinik, Lev; Ayers, Sheri B; LeFebvre, Aaron K; Darvasi, Ariel

    2015-05-01

    Common forensic and mass disaster scenarios present DNA evidence that comprises a mixture of several contributors. Identifying the presence of an individual in such mixtures has proven difficult. In the current study, we evaluate the practical usefulness of currently available "off-the-shelf" SNP microarrays for such purposes. We found that a set of 3000 SNPs specifically selected for this purpose can accurately identify the presence of an individual in complex DNA mixtures of various compositions. For example, individuals contributing as little as 5% to a complex DNA mixture can be robustly identified even if the starting DNA amount was as little as 5.0ng and had undergone whole-genome amplification (WGA) prior to SNP analysis. The work presented in this study represents proof-of-principle that our previously proposed approach, can work with real "forensic-type" samples. Furthermore, in the absence of a low-density focused forensic SNP microarray, the use of standard, currently available high-density SNP microarrays can be similarly used and even increase statistical power due to the larger amount of available information.

  15. Changes in variance explained by top SNP windows over generations for three traits in broiler chicken.

    PubMed

    Fragomeni, Breno de Oliveira; Misztal, Ignacy; Lourenco, Daniela Lino; Aguilar, Ignacio; Okimoto, Ronald; Muir, William M

    2014-01-01

    The purpose of this study was to determine if the set of genomic regions inferred as accounting for the majority of genetic variation in quantitative traits remain stable over multiple generations of selection. The data set contained phenotypes for five generations of broiler chicken for body weight, breast meat, and leg score. The population consisted of 294,632 animals over five generations and also included genotypes of 41,036 single nucleotide polymorphism (SNP) for 4,866 animals, after quality control. The SNP effects were calculated by a GWAS type analysis using single step genomic BLUP approach for generations 1-3, 2-4, 3-5, and 1-5. Variances were calculated for windows of 20 SNP. The top ten windows for each trait that explained the largest fraction of the genetic variance across generations were examined. Across generations, the top 10 windows explained more than 0.5% but less than 1% of the total variance. Also, the pattern of the windows was not consistent across generations. The windows that explained the greatest variance changed greatly among the combinations of generations, with a few exceptions. In many cases, a window identified as top for one combination, explained less than 0.1% for the other combinations. We conclude that identification of top SNP windows for a population may have little predictive power for genetic selection in the following generations for the traits here evaluated.

  16. Changes in variance explained by top SNP windows over generations for three traits in broiler chicken

    PubMed Central

    Fragomeni, Breno de Oliveira; Misztal, Ignacy; Lourenco, Daniela Lino; Aguilar, Ignacio; Okimoto, Ronald; Muir, William M.

    2014-01-01

    The purpose of this study was to determine if the set of genomic regions inferred as accounting for the majority of genetic variation in quantitative traits remain stable over multiple generations of selection. The data set contained phenotypes for five generations of broiler chicken for body weight, breast meat, and leg score. The population consisted of 294,632 animals over five generations and also included genotypes of 41,036 single nucleotide polymorphism (SNP) for 4,866 animals, after quality control. The SNP effects were calculated by a GWAS type analysis using single step genomic BLUP approach for generations 1–3, 2–4, 3–5, and 1–5. Variances were calculated for windows of 20 SNP. The top ten windows for each trait that explained the largest fraction of the genetic variance across generations were examined. Across generations, the top 10 windows explained more than 0.5% but less than 1% of the total variance. Also, the pattern of the windows was not consistent across generations. The windows that explained the greatest variance changed greatly among the combinations of generations, with a few exceptions. In many cases, a window identified as top for one combination, explained less than 0.1% for the other combinations. We conclude that identification of top SNP windows for a population may have little predictive power for genetic selection in the following generations for the traits here evaluated. PMID:25324857

  17. Atomic Force Microscopy for DNA SNP Identification

    NASA Astrophysics Data System (ADS)

    Valbusa, Ugo; Ierardi, Vincenzo

    The knowledge of the effects of single-nucleotide polymorphisms (SNPs) in the human genome greatly contributes to better comprehension of the relation between genetic factors and diseases. Sequence analysis of genomic DNA in different individuals reveals positions where variations that involve individual base substitutions can occur. Single-nucleotide polymorphisms are highly abundant and can have different consequences at phenotypic level. Several attempts were made to apply atomic force microscopy (AFM) to detect and map SNP sites in DNA strands. The most promising approach is the study of DNA mutations producing heteroduplex DNA strands and identifying the mismatches by means of a protein that labels the mismatches. MutS is a protein that is part of a well-known complex of mismatch repair, which initiates the process of repairing when the MutS binds to the mismatched DNA filament. The position of MutS on the DNA filament can be easily recorded by means of AFM imaging.

  18. Effects of the MDM2 promoter SNP285 and SNP309 on Sp1 transcription factor binding and cancer risk.

    PubMed

    Knappskog, Stian; Lønning, Per E

    2011-01-01

    The proto-oncogene MDM2 inhibits p53 and plays a key role in cell growth control and apoptosis. Identification of two antagonizing MDM2 polymorphisms, SNP285 and SNP309, affecting cancer risk through modulation of Sp1 transcription factor binding, shed new light on the biological activity and phylogeny of this gene.

  19. Gradient Boosting as a SNP Filter: an Evaluation Using Simulated and Hair Morphology Data.

    PubMed

    Lubke, Gh; Laurin, C; Walters, R; Eriksson, N; Hysi, P; Spector, Td; Montgomery, Gw; Martin, Ng; Medland, Se; Boomsma, DI

    2013-10-20

    Typically, genome-wide association studies consist of regressing the phenotype on each SNP separately using an additive genetic model. Although statistical models for recessive, dominant, SNP-SNP, or SNP-environment interactions exist, the testing burden makes an evaluation of all possible effects impractical for genome-wide data. We advocate a two-step approach where the first step consists of a filter that is sensitive to different types of SNP main and interactions effects. The aim is to substantially reduce the number of SNPs such that more specific modeling becomes feasible in a second step. We provide an evaluation of a statistical learning method called "gradient boosting machine" (GBM) that can be used as a filter. GBM does not require an a priori specification of a genetic model, and permits inclusion of large numbers of covariates. GBM can therefore be used to explore multiple GxE interactions, which would not be feasible within the parametric framework used in GWAS. We show in a simulation that GBM performs well even under conditions favorable to the standard additive regression model commonly used in GWAS, and is sensitive to the detection of interaction effects even if one of the interacting variables has a zero main effect. The latter would not be detected in GWAS. Our evaluation is accompanied by an analysis of empirical data concerning hair morphology. We estimate the phenotypic variance explained by increasing numbers of highest ranked SNPs, and show that it is sufficient to select 10K-20K SNPs in the first step of a two-step approach. PMID:24404405

  20. Deriving Gene Networks from SNP Associated with Triacylglycerol and Phospholipid Fatty Acid Fractions from Ribeyes of Angus Cattle

    PubMed Central

    Buchanan, Justin W.; Reecy, James M.; Garrick, Dorian J.; Duan, Qing; Beitz, Don C.; Koltes, James E.; Saatchi, Mahdi; Koesterke, Lars; Mateescu, Raluca G.

    2016-01-01

    The fatty acid profile of beef is a complex trait that can benefit from gene-interaction network analysis to understand relationships among loci that contribute to phenotypic variation. Phenotypic measures of fatty acid profile from triacylglycerol and phospholipid fractions of longissimus muscle, pedigree information, and Illumina 54 k bovine SNP genotypes were utilized to derive an annotated gene network associated with fatty acid composition in 1,833 Angus beef cattle. The Bayes-B statistical model was utilized to perform a genome wide association study to estimate associations between 54 k SNP genotypes and 39 individual fatty acid phenotypes within each fraction. Posterior means of the effects were estimated for each of the 54 k SNP and for the collective effects of all the SNP in every 1-Mb genomic window in terms of the proportion of genetic variance explained by the window. Windows that explained the largest proportions of genetic variance for individual lipids were found in the triacylglycerol fraction. There was almost no overlap in the genomic regions explaining variance between the triacylglycerol and phospholipid fractions. Partial correlations were used to identify correlated regions of the genome for the set of largest 1 Mb windows that explained up to 35% genetic variation in either fatty acid fraction. SNP were allocated to windows based on the bovine UMD3.1 assembly. Gene network clusters were generated utilizing a partial correlation and information theory algorithm. Results were used in conjunction with network scoring and visualization software to analyze correlated SNP across 39 fatty acid phenotypes to identify SNP of significance. Significant pathways implicated in fatty acid metabolism through GO term enrichment analysis included homeostasis of number of cells, homeostatic process, coenzyme/cofactor activity, and immunoglobulin. These results suggest different metabolic pathways regulate the development of different types of lipids found in

  1. Deriving Gene Networks from SNP Associated with Triacylglycerol and Phospholipid Fatty Acid Fractions from Ribeyes of Angus Cattle.

    PubMed

    Buchanan, Justin W; Reecy, James M; Garrick, Dorian J; Duan, Qing; Beitz, Don C; Koltes, James E; Saatchi, Mahdi; Koesterke, Lars; Mateescu, Raluca G

    2016-01-01

    The fatty acid profile of beef is a complex trait that can benefit from gene-interaction network analysis to understand relationships among loci that contribute to phenotypic variation. Phenotypic measures of fatty acid profile from triacylglycerol and phospholipid fractions of longissimus muscle, pedigree information, and Illumina 54 k bovine SNP genotypes were utilized to derive an annotated gene network associated with fatty acid composition in 1,833 Angus beef cattle. The Bayes-B statistical model was utilized to perform a genome wide association study to estimate associations between 54 k SNP genotypes and 39 individual fatty acid phenotypes within each fraction. Posterior means of the effects were estimated for each of the 54 k SNP and for the collective effects of all the SNP in every 1-Mb genomic window in terms of the proportion of genetic variance explained by the window. Windows that explained the largest proportions of genetic variance for individual lipids were found in the triacylglycerol fraction. There was almost no overlap in the genomic regions explaining variance between the triacylglycerol and phospholipid fractions. Partial correlations were used to identify correlated regions of the genome for the set of largest 1 Mb windows that explained up to 35% genetic variation in either fatty acid fraction. SNP were allocated to windows based on the bovine UMD3.1 assembly. Gene network clusters were generated utilizing a partial correlation and information theory algorithm. Results were used in conjunction with network scoring and visualization software to analyze correlated SNP across 39 fatty acid phenotypes to identify SNP of significance. Significant pathways implicated in fatty acid metabolism through GO term enrichment analysis included homeostasis of number of cells, homeostatic process, coenzyme/cofactor activity, and immunoglobulin. These results suggest different metabolic pathways regulate the development of different types of lipids found in

  2. Epistatic effects on abdominal fat content in chickens: results from a genome-wide SNP-SNP interaction analysis.

    PubMed

    Li, Fangge; Hu, Guo; Zhang, Hui; Wang, Shouzhi; Wang, Zhipeng; Li, Hui

    2013-01-01

    We performed a pairwise epistatic interaction test using the chicken 60 K single nucleotide polymorphism (SNP) chip for the 11(th) generation of the Northeast Agricultural University broiler lines divergently selected for abdominal fat content. A linear mixed model was used to test two dimensions of SNP interactions affecting abdominal fat weight. With a threshold of P<1.2×10(-11) by a Bonferroni 5% correction, 52 pairs of SNPs were detected, comprising 45 pairs showing an Additive×Additive and seven pairs showing an Additive×Dominance epistatic effect. The contribution rates of significant epistatic interactive SNPs ranged from 0.62% to 1.54%, with 47 pairs contributing more than 1%. The SNP-SNP network affecting abdominal fat weight constructed using the significant SNP pairs was analyzed, estimated and annotated. On the basis of the network's features, SNPs Gga_rs14303341 and Gga_rs14988623 at the center of the subnet should be important nodes, and an interaction between GGAZ and GGA8 was suggested. Twenty-two quantitative trait loci, 97 genes (including nine non-coding genes), and 50 pathways were annotated on the epistatic interactive SNP-SNP network. The results of the present study provide insights into the genetic architecture underlying broiler chicken abdominal fat weight.

  3. SNP-SNP interaction analysis of NF-κB signaling pathway on breast cancer survival

    PubMed Central

    Jamshidi, Maral; Fagerholm, Rainer; Khan, Sofia; Aittomäki, Kristiina; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Andrulis, Irene L.; Chang-Claude, Jenny; Devilee, Peter; Fasching, Peter A.; Michailidou, Kyriaki; Bolla, Manjeet K.; Dennis, Joe; Wang, Qin; Guo, Qi; Rhenius, Valerie; Cornelissen, Sten; Rudolph, Anja; Knight, Julia A.; Loehberg, Christian R.; Burwinkel, Barbara; Marme, Frederik; Hopper, John L.; Southey, Melissa C.; Bojesen, Stig E.; Flyger, Henrik; Brenner, Hermann; Holleczek, Bernd; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Dyck, Laurien Van; Nevelsteen, Ines; Couch, Fergus J.; Olson, Janet E.; Giles, Graham G.; McLean, Catriona; Haiman, Christopher A.; Henderson, Brian E.; Winqvist, Robert; Pylkäs, Katri; Tollenaar, Rob A.E.M.; García-Closas, Montserrat; Figueroa, Jonine; Hooning, Maartje J.; Martens, John W.M.; Cox, Angela; Cross, Simon S.; Simard, Jacques; Dunning, Alison M.; Easton, Douglas F.; Pharoah, Paul D.P.; Hall, Per; Blomqvist, Carl; Schmidt, Marjanka K.; Nevanlinna, Heli

    2015-01-01

    In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-SNP interactions on survival using the likelihood-ratio test comparing multivariate Cox’ regression models of SNP pairs without and with an interaction term. We found two interacting pairs associating with prognosis: patients simultaneously homozygous for the rare alleles of rs5996080 and rs7973914 had worse survival (HRinteraction 6.98, 95% CI=3.3-14.4, P = 1.42E-07), and patients carrying at least one rare allele for rs17243893 and rs57890595 had better survival (HRinteraction 0.51, 95% CI=0.3-0.6, P = 2.19E-05). Based on in silico functional analyses and literature, we speculate that the rs5996080 and rs7973914 loci may affect the BAFFR and TNFR1/TNFR3 receptors and breast cancer survival, possibly by disturbing both the canonical and non-canonical NF-κB pathways or their dynamics, whereas, rs17243893-rs57890595 interaction on survival may be mediated through TRAF2-TRAIL-R4 interplay. These results warrant further validation and functional analyses. PMID:26317411

  4. MALDI-TOF mass spectrometry-based SNP genotyping.

    PubMed

    Pusch, Wolfgang; Wurmbach, Jan-Henner; Thiele, Herbert; Kostrzewa, Markus

    2002-07-01

    In recent years a growing demand for simple and robust SNP genotyping platforms has arisen from the widespread use of SNPs in industrial and public research. The resulting knowledge about genotype/phenotype correlations is of special interest for the identification of potential new drug targets and in the field of pharmacogenomics. However, full exploitation of the available genomic information requires vast numbers of SNP analyses, as large cohorts of patients have to be screened for a large number of markers. Only very few of the current SNP genotyping techniques can cope with the resulting demands concerning sample throughput, automation, accuracy and cost-effectiveness. MALDI-TOF mass spectrometry has the potential to develop into a 'Gold Standard' for high-throughput SNP genotyping - if it has not already done so. This review will focus on the latest developments of this technology.

  5. Rapid Detection of Rare Deleterious Variants by Next Generation Sequencing with Optional Microarray SNP Genotype Data

    PubMed Central

    Watson, Christopher M.; Crinnion, Laura A.; Gurgel‐Gianetti, Juliana; Harrison, Sally M.; Daly, Catherine; Antanavicuite, Agne; Lascelles, Carolina; Markham, Alexander F.; Pena, Sergio D. J.; Bonthron, David T.

    2015-01-01

    ABSTRACT Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease‐causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome‐wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution. PMID:26037133

  6. Genomic position mapping discrepancies of commercial SNP chips.

    PubMed

    Fadista, João; Bendixen, Christian

    2012-01-01

    The field of genetics has come to rely heavily on commercial genotyping arrays and accompanying annotations for insights into genotype-phenotype associations. However, in order to avoid errors and false leads, it is imperative that the annotation of SNP chromosomal positions is accurate and unambiguous. We report on genomic positional discrepancies of various SNP chips for human, cattle and mouse species, and discuss their causes and consequences.

  7. Detection of selective sweeps in cattle using genome-wide SNP data

    PubMed Central

    2013-01-01

    Background The domestication and subsequent selection by humans to create breeds and biological types of cattle undoubtedly altered the patterning of variation within their genomes. Strong selection to fix advantageous large-effect mutations underlying domesticability, breed characteristics or productivity created selective sweeps in which variation was lost in the chromosomal region flanking the selected allele. Selective sweeps have now been identified in the genomes of many animal species including humans, dogs, horses, and chickens. Here, we attempt to identify and characterise regions of the bovine genome that have been subjected to selective sweeps. Results Two datasets were used for the discovery and validation of selective sweeps via the fixation of alleles at a series of contiguous SNP loci. BovineSNP50 data were used to identify 28 putative sweep regions among 14 diverse cattle breeds. Affymetrix BOS 1 prescreening assay data for five breeds were used to identify 85 regions and validate 5 regions identified using the BovineSNP50 data. Many genes are located within these regions and the lack of sequence data for the analysed breeds precludes the nomination of selected genes or variants and limits the prediction of the selected phenotypes. However, phenotypes that we predict to have historically been under strong selection include horned-polled, coat colour, stature, ear morphology, and behaviour. Conclusions The bias towards common SNPs in the design of the BovineSNP50 assay led to the identification of recent selective sweeps associated with breed formation and common to only a small number of breeds rather than ancient events associated with domestication which could potentially be common to all European taurines. The limited SNP density, or marker resolution, of the BovineSNP50 assay significantly impacted the rate of false discovery of selective sweeps, however, we found sweeps in common between breeds which were confirmed using an ultra

  8. Meta-analysis diagnostic accuracy of SNP-based pathogenicity detection tools: a case of UTG1A1 gene mutations

    PubMed Central

    Galehdari, Hamid; Saki, Najmaldin; Mohammadi-asl, Javad; Rahim, Fakher

    2013-01-01

    Crigler-Najjar syndrome (CNS) type I and type II are usually inherited as autosomal recessive conditions that result from mutations in the UGT1A1 gene. The main objective of the present review is to summarize results of all available evidence on the accuracy of SNP-based pathogenicity detection tools compared to published clinical result for the prediction of in nsSNPs that leads to disease using prediction performance method. A comprehensive search was performed to find all mutations related to CNS. Database searches included dbSNP, SNPdbe, HGMD, Swissvar, ensemble, and OMIM. All the mutation related to CNS was extracted. The pathogenicity prediction was done using SNP-based pathogenicity detection tools include SIFT, PHD-SNP, PolyPhen2, fathmm, Provean, and Mutpred. Overall, 59 different SNPs related to missense mutations in the UGT1A1 gene, were reviewed. Comparing the diagnostic OR, PolyPhen2 and Mutpred have the highest detection 4.983 (95% CI: 1.24 – 20.02) in both, following by SIFT (diagnostic OR: 3.25, 95% CI: 1.07 – 9.83). The highest MCC of SNP-based pathogenicity detection tools, was belong to SIFT (34.19%) followed by Provean, PolyPhen2, and Mutpred (29.99%, 29.89%, and 29.89%, respectively). Hence the highest SNP-based pathogenicity detection tools ACC, was fit to SIFT (62.71%) followed by PolyPhen2, and Mutpred (61.02%, in both). Our results suggest that some of the well-established SNP-based pathogenicity detection tools can appropriately reflect the role of a disease-associated SNP in both local and global structures. PMID:23875061

  9. Evaluation of Y chromosomal SNP haplogrouping in the HID-Ion AmpliSeq™ Identity Panel.

    PubMed

    Ochiai, Eriko; Minaguchi, Kiyoshi; Nambiar, Phrabhakaran; Kakimoto, Yu; Satoh, Fumiko; Nakatome, Masato; Miyashita, Keiko; Osawa, Motoki

    2016-09-01

    The Y chromosomal haplogroup determined from single nucleotide polymorphism (SNP) combinations is a valuable genetic marker to study ancestral male lineage and ethical distribution. Next-generation sequencing has been developed for widely diverse genetics fields. For this study, we demonstrate 34 Y-SNP typing employing the Ion PGM™ system to perform haplogrouping. DNA libraries were constructed using the HID-Ion AmpliSeq™ Identity Panel. Emulsion PCR was performed, then DNA sequences were analyzed on the Ion 314 and 316 Chip Kit v2. Some difficulties became apparent during the analytic processes. No-call was reported at rs2032599 and M479 in six samples, in which the least coverage was observed at M479. A minor misreading occurred at rs2032631 and M479. A real time PCR experiment using other pairs of oligonucleotide primers showed that these events might result from the flanking sequence. Finally, Y haplogroup was determined completely for 81 unrelated males including Japanese (n=59) and Malay (n=22) subjects. The allelic divergence differed between the two populations. In comparison with the conventional Sanger method, next-generation sequencing provides a comprehensive SNP analysis with convenient procedures, but further system improvement is necessary. PMID:27591541

  10. SNP markers identify widely distributed clonal lineages of Phytophthora colocasiae in Vietnam, Hawaii and Hainan Island, China.

    PubMed

    Shrestha, Sandesh; Hu, Jian; Fryxell, Rebecca Trout; Mudge, Joann; Lamour, Kurt

    2014-01-01

    Taro (Colocasia esculenta) is an important food crop, and taro leaf blight caused by Phytophthora colocasiae can significantly affect production. Our objectives were to develop single nucleotide polymorphism (SNP) markers for P. colocasiae and characterize populations in Hawaii (HI), Vietnam (VN) and Hainan Island, China (HIC). In total, 379 isolates were analyzed for mating type and multilocus SNP profiles including 214 from HI, 97 from VN and 68 from HIC. A total of 1152 single nucleotide variant (SNV) sites were identified via restriction site-associated DNA (RAD) sequencing of two field isolates. Genotyping with 27 SNPs revealed 41 multilocus SNP genotypes grouped into seven clonal lineages containing 2-232 members. Three clonal lineages were shared among countries. In addition, five SNP markers had a low incidence of loss of heterozygosity (LOH) during asexual laboratory growth. For HI and VN, >95% of isolates were the A2 mating type. On HIC, isolates within single clonal lineages had A1, A2 and A0 (neuter) isolates. The implications for the wide dispersal of clonal lineages are discussed.

  11. SNP markers identify widely distributed clonal lineages of Phytophthora colocasiae in Vietnam, Hawaii and Hainan Island, China.

    PubMed

    Shrestha, Sandesh; Hu, Jian; Fryxell, Rebecca Trout; Mudge, Joann; Lamour, Kurt

    2014-01-01

    Taro (Colocasia esculenta) is an important food crop, and taro leaf blight caused by Phytophthora colocasiae can significantly affect production. Our objectives were to develop single nucleotide polymorphism (SNP) markers for P. colocasiae and characterize populations in Hawaii (HI), Vietnam (VN) and Hainan Island, China (HIC). In total, 379 isolates were analyzed for mating type and multilocus SNP profiles including 214 from HI, 97 from VN and 68 from HIC. A total of 1152 single nucleotide variant (SNV) sites were identified via restriction site-associated DNA (RAD) sequencing of two field isolates. Genotyping with 27 SNPs revealed 41 multilocus SNP genotypes grouped into seven clonal lineages containing 2-232 members. Three clonal lineages were shared among countries. In addition, five SNP markers had a low incidence of loss of heterozygosity (LOH) during asexual laboratory growth. For HI and VN, >95% of isolates were the A2 mating type. On HIC, isolates within single clonal lineages had A1, A2 and A0 (neuter) isolates. The implications for the wide dispersal of clonal lineages are discussed. PMID:24895424

  12. DoGSD: the dog and wolf genome SNP database

    PubMed Central

    Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M.; Wang, Guo-Dong; Zhang, Ya-Ping

    2015-01-01

    The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. PMID:25404132

  13. DoGSD: the dog and wolf genome SNP database.

    PubMed

    Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping

    2015-01-01

    The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. PMID:25404132

  14. DoGSD: the dog and wolf genome SNP database.

    PubMed

    Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping

    2015-01-01

    The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies.

  15. SNP genotyping of animal and human derived isolates of Mycobacterium avium subsp. paratuberculosis.

    PubMed

    Wynne, James W; Beller, Christie; Boyd, Victoria; Francis, Barry; Gwoźdź, Jacek; Carajias, Marios; Heine, Hans G; Wagner, Josef; Kirkwood, Carl D; Michalski, Wojtek P

    2014-08-27

    Mycobacterium avium subsp. paratuberculosis (MAP) is the aetiological agent of Johne's disease (JD), a chronic granulomatous enteritis that affects ruminants worldwide. While the ability of MAP to cause disease in animals is clear, the role of this bacterium in human inflammatory bowel diseases remains unresolved. Previous whole genome sequencing of MAP isolates derived from human and three animal hosts showed that human isolates were genetically similar and showed a close phylogenetic relationship to one bovine isolate. In contrast, other animal derived isolates were more genetically diverse. The present study aimed to investigate the frequency of this human strain across 52 wild-type MAP isolates, collected predominantly from Australia. A Luminex based SNP genotyping approach was utilised to genotype SNPs that had previously been shown to be specific to the human, bovine or ovine isolate types. Fourteen SNPs were initially evaluated across a reference panel of isolates with known genotypes. A subset of seven SNPs was chosen for analysis within the wild-type collection. Of the seven SNPs, three were found to be unique to paediatric human isolates. No wild-type isolates contain these SNP alleles. Interestingly, and in contrast to the paediatric isolates, three additional adult human isolates (derived from adult Crohn's disease patients) also did not contain these SNP alleles. Furthermore we identified two SNPs, which demonstrate extensive polymorphism within the animal-derived MAP isolates. One of which appears unique to ovine and a single camel isolate. From this study we suggest the existence of genetic heterogeneity between human derived MAP isolates, some of which are highly similar to those derived from bovine hosts, but others of which are more divergent.

  16. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649).

    PubMed

    Knappskog, Stian; Gansmo, Liv B; Dibirova, Khadizha; Metspalu, Andres; Cybulski, Cezary; Peterlongo, Paolo; Aaltonen, Lauri; Vatten, Lars; Romundstad, Pål; Hveem, Kristian; Devilee, Peter; Evans, Gareth D; Lin, Dongxin; Van Camp, Guy; Manolopoulos, Vangelis G; Osorio, Ana; Milani, Lili; Ozcelik, Tayfun; Zalloua, Pierre; Mouzaya, Francis; Bliznetz, Elena; Balanovska, Elena; Pocheshkova, Elvira; Kučinskas, Vaidutis; Atramentova, Lubov; Nymadawa, Pagbajabyn; Titov, Konstantin; Lavryashina, Maria; Yusupov, Yuldash; Bogdanova, Natalia; Koshel, Sergey; Zamora, Jorge; Wedge, David C; Charlesworth, Deborah; Dörk, Thilo; Balanovsky, Oleg; Lønning, Per E

    2014-09-30

    The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 - 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk.

  17. Whole genome SNP scanning of snow sheep (Ovis nivicola).

    PubMed

    Deniskova, T E; Okhlopkov, I M; Sermyagin, A A; Gladyr', E A; Bagirov, V A; Sölkner, J; Mamaev, N V; Brem, G; Zinov'eva, N A

    2016-07-01

    This is the first report performing the whole genome SNP scanning of snow sheep (Ovis nivicola). Samples of snow sheep (n = 18) collected in six different regions of the Republic of Sakha (Yakutia) from 64° to 71° N. For SNP genotyping, we applied Ovine 50K SNP BeadChip (Illumina, United States), designed for domestic sheep. The total number of genotyped SNPs (call rate 90%) was 47796 (88.1% of total SNPs), wherein 1006 SNPs were polymorphic (2.1%). Principal component analysis (PCA) showed the clear differentiation within the species O. nivicola: studied individuals were distributed among five distinct arrays corresponding to the geographical locations of sampling points. Our results demonstrate that the DNA chip designed for domestic sheep can be successfully used to study the allele pool and the genetic structure of snow sheep populations. PMID:27599514

  18. Exhaustive search of the SNP-SNP interactome identifies epistatic effects on brain volume in two cohorts

    PubMed Central

    Hibar, Derrek P.; Stein, Jason L.; Jahanshad, Neda; Kohannim, Omid; Toga, Arthur W.; McMahon, Katie L.; de Zubicaray, Greig I.; Montgomery, Grant W.; Martin, Nicholas G.; Wright, Margaret J.; Weiner, Michael W.; Thompson, Paul M.

    2014-01-01

    The SNP-SNP interactome has rarely been explored in the context of neuroimaging genetics mainly due to the complexity of conducting ∼1011 pairwise statistical tests. However, recent advances in machine learning, specifically the iterative sure independence screening (SIS) method, have enabled the analysis of datasets where the number of predictors is much larger than the number of observations. Using an implementation of the SIS algorithm (called EPISIS), we used exhaustive search of the genome-wide, SNP-SNP interactome to identify and prioritize SNPs for interaction analysis. We identified a significant SNP pair, rs1345203 and rs1213205, associated with temporal lobe volume. We further examined the full-brain, voxelwise effects of the interaction in the ADNI dataset and separately in an independent dataset of healthy twins (QTIM). We found that each additional loading in the epistatic effect was associated with ∼5% greater brain regional brain volume (a protective effect) in both the ADNI and QTIM samples. PMID:24505811

  19. Sniper: improved SNP discovery by multiply mapping deep sequenced reads.

    PubMed

    Simola, Daniel F; Kim, Junhyong

    2011-06-20

    SNP (single nucleotide polymorphism) discovery using next-generation sequencing data remains difficult primarily because of redundant genomic regions, such as interspersed repetitive elements and paralogous genes, present in all eukaryotic genomes. To address this problem, we developed Sniper, a novel multi-locus Bayesian probabilistic model and a computationally efficient algorithm that explicitly incorporates sequence reads that map to multiple genomic loci. Our model fully accounts for sequencing error, template bias, and multi-locus SNP combinations, maintaining high sensitivity and specificity under a broad range of conditions. An implementation of Sniper is freely available at http://kim.bio.upenn.edu/software/sniper.shtml.

  20. Identification of novel single nucleotide polymorphisms (SNPs) in deer (Odocoileus spp.) using the BovineSNP50 BeadChip.

    PubMed

    Haynes, Gwilym D; Latch, Emily K

    2012-01-01

    Single nucleotide polymorphisms (SNPs) are growing in popularity as a genetic marker for investigating evolutionary processes. A panel of SNPs is often developed by comparing large quantities of DNA sequence data across multiple individuals to identify polymorphic sites. For non-model species, this is particularly difficult, as performing the necessary large-scale genomic sequencing often exceeds the resources available for the project. In this study, we trial the Bovine SNP50 BeadChip developed in cattle (Bos taurus) for identifying polymorphic SNPs in cervids Odocoileus hemionus (mule deer and black-tailed deer) and O. virginianus (white-tailed deer) in the Pacific Northwest. We found that 38.7% of loci could be genotyped, of which 5% (n = 1068) were polymorphic. Of these 1068 polymorphic SNPs, a mixture of putatively neutral loci (n = 878) and loci under selection (n = 190) were identified with the F(ST)-outlier method. A range of population genetic analyses were implemented using these SNPs and a panel of 10 microsatellite loci. The three types of deer could readily be distinguished with both the SNP and microsatellite datasets. This study demonstrates that commercially developed SNP chips are a viable means of SNP discovery for non-model organisms, even when used between very distantly related species (the Bovidae and Cervidae families diverged some 25.1-30.1 million years before present).

  1. SNP marker diversity in common bean (Phaseolus vulgaris L.).

    PubMed

    Cortés, Andrés J; Chavarro, Martha C; Blair, Matthew W

    2011-09-01

    Single nucleotide polymorphism (SNP) markers have become a genetic technology of choice because of their automation and high precision of allele calls. In this study, our goal was to develop 94 SNPs and test them across well-chosen common bean (Phaseolus vulgaris L.) germplasm. We validated and accessed SNP diversity at 84 gene-based and 10 non-genic loci using KASPar technology in a panel of 70 genotypes that have been used as parents of mapping populations and have been previously evaluated for SSRs. SNPs exhibited high levels of genetic diversity, an excess of middle frequency polymorphism, and a within-genepool mismatch distribution as expected for populations affected by sudden demographic expansions after domestication bottlenecks. This set of markers was useful for distinguishing Andean and Mesoamerican genotypes but less useful for distinguishing within each gene pool. In summary, slightly greater polymorphism and race structure was found within the Andean gene pool than within the Mesoamerican gene pool but polymorphism rate between genotypes was consistent with genepool and race identity. Our survey results represent a baseline for the choice of SNP markers for future applications because gene-associated SNPs could themselves be causative SNPs for traits. Finally, we discuss that the ideal genetic marker combination with which to carry out diversity, mapping and association studies in common bean should consider a mix of both SNP and SSR markers.

  2. Do you really know where this SNP goes?

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The release of build 10.2 of the swine genome was a marked improvement over previous builds and has proven extremely useful. However, as most know, there are regions of the genome that this particular build does not accurately represent. For instance, nearly 25% of the 62,162 SNP on the Illumina Por...

  3. Software solutions for the livestock genomics SNP array revolution.

    PubMed

    Nicolazzi, E L; Biffani, S; Biscarini, F; Orozco Ter Wengel, P; Caprera, A; Nazzicari, N; Stella, A

    2015-08-01

    Since the beginning of the genomic era, the number of available single nucleotide polymorphism (SNP) arrays has grown considerably. In the bovine species alone, 11 SNP chips not completely covered by intellectual property are currently available, and the number is growing. Genomic/genotype data are not standardized, and this hampers its exchange and integration. In addition, software used for the analyses of these data usually requires not standard (i.e. case specific) input files which, considering the large amount of data to be handled, require at least some programming skills in their production. In this work, we describe a software toolkit for SNP array data management, imputation, genome-wide association studies, population genetics and genomic selection. However, this toolkit does not solve the critical need for standardization of the genotypic data and software input files. It only highlights the chaotic situation each researcher has to face on a daily basis and gives some helpful advice on the currently available tools in order to navigate the SNP array data complexity. PMID:25907889

  4. SNP Discovery through Next-Generation Sequencing and Its Applications

    PubMed Central

    Kumar, Santosh; Banks, Travis W.; Cloutier, Sylvie

    2012-01-01

    The decreasing cost along with rapid progress in next-generation sequencing and related bioinformatics computing resources has facilitated large-scale discovery of SNPs in various model and nonmodel plant species. Large numbers and genome-wide availability of SNPs make them the marker of choice in partially or completely sequenced genomes. Although excellent reviews have been published on next-generation sequencing, its associated bioinformatics challenges, and the applications of SNPs in genetic studies, a comprehensive review connecting these three intertwined research areas is needed. This paper touches upon various aspects of SNP discovery, highlighting key points in availability and selection of appropriate sequencing platforms, bioinformatics pipelines, SNP filtering criteria, and applications of SNPs in genetic analyses. The use of next-generation sequencing methodologies in many non-model crops leading to discovery and implementation of SNPs in various genetic studies is discussed. Development and improvement of bioinformatics software that are open source and freely available have accelerated the SNP discovery while reducing the associated cost. Key considerations for SNP filtering and associated pipelines are discussed in specific topics. A list of commonly used software and their sources is compiled for easy access and reference. PMID:23227038

  5. Software solutions for the livestock genomics SNP array revolution.

    PubMed

    Nicolazzi, E L; Biffani, S; Biscarini, F; Orozco Ter Wengel, P; Caprera, A; Nazzicari, N; Stella, A

    2015-08-01

    Since the beginning of the genomic era, the number of available single nucleotide polymorphism (SNP) arrays has grown considerably. In the bovine species alone, 11 SNP chips not completely covered by intellectual property are currently available, and the number is growing. Genomic/genotype data are not standardized, and this hampers its exchange and integration. In addition, software used for the analyses of these data usually requires not standard (i.e. case specific) input files which, considering the large amount of data to be handled, require at least some programming skills in their production. In this work, we describe a software toolkit for SNP array data management, imputation, genome-wide association studies, population genetics and genomic selection. However, this toolkit does not solve the critical need for standardization of the genotypic data and software input files. It only highlights the chaotic situation each researcher has to face on a daily basis and gives some helpful advice on the currently available tools in order to navigate the SNP array data complexity.

  6. Ascertainment Biases in SNP Chips Affect Measures of Population Divergence

    PubMed Central

    Albrechtsen, Anders; Nielsen, Finn Cilius; Nielsen, Rasmus

    2010-01-01

    Chip-based high-throughput genotyping has facilitated genome-wide studies of genetic diversity. Many studies have utilized these large data sets to make inferences about the demographic history of human populations using measures of genetic differentiation such as FST or principal component analyses. However, the single nucleotide polymorphism (SNP) chip data suffer from ascertainment biases caused by the SNP discovery process in which a small number of individuals from selected populations are used as discovery panels. In this study, we investigate the effect of the ascertainment bias on inferences regarding genetic differentiation among populations in one of the common genome-wide genotyping platforms. We generate SNP genotyping data for individuals that previously have been subject to partial genome-wide Sanger sequencing and compare inferences based on genotyping data to inferences based on direct sequencing. In addition, we also analyze publicly available genome-wide data. We demonstrate that the ascertainment biases will distort measures of human diversity and possibly change conclusions drawn from these measures in some times unexpected ways. We also show that details of the genotyping calling algorithms can have a surprisingly large effect on population genetic inferences. We not only present a correction of the spectrum for the widely used Affymetrix SNP chips but also show that such corrections are difficult to generalize among studies. PMID:20558595

  7. SNP diversity within and among Brassica rapa accessions reveals no geographic differentiation.

    PubMed

    Tanhuanpää, P; Erkkilä, M; Tenhola-Roininen, T; Tanskanen, J; Manninen, O

    2016-01-01

    Genetic diversity was studied in a collection of 61 accessions of Brassica rapa, which were mostly oil-type turnip rapes but also included two oil-type subsp. dichotoma and five subsp. trilocularis accessions, as well as three leaf-type subspecies (subsp. japonica, pekinensis, and chinensis) and five turnip cultivars (subsp. rapa). Two-hundred and nine SNP markers, which had been discovered by amplicon resequencing, were used to genotype 893 plants from the B. rapa collection using Illumina BeadXpress. There was great variation in the diversity indices between accessions. With STRUCTURE analysis, the plant collection could be divided into three groups that seemed to correspond to morphotype and flowering habit but not to geography. According to AMOVA analysis, 65% of the variation was due to variation within accessions, 25% among accessions, and 10% among groups. A smaller subset of the plant collection, 12 accessions, was also studied with 5727 GBS-SNPs. Diversity indices obtained with GBS-SNPs correlated well with those obtained with Illumina BeadXpress SNPs. The developed SNP markers have already been used and will be used in future plant breeding programs as well as in mapping and diversity studies.

  8. High-throughput genomics in sorghum: from whole-genome resequencing to a SNP screening array.

    PubMed

    Bekele, Wubishet A; Wieckhorst, Silke; Friedt, Wolfgang; Snowdon, Rod J

    2013-12-01

    With its small, diploid and completely sequenced genome, sorghum (Sorghum bicolor L. Moench) is highly amenable to genomics-based breeding approaches. Here, we describe the development and testing of a robust single-nucleotide polymorphism (SNP) array platform that enables polymorphism screening for genome-wide and trait-linked polymorphisms in genetically diverse S. bicolor populations. Whole-genome sequences with 6× to 12× coverage from five genetically diverse S. bicolor genotypes, including three sweet sorghums and two grain sorghums, were aligned to the sorghum reference genome. From over 1 million high-quality SNPs, we selected 2124 Infinium Type II SNPs that were informative in all six source genomes, gave an optimal Assay Design Tool (ADT) score, had allele frequencies of 50% in the six genotypes and were evenly spaced throughout the S. bicolor genome. Furthermore, by phenotype-based pool sequencing, we selected an additional 876 SNPs with a phenotypic association to early-stage chilling tolerance, a key trait for European sorghum breeding. The 3000 attempted bead types were used to populate half of a dual-species Illumina iSelect SNP array. The array was tested using 564 Sorghum spp. genotypes, including offspring from four unrelated recombinant inbred line (RIL) and F2 populations and a genetic diversity collection. A high call rate of over 80% enabled validation of 2620 robust and polymorphic sorghum SNPs, underlining the efficiency of the array development scheme for whole-genome SNP selection and screening, with diverse applications including genetic mapping, genome-wide association studies and genomic selection.

  9. Amerindians show association to obesity with adiponectin gene SNP45 and SNP276: population genetics of a food intake control and "thrifty" gene.

    PubMed

    Arnaiz-Villena, Antonio; Fernández-Honrado, Mercedes; Rey, Diego; Enríquez-de-Salamanca, Mercedes; Abd-El-Fatah-Khalil, Sedeka; Arribas, Ignacio; Coca, Carmen; Algora, Manuel; Areces, Cristina

    2013-02-01

    Adiponectin gene polymorphisms SNP45 and SNP276 have been related to metabolic syndrome (MS) and related pathologies, including obesity. However results of associations are contradictory depending on which population is studied. In the present study, these adiponectin SNPs are for the first time studied in Amerindians. Allele frequencies are obtained and comparison with obesity and other MS related parameters are performed. Amerindians were also defined by characteristic HLA genes. Our main results are: (1) SNP276 T is associated to low diastolic blood pressure in Amerindians, (2) SNP45 G allele is correlated with obesity in female but not in male Amerindians, (3) SNP45/SNP276 T/G haplotype in total obese/non-obese subjects tends to show a linkage with non-obese Amerindians, (4) SNP45/SNP276 T/T haplotype is linked to obese Amerindian males. Also, a world population study is carried out finding that SNP45 T and SNP276 T alleles are the most frequent in African Blacks and are found significantly in lower frequencies in Europeans and Asians. This together with the fact that there is a linkage of this haplotype to obese Amerindian males suggest that evolutionary forces related to famine (or population density in relation with available food) may have shaped world population adiponectin polymorphism frequencies. PMID:23108996

  10. Large-Scale SNP Discovery through RNA Sequencing and SNP Genotyping by Targeted Enrichment Sequencing in Cassava (Manihot esculenta Crantz)

    PubMed Central

    Pootakham, Wirulda; Shearman, Jeremy R.; Ruang-areerate, Panthita; Sonthirod, Chutima; Sangsrakru, Duangjai; Jomchai, Nukoon; Yoocha, Thippawan; Triwitayakorn, Kanokporn; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2014-01-01

    Cassava (Manihot esculenta Crantz) is one of the most important crop species being the main source of dietary energy in several countries. Marker-assisted selection has become an essential tool in plant breeding. Single nucleotide polymorphism (SNP) discovery via transcriptome sequencing is an attractive strategy for genome complexity reduction in organisms with large genomes. We sequenced the transcriptome of 16 cassava accessions using the Illumina HiSeq platform and identified 675,559 EST-derived SNP markers. A subset of those markers was subsequently genotyped by capture-based targeted enrichment sequencing in 100 F1 progeny segregating for starch viscosity phenotypes. A total of 2,110 non-redundant SNP markers were used to construct a genetic map. This map encompasses 1,785 cM and consists of 19 linkage groups. A major quantitative trait locus (QTL) controlling starch pasting properties was identified and shown to coincide with the QTL previously reported for this trait. With a high-density SNP-based linkage map presented here, we also uncovered a novel QTL associated with starch pasting time on LG 10. PMID:25551642

  11. Large-scale SNP discovery through RNA sequencing and SNP genotyping by targeted enrichment sequencing in cassava (Manihot esculenta Crantz).

    PubMed

    Pootakham, Wirulda; Shearman, Jeremy R; Ruang-Areerate, Panthita; Sonthirod, Chutima; Sangsrakru, Duangjai; Jomchai, Nukoon; Yoocha, Thippawan; Triwitayakorn, Kanokporn; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2014-01-01

    Cassava (Manihot esculenta Crantz) is one of the most important crop species being the main source of dietary energy in several countries. Marker-assisted selection has become an essential tool in plant breeding. Single nucleotide polymorphism (SNP) discovery via transcriptome sequencing is an attractive strategy for genome complexity reduction in organisms with large genomes. We sequenced the transcriptome of 16 cassava accessions using the Illumina HiSeq platform and identified 675,559 EST-derived SNP markers. A subset of those markers was subsequently genotyped by capture-based targeted enrichment sequencing in 100 F1 progeny segregating for starch viscosity phenotypes. A total of 2,110 non-redundant SNP markers were used to construct a genetic map. This map encompasses 1,785 cM and consists of 19 linkage groups. A major quantitative trait locus (QTL) controlling starch pasting properties was identified and shown to coincide with the QTL previously reported for this trait. With a high-density SNP-based linkage map presented here, we also uncovered a novel QTL associated with starch pasting time on LG 10.

  12. Multiple SNP Set Analysis for Genome-Wide Association Studies Through Bayesian Latent Variable Selection.

    PubMed

    Lu, Zhao-Hua; Zhu, Hongtu; Knickmeyer, Rebecca C; Sullivan, Patrick F; Williams, Stephanie N; Zou, Fei

    2015-12-01

    The power of genome-wide association studies (GWAS) for mapping complex traits with single-SNP analysis (where SNP is single-nucleotide polymorphism) may be undermined by modest SNP effect sizes, unobserved causal SNPs, correlation among adjacent SNPs, and SNP-SNP interactions. Alternative approaches for testing the association between a single SNP set and individual phenotypes have been shown to be promising for improving the power of GWAS. We propose a Bayesian latent variable selection (BLVS) method to simultaneously model the joint association mapping between a large number of SNP sets and complex traits. Compared with single SNP set analysis, such joint association mapping not only accounts for the correlation among SNP sets but also is capable of detecting causal SNP sets that are marginally uncorrelated with traits. The spike-and-slab prior assigned to the effects of SNP sets can greatly reduce the dimension of effective SNP sets, while speeding up computation. An efficient Markov chain Monte Carlo algorithm is developed. Simulations demonstrate that BLVS outperforms several competing variable selection methods in some important scenarios. PMID:26515609

  13. Cancer Gene Prioritization for Targeted Resequencing Using FitSNP Scores

    PubMed Central

    Fieuw, Annelies; De Wilde, Bram; Speleman, Frank

    2012-01-01

    Background Although the throughput of next generation sequencing is increasing and at the same time the cost is substantially reduced, for the majority of laboratories whole genome sequencing of large cohorts of cancer samples is still not feasible. In addition, the low number of genomes that are being sequenced is often problematic for the downstream interpretation of the significance of the variants. Targeted resequencing can partially circumvent this problem; by focusing on a limited number of candidate cancer genes to sequence, more samples can be included in the screening, hence resulting in substantial improvement of the statistical power. In this study, a successful strategy for prioritizing candidate genes for targeted resequencing of cancer genomes is presented. Results Four prioritization strategies were evaluated on six different cancer types: genes were ranked using these strategies, and the positive predictive value (PPV) or mutation rate within the top-ranked genes was compared to the baseline mutation rate in each tumor type. Successful strategies generate gene lists in which the top is enriched for known mutated genes, as evidenced by an increase in PPV. A clear example of such an improvement is seen in colon cancer, where the PPV is increased by 2.3 fold compared to the baseline level when 100 top fitSNP genes are sequenced. Conclusions A gene prioritization strategy based on the fitSNP scores appears to be most successful in identifying mutated cancer genes across different tumor entities, with variance of gene expression levels as a good second best. PMID:22396732

  14. Role of an SNP in Alternative Splicing of Bovine NCF4 and Mastitis Susceptibility.

    PubMed

    Ju, Zhihua; Wang, Changfa; Wang, Xiuge; Yang, Chunhong; Sun, Yan; Jiang, Qiang; Wang, Fei; Li, Mengjiao; Zhong, Jifeng; Huang, Jinming

    2015-01-01

    Neutrophil cytosolic factor 4 (NCF4) is component of the nicotinamide dinucleotide phosphate oxidase complex, a key factor in biochemical pathways and innate immune responses. In this study, splice variants and functional single-nucleotide polymorphism (SNP) of NCF4 were identified to determine the variability and association of the gene with susceptibility to bovine mastitis characterized by inflammation. A novel splice variant, designated as NCF4-TV and characterized by the retention of a 48 bp sequence in intron 9, was detected in the mammary gland tissues of infected cows. The expression of the NCF4-reference main transcript in the mastitic mammary tissues was higher than that in normal tissues. A novel SNP, g.18174 A>G, was also found in the retained 48 bp region of intron 9. To determine whether NCF4-TV could be due to the g.18174 A>G mutation, we constructed two mini-gene expression vectors with the wild-type or mutant NCF4 g.18174 A>G fragment. The vectors were then transiently transfected into 293T cells, and alternative splicing of NCF4 was analyzed by reverse transcription-PCR and sequencing. Mini-gene splicing assay demonstrated that the aberrantly spliced NCF4-TV with 48 bp retained fragment in intron 9 could be due to g.18174 A>G, which was associated with milk somatic count score and increased risk of mastitis infection in cows. NCF4 expression was also regulated by alternative splicing. This study proposes that NCF4 splice variants generated by functional SNP are important risk factors for mastitis susceptibility in dairy cows.

  15. A general SNP-based molecular barcode for Plasmodium falciparum identification and tracking

    PubMed Central

    Daniels, Rachel; Volkman, Sarah K; Milner, Danny A; Mahesh, Nira; Neafsey, Daniel E; Park, Daniel J; Rosen, David; Angelino, Elaine; Sabeti, Pardis C; Wirth, Dyann F; Wiegand, Roger C

    2008-01-01

    Background Single nucleotide polymorphism (SNP) genotyping provides the means to develop a practical, rapid, inexpensive assay that will uniquely identify any Plasmodium falciparum parasite using a small amount of DNA. Such an assay could be used to distinguish recrudescence from re-infection in drug trials, to monitor the frequency and distribution of specific parasites in a patient population undergoing drug treatment or vaccine challenge, or for tracking samples and determining purity of isolates in the laboratory during culture adaptation and sub-cloning, as well as routine passage. Methods A panel of twenty-four SNP markers has been identified that exhibit a high minor allele frequency (average MAF > 35%), for which robust TaqMan genotyping assays were constructed. All SNPs were identified through whole genome sequencing and MAF was estimated through Affymetrix array-based genotyping of a worldwide collection of parasites. These assays create a "molecular barcode" to uniquely identify a parasite genome. Results Using 24 such markers no two parasites known to be of independent origin have yet been found to have the same allele signature. The TaqMan genotyping assays can be performed on a variety of samples including cultured parasites, frozen whole blood, or whole blood spotted onto filter paper with a success rate > 99%. Less than 5 ng of parasite DNA is needed to complete a panel of 24 markers. The ability of this SNP panel to detect and identify parasites was compared to the standard molecular methods, MSP-1 and MSP-2 typing. Conclusion This work provides a facile field-deployable genotyping tool that can be used without special skills with standard lab equipment, and at reasonable cost that will unambiguously identify and track P. falciparum parasites both from patient samples and in the laboratory. PMID:18959790

  16. Role of an SNP in Alternative Splicing of Bovine NCF4 and Mastitis Susceptibility.

    PubMed

    Ju, Zhihua; Wang, Changfa; Wang, Xiuge; Yang, Chunhong; Sun, Yan; Jiang, Qiang; Wang, Fei; Li, Mengjiao; Zhong, Jifeng; Huang, Jinming

    2015-01-01

    Neutrophil cytosolic factor 4 (NCF4) is component of the nicotinamide dinucleotide phosphate oxidase complex, a key factor in biochemical pathways and innate immune responses. In this study, splice variants and functional single-nucleotide polymorphism (SNP) of NCF4 were identified to determine the variability and association of the gene with susceptibility to bovine mastitis characterized by inflammation. A novel splice variant, designated as NCF4-TV and characterized by the retention of a 48 bp sequence in intron 9, was detected in the mammary gland tissues of infected cows. The expression of the NCF4-reference main transcript in the mastitic mammary tissues was higher than that in normal tissues. A novel SNP, g.18174 A>G, was also found in the retained 48 bp region of intron 9. To determine whether NCF4-TV could be due to the g.18174 A>G mutation, we constructed two mini-gene expression vectors with the wild-type or mutant NCF4 g.18174 A>G fragment. The vectors were then transiently transfected into 293T cells, and alternative splicing of NCF4 was analyzed by reverse transcription-PCR and sequencing. Mini-gene splicing assay demonstrated that the aberrantly spliced NCF4-TV with 48 bp retained fragment in intron 9 could be due to g.18174 A>G, which was associated with milk somatic count score and increased risk of mastitis infection in cows. NCF4 expression was also regulated by alternative splicing. This study proposes that NCF4 splice variants generated by functional SNP are important risk factors for mastitis susceptibility in dairy cows. PMID:26600390

  17. Role of an SNP in Alternative Splicing of Bovine NCF4 and Mastitis Susceptibility

    PubMed Central

    Wang, Xiuge; Yang, Chunhong; Sun, Yan; Jiang, Qiang; Wang, Fei; Li, Mengjiao; Zhong, Jifeng; Huang, Jinming

    2015-01-01

    Neutrophil cytosolic factor 4 (NCF4) is component of the nicotinamide dinucleotide phosphate oxidase complex, a key factor in biochemical pathways and innate immune responses. In this study, splice variants and functional single-nucleotide polymorphism (SNP) of NCF4 were identified to determine the variability and association of the gene with susceptibility to bovine mastitis characterized by inflammation. A novel splice variant, designated as NCF4-TV and characterized by the retention of a 48 bp sequence in intron 9, was detected in the mammary gland tissues of infected cows. The expression of the NCF4-reference main transcript in the mastitic mammary tissues was higher than that in normal tissues. A novel SNP, g.18174 A>G, was also found in the retained 48 bp region of intron 9. To determine whether NCF4-TV could be due to the g.18174 A>G mutation, we constructed two mini-gene expression vectors with the wild-type or mutant NCF4 g.18174 A>G fragment. The vectors were then transiently transfected into 293T cells, and alternative splicing of NCF4 was analyzed by reverse transcription-PCR and sequencing. Mini-gene splicing assay demonstrated that the aberrantly spliced NCF4-TV with 48 bp retained fragment in intron 9 could be due to g.18174 A>G, which was associated with milk somatic count score and increased risk of mastitis infection in cows. NCF4 expression was also regulated by alternative splicing. This study proposes that NCF4 splice variants generated by functional SNP are important risk factors for mastitis susceptibility in dairy cows. PMID:26600390

  18. Detection of homologous horizontal gene transfer in SNP data

    2012-07-23

    We study the detection of mutations, sequencing errors, and homologous horizontal gene transfers (HGT) in a set of closely related microbial genomes. We base the model on single nucleotide polymorphisms (SNP's) and break the genomes into blocks to handle the rearrangement problem. Then we apply a synamic programming algorithm to model whether changes within each block are likely a result of mutations, sequencing errors, or HGT.

  19. SNP Haplotype Mapping in a Small ALS Family

    PubMed Central

    Krueger, Katherine A. Dick; Tsuji, Shoji; Fukuda, Yoko; Takahashi, Yuji; Goto, Jun; Mitsui, Jun; Ishiura, Hiroyuki; Dalton, Joline C.; Miller, Michael B.; Day, John W.; Ranum, Laura P. W.

    2009-01-01

    The identification of genes for monogenic disorders has proven to be highly effective for understanding disease mechanisms, pathways and gene function in humans. Nevertheless, while thousands of Mendelian disorders have not yet been mapped there has been a trend away from studying single-gene disorders. In part, this is due to the fact that many of the remaining single-gene families are not large enough to map the disease locus to a single site in the genome. New tools and approaches are needed to allow researchers to effectively tap into this genetic gold-mine. Towards this goal, we have used haploid cell lines to experimentally validate the use of high-density single nucleotide polymorphism (SNP) arrays to define genome-wide haplotypes and candidate regions, using a small amyotrophic lateral sclerosis (ALS) family as a prototype. Specifically, we used haploid-cell lines to determine if high-density SNP arrays accurately predict haplotypes across entire chromosomes and show that haplotype information significantly enhances the genetic information in small families. Panels of haploid-cell lines were generated and a 5 centimorgan (cM) short tandem repeat polymorphism (STRP) genome scan was performed. Experimentally derived haplotypes for entire chromosomes were used to directly identify regions of the genome identical-by-descent in 5 affected individuals. Comparisons between experimentally determined and in silico haplotypes predicted from SNP arrays demonstrate that SNP analysis of diploid DNA accurately predicted chromosomal haplotypes. These methods precisely identified 12 candidate intervals, which are shared by all 5 affected individuals. Our study illustrates how genetic information can be maximized using readily available tools as a first step in mapping single-gene disorders in small families. PMID:19479031

  20. Introgression browser: high-throughput whole-genome SNP visualization.

    PubMed

    Aflitos, Saulo Alves; Sanchez-Perez, Gabino; de Ridder, Dick; Fransz, Paul; Schranz, Michael E; de Jong, Hans; Peters, Sander A

    2015-04-01

    Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not viable. Here, we present the Introgression Browser (iBrowser), a bioinformatics tool aimed at visualizing introgressions at nucleotide or SNP (Single Nucleotide Polymorphisms) accuracy. The software selects homozygous SNPs from Variant Call Format (VCF) information and filters out heterozygous SNPs, multi-nucleotide polymorphisms (MNPs) and insertion-deletions (InDels). For data analysis iBrowser makes use of sliding windows, but if needed it can generate any desired fragmentation pattern through General Feature Format (GFF) information. In an example of tomato (Solanum lycopersicum) accessions we visualize SNP patterns and elucidate both position and boundaries of the introgressions. We also show that our tool is capable of identifying alien DNA in a panel of the closely related S. pimpinellifolium by examining phylogenetic relationships of the introgressed segments in tomato. In a third example, we demonstrate the power of the iBrowser in a panel of 597 Arabidopsis accessions, detecting the boundaries of a SNP-free region around a polymorphic 1.17 Mbp inverted segment on the short arm of chromosome 4. The architecture and functionality of iBrowser makes the software appropriate for a broad set of analyses including SNP mining, genome structure analysis, and pedigree analysis. Its functionality, together with the capability to process large data sets and efficient visualization of sequence variation, makes iBrowser a valuable breeding tool.

  1. Robust demographic inference from genomic and SNP data.

    PubMed

    Excoffier, Laurent; Dupanloup, Isabelle; Huerta-Sánchez, Emilia; Sousa, Vitor C; Foll, Matthieu

    2013-10-01

    We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with ∂a∂i, the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.

  2. Robust Demographic Inference from Genomic and SNP Data

    PubMed Central

    Excoffier, Laurent; Dupanloup, Isabelle; Huerta-Sánchez, Emilia; Sousa, Vitor C.; Foll, Matthieu

    2013-01-01

    We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets. PMID:24204310

  3. Investigating single nucleotide polymorphism (SNP) density in the human genome and its implications for molecular evolution.

    PubMed

    Zhao, Zhongming; Fu, Yun-Xin; Hewett-Emmett, David; Boerwinkle, Eric

    2003-07-17

    We investigated the single nucleotide polymorphism (SNP) density across the human genome and in different genic categories using two SNP databases: Celera's CgsSNP, which includes SNPs identified by comparing genomic sequences, and Celera's RefSNP, which includes SNPs from a variety of sources and is biased toward disease-associated genes. Based on CgsSNP, the average numbers of SNPs per 10 kb was 8.33, 8.44, and 8.09 in the human genome, in intergenic regions, and in genic regions, respectively. In genic regions, the SNP density in intronic, exonic and adjoining untranslated regions was 8.21, 5.28, and 7.51 SNPs per 10 kb, respectively. The pattern of SNP density based on RefSNP was different from that based on CgsSNP, emphasizing its utility for genotype-phenotype association studies but not for most population genetic studies. The number of SNPs per chromosome was correlated with chromosome length, but the density of SNPs estimated by CgsSNP was not significantly correlated with the GC content of the chromosome. Based on CgsSNP, the ratio of nonsense to missense mutations (0.027), the ratio of missense to silent mutations (1.15), and the ratio of non-synonymous to synonymous mutations (1.18) was less than half of that expected in a human protein coding sequence under the neutral mutation theory, reflecting a role for natural selection, especially purifying selection. PMID:12909357

  4. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649).

    PubMed

    Knappskog, Stian; Gansmo, Liv B; Dibirova, Khadizha; Metspalu, Andres; Cybulski, Cezary; Peterlongo, Paolo; Aaltonen, Lauri; Vatten, Lars; Romundstad, Pål; Hveem, Kristian; Devilee, Peter; Evans, Gareth D; Lin, Dongxin; Van Camp, Guy; Manolopoulos, Vangelis G; Osorio, Ana; Milani, Lili; Ozcelik, Tayfun; Zalloua, Pierre; Mouzaya, Francis; Bliznetz, Elena; Balanovska, Elena; Pocheshkova, Elvira; Kučinskas, Vaidutis; Atramentova, Lubov; Nymadawa, Pagbajabyn; Titov, Konstantin; Lavryashina, Maria; Yusupov, Yuldash; Bogdanova, Natalia; Koshel, Sergey; Zamora, Jorge; Wedge, David C; Charlesworth, Deborah; Dörk, Thilo; Balanovsky, Oleg; Lønning, Per E

    2014-09-30

    The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 - 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk. PMID:25327560

  5. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649)

    PubMed Central

    Knappskog, Stian; Gansmo, Liv B.; Dibirova, Khadizha; Metspalu, Andres; Cybulski, Cezary; Peterlongo, Paolo; Aaltonen, Lauri; Vatten, Lars; Romundstad, Pål; Hveem, Kristian; Devilee, Peter; Evans, Gareth D.; Lin, Dongxin; Camp, Guy Van; Manolopoulos, Vangelis G.; Osorio, Ana; Milani, Lili; Ozcelik, Tayfun; Zalloua, Pierre; Mouzaya, Francis; Bliznetz, Elena; Balanovska, Elena; Pocheshkova, Elvira; Kučinskas, Vaidutis; Atramentova, Lubov; Nymadawa, Pagbajabyn; Titov, Konstantin; Lavryashina, Maria; Yusupov, Yuldash; Bogdanova, Natalia; Koshel, Sergey; Zamora, Jorge; Wedge, David C.; Charlesworth, Deborah; Dörk, Thilo; Balanovsky, Oleg; Lønning, Per E.

    2014-01-01

    The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 – 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk. PMID:25327560

  6. Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics.

    PubMed

    Lamparter, David; Marbach, Daniel; Rueedi, Rico; Kutalik, Zoltán; Bergmann, Sven

    2016-01-01

    Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries.

  7. Protein-protein interaction and SNP analysis in intraductal papillary mucinous neoplasm.

    PubMed

    Jiang, Pu; Zang, Weidong; Wang, Lishan; Xu, Ying; Liu, Yang; Deng, Shi-Xiong

    2013-01-15

    Intraductal papillary mucinous neoplasm (IPMN) is a type of tumor that grows within the pancreatic ducts. It is a progress from hyperplasia to intraductal adenoma (IPMA), to noninvasive carcinoma, and ultimately to invasive carcinoma (IPMC). The objective of this study was to explore the molecular mechanism of the progression from IPMA to IPMC. By using the GSE19650 affymetrix microarray data accessible from Gene Expression Omnibus (GEO) database, we first identified the differentially expressed genes (DEGs) between IPMA and IPMC, followed by the protein-protein interaction and single-nucleotide polymorphism (SNP) analysis of the DEGs. Our study identified thousands of DEGs which involved regulation of cell cycle and apoptosis in this progression from IPMA to IPMC. Protein-protein interaction network construction found that MYC, IL6ST, NR3C1, CREBBP, GATA1 and LRP1 might play an important role in the progression. Furthermore, the SNP analysis confirmed the association between BRAC1 and pancreas cancer. In conclusion, our data provide a comprehensive bioinformatics analysis of genes and pathways which may be involved in the progression of IPMN from IPMA to IPMC.

  8. Human Y-chromosome SNP characterization by multiplex amplified product-length polymorphism analysis.

    PubMed

    Medina, Laura Smeldy Jurado; Muzzio, Marina; Schwab, Marisol; Costantino, María Leticia Bravi; Barreto, Guillermo; Bailliet, Graciela

    2014-09-01

    We designed an allele-specific amplification protocol to optimize Y-chromosome SNP typing, which is an unavoidable step for defining the phylogenetic status of paternal lineages. It allows the simultaneous highly specific definition of up to six mutations in a single reaction by amplification fragment length polymorphism (AFLP) without the need of specialized equipment, at a considerably lower cost than that based on single-base primer extension (SNaPshot™) technology or PCR-RFLP systems, requiring as little as 0.5 ng DNA and compatible with the small fragments characteristic of low-quality DNA. By designation of two primers recognizing the derived and ancestral state for each SNP, which can be differentiated by size by the addition of a noncomplementary nucleotide tail, we could define major Y clades E, F, K, R, Q, and subhaplogroups R1, R1a, R1b, R1b1b, R1b1c, J1, J2, G1, G2, I1, Q1a3, and Q1a3a1 through amplification fragments that ranged between 60 and 158bp. PMID:24846779

  9. Identification of close relatives in the HUGO Pan-Asian SNP database.

    PubMed

    Yang, Xiong; Xu, Shuhua

    2011-01-01

    The HUGO Pan-Asian SNP Consortium has recently released a genome-wide dataset, which consists of 1,719 DNA samples collected from 71 Asian populations. For studies of human population genetics such as genetic structure and migration history, this provided the most comprehensive large-scale survey of genetic variation to date in East and Southeast Asia. However, although considered in the analysis, close relatives were not clearly reported in the original paper. Here we performed a systematic analysis of genetic relationships among individuals from the Pan-Asian SNP (PASNP) database and identified 3 pairs of monozygotic twins or duplicate samples, 100 pairs of first-degree and 161 second-degree of relationships. Three standardized subsets with different levels of unrelated individuals were suggested here for future applications of the samples in most types of population-genetics studies (denoted by PASNP1716, PASNP1640 and PASNP1583 respectively) based on the relationships inferred in this study. In addition, we provided gender information for PASNP samples, which were not included in the original dataset, based on analysis of X chromosome data.

  10. Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics

    PubMed Central

    Rueedi, Rico; Kutalik, Zoltán; Bergmann, Sven

    2016-01-01

    Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries. PMID:26808494

  11. Both a nicotinic single nucleotide polymorphism (SNP) and a noradrenergic SNP modulate working memory performance when attention is manipulated.

    PubMed

    Greenwood, Pamela M; Sundararajan, Ramya; Lin, Ming-Kuan; Kumar, Reshma; Fryxell, Karl J; Parasuraman, Raja

    2009-11-01

    We investigated the relation between the two systems of visuospatial attention and working memory by examining the effect of normal variation in cholinergic and noradrenergic genes on working memory performance under attentional manipulation. We previously reported that working memory for location was impaired following large location precues, indicating the scale of visuospatial attention has a role in forming the mental representation of the target. In one of the first studies to compare effects of two single nucleotide polymorphisms (SNPs) on the same cognitive task, we investigated the neurotransmission systems underlying interactions between attention and memory. Based on our previous report that the CHRNA4 rs#1044396 C/T nicotinic receptor SNP affected visuospatial attention, but not working memory, and the DBH rs#1108580 G/A noradrenergic enzyme SNP affected working memory, but not attention, we predicted that both SNPs would modulate performance when the two systems interacted and working memory was manipulated by attention. We found the scale of visuospatial attention deployed around a target affected memory for location of that target. Memory performance was modulated by the two SNPs. CHRNA4 C/C homozygotes and DBH G allele carriers showed the best memory performance but also the greatest benefit of visuospatial attention on memory. Overall, however, the CHRNA4 SNP exerted a stronger effect than the DBH SNP on memory performance when visuospatial attention was manipulated. This evidence of an integrated cholinergic influence on working memory performance under attentional manipulation is consistent with the view that working memory and visuospatial attention are separate systems which can interact.

  12. Single Nucleotide Polymorphism (SNP)-Based Loss of Heterozygosity (LOH) Testing by Real Time PCR in Patients Suspect of Myeloproliferative Disease

    PubMed Central

    Huijsmans, Cornelis J. J.; Poodt, Jeroen; Damen, Jan; van der Linden, Johannes C.; Savelkoul, Paul H. M.; Pruijt, Johannes F. M.; Hilbink, Mirrian; Hermans, Mirjam H. A.

    2012-01-01

    During tumor development, loss of heterozygosity (LOH) often occurs. When LOH is preceded by an oncogene activating mutation, the mutant allele may be further potentiated if the wild-type allele is lost or inactivated. In myeloproliferative neoplasms (MPN) somatic acquisition of JAK2V617F may be followed by LOH resulting in loss of the wild type allele. The occurrence of LOH in MPN and other proliferative diseases may lead to a further potentiating the mutant allele and thereby increasing morbidity. A real time PCR based SNP profiling assay was developed and validated for LOH detection of the JAK2 region (JAK2LOH). Blood of a cohort of 12 JAK2V617F-positive patients (n = 6 25–50% and n = 6>50% JAK2V617F) and a cohort of 81 patients suspected of MPN was stored with EDTA and subsequently used for validation. To generate germ-line profiles, non-neoplastic formalin-fixed paraffin-embedded tissue from each patient was analyzed. Results of the SNP assay were compared to those of an established Short Tandem Repeat (STR) assay. Both assays revealed JAK2LOH in 1/6 patients with 25–50% JAK2V617F. In patients with >50% JAK2V617F, JAK2LOH was detected in 6/6 by the SNP assay and 5/6 patients by the STR assay. Of the 81 patients suspected of MPN, 18 patients carried JAK2V617F. Both the SNP and STR assay demonstrated the occurrence of JAK2LOH in 5 of them. In the 63 JAK2V617F-negative patients, no JAK2LOH was observed by SNP and STR analyses. The presented SNP assay reliably detects JAK2LOH and is a fast and easy to perform alternative for STR analyses. We therefore anticipate the SNP approach as a proof of principle for the development of LOH SNP-assays for other clinically relevant LOH loci. PMID:22768290

  13. Rapid Diagnosis of Imprinting Disorders Involving Copy Number Variation and Uniparental Disomy Using Genome-Wide SNP Microarrays.

    PubMed

    Liu, Weiqiang; Zhang, Rui; Wei, Jun; Zhang, Huimin; Yu, Guojiu; Li, Zhihua; Chen, Min; Sun, Xiaofang

    2015-01-01

    Imprinting disorders, such as Beckwith-Wiedemann syndrome (BWS), Prader-Willi syndrome (PWS) and Angelman syndrome (AS), can be detected via methylation analysis, methylation-specific multiplex ligation-dependent probe amplification (MS-MLPA), or other methods. In this study, we applied single nucleotide polymorphism (SNP)-based chromosomal microarray analysis to detect copy number variations (CNVs) and uniparental disomy (UPD) events in patients with suspected imprinting disorders. Of 4 patients, 2 had a 5.25-Mb microdeletion in the 15q11.2q13.2 region, 1 had a 38.4-Mb mosaic UPD in the 11p15.4 region, and 1 had a 60-Mb detectable UPD between regions 14q13.2 and 14q32.13. Although the 14q32.2 region was classified as normal by SNP array for the 14q13 UPD patient, it turned out to be a heterodisomic UPD by short tandem repeat marker analysis. MS-MLPA analysis was performed to validate the variations. In conclusion, SNP-based microarray is an efficient alternative method for quickly and precisely diagnosing PWS, AS, BWS, and other imprinted gene-associated disorders when considering aberrations due to CNVs and most types of UPD. PMID:26184742

  14. Population structure and genetic diversity in a commercial maize breeding program assessed with SSR and SNP markers

    PubMed Central

    Van Inghelandt, Delphine; Melchinger, Albrecht E.; Lebreton, Claude

    2010-01-01

    Information about the genetic diversity and population structure in elite breeding material is of fundamental importance for the improvement of crops. The objectives of our study were to (a) examine the population structure and the genetic diversity in elite maize germplasm based on simple sequence repeat (SSR) markers, (b) compare these results with those obtained from single nucleotide polymorphism (SNP) markers, and (c) compare the coancestry coefficient calculated from pedigree records with genetic distance estimates calculated from SSR and SNP markers. Our study was based on 1,537 elite maize inbred lines genotyped with 359 SSR and 8,244 SNP markers. The average number of alleles per locus, of group specific alleles, and the gene diversity (D) were higher for SSRs than for SNPs. Modified Roger’s distance (MRD) estimates and membership probabilities of the STRUCTURE matrices were higher for SSR than for SNP markers but the germplasm organization in four heterotic pools was consistent with STRUCTURE results based on SSRs and SNPs. MRD estimates calculated for the two marker systems were highly correlated (0.87). Our results suggested that the same conclusions regarding the structure and the diversity of heterotic pools could be drawn from both markers types. Furthermore, although our results suggested that the ratio of the number of SSRs and SNPs required to obtain MRD or D estimates with similar precision is not constant across the various precision levels, we propose that between 7 and 11 times more SNPs than SSRs should be used for analyzing population structure and genetic diversity. Electronic supplementary material The online version of this article (doi:10.1007/s00122-009-1256-2) contains supplementary material, which is available to authorized users. PMID:20063144

  15. [Mechanism of genuineness of Glycyrrhiza uralensis based on SNP of β-Amyrin synthase gene].

    PubMed

    Zang, Yi-mei; Li, Yan-peng; Qiao, Jing; Chen, Hong-hao; Liu, Chun-sheng

    2015-07-01

    β-Amyrin synthase (β-AS) genes of Glycyrrhiza uralensis from 6 different regions were analyzed by PCR-SSCP and sequenced, then the correlationship between β-AS SNP and regions of Glycyrrhiza uralensis were determined. According to the 1 coding single nucleotide polymorphism on the first exon of β-AS gene at 94 bp site, Glycyrrhiza uralensis could be divided into 3 genotypes. In these genotypes, the percentage of 94A type in genuine regions was much higher, and it had significant differences with the percentage in non-genuine regions (P < 0.001). The results of the experiment proved that different β-AS genotypes at 94 bp site from different regions may be one of the important reasons to result in the genuineness of Glycyrrhiza uralensis. PMID:26552155

  16. PCR amplification of SNP loci from crude DNA for large-scale genotyping of oomycetes.

    PubMed

    Hu, Jian; Lyon, Rebecca; Zhou, Yuxin; Lamour, Kurt

    2014-01-01

    Similar to other eukaryotes, single nucleotide polymorphism (SNP) markers are abundant in many oomycete plant pathogen genomes. High resolution DNA melting analysis (HR-DMA) is a cost-effective method for SNP genotyping, but like many SNP marker technologies, is limited by the amount and quality of template DNA. We describe PCR preamplification of Phytophthora and Peronospora SNP loci from crude DNA extracted from a small amount of mycelium and/or infected plant tissue to produce sufficient template to genotype at least 10 000 SNPs. The approach is fast, inexpensive, requires minimal biological material and should be useful for many organisms in a variety of contexts. PMID:24871597

  17. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

    PubMed Central

    2010-01-01

    Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls

  18. Personalized Medicine Through SNP Testing for Breast Cancer Risk: Clinical Implementation.

    PubMed

    Howe, Rebecca; Miron-Shatz, Talya; Hanoch, Yaniv; Omer, Zehra B; O'Donoghue, Cristina; Ozanne, Elissa M

    2015-10-01

    Single nucleotide polymorphisms (SNPs) have the potential to improve personalized medicine in breast cancer care. As new SNPs are discovered, further enhancing risk classification, SNP testing may serve to complement family history and phenotypic risk factors when assessed in a clinical setting. SNP analysis is particularly relevant to high-risk women who may seek out such information to guide their decision-making around risk-reduction. However, little is known about how high-risk women may respond to SNP testing with regard to clinical decision-making. We examined high-risk women's interest in SNP testing for breast cancer risk through an online survey of hypothetical testing scenarios. Women stated their preferences for sharing test results and selected the most likely follow-up action they would pursue in each of the test result scenarios (above average and below average risk for breast cancer). Four hundred seventy-eight women participated. Most women (89 %) did not know what a SNP was prior to the study. Once SNP testing was described, 75 % were interested in SNP testing. Participants stated an interest in lifestyle interventions for risk-reduction and wanted to discuss their testing results with their doctor or a genetic counselor. Women are interested in SNP testing and are prepared to make lifestyle changes based on testing results. Women's preference for discussing testing results with a healthcare provider aligns with the current trend towards SNP testing in a clinical setting.

  19. Obesity-related known and candidate SNP markers can significantly change affinity of TATA-binding protein for human gene promoters

    PubMed Central

    2015-01-01

    Background Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e., significant differences of personal genomes of patients from the reference human genome) can help physicians to improve treatment. Clinical research can validate SNP markers via genotyping of patients and demonstration that SNP alleles are significantly more frequent in patients than in healthy people. The search for biomedical SNP markers of interest can be accelerated by computer-based analysis of hundreds of millions of SNPs in the 1000 Genomes project because of selection of the most meaningful candidate SNP markers and elimination of neutral SNPs. Results We cross-validated the output of two computer-based methods: DNA sequence analysis using Web service SNP_TATA_Comparator and keyword search for articles on comorbidities of obesity. Near the sites binding to TATA-binding protein (TBP) in human gene promoters, we found 22 obesity-related candidate SNP markers, including rs10895068 (male breast cancer in obesity); rs35036378 (reduced risk of obesity after ovariectomy); rs201739205 (reduced risk of obesity-related cancers due to weight loss by diet/exercise in obese postmenopausal women); rs183433761 (obesity resistance during a high-fat diet); rs367732974 and rs549591993 (both: cardiovascular complications in obese patients with type 2 diabetes mellitus); rs200487063 and rs34104384 (both: obesity-caused hypertension); rs35518301, rs72661131, and rs562962093 (all: obesity); and rs397509430, rs33980857, rs34598529, rs33931746, rs33981098, rs34500389, rs63750953, rs281864525, rs35518301, and rs34166473 (all: chronic inflammation in comorbidities of obesity). Using an electrophoretic mobility shift assay under nonequilibrium conditions, we

  20. Fine-scaled human genetic structure revealed by SNP microarrays.

    PubMed

    Xing, Jinchuan; Watkins, W Scott; Witherspoon, David J; Zhang, Yuhua; Guthery, Stephen L; Thara, Rangaswamy; Mowry, Bryan J; Bulayeva, Kazima; Weiss, Robert B; Jorde, Lynn B

    2009-05-01

    We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure. PMID:19411602

  1. Fine-scaled human genetic structure revealed by SNP microarrays.

    PubMed

    Xing, Jinchuan; Watkins, W Scott; Witherspoon, David J; Zhang, Yuhua; Guthery, Stephen L; Thara, Rangaswamy; Mowry, Bryan J; Bulayeva, Kazima; Weiss, Robert B; Jorde, Lynn B

    2009-05-01

    We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.

  2. Structural Architecture of SNP Effects on Complex Traits

    PubMed Central

    Gamazon, Eric R.; Cox, Nancy J.; Davis, Lea K.

    2014-01-01

    Despite the discovery of copy-number variation (CNV) across the genome nearly 10 years ago, current SNP-based analysis methodologies continue to collapse the homozygous (i.e., A/A), hemizygous (i.e., A/0), and duplicative (i.e., A/A/A) genotype states, treating the genotype variable as irreducible or unaltered by other colocalizing forms of genetic (e.g., structural) variation. Our understanding of common, genome-wide CNVs suggests that the canonical genotype construct might belie the enormous complexity of the genome. Here we present multiple analyses of several phenotypes and provide methods supporting a conceptual shift that embraces the structural dimension of genotype. We comprehensively investigate the impact of the structural dimension of genotype on (1) GWAS methods, (2) interpretation of rare LOF variants, (3) characterization of genomic architecture, and (4) implications for mapping loci involved in complex disease. Taken together, these results argue for the inclusion of a structural dimension and suggest that some portion of the “missing” heritability might be recovered through integration of the structural dimension of SNP effects on complex traits. PMID:25307299

  3. New Insights into the Geographic Distribution of Mycobacterium leprae SNP Genotypes Determined for Isolates from Leprosy Cases Diagnosed in Metropolitan France and French Territories

    PubMed Central

    Reibel, Florence; Chauffour, Aurélie; Brossier, Florence; Jarlier, Vincent; Cambau, Emmanuelle; Aubry, Alexandra

    2015-01-01

    Background Between 20 and 30 bacteriologically confirmed cases of leprosy are diagnosed each year at the French National Reference Center for mycobacteria. Patients are mainly immigrants from various endemic countries or living in French overseas territories. We aimed at expanding data regarding the geographical distribution of the SNP genotypes of the M. leprae isolates from these patients. Methodology/Principal findings Skin biopsies were obtained from 71 leprosy patients diagnosed between January 2009 and December 2013. Data regarding age, sex and place of birth and residence were also collected. Diagnosis of leprosy was confirmed by microscopic detection of acid-fast bacilli and/or amplification by PCR of the M. leprae-specific RLEP region. Single nucleotide polymorphisms (SNP), present in the M. leprae genome at positions 14 676, 1 642 875 and 2 935 685, were determined with an efficiency of 94% (67/71). Almost all patients were from countries other than France where leprosy is still prevalent (n = 31) or from French overseas territories (n = 36) where leprosy is not totally eradicated, while only a minority (n = 4) was born in metropolitan France but have lived in other countries. SNP type 1 was predominant (n = 33), followed by type 3 (n = 17), type 4 (n = 11) and type 2 (n = 6). SNP types were concordant with those previously reported as prevalent in the patients’ countries of birth. SNP types found in patients born in countries other than France (Comoros, Haiti, Benin, Congo, Sri Lanka) and French overseas territories (French Polynesia, Mayotte and La Réunion) not covered by previous work correlated well with geographical location and history of human settlements. Conclusions/Significance The phylogenic analysis of M. leprae strains isolated in France strongly suggests that French leprosy cases are caused by SNP types that are (a) concordant with the geographic origin or residence of the patients (non-French countries, French overseas territories

  4. Comparing the efficacy of SNP filtering methods for identifying a single causal SNP in a known association region.

    PubMed

    Spencer, Amy Victoria; Cox, Angela; Walters, Kevin

    2014-01-01

    Genome-wide association studies have successfully identified associations between common diseases and a large number of single nucleotide polymorphisms (SNPs) across the genome. We investigate the effectiveness of several statistics, including p-values, likelihoods, genetic map distance and linkage disequilibrium between SNPs, in filtering SNPs in several disease-associated regions. We use simulated data to compare the efficacy of filters with different sample sizes and for causal SNPs with different minor allele frequencies (MAFs) and effect sizes, focusing on the small effect sizes and MAFs likely to represent the majority of unidentified causal SNPs. In our analyses, of all the methods investigated, filtering on the ranked likelihoods consistently retains the true causal SNP with the highest probability for a given false positive rate. This was the case for all the local linkage disequilibrium patterns investigated. Our results indicate that when using this method to retain only the top 5% of SNPs, even a causal SNP with an odds ratio of 1.1 and MAF of 0.08 can be retained with a probability exceeding 0.9 using an overall sample size of 50,000.

  5. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding

    PubMed Central

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network. PMID:27602231

  6. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding

    PubMed Central

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network.

  7. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding.

    PubMed

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network.

  8. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding.

    PubMed

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network. PMID:27602231

  9. Integrating fMRI and SNP data for biomarker identification for schizophrenia with a sparse representation based variable selection method

    PubMed Central

    2013-01-01

    Background In recent years, both single-nucleotide polymorphism (SNP) array and functional magnetic resonance imaging (fMRI) have been widely used for the study of schizophrenia (SCZ). In addition, a few studies have been reported integrating both SNPs data and fMRI data for comprehensive analysis. Methods In this study, a novel sparse representation based variable selection (SRVS) method has been proposed and tested on a simulation data set to demonstrate its multi-resolution properties. Then the SRVS method was applied to an integrative analysis of two different SCZ data sets, a Single-nucleotide polymorphism (SNP) data set and a functional resonance imaging (fMRI) data set, including 92 cases and 116 controls. Biomarkers for the disease were identified and validated with a multivariate classification approach followed by a leave one out (LOO) cross-validation. Then we compared the results with that of a previously reported sparse representation based feature selection method. Results Results showed that biomarkers from our proposed SRVS method gave significantly higher classification accuracy in discriminating SCZ patients from healthy controls than that of the previous reported sparse representation method. Furthermore, using biomarkers from both data sets led to better classification accuracy than using single type of biomarkers, which suggests the advantage of integrative analysis of different types of data. Conclusions The proposed SRVS algorithm is effective in identifying significant biomarkers for complicated disease as SCZ. Integrating different types of data (e.g. SNP and fMRI data) may identify complementary biomarkers benefitting the diagnosis accuracy of the disease. PMID:24565219

  10. Molecular cloning and SNP association analysis of chicken PMCH gene.

    PubMed

    Sun, Guirong; Li, Ming; Li, Hong; Tian, Yadong; Chen, Qixin; Bai, Yichun; Kang, Xiangtao

    2013-08-01

    The pre-melanin-concentrating hormone (PMCH) gene is an important gene functionally concerning the regulations of body fat content, feeding behavior and energy balance. In this study, the full-length cDNA of chicken PMCH gene was amplified by SMART RACE method. The single nucleotide polymorphisms (SNPs) in the PMCH gene were screened by comparative sequence analysis. The obtained non-synonymous coding SNPs (ncSNPs) were designed for genotyping firstly. Its effects on growth, carcass characteristics and meat quality traits were investigated employing the F2 resource population of Gushi chicken crossed with Anak broiler by AluI CRS-PCR-RFLP. Our results indicated that the cDNA of chicken PMCH shared 67.25 and 66.47% homology with that of human and bovine PMCH, respectively. The deduced amino acid sequence of chicken PMCH (163 amino acids) were 52.07 and 50.89% identical to those of human and bovine PMCH, respectively. The PMCH protein sequence is predicted to have several functional domains, including pro-MCH, CSP, IL7, XPGI and some low complexity sequence. It has 8 phosphorylation sites and no signal peptide sequence. gga-miR-18a, gga-miR-18b, gga-miR-499 microRNA targeting site was predicted in the 3' untranslated region of chicken PMCH mRNA. In addition, a total of seven SNPs including an ncSNP and a synonymous coding SNP, were identified in the PMCH gene. The ncSNP c.81 A>T was found to be in moderate polymorphic state (polymorphic index=0.365), and the frequencies for genotype AA, AB and BB were 0.3648, 0.4682 and 0.1670, respectively. Significant associations between the locus and shear force of breast and leg were observed. This polymorphic site may serve as a useful target for the marker assisted selection of the growth and meat quality traits in chicken.

  11. SNP Array Karyotyping Allows for the Detection of Uniparental Disomy and Cryptic Chromosomal Abnormalities in MDS/MPD-U and MPD

    PubMed Central

    Gondek, Lukasz P.; Dunbar, Andrew J.; Szpurka, Hadrian; McDevitt, Michael A.; Maciejewski, Jaroslaw P.

    2007-01-01

    We applied single nucleotide polymorphism arrays (SNP-A) to study karyotypic abnormalities in patients with atypical myeloproliferative syndromes (MPD), including myeloproliferative/myelodysplastic syndrome overlap both positive and negative for the JAK2 V617F mutation and secondary acute myeloid leukemia (AML). In typical MPD cases (N = 8), which served as a control group, those with a homozygous V617F mutation showed clear uniparental disomy (UPD) of 9p using SNP-A. Consistent with possible genomic instability, in 19/30 MDS/MPD-U patients, we found additional lesions not identified by metaphase cytogenetics. In addition to UPD9p, we also have detected UPD affecting other chromosomes, including 1 (2/30), 11 (4/30), 12 (1/30) and 22 (1/30). Transformation to AML was observed in 8/30 patients. In 5 V617F+ patients who progressed to AML, we show that SNP-A can allow for the detection of two modes of transformation: leukemic blasts evolving from either a wild-type jak2 precursor carrying other acquired chromosomal defects, or from a V617F+ mutant progenitor characterized by UPD9p. SNP-A-based detection of cryptic lesions in MDS/MPD-U may help explain the clinical heterogeneity of this disorder. PMID:18030353

  12. Eurasiaplex: a forensic SNP assay for differentiating European and South Asian ancestries.

    PubMed

    Phillips, C; Freire Aradas, A; Kriegel, A K; Fondevila, M; Bulbul, O; Santos, C; Serrulla Rech, F; Perez Carceles, M D; Carracedo, Á; Schneider, P M; Lareu, M V

    2013-05-01

    We have selected a set of single nucleotide polymorphisms (SNPs) with the specific aim of differentiating European and South Asian ancestries. The SNPs were combined into a 23-plex SNaPshot primer extension assay: Eurasiaplex, designed to complement an existing 34-plex forensic ancestry test with both marker sets occupying well-spaced genomic positions, enabling their combination as single profile submissions to the Bayesian Snipper forensic ancestry inference system. We analyzed the ability of Eurasiaplex plus 34plex SNPs to assign ancestry to a total 1648 profiles from 16 European, 7 Middle East, 13 Central-South Asian and 21 East Asian populations. Ancestry assignment likelihoods were estimated from Snipper using training sets of five-group data (three Eurasian groups, East Asian and African genotypes) and four-group data (Middle East genotypes removed). Five-group differentiations gave assignment success of 91% for NW European populations, 72% for Middle East populations and 39% for Central-South Asian populations, indicating Middle East individuals are not reliably differentiated from either Europeans or Central-South Asians. Four-group differentiations provided markedly improved assignment success rates of 97% for most continental Europeans tested (excluding Turkish and Adygei at the far eastern edge of Europe) and 95% for Central-South Asians, despite applying a probability threshold for the highest likelihood ratio above '100 times more likely'. As part of the assessment of the sensitivity of Eurasiaplex to analyze challenging forensic material we detail Eurasiaplex and 34-plex SNP typing to infer ancestry of a cranium recovered from the sea, achieving 82% SNP genotype completeness. Therefore, Eurasiaplex provides an informative and forensically robust approach to the differentiation of European and South Asian ancestries amongst Eurasian populations.

  13. Networks of intergenic long-range enhancers and snpRNAs drive castration-resistant phenotype of prostate cancer and contribute to pathogenesis of multiple common human disorders

    PubMed Central

    Glinskii, Anna B; Ma, Shuang; Ma, Jun; Grant, Denise; Lim, Chang-Uk; Guest, Ian; Sell, Stewart; Buttyan, Ralph

    2011-01-01

    The mechanistic relevance of intergenic disease-associated genetic loci (IDAGL) containing highly statistically significant disease-linked SNPs remains unknown. Here, we present experimental and clinical evidence supporting the importantance of the role of IDAGL in human diseases. A targeted RT-PCR screen coupled with sequencing of purified PCR products detects widespread transcription at multiple IDAGL and identifies 96 small noncoding trans-regulatory RNAs of ∼100–300 nt in length containing SNPs (snpRNAs) associated with 21 common disorders. Multiple independent lines of experimental evidence support functionality of snpRNAs by documenting their cell type-specific expression and evolutionary conservation of sequences, genomic coordinates and biological effects. Chromatin state signatures, expression profiling experiments and luciferase reporter assays demonstrate that many IDAGL are Polycomb-regulated long-range enhancers. Expression of snpRNAs in human and mouse cells markedly affects cellular behavior and induces allele-specific clinically relevant phenotypic changes: NLRP1-locus snpRNAs rs2670660 exert regulatory effects on monocyte/macrophage transdifferentiation, induce prostate cancer (PC) susceptibility snpRNAs and transform low-malignancy hormone-dependent human PC cells into highly malignant androgen-independent PC. Q-PCR analysis and luciferase reporter assays demonstrate that snpRNA sequences represent allele-specific “decoy” targets of microRNAs that function as SNP allele-specific modifiers of microRNA expression and activity. We demonstrate that trans-acting RNA molecules facilitating resistance to androgen depletion (RAD) in vitro and castration-resistant phenotype (CRP) in vivo of PC contain intergenic 8q24-locus SNP variants (rs1447295; rs16901979; rs6983267) that were recently linked with increased risk of PC. Q-PCR analysis of clinical samples reveals markedly increased and highly concordant (r = 0.896; p < 0.0001) snpRNA expression

  14. A Coordinated Approach to Peach SNP Discovery in RosBREED

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In the USDA-funded multi-institutional and trans-disciplinary project, “RosBREED”, crop-specific SNP genome scan platforms are being developed for peach, apple, strawberry, and cherry at a resolution of at least one polymorphic SNP marker every 5 cM in any random cross, for use in Pedigree-Based Ana...

  15. Genome-wide copy number variations using SNP genotyping in a mixed breed swine population

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Copy number variations (CNVs) are increasingly understood to affect phenotypic variation. This study uses SNP genotyping of trios of mixed breed swine to add to the catalog of known genotypic variation in an important agricultural animal. Porcine SNP60 BeadChip genotypes were collected from 1802 pi...

  16. Development and Applications of a Bovine 50,000 SNP Chip

    Technology Transfer Automated Retrieval System (TEKTRAN)

    To develop an Illumina iSelect high density single nucleotide polymorphism (SNP) assay for cattle, the collaborative iBMC (Illumina, USDA ARS Beltsville, University of Missouri, USDA ARS Clay Center) Consortium first performed a de novo SNP discovery project in which genomic reduced representation l...

  17. A new SNP panel for evaluating genetic diversity in a composite cattle breed

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A custom 60K SNP panel, extracted from Bovine HD SNP chip was used to evaluate genotypic frequency changes in Braford (BF, a composite breed) when compared to progenitor breeds: Hereford (HF), Brahman (BR), and Nelore (NE). Samples from both the U. S. and Brazil were used. The new panel differentiat...

  18. The development and characterization of a 60K SNP chip for chicken

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In livestock species like the chicken, high throughput SNP genotyping assays are increasingly being used for whole genome association studies and as a tool in breeding (referred to as genomic selection). We describe the design of a moderate density (60K) Illumina SNP BeadChip in chicken consisting o...

  19. SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genome projects routinely produce draft sequences for species from diverse evolutionary clades, but generally do not create single nucleotide polymorphism (SNP) resources. We present an approach for de novo SNP discovery based on short-read sequencing of reduced representation libraries (RRL) to ge...

  20. SNP-VISTA: An Interactive SNPs Visualization Tool

    SciTech Connect

    Shah, Nameeta; Teplitsky, Michael V.; Pennacchio, Len A.; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L.

    2005-07-05

    Recent advances in sequencing technologies promise better diagnostics for many diseases as well as better understanding of evolution of microbial populations. Single Nucleotide Polymorphisms(SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it is possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease and then screen for causative mutations.In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples makes possible more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at http://genome.lbl.gov/vista/snpvista.

  1. SNPsyn: detection and exploration of SNP–SNP interactions

    PubMed Central

    Curk, Tomaz; Rot, Gregor; Zupan, Blaz

    2011-01-01

    SNPsyn (http://snpsyn.biolab.si) is an interactive software tool for the discovery of synergistic pairs of single nucleotide polymorphisms (SNPs) from large genome-wide case-control association studies (GWAS) data on complex diseases. Synergy among SNPs is estimated using an information-theoretic approach called interaction analysis. SNPsyn is both a stand-alone C++/Flash application and a web server. The computationally intensive part is implemented in C++ and can run in parallel on a dedicated cluster or grid. The graphical user interface is written in Adobe Flash Builder 4 and can run in most web browsers or as a stand-alone application. The SNPsyn web server hosts the Flash application, receives GWAS data submissions, invokes the interaction analysis and serves result files. The user can explore details on identified synergistic pairs of SNPs, perform gene set enrichment analysis and interact with the constructed SNP synergy network. PMID:21576219

  2. Expression and SNP association analysis of porcine FBXL4 gene.

    PubMed

    Li, Y; Yang, S L; Tang, Z L; Cui, W T; Mu, Y L; Chu, M X; Zhao, S H; Wu, Z F; Li, K; Peng, K M

    2010-01-01

    As a kind of E3 ligase, the product of FBXL4 gene belongs to a member of FBLs which is the biggest eukaryotic subfamily of F-BOX proteins, it can recognize some substrate through particular protein-protein interaction domains. To investigate its functions, the polymorphism and association analysis was analyzed. The partial cDNA of porcine FBXL4 with 2384 bp long was first cloned; the deduced protein comprises a conserved F-BOX domain at position from the 277th to 332nd amino acid. The phylogenetic tree indicated porcine FBXL4 has the closest genetic relationship with bovine FBXL4 than other selected animal species. Ten tissue expression level of porcine FBXL4 mRNA fluctuated remarkably in a large range by quantitative RT-PCR analysis. For two identified SNPs, the genotyping analysis of Tail showed TT genotype owned dominance in introduced Landrace pig and miniature Guizhou and Wuzhishan breeds, but CC genotype was more than two other genotypes in miniature Laiwu breed. While in another genotyping analysis of BsaJI, CC genotype was obviously more than other genotypes in two kinds of Chinese miniature pig breeds and introduced Landrace pig breeds. Furthermore, the association analysis with immune traits and blood parameters revealed that SNP Tail was significantly associated with the lymphocyte percentage (P = 0.0166) and the antibody levels for pseudorabies virus vaccination (P = 0.0001) of neonate piglets at 0 day. Meanwhile, SNP BsaJI was significantly associated with lymphocyte percentage of individuals at 32 days (P = 0.0351), neutrophil percentage (P = 0.0005), the absolute lymphocyte count (P = 0.0458), and the mixed cells (P = 0.0010) of neonate piglets at 0 day. PMID:19768576

  3. TNF-alpha SNP haplotype frequencies in equidae.

    PubMed

    Brown, J J; Ollier, W E R; Thomson, W; Matthews, J B; Carter, S D; Binns, M; Pinchbeck, G; Clegg, P D

    2006-05-01

    Tumour necrosis factor alpha (TNF-alpha) is a pro-inflammatory cytokine that plays a crucial role in the regulation of inflammatory and immune responses. In all vertebrate species the genes encoding TNF-alpha are located within the major histocompatability complex. In the horse TNF-alpha has been ascribed a role in a variety of important disease processes. Previously two single nucleotide polymorphisms (SNPs) have been reported within the 5' un-translated region of the equine TNF-alpha gene. We have examined the equine TNF-alpha promoter region further for additional SNPs by analysing DNA from 131 horses (Equus caballus), 19 donkeys (E. asinus), 2 Grant's zebras (E. burchellii boehmi) and one onager (E. hemionus). Two further SNPs were identified at nucleotide positions 24 (T/G) and 452 (T/C) relative to the first nucleotide of the 522 bp polymerase chain reaction product. A sequence variant at position 51 was observed between equidae. SNaPSHOT genotyping assays for these and the two previously reported SNPs were performed on 457 horses comprising seven different breeds and 23 donkeys to determine the gene frequencies. SNP frequencies varied considerably between different horse breeds and also between the equine species. In total, nine different TNF-alpha promoter SNP haplotypes and their frequencies were established amongst the various equidae examined, with some haplotypes being found only in horses and others only in donkeys or zebras. The haplotype frequencies observed varied greatly between different horse breeds. Such haplotypes may relate to levels of TNF-alpha production and disease susceptibility and further investigation is required to identify associations between particular haplotypes and altered risk of disease.

  4. Genotyping NAT2 with only two SNPs (rs1041983 and rs1801280) outperforms the tagging SNP rs1495741 and is equivalent to the conventional 7-SNP NAT2 genotype.

    PubMed

    Selinski, Silvia; Blaszkewicz, Meinolf; Lehmann, Marie-Louise; Ovsiannikov, Daniel; Moormann, Oliver; Guballa, Christoph; Kress, Alexander; Truss, Michael C; Gerullis, Holger; Otto, Thomas; Barski, Dimitri; Niegisch, Günter; Albers, Peter; Frees, Sebastian; Brenner, Walburgis; Thüroff, Joachim W; Angeli-Greaves, Miriam; Seidel, Thilo; Roth, Gerhard; Dietrich, Holger; Ebbinghaus, Rainer; Prager, Hans M; Bolt, Hermann M; Falkenstein, Michael; Zimmermann, Anna; Klein, Torsten; Reckwitz, Thomas; Roemer, Hermann C; Löhlein, Dietrich; Weistenhöfer, Wobbeke; Schöps, Wolfgang; Hassan Rizvi, Syed Adibul; Aslam, Muhammad; Bánfi, Gergely; Romics, Imre; Steffens, Michael; Ekici, Arif B; Winterpacht, Andreas; Ickstadt, Katja; Schwender, Holger; Hengstler, Jan G; Golka, Klaus

    2011-10-01

    Genotyping N-acetyltransferase 2 (NAT2) is of high relevance for individualized dosing of antituberculosis drugs and bladder cancer epidemiology. In this study we compared a recently published tagging single nucleotide polymorphism (SNP) (rs1495741) to the conventional 7-SNP genotype (G191A, C282T, T341C, C481T, G590A, A803G and G857A haplotype pairs) and systematically analysed if novel SNP combinations outperform the latter. For this purpose, we studied 3177 individuals by PCR and phenotyped 344 individuals by the caffeine test. Although the tagSNP and the 7-SNP genotype showed a high degree of correlation (R=0.933, P<0.0001) the 7-SNP genotype nevertheless outperformed the tagging SNP with respect to specificity (1.0 vs. 0.9444, P=0.0065). Considering all possible SNP combinations in a receiver operating characteristic analysis we identified a 2-SNP genotype (C282T, T341C) that outperformed the tagging SNP and was equivalent to the 7-SNP genotype. The 2-SNP genotype predicted the correct phenotype with a sensitivity of 0.8643 and a specificity of 1.0. In addition, it predicted the 7-SNP genotype with sensitivity and specificity of 0.9993 and 0.9880, respectively. The prediction of the NAT2 genotype by the 2-SNP genotype performed similar in populations of Caucasian, Venezuelan and Pakistani background. A 2-SNP genotype predicts NAT2 phenotypes with similar sensitivity and specificity as the conventional 7-SNP genotype. This procedure represents a facilitation in individualized dosing of NAT2 substrates without losing sensitivity or specificity.

  5. Design and characterization of a 52K SNP chip for goats.

    PubMed

    Tosser-Klopp, Gwenola; Bardou, Philippe; Bouchez, Olivier; Cabau, Cédric; Crooijmans, Richard; Dong, Yang; Donnadieu-Tonon, Cécile; Eggen, André; Heuven, Henri C M; Jamli, Saadiah; Jiken, Abdullah Johari; Klopp, Christophe; Lawley, Cynthia T; McEwan, John; Martin, Patrice; Moreno, Carole R; Mulsant, Philippe; Nabihoudine, Ibouniyamine; Pailhoux, Eric; Palhière, Isabelle; Rupp, Rachel; Sarry, Julien; Sayre, Brian L; Tircazes, Aurélie; Jun Wang; Wang, Wen; Zhang, Wenguang

    2014-01-01

    The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50-60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF) suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed): Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc) and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes), sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years.

  6. A Customized Pigmentation SNP Array Identifies a Novel SNP Associated with Melanoma Predisposition in the SLC45A2 Gene

    PubMed Central

    Alonso, Santos; Boyano, M. Dolores; Peña-Chilet, Maria; Pita, Guillermo; Aviles, Jose A.; Mayor, Matias; Gomez-Fernandez, Cristina; Casado, Beatriz; Martin-Gonzalez, Manuel; Izagirre, Neskuts; De la Rua, Concepcion; Asumendi, Aintzane; Perez-Yarza, Gorka; Arroyo-Berdugo, Yoana; Boldo, Enrique; Lozoya, Rafael; Torrijos-Aguilar, Arantxa; Pitarch, Ana; Pitarch, Gerard; Sanchez-Motilla, Jose M.; Valcuende-Cavero, Francisca; Tomas-Cabedo, Gloria; Perez-Pastor, Gemma; Diaz-Perez, Jose L.; Gardeazabal, Jesus; de Lizarduy, Iñigo Martinez; Sanchez-Diez, Ana; Valdes, Carlos; Pizarro, Angel; Casado, Mariano; Carretero, Gregorio; Botella-Estrada, Rafael; Nagore, Eduardo; Lazaro, Pablo; Lluch, Ana; Benitez, Javier; Martinez-Cadenas, Conrado; Ribas, Gloria

    2011-01-01

    As the incidence of Malignant Melanoma (MM) reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2) and rs2069398 (SILV/CKD2), were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls). A novel SNP located on the SLC45A2 gene (rs35414) was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001). None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively) had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls). Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date. PMID:21559390

  7. Exploring the structural and functional effect of pRB by significant nsSNP in the coding region of RB1 gene causing retinoblastoma.

    PubMed

    Rajasekaran, R; Sethumadhavan, Rao

    2010-02-01

    In this study, we identified the most deleterious nsSNP in RB1 gene through structural and functional properties of its protein (pRB) and investigated its binding affinity with E2F-2. Out of 956 SNPs, we investigated 12 nsSNPs in coding region in which three of them (SNPids rs3092895, rs3092903 and rs3092905) are commonly found to be damaged by I-Mutant 2.0, SIFT and PolyPhen programs. With this effort, we modeled the mutant pRB proteins based on these deleterious nsSNPs. From a comparison of total energy, stabilizing residues and RMSD of these three mutant proteins with native pRB protein, we identified that the major mutation is from Glutamic acid to Glycine at the residue position of 746 of pRB. Further, we compared the binding efficiency of both native and mutant pRB (E746G) with E2F-2. We found that mutant pRB has less binding affinity with E2F-2 as compared to native type. This is due to sixteen hydrogen bonding and two salt bridges that exist between native type and E2F-2, whereas mutant type makes only thirteen hydrogen bonds and one salt bridge with E2F-2. Based on our investigation, we propose that the SNP with an id rs3092905 could be the most deleterious nsSNP in RB1 gene causing retinoblastoma.

  8. DBDiaSNP: An Open-Source Knowledgebase of Genetic Polymorphisms and Resistance Genes Related to Diarrheal Pathogens

    PubMed Central

    Mehla, Kusum

    2015-01-01

    Abstract Diarrhea is a highly common infection among children, responsible for significant morbidity and mortality rate worldwide. After pneumonia, diarrhea remains the second leading cause of neonatal deaths. Numerous viral, bacterial, and parasitic enteric pathogens are associated with diarrhea. With increasing antibiotic resistance among enteric pathogens, there is an urgent need for global surveillance of the mutations and resistance genes primarily responsible for resistance to antibiotic treatment. Single Nucleotide Polymorphisms are important in this regard as they have a vast potential to be utilized as molecular diagnostics for gene-disease or pharmacogenomics association studies linking genotype to phenotype. DBDiaSNP is a comprehensive repository of mutations and resistance genes among various diarrheal pathogens and hosts to advance breakthroughs that will find applications from development of sequence-based diagnostic tools to drug discovery. It contains information about 946 mutations and 326 resistance genes compiled from literature and various web resources. As of March 2015, it houses various pathogen genes and the mutations responsible for antibiotic resistance. The pathogens include, for example, DEC (Diarrheagenic E.coli), Salmonella spp., Campylobacter spp., Shigella spp., Clostridium difficile, Aeromonas spp., Helicobacter pylori, Entamoeba histolytica, Vibrio cholera, and viruses. It also includes mutations from hosts (e.g., humans, pigs, others) that render them either susceptible or resistant to a certain type of diarrhea. DBDiaSNP is therefore intended as an integrated open access database for researchers and clinicians working on diarrheal diseases. Additionally, we note that the DBDiaSNP is one of the first antibiotic resistance databases for the diarrheal pathogens covering mutations and resistance genes that have clinical relevance from a broad range of pathogens and hosts. For future translational research involving integrative

  9. Sequential sentinel SNP Regional Association Plots (SSS-RAP): an approach for testing independence of SNP association signals using meta-analysis data.

    PubMed

    Zheng, Jie; Gaunt, Tom R; Day, Ian N M

    2013-01-01

    Genome-Wide Association Studies (GWAS) frequently incorporate meta-analysis within their framework. However, conditional analysis of individual-level data, which is an established approach for fine mapping of causal sites, is often precluded where only group-level summary data are available for analysis. Here, we present a numerical and graphical approach, "sequential sentinel SNP regional association plot" (SSS-RAP), which estimates regression coefficients (beta) with their standard errors using the meta-analysis summary results directly. Under an additive model, typical for genes with small effect, the effect for a sentinel SNP can be transformed to the predicted effect for a possibly dependent SNP through a 2×2 2-SNP haplotypes table. The approach assumes Hardy-Weinberg equilibrium for test SNPs. SSS-RAP is available as a Web-tool (http://apps.biocompute.org.uk/sssrap/sssrap.cgi). To develop and illustrate SSS-RAP we analyzed lipid and ECG traits data from the British Women's Heart and Health Study (BWHHS), evaluated a meta-analysis for ECG trait and presented several simulations. We compared results with existing approaches such as model selection methods and conditional analysis. Generally findings were consistent. SSS-RAP represents a tool for testing independence of SNP association signals using meta-analysis data, and is also a convenient approach based on biological principles for fine mapping in group level summary data.

  10. High-Throughput DNA Array for SNP Detection of KRAS Gene Using a Centrifugal Microfluidic Device.

    PubMed

    Sedighi, Abootaleb; Li, Paul C H

    2016-01-01

    Here, we describe detection of single nucleotide polymorphism (SNP) in genomic DNA samples using a NanoBioArray (NBA) chip. Fast DNA hybridization is achieved in the chip when target DNAs are introduced to the surface-arrayed probes using centrifugal force. Gold nanoparticles (AuNPs) are used to assist SNP detection at room temperature. The parallel setting of sample introduction in the spiral channels of the NBA chip enables multiple analyses on many samples, resulting in a technique appropriate for high-throughput SNP detection. The experimental procedure, including chip fabrication, probe array printing, DNA amplification, hybridization, signal detection, and data analysis, is described in detail.

  11. Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing.

    PubMed

    Ogden, R; Gharbi, K; Mugue, N; Martinsohn, J; Senn, H; Davey, J W; Pourkazemi, M; McEwing, R; Eland, C; Vidotto, M; Sergeev, A; Congiu, L

    2013-06-01

    Caviar-producing sturgeons belonging to the genus Acipenser are considered to be one of the most endangered species groups in the world. Continued overfishing in spite of increasing legislation, zero catch quotas and extensive aquaculture production have led to the collapse of wild stocks across Europe and Asia. The evolutionary relationships among Adriatic, Russian, Persian and Siberian sturgeons are complex because of past introgression events and remain poorly understood. Conservation management, traceability and enforcement suffer a lack of appropriate DNA markers for the genetic identification of sturgeon at the species, population and individual level. This study employed RAD sequencing to discover and characterize single nucleotide polymorphism (SNP) DNA markers for use in sturgeon conservation in these four tetraploid species over three biological levels, using a single sequencing lane. Four population meta-samples and eight individual samples from one family were barcoded separately before sequencing. Analysis of 14.4 Gb of paired-end RAD data focused on the identification of SNPs in the paired-end contig, with subsequent in silico and empirical validation of candidate markers. Thousands of putatively informative markers were identified including, for the first time, SNPs that show population-wide differentiation between Russian and Persian sturgeons, representing an important advance in our ability to manage these cryptic species. The results highlight the challenges of genotyping-by-sequencing in polyploid taxa, while establishing the potential genetic resources for developing a new range of caviar traceability and enforcement tools. PMID:23473098

  12. Porcine colonization of the Americas: a 60k SNP story

    PubMed Central

    Burgos-Paz, W; Souza, C A; Megens, H J; Ramayo-Caldas, Y; Melo, M; Lemús-Flores, C; Caal, E; Soto, H W; Martínez, R; Álvarez, L A; Aguirre, L; Iñiguez, V; Revidatti, M A; Martínez-López, O R; Llambi, S; Esteve-Codina, A; Rodríguez, M C; Crooijmans, R P M A; Paiva, S R; Schook, L B; Groenen, M A M; Pérez-Enciso, M

    2013-01-01

    The pig, Sus scrofa, is a foreign species to the American continent. Although pigs originally introduced in the Americas should be related to those from the Iberian Peninsula and Canary islands, the phylogeny of current creole pigs that now populate the continent is likely to be very complex. Because of the extreme climates that America harbors, these populations also provide a unique example of a fast evolutionary phenomenon of adaptation. Here, we provide a genome wide study of these issues by genotyping, with a 60k SNP chip, 206 village pigs sampled across 14 countries and 183 pigs from outgroup breeds that are potential founders of the American populations, including wild boar, Iberian, international and Chinese breeds. Results show that American village pigs are primarily of European ancestry, although the observed genetic landscape is that of a complex conglomerate. There was no correlation between genetic and geographical distances, neither continent wide nor when analyzing specific areas. Most populations showed a clear admixed structure where the Iberian pig was not necessarily the main component, illustrating how international breeds, but also Chinese pigs, have contributed to extant genetic composition of American village pigs. We also observe that many genes related to the cardiovascular system show an increased differentiation between altiplano and genetically related pigs living near sea level. PMID:23250008

  13. Porcine colonization of the Americas: a 60k SNP story.

    PubMed

    Burgos-Paz, W; Souza, C A; Megens, H J; Ramayo-Caldas, Y; Melo, M; Lemús-Flores, C; Caal, E; Soto, H W; Martínez, R; Alvarez, L A; Aguirre, L; Iñiguez, V; Revidatti, M A; Martínez-López, O R; Llambi, S; Esteve-Codina, A; Rodríguez, M C; Crooijmans, R P M A; Paiva, S R; Schook, L B; Groenen, M A M; Pérez-Enciso, M

    2013-04-01

    The pig, Sus scrofa, is a foreign species to the American continent. Although pigs originally introduced in the Americas should be related to those from the Iberian Peninsula and Canary islands, the phylogeny of current creole pigs that now populate the continent is likely to be very complex. Because of the extreme climates that America harbors, these populations also provide a unique example of a fast evolutionary phenomenon of adaptation. Here, we provide a genome wide study of these issues by genotyping, with a 60k SNP chip, 206 village pigs sampled across 14 countries and 183 pigs from outgroup breeds that are potential founders of the American populations, including wild boar, Iberian, international and Chinese breeds. Results show that American village pigs are primarily of European ancestry, although the observed genetic landscape is that of a complex conglomerate. There was no correlation between genetic and geographical distances, neither continent wide nor when analyzing specific areas. Most populations showed a clear admixed structure where the Iberian pig was not necessarily the main component, illustrating how international breeds, but also Chinese pigs, have contributed to extant genetic composition of American village pigs. We also observe that many genes related to the cardiovascular system show an increased differentiation between altiplano and genetically related pigs living near sea level.

  14. Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing.

    PubMed

    Ogden, R; Gharbi, K; Mugue, N; Martinsohn, J; Senn, H; Davey, J W; Pourkazemi, M; McEwing, R; Eland, C; Vidotto, M; Sergeev, A; Congiu, L

    2013-06-01

    Caviar-producing sturgeons belonging to the genus Acipenser are considered to be one of the most endangered species groups in the world. Continued overfishing in spite of increasing legislation, zero catch quotas and extensive aquaculture production have led to the collapse of wild stocks across Europe and Asia. The evolutionary relationships among Adriatic, Russian, Persian and Siberian sturgeons are complex because of past introgression events and remain poorly understood. Conservation management, traceability and enforcement suffer a lack of appropriate DNA markers for the genetic identification of sturgeon at the species, population and individual level. This study employed RAD sequencing to discover and characterize single nucleotide polymorphism (SNP) DNA markers for use in sturgeon conservation in these four tetraploid species over three biological levels, using a single sequencing lane. Four population meta-samples and eight individual samples from one family were barcoded separately before sequencing. Analysis of 14.4 Gb of paired-end RAD data focused on the identification of SNPs in the paired-end contig, with subsequent in silico and empirical validation of candidate markers. Thousands of putatively informative markers were identified including, for the first time, SNPs that show population-wide differentiation between Russian and Persian sturgeons, representing an important advance in our ability to manage these cryptic species. The results highlight the challenges of genotyping-by-sequencing in polyploid taxa, while establishing the potential genetic resources for developing a new range of caviar traceability and enforcement tools.

  15. Cluster-localized sparse logistic regression for SNP data.

    PubMed

    Binder, Harald; Müller, Tina; Schwender, Holger; Golka, Klaus; Steffens, Michael; Hengstler, Jan G; Ickstadt, Katja; Schumacher, Martin

    2012-08-14

    The task of analyzing high-dimensional single nucleotide polymorphism (SNP) data in a case-control design using multivariable techniques has only recently been tackled. While many available approaches investigate only main effects in a high-dimensional setting, we propose a more flexible technique, cluster-localized regression (CLR), based on localized logistic regression models, that allows different SNPs to have an effect for different groups of individuals. Separate multivariable regression models are fitted for the different groups of individuals by incorporating weights into componentwise boosting, which provides simultaneous variable selection, hence sparse fits. For model fitting, these groups of individuals are identified using a clustering approach, where each group may be defined via different SNPs. This allows for representing complex interaction patterns, such as compositional epistasis, that might not be detected by a single main effects model. In a simulation study, the CLR approach results in improved prediction performance, compared to the main effects approach, and identification of important SNPs in several scenarios. Improved prediction performance is also obtained for an application example considering urinary bladder cancer. Some of the identified SNPs are predictive for all individuals, while others are only relevant for a specific group. Together with the sets of SNPs that define the groups, potential interaction patterns are uncovered.

  16. Association of the calpain-10 gene with type 2 diabetes mellitus in a Mexican population.

    PubMed

    del Bosque-Plata, Laura; Aguilar-Salinas, Carlos A; Tusié-Luna, María Teresa; Ramírez-Jiménez, Salvador; Rodríguez-Torres, Maribel; Aurón-Gómez, Moisés; Ramírez, Erika; Velasco-Pérez, María Luisa; Ramírez-Silva, Alfredo; Gómez-Pérez, Francisco; Hanis, Craig L; Tsuchiya, Takafumi; Yoshiuchi, Issei; Cox, Nancy J; Bell, Graeme I

    2004-02-01

    Variation in the calpain-10 gene (CAPN10) has been associated with risk of type 2 diabetes in the Mexican American population of Starr County, Texas. We typed five polymorphisms in the calpain-10 gene (SNP-43, -43, -63, and -110 and Indel-19) to test for association with type 2 diabetes in 248 individuals representative of the mestizo population of Mexico City and Orizaba, Mexico including 134 patients with type 2 diabetes and 114 subjects with normal fasting blood glucose levels. We found a significant difference in SNP-44 allele and genotype frequencies between type 2 diabetic and non-diabetic subjects. The rare allele at SNP-44 was associated with increased risk of type 2 diabetes (odds ratio (OR)=2.72, 95% confidence interval (CI)=1.16-6.35, P=0.017). SNP-110, which is in perfect linkage disequilibrium with SNP-44, was also associated with type 2 diabetes. The SNP-43, Indel-19, and SNP-63 haplogenotype 112/121 associated with significantly increased risk (OR=2.16, 95% CI=1.31-3.57) of type 2 diabetes in Mexican Americans was not associated with significantly increased in risk in Mexicans (OR=1.15, 95% CI=0.57-2.34). The results suggest that variation in CAPN10 affects risk of type 2 diabetes in the mestizo population of central Mexico (Mexico City and Orizaba) and in Mexican Americans (Starr County, Texas). PMID:14741193

  17. Interim report on updated microarray probes for the LLNL Burkholderia pseudomallei SNP array

    SciTech Connect

    Gardner, S; Jaing, C

    2012-03-27

    The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interim report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.

  18. Combined array CGH plus SNP genome analyses in a single assay for optimized clinical testing.

    PubMed

    Wiszniewska, Joanna; Bi, Weimin; Shaw, Chad; Stankiewicz, Pawel; Kang, Sung-Hae L; Pursley, Amber N; Lalani, Seema; Hixson, Patricia; Gambin, Tomasz; Tsai, Chun-hui; Bock, Hans-Georg; Descartes, Maria; Probst, Frank J; Scaglia, Fernando; Beaudet, Arthur L; Lupski, James R; Eng, Christine; Cheung, Sau Wai; Bacino, Carlos; Patel, Ankita

    2014-01-01

    In clinical diagnostics, both array comparative genomic hybridization (array CGH) and single nucleotide polymorphism (SNP) genotyping have proven to be powerful genomic technologies utilized for the evaluation of developmental delay, multiple congenital anomalies, and neuropsychiatric disorders. Differences in the ability to resolve genomic changes between these arrays may constitute an implementation challenge for clinicians: which platform (SNP vs array CGH) might best detect the underlying genetic cause for the disease in the patient? While only SNP arrays enable the detection of copy number neutral regions of absence of heterozygosity (AOH), they have limited ability to detect single-exon copy number variants (CNVs) due to the distribution of SNPs across the genome. To provide comprehensive clinical testing for both CNVs and copy-neutral AOH, we enhanced our custom-designed high-resolution oligonucleotide array that has exon-targeted coverage of 1860 genes with 60,000 SNP probes, referred to as Chromosomal Microarray Analysis - Comprehensive (CMA-COMP). Of the 3240 cases evaluated by this array, clinically significant CNVs were detected in 445 cases including 21 cases with exonic events. In addition, 162 cases (5.0%) showed at least one AOH region >10 Mb. We demonstrate that even though this array has a lower density of SNP probes than other commercially available SNP arrays, it reliably detected AOH events >10 Mb as well as exonic CNVs beyond the detection limitations of SNP genotyping. Thus, combining SNP probes and exon-targeted array CGH into one platform provides clinically useful genetic screening in an efficient manner.

  19. Combined array CGH plus SNP genome analyses in a single assay for optimized clinical testing

    PubMed Central

    Wiszniewska, Joanna; Bi, Weimin; Shaw, Chad; Stankiewicz, Pawel; Kang, Sung-Hae L; Pursley, Amber N; Lalani, Seema; Hixson, Patricia; Gambin, Tomasz; Tsai, Chun-hui; Bock, Hans-Georg; Descartes, Maria; Probst, Frank J; Scaglia, Fernando; Beaudet, Arthur L; Lupski, James R; Eng, Christine; Wai Cheung, Sau; Bacino, Carlos; Patel, Ankita

    2014-01-01

    In clinical diagnostics, both array comparative genomic hybridization (array CGH) and single nucleotide polymorphism (SNP) genotyping have proven to be powerful genomic technologies utilized for the evaluation of developmental delay, multiple congenital anomalies, and neuropsychiatric disorders. Differences in the ability to resolve genomic changes between these arrays may constitute an implementation challenge for clinicians: which platform (SNP vs array CGH) might best detect the underlying genetic cause for the disease in the patient? While only SNP arrays enable the detection of copy number neutral regions of absence of heterozygosity (AOH), they have limited ability to detect single-exon copy number variants (CNVs) due to the distribution of SNPs across the genome. To provide comprehensive clinical testing for both CNVs and copy-neutral AOH, we enhanced our custom-designed high-resolution oligonucleotide array that has exon-targeted coverage of 1860 genes with 60 000 SNP probes, referred to as Chromosomal Microarray Analysis – Comprehensive (CMA-COMP). Of the 3240 cases evaluated by this array, clinically significant CNVs were detected in 445 cases including 21 cases with exonic events. In addition, 162 cases (5.0%) showed at least one AOH region >10 Mb. We demonstrate that even though this array has a lower density of SNP probes than other commercially available SNP arrays, it reliably detected AOH events >10 Mb as well as exonic CNVs beyond the detection limitations of SNP genotyping. Thus, combining SNP probes and exon-targeted array CGH into one platform provides clinically useful genetic screening in an efficient manner. PMID:23695279

  20. Automated SNP detection in expressed sequence tags: statistical considerations and application to maritime pine sequences.

    PubMed

    Dantec, Loïck Le; Chagné, David; Pot, David; Cantin, Olivier; Garnier-Géré, Pauline; Bedon, Frank; Frigerio, Jean-Marc; Chaumeil, Philippe; Léger, Patrick; Garcia, Virginie; Laigret, Frédéric; De Daruvar, Antoine; Plomion, Christophe

    2004-02-01

    We developed an automated pipeline for the detection of single nucleotide polymorphisms (SNPs) in expressed sequence tag (EST) data sets, by combining three DNA sequence analysis programs: Phred, Phrap and PolyBayes. This application requires access to the individual electrophoregram traces. First, a reference set of 65 SNPs was obtained from the sequencing of 30 gametes in 13 maritime pine (Pinus pinaster Ait.) gene fragments (6671 bp), resulting in a frequency of 1 SNP every 102.6 bp. Second, parameters of the three programs were optimized in order to retrieve as many true SNPs, while keeping the rate of false positive as low as possible. Overall, the efficiency of detection of true SNPs was 83.1%. However, this rate varied largely as a function of the rare SNP allele frequency: down to 41% for rare SNP alleles (frequency < 10%), up to 98% for allele frequencies above 10%. Third, the detection method was applied to the 18498 assembled maritime pine (Pinus pinaster Ait.) ESTs, allowing to identify a total of 1400 candidate SNPs, in contigs containing between 4 and 20 sequence reads. These genetic resources, described for the first time in a forest tree species, were made available at http://www.pierroton.inra/genetics/Pinesnps. We also derived an analytical expression for the SNP detection probability as a function of the SNP allele frequency, the number of haploid genomes used to generate the EST sequence database, and the sample size of the contigs considered for SNP detection. The frequency of the SNP allele was shown to be the main factor influencing the probability of SNP detection.

  1. Next-generation transcriptome sequencing, SNP discovery and validation in four market classes of peanut, Arachis hypogaea L.

    PubMed

    Chopra, Ratan; Burow, Gloria; Farmer, Andrew; Mudge, Joann; Simpson, Charles E; Wilkins, Thea A; Baring, Michael R; Puppala, Naveen; Chamberlin, Kelly D; Burow, Mark D

    2015-06-01

    Single-nucleotide polymorphisms, which can be identified in the thousands or millions from comparisons of transcriptome or genome sequences, are ideally suited for making high-resolution genetic maps, investigating population evolutionary history, and discovering marker-trait linkages. Despite significant results from their use in human genetics, progress in identification and use in plants, and particularly polyploid plants, has lagged. As part of a long-term project to identify and use SNPs suitable for these purposes in cultivated peanut, which is tetraploid, we generated transcriptome sequences of four peanut cultivars, namely OLin, New Mexico Valencia C, Tamrun OL07 and Jupiter, which represent the four major market classes of peanut grown in the world, and which are important economically to the US southwest peanut growing region. CopyDNA libraries of each genotype were used to generate 2 × 54 paired-end reads using an Illumina GAIIx sequencer. Raw reads were mapped to a custom reference consisting of Tifrunner 454 sequences plus peanut ESTs in GenBank, compromising 43,108 contigs; 263,840 SNP and indel variants were identified among four genotypes compared to the reference. A subset of 6 variants was assayed across 24 genotypes representing four market types using KASP chemistry to assess the criteria for SNP selection. Results demonstrated that transcriptome sequencing can identify SNPs usable as selectable DNA-based markers in complex polyploid species such as peanut. Criteria for effective use of SNPs as markers are discussed in this context.

  2. Exploring Germplasm Diversity to Understand the Domestication Process in Cicer spp. Using SNP and DArT Markers

    PubMed Central

    Roorkiwal, Manish; von Wettberg, Eric J.; Upadhyaya, Hari D.; Warschefsky, Emily; Rathore, Abhishek; Varshney, Rajeev K.

    2014-01-01

    To estimate genetic diversity within and between 10 interfertile Cicer species (94 genotypes) from the primary, secondary and tertiary gene pool, we analysed 5,257 DArT markers and 651 KASPar SNP markers. Based on successful allele calling in the tertiary gene pool, 2,763 DArT and 624 SNP markers that are polymorphic between genotypes from the gene pools were analyzed further. STRUCTURE analyses were consistent with 3 cultivated populations, representing kabuli, desi and pea-shaped seed types, with substantial admixture among these groups, while two wild populations were observed using DArT markers. AMOVA was used to partition variance among hierarchical sets of landraces and wild species at both the geographical and species level, with 61% of the variation found between species, and 39% within species. Molecular variance among the wild species was high (39%) compared to the variation present in cultivated material (10%). Observed heterozygosity was higher in wild species than the cultivated species for each linkage group. Our results support the Fertile Crescent both as the center of domestication and diversification of chickpea. The collection used in the present study covers all the three regions of historical chickpea cultivation, with the highest diversity in the Fertile Crescent region. Shared alleles between different gene pools suggest the possibility of gene flow among these species or incomplete lineage sorting and could indicate complicated patterns of divergence and fusion of wild chickpea taxa in the past. PMID:25010059

  3. Construction of a versatile SNP array for pyramiding useful genes of rice.

    PubMed

    Kurokawa, Yusuke; Noda, Tomonori; Yamagata, Yoshiyuki; Angeles-Shim, Rosalyn; Sunohara, Hidehiko; Uehara, Kanako; Furuta, Tomoyuki; Nagai, Keisuke; Jena, Kshirod Kumar; Yasui, Hideshi; Yoshimura, Atsushi; Ashikari, Motoyuki; Doi, Kazuyuki

    2016-01-01

    DNA marker-assisted selection (MAS) has become an indispensable component of breeding. Single nucleotide polymorphisms (SNP) are the most frequent polymorphism in the rice genome. However, SNP markers are not readily employed in MAS because of limitations in genotyping platforms. Here the authors report a Golden Gate SNP array that targets specific genes controlling yield-related traits and biotic stress resistance in rice. As a first step, the SNP genotypes were surveyed in 31 parental varieties using the Affymetrix Rice 44K SNP microarray. The haplotype information for 16 target genes was then converted to the Golden Gate platform with 143-plex markers. Haplotypes for the 14 useful allele are unique and can discriminate among all other varieties. The genotyping consistency between the Affymetrix microarray and the Golden Gate array was 92.8%, and the accuracy of the Golden Gate array was confirmed in 3 F2 segregating populations. The concept of the haplotype-based selection by using the constructed SNP array was proofed. PMID:26566831

  4. Electrochemical Li Topotactic Reaction in Layered SnP3 for Superior Li-Ion Batteries

    PubMed Central

    Park, Jae-Wan; Park, Cheol-Min

    2016-01-01

    The development of new anode materials having high electrochemical performances and interesting reaction mechanisms is highly required to satisfy the need for long-lasting mobile electronic devices and electric vehicles. Here, we report a layer crystalline structured SnP3 and its unique electrochemical behaviors with Li. The SnP3 was simply synthesized through modification of Sn crystallography by combination with P and its potential as an anode material for LIBs was investigated. During Li insertion reaction, the SnP3 anode showed an interesting two-step electrochemical reaction mechanism comprised of a topotactic transition (0.7–2.0 V) and a conversion (0.0–2.0 V) reaction. When the SnP3-based composite electrode was tested within the topotactic reaction region (0.7–2.0 V) between SnP3 and LixSnP3 (x ≤ 4), it showed excellent electrochemical properties, such as a high volumetric capacity (1st discharge/charge capacity was 840/663 mA h cm−3) with a high initial coulombic efficiency, stable cycle behavior (636 mA h cm−3 over 100 cycles), and fast rate capability (550 mA h cm−3 at 3C). This layered SnP3 anode will be applicable to a new anode material for rechargeable LIBs. PMID:27775090

  5. QTL scanning for rice yield using a whole genome SNP array.

    PubMed

    Tan, Cong; Han, Zhongmin; Yu, Huihui; Zhan, Wei; Xie, Weibo; Chen, Xun; Zhao, Hu; Zhou, Fasong; Xing, Yongzhong

    2013-12-20

    High-throughput SNP genotyping is widely used for plant genetic studies. Recently, a RICE6K SNP array has been developed based on the Illumina Bead Array platform and Infinium SNP assay technology for genome-wide evaluation of allelic variations and breeding applications. In this study, the RICE6K SNP array was used to genotype a recombinant inbred line (RIL) population derived from the cross between the indica variety, Zhenshan 97, and the japonica variety, Xizang 2. A total of 3324 SNP markers of high quality were identified and were grouped into 1495 recombination bins in the RIL population. A high-density linkage map, consisting of the 1495 bins, was developed, covering 1591.2 cM and with average length of 1.1 cM per bin. Segregation distortions were observed in 24 regions of the 11 chromosomes in the RILs. One half of the distorted regions contained fertility genes that had been previously reported. A total of 23 QTLs were identified for yield. Seven QTLs were firstly detected in this study. The positive alleles from about half of the identified QTLs came from Zhenshan 97 and they had lower phenotypic values than Xizang 2. This indicated that favorable alleles for breeding were dispersed in both parents and pyramiding favorable alleles could develop elite lines. The size of the mapping population for QTL analysis using high throughput SNP genotyping platform is also discussed.

  6. SNP and mutation data on the web - hidden treasures for uncovering.

    PubMed

    Barnes, Michael R

    2002-01-01

    SNP data has grown exponentially over the last two years, SNP database evolution has matched this growth, as initial development of several independent SNP databases has given way to one central SNP database, dbSNP. Other SNP databases have instead evolved to complement this central database by providing gene specific focus and an increased level of curation and analysis on subsets of data, derived from the central data set. By contrast, human mutation data, which has been collected over many years, is still stored in disparate sources, although moves are afoot to move to a similar central database. These developments are timely, human mutation and polymorphism data both hold complementary keys to a better understanding of how genes function and malfunction in disease. The impending availability of a complete human genome presents us with an ideal framework to integrate both these forms of data, as our understanding of the mechanisms of disease increase, the full genomic context of variation may become increasingly significant.

  7. Species Delimitation using Genome-Wide SNP Data

    PubMed Central

    Leaché, Adam D.; Fujita, Matthew K.; Minin, Vladimir N.; Bouckaert, Remco R.

    2014-01-01

    The multispecies coalescent has provided important progress for evolutionary inferences, including increasing the statistical rigor and objectivity of comparisons among competing species delimitation models. However, Bayesian species delimitation methods typically require brute force integration over gene trees via Markov chain Monte Carlo (MCMC), which introduces a large computation burden and precludes their application to genomic-scale data. Here we combine a recently introduced dynamic programming algorithm for estimating species trees that bypasses MCMC integration over gene trees with sophisticated methods for estimating marginal likelihoods, needed for Bayesian model selection, to provide a rigorous and computationally tractable technique for genome-wide species delimitation. We provide a critical yet simple correction that brings the likelihoods of different species trees, and more importantly their corresponding marginal likelihoods, to the same common denominator, which enables direct and accurate comparisons of competing species delimitation models using Bayes factors. We test this approach, which we call Bayes factor delimitation (*with genomic data; BFD*), using common species delimitation scenarios with computer simulations. Varying the numbers of loci and the number of samples suggest that the approach can distinguish the true model even with few loci and limited samples per species. Misspecification of the prior for population size θ has little impact on support for the true model. We apply the approach to West African forest geckos (Hemidactylus fasciatus complex) using genome-wide SNP data. This new Bayesian method for species delimitation builds on a growing trend for objective species delimitation methods with explicit model assumptions that are easily tested. [Bayes factor; model testing; phylogeography; RADseq; simulation; speciation.] PMID:24627183

  8. Comparison of genetic distance measures using human SNP genotype data.

    PubMed

    Libiger, Ondrej; Nievergelt, Caroline M; Schork, Nicholas J

    2009-08-01

    Quantification of the genetic distance between populations is instrumental in many genetic research initiatives, and a large number of formulas for this purpose have been proposed. However, selection of an appropriate measure for assessing genetic distance between real-world human populations that diverged as a result of mechanisms that are not fully known can be a challenging task. We compared results from nine widely used genetic distance measures to high-density whole-genome SNP genotype data obtained on individuals from 51 world populations. Using population trees and generalized analysis of molecular variance, we found that contradictory inferences could be drawn from analyses that used different distance measures. We determined the grouping of the distance measures in terms of similarity and consistency of their values using concordance, consistency, and Procrustes analyses. Overall, the Cavalli-Sforza and Edwards distance measure differed the most from the other measures. Wright's F(ST) for diploid data, the Latter and Reynolds distances, and Nei's minimum distance measures each yielded values that were most consistent with the other eight distance measures in terms of ordering populations based on genetic distance. The Cavalli-Sforza and Edwards distance and Nei's geometric distance were least consistent. Simulation studies showed that the Cavalli-Sforza and Edwards distance is relatively more sensitive in distinguishing genetically similar populations and that the Reynolds genetic distance provides the highest sensitivity for highly divergent populations. Finally, our study suggests that using the Cavalli-Sforza and Edwards distance may provide less power for studies concerning human migration history.

  9. Review of alignment and SNP calling algorithms for next-generation sequencing data.

    PubMed

    Mielczarek, M; Szyda, J

    2016-02-01

    Application of the massive parallel sequencing technology has become one of the most important issues in life sciences. Therefore, it was crucial to develop bioinformatics tools for next-generation sequencing (NGS) data processing. Currently, two of the most significant tasks include alignment to a reference genome and detection of single nucleotide polymorphisms (SNPs). In many types of genomic analyses, great numbers of reads need to be mapped to the reference genome; therefore, selection of the aligner is an essential step in NGS pipelines. Two main algorithms-suffix tries and hash tables-have been introduced for this purpose. Suffix array-based aligners are memory-efficient and work faster than hash-based aligners, but they are less accurate. In contrast, hash table algorithms tend to be slower, but more sensitive. SNP and genotype callers may also be divided into two main different approaches: heuristic and probabilistic methods. A variety of software has been subsequently developed over the past several years. In this paper, we briefly review the current development of NGS data processing algorithms and present the available software.

  10. Family-Based Multi-SNP X Chromosome Analysis Using Parent Information.

    PubMed

    Wise, Alison S; Shi, Min; Weinberg, Clarice R

    2016-01-01

    We propose a method for association analysis of haplotypes on the X chromosome that offers both improved power and robustness to population stratification in studies of affected offspring and their parents if all three have been genotyped. The method makes use of assumed parental haplotype exchangeability (PHE), a weaker assumption than Hardy-Weinberg equilibrium (HWE). PHE requires that in the source population, of the three X chromosome haplotypes carried by the two parents, each is equally likely to be carried by the father. We propose a pseudo-sibling approach that exploits that exchangeability assumption. Our method extends the single-SNP PIX-LRT method to multiple SNPs in a high linkage block. We describe methods for testing the PHE assumption and also for determining how apparent violations can be distinguished from true fetal effects or maternally-mediated effects. We show results of simulations that demonstrate nominal type I error rate and good power. The methods are then applied to dbGaP data on the birth defect oral cleft, using both Asian and Caucasian families with cleft. PMID:26941777

  11. Screening and SNP mapping of copper-resistant mutations in C. elegans.

    PubMed

    Song, Shaojuan; Guo, Yaping; Zhang, Xueyao; Zhang, Jianzhen; Ma, Enbo

    2014-12-01

    Copper plays critical roles in biological system; however, it is toxic in excess. To identify novel genes involved in copper metabolism, we performed a whole genome-wide genetic screen in C. elegans model organism to search for mutants which are resistant to excessive copper. Wild type (N2) L4 worms were mutagenized with ethylmethane sulfonate (EMS), and the F₂progeny were screened on culture medium with excess copper. Two copper-resistant mutants, ms₁and ms₂, were recovered from the screening of 100 000 hyploid genomes. No obvious developmental defects were observed in ms₁and ms₂mutants, and they were able to grow into adults on screen medium plate, but N₂worms arrested in L₁stage. Results of backcross test suggested that copper-resistant phenotype in ms₁may be controlled by a single recessive gene, but probably there are mutations in multiple genes in ms₂, as no copper resistant worms could be found in F₂progeny when ms₂mutants were backcrossed with N₂worms. To determine the mutation positions of ms₁, we employed single nucleotide polymorphisms (SNPs) mapping. Our mapping results indicated that ms₁mutation is on chromosome II (LGII). By analysis of 8 SNP markers from -18 to 23 on LGII, we found that ms₁mutation is at approximately LGII:-6. Further study on ms₁mutants will provide insights into copper metabolism and its regulation.

  12. Sensitive DNA detection and SNP discrimination using ultrabright SERS nanorattles and magnetic beads for malaria diagnostics.

    PubMed

    Ngo, Hoan T; Gandra, Naveen; Fales, Andrew M; Taylor, Steve M; Vo-Dinh, Tuan

    2016-07-15

    One of the major obstacles to implement nucleic acid-based molecular diagnostics at the point-of-care (POC) and in resource-limited settings is the lack of sensitive and practical DNA detection methods that can be seamlessly integrated into portable platforms. Herein we present a sensitive yet simple DNA detection method using a surface-enhanced Raman scattering (SERS) nanoplatform: the ultrabright SERS nanorattle. The method, referred to as the nanorattle-based method, involves sandwich hybridization of magnetic beads that are loaded with capture probes, target sequences, and ultrabright SERS nanorattles that are loaded with reporter probes. Upon hybridization, a magnet was applied to concentrate the hybridization sandwiches at a detection spot for SERS measurements. The ultrabright SERS nanorattles, composed of a core and a shell with resonance Raman reporters loaded in the gap space between the core and the shell, serve as SERS tags for signal detection. Using this method, a specific DNA sequence of the malaria parasite Plasmodium falciparum could be detected with a detection limit of approximately 100 attomoles. Single nucleotide polymorphism (SNP) discrimination of wild type malaria DNA and mutant malaria DNA, which confers resistance to artemisinin drugs, was also demonstrated. These test models demonstrate the molecular diagnostic potential of the nanorattle-based method to both detect and genotype infectious pathogens. Furthermore, the method's simplicity makes it a suitable candidate for integration into portable platforms for POC and in resource-limited settings applications. PMID:26913502

  13. Exciton Primer-mediated SNP detection in SmartAmp2 reactions.

    PubMed

    Lezhava, Alexander; Ishidao, Takefumi; Ishizu, Yuri; Naito, Kana; Hanami, Takeshi; Katayama, Atsuko; Kogo, Yasushi; Soma, Takahiro; Ikeda, Shuji; Murakami, Kayoko; Nogawa, Chihiro; Itoh, Masayoshi; Mitani, Yasumasa; Harbers, Matthias; Okamoto, Akimitsu; Hayashizaki, Yoshihide

    2010-02-01

    Most commonly used intercalating fluorescent dyes in DNA detection are lacking any sequence specificity, whereas so-called Exciton Primers can overcome this limitation by functioning as "sequence-specific dyes." After hybridization to complementary sequences, the fluorescence of Exciton Primers provides sequence-specific signals for real-time monitoring of amplification reactions. Applied to the SmartAmp2 mutation detection process, Exciton Primers show high signal strength with low background leading to a superior specificity and sensitivity compared to SYBR Green I. Signal strength can be further enhanced using multiple dyes within one Exciton Primer or use of multiple Exciton Primers in the same amplification reaction. Here we demonstrate the use of Exciton Primers for genotyping a single nucleotide polymorphism (SNP) in the VKORC1 locus (-1639G>A) relevant for Warfarin dosing as an example for Exciton Primers mediated genotyping by SmartAmp2. The genotyping assay can use only one labeled Exciton Primer for endpoint detection, or simultaneously by real-time monitoring detect wild-type and mutant alleles in a one-tube reaction using two Exciton Primers having different dyes. Working directly from blood samples, Exciton Primer mediated genotyping by SmartAmp2 offers superior solutions for rapid point-of-care testing.

  14. A genome-wide SNP scan accelerates trait-regulatory genomic loci identification in chickpea.

    PubMed

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C L L; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K; Parida, Swarup K

    2015-06-10

    We identified 44844 high-quality SNPs by sequencing 92 diverse chickpea accessions belonging to a seed and pod trait-specific association panel using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays. A GWAS (genome-wide association study) in an association panel of 211, including the 92 sequenced accessions, identified 22 major genomic loci showing significant association (explaining 23-47% phenotypic variation) with pod and seed number/plant and 100-seed weight. Eighteen trait-regulatory major genomic loci underlying 13 robust QTLs were validated and mapped on an intra-specific genetic linkage map by QTL mapping. A combinatorial approach of GWAS, QTL mapping and gene haplotype-specific LD mapping and transcript profiling uncovered one superior haplotype and favourable natural allelic variants in the upstream regulatory region of a CesA-type cellulose synthase (Ca_Kabuli_CesA3) gene regulating high pod and seed number/plant (explaining 47% phenotypic variation) in chickpea. The up-regulation of this superior gene haplotype correlated with increased transcript expression of Ca_Kabuli_CesA3 gene in the pollen and pod of high pod/seed number accession, resulting in higher cellulose accumulation for normal pollen and pollen tube growth. A rapid combinatorial genome-wide SNP genotyping-based approach has potential to dissect complex quantitative agronomic traits and delineate trait-regulatory genomic loci (candidate genes) for genetic enhancement in crop plants, including chickpea.

  15. Genome-wide SNP discovery and linkage analysis in barley based on genes responsive to abiotic stress.

    PubMed

    Rostoks, Nils; Mudie, Sharon; Cardle, Linda; Russell, Joanne; Ramsay, Luke; Booth, Allan; Svensson, Jan T; Wanamaker, Steve I; Walia, Harkamal; Rodriguez, Edmundo M; Hedley, Peter E; Liu, Hui; Morris, Jenny; Close, Timothy J; Marshall, David F; Waugh, Robbie

    2005-12-01

    More than 2,000 genome-wide barley single nucleotide polymorphisms (SNPs) were developed by resequencing unigene fragments from eight diverse accessions. The average genome-wide SNP frequency observed in 877 unigenes was 1 SNP per 200 bp. However, SNP frequency was highly variable with the least number of SNP and SNP haplotypes observed within European cultivated germplasm reflecting effects of breeding history on genetic diversity. More than 300 SNP loci were mapped genetically in three experimental mapping populations which allowed the construction of an integrated SNP map incorporating a large number of RFLP, AFLP and SSR markers (1,237 loci in total). The genes used for SNP discovery were selected based on their transcriptional response to a variety of abiotic stresses. A set of known barley abiotic stress QTL was positioned on the linkage map, while the available sequence and gene expression information facilitated the identification of genes potentially associated with these traits. Comparison of the sequenced SNP loci to the rice genome sequence identified several regions of highly conserved gene order providing a framework for marker saturation in barley genomic regions of interest. The integration of genome-wide SNP and expression data with available genetic and phenotypic information will facilitate the identification of gene function in barley and other non-model organisms. PMID:16244872

  16. Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing

    PubMed Central

    Yu, ShiGang; Chu, WeiWei; Zhang, LiFan; Han, HouMing; Zhao, RongXue; Wu, Wei; Zhu, JiangNing; Dodson, Michael V.; Wei, Wei; Liu, HongLin; Chen, Jie

    2015-01-01

    Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying

  17. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

    PubMed

    Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R; Taylor, Jeremy F; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart

    2016-01-01

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The

  18. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications

    PubMed Central

    Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R.; Taylor, Jeremy F.; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart

    2016-01-01

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The

  19. Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing.

    PubMed

    Yu, ShiGang; Chu, WeiWei; Zhang, LiFan; Han, HouMing; Zhao, RongXue; Wu, Wei; Zhu, JiangNing; Dodson, Michael V; Wei, Wei; Liu, HongLin; Chen, Jie

    2015-01-01

    Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying

  20. Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing.

    PubMed

    Yu, ShiGang; Chu, WeiWei; Zhang, LiFan; Han, HouMing; Zhao, RongXue; Wu, Wei; Zhu, JiangNing; Dodson, Michael V; Wei, Wei; Liu, HongLin; Chen, Jie

    2015-01-01

    Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying

  1. Homozygous mdm2 SNP309 cancer cells with compromised transcriptional elongation at p53 target genes are sensitive to induction of p53-independent cell death.

    PubMed

    Rosso, Melissa; Polotskaia, Alla; Bargonetti, Jill

    2015-10-27

    A single nucleotide polymorphism (T to G) in the mdm2 P2 promoter, mdm2 SNP309, leads to MDM2 overexpression promoting chemotherapy resistant cancers. Two mdm2 G/G SNP309 cancer cell lines, MANCA and A875, have compromised wild-type p53 that co-localizes with MDM2 on chromatin. We hypothesized that MDM2 in these cells inhibited transcription initiation at the p53 target genes p21 and puma. Surprisingly, following etoposide treatment transcription initiation occurred at the compromised target genes in MANCA and A875 cells similar to the T/T ML-1 cell line. In all cell lines tested there was equally robust recruitment of total and initiated RNA polymerase II (Pol II). We found that knockdown of MDM2 in G/G cells moderately increased expression of subsets of p53 target genes without increasing p53 stability. Importantly, etoposide and actinomycin D treatments increased histone H3K36 trimethylation in T/T, but not G/G cells, suggesting a G/G correlated inhibition of transcription elongation. We therefore tested a chemotherapeutic agent (8-amino-adenosine) that induces p53-independent cell death for higher clinically relevant cytotoxicity. We demonstrated that T/T and G/G mdm2 SNP309 cells were equally sensitive to 8-amino-adenosine induced cell death. In conclusion for cancer cells overexpressing MDM2, targeting MDM2 may be less effective than inducing p53-independent cell death.

  2. Application of Multi-SNP Approaches Bayesian LASSO and AUC-RF to Detect Main Effects of Inflammatory-Gene Variants Associated with Bladder Cancer Risk

    PubMed Central

    Calle, M. Luz; Rothman, Nathaniel; Urrea, Víctor; Kogevinas, Manolis; Petrus, Sandra; Chanock, Stephen J.; Tardón, Adonina; García-Closas, Montserrat; González-Neira, Anna; Vellalta, Gemma; Carrato, Alfredo; Navarro, Arcadi; Lorente-Galdós, Belén; Silverman, Debra T.; Real, Francisco X.; Wu, Xifeng; Malats, Núria

    2013-01-01

    The relationship between inflammation and cancer is well established in several tumor types, including bladder cancer. We performed an association study between 886 inflammatory-gene variants and bladder cancer risk in 1,047 cases and 988 controls from the Spanish Bladder Cancer (SBC)/EPICURO Study. A preliminary exploration with the widely used univariate logistic regression approach did not identify any significant SNP after correcting for multiple testing. We further applied two more comprehensive methods to capture the complexity of bladder cancer genetic susceptibility: Bayesian Threshold LASSO (BTL), a regularized regression method, and AUC-Random Forest, a machine-learning algorithm. Both approaches explore the joint effect of markers. BTL analysis identified a signature of 37 SNPs in 34 genes showing an association with bladder cancer. AUC-RF detected an optimal predictive subset of 56 SNPs. 13 SNPs were identified by both methods in the total population. Using resources from the Texas Bladder Cancer study we were able to replicate 30% of the SNPs assessed. The associations between inflammatory SNPs and bladder cancer were reexamined among non-smokers to eliminate the effect of tobacco, one of the strongest and most prevalent environmental risk factor for this tumor. A 9 SNP-signature was detected by BTL. Here we report, for the first time, a set of SNP in inflammatory genes jointly associated with bladder cancer risk. These results highlight the importance of the complex structure of genetic susceptibility associated with cancer risk. PMID:24391818

  3. A SNP in the Immunoregulatory Molecule CTLA-4 Controls mRNA Splicing In Vivo but Does Not Alter Diabetes Susceptibility in the NOD Mouse.

    PubMed

    Jakubczik, Fabian; Jones, Ken; Nichols, Jennifer; Mansfield, William; Cooke, Anne; Holmes, Nick

    2016-01-01

    CTLA-4 is a critical "checkpoint" regulator in autoimmunity. Variation in CTLA-4 isoform expression has been linked to type 1 diabetes development in human and NOD mouse studies. In the NOD mouse, a causative link between increased expression of the minor isoform ligand-independent CTLA-4 and a reduction in diabetes has become widely accepted. Altered splicing of CTLA-4 has been attributed to a single nucleotide polymorphism (SNP) in Ctla4 exon2 (e2_77A/G). To investigate this link, we have used NOD embryonic stem (ES) cells to generate a novel NOD transgenic line with the 77A/G SNP. This strain phenocopies the increase in splicing toward the liCTLA4 isoform seen in B10 Idd5.1 mice. Crucially, the SNP does not alter the spontaneous incidence of diabetes, the incidence of cyclophosphamide-induced diabetes, or the activation of diabetogenic T-cell receptor transgenic CD4(+) T cells after adoptive transfer. Our results show that one or more of the many other linked genetic variants between the B10 and NOD genome are required for the diabetes protection conferred by Idd5.1. With the NOD mouse model closely mimicking the human disease, our data demonstrate that knock-in transgenic mice on the NOD background can test causative mutations relevant in human diabetes.

  4. HapRice, an SNP haplotype database and a web tool for rice.

    PubMed

    Yonemaru, Jun-ichi; Ebana, Kaworu; Yano, Masahiro

    2014-01-01

    Genome-wide single nucleotide polymorphism (SNP) analysis is a promising tool to examine the genetic diversity of rice populations and genetic traits of scientific and economic importance. Next-generation sequencing technology has accelerated the re-sequencing of diverse rice varieties and the discovery of genome-wide SNPs. Notably, validation of these SNPs by a high-throughput genotyping system, such as an SNP array, could provide a manageable and highly accurate SNP set. To enhance the potential utility of genome-wide SNPs for geneticists and breeders, analysis tools need to be developed. Here, we constructed an SNP haplotype database, which allows visualization of the allele frequency of all SNPs in the genome browser. We calculated the allele frequencies of 3,334 SNPs in 76 accessions from the world rice collection and 3,252 SNPs in 177 Japanese rice accessions; all these SNPs have been validated in our previous studies. The SNP haplotypes were defined by the allele frequency in each cultivar group (aus, indica, tropical japonica and temperate japonica) for the world rice accessions, and in non-irrigated and three irrigated groups (three variety registration periods) for Japanese rice accessions. We also developed web tools for finding polymorphic SNPs between any two rice accessions and for the primer design to develop cleaved amplified polymorphic sequence markers at any SNP. The 'HapRice' database and the web tools can be accessed at http://qtaro.abr.affrc.go.jp/index.html. In addition, we established a core SNP set consisting of 768 SNPs uniformly distributed in the rice genome; this set is of a practically appropriate size for use in rice genetic analysis.

  5. fcGENE: A Versatile Tool for Processing and Transforming SNP Datasets

    PubMed Central

    Roshyara, Nab Raj; Scholz, Markus

    2014-01-01

    Background Modern analysis of high-dimensional SNP data requires a number of biometrical and statistical methods such as pre-processing, analysis of population structure, association analysis and genotype imputation. Software used for these purposes often rely on specific and incompatible input and output data formats. Therefore extensive data management including multiple format conversions is necessary during analyses. Methods In order to support fast and efficient management and bio-statistical quality control of high-dimensional SNP data, we developed the publically available software fcGENE using C++ object-oriented programming language. This software simplifies and automates the use of different existing analysis packages, especially during the workflow of genotype imputations and corresponding analyses. Results fcGENE transforms SNP data and imputation results into different formats required for a large variety of analysis packages such as PLINK, SNPTEST, HAPLOVIEW, EIGENSOFT, GenABEL and tools used for genotype imputation such as MaCH, IMPUTE, BEAGLE and others. Data Management tasks like merging, splitting, extracting SNP and pedigree information can be performed. fcGENE also supports a number of bio-statistical quality control processes and quality based filtering processes at SNP- and sample-wise level. The tool also generates templates of commands required to run specific software packages, especially those required for genotype imputation. We demonstrate the functionality of fcGENE by example workflows of SNP data analyses and provide a comprehensive manual of commands, options and applications. Conclusions We have developed a user-friendly open-source software fcGENE, which comprehensively supports SNP data management, quality control and analysis workflows. Download statistics and corresponding feedbacks indicate that software is highly recognised and extensively applied by the scientific community. PMID:25050709

  6. SNP Design from 454 Sequencing of Podosphaera plantaginis Transcriptome Reveals a Genetically Diverse Pathogen Metapopulation with High Levels of Mixed-Genotype Infection

    PubMed Central

    Tollenaere, Charlotte; Susi, Hanna; Nokso-Koivisto, Jussi; Koskinen, Patrik; Tack, Ayco; Auvinen, Petri; Paulin, Lars; Frilander, Mikko J.; Lehtonen, Rainer; Laine, Anna-Liisa

    2012-01-01

    Background Molecular tools may greatly improve our understanding of pathogen evolution and epidemiology but technical constraints have hindered the development of genetic resources for parasites compared to free-living organisms. This study aims at developing molecular tools for Podosphaera plantaginis, an obligate fungal pathogen of Plantago lanceolata. This interaction has been intensively studied in the Åland archipelago of Finland with epidemiological data collected from over 4,000 host populations annually since year 2001. Principal Findings A cDNA library of a pooled sample of fungal conidia was sequenced on the 454 GS-FLX platform. Over 549,411 reads were obtained and annotated into 45,245 contigs. Annotation data was acquired for 65.2% of the assembled sequences. The transcriptome assembly was screened for SNP loci, as well as for functionally important genes (mating-type genes and potential effector proteins). A genotyping assay of 27 SNP loci was designed and tested on 380 infected leaf samples from 80 populations within the Åland archipelago. With this panel we identified 85 multilocus genotypes (MLG) with uneven frequencies across the pathogen metapopulation. Approximately half of the sampled populations contain polymorphism. Our genotyping protocol revealed mixed-genotype infection within a single host leaf to be common. Mixed infection has been proposed as one of the main drivers of pathogen evolution, and hence may be an important process in this pathosystem. Significance The developed SNP panel offers exciting research perspectives for future studies in this well-characterized pathosystem. Also, the transcriptome provides an invaluable novel genomic resource for powdery mildews, which cause significant yield losses on commercially important crops annually. Furthermore, the features that render genetic studies in this system a challenge are shared with the majority of obligate parasitic species, and hence our results provide methodological insights

  7. Is MDM2 SNP309 Variation a Risk Factor for Head and Neck Carcinoma?

    PubMed Central

    Zhuo, Xianlu; Ye, Huiping; Li, Qi; Xiang, Zhaolan; Zhang, Xueyuan

    2016-01-01

    Abstract Murine double minute-2 (MDM2) is a negative regulator of P53, and its T309G polymorphism has been suggested as a risk factor for a variety of cancers. Increasing evidence has shown the association of MDM2 T309G polymorphism with head and neck carcinoma (HNC) risk. However, the results are inconsistent. Thus, we performed a meta-analysis to elucidate the association. The meta-analysis retrieved studies published up to August 2015, and essential information was extracted for analysis. Separate analyses on ethnicity, source of controls, sample size, detection method, and cancer types were also conducted. Odds ratios (ORs) and their 95% confidence intervals (CIs) were used to estimate the association. Pooled data from 16 case–control studies including 4625 cases and 6927 controls failed to indicate a significant association. However, in the subgroup analysis of sample sizes, an increased risk was observed in the largest sample size group (>1000) under a recessive model (OR = 1.52; 95% CI = 1.08–2.13). Increased risks were also found in the nasopharyngeal cancer in the subgroup analysis of cancer types (GG vs TT: OR = 2.07; 95% CI = 1.38–3.12; dominant model: OR = 1.48; 95% CI = 1.13–1.93; recessive model: OR = 1.76; 95% CI = 1.17–2.65). The results suggest that homozygote GG alleles of MDM2 SNP309 may be a low-penetrant risk factor for HNC, and G allele may confer nasopharyngeal cancer susceptibility. PMID:26945408

  8. MDM2 SNP309 promoter polymorphism confers risk for hereditary melanoma.

    PubMed

    Thunell, Lena K; Bivik, Cecilia; Wäster, Petra; Fredrikson, Mats; Stjernström, Annika; Synnerstad, Ingrid; Rosdahl, Inger; Enerbäck, Charlotta

    2014-06-01

    The p53 pathway regulates stress response, and variations in p53, MDM2, and MDM4 may predispose an individual to tumor development. The aim of this study was to study the impact of genetic variation on sporadic and hereditary melanoma. We have analyzed a combination of three functionally relevant variants of the p53 pathway in 258 individuals with sporadic malignant melanomas, 50 with hereditary malignant melanomas, and 799 healthy controls. Genotyping was performed by PCR-restriction fragment length polymorphism, pyrosequencing, and allelic discrimination. We found an increased risk for hereditary melanoma in MDM2 GG homozygotes, which was more pronounced among women (P=0.035). In the event of pairwise combinations of the single nucleotide polymorphisms, a risk elevation was shown for MDM2 GG homozygotes/p53 wild-type Arg in hereditary melanoma (P=0.01). Individuals with sporadic melanomas of the superficial spreading type, including melanoma in situ, showed a slightly higher frequency of the MDM2 GG genotype compared with those with nodular melanomas (P=0.04). The dysplastic nevus phenotype, present in the majority of our hereditary melanoma cases and also in some sporadic cases, further enhanced the effect of the MDM2 GG genotype on melanoma risk (P=0.005). In conclusion, the results show an association between MDM2 SNP309 and an increased risk for hereditary melanoma, especially among women. Analysis of sporadic melanoma also shows an association between MDM2 and the superficial spreading melanoma subtype, as well as an association with the presence of dysplastic nevi in sporadic melanoma. PMID:24625390

  9. The extent of linkage disequilibrium in beef cattle breeds using high-density SNP genotypes

    PubMed Central

    2014-01-01

    Background The extent of linkage disequilibrium (LD) between molecular markers impacts genome-wide association studies and implementation of genomic selection. The availability of high-density single nucleotide polymorphism (SNP) genotyping platforms makes it possible to investigate LD at an unprecedented resolution. In this work, we characterised LD decay in breeds of beef cattle of taurine, indicine and composite origins and explored its variation across autosomes and the X chromosome. Findings In each breed, LD decayed rapidly and r2 was less than 0.2 for marker pairs separated by 50 kb. The LD decay curves clustered into three groups of similar LD decay that distinguished the three main cattle types. At short distances between markers (< 10 kb), taurine breeds showed higher LD (r2 = 0.45) than their indicine (r2 = 0.25) and composite (r2 = 0.32) counterparts. This higher LD in taurine breeds was attributed to a smaller effective population size and a stronger bottleneck during breed formation. Using all SNPs on only the X chromosome, the three cattle types could still be distinguished. However for taurine breeds, the LD decay on the X chromosome was much faster and the background level much lower than for indicine breeds and composite populations. When using only SNPs that were polymorphic in all breeds, the analysis of the X chromosome mimicked that of the autosomes. Conclusions The pattern of LD mirrored some aspects of the history of breed populations and showed a sharp decay with increasing physical distance between markers. We conclude that the availability of the HD chip can be used to detect association signals that remained hidden when using lower density genotyping platforms, since LD dropped below 0.2 at distances of 50 kb. PMID:24661366

  10. CsSNP: A Web-Based Tool for the Detecting of Comparative Segments SNPs.

    PubMed

    Wang, Yi; Wang, Shuangshuang; Zhou, Dongjie; Yang, Shuai; Xu, Yongchao; Yang, Chao; Yang, Long

    2016-07-01

    SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies. PMID:27347883

  11. Supervised learning-based tagSNP selection for genome-wide disease classifications

    PubMed Central

    Liu, Qingzhong; Yang, Jack; Chen, Zhongxue; Yang, Mary Qu; Sung, Andrew H; Huang, Xudong

    2008-01-01

    Background Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markers. Results We have developed a feature selection method named Supervised Recursive Feature Addition (SRFA). This method combines supervised learning and statistical measures for the chosen candidate features/SNPs to reconcile the redundancy information and, in doing so, improve the classification performance in association studies. Additionally, we have proposed a Support Vector based Recursive Feature Addition (SVRFA) scheme in SNP-disease association analysis. Conclusions We have proposed using SRFA with different statistical learning classifiers and SVRFA for both SNP selection and disease classification and then applying them to two complex disease data sets. In general, our approaches outperform the well-known feature selection method of Support Vector Machine Recursive Feature Elimination and logic regression-based SNP selection for disease classification in genetic association studies. Our study further indicates that both genetic and environmental variables should be taken into account when doing disease predictions and classifications for the most complex human diseases that have gene-environment interactions. PMID:18366619

  12. Highly specific SNP detection using 2D graphene electronics and DNA strand displacement

    PubMed Central

    Hwang, Michael T.; Landon, Preston B.; Lee, Joon; Choi, Duyoung; Mo, Alexander H.; Glinsky, Gennadi; Lal, Ratnesh

    2016-01-01

    Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine. PMID:27298347

  13. Highly specific SNP detection using 2D graphene electronics and DNA strand displacement.

    PubMed

    Hwang, Michael T; Landon, Preston B; Lee, Joon; Choi, Duyoung; Mo, Alexander H; Glinsky, Gennadi; Lal, Ratnesh

    2016-06-28

    Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine.

  14. Mining and Analysis of SNP in Response to Salinity Stress in Upland Cotton (Gossypium hirsutum L.)

    PubMed Central

    Wang, Xiaoge; Lu, Xuke; Wang, Junjuan; Wang, Delong; Yin, Zujun; Fan, Weili; Wang, Shuai; Ye, Wuwei

    2016-01-01

    Salinity stress is a major abiotic factor that affects crop output, and as a pioneer crop in saline and alkaline land, salt tolerance study of cotton is particularly important. In our experiment, four salt-tolerance varieties with different salt tolerance indexes including CRI35 (65.04%), Kanghuanwei164 (56.19%), Zhong9807 (55.20%) and CRI44 (50.50%), as well as four salt-sensitive cotton varieties including Hengmian3 (48.21%), GK50 (40.20%), Xinyan96-48 (34.90%), ZhongS9612 (24.80%) were used as the materials. These materials were divided into salt-tolerant group (ST) and salt-sensitive group (SS). Illumina Cotton SNP 70K Chip was used to detect SNP in different cotton varieties. SNPv (SNP variation of the same seedling pre- and after- salt stress) in different varieties were screened; polymorphic SNP and SNPr (SNP related to salt tolerance) were obtained. Annotation and analysis of these SNPs showed that (1) the induction efficiency of salinity stress on SNPv of cotton materials with different salt tolerance index was different, in which the induction efficiency on salt-sensitive materials was significantly higher than that on salt-tolerant materials. The induction of salt stress on SNPv was obviously biased. (2) SNPv induced by salt stress may be related to the methylation changes under salt stress. (3) SNPr may influence salt tolerance of plants by affecting the expression of salt-tolerance related genes. PMID:27355327

  15. Highly specific SNP detection using 2D graphene electronics and DNA strand displacement.

    PubMed

    Hwang, Michael T; Landon, Preston B; Lee, Joon; Choi, Duyoung; Mo, Alexander H; Glinsky, Gennadi; Lal, Ratnesh

    2016-06-28

    Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine. PMID:27298347

  16. Genome-Wide SNP Calling Using Next Generation Sequencing Data in Tomato

    PubMed Central

    Kim, Ji-Eun; Oh, Sang-Keun; Lee, Jeong-Hee; Lee, Bo-Mi; Jo, Sung-Hwan

    2014-01-01

    The tomato (Solanum lycopersicum L.) is a model plant for genome research in Solanaceae, as well as for studying crop breeding. Genome-wide single nucleotide polymorphisms (SNPs) are a valuable resource in genetic research and breeding. However, to do discovery of genome-wide SNPs, most methods require expensive high-depth sequencing. Here, we describe a method for SNP calling using a modified version of SAMtools that improved its sensitivity. We analyzed 90 Gb of raw sequence data from next-generation sequencing of two resequencing and seven transcriptome data sets from several tomato accessions. Our study identified 4,812,432 non-redundant SNPs. Moreover, the workflow of SNP calling was improved by aligning the reference genome with its own raw data. Using this approach, 131,785 SNPs were discovered from transcriptome data of seven accessions. In addition, 4,680,647 SNPs were identified from the genome of S. pimpinellifolium, which are 60 times more than 71,637 of the PI212816 transcriptome. SNP distribution was compared between the whole genome and transcriptome of S. pimpinellifolium. Moreover, we surveyed the location of SNPs within genic and intergenic regions. Our results indicated that the sufficient genome-wide SNP markers and very sensitive SNP calling method allow for application of marker assisted breeding and genome-wide association studies. PMID:24552708

  17. CsSNP: A Web-Based Tool for the Detecting of Comparative Segments SNPs.

    PubMed

    Wang, Yi; Wang, Shuangshuang; Zhou, Dongjie; Yang, Shuai; Xu, Yongchao; Yang, Chao; Yang, Long

    2016-07-01

    SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies.

  18. Transcriptome sequencing for SNP discovery across Cucumis melo

    PubMed Central

    2012-01-01

    from India and Africa as compared to commercial cultivars, cultigens and landraces from Eastern Europe, Western Asia and the Mediterranean basin is consistent with the evolutionary history proposed for the species. Group-specific SNVs that will be useful in introgression programs were also detected. In a sample of 143 selected putative SNPs, we verified 93% of the polymorphisms in a panel of 78 genotypes. Conclusions This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial) melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity. This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties. PMID:22726804

  19. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

    PubMed

    Welter, Danielle; MacArthur, Jacqueline; Morales, Joannella; Burdett, Tony; Hall, Peggy; Junkins, Heather; Klemm, Alan; Flicek, Paul; Manolio, Teri; Hindorff, Lucia; Parkinson, Helen

    2014-01-01

    The National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) Catalog provides a publicly available manually curated collection of published GWAS assaying at least 100,000 single-nucleotide polymorphisms (SNPs) and all SNP-trait associations with P <1 × 10(-5). The Catalog includes 1751 curated publications of 11 912 SNPs. In addition to the SNP-trait association data, the Catalog also publishes a quarterly diagram of all SNP-trait associations mapped to the SNPs' chromosomal locations. The Catalog can be accessed via a tabular web interface, via a dynamic visualization on the human karyotype, as a downloadable tab-delimited file and as an OWL knowledge base. This article presents a number of recent improvements to the Catalog, including novel ways for users to interact with the Catalog and changes to the curation infrastructure.

  20. Cross-Species Application of SNP Chips is Not Suitable for Identifying Runs of Homozygosity.

    PubMed

    Shafer, Aaron B A; Miller, Joshua M; Kardos, Marty

    2016-03-01

    Cross-species application of single-nucleotide polymorphism (SNP) chips is a valid, relatively cost-effective alternative to the high-throughput sequencing methods generally required to obtain a genome-wide sampling of polymorphisms. Kharzinova et al. (2015) examined the applicability of SNP chips developed in domestic bovids (cattle and sheep) to a semi-wild cervid (reindeer). The ancestors of bovids and cervids diverged between 20 and 30 million years ago (Hassanin and Douzery 2003; Bibi et al. 2013). Empirical work has shown that for a SNP chip developed in a bovid and applied to a cervid species, approximately 50% genotype success with 1% of the loci being polymorphic is expected (Miller et al. 2012). The genotyping of Kharzinova et al. (2015) follows this pattern; however, these data are not appropriate for identifying runs of homozygosity (ROH) and can be problematic for estimating linkage disequilibrium (LD) and we caution readers in this regard.

  1. Cross-Species Application of SNP Chips is Not Suitable for Identifying Runs of Homozygosity.

    PubMed

    Shafer, Aaron B A; Miller, Joshua M; Kardos, Marty

    2016-03-01

    Cross-species application of single-nucleotide polymorphism (SNP) chips is a valid, relatively cost-effective alternative to the high-throughput sequencing methods generally required to obtain a genome-wide sampling of polymorphisms. Kharzinova et al. (2015) examined the applicability of SNP chips developed in domestic bovids (cattle and sheep) to a semi-wild cervid (reindeer). The ancestors of bovids and cervids diverged between 20 and 30 million years ago (Hassanin and Douzery 2003; Bibi et al. 2013). Empirical work has shown that for a SNP chip developed in a bovid and applied to a cervid species, approximately 50% genotype success with 1% of the loci being polymorphic is expected (Miller et al. 2012). The genotyping of Kharzinova et al. (2015) follows this pattern; however, these data are not appropriate for identifying runs of homozygosity (ROH) and can be problematic for estimating linkage disequilibrium (LD) and we caution readers in this regard. PMID:26774056

  2. k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

    SciTech Connect

    2014-11-18

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.

  3. k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

    2014-11-18

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny inmore » minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.« less

  4. An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data.

    PubMed

    Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A V S K; Varshney, Rajeev K

    2014-01-01

    Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone

  5. SNP markers-based map construction and genome-wide linkage analysis in Brassica napus.

    PubMed

    Raman, Harsh; Dalton-Morgan, Jessica; Diffey, Simon; Raman, Rosy; Alamery, Salman; Edwards, David; Batley, Jacqueline

    2014-09-01

    An Illumina Infinium array comprising 5306 single nucleotide polymorphism (SNP) markers was used to genotype 175 individuals of a doubled haploid population derived from a cross between Skipton and Ag-Spectrum, two Australian cultivars of rapeseed (Brassica napus L.). A genetic linkage map based on 613 SNP and 228 non-SNP (DArT, SSR, SRAP and candidate gene markers) covering 2514.8 cM was constructed and further utilized to identify loci associated with flowering time and resistance to blackleg, a disease caused by the fungus Leptosphaeria maculans. Comparison between genetic map positions of SNP markers and the sequenced Brassica rapa (A) and Brassica oleracea (C) genome scaffolds showed several genomic rearrangements in the B. napus genome. A major locus controlling resistance to L. maculans was identified at both seedling and adult plant stages on chromosome A07. QTL analyses revealed that up to 40.2% of genetic variation for flowering time was accounted for by loci having quantitative effects. Comparative mapping showed Arabidopsis and Brassica flowering genes such as Phytochrome A/D, Flowering Locus C and agamous-Like MADS box gene AGL1 map within marker intervals associated with flowering time in a DH population from Skipton/Ag-Spectrum. Genomic regions associated with flowering time and resistance to L. maculans had several SNP markers mapped within 10 cM. Our results suggest that SNP markers will be suitable for various applications such as trait introgression, comparative mapping and high-resolution mapping of loci in B. napus.

  6. Vitis Phylogenomics: Hybridization Intensities from a SNP Array Outperform Genotype Calls

    PubMed Central

    Miller, Allison J.; Matasci, Naim; Schwaninger, Heidi; Aradhya, Mallikarjuna K.; Prins, Bernard; Zhong, Gan-Yuan; Simon, Charles; Buckler, Edward S.; Myles, Sean

    2013-01-01

    Understanding relationships among species is a fundamental goal of evolutionary biology. Single nucleotide polymorphisms (SNPs) identified through next generation sequencing and related technologies enable phylogeny reconstruction by providing unprecedented numbers of characters for analysis. One approach to SNP-based phylogeny reconstruction is to identify SNPs in a subset of individuals, and then to compile SNPs on an array that can be used to genotype additional samples at hundreds or thousands of sites simultaneously. Although powerful and efficient, this method is subject to ascertainment bias because applying variation discovered in a representative subset to a larger sample favors identification of SNPs with high minor allele frequencies and introduces bias against rare alleles. Here, we demonstrate that the use of hybridization intensity data, rather than genotype calls, reduces the effects of ascertainment bias. Whereas traditional SNP calls assess known variants based on diversity housed in the discovery panel, hybridization intensity data survey variation in the broader sample pool, regardless of whether those variants are present in the initial SNP discovery process. We apply SNP genotype and hybridization intensity data derived from the Vitis9kSNP array developed for grape to show the effects of ascertainment bias and to reconstruct evolutionary relationships among Vitis species. We demonstrate that phylogenies constructed using hybridization intensities suffer less from the distorting effects of ascertainment bias, and are thus more accurate than phylogenies based on genotype calls. Moreover, we reconstruct the phylogeny of the genus Vitis using hybridization data, show that North American subgenus Vitis species are monophyletic, and resolve several previously poorly known relationships among North American species. This study builds on earlier work that applied the Vitis9kSNP array to evolutionary questions within Vitis vinifera and has general

  7. Pathways of distinction analysis: a new technique for multi-SNP analysis of GWAS data.

    PubMed

    Braun, Rosemary; Buetow, Kenneth

    2011-06-01

    Genome-wide association studies (GWAS) have become increasingly common due to advances in technology and have permitted the identification of differences in single nucleotide polymorphism (SNP) alleles that are associated with diseases. However, while typical GWAS analysis techniques treat markers individually, complex diseases (cancers, diabetes, and Alzheimers, amongst others) are unlikely to have a single causative gene. Thus, there is a pressing need for multi-SNP analysis methods that can reveal system-level differences in cases and controls. Here, we present a novel multi-SNP GWAS analysis method called Pathways of Distinction Analysis (PoDA). The method uses GWAS data and known pathway-gene and gene-SNP associations to identify pathways that permit, ideally, the distinction of cases from controls. The technique is based upon the hypothesis that, if a pathway is related to disease risk, cases will appear more similar to other cases than to controls (or vice versa) for the SNPs associated with that pathway. By systematically applying the method to all pathways of potential interest, we can identify those for which the hypothesis holds true, i.e., pathways containing SNPs for which the samples exhibit greater within-class similarity than across classes. Importantly, PoDA improves on existing single-SNP and SNP-set enrichment analyses, in that it does not require the SNPs in a pathway to exhibit independent main effects. This permits PoDA to reveal pathways in which epistatic interactions drive risk. In this paper, we detail the PoDA method and apply it to two GWAS: one of breast cancer and the other of liver cancer. The results obtained strongly suggest that there exist pathway-wide genomic differences that contribute to disease susceptibility. PoDA thus provides an analytical tool that is complementary to existing techniques and has the power to enrich our understanding of disease genomics at the systems-level.

  8. ComB: SNP calling and mapping analysis for color and nucleotide space platforms.

    PubMed

    Souaiaia, Tade; Frazier, Zach; Chen, Ting

    2011-06-01

    The determination of single nucleotide polymorphisms (SNPs) has become faster and more cost effective since the advent of short read data from next generation sequencing platforms such as Roche's 454 Sequencer, Illumina's Solexa platform, and Applied Biosystems SOLiD sequencer. The SOLiD sequencing platform, which is capable of producing more than 6 GB of sequence data in a single run, uses a unique encoding scheme where color reads represent transitions between adjacent nucleotides. The determination of SNPs from color reads usually involves the translation of color alignments to likely nucleotide strings to facilitate the use of tools designed for nucleotide reads. This technique results in the loss of significant information in the color read, producing many incorrect SNP calls, especially if regions exist with dense or adjacent polymorphism. Additionally, color reads align ambiguously and incorrectly more often than nucleotide reads making integrated SNP calling a difficult challenge. We have developed ComB, a SNP calling tool which operates directly in color space, using a Bayesian model to incorporate unique and ambiguous reads to iteratively determine SNP identity. ComB is capable of accurately calling short consecutive nucleotide polymorphisms and densely clustered SNPs; both of which other SNP calling tools fail to identify. ComB, which is capable of using billions of short reads to accurately and efficiently perform whole human genome SNP calling in parallel, is also capable of using sequence data or even integrating sequence and color space data sets. We use real and simulated data to demonstrate that ComB's iterative strategy and recalibration of quality scores allow it to discover more true SNPs while calling fewer false positives than tools which use only color alignments as well as tools which translate color reads to nucleotide strings.

  9. RAD tag sequencing as a source of SNP markers in Cynara cardunculus L

    PubMed Central

    2012-01-01

    Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349

  10. A technical platform for PCR-based SNP screening in cereals and other crops.

    PubMed

    Wang, Zining

    2014-01-01

    With the rapid development of sequencing technologies and sequenced genomes, single-nucleotide polymorphisms (SNPs) have become a common genomic tool in the study of biological diversity, genome variation, gene mapping, cloning, and marker-assisted selection. In this chapter, PCR-based SNP screening is discussed in detail. This includes preparation of solutions and buffers, designing of tetra-primers, PCR for DNA amplification, gel electrophoresis, and SNP screening. By grasping the techniques and experience from the wet laboratories, researchers can quickly use this genomic tool to tackle problems in their research.

  11. Minimal SNP overlap among multiple panels of ancestry informative markers argues for more international collaboration.

    PubMed

    Soundararajan, Usha; Yun, Libing; Shi, Meisen; Kidd, Kenneth K

    2016-07-01

    The century-old use of genetic markers to determine population relationships has morphed in modern forensics into use of markers to determine the ancestry of an individual from a DNA sample. Researchers have identified sets of SNPs that have frequency differences among populations and many sets of SNPs have been published for the purpose of inferring ancestry. Such inference also requires reference datasets for the particular set of SNPs selected. We have identified 21 largely independent published panels of ancestry informative SNPs (AISNPs) and examined their union of 1397 SNPs. No SNP occurs in more than 6 panels. The 1397 SNPs in 21 panels yield a largely empty matrix that is inhibiting progress on more refined ability to infer ancestry for a forensic sample. The most common set of reference populations is the HGDP set of 52 small population samples totaling a thousand individuals. Only 46 (3%) of the 1397 SNPs occur in three or more panels. We assembled a new dataset for 44 of those SNPs involving 4,559 individuals from 73 populations. Analyses of this dataset provided clear differentiation of only five biogeographic regions: sub-Saharan Africa, Europe and SW Asia, South Asia, East Asia, and the Americas. This is an inadequate level of biogeographic resolution already exceeded by other panels. We conclude that more such AISNP panels are not needed and that the forensic community must collaborate to develop a common set of highly differentiating AISNPs typed on a very large number of population samples. How that can be accomplished will be the subject of future discussion. PMID:26977931

  12. Single-nucleotide polymorphism discovery and validation in high-density SNP array for genetic analysis in European white oaks.

    PubMed

    Lepoittevin, C; Bodénès, C; Chancerel, E; Villate, L; Lang, T; Lesur, I; Boury, C; Ehrenmann, F; Zelenica, D; Boland, A; Besse, C; Garnier-Géré, P; Plomion, C; Kremer, A

    2015-11-01

    An Illumina Infinium SNP genotyping array was constructed for European white oaks. Six individuals of Quercus petraea and Q. robur were considered for SNP discovery using both previously obtained Sanger sequences across 676 gene regions (1371 in vitro SNPs) and Roche 454 technology sequences from 5112 contigs (6542 putative in silico SNPs). The 7913 SNPs were genotyped across the six parental individuals, full-sib progenies (one within each species and two interspecific crosses between Q. petraea and Q. robur) and three natural populations from south-western France that included two additional interfertile white oak species (Q. pubescens and Q. pyrenaica). The genotyping success rate in mapping populations was 80.4% overall and 72.4% for polymorphic SNPs. In natural populations, these figures were lower (54.8% and 51.9%, respectively). Illumina genotype clusters with compression (shift of clusters on the normalized x-axis) were detected in ~25% of the successfully genotyped SNPs and may be due to the presence of paralogues. Compressed clusters were significantly more frequent for SNPs showing a priori incorrect Illumina genotypes, suggesting that they should be considered with caution or discarded. Altogether, these results show a high experimental error rate for the Infinium array (between 15% and 20% of SNPs potentially unreliable and 10% when excluding all compressed clusters), and recommendations are proposed when applying this type of high-throughput technique. Finally, results on diversity levels and shared polymorphisms across targeted white oaks and more distant species of the Quercus genus are discussed, and perspectives for future comparative studies are proposed.

  13. Mixed model methods for genomic prediction and variance component estimation of additive and dominance effects using SNP markers.

    PubMed

    Da, Yang; Wang, Chunkao; Wang, Shengwen; Hu, Guo

    2014-01-01

    We established a genomic model of quantitative trait with genomic additive and dominance relationships that parallels the traditional quantitative genetics model, which partitions a genotypic value as breeding value plus dominance deviation and calculates additive and dominance relationships using pedigree information. Based on this genomic model, two sets of computationally complementary but mathematically identical mixed model methods were developed for genomic best linear unbiased prediction (GBLUP) and genomic restricted maximum likelihood estimation (GREML) of additive and dominance effects using SNP markers. These two sets are referred to as the CE and QM sets, where the CE set was designed for large numbers of markers and the QM set was designed for large numbers of individuals. GBLUP and associated accuracy formulations for individuals in training and validation data sets were derived for breeding values, dominance deviations and genotypic values. Simulation study showed that GREML and GBLUP generally were able to capture small additive and dominance effects that each accounted for 0.00005-0.0003 of the phenotypic variance and GREML was able to differentiate true additive and dominance heritability levels. GBLUP of the total genetic value as the summation of additive and dominance effects had higher prediction accuracy than either additive or dominance GBLUP, causal variants had the highest accuracy of GREML and GBLUP, and predicted accuracies were in agreement with observed accuracies. Genomic additive and dominance relationship matrices using SNP markers were consistent with theoretical expectations. The GREML and GBLUP methods can be an effective tool for assessing the type and magnitude of genetic effects affecting a phenotype and for predicting the total genetic value at the whole genome level.

  14. Insertion sequence element single nucleotide polymorphism typing provides insights into the population structure and evolution of Mycobacterium ulcerans across Africa.

    PubMed

    Vandelannoote, Koen; Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C

    2014-02-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the "pan-African clade" were found to be widespread throughout Africa, while the ISE-SNP types of the "Gabonese/Cameroonian clade" were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types.

  15. Insertion Sequence Element Single Nucleotide Polymorphism Typing Provides Insights into the Population Structure and Evolution of Mycobacterium ulcerans across Africa

    PubMed Central

    Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C.

    2014-01-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the “pan-African clade” were found to be widespread throughout Africa, while the ISE-SNP types of the “Gabonese/Cameroonian clade” were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types. PMID:24296504

  16. Insertion sequence element single nucleotide polymorphism typing provides insights into the population structure and evolution of Mycobacterium ulcerans across Africa.

    PubMed

    Vandelannoote, Koen; Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C

    2014-02-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the "pan-African clade" were found to be widespread throughout Africa, while the ISE-SNP types of the "Gabonese/Cameroonian clade" were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types. PMID:24296504

  17. MDM2 promoter SNP55 (rs2870820) affects risk of colon cancer but not breast-, lung-, or prostate cancer.

    PubMed

    Helwa, Reham; Gansmo, Liv B; Romundstad, Pål; Hveem, Kristian; Vatten, Lars; Ryan, Bríd M; Harris, Curtis C; Lønning, Per E; Knappskog, Stian

    2016-01-01

    Two functional SNPs (SNP285G > C; rs117039649 and SNP309T > G; rs2279744) have previously been reported to modulate Sp1 transcription factor binding to the promoter of the proto-oncogene MDM2, and to influence cancer risk. Recently, a third SNP (SNP55C > T; rs2870820) was also reported to affect Sp1 binding and MDM2 transcription. In this large population based case-control study, we genotyped MDM2 SNP55 in 10,779 Caucasian individuals, previously genotyped for SNP309 and SNP285, including cases of colon (n = 1,524), lung (n = 1,323), breast (n = 1,709) and prostate cancer (n = 2,488) and 3,735 non-cancer controls, as well as 299 healthy African-Americans. Applying the dominant model, we found an elevated risk of colon cancer among individuals harbouring SNP55TT/CT genotypes compared to the SNP55CC genotype (OR = 1.15; 95% CI = 1.01-1.30). The risk was found to be highest for left-sided colon cancer (OR = 1.21; 95% CI = 1.00-1.45) and among females (OR = 1.32; 95% CI = 1.01-1.74). Assessing combined genotypes, we found the highest risk of colon cancer among individuals harbouring the SNP55TT or CT together with the SNP309TG genotype (OR = 1.21; 95% CI = 1.00-1.46). Supporting the conclusions from the risk estimates, we found colon cancer cases carrying the SNP55TT/CT genotypes to be diagnosed at younger age as compared to SNP55CC (p = 0.053), in particular among patients carrying the SNP309TG/TT genotypes (p = 0.009). PMID:27624283

  18. MDM2 promoter SNP55 (rs2870820) affects risk of colon cancer but not breast-, lung-, or prostate cancer

    PubMed Central

    Helwa, Reham; Gansmo, Liv B.; Romundstad, Pål; Hveem, Kristian; Vatten, Lars; Ryan, Bríd M.; Harris, Curtis C.; Lønning, Per E.; Knappskog, Stian

    2016-01-01

    Two functional SNPs (SNP285G > C; rs117039649 and SNP309T > G; rs2279744) have previously been reported to modulate Sp1 transcription factor binding to the promoter of the proto-oncogene MDM2, and to influence cancer risk. Recently, a third SNP (SNP55C > T; rs2870820) was also reported to affect Sp1 binding and MDM2 transcription. In this large population based case-control study, we genotyped MDM2 SNP55 in 10,779 Caucasian individuals, previously genotyped for SNP309 and SNP285, including cases of colon (n = 1,524), lung (n = 1,323), breast (n = 1,709) and prostate cancer (n = 2,488) and 3,735 non-cancer controls, as well as 299 healthy African-Americans. Applying the dominant model, we found an elevated risk of colon cancer among individuals harbouring SNP55TT/CT genotypes compared to the SNP55CC genotype (OR = 1.15; 95% CI = 1.01–1.30). The risk was found to be highest for left-sided colon cancer (OR = 1.21; 95% CI = 1.00–1.45) and among females (OR = 1.32; 95% CI = 1.01–1.74). Assessing combined genotypes, we found the highest risk of colon cancer among individuals harbouring the SNP55TT or CT together with the SNP309TG genotype (OR = 1.21; 95% CI = 1.00–1.46). Supporting the conclusions from the risk estimates, we found colon cancer cases carrying the SNP55TT/CT genotypes to be diagnosed at younger age as compared to SNP55CC (p = 0.053), in particular among patients carrying the SNP309TG/TT genotypes (p = 0.009). PMID:27624283

  19. Priming of seeds with nitric oxide donor sodium nitroprusside (SNP) alleviates the inhibition on wheat seed germination by salt stress.

    PubMed

    Duan, Pei; Ding, Feng; Wang, Fang; Wang, Bao-Shan

    2007-06-01

    The effect of SNP, an NO donor, on seed germination of wheat (Triticum aestivum L. cv. 'DK961') under salt stress was studied. The results showed that priming of seeds with 0.06 mmol/L SNP for 24 h markedly alleviated the decrease of the germination percentage, germination index, vigor index and imbibition rate of wheat seeds under salt stress. SNP significantly alleviated the decrease of the beta-amylase activity but almost did not affect the alpha-amylase activity of wheat seeds under salt stress. SNP slightly increased the alpha-amylase isoenzymes (especially isoenzyme 3) and significantly increased the beta-amylase isoenzymes (especially isoenzyme d, e, f and g). SNP pretreatment decreased Na(+) content, but increased the K(+) content, resulting in a mark increase of K(+)/Na(+) ratio of wheat seedlings under salt stress. These results suggested that NO is involved in promoting wheat seed germination under salt stress by increasing the beta-amylase activity.

  20. An improved consensus linkage map of barley based on flow-sorted chromosomes and SNP markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recent advances in high-throughput genotyping have made it easier to combine information from different mapping populations into consensus genetic maps, which provide increased marker density and genome coverage compared to individual maps. Previously, a SNP-based genotyping platform was developed a...

  1. Identification of a SNP marker associated with WB242 nematode resistance in sugar beet

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The beet-cyst nematode (Heterodera schachtii Schmidt) is one of the major diseases of sugar beet. The identification of molecular markers associated to the nematode resistance would be helpful for developing resistant varieties. The aim of this study was the identification of SNP (Single Nucleotide ...

  2. Use of microsatellite and SNP markers to characterize biotypes in Hessian fly

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Exploration of the biotype structure of Hessian fly, Mayetiola destructor (Say), would improve our knowledge regarding variation in virulence phenotypes and difference in genetic background. The objective of this study was to develop and test a panel of 18 microsatellite and 22 SNP markers to reveal...

  3. Association of Agronomic Traits with SNP Markers in Durum Wheat (Triticum turgidum L. durum (Desf.))

    PubMed Central

    Hu, Xin; Ren, Jing; Ren, Xifeng; Huang, Sisi; Sabiel, Salih A. I.; Luo, Mingcheng; Nevo, Eviatar; Fu, Chunjie; Peng, Junhua; Sun, Dongfa

    2015-01-01

    Association mapping is a powerful approach to detect associations between traits of interest and genetic markers based on linkage disequilibrium (LD) in molecular plant breeding. In this study, 150 accessions of worldwide originated durum wheat germplasm (Triticum turgidum spp. durum) were genotyped using 1,366 SNP markers. The extent of LD on each chromosome was evaluated. Association of single nucleotide polymorphisms (SNP) markers with ten agronomic traits measured in four consecutive years was analyzed under a mix linear model (MLM). Two hundred and one significant association pairs were detected in the four years. Several markers were associated with one trait, and also some markers were associated with multiple traits. Some of the associated markers were in agreement with previous quantitative trait loci (QTL) analyses. The function and homology analyses of the corresponding ESTs of some SNP markers could explain many of the associations for plant height, length of main spike, number of spikelets on main spike, grain number per plant, and 1000-grain weight, etc. The SNP associations for the observed traits are generally clustered in specific chromosome regions of the wheat genome, mainly in 2A, 5A, 6A, 7A, 1B, and 6B chromosomes. This study demonstrates that association mapping can complement and enhance previous QTL analyses and provide additional information for marker-assisted selection. PMID:26110423

  4. MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data.

    PubMed

    Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong

    2015-01-01

    Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package "MAFsnp" implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/.

  5. Oligonucleotide array outperforms SNP array on formalin-fixed paraffin-embedded clinical samples.

    PubMed

    Nasri, Soroush; Anjomshoaa, Ahmad; Song, Sarah; Guilford, Parry; McNoe, Les; Black, Michael; Phillips, Vicky; Reeve, Anthony; Humar, Bostjan

    2010-04-01

    Compromised quality of formalin-fixed paraffin-embedded (FFPE)-derived DNA has compounded the use of archival specimens for array-based genomic studies. Recent technological advances have led to first successes in this field; however, there is currently no general agreement on the most suitable platform for the array-based analysis of FFPE DNA. In this study, FFPE and matched fresh-frozen (FF) specimens were separately analyzed with Affymetrix single nucleotide polymorphism (SNP) 6.0 and Agilent 4x44K oligonucleotide arrays to compare the genomic profiles from the two tissue sources and to assess the relative performance of the two platforms on FFPE material. Genomic DNA was extracted from matched FFPE-FF pairs of normal intestinal epithelium from four patients and were applied to the SNP and oligonucleotide platforms according to the manufacturer-recommended protocols. On the Affymetrix platform, a substantial increase in apparent copy number alterations was observed in all FFPE tissues relative to their matched FF counterparts. In contrast, FFPE and matched FF genomic profiles obtained via the Agilent platform were very similar. Both the SNP and the oligonucleotide platform performed comparably on FF material. This study demonstrates that Agilent oligonucleotide array comparative genomic hybridization generates reliable results from FFPE extracted DNA, whereas the Affymetrix SNP-based array seems less suitable for the analysis of FFPE material.

  6. Association mapping of resistance to leaf rust in emmer wheat using high throughput SNP markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Emmer wheat (Triticum turgidum L. subsp. dicoccum) is known to be a useful source of genes for many desirable characters for improvement of modern cultivated wheat. Recently, a panel of 181 emmer wheat accessions has been genotyped with wheat 9K SNP (single nucleotide polymorphism) markers and exte...

  7. A novel approach to analyzing fMRI and SNP data via parallel independent component analysis

    NASA Astrophysics Data System (ADS)

    Liu, Jingyu; Pearlson, Godfrey; Calhoun, Vince; Windemuth, Andreas

    2007-03-01

    There is current interest in understanding genetic influences on brain function in both the healthy and the disordered brain. Parallel independent component analysis, a new method for analyzing multimodal data, is proposed in this paper and applied to functional magnetic resonance imaging (fMRI) and a single nucleotide polymorphism (SNP) array. The method aims to identify the independent components of each modality and the relationship between the two modalities. We analyzed 92 participants, including 29 schizophrenia (SZ) patients, 13 unaffected SZ relatives, and 50 healthy controls. We found a correlation of 0.79 between one fMRI component and one SNP component. The fMRI component consists of activations in cingulate gyrus, multiple frontal gyri, and superior temporal gyrus. The related SNP component is contributed to significantly by 9 SNPs located in sets of genes, including those coding for apolipoprotein A-I, and C-III, malate dehydrogenase 1 and the gamma-aminobutyric acid alpha-2 receptor. A significant difference in the presences of this SNP component is found between the SZ group (SZ patients and their relatives) and the control group. In summary, we constructed a framework to identify the interactions between brain functional and genetic information; our findings provide new insight into understanding genetic influences on brain function in a common mental disorder.

  8. De Novo sequencing of sunflower genome for SNP discovery using RAD (Restriction site Associated DNA) approach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Application of Single Nucleotide Polymorphism (SNP) marker technology as a tool in sunflower breeding programs offers enormous potential to improve sunflower genetics, and facilitate faster release of sunflower hybrids to the market place. Through a National Sunflower Association (NSA) funded initia...

  9. Utilization of a whole genome SNP panel for efficient genetic mapping in the mouse

    PubMed Central

    Moran, Jennifer L.; Bolton, Andrew D.; Tran, Pamela V.; Brown, Alison; Dwyer, Noelle D.; Manning, Danielle K.; Bjork, Bryan C.; Li, Cheng; Montgomery, Kate; Siepka, Sandra M.; Vitaterna, Martha Hotz; Takahashi, Joseph S.; Wiltshire, Tim; Kwiatkowski, David J.; Kucherlapati, Raju; Beier, David R.

    2006-01-01

    Phenotype-driven genetics can be used to create mouse models of human disease and birth defects. However, the utility of these mutant models is limited without identification of the causal gene. To facilitate genetic mapping, we developed a fixed single nucleotide polymorphism (SNP) panel of 394 SNPs as an alternative to analyses using simple sequence length polymorphism (SSLP) marker mapping. With the SNP panel, chromosomal locations for 22 monogenic mutants were identified. The average number of affected progeny genotyped for mapped monogenic mutations is nine. Map locations for several mutants have been obtained with as few as four affected progeny. The average size of genetic intervals obtained for these mutants is 43 Mb, with a range of 17–83 Mb. Thus, our SNP panel allows for identification of moderate resolution map position with small numbers of mice in a high-throughput manner. Importantly, the panel is suitable for mapping crosses from many inbred and wild-derived inbred strain combinations. The chromosomal localizations obtained with the SNP panel allow one to quickly distinguish between potentially novel loci or remutations in known genes, and facilitates fine mapping and positional cloning. By using this approach, we identified DNA sequence changes in two ethylnitrosourea-induced mutants. PMID:16461637

  10. Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases...

  11. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ~4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification pr...

  12. Optimal design of low-density SNP arrays for genomic prediction: algorithm and applications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for their optimal design. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optim...

  13. Longevity and Plasticity of CFTR Provide an Argument for Noncanonical SNP Organization in Hominid DNA

    PubMed Central

    Hill, Aubrey E.; Plyler, Zackery E.; Tiwari, Hemant; Patki, Amit; Tully, Joel P.; McAtee, Christopher W.; Moseley, Leah A.; Sorscher, Eric J.

    2014-01-01

    Like many other ancient genes, the cystic fibrosis transmembrane conductance regulator (CFTR) has survived for hundreds of millions of years. In this report, we consider whether such prodigious longevity of an individual gene – as opposed to an entire genome or species – should be considered surprising in the face of eons of relentless DNA replication errors, mutagenesis, and other causes of sequence polymorphism. The conventions that modern human SNP patterns result either from purifying selection or random (neutral) drift were not well supported, since extant models account rather poorly for the known plasticity and function (or the established SNP distributions) found in a multitude of genes such as CFTR. Instead, our analysis can be taken as a polemic indicating that SNPs in CFTR and many other mammalian genes may have been generated—and continue to accrue—in a fundamentally more organized manner than would otherwise have been expected. The resulting viewpoint contradicts earlier claims of ‘directional’ or ‘intelligent design-type’ SNP formation, and has important implications regarding the pace of DNA adaptation, the genesis of conserved non-coding DNA, and the extent to which eukaryotic SNP formation should be viewed as adaptive. PMID:25350658

  14. SNP discovery in complex allotetraploid genomes (Gossypium spp., Malvaceae) using genotyping by sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Dramatic decreases in the cost of DNA sequencing have enabled the development of very large numbers of markers based on single nucleotide polymorphism (SNP) for phylogenetic studies, population genetics, linkage mapping, marker-assisted breeding and other applications. Using Illumina next-generatio...

  15. Measuring diversity in Gossypium hirsutum using the CottonSNP63K Array

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A CottonSNP63K array and accompanying cluster file has been developed and includes 45,104 intra-specific SNPs and 17,954 inter-specific SNPs for automated genotyping of cotton (Gossypium spp.) samples. Development of the cluster file included genotyping of 1,156 samples, a subset of which were iden...

  16. The use of SNP data for the monitoring of genetic diversity in cattle breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    LD between SNPs contains information about effective population size. In this study, we investigate the use of genome-wide SNP data for marker based estimation of effective population size for two taurine cattle breeds of Africa and two local cattle breeds of Switzerland. Estimated recombination rat...

  17. SNP-based high density genetic map and mapping of btwd1 dwarfing gene in barley

    PubMed Central

    Ren, Xifeng; Wang, Jibin; Liu, Lipan; Sun, Genlou; Li, Chengdao; Luo, Hong; Sun, Dongfa

    2016-01-01

    A high-density linkage map is a valuable tool for functional genomics and breeding. A newly developed sequence-based marker technology, restriction site associated DNA (RAD) sequencing, has been proven to be powerful for the rapid discovery and genotyping of genome-wide single nucleotide polymorphism (SNP) markers and for the high-density genetic map construction. The objective of this research was to construct a high-density genetic map of barley using RAD sequencing. 1894 high-quality SNP markers were developed and mapped onto all seven chromosomes together with 68 SSR markers. These 1962 markers constituted a total genetic length of 1375.8 cM and an average of 0.7 cM between adjacent loci. The number of markers within each linkage group ranged from 209 to 396. The new recessive dwarfing gene btwd1 in Huaai 11 was mapped onto the high density linkage maps. The result showed that the btwd1 is positioned between SNP marks 7HL_6335336 and 7_249275418 with a genetic distance of 0.9 cM and 0.7 cM on chromosome 7H, respectively. The SNP-based high-density genetic map developed and the dwarfing gene btwd1 mapped in this study provide critical information for position cloning of the btwd1 gene and molecular breeding of barley. PMID:27530597

  18. SNP-based high density genetic map and mapping of btwd1 dwarfing gene in barley.

    PubMed

    Ren, Xifeng; Wang, Jibin; Liu, Lipan; Sun, Genlou; Li, Chengdao; Luo, Hong; Sun, Dongfa

    2016-01-01

    A high-density linkage map is a valuable tool for functional genomics and breeding. A newly developed sequence-based marker technology, restriction site associated DNA (RAD) sequencing, has been proven to be powerful for the rapid discovery and genotyping of genome-wide single nucleotide polymorphism (SNP) markers and for the high-density genetic map construction. The objective of this research was to construct a high-density genetic map of barley using RAD sequencing. 1894 high-quality SNP markers were developed and mapped onto all seven chromosomes together with 68 SSR markers. These 1962 markers constituted a total genetic length of 1375.8 cM and an average of 0.7 cM between adjacent loci. The number of markers within each linkage group ranged from 209 to 396. The new recessive dwarfing gene btwd1 in Huaai 11 was mapped onto the high density linkage maps. The result showed that the btwd1 is positioned between SNP marks 7HL_6335336 and 7_249275418 with a genetic distance of 0.9 cM and 0.7 cM on chromosome 7H, respectively. The SNP-based high-density genetic map developed and the dwarfing gene btwd1 mapped in this study provide critical information for position cloning of the btwd1 gene and molecular breeding of barley.

  19. A web-based genome browser for 'SNP-aware' assay design

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Human and animal genomes contain an abundance of single nucleotide polymorphisms (SNPs) that are useful for genetic testing. However, the relatively large number of SNPs present in diverse populations can pose serious problems when designing assays. It is important to “mask” some SNP positions so ...

  20. The impact of SNP fingerprinting and parentage analysis on the effectiveness of variety recommendations in cacao

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Evidence for the impact of mislabeling and/or pollen contamination on consistency of field performance has been lacking to reinforce the need for strict adherence to quality control protocols in cacao seed garden and germplasm plot management. The present study used SNP fingerprinting at 64 loci to ...

  1. SNP-based high density genetic map and mapping of btwd1 dwarfing gene in barley.

    PubMed

    Ren, Xifeng; Wang, Jibin; Liu, Lipan; Sun, Genlou; Li, Chengdao; Luo, Hong; Sun, Dongfa

    2016-01-01

    A high-density linkage map is a valuable tool for functional genomics and breeding. A newly developed sequence-based marker technology, restriction site associated DNA (RAD) sequencing, has been proven to be powerful for the rapid discovery and genotyping of genome-wide single nucleotide polymorphism (SNP) markers and for the high-density genetic map construction. The objective of this research was to construct a high-density genetic map of barley using RAD sequencing. 1894 high-quality SNP markers were developed and mapped onto all seven chromosomes together with 68 SSR markers. These 1962 markers constituted a total genetic length of 1375.8 cM and an average of 0.7 cM between adjacent loci. The number of markers within each linkage group ranged from 209 to 396. The new recessive dwarfing gene btwd1 in Huaai 11 was mapped onto the high density linkage maps. The result showed that the btwd1 is positioned between SNP marks 7HL_6335336 and 7_249275418 with a genetic distance of 0.9 cM and 0.7 cM on chromosome 7H, respectively. The SNP-based high-density genetic map developed and the dwarfing gene btwd1 mapped in this study provide critical information for position cloning of the btwd1 gene and molecular breeding of barley. PMID:27530597

  2. ACNE: a summarization method to estimate allele-specific copy numbers for Affymetrix SNP arrays

    PubMed Central

    Ortiz-Estevez, Maria; Bengtsson, Henrik; Rubio, Angel

    2010-01-01

    Motivation: Current algorithms for estimating DNA copy numbers (CNs) borrow concepts from gene expression analysis methods. However, single nucleotide polymorphism (SNP) arrays have special characteristics that, if taken into account, can improve the overall performance. For example, cross hybridization between alleles occurs in SNP probe pairs. In addition, most of the current CN methods are focused on total CNs, while it has been shown that allele-specific CNs are of paramount importance for some studies. Therefore, we have developed a summarization method that estimates high-quality allele-specific CNs. Results: The proposed method estimates the allele-specific DNA CNs for all Affymetrix SNP arrays dealing directly with the cross hybridization between probes within SNP probesets. This algorithm outperforms (or at least it performs as well as) other state-of-the-art algorithms for computing DNA CNs. It better discerns an aberration from a normal state and it also gives more precise allele-specific CNs. Availability: The method is available in the open-source R package ACNE, which also includes an add on to the aroma.affymetrix framework (http://www.aroma-project.org/). Contact: arubio@ceit.es Supplementaruy information: Supplementary data are available at Bioinformatics online. PMID:20529889

  3. Verification of genetic identity of introduced cacao germplasm in Ghana using single nucleotide polymorphism (SNP) markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Accurate identification of individual genotypes is important for cacao (Theobroma cacao L.) breeding, germplasm conservation and seed propagation. The development of single nucleotide polymorphism (SNP) markers in cacao offers an effective way to use a high-throughput genotyping system for cacao gen...

  4. Estimating the effect of SNP genotype on quantitative traits from pooled DNA samples

    PubMed Central

    2012-01-01

    Background Studies to detect associations between DNA markers and traits of interest in humans and livestock benefit from increasing the number of individuals genotyped. Performing association studies on pooled DNA samples can provide greater power for a given cost. For quantitative traits, the effect of an SNP is measured in the units of the trait and here we propose and demonstrate a method to estimate SNP effects on quantitative traits from pooled DNA data. Methods To obtain estimates of SNP effects from pooled DNA samples, we used logistic regression of estimated allele frequencies in pools on phenotype. The method was tested on a simulated dataset, and a beef cattle dataset using a model that included principal components from a genomic correlation matrix derived from the allele frequencies estimated from the pooled samples. The performance of the obtained estimates was evaluated by comparison with estimates obtained using regression of phenotype on genotype from individual samples of DNA. Results For the simulated data, the estimates of SNP effects from pooled DNA are similar but asymptotically different to those from individual DNA data. Error in estimating allele frequencies had a large effect on the accuracy of estimated SNP effects. For the beef cattle dataset, the principal components of the genomic correlation matrix from pooled DNA were consistent with known breed groups, and could be used to account for population stratification. Correctly modeling the contemporary group structure was essential to achieve estimates similar to those from individual DNA data, and pooling DNA from individuals within groups was superior to pooling DNA across groups. For a fixed number of assays, pooled DNA samples produced results that were more correlated with results from individual genotyping data than were results from one random individual assayed from each pool. Conclusions Use of logistic regression of allele frequency on phenotype makes it possible to estimate SNP

  5. Development and Characterization of a High Density SNP Genotyping Assay for Cattle

    PubMed Central

    Matukumalli, Lakshmi K.; Lawley, Cynthia T.; Schnabel, Robert D.; Taylor, Jeremy F.; Allan, Mark F.; Heaton, Michael P.; O'Connell, Jeff; Moore, Stephen S.; Smith, Timothy P. L.; Sonstegard, Tad S.; Van Tassell, Curtis P.

    2009-01-01

    The success of genome-wide association (GWA) studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP) genotyping for the identification of quantitative trait loci (QTL) and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF) ranging from 0.24 to 0.27). The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle. PMID:19390634

  6. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate.

    PubMed

    Roffler, Gretchen H; Amish, Stephen J; Smith, Seth; Cosart, Ted; Kardos, Marty; Schwartz, Michael K; Luikart, Gordon

    2016-09-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5' and 3' untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species. PMID:27327375

  7. Development and Validation of a High-Density SNP Genotyping Array for African Oil Palm.

    PubMed

    Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Heng, Huey Ying; Lee, Heng Leng; Mohamed, Mohaimi; Low, Joel Zi-Bin; Apparow, Sukganah; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Appleton, David Ross

    2016-08-01

    High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle.

  8. SNP discovery in the transcriptome of white Pacific shrimp Litopenaeus vannamei by next generation sequencing.

    PubMed

    Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2014-01-01

    The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies.

  9. Estrogen, SNP-Dependent Chemokine Expression and Selective Estrogen Receptor Modulator Regulation.

    PubMed

    Ho, Ming-Fen; Bongartz, Tim; Liu, Mohan; Kalari, Krishna R; Goss, Paul E; Shepherd, Lois E; Goetz, Matthew P; Kubo, Michiaki; Ingle, James N; Wang, Liewei; Weinshilboum, Richard M

    2016-03-01

    We previously reported, on the basis of a genome-wide association study for aromatase inhibitor-induced musculoskeletal symptoms, that single-nucleotide polymorphisms (SNPs) near the T-cell leukemia/lymphoma 1A (TCL1A) gene were associated with aromatase inhibitor-induced musculoskeletal pain and with estradiol (E2)-induced TCL1A expression. Furthermore, variation in TCL1A expression influenced the downstream expression of proinflammatory cytokines and cytokine receptors. Specifically, the top hit genome-wide association study SNP, rs11849538, created a functional estrogen response element (ERE) that displayed estrogen receptor (ER) binding and increased E2 induction of TCL1A expression only for the variant SNP genotype. In the present study, we pursued mechanisms underlying the E2-SNP-dependent regulation of TCL1A expression and, in parallel, our subsequent observations that SNPs at a distance from EREs can regulate ERα binding and that ER antagonists can reverse phenotypes associated with those SNPs. Specifically, we performed a series of functional genomic studies using a large panel of lymphoblastoid cell lines with dense genomic data that demonstrated that TCL1A SNPs at a distance from EREs can modulate ERα binding and expression of TCL1A as well as the expression of downstream immune mediators. Furthermore, 4-hydroxytamoxifen or fulvestrant could reverse these SNP-genotype effects. Similar results were found for SNPs in the IL17A cytokine and CCR6 chemokine receptor genes. These observations greatly expand our previous results and support the existence of a novel molecular mechanism that contributes to the complex interplay between estrogens and immune systems. They also raise the possibility of the pharmacological manipulation of the expression of proinflammatory cytokines and chemokines in a SNP genotype-dependent fashion. PMID:26866883

  10. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate.

    PubMed

    Roffler, Gretchen H; Amish, Stephen J; Smith, Seth; Cosart, Ted; Kardos, Marty; Schwartz, Michael K; Luikart, Gordon

    2016-09-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5' and 3' untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.

  11. Development and Validation of a High-Density SNP Genotyping Array for African Oil Palm.

    PubMed

    Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Heng, Huey Ying; Lee, Heng Leng; Mohamed, Mohaimi; Low, Joel Zi-Bin; Apparow, Sukganah; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Appleton, David Ross

    2016-08-01

    High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle. PMID:27112659

  12. Allelic association patterns for a dense SNP map.

    PubMed

    Weir, B S; Hill, W G; Cardon, L R

    2004-12-01

    A dense set of 5,000 SNPs on a 10-Mb region of human chromosome 20 has been typed on samples of African Americans, East Asians, and United Kingdom Caucasians. There are departures from Hardy-Weinberg equilibrium beyond the level at which markers are often discarded because of possible genotyping errors. The observation that markers showing such departures are often close together on the chromosome confirms the result that Hardy-Weinberg tests at two loci are correlated to an extent that depends on the linkage disequilibrium between those two markers. Linkage disequilibrium can be described by the composite linkage disequilibrium coefficient, the parameter that determines the behavior of case-control allelic tests of association. A useful preliminary investigation of datasets of this type is provided by counting the numbers of distinct multi-locus genotypes in windows of a few markers.

  13. One SNP at a Time: Moving beyond GWAS in Psoriasis.

    PubMed

    Ray-Jones, Helen; Eyre, Stephen; Barton, Anne; Warren, Richard B

    2016-03-01

    Although genome-wide association studies have revealed important insights into the global genetic basis of psoriasis, the findings require further investigation. At present, the known genetic risk loci are largely uncharacterized in terms of the variant or gene responsible for the association, the biological pathway involved, and the main cell type driving the pathology. This review primarily focuses on current approaches toward gaining a complete understanding of how these known genetic loci contribute to an increased disease risk in psoriasis. PMID:26811024

  14. The easy road to genome-wide medium density SNP screening in a non-model species: development and application of a 10 K SNP-chip for the house sparrow (Passer domesticus).

    PubMed

    Hagen, Ingerid J; Billing, Anna M; Rønning, Bernt; Pedersen, Sindre A; Pärn, Henrik; Slate, Jon; Jensen, Henrik

    2013-05-01

    With the advent of next generation sequencing, new avenues have opened to study genomics in wild populations of non-model species. Here, we describe a successful approach to a genome-wide medium density Single Nucleotide Polymorphism (SNP) panel in a non-model species, the house sparrow (Passer domesticus), through the development of a 10 K Illumina iSelect HD BeadChip. Genomic DNA and cDNA derived from six individuals were sequenced on a 454 GS FLX system and generated a total of 1.2 million sequences, in which SNPs were detected. As no reference genome exists for the house sparrow, we used the zebra finch (Taeniopygia guttata) reference genome to determine the most likely position of each SNP. The 10 000 SNPs on the SNP-chip were selected to be distributed evenly across 31 chromosomes, giving on average one SNP per 100 000 bp. The SNP-chip was screened across 1968 individual house sparrows from four island populations. Of the original 10 000 SNPs, 7413 were found to be variable, and 99% of these SNPs were successfully called in at least 93% of all individuals. We used the SNP-chip to demonstrate the ability of such genome-wide marker data to detect population sub-division, and compared these results to similar analyses using microsatellites. The SNP-chip will be used to map Quantitative Trait Loci (QTL) for fitness-related phenotypic traits in natural populations.

  15. Identification of differently expressed genes with specific SNP Loci for breast cancer by the integration of SNP and gene expression profiling analyses.

    PubMed

    Yuan, Pengfei; Liu, Dechun; Deng, Miao; Liu, Jiangbo; Wang, Jianguang; Zhang, Like; Liu, Qipeng; Zhang, Ting; Chen, Yanbin; Jin, Gaoyuan

    2015-04-01

    This study aims to explore the relationship between gene polymorphism and breast cancer, and to screen DEGs (differentially expressed genes) with SNPs (single nucleotide polymorphisms) related to breast cancer. The SNPs of 17 patients and the preprocessed SNP profiling GSE 32258 (38 cases of normal breast cells) were combined to identify their correlation with breast cancer using chi-square test. The gene expression profiling batch8_9 (38 cases of patients and 8 cases of normal tissue) was preprocessed with limma package, and the DEGs were filtered out. Then fisher's method was applied to integrate DEGs and SNPs associated with breast cancer. With NetBox software, TRED (Transcriptional Regulatory Element Database) and UCSC (University of California Santa Cruz) database, genes-associated network and transcriptional regulatory network were constructed using cytoscape software. Further, GO (Gene Ontology) and KEGG analyses were performed for genes in the networks by using siggenes. In total, 332 DEGs were identified. There were 160 breast cancer-related SNPs related to 106 genes of gene expression profiling (19 were significant DEGs). Finally, 11co-correlated DEGs were selected. In genes-associated network, 9 significant DEGs were correlated to 23 LINKER genes while, in transcriptional regulatory network, E2F1 had regulatory relationships with 7 DEGs including MTUS1, CD44, CCNB1 and CCND2. KRAS with SNP locus of rs1137282 was involved in 35 KEGG pathways. The genes of MTUS1, CD44, CCNB1, CCND2 and KRAS with specific SNP loci may be used as biomarkers for diagnosis of breast cancer. Besides, E2F1 was recognized as the transcription factor of 7 DEGs including MTUS1, CD44, CCNB1 and CCND2.

  16. Leaf Transcriptome Sequencing for Identifying Genic-SSR Markers and SNP Heterozygosity in Crossbred Mango Variety ‘Amrapali’ (Mangifera indica L.)

    PubMed Central

    Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar

    2016-01-01

    Mango (Mangifera indica L.) is called “king of fruits” due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties ‘Neelam’, ‘Dashehari’ and their hybrid ‘Amrapali’ using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango. PMID:27736892

  17. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems.

    PubMed

    Wade, Len J; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K; Reddy, K R Kamalnath; Varma, C Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H E; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7-40.7 Mb) and on chromosome 8 (20.3-21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions. PMID:25909711

  18. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems

    PubMed Central

    Wade, Len J.; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K.; Reddy, K. R. Kamalnath; Varma, C. Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R.; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H. E.; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L.; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7–40.7 Mb) and on chromosome 8 (20.3–21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions. PMID:25909711

  19. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems.

    PubMed

    Wade, Len J; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K; Reddy, K R Kamalnath; Varma, C Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H E; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7-40.7 Mb) and on chromosome 8 (20.3-21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions.

  20. A robust statistical method to detect null alleles in microsatellite and SNP datasets in both panmictic and inbred populations.

    PubMed

    Girard, Philippe

    2011-01-01

    Null alleles are common technical artifacts in genetic-based analysis. Powerful methods enabling their detection in either panmictic or inbred populations have been proposed. However, none of these methods appears unbiased in both types of mating systems, necessitating a priori knowledge of the inbreeding level of the population under study. To counter this problem, I propose to use the software FDist2 to detect the atypical fixation indices that characterize markers with null alleles. The rational behind this approach and the parameter settings are explained. The power of the method for various sample sizes, degrees of inbreeding and null allele frequencies is evaluated using simulated microsatellite and SNP datasets and then compared to two other null allele detection methods. The results clearly show the robustness of the method proposed here as well as its greater accuracy in both panmictic and inbred populations for both types of marker. By allowing a proper detection of null alleles for a wide range of mating systems and markers, this new method is particularly appealing for numerous genetic studies using co-dominant loci. PMID:21381434

  1. Population genomic structure and linkage disequilibrium analysis of South African goat breeds using genome-wide SNP data.

    PubMed

    Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C

    2016-08-01

    The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2)  = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in

  2. Population genomic structure and linkage disequilibrium analysis of South African goat breeds using genome-wide SNP data.

    PubMed

    Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C

    2016-08-01

    The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2)  = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in

  3. Evaluation of probabilistic and logical inference for a SNP annotation system.

    PubMed

    Shen, Terry H; Tarczy-Hornoch, Peter; Detwiler, Landon T; Cadag, Eithon; Carlson, Christopher S

    2010-06-01

    Genome wide association studies (GWAS) are an important approach to understanding the genetic mechanisms behind human diseases. Single nucleotide polymorphisms (SNPs) are the predominant markers used in genome wide association studies, and the ability to predict which SNPs are likely to be functional is important for both a priori and a posteriori analyses of GWA studies. This article describes the design, implementation and evaluation of a family of systems for the purpose of identifying SNPs that may cause a change in phenotypic outcomes. The methods described in this article characterize the feasibility of combinations of logical and probabilistic inference with federated data integration for both point and regional SNP annotation and analysis. Evaluations of the methods demonstrate the overall strong predictive value of logical, and logical with probabilistic, inference applied to the domain of SNP annotation.

  4. Nanoparticle-based detection and quantification of DNA with single nucleotide polymorphism (SNP) discrimination selectivity

    PubMed Central

    Qin, Wei Jie; Yung, Lin Yue Lanry

    2007-01-01

    Sequence-specific DNA detection is important in various biomedical applications such as gene expression profiling, disease diagnosis and treatment, drug discovery and forensic analysis. Here we report a gold nanoparticle-based method that allows DNA detection and quantification and is capable of single nucleotide polymorphism (SNP) discrimination. The precise quantification of single-stranded DNA is due to the formation of defined nanoparticle-DNA conjugate groupings in the presence of target/linker DNA. Conjugate groupings were characterized and quantified by gel electrophoresis. A linear correlation between the amount of target DNA and conjugate groupings was found. For SNP detection, single base mismatch discrimination was achieved for both the end- and center-base mismatch. The method described here may be useful for the development of a simple and quantitative DNA detection assay. PMID:17720714

  5. Cultivar origin and admixture detection in Turkish olive oils by SNP-based CAPS assays.

    PubMed

    Uncu, Ali Tevfik; Frary, Anne; Doganlar, Sami

    2015-03-01

    The aim of this study was to establish a DNA-based identification key to ascertain the cultivar origin of Turkish monovarietal olive oils. To reach this aim, we sequenced short fragments from five olive genes for SNP (single nucleotide polymorphism) identification and developed CAPS (cleaved amplified polymorphic DNA) assays for SNPs that alter restriction enzyme recognition motifs. When applied on the oils of 17 olive cultivars, a maximum of five CAPS assays were necessary to discriminate the varietal origin of the samples. We also tested the efficiency and limit of our approach for detecting olive oil admixtures. As a result of the analysis, we were able to detect admixing down to a limit of 20%. The SNP-based CAPS assays developed in this work can be used for testing and verification of the authenticity of Turkish monovarietal olive oils, for olive tree certification, and in germplasm characterization and preservation studies.

  6. Functional analysis of deep intronic SNP rs13438494 in intron 24 of PCLO gene.

    PubMed

    Seo, Seunghee; Takayama, Kanako; Uno, Kyosuke; Ohi, Kazutaka; Hashimoto, Ryota; Nishizawa, Daisuke; Ikeda, Kazutaka; Ozaki, Norio; Nabeshima, Toshitaka; Miyamoto, Yoshiaki; Nitta, Atsumi

    2013-01-01

    The single nucleotide polymorphism (SNP) rs13438494 in intron 24 of PCLO was significantly associated with bipolar disorder in a meta-analysis of genome-wide association studies. In this study, we performed functional minigene analysis and bioinformatics prediction of splicing regulatory sequences to characterize the deep intronic SNP rs13438494. We constructed minigenes with A and C alleles containing exon 24, intron 24, and exon 25 of PCLO to assess the genetic effect of rs13438494 on splicing. We found that the C allele of rs13438494 reduces the splicing efficiency of the PCLO minigene. In addition, prediction analysis of enhancer/silencer motifs using the Human Splice Finder web tool indicated that rs13438494 induces the abrogation or creation of such binding sites. Our results indicate that rs13438494 alters splicing efficiency by creating or disrupting a splicing motif, which functions by binding of splicing regulatory proteins, and may ultimately result in bipolar disorder in affected people.

  7. SNP analysis using a molecular beacon-based operating cooperatively (OC) sensor.

    PubMed

    Cornett, Evan M; Kolpashchikov, Dmitry M

    2013-01-01

    Analysis of single-nucleotide polymorphisms (SNPs) is important for diagnosis of infectious and genetic diseases, for environment and population studies, as well as in forensic applications. Herein is a detailed description to design an "operating cooperatively" (OC) sensor for highly specific SNP analysis. OC sensors use two unmodified DNA adaptor strands and a molecular beacon probe to detect a nucleic acid targets with exceptional specificity towards SNPs. Genotyping can be accomplished at room temperature in a homogenous assay. The approach is easily adaptable for any nucleic acid target, and has been successfully used for analysis of targets with complex secondary structures. Additionally, OC sensors are an easy-to-design and cost-effective method for SNP analysis and nucleic acid detection.

  8. Genome-wide SNP association-based localization of a dwarfism gene in Friesian dwarf horses.

    PubMed

    Orr, N; Back, W; Gu, J; Leegwater, P; Govindarajan, P; Conroy, J; Ducro, B; Van Arendonk, J A M; MacHugh, D E; Ennis, S; Hill, E W; Brama, P A J

    2010-12-01

    The recent completion of the horse genome and commercial availability of an equine SNP genotyping array has facilitated the mapping of disease genes. We report putative localization of the gene responsible for dwarfism, a trait in Friesian horses that is thought to have a recessive mode of inheritance, to a 2-MB region of chromosome 14 using just 10 affected animals and 10 controls. We successfully genotyped 34,429 SNPs that were tested for association with dwarfism using chi-square tests. The most significant SNP in our study, BIEC2-239376 (P(2df)=4.54 × 10(-5), P(rec)=7.74 × 10(-6)), is located close to a gene implicated in human dwarfism. Fine-mapping and resequencing analyses did not aid in further localization of the causative variant, and replication of our findings in independent sample sets will be necessary to confirm these results.

  9. Infinium Assay for Large-scale SNP Genotyping Applications

    PubMed Central

    Adler, Adam J.; Wiley, Graham B.; Gaffney, Patrick M.

    2013-01-01

    Genotyping variants in the human genome has proven to be an efficient method to identify genetic associations with phenotypes. The distribution of variants within families or populations can facilitate identification of the genetic factors of disease. Illumina's panel of genotyping BeadChips allows investigators to genotype thousands or millions of single nucleotide polymorphisms (SNPs) or to analyze other genomic variants, such as copy number, across a large number of DNA samples. These SNPs can be spread throughout the genome or targeted in specific regions in order to maximize potential discovery. The Infinium assay has been optimized to yield high-quality, accurate results quickly. With proper setup, a single technician can process from a few hundred to over a thousand DNA samples per week, depending on the type of array. This assay guides users through every step, starting with genomic DNA and ending with the scanning of the array. Using propriety reagents, samples are amplified, fragmented, precipitated, resuspended, hybridized to the chip, extended by a single base, stained, and scanned on either an iScan or Hi Scan high-resolution optical imaging system. One overnight step is required to amplify the DNA. The DNA is denatured and isothermally amplified by whole-genome amplification; therefore, no PCR is required. Samples are hybridized to the arrays during a second overnight step. By the third day, the samples are ready to be scanned and analyzed. Amplified DNA may be stockpiled in large quantities, allowing bead arrays to be processed every day of the week, thereby maximizing throughput. PMID:24300335

  10. Haplotype inference from unphased SNP data in heterozygous polyploids based on SAT

    PubMed Central

    Neigenfind, Jost; Gyetvai, Gabor; Basekow, Rico; Diehl, Svenja; Achenbach, Ute; Gebhardt, Christiane; Selbig, Joachim; Kersten, Birgit

    2008-01-01

    Background Haplotype inference based on unphased SNP markers is an important task in population genetics. Although there are different approaches to the inference of haplotypes in diploid species, the existing software is not suitable for inferring haplotypes from unphased SNP data in polyploid species, such as the cultivated potato (Solanum tuberosum). Potato species are tetraploid and highly heterozygous. Results Here we present the software SATlotyper which is able to handle polyploid and polyallelic data. SATlo-typer uses the Boolean satisfiability problem to formulate Haplotype Inference by Pure Parsimony. The software excludes existing haplotype inferences, thus allowing for calculation of alternative inferences. As it is not known which of the multiple haplotype inferences are best supported by the given unphased data set, we use a bootstrapping procedure that allows for scoring of alternative inferences. Finally, by means of the bootstrapping scores, it is possible to optimise the phased genotypes belonging to a given haplotype inference. The program is evaluated with simulated and experimental SNP data generated for heterozygous tetraploid populations of potato. We show that, instead of taking the first haplotype inference reported by the program, we can significantly improve the quality of the final result by applying additional methods that include scoring of the alternative haplotype inferences and genotype optimisation. For a sub-population of nineteen individuals, the predicted results computed by SATlotyper were directly compared with results obtained by experimental haplotype inference via sequencing of cloned amplicons. Prediction and experiment gave similar results regarding the inferred haplotypes and phased genotypes. Conclusion Our results suggest that Haplotype Inference by Pure Parsimony can be solved efficiently by the SAT approach, even for data sets of unphased SNP from heterozygous polyploids. SATlotyper is freeware and is distributed as

  11. MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data.

    PubMed

    Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong

    2015-01-01

    Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package "MAFsnp" implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/. PMID:26309201

  12. SNP discovery using Next Generation Transcriptomic Sequencing in Atlantic herring (Clupea harengus).

    PubMed

    Helyar, Sarah J; Limborg, Morten T; Bekkevold, Dorte; Babbucci, Massimiliano; van Houdt, Jeroen; Maes, Gregory E; Bargelloni, Luca; Nielsen, Rasmus O; Taylor, Martin I; Ogden, Rob; Cariani, Alessia; Carvalho, Gary R; Panitz, Frank

    2012-01-01

    The introduction of Next Generation Sequencing (NGS) has revolutionised population genetics, providing studies of non-model species with unprecedented genomic coverage, allowing evolutionary biologists to address questions previously far beyond the reach of available resources. Furthermore, the simple mutation model of Single Nucleotide Polymorphisms (SNPs) permits cost-effective high-throughput genotyping in thousands of individuals simultaneously. Genomic resources are scarce for the Atlantic herring (Clupea harengus), a small pelagic species that sustains high revenue fisheries. This paper details the development of 578 SNPs using a combined NGS and high-throughput genotyping approach. Eight individuals covering the species distribution in the eastern Atlantic were bar-coded and multiplexed into a single cDNA library and sequenced using the 454 GS FLX platform. SNP discovery was performed by de novo sequence clustering and contig assembly, followed by the mapping of reads against consensus contig sequences. Selection of candidate SNPs for genotyping was conducted using an in silico approach. SNP validation and genotyping were performed simultaneously using an Illumina 1,536 GoldenGate assay. Although the conversion rate of candidate SNPs in the genotyping assay cannot be predicted in advance, this approach has the potential to maximise cost and time efficiencies by avoiding expensive and time-consuming laboratory stages of SNP validation. Additionally, the in silico approach leads to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic resources presented here represent a valuable multi-purpose resource for developing informative marker panels for population discrimination, microarray development and for population

  13. Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort

    PubMed Central

    2012-01-01

    Background Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. Results Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. Conclusion Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex

  14. SNP Discovery Using Next Generation Transcriptomic Sequencing in Atlantic Herring (Clupea harengus)

    PubMed Central

    Bekkevold, Dorte; Babbucci, Massimiliano; van Houdt, Jeroen; Maes, Gregory E.; Bargelloni, Luca; Nielsen, Rasmus O.; Taylor, Martin I.; Ogden, Rob; Cariani, Alessia; Carvalho, Gary R.; Consortium, FishPopTrace; Panitz, Frank

    2012-01-01

    The introduction of Next Generation Sequencing (NGS) has revolutionised population genetics, providing studies of non-model species with unprecedented genomic coverage, allowing evolutionary biologists to address questions previously far beyond the reach of available resources. Furthermore, the simple mutation model of Single Nucleotide Polymorphisms (SNPs) permits cost-effective high-throughput genotyping in thousands of individuals simultaneously. Genomic resources are scarce for the Atlantic herring (Clupea harengus), a small pelagic species that sustains high revenue fisheries. This paper details the development of 578 SNPs using a combined NGS and high-throughput genotyping approach. Eight individuals covering the species distribution in the eastern Atlantic were bar-coded and multiplexed into a single cDNA library and sequenced using the 454 GS FLX platform. SNP discovery was performed by de novo sequence clustering and contig assembly, followed by the mapping of reads against consensus contig sequences. Selection of candidate SNPs for genotyping was conducted using an in silico approach. SNP validation and genotyping were performed simultaneously using an Illumina 1,536 GoldenGate assay. Although the conversion rate of candidate SNPs in the genotyping assay cannot be predicted in advance, this approach has the potential to maximise cost and time efficiencies by avoiding expensive and time-consuming laboratory stages of SNP validation. Additionally, the in silico approach leads to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic resources presented here represent a valuable multi-purpose resource for developing informative marker panels for population discrimination, microarray development and for population

  15. Haplotype Block Partitioning and Tag SNP Selection Using Genotype Data and Their Applications to Association Studies

    PubMed Central

    Zhang, Kui; Qin, Zhaohui S.; Liu, Jun S.; Chen, Ting; Waterman, Michael S.; Sun, Fengzhu

    2004-01-01

    Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs) is sufficient to capture most of the haplotype structure of the human genome. In this paper, we develop a method to partition haplotypes into blocks and to identify tag SNPs based on genotype data by combining a dynamic programming algorithm for haplotype block partitioning and tag SNP selection based on haplotype data with a variation of the expectation maximization (EM) algorithm for haplotype inference. We assess the effects of using either haplotype or genotype data in haplotype block identification and tag SNP selection as a function of several factors, including sample size, density or number of SNPs studied, allele frequencies, fraction of missing data, and genotyping error rate, using extensive simulations. We find that a modest number of haplotype or genotype samples will result in consistent block partitions and tag SNP selection. The power of association studies based on tag SNPs using genotype data is similar to that using haplotype data. PMID:15078859

  16. SNP genotyping in melons: genetic variation, population structure, and linkage disequilibrium.

    PubMed

    Esteras, Cristina; Formisano, Gelsomina; Roig, Cristina; Díaz, Aurora; Blanca, José; Garcia-Mas, Jordi; Gómez-Guillamón, María Luisa; López-Sesé, Ana Isabel; Lázaro, Almudena; Monforte, Antonio J; Picó, Belén

    2013-05-01

    Novel sequencing technologies were recently used to generate sequences from multiple melon (Cucumis melo L.) genotypes, enabling the in silico identification of large single nucleotide polymorphism (SNP) collections. In order to optimize the use of these markers, SNP validation and large-scale genotyping are necessary. In this paper, we present the first validated design for a genotyping array with 768 SNPs that are evenly distributed throughout the melon genome. This customized Illumina GoldenGate assay was used to genotype a collection of 74 accessions, representing most of the botanical groups of the species. Of the assayed loci, 91 % were successfully genotyped. The array provided a large number of polymorphic SNPs within and across accessions. This set of SNPs detected high levels of variation in accessions from this crop's center of origin as well as from several other areas of melon diversification. Allele distribution throughout the genome revealed regions that distinguished between the two main groups of cultivated accessions (inodorus and cantalupensis). Population structure analysis showed a subdivision into five subpopulations, reflecting the history of the crop. A considerably low level of LD was detected, which decayed rapidly within a few kilobases. Our results show that the GoldenGate assay can be used successfully for high-throughput SNP genotyping in melon. Since many of the genotyped accessions are currently being used as the parents of breeding populations in various programs, this set of mapped markers could be used for future mapping and breeding efforts.

  17. SNP Marker Discovery in Pima Cotton (Gossypium barbadense L.) Leaf Transcriptomes

    PubMed Central

    Kottapalli, Pratibha; Ulloa, Mauricio; Kottapalli, Kameswara Rao; Payton, Paxton; Burke, John

    2016-01-01

    The objective of this study was to explore the known narrow genetic diversity and discover single-nucleotide polymorphic (SNP) markers for marker-assisted breeding within Pima cotton (Gossypium barbadense L.) leaf transcriptomes. cDNA from 25-day plants of three diverse cotton genotypes [Pima S6 (PS6), Pima S7 (PS7), and Pima 3-79 (P3-79)] was sequenced on Illumina sequencing platform. A total of 28.9 million reads (average read length of 138 bp) were generated by sequencing cDNA libraries of these three genotypes. The de novo assembly of reads generated transcriptome sets of 26,369 contigs for PS6, 25,870 contigs for PS7, and 24,796 contigs for P3-79. A Pima leaf reference transcriptome was generated consisting of 42,695 contigs. More than 10,000 single-nucleotide polymorphisms (SNPs) were identified between the genotypes, with 100% SNP frequency and a minimum of eight sequencing reads. The most prevalent SNP substitutions were C—T and A—G in these cotton genotypes. The putative SNPs identified can be utilized for characterizing genetic diversity, genotyping, and eventually in Pima cotton breeding through marker-assisted selection. PMID:27721653

  18. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.

    PubMed

    Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos

    2015-08-01

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity.

  19. High-throughput SNP-genotyping analysis of the relationships among Ponto-Caspian sturgeon species

    PubMed Central

    Rastorguev, Sergey M; Nedoluzhko, Artem V; Mazur, Alexander M; Gruzdeva, Natalia M; Volkov, Alexander A; Barmintseva, Anna E; Mugue, Nikolai S; Prokhortchouk, Egor B

    2013-01-01

    Abstract Legally certified sturgeon fisheries require population protection and conservation methods, including DNA tests to identify the source of valuable sturgeon roe. However, the available genetic data are insufficient to distinguish between different sturgeon populations, and are even unable to distinguish between some species. We performed high-throughput single-nucleotide polymorphism (SNP)-genotyping analysis on different populations of Russian (Acipenser gueldenstaedtii), Persian (A. persicus), and Siberian (A. baerii) sturgeon species from the Caspian Sea region (Volga and Ural Rivers), the Azov Sea, and two Siberian rivers. We found that Russian sturgeons from the Volga and Ural Rivers were essentially indistinguishable, but they differed from Russian sturgeons in the Azov Sea, and from Persian and Siberian sturgeons. We identified eight SNPs that were sufficient to distinguish these sturgeon populations with 80% confidence, and allowed the development of markers to distinguish sturgeon species. Finally, on the basis of our SNP data, we propose that the A. baerii-like mitochondrial DNA found in some Russian sturgeons from the Caspian Sea arose via an introgression event during the Pleistocene glaciation. In the present study, the high-throughput genotyping analysis of several sturgeon populations was performed. SNP markers for species identification were defined. The possible explanation of the baerii-like mitotype presence in some Russian sturgeons in the Caspian Sea was suggested. PMID:24567827

  20. A whole-genome SNP array (RICE6K) for genomic breeding in rice.

    PubMed

    Yu, Huihui; Xie, Weibo; Li, Jing; Zhou, Fasong; Zhang, Qifa

    2014-01-01

    The advances in genotyping technology provide an opportunity to use genomic tools in crop breeding. As compared to field selections performed in conventional breeding programmes, genomics-based genotype screen can potentially reduce number of breeding cycles and more precisely integrate target genes for particular traits into an ideal genetic background. We developed a whole-genome single nucleotide polymorphism (SNP) array, RICE6K, based on Infinium technology, using representative SNPs selected from more than four million SNPs identified from resequencing data of more than 500 rice landraces. RICE6K contains 5102 SNP and insertion-deletion (InDel) markers, about 4500 of which were of high quality in the tested rice lines producing highly repeatable results. Forty-five functional markers that are located inside 28 characterized genes of important traits can be detected using RICE6K. The SNP markers are evenly distributed on the 12 chromosomes of rice with the average density of 12 SNPs per 1 Mb and can provide information for polymorphisms between indica and japonica subspecies as well as varieties within indica and japonica groups. Application tests of RICE6K showed that the array is suitable for rice germplasm fingerprinting, genotyping bulked segregating pools, seed authenticity check and genetic background selection. These results suggest that RICE6K provides an efficient and reliable genotyping tool for rice genomic breeding.

  1. Pyrosequencing protocol using a universal biotinylated primer for mutation detection and SNP genotyping.

    PubMed

    Royo, Jose Luis; Hidalgo, Manuel; Ruiz, Agustin

    2007-01-01

    DNA sequencing has markedly changed the nature of biomedical research, identifying millions of polymorphisms along the human genome that now require further analysis to study the genetic basis of human diseases. Among the DNA-sequencing platforms available, Pyrosequencing has become a useful tool for medium-throughput single nucleotide polymorphism (SNP) genotyping, mutation detection, copy-number studies and DNA methylation analysis. Its 96-well genotyping format allows reliable results to be obtained at reasonable costs in a few minutes. However, a specific biotinylated primer is usually required for each SNP under study to allow the capture of single-stranded DNA template for the Pyrosequencing assay. Here, we present an alternative to the standard labeling of PCR products for analysis by Pyrosequencing that circumvents the requirement of specific biotinylated primers for each SNP of interest. This protocol uses a single biotinylated primer that is simultaneously incorporated into all M13-tagged PCR products during the amplification reaction. The protocol covers all steps from the PCR amplification and capture of single-stranded template, its preparation, and the Pyrosequencing assay itself. Once the correct primer stoichiometry has been determined, the assay takes around 2 h for PCR amplification, followed by 15-20 min (per plate) to obtain the genotypes.

  2. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao

    PubMed Central

    Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos

    2015-01-01

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980

  3. PEAS V1.0: a package for elementary analysis of SNP data.

    PubMed

    Xu, Shuhua; Gupta, Sanchit; Jin, Li

    2010-11-01

    We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm. PMID:21565121

  4. Identification and SNP association analysis of a novel gene in chicken.

    PubMed

    Mei, Xingxing; Kang, Xiangtao; Liu, Xiaojun; Jia, Lijuan; Li, Hong; Li, Zhuanjian; Jiang, Ruirui

    2016-02-01

    A novel gene that was predicted to encode a long noncoding RNA (lncRNA) transcript was identified in a previous study that aimed to detect candidate genes related to growth rate differences between Chinese local breed Gushi chickens and Anka broilers. To characterise the biological function of the lncRNA, we cloned and sequenced the complete open reading frame of the gene. We performed quantitative real-time polymerase chain reaction (qPCR) to analyse the expression patterns of the lncRNA in different tissues of chicken at different development stages. The qPCR data showed that the novel lncRNA gene was expressed extensively, with the highest abundance in spleen and lung and the lowest abundance in pectoralis and leg muscle. Additionally, we identified a single nucleotide polymorphism (SNP) at the 5'-end of the gene and studied the association between the SNP and chicken growth traits using data from an F2 resource population of Gushi chickens and Anka broilers. The association analysis showed that the SNP was significantly (P < 0.05) associated with leg muscle weight, chest breadth, sternal length and body weight in chickens at 1 day, 4 weeks and 6 weeks of age. We concluded that the novel lncRNA gene, which we designated pouBW1, may play an important role in regulating chicken growth.

  5. SNP Discovery and Development of a High-Density Genotyping Array for Sunflower

    PubMed Central

    Bachlava, Eleni; Taylor, Christopher A.; Tang, Shunxue; Bowers, John E.; Mandel, Jennifer R.; Burke, John M.; Knapp, Steven J.

    2012-01-01

    Recent advances in next-generation DNA sequencing technologies have made possible the development of high-throughput SNP genotyping platforms that allow for the simultaneous interrogation of thousands of single-nucleotide polymorphisms (SNPs). Such resources have the potential to facilitate the rapid development of high-density genetic maps, and to enable genome-wide association studies as well as molecular breeding approaches in a variety of taxa. Herein, we describe the development of a SNP genotyping resource for use in sunflower (Helianthus annuus L.). This work involved the development of a reference transcriptome assembly for sunflower, the discovery of thousands of high quality SNPs based on the generation and analysis of ca. 6 Gb of transcriptome re-sequencing data derived from multiple genotypes, the selection of 10,640 SNPs for inclusion in the genotyping array, and the use of the resulting array to screen a diverse panel of sunflower accessions as well as related wild species. The results of this work revealed a high frequency of polymorphic SNPs and relatively high level of cross-species transferability. Indeed, greater than 95% of successful SNP assays revealed polymorphism, and more than 90% of these assays could be successfully transferred to related wild species. Analysis of the polymorphism data revealed patterns of genetic differentiation that were largely congruent with the evolutionary history of sunflower, though the large number of markers allowed for finer resolution than has previously been possible. PMID:22238659

  6. PredictSNP: robust and accurate consensus classifier for prediction of disease-related mutations.

    PubMed

    Bendl, Jaroslav; Stourac, Jan; Salanda, Ondrej; Pavelka, Antonin; Wieben, Eric D; Zendulka, Jaroslav; Brezovsky, Jan; Damborsky, Jiri

    2014-01-01

    Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.

  7. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.

    PubMed

    Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos

    2015-08-01

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980

  8. How to Use SNP_TATA_Comparator to Find a Significant Change in Gene Expression Caused by the Regulatory SNP of This Gene's Promoter via a Change in Affinity of the TATA-Binding Protein for This Promoter

    PubMed Central

    Ponomarenko, Mikhail; Rasskazov, Dmitry; Arkova, Olga; Ponomarenko, Petr; Suslov, Valentin; Savinkova, Ludmila; Kolchanov, Nikolay

    2015-01-01

    The use of biomedical SNP markers of diseases can improve effectiveness of treatment. Genotyping of patients with subsequent searching for SNPs more frequent than in norm is the only commonly accepted method for identification of SNP markers within the framework of translational research. The bioinformatics applications aimed at millions of unannotated SNPs of the “1000 Genomes” can make this search for SNP markers more focused and less expensive. We used our Web service involving Fisher's Z-score for candidate SNP markers to find a significant change in a gene's expression. Here we analyzed the change caused by SNPs in the gene's promoter via a change in affinity of the TATA-binding protein for this promoter. We provide examples and discuss how to use this bioinformatics application in the course of practical analysis of unannotated SNPs from the “1000 Genomes” project. Using known biomedical SNP markers, we identified 17 novel candidate SNP markers nearby: rs549858786 (rheumatoid arthritis); rs72661131 (cardiovascular events in rheumatoid arthritis); rs562962093 (stroke); rs563558831 (cyclophosphamide bioactivation); rs55878706 (malaria resistance, leukopenia), rs572527200 (asthma, systemic sclerosis, and psoriasis), rs371045754 (hemophilia B), rs587745372 (cardiovascular events); rs372329931, rs200209906, rs367732974, and rs549591993 (all four: cancer); rs17231520 and rs569033466 (both: atherosclerosis); rs63750953, rs281864525, and rs34166473 (all three: malaria resistance, thalassemia). PMID:26516624

  9. Efficient SNP Discovery by Combining Microarray and Lab-on-a-Chip Data for Animal Breeding and Selection

    PubMed Central

    Huang, Chao-Wei; Lin, Yu-Tsung; Ding, Shih-Torng; Lo, Ling-Ling; Wang, Pei-Hwa; Lin, En-Chung; Liu, Fang-Wei; Lu, Yen-Wen

    2015-01-01

    The genetic markers associated with economic traits have been widely explored for animal breeding. Among these markers, single-nucleotide polymorphism (SNPs) are gradually becoming a prevalent and effective evaluation tool. Since SNPs only focus on the genetic sequences of interest, it thereby reduces the evaluation time and cost. Compared to traditional approaches, SNP genotyping techniques incorporate informative genetic background, improve the breeding prediction accuracy and acquiesce breeding quality on the farm. This article therefore reviews the typical procedures of animal breeding using SNPs and the current status of related techniques. The associated SNP information and genotyping techniques, including microarray and Lab-on-a-Chip based platforms, along with their potential are highlighted. Examples in pig and poultry with different SNP loci linked to high economic trait values are given. The recommendations for utilizing SNP genotyping in nimal breeding are summarized.

  10. Efficient SNP Discovery by Combining Microarray and Lab-on-a-Chip Data for Animal Breeding and Selection

    PubMed Central

    Huang, Chao-Wei; Lin, Yu-Tsung; Ding, Shih-Torng; Lo, Ling-Ling; Wang, Pei-Hwa; Lin, En-Chung; Liu, Fang-Wei; Lu, Yen-Wen

    2015-01-01

    The genetic markers associated with economic traits have been widely explored for animal breeding. Among these markers, single-nucleotide polymorphism (SNPs) are gradually becoming a prevalent and effective evaluation tool. Since SNPs only focus on the genetic sequences of interest, it thereby reduces the evaluation time and cost. Compared to traditional approaches, SNP genotyping techniques incorporate informative genetic background, improve the breeding prediction accuracy and acquiesce breeding quality on the farm. This article therefore reviews the typical procedures of animal breeding using SNPs and the current status of related techniques. The associated SNP information and genotyping techniques, including microarray and Lab-on-a-Chip based platforms, along with their potential are highlighted. Examples in pig and poultry with different SNP loci linked to high economic trait values are given. The recommendations for utilizing SNP genotyping in nimal breeding are summarized. PMID:27600241

  11. Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...

  12. Use of the Illumina GoldenGate assay for single nucleotide polymorphism (SNP) genotyping in cereal crops.

    PubMed

    Chao, Shiaoman; Lawley, Cindy

    2015-01-01

    Highly parallel genotyping assays, such as the GoldenGate assay developed by Illumina, capable of interrogating up to 3,072 single nucleotide polymorphisms (SNPs) simultaneously, have greatly facilitated genome-wide studies, particularly for crops with large and complex genome structures. In this report, we provide detailed information and guidelines regarding genomic DNA preparation, SNP assay design, SNP assay protocols, and genotype calling using Illumina's GenomeStudio software. PMID:25373766

  13. Comparison of SSR and SNP Markers in Estimation of Genetic Diversity and Population Structure of Indian Rice Varieties

    PubMed Central

    Singh, Amit Kumar; Kumar, Sundeep; Srinivasan, Kalyani; Tyagi, R. K.; Singh, N. K.; Singh, Rakesh

    2013-01-01

    Simple sequence repeat (SSR) and Single Nucleotide Polymorphic (SNP), the two most robust markers for identifying rice varieties were compared for assessment of genetic diversity and population structure. Total 375 varieties of rice from various regions of India archived at the Indian National GeneBank, NBPGR, New Delhi, were analyzed using thirty six genetic markers, each of hypervariable SSR (HvSSR) and SNP which were distributed across 12 rice chromosomes. A total of 80 alleles were amplified with the SSR markers with an average of 2.22 alleles per locus whereas, 72 alleles were amplified with SNP markers. Polymorphic information content (PIC) values for HvSSR ranged from 0.04 to 0.5 with an average of 0.25. In the case of SNP markers, PIC values ranged from 0.03 to 0.37 with an average of 0.23. Genetic relatedness among the varieties was studied; utilizing an unrooted tree all the genotypes were grouped into three major clusters with both SSR and SNP markers. Analysis of molecular variance (AMOVA) indicated that maximum diversity was partitioned between and within individual level but not between populations. Principal coordinate analysis (PCoA) with SSR markers showed that genotypes were uniformly distributed across the two axes with 13.33% of cumulative variation whereas, in case of SNP markers varieties were grouped into three broad groups across two axes with 45.20% of cumulative variation. Population structure were tested using K values from 1 to 20, but there was no clear population structure, therefore Ln(PD) derived Δk was plotted against the K to determine the number of populations. In case of SSR maximum Δk was at K=5 whereas, in case of SNP maximum Δk was found at K=15, suggesting that resolution of population was higher with SNP markers, but SSR were more efficient for diversity analysis. PMID:24367635

  14. 250K SNP array karyotyping identifies acquired uniparental disomy and homozygous mutations, including novel missense substitutions of c-Cbl, in myeloid malignancies

    PubMed Central

    Dunbar, Andrew J.; Gondek, Lukasz P.; O’Keefe, Christine L.; Makishima, Hideki; Rataul, Manjot S.; Szpurka, Hadrian; Sekeres, Mikkael A.; Wang, Xiao Fei; McDevitt, Michael A.; Maciejewski, Jaroslaw P.

    2009-01-01

    Two types of acquired loss of heterozygosity are possible in cancer: deletions and copy-neutral uniparental disomy (UPD). Conventionally, copy number losses are identified using metaphase cytogenetics while detection of UPD is accomplished by microsatellite and copy number analysis and as such, is not often used clinically. Recently, introduction of single nucleotide polymorphism (SNP) microarrays have allowed for the systematic and sensitive detection of UPD in hematological malignancies and other cancers. In this study, we have applied 250K SNP array technology to detect previously cryptic chromosomal changes, particularly UPD, in a cohort of 301 patients with myelodysplastic syndromes (MDS), overlap MDS/myeloproliferative disorders (MPD), MPD, and acute myeloid leukemia (AML). We show that UPD is a common chromosomal defect in myeloid malignancies, particularly in chronic myelomonocytic leukemia (CMML; 48%) and MDS/MPD-unclassifiable (38%). Furthermore, we demonstrate that mapping minimally overlapping segmental UPD regions can help target the search for both known and unknown pathogenic mutations, including newly identified missense mutations in the proto-oncogene c-Cbl in 7/12 patients with UPD11q. Acquired mutations of c-Cbl E3 ubiquitin ligase may explain the pathogenesis of a clonal process in a subset of MDS/MPD, including CMML. PMID:19074904

  15. Y-chromosome polymorphisms and ethnic group – a combined STR and SNP approach in a population sample from northern Italy

    PubMed Central

    Cortellini, Venusia; Verzeletti, Andrea; Cerri, Nicoletta; Marino, Alberto; De Ferrari, Francesco

    2013-01-01

    Aim To find an association between Y chromosome polymorphisms and some ethnic groups. Methods Short tandem repeats (STR) and single-nucleotide polymorphisms (SNP) on the Y chromosome were typed in 311 unrelated men from four different ethnic groups – Italians from northern Italy, Albanians, Africans from the Maghreb region, and Indo-Pakistanis, using the AmpFlSTR® Yfiler PCR Amplification Kit and the SNaPshot Multiplex Kit. Results STRs analysis found 299 different haplotypes and SNPs analysis 11 different haplogroups. Haplotypes and haplogroups were analyzed and compared between different ethnic groups. Significant differences were found among all the population groups, except between Italians and Indo-Pakistanis and between Albanians and Indo-Pakistanis. Conclusions Typing both STRs and SNPs on the Y chromosome could become useful in determining ethnic origin of a potential suspect. PMID:23771759

  16. TheSNPpit—A High Performance Database System for Managing Large Scale SNP Data

    PubMed Central

    Groeneveld, Eildert; Lichtenberg, Helmut

    2016-01-01

    The fast development of high throughput genotyping has opened up new possibilities in genetics while at the same time producing considerable data handling issues. TheSNPpit is a database system for managing large amounts of multi panel SNP genotype data from any genotyping platform. With an increasing rate of genotyping in areas like animal and plant breeding as well as human genetics, already now hundreds of thousand of individuals need to be managed. While the common database design with one row per SNP can manage hundreds of samples this approach becomes progressively slower as the size of the data sets increase until it finally fails completely once tens or even hundreds of thousands of individuals need to be managed. TheSNPpit has implemented three ideas to also accomodate such large scale experiments: highly compressed vector storage in a relational database, set based data manipulation, and a very fast export written in C with Perl as the base for the framework and PostgreSQL as the database backend. Its novel subset system allows the creation of named subsets based on the filtering of SNP (based on major allele frequency, no-calls, and chromosomes) and manually applied sample and SNP lists at negligible storage costs, thus avoiding the issue of proliferating file copies. The named subsets are exported for down stream analysis. PLINK ped and map files are processed as in- and outputs. TheSNPpit allows management of different panel sizes in the same population of individuals when higher density panels replace previous lower density versions as it occurs in animal and plant breeding programs. A completely generalized procedure allows storage of phenotypes. TheSNPpit only occupies 2 bits for storing a single SNP implying a capacity of 4 mio SNPs per 1MB of disk storage. To investigate performance scaling, a database with more than 18.5 mio samples has been created with 3.4 trillion SNPs from 12 panels ranging from 1000 through 20 mio SNPs resulting in a

  17. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    PubMed

    Gimode, Davis; Odeny, Damaris A; de Villiers, Etienne P; Wanyonyi, Solomon; Dida, Mathews M; Mneney, Emmarold E; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional

  18. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies

    PubMed Central

    Gimode, Davis; Odeny, Damaris A.; de Villiers, Etienne P.; Wanyonyi, Solomon; Dida, Mathews M.; Mneney, Emmarold E.; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M.

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional

  19. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    PubMed

    Gimode, Davis; Odeny, Damaris A; de Villiers, Etienne P; Wanyonyi, Solomon; Dida, Mathews M; Mneney, Emmarold E; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional

  20. SNP genotypes of olfactory receptor genes associated with olfactory ability in German Shepherd dogs.

    PubMed

    Yang, M; Geng, G-J; Zhang, W; Cui, L; Zhang, H-X; Zheng, J-L

    2016-04-01

    To find out the relationship between SNP genotypes of canine olfactory receptor genes and olfactory ability, 28 males and 20 females from German Shepherd dogs in police service were scored by odor detection tests and analyzed using the Beckman GenomeLab SNPstream. The representative 22 SNP loci from the exonic regions of 12 olfactory receptor genes were investigated, and three kinds of odor (human, ice drug and trinitrotoluene) were detected. The results showed that the SNP genotypes at the OR10H1-like:c.632C>T, OR10H1-like:c.770A>T, OR2K2-like:c.518G>A, OR4C11-like:c.511T>G and OR4C11-like:c.692G>A loci had a statistically significant effect on the scenting abilities (P < 0.001). The kind of odor influenced the performances of the dogs (P < 0.001). In addition, there were interactions between genotype and the kind of odor at the following loci: OR10H1-like:c.632C>T, OR10H1-like:c.770A>T, OR4C11-like:c.511T>G and OR4C11-like:c.692G>A (P < 0.001). The dogs with genotype CC at the OR10H1-like:c.632C>T, genotype AA at the OR10H1-like:c.770A>T, genotype TT at the OR4C11-like:c.511T>G and genotype GG at the OR4C11-like:c.692G>A loci did better at detecting the ice drug. We concluded that there was linkage between certain SNP genotypes and the olfactory ability of dogs and that SNP genotypes might be useful in determining dogs' scenting potential.

  1. Allelic imbalance analysis by high-density single-nucleotide polymorphic allele (SNP) array with whole genome amplified DNA

    PubMed Central

    Wong, Kwong-Kwok; Tsang, Yvonne T. M.; Shen, Jianhe; Cheng, Rita S.; Chang, Yi-Mieng; Man, Tsz-Kwong; Lau, Ching C.

    2004-01-01

    Besides their use in mRNA expression profiling, oligonucleotide microarrays have also been applied to single-nucleotide polymorphism (SNP) and loss of heterozygosity (LOH) or allelic imbalance studies. In this report, we evaluate the reliability of using whole genome amplified DNA for analysis with an oligonucleotide microarray containing 11 560 SNPs to detect allelic imbalance and chromosomal copy number abnormalities. Whole genome SNP analyses were performed with DNA extracted from osteosarcoma tissues and patient-matched blood. SNP calls were then generated by Affymetrix® GeneChip® DNA Analysis Software. In two osteosarcoma cases, using unamplified DNA, we identified 793 and 1070 SNP loci with allelic imbalance, respectively. In a parallel experiment with amplified DNA, 78% and 83% of these SNP loci with allelic imbalance was detected. The average false-positive rate is 13.8%. Furthermore, using the Affymetrix® GeneChip® Chromosome Copy Number Tool to analyze the SNP array data, we were able to detect identical chromosomal regions with gain or loss in both amplified and unamplified DNA at cytoband resolution. PMID:15148342

  2. Supplementing High-Density SNP Microarrays for Additional Coverage of Disease-Related Genes: Addiction as a Paradigm

    SciTech Connect

    SacconePhD, Scott F; Chesler, Elissa J; Bierut, Laura J; Kalivas, Peter J; Lerman, Caryn; Saccone, Nancy L; Uhl, George R; Li, Chuan-Yun; Philip, Vivek M; Edenberg, Howard; Sherry, Steven; Feolo, Michael; Moyzis, Robert K; Rutter, Joni L

    2009-01-01

    Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well represented by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions.

  3. Replication of obesity and diabetes-related SNP associations in individuals from Yucatán, México

    PubMed Central

    Hernandez-Escalante, Victor M.; Nava-Gonzalez, Edna J.; Voruganti, V. Saroja; Kent, Jack W.; Haack, Karin; Laviada-Molina, Hugo A.; Molina-Segui, Fernanda; Gallegos-Cabriales, Esther C.; Lopez-Alvarenga, Juan Carlos; Cole, Shelley A.; Mezzles, Marguerite J.; Comuzzie, Anthony G.; Bastarrachea, Raul A.

    2014-01-01

    The prevalence of type 2 diabetes (T2D) is rising rapidly and in Mexicans is ~19%. T2D is affected by both environmental and genetic factors. Although specific genes have been implicated in T2D risk few of these findings are confirmed in studies of Mexican subjects. Our aim was to replicate associations of 39 single nucleotide polymorphisms (SNPs) from 10 genes with T2D-related phenotypes in a community-based Mexican cohort. Unrelated individuals (n = 259) living in southeastern Mexico were enrolled in the study based at the University of Yucatan School of Medicine in Merida. Phenotypes measured included anthropometric measurements, circulating levels of adipose tissue endocrine factors (leptin, adiponectin, pro-inflammatory cytokines), and insulin, glucose, and blood pressure. Association analyses were conducted by measured genotype analysis implemented in SOLAR, adapted for unrelated individuals. SNP Minor allele frequencies ranged from 2.2 to 48.6%. Nominal associations were found for CNR1, SLC30A8, GCK, and PCSK1 SNPs with systolic blood pressure, insulin and glucose, and for CNR1, SLC30A8, KCNJ11, and PCSK1 SNPs with adiponectin and leptin (p < 0.05). P-values greater than 0.0014 were considered significant. Association of SNPs rs10485170 of CNR1 and rs5215 of KCNJ11 with adiponectin and leptin, respectively, reached near significance (p = 0.002). Significant association (p = 0.001) was observed between plasma leptin and rs5219 of KCNJ11. PMID:25477898

  4. SNP-SNP Interaction between TLR4 and MyD88 in Susceptibility to Coronary Artery Disease in the Chinese Han Population.

    PubMed

    Sun, Dandan; Sun, Liping; Xu, Qian; Gong, Yuehua; Wang, Honghu; Yang, Jun; Yuan, Yuan

    2016-03-01

    The toll-like receptor 4 (TLR4)-myeloid differentiation factor 88 (MyD88)-dependent signaling pathway plays a role in the initiation and progression of coronary artery disease (CAD). We investigated SNP-SNP interactions between the TLR4 and MyD88 genes in CAD susceptibility and assessed whether the effects of such interactions were modified by confounding risk factors (hyperglycemia, hyperlipidemia and Helicobacter pylori (H. pylori) infection). Participants with CAD (n = 424) and controls (n = 424) without CAD were enrolled. Polymerase chain restriction-restriction fragment length polymorphism was performed on genomic DNA to detect polymorphisms in TLR4 (rs10116253, rs10983755, and rs11536889) and MyD88 (rs7744). H. pylori infections were evaluated by enzyme-linked immunosorbent assays, and the cardiovascular risk factors for each subject were evaluated clinically. The significant interaction between TLR4 rs11536889 and MyD88 rs7744 was associated with an increased CAD risk (p value for interaction = 0.024). In conditions of hyperglycemia, the interaction effect was strengthened between TLR4 rs11536889 and MyD88 rs7744 (p value for interaction = 0.004). In hyperlipidemic participants, the interaction strength was also enhanced for TLR4 rs11536889 and MyD88 rs7744 (p value for interaction = 0.006). Thus, the novel interaction between TLR4 rs11536889 and MyD88 rs7744 was related with an increased risk of CAD, that could be strengthened by the presence of hyperglycemia or hyperlipidemia. PMID:26959040

  5. SNP-SNP Interaction between TLR4 and MyD88 in Susceptibility to Coronary Artery Disease in the Chinese Han Population.

    PubMed

    Sun, Dandan; Sun, Liping; Xu, Qian; Gong, Yuehua; Wang, Honghu; Yang, Jun; Yuan, Yuan

    2016-03-04

    The toll-like receptor 4 (TLR4)-myeloid differentiation factor 88 (MyD88)-dependent signaling pathway plays a role in the initiation and progression of coronary artery disease (CAD). We investigated SNP-SNP interactions between the TLR4 and MyD88 genes in CAD susceptibility and assessed whether the effects of such interactions were modified by confounding risk factors (hyperglycemia, hyperlipidemia and Helicobacter pylori (H. pylori) infection). Participants with CAD (n = 424) and controls (n = 424) without CAD were enrolled. Polymerase chain restriction-restriction fragment length polymorphism was performed on genomic DNA to detect polymorphisms in TLR4 (rs10116253, rs10983755, and rs11536889) and MyD88 (rs7744). H. pylori infections were evaluated by enzyme-linked immunosorbent assays, and the cardiovascular risk factors for each subject were evaluated clinically. The significant interaction between TLR4 rs11536889 and MyD88 rs7744 was associated with an increased CAD risk (p value for interaction = 0.024). In conditions of hyperglycemia, the interaction effect was strengthened between TLR4 rs11536889 and MyD88 rs7744 (p value for interaction = 0.004). In hyperlipidemic participants, the interaction strength was also enhanced for TLR4 rs11536889 and MyD88 rs7744 (p value for interaction = 0.006). Thus, the novel interaction between TLR4 rs11536889 and MyD88 rs7744 was related with an increased risk of CAD, that could be strengthened by the presence of hyperglycemia or hyperlipidemia.

  6. A GWAS SNP for Schizophrenia Is Linked to the Internal MIR137 Promoter and Supports Differential Allele-Specific Expression

    PubMed Central

    Warburton, Alix; Breen, Gerome; Bubb, Vivien J.; Quinn, John P.

    2016-01-01

    Single nucleotide polymorphisms (SNPs) within the MIR137 gene locus have been shown to confer risk for schizophrenia through genome-wide association studies (GWAS). The expression levels of microRNA-137 (miR-137) and its validated gene targets have also been shown to be disrupted in several neuropsychiatric conditions, including schizophrenia. Regulation of miR-137 expression is thus imperative for normal neuronal functioning. We previously characterized an internal promoter domain within the MIR137 gene that contained a variable number tandem repeat (VNTR) polymorphism and could alter the in vitro levels of miR-137 in a stimulus-induced and allele-specific manner. We now demonstrate that haplotype tagging-SNP analysis linked the rs1625579 GWAS SNP for schizophrenia to this internal MIR137 promoter through a proxy SNP rs2660304 located at this domain. We postulated that the rs2660304 promoter SNP may act as predisposing factor for schizophrenia through altering the levels of miR-137 expression in a genotype-dependent manner. Reporter gene analysis of the internal MIR137 promoter containing the common VNTR variant demonstrated genotype-dependent differences in promoter activity with respect to rs2660304. In line with previous reports, the major allele of the rs2660304 proxy SNP, which has previously been linked with schizophrenia risk through genetic association, resulted in downregulation of reporter gene expression in a tissue culture model. The genetic influence of the rs2660304 proxy SNP on the transcriptional activity of the internal MIR137 promoter, and thus the levels of miR-137 expression, therefore offers a distinct regulatory mechanism to explain the functional significance of the rs1625579 GWAS SNP for schizophrenia risk. PMID:26429811

  7. Association of MDM2 SNP309 and TP53 Arg72Pro polymorphisms with risk of endometrial cancer

    PubMed Central

    YONEDA, TOMOKO; KUBOYAMA, AYUMI; KATO, KIYOKO; OHGAMI, TATSUHIRO; OKAMOTO, KANAKO; SAITO, TOSHIAKI; WAKE, NORIO

    2013-01-01

    The incidence of endometrial cancer, a common gynecological malignancy, is increasing in Japan. We have previously shown that the ER/MDM2/p53/p21 pathway plays an important role in endometrial carcinogenesis. In the present study, we investigated the effects of germline single nucleotide polymorphisms in murine double minute 2 (MDM2) SNP309, TP53 Arg72Pro, ESR1 PvuII and XbaI, and p21 codon 31 on endometrial cancer risk. We evaluated these polymorphisms in DNA samples from 125 endometrial cancer cases and 200 controls using polymerase chain reaction-based restriction fragment length polymorphism. The association of each genetic polymorphism with endometrial cancer was examined by the odds ratio and 95% confidence interval, which were obtained using logistic regression analysis. The SNP309 GG genotype non-significantly increased the risk of endometrial cancer. The 95% confidence interval for the GG genotype vs. the TT genotype of MDM2 SNP309 was 1.76 (0.93–3.30). Endometrial cancer was not associated with tested SNP genotypes for TP53, ESR1 and p21. The combination of SNP309 GG + TG and TP53 codon 72 Arg/Arg significantly increased endometrial cancer risk. The adjusted OR was 2.53 (95% confidence interval, 1.03–6.21) and P for the interaction was 0.04. This result was supported by in vitro data showing that endometrial cancer cell lines with the SNP309 G allele failed to show growth inhibition by treatment with RITA, which reduces p53-MDM2 binding. The presence of the SNP309 G allele and TP53 codon 72 Arg/Arg genotype is associated with an increased risk of endometrial cancer in Japanese women. PMID:23624782

  8. Imputation of microsatellite alleles from dense SNP genotypes for parentage verification across multiple Bos taurus and Bos indicus breeds

    PubMed Central

    McClure, Matthew C.; Sonstegard, Tad S.; Wiggans, George R.; Van Eenennaam, Alison L.; Weber, Kristina L.; Penedo, Cecilia T.; Berry, Donagh P.; Flynn, John; Garcia, Jose F.; Carmo, Adriana S.; Regitano, Luciana C. A.; Albuquerque, Milla; Silva, Marcos V. G. B.; Machado, Marco A.; Coffey, Mike; Moore, Kirsty; Boscher, Marie-Yvonne; Genestout, Lucie; Mazza, Raffaele; Taylor, Jeremy F.; Schnabel, Robert D.; Simpson, Barry; Marques, Elisa; McEwan, John C.; Cromie, Andrew; Coutinho, Luiz L.; Kuehn, Larry A.; Keele, John W.; Piper, Emily K.; Cook, Jim; Williams, Robert; Van Tassell, Curtis P.

    2013-01-01

    To assist cattle producers transition from microsatellite (MS) to single nucleotide polymorphism (SNP) genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N = 479) from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8000 animals representing 39 breeds (Bos taurus and B. indicus) were used to predict 9410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles from 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N = 2 to 36 breeds). These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had ≤1 Mendelian inheritance conflicts with their parents' reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset. PMID:24065982

  9. Somatic Mutation of the SNP rs11614913 and Its Association with Increased MIR 196A2 Expression in Breast Cancer.

    PubMed

    Zhao, Huanhuan; Xu, Jingman; Zhao, Dan; Geng, Meijuan; Ge, Haize; Fu, Li; Zhu, Zhengmao

    2016-02-01

    Common genetic variants (single-nucleotide polymorphisms [SNPs]) in microRNA genes may alter their maturation or expression, resulting in varied functional consequences. Several studies have evaluated the association between the SNP rs11614913 and cancer risk in diverse populations and in a range of cancers, with contradictory outcomes. In this study, we examined 114 paired samples (tumor and normal tissues) from breast cancer patients to study the genotype distribution and somatic mutation of the SNP in MIR 196A2 (rs11614913 C-T). In addition, we evaluated their influence on the mature MIR 196A2 expression. We found that 14% (16/114) of tumors underwent somatic mutation of the SNP rs11614913. Moreover, the CT heterozygous and the CC homozygous states of SNP rs11614913 were more prone to mutation, while the TT homozygous state appeared to be resistant. We further detected a significant increase (p = 0.002) in mature MIR 196A2 expression in breast cancer. In particular, we found a significant association between the occurrence of SNP rs11614913 mutation and high expression (p = 0.0002). In addition, the mature MIR 196A2 expression level was significantly associated with the higher tumor grade (p = 0.004). Taken together, our results seem to demonstrate that somatic mutation of SNP rs11614913 in MIR 196A2 can have an influence on its expression. In addition, it indicated that an unknown mechanism is responsible for both the mutation of SNP rs11614913 and the dysregulation of mature MIR 196A2 expression.

  10. [Study on relieving effects of exogenous SNP, Spd on Belamcanda chinensis under salt-alkalline stress].

    PubMed

    Xu, Meng-Ping; He, Ping; Duan, Cai-Xu; Yang, Mou

    2014-12-01

    The study is aimed to provide the theoretical basis for exploiting and utilization of salt-alkaline soil and cultivating Belamcanda chinensis. In this study, we exerted exogenous substances SNP, Spd to relieve the damage of the mixing salt-alkaline stress on B. chinensis seedling which is NaCl, Na2SO4, NaHCO3 and Na2CO3 four kinds of salt molar ratio of 9: 1: 9: 1, salt concentration of 100 mmol x L(-1). The result illustrated that high pH stress is a major factor caused the salt-alkaline stress, the interaction between time and the concentration of each, treatment was observed, what is more, there are synergies between the salt and alkali stress. The content of B. chinensis seedling leaves' membrane peroxidation index (MDA, O2-*) and metabolites (soluble protein, soluble sugars, organic acids) are showing an upward trend in varying degrees under 100 mmol x L(-1) salt-alkaline stress. It is effective to reduce the content of MDA and O2-*. and improve the levels of metabolites, in which the SNP (0.05 mmol x L(-1)) and Spd (0.5 mmol x L(-1)) to alleviate damage effects is the best. Therefore we can hold the conclusion that SNP and Spd can effectively mitigate the damage of B. chinensis seedling on salt-alkaline stress, improve the resistance ability of B. chinensis seedling which can provide the scientific basis for the utilization of salt-alkaline soil, and the cultivation of B. chinensis.

  11. PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations

    PubMed Central

    Bendl, Jaroslav; Stourac, Jan; Salanda, Ondrej; Pavelka, Antonin; Wieben, Eric D.; Zendulka, Jaroslav; Brezovsky, Jan; Damborsky, Jiri

    2014-01-01

    Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp. PMID:24453961

  12. [Study on relieving effects of exogenous SNP, Spd on Belamcanda chinensis under salt-alkalline stress].

    PubMed

    Xu, Meng-Ping; He, Ping; Duan, Cai-Xu; Yang, Mou

    2014-12-01

    The study is aimed to provide the theoretical basis for exploiting and utilization of salt-alkaline soil and cultivating Belamcanda chinensis. In this study, we exerted exogenous substances SNP, Spd to relieve the damage of the mixing salt-alkaline stress on B. chinensis seedling which is NaCl, Na2SO4, NaHCO3 and Na2CO3 four kinds of salt molar ratio of 9: 1: 9: 1, salt concentration of 100 mmol x L(-1). The result illustrated that high pH stress is a major factor caused the salt-alkaline stress, the interaction between time and the concentration of each, treatment was observed, what is more, there are synergies between the salt and alkali stress. The content of B. chinensis seedling leaves' membrane peroxidation index (MDA, O2-*) and metabolites (soluble protein, soluble sugars, organic acids) are showing an upward trend in varying degrees under 100 mmol x L(-1) salt-alkaline stress. It is effective to reduce the content of MDA and O2-*. and improve the levels of metabolites, in which the SNP (0.05 mmol x L(-1)) and Spd (0.5 mmol x L(-1)) to alleviate damage effects is the best. Therefore we can hold the conclusion that SNP and Spd can effectively mitigate the damage of B. chinensis seedling on salt-alkaline stress, improve the resistance ability of B. chinensis seedling which can provide the scientific basis for the utilization of salt-alkaline soil, and the cultivation of B. chinensis. PMID:25911800

  13. High-density SNP assay development for genetic analysis in maritime pine (Pinus pinaster).

    PubMed

    Plomion, C; Bartholomé, J; Lesur, I; Boury, C; Rodríguez-Quilón, I; Lagraulet, H; Ehrenmann, F; Bouffier, L; Gion, J M; Grivet, D; de Miguel, M; de María, N; Cervera, M T; Bagnoli, F; Isik, F; Vendramin, G G; González-Martínez, S C

    2016-03-01

    Maritime pine provides essential ecosystem services in the south-western Mediterranean basin, where it covers around 4 million ha. Its scattered distribution over a range of environmental conditions makes it an ideal forest tree species for studies of local adaptation and evolutionary responses to climatic change. Highly multiplexed single nucleotide polymorphism (SNP) genotyping arrays are increasingly used to study genetic variation in living organisms and for practical applications in plant and animal breeding and genetic resource conservation. We developed a 9k Illumina Infinium SNP array and genotyped maritime pine trees from (i) a three-generation inbred (F2) pedigree, (ii) the French breeding population and (iii) natural populations from Portugal and the French Atlantic coast. A large proportion of the exploitable SNPs (2052/8410, i.e. 24.4%) segregated in the mapping population and could be mapped, providing the densest ever gene-based linkage map for this species. Based on 5016 SNPs, natural and breeding populations from the French gene pool exhibited similar level of genetic diversity. Population genetics and structure analyses based on 3981 SNP markers common to the Portuguese and French gene pools revealed high levels of differentiation, leading to the identification of a set of highly differentiated SNPs that could be used for seed provenance certification. Finally, we discuss how the validated SNPs could facilitate the identification of ecologically and economically relevant genes in this species, improving our understanding of the demography and selective forces shaping its natural genetic diversity, and providing support for new breeding strategies. PMID:26358548

  14. Plastid DNA sequencing and nuclear SNP genotyping help resolve the puzzle of central American Platanus

    PubMed Central

    De Castro, Olga; Di Maio, Antonietta; Lozada García, José Armando; Piacenti, Danilo; Vázquez-Torres, Mario; De Luca, Paolo

    2013-01-01

    Background and Aims Recent research on the history of Platanus reveals that hybridization phenomena occurred in the central American species. This study has two goals: to help resolve the evolutive puzzle of central American Platanus, and to test the potential of real-time polymerase chain reaction (PCR) for detecting ancient hybridization. Methods Sequencing of a uniparental plastid DNA marker [psbA-trnH(GUG) intergenic spacer] and qualitative and quantitative single nucleotide polymorphism (SNP) genotyping of biparental nuclear ribosomal DNA (nrDNA) markers [LEAFY intron 2 (LFY-i2) and internal transcribed spacer 2 (ITS2)] were used. Key Results Based on the SNP genotyping results, several Platanus accessions show the presence of hybridization/introgression, including some accessions of P. rzedowskii and of P. mexicana var. interior and one of P. mexicana var. mexicana from Oaxaca (= P. oaxacana). Based on haplotype analyses of the psbA-trnH spacer, five haplotypes were detected. The most common of these is present in taxa belonging to P. orientalis, P. racemosa sensu lato, some accessions of P. occidentalis sensu stricto (s.s.) from Texas, P. occidentalis var. palmeri, P. mexicana s.s. and P. rzedowskii. This is highly relevant to genetic relationships with the haplotypes present in P. occidentalis s.s. and P. mexicana var. interior. Conclusions Hybridization and introgression events between lineages ancestral to modern central and eastern North American Platanus species occurred. Plastid haplotypes and qualitative and quantitative SNP genotyping provide information critical for understanding the complex history of Mexican Platanus. Compared with the usual molecular techniques of sub-cloning, sequencing and genotyping, real-time PCR assay is a quick and sensitive technique for analysing complex evolutionary patterns. PMID:23798602

  15. An EST-derived SNP and SSR genetic linkage map of cassava (Manihot esculenta Crantz).

    PubMed

    Rabbi, Ismail Yusuf; Kulembeka, Heneriko Philbert; Masumba, Esther; Marri, Pradeep Reddy; Ferguson, Morag

    2012-07-01

    Cassava (Manihot esculenta Crantz) is one of the most important food security crops in the tropics and increasingly being adopted for agro-industrial processing. Genetic improvement of cassava can be enhanced through marker-assisted breeding. For this, appropriate genomic tools are required to dissect the genetic architecture of economically important traits. Here, a genome-wide SNP-based genetic map of cassava anchored in SSRs is presented. An outbreeder full-sib (F1) family was genotyped on two independent SNP assay platforms: an array of 1,536 SNPs on Illumina's GoldenGate platform was used to genotype a first batch of 60 F1. Of the 1,358 successfully converted SNPs, 600 which were polymorphic in at least one of the parents and was subsequently converted to KBiosciences' KASPar assay platform for genotyping 70 additional F1. High-precision genotyping of 163 informative SSRs using capillary electrophoresis was also carried out. Linkage analysis resulted in a final linkage map of 1,837 centi-Morgans (cM) containing 568 markers (434 SNPs and 134 SSRs) distributed across 19 linkage groups. The average distance between adjacent markers was 3.4 cM. About 94.2% of the mapped SNPs and SSRs have also been localized on scaffolds of version 4.1 assembly of the cassava draft genome sequence. This more saturated genetic linkage map of cassava that combines SSR and SNP markers should find several applications in the improvement of cassava including aligning scaffolds of the cassava genome sequence, genetic analyses of important agro-morphological traits, studying the linkage disequilibrium landscape and comparative genomics.

  16. High-density SNP assay development for genetic analysis in maritime pine (Pinus pinaster).

    PubMed

    Plomion, C; Bartholomé, J; Lesur, I; Boury, C; Rodríguez-Quilón, I; Lagraulet, H; Ehrenmann, F; Bouffier, L; Gion, J M; Grivet, D; de Miguel, M; de María, N; Cervera, M T; Bagnoli, F; Isik, F; Vendramin, G G; González-Martínez, S C

    2016-03-01

    Maritime pine provides essential ecosystem services in the south-western Mediterranean basin, where it covers around 4 million ha. Its scattered distribution over a range of environmental conditions makes it an ideal forest tree species for studies of local adaptation and evolutionary responses to climatic change. Highly multiplexed single nucleotide polymorphism (SNP) genotyping arrays are increasingly used to study genetic variation in living organisms and for practical applications in plant and animal breeding and genetic resource conservation. We developed a 9k Illumina Infinium SNP array and genotyped maritime pine trees from (i) a three-generation inbred (F2) pedigree, (ii) the French breeding population and (iii) natural populations from Portugal and the French Atlantic coast. A large proportion of the exploitable SNPs (2052/8410, i.e. 24.4%) segregated in the mapping population and could be mapped, providing the densest ever gene-based linkage map for this species. Based on 5016 SNPs, natural and breeding populations from the French gene pool exhibited similar level of genetic diversity. Population genetics and structure analyses based on 3981 SNP markers common to the Portuguese and French gene pools revealed high levels of differentiation, leading to the identification of a set of highly differentiated SNPs that could be used for seed provenance certification. Finally, we discuss how the validated SNPs could facilitate the identification of ecologically and economically relevant genes in this species, improving our understanding of the demography and selective forces shaping its natural genetic diversity, and providing support for new breeding strategies.

  17. CLUSTAG & WCLUSTAG: Hierarchical Clustering Algorithms for Efficient Tag-SNP Selection

    NASA Astrophysics Data System (ADS)

    Ao, Sio-Iong

    More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.

  18. Rare coding SNP in DZIP1 gene associated with late-onset sporadic Parkinson's disease

    PubMed Central

    Valente, André X. C. N.; Shin, Joo H.; Sarkar, Abhijit; Gao, Yuan

    2012-01-01

    An association between a rare, coding, non-synonymous SNP variant in the gene DZIP1 and Parkinson's disease was found, based on an analysis of the existing NGRC genome-wide association study dataset. The statistical analysis utilized the hypothesis-rich, targeted search unbiased assessment approach, rather than the hypothesis-free, genome-wide agnostic search paradigm. The association of DZIP1 with Parkinson's disease is discussed in the context of a Parkinson's disease stem-cell ageing theory. PMID:22355768

  19. Benefits and burdens of using a SNP array in pregnancies at increased risk for the common aneuploidies.

    PubMed

    Van Opstal, Diane; de Vries, Femke; Govaerts, Lutgarde; Boter, Marjan; Lont, Debora; van Veen, Stefanie; Joosten, Marieke; Diderich, Karin; Galjaard, Robert-Jan; Srebniak, Malgorzata I

    2015-03-01

    We present the nature of pathogenic SNP array findings in pregnancies without ultrasound (US) abnormalities and show the additional diagnostic value of SNP array as compared with rapid aneuploidy detection and karyotyping. 1,330 prenatal samples were investigated with a 0.5-Mb SNP array after the exclusion of the most common aneuploidies. In 2.7% (36/1,330) of the cases, pathogenic chromosome aberrations were found; a microscopically detectable abnormality in 0.7% and a submicroscopic aberration in 2%. Our results show that in addition to the age- or screening-related aneuploidy risk, in pregnancies without US abnormalities, there is a risk of 1:148 (9/1,330) for a (sub)microscopic abnormality associated with an early-onset often severe disease, 1:222 (6/1,330) for a submicroscopic aberration causing an early-onset disease, 1:74 (18/1,330) for carrying a susceptibility locus for a neurodevelopmental disorder, and 1:443 (3/1,330) for a late-onset disorder (hereditary neuropathy with liability to pressure palsies in all three cases). These risk figures are important for adequate pretest counseling so that prospective parents can make informed individualized choices between targeted prenatal testing and broad testing with SNP array. Based on our results, we believe if invasive testing is performed, SNP array should be the preferred cytogenetic technique irrespective of the indication.

  20. Single-cell SNP analyses and interpretations based on RNA-Seq data for colon cancer research

    PubMed Central

    Chen, Jiahuan; Zhou, Qian; Wang, Yangfan; Ning, Kang

    2016-01-01

    Single-cell sequencing is useful for illustrating the cellular heterogeneities inherent in many intricate biological systems, particularly in human cancer. However, owing to the difficulties in acquiring, amplifying and analyzing single-cell genetic material, obstacles remain for single-cell diversity assessments such as single nucleotide polymorphism (SNP) analyses, rendering biological interpretations of single-cell omics data elusive. We used RNA-Seq data from single-cell and bulk colon cancer samples to analyze the SNP profiles for both structural and functional comparisons. Colon cancer-related pathways with single-cell level SNP enrichment, including the TGF-β and p53 signaling pathways, were also investigated based on both their SNP enrichment patterns and gene expression. We also detected a certain number of fusion transcripts, which may promote tumorigenesis, at the single-cell level. Based on these results, single-cell analyses not only recapitulated the SNP analysis results from the bulk samples but also detected cell-to-cell and cell-to-bulk variations, thereby aiding in early diagnosis and in identifying the precise mechanisms underlying cancers at the single-cell level. PMID:27677461

  1. SNP annotation-based whole genomic prediction and selection: an application to feed efficiency and its component traits in pigs.

    PubMed

    Do, D N; Janss, L L G; Jensen, J; Kadarmideen, H N

    2015-05-01

    The study investigated genetic architecture and predictive ability using genomic annotation of residual feed intake (RFI) and its component traits (daily feed intake [DFI], ADG, and back fat [BF]). A total of 1,272 Duroc pigs had both genotypic and phenotypic records, and the records were split into a training (968 pigs) and a validation dataset (304 pigs) by assigning records as before and after January 1, 2012, respectively. SNP were annotated by 14 different classes using Ensembl variant effect prediction. Predictive accuracy and prediction bias were calculated using Bayesian Power LASSO, Bayesian A, B, and Cπ, and genomic BLUP (GBLUP) methods. Predictive accuracy ranged from 0.508 to 0.531, 0.506 to 0.532, 0.276 to 0.357, and 0.308 to 0.362 for DFI, RFI, ADG, and BF, respectively. BayesCπ100.1 increased accuracy slightly compared to the GBLUP model and other methods. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP groups. Genomic prediction has accuracy comparable to observed phenotype, and use of genomic prediction can be cost effective by replacing feed intake measurement. Genomic annotation had less impact on predictive accuracy traits considered here but may be different for other traits. It is the first study to provide useful insights into biological classes of SNP driving the whole genomic prediction for complex traits in pigs.

  2. GStream: improving SNP and CNV coverage on genome-wide association studies.

    PubMed

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method.

  3. GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies

    PubMed Central

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243

  4. Varietal identification of tea (Camellia sinensis) using nanofluidic array of single nucleotide polymorphism (SNP) markers

    PubMed Central

    Fang, Wan-Ping; Meinhardt, Lyndel W; Tan, Hua-Wei; Zhou, Lin; Mischke, Sue; Zhang, Dapeng

    2014-01-01

    Apart from water, tea is the world’s most widely consumed beverage. Tea is produced in more than 50 countries with an annual production of approximately 4.7 million tons. The market segment for specialty tea has been expanding rapidly owing to increased demand, resulting in higher revenues and profits for tea growers and the industry. Accurate varietal identification is critically important to ensure traceability and authentication of premium tea products, which in turn contribute to on-farm conservation of tea genetic diversity. Using a set of single nucleotide polymorphism (SNP) markers developed from the expressed sequence tag (EST) database of Camilla senensis, we genotyped deoxyribonucleic acid (DNA) samples extracted from a diverse group of tea varieties, including both fresh and processed commercial loose-leaf teas. The validation led to the designation of 60 SNPs that unambiguously identified all 40 tested tea varieties with high statistical rigor (p<0.0001). Varietal authenticity and genetic relationships among the analyzed cultivars were further characterized by ordination and Bayesian clustering analysis. These SNP markers, in combination with a high-throughput genotyping protocol, effectively established and verified specific DNA fingerprints for all tested tea varieties. This method provides a powerful tool for variety authentication and quality control for the tea industry. It is also highly useful for the management of tea genetic resources and breeding, where accurate and efficient genotype identification is essential. PMID:26504544

  5. Evaluation of genome coverage and fidelity of multiple displacement amplification from single cells by SNP array.

    PubMed

    Ling, Jiawei; Zhuang, Guanglun; Tazon-Vega, Barbara; Zhang, Chenhui; Cao, Baoqiang; Rosenwaks, Zev; Xu, Kangpu

    2009-11-01

    The scarce amount of DNA contained in a single cell is a limiting factor for clinical application of preimplantation genetic diagnosis mainly due to the risk of misdiagnosis caused by allele dropout and the difficulty in obtaining copy number variations in all 23 pairs of chromosomes. Multiple displacement amplification (MDA) has been reported to generate large quantity of products from small amount of templates. Here, we evaluated the fidelity of whole-genome amplification MDA from single or a few cells and determined the accuracy of chromosome copy number assessment on these MDA products using an Affymetrix 10K 2.0 SNP Mapping Array. An average coverage rate (86.2%) from single cells was obtained and the rates increased significantly when five or more cells were used as templates. Higher concordance for chromosome copy number from single cells could be achieved when the MDA amplified product was used as reference (93.1%) than when gDNA used as reference (82.8%). The present study indicates that satisfactory genome coverage can be obtained from single-cell MDA which may be used for studies where only a minute amount of genetic materials is available. Clinically, MDA coupled with SNP mapping array may provide a reliable and accurate method for chromosome copy number analysis and most likely for the detection of single-gene disorders as well. PMID:19671595

  6. Testing the performance of mtSNP minisequencing in forensic samples.

    PubMed

    Mosquera-Miguel, A; Alvarez-Iglesias, V; Cerezo, M; Lareu, M V; Carracedo, A; Salas, A

    2009-09-01

    There is a growing interest among forensic geneticists in developing efficient protocols for genotyping coding region mitochondrial DNA (mtDNA) SNPs (mtSNPs). Minisequencing is becoming a popular method for SNP genotyping, but it is still used by few forensic laboratories. In part, this is due to the lack of studies testing its efficiency and reproducibility when applied to real and complex forensic samples. Here we tested a minisequencing design that consists of 71 mtSNPs (in three multiplexes) that are diagnostic of known branches of the R0 phylogeny, in real forensic samples, including degraded bones and teeth, hair shafts, and serial dilutions. The fact that amplicons are short coupled with the natural efficiency of the minisequencing technique allow these assays to perform well with all the samples tested either degraded and/or those containing low DNA amount. We did not observe phylogenetic inconsistencies in the 71 mtSNP haplotypes generated, indicating that the technique is robust against potential artefacts that could arise from unintended contamination and/or spurious amplification of nuclear mtDNA pseudogenes (NUMTs).

  7. Transcriptome sequencing to produce SNP-based genetic maps of onion.

    PubMed

    Duangjit, J; Bohanec, B; Chan, A P; Town, C D; Havey, M J

    2013-08-01

    We used the Roche-454 platform to sequence from normalized cDNA libraries from each of two inbred lines of onion (OH1 and 5225). From approximately 1.6 million reads from each inbred, 27,065 and 33,254 cDNA contigs were assembled from OH1 and 5225, respectively. In total, 3,364 well supported single nucleotide polymorphisms (SNPs) on 1,716 cDNA contigs were identified between these two inbreds. One SNP on each of 1,256 contigs was randomly selected for genotyping. OH1 and 5225 were crossed and 182 gynogenic haploids extracted from hybrid plants were used for SNP mapping. A total of 597 SNPs segregated in the OH1 × 5225 haploid family and a genetic map of ten linkage groups (LOD ≥8) was constructed. Three hundred and thirty-nine of the newly identified SNPs were also mapped using a previously developed segregating family from BYG15-23 × AC43, and 223 common SNPs were used to join the two maps. Because these new SNPs are in expressed regions of the genome and commonly occur among onion germplasms, they will be useful for genetic mapping, gene tagging, marker-aided selection, quality control of seed lots, and fingerprinting of cultivars.

  8. Linkage Disequilibrium Estimation of Chinese Beef Simmental Cattle Using High-density SNP Panels

    PubMed Central

    Zhu, M.; Zhu, B.; Wang, Y. H.; Wu, Y.; Xu, L.; Guo, L. P.; Yuan, Z. R.; Zhang, L. P.; Gao, X.; Gao, H. J.; Xu, S. Z.; Li, J. Y.

    2013-01-01

    Linkage disequilibrium (LD) plays an important role in genomic selection and mapping quantitative trait loci (QTL). In this study, the pattern of LD and effective population size (Ne) were investigated in Chinese beef Simmental cattle. A total of 640 bulls were genotyped with IlluminaBovinSNP50BeadChip and IlluminaBovinHDBeadChip. We estimated LD for each autosomal chromosome at the distance between two random SNPs of <0 to 25 kb, 25 to 50 kb, 50 to 100 kb, 100 to 500 kb, 0.5 to 1 Mb, 1 to 5 Mb and 5 to 10 Mb. The mean values of r2 were 0.30, 0.16 and 0.08, when the separation between SNPs ranged from 0 to 25 kb to 50 to 100 kb and then to 0.5 to 1 Mb, respectively. The LD estimates decreased as the distance increased in SNP pairs, and increased with the increase of minor allelic frequency (MAF) and with the decrease of sample sizes. Estimates of effective population size for Chinese beef Simmental cattle decreased in the past generations and Ne was 73 at five generations ago. PMID:25049849

  9. PrimerMapper: high throughput primer design and graphical assembly for PCR and SNP detection

    PubMed Central

    O’Halloran, Damien M.

    2016-01-01

    Primer design represents a widely employed gambit in diverse molecular applications including PCR, sequencing, and probe hybridization. Variations of PCR, including primer walking, allele-specific PCR, and nested PCR provide specialized validation and detection protocols for molecular analyses that often require screening large numbers of DNA fragments. In these cases, automated sequence retrieval and processing become important features, and furthermore, a graphic that provides the user with a visual guide to the distribution of designed primers across targets is most helpful in quickly ascertaining primer coverage. To this end, I describe here, PrimerMapper, which provides a comprehensive graphical user interface that designs robust primers from any number of inputted sequences while providing the user with both, graphical maps of primer distribution for each inputted sequence, and also a global assembled map of all inputted sequences with designed primers. PrimerMapper also enables the visualization of graphical maps within a browser and allows the user to draw new primers directly onto the webpage. Other features of PrimerMapper include allele-specific design features for SNP genotyping, a remote BLAST window to NCBI databases, and remote sequence retrieval from GenBank and dbSNP. PrimerMapper is hosted at GitHub and freely available without restriction. PMID:26853558

  10. SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes

    PubMed Central

    Krueger, Felix; Andrews, Simon R.

    2016-01-01

    Sequencing reads overlapping polymorphic sites in diploid mammalian genomes may be assigned to one allele or the other. This holds the potential to detect gene expression, chromatin modifications, DNA methylation or nuclear interactions in an allele-specific fashion. SNPsplit is an allele-specific alignment sorter designed to read files in SAM/BAM format and determine the allelic origin of reads or read-pairs that cover known single nucleotide polymorphic (SNP) positions. For this to work libraries must have been aligned to a genome in which all known SNP positions were masked with the ambiguity base 'N' and aligned using a suitable mapping program such as Bowtie2, TopHat, STAR, HISAT2, HiCUP or Bismark. SNPsplit also provides an automated solution to generate N-masked reference genomes for hybrid mouse strains based on the variant call information provided by the Mouse Genomes Project. The unique ability of SNPsplit to work with various different kinds of sequencing data including RNA-Seq, ChIP-Seq, Bisulfite-Seq or Hi-C opens new avenues for the integrative exploration of allele-specific data. PMID:27429743

  11. Coding region SNP analysis to enhance dog mtDNA discrimination power in forensic casework.

    PubMed

    Verscheure, Sophie; Backeljau, Thierry; Desmyter, Stijn

    2015-01-01

    The high population frequencies of three control region haplotypes contribute to the low discrimination power of the dog mtDNA control region. It also diminishes the evidential power of a match with one of these haplotypes in forensic casework. A mitochondrial genome study of 214 Belgian dogs suggested 26 polymorphic coding region sites that successfully resolved dogs with the three most frequent control region haplotypes. In this study, three SNP assays were developed to determine the identity of the 26 informative sites. The control region of 132 newly sampled dogs was sequenced and added to the study of 214 dogs. The assays were applied to 58 dogs of the haplotypes of interest, which confirmed their suitability for enhancing dog mtDNA discrimination power. In the Belgian population study of 346 dogs, the set of 26 sites divided the dogs into 25 clusters of mtGenome sequences with substantially lower population frequency estimates than their control region sequences. In case of a match with one of the three control region haplotypes, using these three SNP assays in conjunction with control region sequencing would augment the exclusion probability of dog mtDNA analysis from 92.9% to 97.0%.

  12. Uncovering hidden variance: pair-wise SNP analysis accounts for additional variance in nicotine dependence

    PubMed Central

    Culverhouse, Robert C.; Saccone, Nancy L.; Stitzel, Jerry A.; Wang, Jen C.; Steinbach, Joseph H.; Goate, Alison M.; Schwantes-An, Tae-Hwi; Grucza, Richard A.; Stevens, Victoria L.; Bierut, Laura J.

    2010-01-01

    Results from genome-wide association studies of complex traits account for only a modest proportion of the trait variance predicted to be due to genetics. We hypothesize that joint analysis of polymorphisms may account for more variance. We evaluated this hypothesis on a case–control smoking phenotype by examining pairs of nicotinic receptor single-nucleotide polymorphisms (SNPs) using the Restricted Partition Method (RPM) on data from the Collaborative Genetic Study of Nicotine Dependence (COGEND). We found evidence of joint effects that increase explained variance. Four signals identified in COGEND were testable in independent American Cancer Society (ACS) data, and three of the four signals replicated. Our results highlight two important lessons: joint effects that increase the explained variance are not limited to loci displaying substantial main effects, and joint effects need not display a significant interaction term in a logistic regression model. These results suggest that the joint analyses of variants may indeed account for part of the genetic variance left unexplained by single SNP analyses. Methodologies that limit analyses of joint effects to variants that demonstrate association in single SNP analyses, or require a significant interaction term, will likely miss important joint effects. PMID:21079997

  13. Three clinical experiences with SNP array results consistent with parental incest: a narrative with lessons learned.

    PubMed

    Helm, Benjamin M; Langley, Katherine; Spangler, Brooke; Vergano, Samantha

    2014-08-01

    Single nucleotide polymorphism microarrays have the ability to reveal parental consanguinity which may or may not be known to healthcare providers. Consanguinity can have significant implications for the health of patients and for individual and family psychosocial well-being. These results often present ethical and legal dilemmas that can have important ramifications. Unexpected consanguinity can be confounding to healthcare professionals who may be unprepared to handle these results or to communicate them to families or other appropriate representatives. There are few published accounts of experiences with consanguinity and SNP arrays. In this paper we discuss three cases where molecular evidence of parental incest was identified by SNP microarray. We hope to further highlight consanguinity as a potential incidental finding, how the cases were handled by the clinical team, and what resources were found to be most helpful. This paper aims to contribute further to professional discourse on incidental findings with genomic technology and how they were addressed clinically. These experiences may provide some guidance on how others can prepare for these findings and help improve practice. As genetic and genomic testing is utilized more by non-genetics providers, we also hope to inform about the importance of engaging with geneticists and genetic counselors when addressing these findings.

  14. Heritability of Recurrent Exertional Rhabdomyolysis in Standardbred and Thoroughbred Racehorses Derived From SNP Genotyping Data.

    PubMed

    Norton, Elaine M; Mickelson, James R; Binns, Matthew M; Blott, Sarah C; Caputo, Paul; Isgren, Cajsa M; McCoy, Annette M; Moore, Alison; Piercy, Richard J; Swinburne, June E; Vaudin, Mark; McCue, Molly E

    2016-11-01

    Recurrent exertional rhabdomyolysis (RER) in Thoroughbred and Standardbred racehorses is characterized by episodes of muscle rigidity and cell damage that often recur upon strenuous exercise. The objective was to evaluate the importance of genetic factors in RER by obtaining an unbiased estimate of heritability in cohorts of unrelated Thoroughbred and Standardbred racehorses. Four hundred ninety-one Thoroughbred and 196 Standardbred racehorses were genotyped with the 54K or 74K SNP genotyping arrays. Heritability was calculated from genome-wide SNP data with a mixed linear and Bayesian model, utilizing the standard genetic relationship matrix (GRM). Both the mixed linear and Bayesian models estimated heritability of RER in Thoroughbreds to be approximately 0.34 and in Standardbred racehorses to be approximately 0.45 after adjusting for disease prevalence and sex. To account for potential differences in the genetic architecture of the underlying causal variants, heritability estimates were adjusted based on linkage disequilibrium weighted kinship matrix, minor allele frequency and variant effect size, yielding heritability estimates that ranged between 0.41-0.46 (Thoroughbreds) and 0.39-0.49 (Standardbreds). In conclusion, between 34-46% and 39-49% of the variance in RER susceptibility in Thoroughbred and Standardbred racehorses, respectively, can be explained by the SNPs present on these 2 genotyping arrays, indicating that RER is moderately heritable. These data provide further rationale for the investigation of genetic mutations associated with RER susceptibility.

  15. SNP Miniplexes for Individual Identification of Random-Bred Domestic Cats.

    PubMed

    Brooks, Ashley; Creighton, Erica K; Gandolfi, Barbara; Khan, Razib; Grahn, Robert A; Lyons, Leslie A

    2016-05-01

    Phenotypic and genotypic characteristics of the cat can be obtained from single nucleotide polymorphisms (SNPs) analyses of fur. This study developed miniplexes using SNPs with high discriminating power for random-bred domestic cats, focusing on individual and phenotypic identification. Seventy-eight SNPs were investigated using a multiplex PCR followed by a fluorescently labeled single base extension (SBE) technique (SNaPshot(®) ). The SNP miniplexes were evaluated for reliability, reproducibility, sensitivity, species specificity, detection limitations, and assignment accuracy. Six SNPplexes were developed containing 39 intergenic SNPs and 26 phenotypic SNPs, including a sex identification marker, ZFXY. The combined random match probability (cRMP) was 6.58 × 10(-19) across all Western cat populations and the likelihood ratio was 1.52 × 10(18) . These SNPplexes can distinguish individual cats and their phenotypic traits, which could provide insight into crime reconstructions. A SNP database of 237 cats from 13 worldwide populations is now available for forensic applications.

  16. Coding region SNP analysis to enhance dog mtDNA discrimination power in forensic casework.

    PubMed

    Verscheure, Sophie; Backeljau, Thierry; Desmyter, Stijn

    2015-01-01

    The high population frequencies of three control region haplotypes contribute to the low discrimination power of the dog mtDNA control region. It also diminishes the evidential power of a match with one of these haplotypes in forensic casework. A mitochondrial genome study of 214 Belgian dogs suggested 26 polymorphic coding region sites that successfully resolved dogs with the three most frequent control region haplotypes. In this study, three SNP assays were developed to determine the identity of the 26 informative sites. The control region of 132 newly sampled dogs was sequenced and added to the study of 214 dogs. The assays were applied to 58 dogs of the haplotypes of interest, which confirmed their suitability for enhancing dog mtDNA discrimination power. In the Belgian population study of 346 dogs, the set of 26 sites divided the dogs into 25 clusters of mtGenome sequences with substantially lower population frequency estimates than their control region sequences. In case of a match with one of the three control region haplotypes, using these three SNP assays in conjunction with control region sequencing would augment the exclusion probability of dog mtDNA analysis from 92.9% to 97.0%. PMID:25299153

  17. PrimerMapper: high throughput primer design and graphical assembly for PCR and SNP detection.

    PubMed

    O'Halloran, Damien M

    2016-01-01

    Primer design represents a widely employed gambit in diverse molecular applications including PCR, sequencing, and probe hybridization. Variations of PCR, including primer walking, allele-specific PCR, and nested PCR provide specialized validation and detection protocols for molecular analyses that often require screening large numbers of DNA fragments. In these cases, automated sequence retrieval and processing become important features, and furthermore, a graphic that provides the user with a visual guide to the distribution of designed primers across targets is most helpful in quickly ascertaining primer coverage. To this end, I describe here, PrimerMapper, which provides a comprehensive graphical user interface that designs robust primers from any number of inputted sequences while providing the user with both, graphical maps of primer distribution for each inputted sequence, and also a global assembled map of all inputted sequences with designed primers. PrimerMapper also enables the visualization of graphical maps within a browser and allows the user to draw new primers directly onto the webpage. Other features of PrimerMapper include allele-specific design features for SNP genotyping, a remote BLAST window to NCBI databases, and remote sequence retrieval from GenBank and dbSNP. PrimerMapper is hosted at GitHub and freely available without restriction. PMID:26853558

  18. Differentiation of drug and non-drug Cannabis using a single nucleotide polymorphism (SNP) assay.

    PubMed

    Rotherham, D; Harbison, S A

    2011-04-15

    Cannabis sativa is both an illegal drug and a legitimate crop. The differentiation of illegal drug Cannabis from non-drug forms of Cannabis is relevant in the context of the growth of fibre and seed oil varieties of Cannabis for commercial purposes. This differentiation is currently determined based on the levels of tetrahydrocannabinol (THC) in adult plants. DNA based methods have the potential to assay Cannabis material unsuitable for analysis using conventional means including seeds, pollen and severely degraded material. The purpose of this research was to develop a single nucleotide polymorphism (SNP) assay for the differentiation of "drug" and "non-drug"Cannabis plants. An assay was developed based on four polymorphisms within a 399 bp fragment of the tetrahydrocannabinolic acid (THCA) synthase gene, utilising the snapshot multiplex kit. This SNP assay was tested on 94 Cannabis plants, which included 10 blind samples, and was able to differentiate between "drug" and "non-drug"Cannabis in all cases, while also differentiating between Cannabis and other species. Non-drug plants were found to be homozygous at the four sites assayed while drug Cannabis plants were either homozygous or heterozygous.

  19. Screening for replication of genome-wide SNP associations in sporadic ALS

    PubMed Central

    Cronin, Simon; Tomik, Barbara; Bradley, Daniel G; Slowik, Agnieszka; Hardiman, Orla

    2009-01-01

    We recently reported a joint analysis of genome-wide association (GWA) data on 958 sporadic amyotrophic lateral sclerosis (ALS) cases and 932 controls from Ireland and the publicly available data sets from the United States and the Netherlands. The strongest pooled association was rs10260404 in the dipeptidyl-peptidase 6 (DPP6) gene. Here, we sought confirmation of joint analysis signals in both an expanded Irish and a Polish ALS cohort. Among 287 522 autosomal single-nucleotide polymorphisms (SNPs), 27 were commonly associated on joint analysis of the Irish, US and Dutch GWAs. These 27 SNPs were genotyped in an expanded Irish cohort (312 patients with SALS; 259 controls) and an additional Polish cohort (218 patients; 356 controls). Eleven SNPs, including rs10260404, reached a final P-value below 0.05 in the Irish cohort. In the Polish cohort, only one SNP, rs6299711, showed nominal association with ALS. Pooling of data for 1267 patients with ALS and 1336 control subjects did not identify any association reaching Bonferroni significance (P<1.74 × 10−7). The present strategy did not reveal any consistently associated SNP across four populations. The result for DPP6 is surprising, as it has been replicated elsewhere. We discuss the possible interpretations and implications of these findings for future ALS GWA studies both within and between populations. PMID:18987618

  20. Diversity in 113 cowpea [Vigna unguiculata (L) Walp] accessions assessed with 458 SNP markers.

    PubMed

    Egbadzor, Kenneth F; Ofori, Kwadwo; Yeboah, Martin; Aboagye, Lawrence M; Opoku-Agyeman, Michael O; Danquah, Eric Y; Offei, Samuel K

    2014-01-01

    Single Nucleotide Polymorphism (SNP) markers were used in characterization of 113 cowpea accessions comprising of 108 from Ghana and 5 from abroad. Leaf tissues from plants cultivated at the University of Ghana were genotyped at KBioscience in the United Kingdom. Data was generated for 477 SNPs, out of which 458 revealed polymorphism. The results were used to analyze genetic dissimilarity among the accessions using Darwin 5 software. The markers discriminated among all of the cowpea accessions and the dissimilarity values which ranged from 0.006 to 0.63 were used for factorial plot. Unexpected high levels of heterozygosity were observed on some of the accessions. Accessions known to be closely related clustered together in a dendrogram drawn with WPGMA method. A maximum length sub-tree which comprised of 48 core accessions was constructed. The software package structure was used to separate accessions into three groups, and the programme correctly identified varieties that were known hybrids. The hybrids were those accessions with numerous heterozygous loci. The structure plot showed closely related accessions with similar genome patterns. The SNP markers were more efficient in discriminating among the cowpea germplasm than morphological, seed protein polymorphism and simple sequence repeat studies reported earlier on the same collection. PMID:25332852

  1. Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms.

    PubMed

    Diskin, Sharon J; Li, Mingyao; Hou, Cuiping; Yang, Shuzhang; Glessner, Joseph; Hakonarson, Hakon; Bucan, Maja; Maris, John M; Wang, Kai

    2008-11-01

    Whole-genome microarrays with large-insert clones designed to determine DNA copy number often show variation in hybridization intensity that is related to the genomic position of the clones. We found these 'genomic waves' to be present in Illumina and Affymetrix SNP genotyping arrays, confirming that they are not platform-specific. The causes of genomic waves are not well-understood, and they may prevent accurate inference of copy number variations (CNVs). By measuring DNA concentration for 1444 samples and by genotyping the same sample multiple times with varying DNA quantity, we demonstrated that DNA quantity correlates with the magnitude of waves. We further showed that wavy signal patterns correlate best with GC content, among multiple genomic features considered. To measure the magnitude of waves, we proposed a GC-wave factor (GCWF) measure, which is a reliable predictor of DNA quantity (correlation coefficient = 0.994 based on samples with serial dilution). Finally, we developed a computational approach by fitting regression models with GC content included as a predictor variable, and we show that this approach improves the accuracy of CNV detection. With the wide application of whole-genome SNP genotyping techniques, our wave adjustment method will be important for taking full advantage of genotyped samples for CNV analysis.

  2. SNP genotyping and population genomics from expressed sequences - current advances and future possibilities.

    PubMed

    De Wit, Pierre; Pespeni, Melissa H; Palumbi, Stephen R

    2015-05-01

    With the rapid increase in production of genetic data from new sequencing technologies, a myriad of new ways to study genomic patterns in nonmodel organisms are currently possible. Because genome assembly still remains a complicated procedure, and because the functional role of much of the genome is unclear, focusing on SNP genotyping from expressed sequences provides a cost-effective way to reduce complexity while still retaining functionally relevant information. This review summarizes current methods, identifies ways that using expressed sequence data benefits population genomic inference and explores how current practitioners evaluate and overcome challenges that are commonly encountered. We focus particularly on the additional power of functional analysis provided by expressed sequence data and how these analyses push beyond allele pattern data available from nonfunction genomic approaches. The massive data sets generated by these approaches create opportunities and problems as well - especially false positives. We discuss methods available to validate results from expressed SNP genotyping assays, new approaches that sidestep use of mRNA and review follow-up experiments that can focus on evolutionary mechanisms acting across the genome.

  3. Heritability of Recurrent Exertional Rhabdomyolysis in Standardbred and Thoroughbred Racehorses Derived From SNP Genotyping Data.

    PubMed

    Norton, Elaine M; Mickelson, James R; Binns, Matthew M; Blott, Sarah C; Caputo, Paul; Isgren, Cajsa M; McCoy, Annette M; Moore, Alison; Piercy, Richard J; Swinburne, June E; Vaudin, Mark; McCue, Molly E

    2016-11-01

    Recurrent exertional rhabdomyolysis (RER) in Thoroughbred and Standardbred racehorses is characterized by episodes of muscle rigidity and cell damage that often recur upon strenuous exercise. The objective was to evaluate the importance of genetic factors in RER by obtaining an unbiased estimate of heritability in cohorts of unrelated Thoroughbred and Standardbred racehorses. Four hundred ninety-one Thoroughbred and 196 Standardbred racehorses were genotyped with the 54K or 74K SNP genotyping arrays. Heritability was calculated from genome-wide SNP data with a mixed linear and Bayesian model, utilizing the standard genetic relationship matrix (GRM). Both the mixed linear and Bayesian models estimated heritability of RER in Thoroughbreds to be approximately 0.34 and in Standardbred racehorses to be approximately 0.45 after adjusting for disease prevalence and sex. To account for potential differences in the genetic architecture of the underlying causal variants, heritability estimates were adjusted based on linkage disequilibrium weighted kinship matrix, minor allele frequency and variant effect size, yielding heritability estimates that ranged between 0.41-0.46 (Thoroughbreds) and 0.39-0.49 (Standardbreds). In conclusion, between 34-46% and 39-49% of the variance in RER susceptibility in Thoroughbred and Standardbred racehorses, respectively, can be explained by the SNPs present on these 2 genotyping arrays, indicating that RER is moderately heritable. These data provide further rationale for the investigation of genetic mutations associated with RER susceptibility. PMID:27489252

  4. Diversity in 113 cowpea [Vigna unguiculata (L) Walp] accessions assessed with 458 SNP markers.

    PubMed

    Egbadzor, Kenneth F; Ofori, Kwadwo; Yeboah, Martin; Aboagye, Lawrence M; Opoku-Agyeman, Michael O; Danquah, Eric Y; Offei, Samuel K

    2014-01-01

    Single Nucleotide Polymorphism (SNP) markers were used in characterization of 113 cowpea accessions comprising of 108 from Ghana and 5 from abroad. Leaf tissues from plants cultivated at the University of Ghana were genotyped at KBioscience in the United Kingdom. Data was generated for 477 SNPs, out of which 458 revealed polymorphism. The results were used to analyze genetic dissimilarity among the accessions using Darwin 5 software. The markers discriminated among all of the cowpea accessions and the dissimilarity values which ranged from 0.006 to 0.63 were used for factorial plot. Unexpected high levels of heterozygosity were observed on some of the accessions. Accessions known to be closely related clustered together in a dendrogram drawn with WPGMA method. A maximum length sub-tree which comprised of 48 core accessions was constructed. The software package structure was used to separate accessions into three groups, and the programme correctly identified varieties that were known hybrids. The hybrids were those accessions with numerous heterozygous loci. The structure plot showed closely related accessions with similar genome patterns. The SNP markers were more efficient in discriminating among the cowpea germplasm than morphological, seed protein polymorphism and simple sequence repeat studies reported earlier on the same collection.

  5. Whole-Genome Analysis of Diversity and SNP-Major Gene Association in Peach Germplasm

    PubMed Central

    Micheletti, Diego; Dettori, Maria Teresa; Micali, Sabrina; Aramini, Valeria; Pacheco, Igor; Da Silva Linge, Cassia; Foschi, Stefano; Banchi, Elisa; Barreneche, Teresa; Quilot-Turion, Bénédicte; Lambert, Patrick; Pascal, Thierry; Iglesias, Ignasi; Carbó, Joaquim; Wang, Li-rong; Ma, Rui-juan; Li, Xiong-wei; Gao, Zhong-shan; Nazzicari, Nelson; Troggio, Michela; Bassi, Daniele; Rossini, Laura; Verde, Ignazio; Laurens, François; Arús, Pere; Aranzana, Maria José

    2015-01-01

    Peach was domesticated in China more than four millennia ago and from there it spread world-wide. Since the middle of the last century, peach breeding programs have been very dynamic generating hundreds of new commercial varieties, however, in most cases such varieties derive from a limited collection of parental lines (founders). This is one reason for the observed low levels of variability of the commercial gene pool, implying that knowledge of the extent and distribution of genetic variability in peach is critical to allow the choice of adequate parents to confer enhanced productivity, adaptation and quality to improved varieties. With this aim we genotyped 1,580 peach accessions (including a few closely related Prunus species) maintained and phenotyped in five germplasm collections (four European and one Chinese) with the International Peach SNP Consortium 9K SNP peach array. The study of population structure revealed the subdivision of the panel in three main populations, one mainly made up of Occidental varieties from breeding programs (POP1OCB), one of Occidental landraces (POP2OCT) and the third of Oriental accessions (POP3OR). Analysis of linkage disequilibrium (LD) identified differential patterns of genome-wide LD blocks in each of the populations. Phenotypic data for seven monogenic traits were integrated in a genome-wide association study (GWAS). The significantly associated SNPs were always in the regions predicted by linkage analysis, forming haplotypes of markers. These diagnostic haplotypes could be used for marker-assisted selection (MAS) in modern breeding programs. PMID:26352671

  6. SNP discovery in complex allotetraploid genomes (Gossypium spp., Malvaceae) using genotyping by sequencing1

    PubMed Central

    Logan-Young, Carla Jo; Yu, John Z.; Verma, Surender K.; Percy, Richard G.; Pepper, Alan E.

    2015-01-01

    Premise of the study: Single-nucleotide polymorphism (SNP) marker discovery in plants with complex allotetraploid genomes is often confounded by the presence of homeologous loci (along with paralogous and orthologous loci). Here we present a strategy to filter for SNPs representing orthologous loci. Methods and Results: Using Illumina next-generation sequencing, 54 million reads were collected from restriction enzyme–digested DNA libraries of a diversity of Gossypium taxa. Loci with one to three SNPs were discovered using the Stacks software package, yielding 25,529 new cotton SNP combinations, including those that are polymorphic at both interspecific and intraspecific levels. Frequencies of predicted dual-homozygous (aa/bb) marker polymorphisms ranged from 6.7–11.6% of total shared fragments in intraspecific comparisons and from 15.0–16.4% in interspecific comparisons. Conclusions: This resource provides dual-homozygous (aa/bb) marker polymorphisms. Both in silico and experimental validation efforts demonstrated that these markers are enriched for single orthologous loci that are homozygous for alternative alleles. PMID:25798340

  7. Lack of Association of the CD247 SNP rs2056626 with Systemic Sclerosis in Han Chinese

    PubMed Central

    Wang, Jiucun; Yi, Lin; Guo, Xinjian; He, Dongyi; Li, Hongyi; Guo, Gang; Wang, Yi; Zou, Hejian; Gu, Yuanhui; Tu, Wenzhen; Wu, Wenyu; Yang, Li; Xiao, Rong; Lai, Syeling; Assassi, Shervin; Mayes, Maureen D; Zhou, Xiaodong

    2014-01-01

    Systemic sclerosis (SSc) is a complex disease involving multiple genetic factors. A recent genome-wide association study (GWAS) indicated that CD247 was strongly associated with SSc, which was subsequently confirmed in a SSc cohort of European population. However, genetic heterogeneity in different ethnic populations may significantly impact the complex trait of SSc. The studies herein aimed to examine whether the SSc-associated SNP rs2056626 of CD247 identified in Caucasian is also associated with Han Chinese SSc. A Han Chinese cohort consisting of 387 SSc patients and 523 healthy controls were examined in the studies. TaqMan assays were performed to examine the SNP. Exact p-values were obtained (Fisher’s test) from 2x2 tables of allele counts and disease status. The results showed that there was no association between rs2056626 of CD247 and SSc or any SSc subtypes of Han Chinese. The negative results are important in understanding genetics of SSc in different ethnic populations, which further suggest complex nature of genetics of SSc. PMID:25317213

  8. Performance of different SNP panels for parentage testing in two East Asian cattle breeds.

    PubMed

    Strucken, E M; Gudex, B; Ferdosi, M H; Lee, H K; Song, K D; Gibson, J P; Kelly, M; Piper, E K; Porto-Neto, L R; Lee, S H; Gondro, C

    2014-08-01

    The International Society for Animal Genetics (ISAG) proposed a panel of single nucleotide polymorphisms (SNPs) for parentage testing in cattle (a core panel of 100 SNPs and an additional list of 100 SNPs). However, markers specific to East Asian taurine cattle breeds were not included, and no information is available as to whether the ISAG panel performs adequately for these breeds. We tested ISAG's core (100 SNP) and full (200 SNP) panels on two East Asian taurine breeds: the Korean Hanwoo and the Japanese Wagyu, the latter from the Australian herd. Even though the power of exclusion was high at 0.99 for both ISAG panels, the core panel performed poorly with 3.01% false-positive assignments in the Hanwoo population and 3.57% in the Wagyu. The full ISAG panel identified all sire-offspring relations correctly in both populations with 0.02% of relations wrongly excluded in the Hanwoo population. Based on these results, we created and tested two population-specific marker panels: one for the Wagyu population, which showed no false-positive assignments with either 100 or 200 SNPs, and a second panel for the Hanwoo, which still had some false-positive assignments with 100 SNPs but no false positives using 200 SNPs. In conclusion, for parentage assignment in East Asian cattle breeds, only the full ISAG panel is adequate for parentage testing. If fewer markers should be used, it is advisable to use population-specific markers rather than the ISAG panel.

  9. SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes.

    PubMed

    Krueger, Felix; Andrews, Simon R

    2016-01-01

    Sequencing reads overlapping polymorphic sites in diploid mammalian genomes may be assigned to one allele or the other. This holds the potential to detect gene expression, chromatin modifications, DNA methylation or nuclear interactions in an allele-specific fashion. SNPsplit is an allele-specific alignment sorter designed to read files in SAM/BAM format and determine the allelic origin of reads or read-pairs that cover known single nucleotide polymorphic (SNP) positions. For this to work libraries must have been aligned to a genome in which all known SNP positions were masked with the ambiguity base 'N' and aligned using a suitable mapping program such as Bowtie2, TopHat, STAR, HISAT2, HiCUP or Bismark. SNPsplit also provides an automated solution to generate N-masked reference genomes for hybrid mouse strains based on the variant call information provided by the Mouse Genomes Project. The unique ability of SNPsplit to work with various different kinds of sequencing data including RNA-Seq, ChIP-Seq, Bisulfite-Seq or Hi-C opens new avenues for the integrative exploration of allele-specific data. PMID:27429743

  10. Development of genetic markers in abalone through construction of a SNP database.

    PubMed

    Kang, J-H; Appleyard, S A; Elliott, N G; Jee, Y-J; Lee, J B; Kang, S W; Baek, M K; Han, Y S; Choi, T-J; Lee, Y S

    2011-06-01

    In the absence of a reference genome, single-nucleotide polymorphisms (SNP) discovery in a group of abalone species was undertaken by random sequence assembly. A web-based interface was constructed, and 11 932 DNA sequences from the genus Haliotis were assembled, with 1321 contigs built. Of these, 118 contigs that consisted of at least ten annotation groups were selected. The 1577 putative SNPs were identified from the 118 contigs, with SNPs in several HSP70 gene contigs confirmed by PCR amplification of an 809-bp DNA fragment. SNPs in the HSP70 gene were compared across eight abalone species. A total of 129 polymorphic sites, including heterozygote sites within and among species, were observed. Phylogenetic analysis of the partial HSP70 gene region showed separation of the tested abalone into two groups, one reflecting the southern hemisphere species and the other the northern hemisphere species. Interestingly, Haliotis iris from New Zealand showed a closer relationship to species distributed in the northern Pacific region. Although HSP genes are known to be highly conserved among taxa, the validation of polymorphic SNPs from HSP70 in this mollusc demonstrates the applicability of cross-species SNP markers in abalone and the first step towards universal nuclear markers in Haliotis.

  11. PrimerMapper: high throughput primer design and graphical assembly for PCR and SNP detection.

    PubMed

    O'Halloran, Damien M

    2016-01-01

    Primer design represents a widely employed gambit in diverse molecular applications including PCR, sequencing, and probe hybridization. Variations of PCR, including primer walking, allele-specific PCR, and nested PCR provide specialized validation and detection protocols for molecular analyses that often require screening large numbers of DNA fragments. In these cases, automated sequence retrieval and processing become important features, and furthermore, a graphic that provides the user with a visual guide to the distribution of designed primers across targets is most helpful in quickly ascertaining primer coverage. To this end, I describe here, PrimerMapper, which provides a comprehensive graphical user interface that designs robust primers from any number of inputted sequences while providing the user with both, graphical maps of primer distribution for each inputted sequence, and also a global assembled map of all inputted sequences with designed primers. PrimerMapper also enables the visualization of graphical maps within a browser and allows the user to draw new primers directly onto the webpage. Other features of PrimerMapper include allele-specific design features for SNP genotyping, a remote BLAST window to NCBI databases, and remote sequence retrieval from GenBank and dbSNP. PrimerMapper is hosted at GitHub and freely available without restriction.

  12. Publishing SNP genotypes of human embryonic stem cell lines: policy statement of the International Stem Cell Forum Ethics Working Party.

    PubMed

    Knoppers, Bartha M; Isasi, Rosario; Benvenisty, Nissim; Kim, Ock-Joo; Lomax, Geoffrey; Morris, Clive; Murray, Thomas H; Lee, Eng Hin; Perry, Margery; Richardson, Genevra; Sipp, Douglas; Tanner, Klaus; Wahlström, Jan; de Wert, Guido; Zeng, Fanyi

    2011-09-01

    Novel methods and associated tools permitting individual identification in publicly accessible SNP databases have become a debatable issue. There is growing concern that current technical and ethical safeguards to protect the identities of donors could be insufficient. In the context of human embryonic stem cell research, there are no studies focusing on the probability that an hESC line donor could be identified by analyzing published SNP profiles and associated genotypic and phenotypic information. We present the International Stem Cell Forum (ISCF) Ethics Working Party's Policy Statement on "Publishing SNP Genotypes of Human Embryonic Stem Cell Lines (hESC)". The Statement prospectively addresses issues surrounding the publication of genotypic data and associated annotations of hESC lines in open access databases. It proposes a balanced approach between the goals of open science and data sharing with the respect for fundamental bioethical principles (autonomy, privacy, beneficence, justice and research merit and integrity).

  13. Assignment of SNP allelic configuration in polyploids using competitive allele-specific PCR: application to citrus triploid progeny

    PubMed Central

    Cuenca, José; Aleza, Pablo; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    Background Polyploidy is a major component of eukaryote evolution. Estimation of allele copy numbers for molecular markers has long been considered a challenge for polyploid species, while this process is essential for most genetic research. With the increasing availability and whole-genome coverage of single nucleotide polymorphism (SNP) markers, it is essential to implement a versatile SNP genotyping method to assign allelic configuration efficiently in polyploids. Scope This work evaluates the usefulness of the KASPar method, based on competitive allele-specific PCR, for the assignment of SNP allelic configuration. Citrus was chosen as a model because of its economic importance, the ongoing worldwide polyploidy manipulation projects for cultivar and rootstock breeding, and the increasing availability of SNP markers. Conclusions Fifteen SNP markers were successfully designed that produced clear allele signals that were in agreement with previous genotyping results at the diploid level. The analysis of DNA mixes between two haploid lines (Clementine and pummelo) at 13 different ratios revealed a very high correlation (average = 0·9796; s.d. = 0·0094) between the allele ratio and two parameters [θ angle = tan−1 (y/x) and y′ = y/(x + y)] derived from the two normalized allele signals (x and y) provided by KASPar. Separated cluster analysis and analysis of variance (ANOVA) from mixed DNA simulating triploid and tetraploid hybrids provided 99·71 % correct allelic configuration. Moreover, triploid populations arising from 2n gametes and interploid crosses were easily genotyped and provided useful genetic information. This work demonstrates that the KASPar SNP genotyping technique is an efficient way to assign heterozygous allelic configurations within polyploid populations. This method is accurate, simple and cost-effective. Moreover, it may be useful for quantitative studies, such as relative allele-specific expression analysis and bulk segregant analysis

  14. Frequency of SNP -336A/G in the promoter region of CD209 in a population from northeastern Brazil.

    PubMed

    Costa, P N; Ferreira-Fernandes, H; de Oliveira, J S; Pereira, A C T C; Pinto, G R; Ferreira, G P

    2015-08-14

    Dendritic cells (DCs) mediate the initiation of the immune response against a variety of pathogens. The DC-SIGN receptor is encoded by the gene CD209 and is expressed on the surface of DCs. It binds to mannose-rich carbohydrates and enables the recognition of bacteria, fungi, parasites, and viruses. SNP -336A/G in the promoter region of CD209 influences the expression of the DC-SIGN receptor. Several studies have associated this SNP with an increased susceptibility to infectious diseases and the development of more severe forms of disease. Therefore, the aim of this study was to determine the prevalence of SNP -336A/G in a population from northeastern Brazil. We analyzed 181 individuals from the general population of Parnaíba, Piauí, Brazil, of which 37% were men and 63% were women. SNP -336A/G was detected by polymerase chain reaction and treatment with the restriction enzyme MscI and visualized by electrophoresis on an 8% polyacrylamide gel stained with silver nitrate. Of the individuals analyzed, 116 (64.1%) were homozygous AA, 57 (31.5%) were heterozygous (AG), and 8 (4.4%) were homozygous GG. The allele frequency of -336G was 20.2%. Genotype frequencies were in Hardy-Weinberg equilibrium. To the best of our knowledge, this is the first report to describe the frequency of the CD209 SNP -336A/G in a population in the State of Piauí. Further studies are needed to determine the relationship between this SNP and the vulnerability of this population to major infectious diseases.

  15. Comparison of whole-genome (13X) and capture (87X) resequencing methods for SNP and genotype callings.

    PubMed

    Roux, P F; Marthey, S; Djari, A; Moroldo, M; Esquerré, D; Estellé, J; Klopp, C; Lagarrigue, S; Demeure, O

    2015-02-01

    The number of polymorphisms identified with next-generation sequencing approaches depends directly on the sequencing depth and therefore on the experimental cost. Although higher levels of depth ensure more sensitive and more specific SNP calls, economic constraints limit the increase of depth for whole-genome resequencing (WGS). For this reason, capture resequencing is used for studies focusing on only some specific regions of the genome. However, several biases in capture resequencing are known to have a negative impact on the sensitivity of SNP detection. Within this framework, the aim of this study was to compare the accuracy of WGS and capture resequencing on SNP detection and genotype calling, which differ in terms of both sequencing depth and biases. Indeed, we have evaluated the SNP calling and genotyping accuracy in a WGS dataset (13X) and in a capture resequencing dataset (87X) performed on 11 individuals. The percentage of SNPs not identified due to a sevenfold sequencing depth decrease was estimated at 7.8% using a down-sampling procedure on the capture sequencing dataset. A comparison of the 87X capture sequencing dataset with the WGS dataset revealed that capture-related biases were leading with the loss of 5.2% of SNPs detected with WGS. Nevertheless, when considering the SNPs detected by both approaches, capture sequencing appears to achieve far better SNP genotyping, with about 4.4% of the WGS genotypes that can be considered as erroneous and even 10% focusing on heterozygous genotypes. In conclusion, WGS and capture deep sequencing can be considered equivalent strategies for SNP detection, as the rate of SNPs not identified because of a low sequencing depth in the former is quite similar to SNPs missed because of method biases of the latter. On the other hand, capture deep sequencing clearly appears more adapted for studies requiring great accuracy in genotyping. PMID:25515399

  16. SNP Analysis and Whole Exome Sequencing: Their Application in the Analysis of a Consanguineous Pedigree Segregating Ataxia

    PubMed Central

    Nickerson, Sarah L.; Marquis-Nicholson, Renate; Claxton, Karen; Ashton, Fern; Leong, Ivone U. S.; Prosser, Debra O.; Love, Jennifer M.; George, Alice M.; Taylor, Graham; Wilson, Callum; McKinlay Gardner, R. J.; Love, Donald R.

    2015-01-01

    Autosomal recessive cerebellar ataxia encompasses a large and heterogeneous group of neurodegenerative disorders. We employed single nucleotide polymorphism (SNP) analysis and whole exome sequencing to investigate a consanguineous Maori pedigree segregating ataxia. We identified a novel mutation in exon 10 of the SACS gene: c.7962T>G p.(Tyr2654*), establishing the diagnosis of autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS). Our findings expand both the genetic and phenotypic spectrum of this rare disorder, and highlight the value of high-density SNP analysis and whole exome sequencing as powerful and cost-effective tools in the diagnosis of genetically heterogeneous disorders such as the hereditary ataxias.

  17. SNiPloid: A Utility to Exploit High-Throughput SNP Data Derived from RNA-Seq in Allopolyploid Species

    PubMed Central

    Peralta, Marine; Combes, Marie-Christine; Lashermes, Philippe; Dereeper, Alexis

    2013-01-01

    High-throughput sequencing is a common approach to discover SNP variants, especially in plant species. However, methods to analyze predicted SNPs are often optimized for diploid plant species whereas many crop species are allopolyploids and combine related but divergent subgenomes (homoeologous chromosome sets). We created a software tool, SNiPloid, that exploits and interprets putative SNPs in the context of allopolyploidy by comparing SNPs from an allopolyploid with those obtained in its modern-day diploid progenitors. SNiPloid can compare SNPs obtained from a sample to estimate the subgenome contribution to the transcriptome or SNPs obtained from two polyploid accessions to search for SNP divergence. PMID:24163691

  18. Candidate SNP Associations of Optimism and Resilience in Older Adults: Exploratory Study of 935 Community-Dwelling Adults

    PubMed Central

    Rana, Brinda K.; Darst, Burcu F.; Bloss, Cinnamon; Shih, Pei-an Betty; Depp, Colin; Nievergelt, Caroline M.; Allison, Matthew; Parsons, J. Kellogg; Schork, Nicholas; Jeste, Dilip V.

    2014-01-01

    Objective Optimism and resilience promote health and well-being in older adults, and previous reports suggest that these traits are heritable. We examined the association of selected single-nucleotide polymorphisms (SNPs) with optimism and resilience in older adults. Design Candidate gene association study that was a follow-on at the University of California, San Diego sites of two NIH-funded multi-site longitudinal investigations: Women's Health Initiative (WHI) and SELenium and vitamin E Cancer prevention Trial (SELECT). Participants 426 Women from WHI older than age 50, and 509 men older than age 55 (age 50 for African-American men) from SELECT. Measurements 65 candidate gene SNPs that were judged by consensus, based on a literature review, as being related to predisposition to optimism and resilience, and 31 ancestry informative marker SNPs, genotyped from blood-based DNA samples and self-report scales for trait optimism, resilience, and depressive symptoms. Results Using a Bonferroni threshold for significant association (p=0.00089), there were no significant associations for individual SNPs with optimism or resilience in single-locus analyses. Exploratory multi-locus polygenic analyses with a p-value of <.05, showed an association of optimism with SNPs in MAO-A, IL10, and FGG genes, and an association of resilience with a SNP in MAO-A gene. Conclusions Correcting for Type I errors, there were no significant associations of optimism and resilience with specific gene SNPs in single-locus analyses. Positive psychological traits are likely to be genetically complex, with many loci having small effects contributing to phenotypic variation. Our exploratory multi-locus polygenic analyses suggest that larger sample sizes and complementary approaches involving methods such as sequence-based association studies, copy number variation analyses, and pathway-based analyses could be useful for better understanding the genetic basis of these positive psychological traits

  19. Genome-wide characteristics of copy number variation in Polish Holstein and Polish Red cattle using SNP genotyping assay.

    PubMed

    Gurgul, A; Jasielczuk, I; Szmatoła, T; Pawlina, K; Ząbek, T; Żukowski, K; Bugno-Poniewierska, M

    2015-04-01

    Copy number variation (CNV), which results from deletions or amplifications of large fragments of genomic DNA, is widespread in mammalian genomes and apart from its potential pathogenic effect it is considered as a source of natural genetic diversity. In cattle populations, this kind of genetic variability remains still insufficiently elucidated and studies focusing on the detection of new structural genomic variants in different cattle populations may contribute to a better understanding of cattle breeds' diversity and genetic basis of production traits. In this study, by using BovineSNP50 assay and cnvPartition algorithm we identified CNVs in two different cattle breeds: Holstein (859 animals) and Polish Red (301). In Holstein cattle we found 648 CNVs which could be reduced to 91 non-redundant variable genomic regions (CNVRs) covering in total 168.6 Mb of the genomic sequence. In Polish Red cattle we detected 62 CNVs, localized in 37 variable regions encompassing 22.3 Mb of the sequence, corresponding to 0.89 % of the autosomal genome. Within the regions we identified 1,192 unique RefSeq genes which are engaged in a variety of biological processes. High concordance of the regions' distribution was found between the studied breeds, however copy number variants seemed to be more common in Holstein cattle. About 26 % of the regions described in this study could be classified as newly identified. The results of this study will broaden the knowledge of CNVs in genomes of cattle of different breeds and will provide foundations for further research aiming to identify a relationship between this type of genetic variation and phenotypic traits.

  20. Characterization of genomic imbalances in diffuse large B-cell lymphoma by detailed SNP-chip analysis.

    PubMed

    Scholtysik, René; Kreuz, Markus; Hummel, Michael; Rosolowski, Maciej; Szczepanowski, Monika; Klapper, Wolfram; Loeffler, Markus; Trümper, Lorenz; Siebert, Reiner; Küppers, Ralf

    2015-03-01

    The pathogenesis of diffuse large B-cell lymphomas (DLBCL) is only partly understood. We analyzed 148 DLBCL by single nucleotide polymorphism (SNP)-chips to characterize genomic imbalances. Seventy-nine cases were of the germinal center B-cell like (GCB) type of DLBCL, 49 of the activated B-cell like (ABC) subtype and 20 were unclassified DLBCL. Twenty-four regions of recurrent genomic gains and 38 regions of recurrent genomic losses were identified over the whole cohort, with a median of 25 imbalances per case for ABC-DLBCL and 19 per case for GCB-DLBCL. Several recurrent copy number changes showed differential frequencies in the GCB- and ABC-DLBCL subgroups, including gains of HDAC7A predominantly in GCB-DLBCL (38% of cases) and losses of BACH2 and CASP8AP2 predominantly in ABC-DLBCL (35%), hinting at disparate pathogenetic mechanisms in these entities. Correlating gene expression and copy number revealed a strong gene dosage effect in all tumors, with 34% of probesets showing a concordant expression change in affected regions. Two new potential tumor suppressor genes emerging from the analysis, CASP3 and IL5RA, were sequenced in ten and 16 candidate cases, respectively. However, no mutations were found, pointing to a potential haploinsufficiency effect of these genes, considering their reduced expression in cases with deletions. Our study thus describes differences and similarities in the landscape of genomic aberrations in the DLBCL subgroups in a large collection of cases, confirming already known targets, but also discovering novel copy number changes with possible pathogenetic relevance.

  1. RNA-Seq-Mediated Transcriptome Analysis of a Fiberless Mutant Cotton and Its Possible Origin Based on SNP Markers.

    PubMed

    Ma, Qifeng; Wu, Man; Pei, Wenfeng; Wang, Xiaoyan; Zhai, Honghong; Wang, Wenkui; Li, Xingli; Zhang, Jinfa; Yu, Jiwen; Yu, Shuxun

    2016-01-01

    As the longest known single-celled trichomes, cotton (Gossypium L.) fibers constitute a classic model system to investigate cell initiation and elongation. In this study, we used a high-throughput transcriptome sequencing technology to identify fiber-initiation-related single nucleotide polymorphism (SNP) markers and differentially expressed genes (DEGs) between the wild-type (WT) Upland cotton (G. hirsutum) Xuzhou 142 and its natural fuzzless-lintless mutant Xuzhou 142 fl. Approximately 700 million high-quality cDNA reads representing over 58 Gb of sequences were obtained, resulting in the identification of 28,610 SNPs--of which 17,479 were novel--from 13,960 expressed genes. Of these SNPs, 50% of SNPs in fl were identical to those of G. barbadense, which suggests the likely origin of the fl mutant from an interspecific hybridization between Xuzhou 142 and an unknown G. barbadense genotype. Of all detected SNPs, 15,555, 12,750, and 305 were classified as non-synonymous, synonymous, and pre-terminated ones, respectively. Moreover, 1,352 insertion/deletion polymorphisms (InDels) were also detected. A total of 865 DEGs were identified between the WT and fl in ovules at -3 and 0 days post-anthesis, with 302 candidate SNPs selected from these DEGs for validation by a high-resolution melting analysis and Sanger sequencing in seven cotton genotypes. The number of genotypic pairwise polymorphisms varied from 43 to 302, indicating that the identified SNPs are reliable. These SNPs should serve as good resources for breeding and genetic studies in cotton. PMID:26990639

  2. SNP Regulation of microRNA Expression and Subsequent Colon Cancer Risk

    PubMed Central

    Mullany, Lila E.; Wolff, Roger K.; Herrick, Jennifer S.; Buas, Matthew F.; Slattery, Martha L.

    2015-01-01

    Introduction MicroRNAs (miRNAs) regulate messenger RNAs (mRNAs) and as such have been implicated in a variety of diseases, including cancer. MiRNAs regulate mRNAs through binding of the miRNA 5’ seed sequence (~7–8 nucleotides) to the mRNA 3’ UTRs; polymorphisms in these regions have the potential to alter miRNA-mRNA target associations. SNPs in miRNA genes as well as miRNA-target genes have been proposed to influence cancer risk through altered miRNA expression levels. Methods MiRNA-SNPs and miRNA-target gene-SNPs were identified through the literature. We used SNPs from Genome-Wide Association Study (GWAS) data that were matched to individuals with miRNA expression data generated from an Agilent platform for colon tumor and non-tumor paired tissues. These samples were used to evaluate 327 miRNA-SNP pairs for associations between SNPs and miRNA expression levels as well as for SNP associations with colon cancer. Results Twenty-two miRNAs expressed in non-tumor tissue were significantly different by genotype and 21 SNPs were associated with altered tumor/non-tumor differential miRNA expression across genotypes. Two miRNAs were associated with SNP genotype for both non-tumor and tumor/non-tumor differential expression. Of the 41 miRNAs significantly associated with SNPs all but seven were significantly differentially expressed in colon tumor tissue. Two of the 41 SNPs significantly associated with miRNA expression levels were associated with colon cancer risk: rs8176318 (BRCA1), ORAA 1.31 95% CI 1.01, 1.78, and rs8905 (PRKAR1A), ORGG 2.31 95% CI 1.11, 4.77. Conclusion Of the 327 SNPs identified in the literature as being important because of their potential regulation of miRNA expression levels, 12.5% had statistically significantly associations with miRNA expression. However, only two of these SNPs were significantly associated with colon cancer. PMID:26630397

  3. Genes of the RNASE5 pathway contain SNP associated with milk production traits in dairy cattle

    PubMed Central

    2013-01-01

    Background Identification of the processes and mutations responsible for the large genetic variation in milk production among dairy cattle has proved challenging. One approach is to identify a biological process potentially involved in milk production and to determine the genetic influence of all the genes included in the process or pathway. Angiogenin encoded by angiogenin, ribonuclease, RNase A family 5 (RNASE5) is relatively abundant in milk, and has been shown to regulate protein synthesis and act as a growth factor in epithelial cells in vitro. However, little is known about the role of angiogenin in the mammary gland or if the polymorphisms present in the bovine RNASE5 gene are associated with lactation and milk production traits in dairy cattle. Given the high economic value of increased protein in milk, we have tested the hypothesis that RNASE5 or genes in the RNASE5 pathway are associated with milk production traits. First, we constructed a “RNASE5 pathway” based on upstream and downstream interacting genes reported in the literature. We then tested SNP in close proximity to the genes of this pathway for association with milk production traits in a large dairy cattle dataset. Results The constructed RNASE5 pathway consisted of 11 genes. Association analysis between SNP in 1 Mb regions surrounding these genes and milk production traits revealed that more SNP than expected by chance were associated with milk protein percent (P < 0.05 significance). There was no significant association with other traits such as milk fat content or fertility. Conclusions These results support a role for the RNASE5 pathway in milk production, specifically milk protein percent, and indicate that polymorphisms in or near these genes explain a proportion of the variation for this trait. This method provides a novel way of understanding the underlying biology of lactation with implications for milk production and can be applied to any pathway or gene set to test whether

  4. Genome-Wide SNP Discovery from Transcriptome of Four Common Carp Strains

    PubMed Central

    Xu, Jian; Ji, Peifeng; Zhao, Zixia; Zhang, Yan; Feng, Jianxin; Wang, Jian; Li, Jiongtang; Zhang, Xiaofeng; Zhao, Lan; Liu, Guangzan; Xu, Peng; Sun, Xiaowen

    2012-01-01

    Background Single nucleotide polymorphisms (SNPs) have been used as genetic marker for genome-wide association studies in many species. Gene-associated SNPs could offer sufficient coverage in trait related research and further more could themselves be causative SNPs for traits. Common carp (Cyprinus carpio) is one of the most important aquaculture species in the world accounting for nearly 14% of freshwater aquaculture production. There are various strains of common carp with different economic traits, however, the genetic mechanism underlying the different traits have not been elucidated yet. In this project, we identified a large number of gene-associated SNPs from four strains of common carp using next-generation sequencing. Results Transcriptome sequencing of four strains of common carp (mirror carp, purse red carp, Xingguo red carp, Yellow River carp) was performed with Solexa HiSeq2000 platform. De novo assembled transcriptome was used as reference for alignments, and SNP calling was done through BWA and SAMtools. A total of 712,042 Intra-strain SNPs were discovered in four strains, of which 483,276 SNPs for mirror carp, 486,629 SNPs for purse red carp, 478,028 SNPs for Xingguo red carp and 488,281 SNPs for Yellow River carp were discovered, respectively. Besides, 53,893 inter-SNPs were identified. Strain-specific SNPs of four strains were 53,938, 53,866, 48,701, 40,131 in mirror carp, purse red carp, Xingguo red carp and Yellow River carp, respectively. GO and KEGG pathway analysis were done to reveal strain-specific genes affected by strain-specific non-synonymous SNPs. Validation of selected SNPs revealed that 48% percent of SNPs (12 of 25) were tested to be true SNPs. Conclusions Transcriptome analysis of common carp using RNA-Seq is a cost-effective way of generating numerous reads for SNP discovery. After validation of identified SNPs, these data will provide a solid base for SNP array designing and genome-wide association studies. PMID:23110192

  5. Genome wide SNP discovery in flax through next generation sequencing of reduced representation libraries

    PubMed Central

    2012-01-01

    Background Flax (Linum usitatissimum L.) is a significant fibre and oilseed crop. Current flax molecular markers, including isozymes, RAPDs, AFLPs and SSRs are of limited use in the construction of high density linkage maps and for association mapping applications due to factors such as low reproducibility, intense labour requirements and/or limited numbers. We report here on the use of a reduced representation library strategy combined with next generation Illumina sequencing for rapid and large scale discovery of SNPs in eight flax genotypes. SNP discovery was performed through in silico analysis of the sequencing data against the whole genome shotgun sequence assembly of flax genotype CDC Bethune. Genotyping-by-sequencing of an F6-derived recombinant inbred line population provided validation of the SNPs. Results Reduced representation libraries of eight flax genotypes were sequenced on the Illumina sequencing platform resulting in sequence coverage ranging from 4.33 to 15.64X (genome equivalents). Depending on the relatedness of the genotypes and the number and length of the reads, between 78% and 93% of the reads mapped onto the CDC Bethune whole genome shotgun sequence assembly. A total of 55,465 SNPs were discovered with the largest number of SNPs belonging to the genotypes with the highest mapping coverage percentage. Approximately 84% of the SNPs discovered were identified in a single genotype, 13% were shared between any two genotypes and the remaining 3% in three or more. Nearly a quarter of the SNPs were found in genic regions. A total of 4,706 out of 4,863 SNPs discovered in Macbeth were validated using genotyping-by-sequencing of 96 F6 individuals from a recombinant inbred line population derived from a cross between CDC Bethune and Macbeth, corresponding to a validation rate of 96.8%. Conclusions Next generation sequencing of reduced representation libraries was successfully implemented for genome-wide SNP discovery from flax. The genotyping

  6. Global Phylogeny of Mycobacterium tuberculosis Based on Single Nucleotide Polymorphism (SNP) Analysis: Insights into Tuberculosis Evolution, Phylogenetic Accuracy of Other DNA Fingerprinting Systems, and Recommendations for a Minimal Standard SNP Set†

    PubMed Central

    Filliol, Ingrid; Motiwala, Alifiya S.; Cavatore, Magali; Qi, Weihong; Hazbón, Manzour Hernando; Bobadilla del Valle, Miriam; Fyfe, Janet; García-García, Lourdes; Rastogi, Nalin; Sola, Christophe; Zozio, Thierry; Guerrero, Marta Inírida; León, Clara Inés; Crabtree, Jonathan; Angiuoli, Sam; Eisenach, Kathleen D.; Durmaz, Riza; Joloba, Moses L.; Rendón, Adrian; Sifuentes-Osornio, José; Ponce de León, Alfredo; Cave, M. Donald; Fleischmann, Robert; Whittam, Thomas S.; Alland, David

    2006-01-01

    We analyzed a global collection of Mycobacterium tuberculosis strains using 212 single nucleotide polymorphism (SNP) markers. SNP nucleotide diversity was high (average across all SNPs, 0.19), and 96% of the SNP locus pairs were in complete linkage disequilibrium. Cluster analyses identified six deeply branching, phylogenetically distinct SNP cluster groups (SCGs) and five subgroups. The SCGs were strongly associated with the geographical origin of the M. tuberculosis samples and the birthplace of the human hosts. The most ancestral cluster (SCG-1) predominated in patients from the Indian subcontinent, while SCG-1 and another ancestral cluster (SCG-2) predominated in patients from East Asia, suggesting that M. tuberculosis first arose in the Indian subcontinent and spread worldwide through East Asia. Restricted SCG diversity and the prevalence of less ancestral SCGs in indigenous populations in Uganda and Mexico suggested a more recent introduction of M. tuberculosis into these regions. The East African Indian and Beijing spoligotypes were concordant with SCG-1 and SCG-2, respectively; X and Central Asian spoligotypes were also associated with one SCG or subgroup combination. Other clades had less consistent associations with SCGs. Mycobacterial interspersed repetitive unit (MIRU) analysis provided less robust phylogenetic information, and only 6 of the 12 MIRU microsatellite loci were highly differentiated between SCGs as measured by GST. Finally, an algorithm was devised to identify two minimal sets of either 45 or 6 SNPs that could be used in future investigations to enable global collaborations for studies on evolution, strain differentiation, and biological differences of M. tuberculosis. PMID:16385065

  7. The human lactase persistence-associated SNP -13910*T enables in vivo functional persistence of lactase promoter-reporter transgene expression.

    PubMed

    Fang, Lin; Ahn, Jong Kun; Wodziak, Dariusz; Sibley, Eric

    2012-07-01

    Lactase is the intestinal enzyme responsible for digestion of the milk sugar lactose. Lactase gene expression declines dramatically upon weaning in mammals and during early childhood in humans (lactase nonpersistence). In various ethnic groups, however, lactase persists in high levels throughout adulthood (lactase persistence). Genetic association studies have identified that lactase persistence in northern Europeans is strongly associated with a single nucleotide polymorphism (SNP) located 14 kb upstream of the lactase gene: -13910*C/T. To determine whether the -13910*T SNP can function in vivo to mediate lactase persistence, we generated transgenic mice harboring human DNA fragments with the -13910*T SNP or the ancestral -13910*C SNP cloned upstream of a 2-kb rat lactase gene promoter in a luciferase reporter construct. We previously reported that the 2-kb rat lactase promoter directs a post-weaning decline of luciferase transgene expression similar to that of the endogenous lactase gene. In the present study, the post-weaning decline directed by the rat lactase promoter is impeded by addition of the -13910*T SNP human DNA fragment, but not by addition of the -13910*C ancestral SNP fragment. Persistence of transgene expression associated with the -13910*T SNP represents the first in vivo data in support of a functional role for the -13910*T SNP in mediating the human lactase persistence phenotype. PMID:22258180

  8. A high-density SNP map of sunflower derived from RAD-sequencing facilitating fine-mapping of the rust resistance gene R12

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A high-resolution genetic map of sunflower was constructed by integrating SNP data from three F2 mapping populations (HA 89/ RHA 464, B-line/ RHA 464, and CR 29/ RHA 468). The consensus map spanned a total length of 1443.84 cM, and consisted of 5,019 SNP markers derived from RAD tag sequencing and 1...

  9. The human lactase persistence-associated SNP -13910*T enables in vivo functional persistence of lactase promoter-reporter transgene expression.

    PubMed

    Fang, Lin; Ahn, Jong Kun; Wodziak, Dariusz; Sibley, Eric

    2012-07-01

    Lactase is the intestinal enzyme responsible for digestion of the milk sugar lactose. Lactase gene expression declines dramatically upon weaning in mammals and during early childhood in humans (lactase nonpersistence). In various ethnic groups, however, lactase persists in high levels throughout adulthood (lactase persistence). Genetic association studies have identified that lactase persistence in northern Europeans is strongly associated with a single nucleotide polymorphism (SNP) located 14 kb upstream of the lactase gene: -13910*C/T. To determine whether the -13910*T SNP can function in vivo to mediate lactase persistence, we generated transgenic mice harboring human DNA fragments with the -13910*T SNP or the ancestral -13910*C SNP cloned upstream of a 2-kb rat lactase gene promoter in a luciferase reporter construct. We previously reported that the 2-kb rat lactase promoter directs a post-weaning decline of luciferase transgene expression similar to that of the endogenous lactase gene. In the present study, the post-weaning decline directed by the rat lactase promoter is impeded by addition of the -13910*T SNP human DNA fragment, but not by addition of the -13910*C ancestral SNP fragment. Persistence of transgene expression associated with the -13910*T SNP represents the first in vivo data in support of a functional role for the -13910*T SNP in mediating the human lactase persistence phenotype.

  10. Olive oil DNA fingerprinting by multiplex SNP genotyping on fluorescent microspheres.

    PubMed

    Kalogianni, Despina P; Bazakos, Christos; Boutsika, Lemonia M; Targem, Mehdi Ben; Christopoulos, Theodore K; Kalaitzis, Panagiotis; Ioannou, Penelope C

    2015-04-01

    Olive oil cultivar verification is of primary importance for the competitiveness of the product and the protection of consumers and producers from fraudulence. Single-nucleotide polymorphisms (SNPs) have emerged as excellent DNA markers for authenticity testing. This paper reports the first multiplex SNP genotyping assay for olive oil cultivar identification that is performed on a suspension of fluorescence-encoded microspheres. Up to 100 sets of microspheres, with unique "fluorescence signatures", are available. Allele discrimination was accomplished by primer extension reaction. The reaction products were captured via hybridization on the microspheres and analyzed, within seconds, by a flow cytometer. The "fluorescence signature" of each microsphere is assigned to a specific allele, whereas the signal from a reporter fluorophore denotes the presence of the allele. As a model, a panel of three SNPs was chosen that enabled identification of five common Greek olive cultivars (Adramytini, Chondrolia Chalkidikis, Kalamon, Koroneiki, and Valanolia).

  11. To Cheat or Not To Cheat: Tryptophan Hydroxylase 2 SNP Variants Contribute to Dishonest Behavior

    PubMed Central

    Shen, Qiang; Teo, Meijun; Winter, Eyal; Hart, Einav; Chew, Soo H.; Ebstein, Richard P.

    2016-01-01

    Although, lying (bear false witness) is explicitly prohibited in the Decalogue and a focus of interest in philosophy and theology, more recently the behavioral and neural mechanisms of deception are gaining increasing attention from diverse fields especially economics, psychology, and neuroscience. Despite the considerable role of heredity in explaining individual differences in deceptive behavior, few studies have investigated which specific genes contribute to the heterogeneity of lying behavior across individuals. Also, little is known concerning which specific neurotransmitter pathways underlie deception. Toward addressing these two key questions, we implemented a neurogenetic strategy and modeled deception by an incentivized die-under-cup task in a laboratory setting. The results of this exploratory study provide provisional evidence that SNP variants across the tryptophan hydroxylase 2 (TPH2) gene, that encodes the rate-limiting enzyme in the biosynthesis of brain serotonin, contribute to individual differences in deceptive behavior. PMID:27199691

  12. Design and synthesis of the superionic conductor Na10SnP2S12

    NASA Astrophysics Data System (ADS)

    Richards, William D.; Tsujimura, Tomoyuki; Miara, Lincoln J.; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand

    2016-03-01

    Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm-1 rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity.

  13. SNP in starch biosynthesis genes associated with nutritional and functional properties of rice

    PubMed Central

    Kharabian-Masouleh, Ardashir; Waters, Daniel L. E.; Reinke, Russell F.; Ward, Rachelle; Henry, Robert J.

    2012-01-01

    Starch is a major component of human diets. The relative contribution of variation in the genes of starch biosynthesis to the nutritional and functional properties of the rice was evaluated in a rice breeding population. Sequencing 18 genes involved in starch synthesis in a population of 233 rice breeding lines discovered 66 functional SNPs in exonic regions. Five genes, AGPS2b, Isoamylase1, SPHOL, SSIIb and SSIVb showed no polymorphism. Association analysis found 31 of the SNP were associated with differences in pasting and cooking quality properties of the rice lines. Two genes appear to be the major loci controlling traits under human selection in rice, GBSSI (waxy gene) and SSIIa. GBSSI influenced amylose content and retrogradation. Other genes contributing to retrogradation were GPT1, SSI, BEI and SSIIIa. SSIIa explained much of the variation in cooking characteristics. Other genes had relatively small effects. PMID:22870386

  14. Individual Genome of the Russian Male: SNP Calling and a de novo Assembly of Unmapped Reads.

    PubMed

    Chekanov, N N; Boulygina, E S; Beletskiy, A V; Prokhortchouk, E B; Skryabin, K G

    2010-07-01

    A somatic cell genome was recently resequenced for a patient with renal cancer. The data were submitted to the NCBI Sequence Read Archive under the accession number SRA012240. Here, we have performed SNP calling for the genome and compared it with several published genomes. We have found 2, 921, 724 SNPs, including 1, 472, 679 newly described ones. Among them, 63, 462 SNPs have been mapped to the Y chromosome and, based on 18 markers, the genome has been ascribed to the R1a1a haplogroup predominant in Russian males. The mitochondrial haplogroup has been determined as U5a, which is also common in the European part of Russia. Short reads unmapped to the human genome were used for thede novoassembly of DNA sequences. This resulted in genome-specific contigs (more than 100 bp in length) with an overall length of 154 kbp (for GAII) and 4.7 kbp (for SOLiD).

  15. Use of SNP-arrays for ChIP assays: computational aspects.

    PubMed

    Muro, Enrique M; McCann, Jennifer A; Rudnicki, Michael A; Andrade-Navarro, Miguel A

    2009-01-01

    The simultaneous genotyping of thousands of single nucleotide polymorphisms (SNPs) in a genome using SNP-Arrays is a very important tool that is revolutionizing genetics and molecular biology. We expanded the utility of this technique by using it following chromatin immunoprecipitation (ChIP) to assess the multiple genomic locations protected by a protein complex recognized by an antibody. The power of this technique is illustrated through an analysis of the changes in histone H4 acetylation, a marker of open chromatin and transcriptionally active genomic regions, which occur during differentiation of human myoblasts into myotubes. The findings have been validated by the observation of a significant correlation between the detected histone modifications and the expression of the nearby genes, as measured by DNA expression microarrays. This chapter focuses on the computational analysis of the data.

  16. PPLine: An Automated Pipeline for SNP, SAP, and Splice Variant Detection in the Context of Proteogenomics.

    PubMed

    Krasnov, George Sergeevich; Dmitriev, Alexey Alexandrovich; Kudryavtseva, Anna Viktorovna; Shargunov, Alexander Valerievich; Karpov, Dmitry Sergeevich; Uroshlev, Leonid Andreevich; Melnikova, Natalya Vladimirovna; Blinov, Vladimir Mikhailovich; Poverennaya, Ekaterina Vladimirovna; Archakov, Alexander Ivanovich; Lisitsa, Andrey Valerievich; Ponomarenko, Elena Alexandrovna

    2015-09-01

    The fundamental mission of the Chromosome-Centric Human Proteome Project (C-HPP) is the research of human proteome diversity, including rare variants. Liver tissues, HepG2 cells, and plasma were selected as one of the major objects for C-HPP studies. The proteogenomic approach, a recently introduced technique, is a powerful method for predicting and validating proteoforms coming from alternative splicing, mutations, and transcript editing. We developed PPLine, a Python-based proteogenomic pipeline providing automated single-amino-acid polymorphism (SAP), indel, and alternative-spliced-variants discovery based on raw transcriptome and exome sequence data, single-nucleotide polymorphism (SNP) annotation and filtration, and the prediction of proteotypic peptides (available at https://sourceforge.net/projects/ppline). In this work, we performed deep transcriptome sequencing of HepG2 cells and liver tissues using two platforms: Illumina HiSeq and Applied Biosystems SOLiD. Using PPLine, we revealed 7756 SAP and indels for HepG2 cells and liver (including 659 variants nonannotated in dbSNP). We found 17 indels in transcripts associated with the translation of alternate reading frames (ARF) longer than 300 bp. The ARF products of two genes, SLMO1 and TMEM8A, demonstrate signatures of caspase-binding domain and Gcn5-related N-acetyltransferase. Alternative splicing analysis predicted novel proteoforms encoded by 203 (liver) and 475 (HepG2) genes according to both Illumina and SOLiD data. The results of the present work represent a basis for subsequent proteomic studies by the C-HPP consortium. PMID:26147802

  17. RNA-Seq Identifies SNP Markers for Growth Traits in Rainbow Trout

    PubMed Central

    Salem, Mohamed; Vallejo, Roger L.; Leeds, Timothy D.; Palti, Yniv; Liu, Sixin; Sabbagh, Annas; Rexroad, Caird E.; Yao, Jianbo

    2012-01-01

    Fast growth is an important and highly desired trait, which affects the profitability of food animal production, with feed costs accounting for the largest proportion of production costs. Traditional phenotype-based selection is typically used to select for growth traits; however, genetic improvement is slow over generations. Single nucleotide polymorphisms (SNPs) explain 90% of the genetic differences between individuals; therefore, they are most suitable for genetic evaluation and strategies that employ molecular genetics for selective breeding. SNPs found within or near a coding sequence are of particular interest because they are more likely to alter the biological function of a protein. We aimed to use SNPs to identify markers and genes associated with genetic variation in growth. RNA-Seq whole-transcriptome analysis of pooled cDNA samples from a population of rainbow trout selected for improved growth versus unselected genetic cohorts (10 fish from 1 full-sib family each) identified SNP markers associated with growth-rate. The allelic imbalances (the ratio between the allele frequencies of the fast growing sample and that of the slow growing sample) were considered at scores >5.0 as an amplification and <0.2 as loss of heterozygosity. A subset of SNPs (n = 54) were validated and evaluated for association with growth traits in 778 individuals of a three-generation parent/offspring panel representing 40 families. Twenty-two SNP markers and one mitochondrial haplotype were significantly associated with growth traits. Polymorphism of 48 of the markers was confirmed in other commercially important aquaculture stocks. Many markers were clustered into genes of metabolic energy production pathways and are suitable candidates for genetic selection. The study demonstrates that RNA-Seq at low sequence coverage of divergent populations is a fast and effective means of identifying SNPs, with allelic imbalances between phenotypes. This technique is suitable for marker

  18. Identifying Litchi (Litchi chinensis Sonn.) Cultivars and Their Genetic Relationships Using Single Nucleotide Polymorphism (SNP) Markers

    PubMed Central

    Liu, Wei; Xiao, Zhidan; Bao, Xiuli; Yang, Xiaoyan; Fang, Jing; Xiang, Xu

    2015-01-01

    Litchi is an important fruit tree in tropical and subtropical areas of the world. However, there is widespread confusion regarding litchi cultivar nomenclature and detailed information of genetic relationships among litchi germplasm is unclear. In the present study, the potential of single nucleotide polymorphism (SNP) for the identification of 96 representative litchi accessions and their genetic relationships in China was evaluated using 155 SNPs that were evenly spaced across litchi genome. Ninety SNPs with minor allele frequencies above 0.05 and a good genotyping success rate were used for further analysis. A relatively high level of genetic variation was observed among litchi accessions, as quantified by the expected heterozygosity (He = 0.305). The SNP based multilocus matching identified two synonymous groups, ‘Heiye’ and ‘Wuye’, and ‘Chengtuo’ and ‘Baitangli 1’. A subset of 14 SNPs was sufficient to distinguish all the non-redundant litchi genotypes, and these SNPs were proven to be highly stable by repeated analyses of a selected group of cultivars. Unweighted pair-group method of arithmetic averages (UPGMA) cluster analysis divided the litchi accessions analyzed into four main groups, which corresponded to the traits of extremely early-maturing, early-maturing, middle-maturing, and late-maturing, indicating that the fruit maturation period should be considered as the primary criterion for litchi taxonomy. Two subpopulations were detected among litchi accessions by STRUCTURE analysis, and accessions with extremely early- and late-maturing traits showed membership coefficients above 0.99 for Cluster 1 and Cluster 2, respectively. Accessions with early- and middle-maturing traits were identified as admixture forms with varying levels of membership shared between the two clusters, indicating their hybrid origin during litchi domestication. The results of this study will benefit litchi germplasm conservation programs and facilitate maximum

  19. Transcriptome sequencing for high throughput SNP development and genetic mapping in Pea

    PubMed Central

    2014-01-01

    Background Pea has a complex genome of 4.3 Gb for which only limited genomic resources are available to date. Although SNP markers are now highly valuable for research and modern breeding, only a few are described and used in pea for genetic diversity and linkage analysis. Results We developed a large resource by cDNA sequencing of 8 genotypes representative of modern breeding material using the Roche 454 technology, combining both long reads (400 bp) and high coverage (3.8 million reads, reaching a total of 1,369 megabases). Sequencing data were assembled and generated a 68 K unigene set, from which 41 K were annotated from their best blast hit against the model species Medicago truncatula. Annotated contigs showed an even distribution along M. truncatula pseudochromosomes, suggesting a good representation of the pea genome. 10 K pea contigs were found to be polymorphic among the genetic material surveyed, corresponding to 35 K SNPs. We validated a subset of 1538 SNPs through the GoldenGate assay, proving their ability to structure a diversity panel of breeding germplasm. Among them, 1340 were genetically mapped and used to build a new consensus map comprising a total of 2070 markers. Based on blast analysis, we could establish 1252 bridges between our pea consensus map and the pseudochromosomes of M. truncatula, which provides new insight on synteny between the two species. Conclusions Our approach created significant new resources in pea, i.e. the most comprehensive genetic map to date tightly linked to the model species M. truncatula and a large SNP resource for both academic research and breeding. PMID:24521263

  20. SNP microarray-based 24 chromosome aneuploidy screening is significantly more consistent than FISH

    PubMed Central

    Treff, Nathan R.; Levy, Brynn; Su, Jing; Northrop, Lesley E.; Tao, Xin; Scott, Richard T.

    2010-01-01

    Many studies estimate that chromosomal mosaicism within the cleavage-stage human embryo is high. However, comparison of two unique methods of aneuploidy screening of blastomeres within the same embryo has not been conducted and may indicate whether mosaicism has been overestimated due to technical inconsistency rather than the biological phenomena. The present study investigates the prevalence of chromosomal abnormality and mosaicism found with two different single cell aneuploidy screening techniques. Thirteen arrested cleavage-stage embryos were studied. Each was biopsied into individual cells (n = 160). The cells from each embryo were randomized into two groups. Those destined for FISH-based aneuploidy screening (n = 75) were fixed, one cell per slide. Cells for SNP microarray-based aneuploidy screening (n = 85) were put into individual tubes. Microarray was significantly more reliable (96%) than FISH (83%) for providing an interpretable result (P = 0.004). Markedly different results were obtained when comparing microarray and FISH results from individual embryos. Mosaicism was significantly less commonly observed by microarray (31%) than by FISH (100%) (P = 0.0005). Although FISH evaluated fewer chromosomes per cell and fewer cells per embryo, FISH still displayed significantly more unique genetic diagnoses per embryo (3.2 ± 0.2) than microarray (1.3 ± 0.2) (P < 0.0001). This is the first prospective, randomized, blinded and paired comparison between microarray and FISH-based aneuploidy screening. SNP microarray-based 24 chromosome aneuploidy screening provides more complete and consistent results than FISH. These results also suggest that FISH technology may overestimate the contribution of mitotic error to the origin of aneuploidy at the cleavage stage of human embryogenesis. PMID:20484246

  1. JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects

    PubMed Central

    Conti, David V.; Richardson, Sylvia

    2016-01-01

    ABSTRACT Recently, large scale genome‐wide association study (GWAS) meta‐analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one‐at‐a‐time. This complicates the ability of fine‐mapping to identify a small set of SNPs for further functional follow‐up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re‐analysis of published marginal summary stactistics under joint multi‐SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi‐region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta‐analysis of glucose and insulin related traits consortium) – a GWAS meta‐analysis of more than 15,000 people. We re‐analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index. PMID:27027514

  2. PPLine: An Automated Pipeline for SNP, SAP, and Splice Variant Detection in the Context of Proteogenomics.

    PubMed

    Krasnov, George Sergeevich; Dmitriev, Alexey Alexandrovich; Kudryavtseva, Anna Viktorovna; Shargunov, Alexander Valerievich; Karpov, Dmitry Sergeevich; Uroshlev, Leonid Andreevich; Melnikova, Natalya Vladimirovna; Blinov, Vladimir Mikhailovich; Poverennaya, Ekaterina Vladimirovna; Archakov, Alexander Ivanovich; Lisitsa, Andrey Valerievich; Ponomarenko, Elena Alexandrovna

    2015-09-01

    The fundamental mission of the Chromosome-Centric Human Proteome Project (C-HPP) is the research of human proteome diversity, including rare variants. Liver tissues, HepG2 cells, and plasma were selected as one of the major objects for C-HPP studies. The proteogenomic approach, a recently introduced technique, is a powerful method for predicting and validating proteoforms coming from alternative splicing, mutations, and transcript editing. We developed PPLine, a Python-based proteogenomic pipeline providing automated single-amino-acid polymorphism (SAP), indel, and alternative-spliced-variants discovery based on raw transcriptome and exome sequence data, single-nucleotide polymorphism (SNP) annotation and filtration, and the prediction of proteotypic peptides (available at https://sourceforge.net/projects/ppline). In this work, we performed deep transcriptome sequencing of HepG2 cells and liver tissues using two platforms: Illumina HiSeq and Applied Biosystems SOLiD. Using PPLine, we revealed 7756 SAP and indels for HepG2 cells and liver (including 659 variants nonannotated in dbSNP). We found 17 indels in transcripts associated with the translation of alternate reading frames (ARF) longer than 300 bp. The ARF products of two genes, SLMO1 and TMEM8A, demonstrate signatures of caspase-binding domain and Gcn5-related N-acetyltransferase. Alternative splicing analysis predicted novel proteoforms encoded by 203 (liver) and 475 (HepG2) genes according to both Illumina and SOLiD data. The results of the present work represent a basis for subsequent proteomic studies by the C-HPP consortium.

  3. Multi-SNP Analysis of GWAS Data Identifies Pathways Associated with Nonalcoholic Fatty Liver Disease

    PubMed Central

    Chen, Qing-Rong; Braun, Rosemary; Hu, Ying; Yan, Chunhua; Brunt, Elizabeth M.; Meerzaman, Daoud

    2013-01-01

    Non-alcoholic fatty liver disease (NAFLD) is a common liver disease; the histological spectrum of which ranges from steatosis to steatohepatitis. Nonalcoholic steatohepatitis (NASH) often leads to cirrhosis and development of hepatocellular carcinoma. To better understand pathogenesis of NAFLD, we performed the pathway of distinction analysis (PoDA) on a genome-wide association study dataset of 250 non-Hispanic white female adult patients with NAFLD, who were enrolled in the NASH Clinical Research Network (CRN) Database Study, to investigate whether biologic process variation measured through genomic variation of genes within these pathways was related to the development of steatohepatitis or cirrhosis. Pathways such as Recycling of eIF2:GDP, biosynthesis of steroids, Terpenoid biosynthesis and Cholesterol biosynthesis were found to be significantly associated with NASH. SNP variants in Terpenoid synthesis, Cholesterol biosynthesis and biosynthesis of steroids were associated with lobular inflammation and cytologic ballooning while those in Terpenoid synthesis were also associated with fibrosis and cirrhosis. These were also related to the NAFLD activity score (NAS) which is derived from the histological severity of steatosis, inflammation and ballooning degeneration. Eukaryotic protein translation and recycling of eIF2:GDP related SNP variants were associated with ballooning, steatohepatitis and cirrhosis. Il2 signaling events mediated by PI3K, Mitotic metaphase/anaphase transition, and Prostanoid ligand receptors were also significantly associated with cirrhosis. Taken together, the results provide evidence for additional ways, beyond the effects of single SNPs, by which genetic factors might contribute to the susceptibility to develop a particular phenotype of NAFLD and then progress to cirrhosis. Further studies are warranted to explain potential important genetic roles of these biological processes in NAFLD. PMID:23894275

  4. SNP-Based Quantification of Allele-Specific DNA Methylation Patterns by Pyrosequencing®.

    PubMed

    Busato, Florence; Tost, Jörg

    2015-01-01

    The analysis of allele-specific DNA methylation patterns has recently attracted much interest as loci of allele-specific DNA methylation overlap with known risk loci for complex diseases and the analysis might contribute to the fine-mapping and interpretation of non-coding genetic variants associated with complex diseases and improve the understanding between genotype and phenotype. In the presented protocol, we present a method for the analysis of DNA methylation patterns on both alleles separately using heterozygous Single Nucleotide Polymorphisms (SNPs) as anchor for allele-specific PCR amplification followed by analysis of the allele-specific DNA methylation patterns by Pyrosequencing(®). Pyrosequencing is an easy-to-handle, quantitative real-time sequencing method that is frequently used for genotyping as well as for the analysis of DNA methylation patterns. The protocol consists of three major steps: (1) identification of individuals heterozygous for a SNP in a region of interest using Pyrosequencing; (2) analysis of the DNA methylation patterns surrounding the SNP on bisulfite-treated DNA to identify regions of potential allele-specific DNA methylation; and (3) the analysis of the DNA methylation patterns associated with each of the two alleles, which are individually amplified using allele-specific PCR. The enrichment of the targeted allele is re-enforced by modification of the allele-specific primers at the allele-discriminating base with Locked Nucleic Acids (LNA). For the proof-of-principle of the developed approach, we provide assay details for three imprinted genes (IGF2, IGF2R, and PEG3) within this chapter. The mean of the DNA methylation patterns derived from the individual alleles corresponds well to the overall DNA methylation patterns and the developed approach proved more reliable compared to other protocols for allele-specific DNA methylation analysis.

  5. SNP discovery and genetic mapping of T-DNA insertional mutants in Fragaria vesca L.

    PubMed

    Ruiz-Rojas, J J; Sargent, D J; Shulaev, V; Dickerman, A W; Pattison, J; Holt, S H; Ciordia, A; Veilleux, Richard E

    2010-08-01

    As part of a program to develop forward and reverse genetics platforms in the diploid strawberry [Fragaria vesca L.; (2n = 2x = 14)] we have generated insertional mutant lines by T-DNA mutagenesis using pCAMBIA vectors. To characterize the T-DNA insertion sites of a population of 108 unique single copy mutants, we utilized thermal asymmetric interlaced PCR (hiTAIL-PCR) to amplify the flanking region surrounding either the left or right border of the T-DNA. Bioinformatics analysis of flanking sequences revealed little preference for insertion site with regard to G/C content; left borders tended to retain more of the plasmid backbone than right borders. Primers were developed from F. vesca flanking sequences to attempt to amplify products from both parents of the reference F. vesca 815 x F. bucharica 601 mapping population. Polymorphism occurred as: presence/absence of an amplification product for 16 primer pairs and different size products for 12 primer pairs, For 46 mutants, where polymorphism was not found by PCR, the amplification products were sequenced to reveal SNP polymorphism. A cleaved amplified polymorphic sequence/derived cleaved amplified polymorphism sequence (CAPS/dCAPS) strategy was then applied to find restriction endonuclease recognition sites in one of the parental lines to map the SNP position of 74 of the T-DNA insertion lines. BLAST search of flanking regions against GenBank revealed that 46 of 108 flanking sequences were close to presumed strawberry genes related to annotated genes from other plants.

  6. EST-derived SNP discovery and selective pressure analysis in Pacific white shrimp ( Litopenaeus vannamei)

    NASA Astrophysics Data System (ADS)

    Liu, Chengzhang; Wang, Xia; Xiang, Jianhai; Li, Fuhua

    2012-09-01

    Pacific white shrimp has become a major aquaculture and fishery species worldwide. Although a large scale EST resource has been publicly available since 2008, the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure. In this study, a set of 155 411 expressed sequence tags (ESTs) from the NCBI database were computationally analyzed and 17 225 single nucleotide polymorphisms (SNPs) were predicted, including 9 546 transitions, 5 124 transversions and 2 481 indels. Among the 7 298 SNP substitutions located in functionally annotated contigs, 58.4% (4 262) are non-synonymous SNPs capable of introducing amino acid mutations. Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding. Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous, suggesting negative selection. Distribution of non-synonymous to synonymous substitutions (Ka/Ks) ratio ranges from 0 to 4.01, (average 0.42, median 0.26), suggesting that the majority of the affected genes are under purifying selection. Enrichment analysis identified multiple gene ontology categories under positive or negative selection. Categories involved in innate immune response and male gamete generation are rich in positively selected genes, which is similar to reports in Drosophila and primates. This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species. The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.

  7. Genome-wide SNP discovery in mungbean by Illumina HiSeq.

    PubMed

    Van, Kyujung; Kang, Yang Jae; Han, Kwang-Soo; Lee, Yeong-Ho; Gwag, Jae-Gyun; Moon, Jung-Kyung; Lee, Suk-Ha

    2013-08-01

    Mungbean [Vigna radiata (L.) Wilczek], a self-pollinated diploid plant with 2n = 22 chromosomes, is an important legume crop with a high-quality amino acid profile. Sequence variation at the whole-genome level was examined by comparing two mungbean cultivars, Sunhwanokdu and Gyeonggijaerae 5, using Illumina HiSeq sequencing data. More than 40 billion bp from both mungbean cultivars were sequenced to a depth of 72×. After de novo assembly of Sunhwanokdu contigs by ABySS 1.3.2 (N50 = 9,958 bp), those longer than 10 kb were aligned with Gyeonggijaerae 5 reads using the Burrows-Wheeler Aligner. SAMTools was used for retrieving single nucleotide polymorphisms (SNPs) between Sunhwanokdu and Gyeonggijaerae 5, defining the lowest and highest depths as 5 and 100, respectively, and the sequence quality as 100. Of the 305,504 single-base changes identified, 40,503 SNPs were considered heterozygous in Gyeonggijaerae 5. Among the remaining 265,001 SNPs, 65.9 % (174,579 cases) were transitions and 34.1 % (90,422 cases) were transversions. For SNP validation, a total of 42 SNPs were chosen among Sunhwanokdu contigs longer than 10 kb and sharing at least 80 % sequence identity with common bean expressed sequence tags as determined with est2genome. Using seven mungbean cultivars from various origins in addition to Sunhwanokdu and Gyeonggijaerae 5, most of the SNPs identified by bioinformatics tools were confirmed by Sanger sequencing. These genome-wide SNP markers could enrich the current molecular resources and might be of value for the construction of a mungbean genetic map and the investigation of genetic diversity.

  8. SNP Discovery with EST and NextGen Sequencing in Switchgrass (Panicum virgatum L.)

    PubMed Central

    Ersoz, Elhan S.; Wright, Mark H.; Pangilinan, Jasmyn L.; Sheehan, Moira J.; Tobias, Christian; Casler, Michael D.; Buckler, Edward S.; Costich, Denise E.

    2012-01-01

    Although yield trials for switchgrass (Panicum virgatum L.), a potentially high value biofuel feedstock crop, are currently underway throughout North America, the genetic tools for crop improvement in this species are still in the early stages of development. Identification of high-density molecular markers, such as single nucleotide polymorphisms (SNPs), that are amenable to high-throughput genotyping approaches, is the first step in a quantitative genetics study of this model biofuel crop species. We generated and sequenced expressed sequence tag (EST) libraries from thirteen diverse switchgrass cultivars representing both upland and lowland ecotypes, as well as tetraploid and octoploid genomes. We followed this with reduced genomic library preparation and massively parallel sequencing of the same samples using the Illumina Genome Analyzer technology platform. EST libraries were used to generate unigene clusters and establish a gene-space reference sequence, thus providing a framework for assembly of the short sequence reads. SNPs were identified utilizing these scaffolds. We used a custom software program for alignment and SNP detection and identified over 149,000 SNPs across the 13 short-read sequencing libraries (SRSLs). Approximately 25,000 additional SNPs were identified from the entire EST collection available for the species. This sequencing effort generated data that are suitable for marker development and for estimation of population genetic parameters, such as nucleotide diversity and linkage disequilibrium. Based on these data, we assessed the feasibility of genome wide association mapping and genomic selection applications in switchgrass. Overall, the SNP markers discovered in this study will help facilitate quantitative genetics experiments and greatly enhance breeding efforts that target improvement of key biofuel traits and development of new switchgrass cultivars. PMID:23049744

  9. Y-SNP L1034: limited genetic link between Mansi and Hungarian-speaking populations.

    PubMed

    Fehér, T; Németh, E; Vándor, A; Kornienko, I V; Csáji, L K; Pamjav, H

    2015-02-01

    Genetic studies noted that the Hungarian Y-chromosomal gene pool significantly differs from other Uralic-speaking populations. Hungarians show very limited or no presence of haplogroup N-Tat, which is frequent among other Uralic-speaking populations. We proposed that some genetic links need to be observed between the linguistically related Hungarian and Mansi populations.This is the first attempt to divide haplogroup N-Tat into subhaplogroups by testing new downstream SNP markers L708 and L1034. Sixty Northern Mansi samples were collected in Western Siberia and genotyped for Y-chromosomal haplotypes and haplogroups. We found 14 Mansi and 92 N-Tat samples from 7 populations. Comparative results showed that all N-Tat samples carried the N-L708 mutation. Some Hungarian, Sekler, and Uzbek samples were L1034 SNP positive, while all Mongolians, Buryats, Khanty, Finnish, and Roma samples yielded a negative result for this marker. Based on the above, L1034 marker seems to be a subgroup of N-Tat, which is typical for Mansi and Hungarian-speaking ethnic groups so far. Based on our time to most recent common ancestor data, the L1034 marker arose 2,500 years before present. The overall frequency of the L1034 is very low among the analyzed populations, thus it does not necessarily mean that proto-Hungarians and Mansi descend from common ancestors. It does provide, however, a limited genetic link supporting language contact. Both Hungarians and Mansi have much more complex genetic population history than the traditional tree-based linguistic model would suggest. PMID:25258186

  10. Distinct SNP Combinations Confer Susceptibility to Urinary Bladder Cancer in Smokers and Non-Smokers

    PubMed Central

    Blaszkewicz, Meinolf; Marchan, Rosemarie; Ickstadt, Katja

    2012-01-01

    Recently, genome-wide association studies have identified and validated genetic variations associated with urinary bladder cancer (UBC). However, it is still unknown whether the high-risk alleles of several SNPs interact with one another, leading to an even higher disease risk. Additionally, there is no information available on how the UBC risk due to these SNPs compare to the risk of cigarette smoking and to occupational exposure to urinary bladder carcinogens, and whether the same or different SNP combinations are relevant in smokers and non-smokers. To address these questions, we analyzed the genotypes of six SNPs, previously found to be associated with UBC, together with the GSTM1 deletion, in 1,595 UBC cases and 1,760 controls, stratified for smoking habits. We identified the strongest interactions of different orders and tested the stability of their effect by bootstrapping. We found that different SNP combinations were relevant in smokers and non-smokers. In smokers, polymorphisms involved in detoxification of cigarette smoke carcinogens were most relevant (GSTM1, rs11892031), in contrast to those in non-smokers with MYC and APOBEC3A near polymorphisms (rs9642880, rs1014971) being the most influential. Stable combinations of up to three high-risk alleles resulted in higher odds ratios (OR) than the individual SNPs, although the interaction effect was less than additive. The highest stable combination effects resulted in an OR of about 2.0, which is still lower than the ORs of cigarette smoking (here, current smokers' OR: 3.28) and comparable to occupational carcinogen exposure risks which, depending on the workplace, show mostly ORs up to 2.0. PMID:23284801

  11. Distinct SNP combinations confer susceptibility to urinary bladder cancer in smokers and non-smokers.

    PubMed

    Schwender, Holger; Selinski, Silvia; Blaszkewicz, Meinolf; Marchan, Rosemarie; Ickstadt, Katja; Golka, Klaus; Hengstler, Jan G

    2012-01-01

    Recently, genome-wide association studies have identified and validated genetic variations associated with urinary bladder cancer (UBC). However, it is still unknown whether the high-risk alleles of several SNPs interact with one another, leading to an even higher disease risk. Additionally, there is no information available on how the UBC risk due to these SNPs compare to the risk of cigarette smoking and to occupational exposure to urinary bladder carcinogens, and whether the same or different SNP combinations are relevant in smokers and non-smokers. To address these questions, we analyzed the genotypes of six SNPs, previously found to be associated with UBC, together with the GSTM1 deletion, in 1,595 UBC cases and 1,760 controls, stratified for smoking habits. We identified the strongest interactions of different orders and tested the stability of their effect by bootstrapping. We found that different SNP combinations were relevant in smokers and non-smokers. In smokers, polymorphisms involved in detoxification of cigarette smoke carcinogens were most relevant (GSTM1, rs11892031), in contrast to those in non-smokers with MYC and APOBEC3A near polymorphisms (rs9642880, rs1014971) being the most influential. Stable combinations of up to three high-risk alleles resulted in higher odds ratios (OR) than the individual SNPs, although the interaction effect was less than additive. The highest stable combination effects resulted in an OR of about 2.0, which is still lower than the ORs of cigarette smoking (here, current smokers' OR: 3.28) and comparable to occupational carcinogen exposure risks which, depending on the workplace, show mostly ORs up to 2.0.

  12. High-density SNP-based genetic maps for the parents of an outcrossed and a selfed tetraploid garden rose cross, inferred from admixed progeny using the 68k rose SNP array

    PubMed Central

    Vukosavljev, Mirjana; Arens, Paul; Voorrips, Roeland E; van ‘t Westende, Wendy PC; Esselink, GD; Bourke, Peter M; Cox, Peter; van de Weg, W Eric; Visser, Richard GF; Maliepaard, Chris; Smulders, Marinus JM

    2016-01-01

    Dense genetic maps create a base for QTL analysis of important traits and future implementation of marker-assisted breeding. In tetraploid rose, the existing linkage maps include <300 markers to cover 28 linkage groups (4 homologous sets of 7 chromosomes). Here we used the 68k WagRhSNP Axiom single-nucleotide polymorphism (SNP) array for rose, in combination with SNP dosage calling at the tetraploid level, to genotype offspring from the garden rose cultivar ‘Red New Dawn’. The offspring proved to be not from a single bi-parental cross. In rose breeding, crosses with unintended parents occur regularly. We developed a strategy to separate progeny into putative populations, even while one of the parents was unknown, using principle component analysis on pairwise genetic distances based on sets of selected SNP markers that were homozygous, and therefore uninformative for one parent. One of the inferred populations was consistent with self-fertilization of ‘Red New Dawn’. Subsequently, linkage maps were generated for a bi-parental and a self-pollinated population with ‘Red New Dawn’ as the common maternal parent. The densest map, for the selfed parent, had 1929 SNP markers on 25 linkage groups, covering 1765.5 cM at an average marker distance of 0.9 cM. Synteny with the strawberry (Fragaria vesca) genome was extensive. Rose ICM1 corresponded to F. vesca pseudochromosome 7 (Fv7), ICM4 to Fv4, ICM5 to Fv3, ICM6 to Fv2 and ICM7 to Fv5. Rose ICM2 corresponded to parts of F. vesca pseudochromosomes 1 and 6, whereas ICM3 is syntenic to the remainder of Fv6.

  13. An ultra-dense SNP linkage map for the octoploid, cultivated strawberry and its application in genetic research

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We will present an ultra-dense genetic linkage map for the octoploid, cultivated strawberry (Fragaria x ananassa) consisting of over 13K Axiom® based SNP markers and 150 previously mapped reference SSR loci. The high quality of the map is demonstrated by the short sizes of each of the 28 linkage gro...

  14. Development of a high-throughput SNP resource to advance genomic, genetic and breeding research in carrot (Daucus carota L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid advancement in high-throughput SNP genotyping technologies along with next generation sequencing (NGS) platforms has decreased the cost, improved the quality of large-scale genome surveys, and allowed specialty crops with limited genomic resources such as carrot (Daucus carota) to access t...

  15. SEL1L SNP rs12435998, a predictor of glioblastoma survival and response to radio-chemotherapy

    PubMed Central

    Storaci, Alessandra Maria; Annovazzi, Laura; Cassoni, Paola; Melcarne, Antonio; De Blasio, Pasquale; Schiffer, Davide; Biunno, Ida

    2015-01-01

    The suppressor of Lin-12-like (C. elegans) (SEL1L) is involved in the endoplasmic reticulum (ER)-associated degradation pathway, malignant transformation and stem cells. In 412 formalin-fixed and paraffin-embedded brain tumors and 39 Glioblastoma multiforme (GBM) cell lines, we determined the frequency of five SEL1L single nucleotide genetic variants with regulatory and coding functions by a SNaPShot™ assay. We tested their possible association with brain tumor risk, prognosis and therapy. We studied the in vitro cytotoxicity of valproic acid (VPA), temozolomide (TMZ), doxorubicin (DOX) and paclitaxel (PTX), alone or in combination, on 11 GBM cell lines, with respect to the SNP rs12435998 genotype. The SNP rs12435998 was prevalent in anaplastic and malignant gliomas, and in meningiomas of all histologic grades, but unrelated to brain tumor risks. In GBM patients, the SNP rs12435998 was associated with prolonged overall survival (OS) and better response to TMZ-based radio-chemotherapy. GBM stem cells with this SNP showed lower levels of SEL1L expression and enhanced sensitivity to VPA. PMID:25948789

  16. CLOCK 3111 T/C SNP interacts with emotional eating behavior for weight-loss in a Mediterranean population

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goals of this research was (1) to analyze the role of emotional eating behavior on weight-loss progression during a 30-week weight-loss program in 1,272 individuals from a large Mediterranean population and (2) to test for interaction between CLOCK 3111 T/C SNP and emotional eating behavior on t...

  17. SNP marker development for linkage map construction, anchoring of the common bean whole genome sequence and genetic research

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Our objectives were to identify SNP DNA markers based on a diverse set of common bean cultivars via next generation sequencing technologies; to develop Illumina Infinium BeadChip assays containing SNPs with high polymorphism within and between common bean market classes, to create high density genet...

  18. SNP discovery and chromosome anchoring provide the first physically-anchored hexaploid oat map and reveal synteny with model species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    For the first time in many years a comprehensive genome map for cultivated oat has been constructed using a combination of single nucleotide polymorphism (SNP) markers and validated with a collection of cytogenetically defined germplasm lines. The markers were able to help distinguish the three geno...

  19. Characterizing Associations and SNP-Environment Interactions for GWAS-Identified Prostate Cancer Risk Markers—Results from BPC3

    PubMed Central

    Lindstrom, Sara; Schumacher, Fredrick; Siddiq, Afshan; Travis, Ruth C.; Campa, Daniele; Berndt, Sonja I.; Diver, W. Ryan; Severi, Gianluca; Allen, Naomi; Andriole, Gerald; Bueno-de-Mesquita, Bas; Chanock, Stephen J.; Crawford, David; Gaziano, J. Michael; Giles, Graham G.; Giovannucci, Edward; Guo, Carolyn; Haiman, Christopher A.; Hayes, Richard B.; Halkjaer, Jytte; Hunter, David J.; Johansson, Mattias; Kaaks, Rudolf; Kolonel, Laurence N.; Navarro, Carmen; Riboli, Elio; Sacerdote, Carlotta; Stampfer, Meir; Stram, Daniel O.; Thun, Michael J.; Trichopoulos, Dimitrios; Virtamo, Jarmo; Weinstein, Stephanie J.; Yeager, Meredith; Henderson, Brian; Ma, Jing; Le Marchand, Loic; Albanes, Demetrius; Kraft, Peter

    2011-01-01

    Genome-wide association studies (GWAS) have identified multiple single nucleotide polymorphisms (SNPs) associated with prostate cancer risk. However, whether these associations can be consistently replicated, vary with disease aggressiveness (tumor stage and grade) and/or interact with non-genetic potential risk factors or other SNPs is unknown. We therefore genotyped 39 SNPs from regions identified by several prostate cancer GWAS in 10,501 prostate cancer cases and 10,831 controls from the NCI Breast and Prostate Cancer Cohort Consortium (BPC3). We replicated 36 out of 39 SNPs (P-values ranging from 0.01 to 10−28). Two SNPs located near KLK3 associated with PSA levels showed differential association with Gleason grade (rs2735839, P = 0.0001 and rs266849, P = 0.0004; case-only test), where the alleles associated with decreasing PSA levels were inversely associated with low-grade (as defined by Gleason grade <8) tumors but positively associated with high-grade tumors. No other SNP showed differential associations according to disease stage or grade. We observed no effect modification by SNP for association with age at diagnosis, family history of prostate cancer, diabetes, BMI, height, smoking or alcohol intake. Moreover, we found no evidence of pair-wise SNP-SNP interactions. While these SNPs represent new independent risk factors for prostate cancer, we saw little evidence for effect modification by other SNPs or by the environmental factors examined. PMID:21390317

  20. Association of STAT2 SNP genotypes and growth phenotypes in heifers from an Angus, Brahman and Romosinuano diallel population

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Components of the growth endocrine axis regulate growth and reproduction traits in cattle. A SNP in the promoter of the signal transducer and activator of transcription 2 (STAT2) has been previously reported to be associated with postpartum rebreeding in a diallel beef population composed of 650 hei...

  1. Translational genomics for abiotic stress in sorghum: transcriptional profiling and validation of SNP markers between germplasm with differential cold tolerance

    Technology Transfer Automated Retrieval System (TEKTRAN)

    One focus of the Sorghum Translational Genomics Lab (part of sorghum CRIS, PSGD, CSRL, USDA-ARS, Lubbock TX) is to utilize nucleotide variation between sorghum germplasm such as those derived from RNA seq for translation and validation of Single Nucleotide Polymorphism (SNP) into easy access DNA m...

  2. Population-standardized genetic risk score: the SNP-based method of choice for inherited risk assessment of prostate cancer

    PubMed Central

    Conran, Carly A; Na, Rong; Chen, Haitao; Jiang, Deke; Lin, Xiaoling; Zheng, S Lilly; Brendler, Charles B; Xu, Jianfeng

    2016-01-01

    Several different approaches are available to clinicians for determining prostate cancer (PCa) risk. The clinical validity of various PCa risk assessment methods utilizing single nucleotide polymorphisms (SNPs) has been established; however, these SNP-based methods have not been compared. The objective of this study was to compare the three most commonly used SNP-based methods for PCa risk assessment. Participants were men (n = 1654) enrolled in a prospective study of PCa development. Genotypes of 59 PCa risk-associated SNPs were available in this cohort. Three methods of calculating SNP-based genetic risk scores (GRSs) were used for the evaluation of individual disease risk such as risk allele count (GRS-RAC), weighted risk allele count (GRS-wRAC), and population-standardized genetic risk score (GRS-PS). Mean GRSs were calculated, and performances were compared using area under the receiver operating characteristic curve (AUC) and positive predictive value (PPV). All SNP-based methods were found to be independently associated with PCa (all P < 0.05; hence their clinical validity). The mean GRSs in men with or without PCa using GRS-RAC were 55.15 and 53.46, respectively, using GRS-wRAC were 7.42 and 6.97, respectively, and using GRS-PS were 1.12 and 0.84, respectively (all P < 0.05 for differences between patients with or without PCa). All three SNP-based methods performed similarly in discriminating PCa from non-PCa based on AUC and in predicting PCa risk based on PPV (all P > 0.05 for comparisons between the three methods), and all three SNP-based methods had a significantly higher AUC than family history (all P < 0.05). Results from this study suggest that while the three most commonly used SNP-based methods performed similarly in discriminating PCa from non-PCa at the population level, GRS-PS is the method of choice for risk assessment at the individual level because its value (where 1.0 represents average population risk) can be easily interpreted regardless

  3. Single nucleotide polymorphism (SNP) at the GHR gene and its associations with chicken growth and fat deposition traits.

    PubMed

    Ouyang, J H; Xie, L; Nie, Q; Luo, C; Liang, Y; Zeng, H; Zhang, X

    2008-03-01

    1. The growth hormone receptor (GHR) plays crucial roles on chicken growth and metabolism. 2. The full cDNA of the chicken GHR gene was scanned for single nucleotide polymorphisms (SNP) by means of denaturing high-performance liquid chromatography (DHPLC). Three SNP, C6540334T, C6542011T and G6631778A, were genotyped in a F(2) designed full-sib resource population to analyse their associations with chicken growth and fat deposition traits. 3. Fifty-five SNP and two other variations were identified in the 8908 bp region of the GHR gene. Among the 55 SNP, 10 were located in coding exons (6 resulted in changes of amino acids) and 45 were in non-coding regions (introns, 5'UTR and 3'UTR). The nucleotide diversity (theta), corrected for sample size of chicken GHR gene, is 1.45 x 10(-3). Fourteen PCR-RFLP markers were developed in the chicken GHR gene. 4. The G6631778A was associated with body weight at 63 d (BW63), dressed weight (DW) and subcutaneous fat thickness (SFT), BW35 and BW49 (P < 0.01) as well as hatch weight (HW) and BW42 in the male population. However, G6631778A was only associated with BW28 in the female population. G rather than A was dominant for chicken growth and fat deposition. Haplotypes based on the three SNP were associated with BW21, BW70, BW77 and SFT, BW7, BW35, BW42, BW49 and BW56 in males, and associated with BW7 and BW14 in females. For growth in males, the H2 and H6 haplotypes had positive and negative effects, respectively; meanwhile H6 was predominant for fat deposition.

  4. A high-throughput SNP marker system for parental polymorphism screening, and diversity analysis in common bean (Phaseolus vulgaris L.).

    PubMed

    Blair, Matthew W; Cortés, Andrés J; Penmetsa, R Varma; Farmer, Andrew; Carrasquilla-Garcia, Noelia; Cook, Doug R

    2013-02-01

    Single nucleotide polymorphism (SNP) detection has become a marker system of choice, because of the high abundance of source polymorphisms and the ease with which allele calls are automated. Various technologies exist for the evaluation of SNP loci and previously we validated two medium throughput technologies. In this study, our goal was to utilize a 768 feature, Illumina GoldenGate assay for common bean (Phaseolus vulgaris L.) developed from conserved legume gene sequences and to use the new technology for (1) the evaluation of parental polymorphisms in a mini-core set of common bean accessions and (2) the analysis of genetic diversity in the crop. A total of 736 SNPs were scored on 236 diverse common bean genotypes with the GoldenGate array. Missing data and heterozygosity levels were low and 94 % of the SNPs were scorable. With the evaluation of the parental polymorphism genotypes, we estimated the utility of the SNP markers in mapping for inter-genepool and intra-genepool populations, the latter being of lower polymorphism than the former. When we performed the diversity analysis with the diverse genotypes, we found Illumina GoldenGate SNPs to provide equivalent evaluations as previous gene-based SNP markers, but less fine-distinctions than with previous microsatellite marker analysis. We did find, however, that the gene-based SNPs in the GoldenGate array had some utility in race structure analysis despite the low polymorphism. Furthermore the SNPs detected high heterozygosity in wild accessions which was probably a reflection of ascertainment bias. The Illumina SNPs were shown to be effective in distinguishing between the genepools, and therefore were most useful in saturation of inter-genepool genetic maps. The implications of these results for breeding in common bean are discussed as well as the advantages and disadvantages of the GoldenGate system for SNP detection.

  5. Extensive population structure in San, Khoe, and mixed ancestry populations from southern Africa revealed by 44 short 5-SNP haplotypes.

    PubMed

    Schlebusch, Carina M; Soodyall, Himlya

    2012-12-01

    The San and Khoe people currently represent remnant groups of a much larger and widely distributed population of hunter-gatherers and pastoralists who had exclusive occupation of southern Africa before the arrival of Bantu-speaking groups in the past 1,200 years and sea-borne immigrants within the last 350 years. Genetic studies [mitochondrial deoxyribonucleic acid (DNA) and Y-chromosome] conducted on San and Khoe groups revealed that they harbor some of the most divergent lineages found in living peoples throughout the world. Recently, high-density, autosomal, single-nucleotide polymorphism (SNP)-array studies confirmed the early divergence of Khoe-San population groups from all other human populations. The present study made use of 220 autosomal SNP markers (in the format of both haplotypes and genotypes) to examine the population structure of various San and Khoe groups and their relationship to other neighboring groups. Whereas analyses based on the genotypic SNP data only supported the division of the included populations into three main groups-Khoe-San, Bantu-speakers, and non-African populations-haplotype analyses revealed finer structure within Khoe-San populations. By the use of only 44 short SNP haplotypes (compiled from a total of 220 SNPs), most of the Khoe-San groups could be resolved as separate groups by applying STRUCTURE analyses. Therefore, by carefully selecting a few SNPs and combining them into haplotypes, we were able to achieve the same level of population distinction that was achieved previously in high-density SNP studies on the same population groups. Using haplotypes proved to be a very efficient and cost-effective way to study population structure.

  6. SNP Discovery and Chromosome Anchoring Provide the First Physically-Anchored Hexaploid Oat Map and Reveal Synteny with Model Species

    PubMed Central

    Chao, Shiaoman; Jellen, Eric N.; Carson, Martin L.; Rines, Howard W.; Obert, Donald E.; Lutz, Joseph D.; Shackelford, Irene; Korol, Abraham B.; Wight, Charlene P.; Gardner, Kyle M.; Hattori, Jiro; Beattie, Aaron D.; Bjørnstad, Åsmund; Bonman, J. Michael; Jannink, Jean-Luc; Sorrells, Mark E.; Brown-Guedira, Gina L.; Mitchell Fetch, Jennifer W.; Harrison, Stephen A.; Howarth, Catherine J.; Ibrahim, Amir; Kolb, Frederic L.; McMullen, Michael S.; Murphy, J. Paul; Ohm, Herbert W.; Rossnagel, Brian G.; Yan, Weikai; Miclaus, Kelci J.; Hiller, Jordan; Maughan, Peter J.; Redman Hulse, Rachel R.; Anderson, Joseph M.; Islamovic, Emir

    2013-01-01

    A physically anchored consensus map is foundational to modern genomics research; however, construction of such a map in oat (Avena sativa L., 2n = 6x = 42) has been hindered by the size and complexity of the genome, the scarcity of robust molecular markers, and the lack of aneuploid stocks. Resources developed in this study include a modified SNP discovery method for complex genomes, a diverse set of oat SNP markers, and a novel chromosome-deficient SNP anchoring strategy. These resources were applied to build the first complete, physically-anchored consensus map of hexaploid oat. Approximately 11,000 high-confidence in silico SNPs were discovered based on nine million inter-varietal sequence reads of genomic and cDNA origin. GoldenGate genotyping of 3,072 SNP assays yielded 1,311 robust markers, of which 985 were mapped in 390 recombinant-inbred lines from six bi-parental mapping populations ranging in size from 49 to 97 progeny. The consensus map included 985 SNPs and 68 previously-published markers, resolving 21 linkage groups with a total map distance of 1,838.8 cM. Consensus linkage groups were assigned to 21 chromosomes using SNP deletion analysis of chromosome-deficient monosomic hybrid stocks. Alignments with sequenced genomes of rice and Brachypodium provide evidence for extensive conservation of genomic regions, and renewed encouragement for orthology-based genomic discovery in this important hexaploid species. These results also provide a framework for high-resolution genetic analysis in oat, and a model for marker development and map construction in other species with complex genomes and limited resources. PMID:23533580

  7. SNP discovery and chromosome anchoring provide the first physically-anchored hexaploid oat map and reveal synteny with model species.

    PubMed

    Oliver, Rebekah E; Tinker, Nicholas A; Lazo, Gerard R; Chao, Shiaoman; Jellen, Eric N; Carson, Martin L; Rines, Howard W; Obert, Donald E; Lutz, Joseph D; Shackelford, Irene; Korol, Abraham B; Wight, Charlene P; Gardner, Kyle M; Hattori, Jiro; Beattie, Aaron D; Bjørnstad, Åsmund; Bonman, J Michael; Jannink, Jean-Luc; Sorrells, Mark E; Brown-Guedira, Gina L; Mitchell Fetch, Jennifer W; Harrison, Stephen A; Howarth, Catherine J; Ibrahim, Amir; Kolb, Frederic L; McMullen, Michael S; Murphy, J Paul; Ohm, Herbert W; Rossnagel, Brian G; Yan, Weikai; Miclaus, Kelci J; Hiller, Jordan; Maughan, Peter J; Redman Hulse, Rachel R; Anderson, Joseph M; Islamovic, Emir; Jackson, Eric W

    2013-01-01

    A physically anchored consensus map is foundational to modern genomics research; however, construction of such a map in oat (Avena sativa L., 2n = 6x = 42) has been hindered by the size and complexity of the genome, the scarcity of robust molecular markers, and the lack of aneuploid stocks. Resources developed in this study include a modified SNP discovery method for complex genomes, a diverse set of oat SNP markers, and a novel chromosome-deficient SNP anchoring strategy. These resources were applied to build the first complete, physically-anchored consensus map of hexaploid oat. Approximately 11,000 high-confidence in silico SNPs were discovered based on nine million inter-varietal sequence reads of genomic and cDNA origin. GoldenGate genotyping of 3,072 SNP assays yielded 1,311 robust markers, of which 985 were mapped in 390 recombinant-inbred lines from six bi-parental mapping populations ranging in size from 49 to 97 progeny. The consensus map included 985 SNPs and 68 previously-published markers, resolving 21 linkage groups with a total map distance of 1,838.8 cM. Consensus linkage groups were assigned to 21 chromosomes using SNP deletion analysis of chromosome-deficient monosomic hybrid stocks. Alignments with sequenced genomes of rice and Brachypodium provide evidence for extensive conservation of genomic regions, and renewed encouragement for orthology-based genomic discovery in this important hexaploid species. These results also provide a framework for high-resolution genetic analysis in oat, and a model for marker development and map construction in other species with complex genomes and limited resources. PMID:23533580

  8. Assessment of the functionality of genome-wide canine SNP arrays and implications for canine disease association studies.

    PubMed

    Ke, X; Kennedy, L J; Short, A D; Seppälä, E H; Barnes, A; Clements, D N; Wood, S H; Carter, S D; Happ, G M; Lohi, H; Ollier, W E R

    2011-04-01

    Domestic dogs share a wide range of important disease conditions with humans, including cancers, diabetes and epilepsy. Many of these conditions have similar or identical underlying pathologies to their human counterparts and thus dogs represent physiologically relevant natural models of human disorders. Comparative genomic approaches whereby disease genes can be identified in dog diseases and then mapped onto the human genome are now recognized as a valid method and are increasing in popularity. The majority of dog breeds have been created over the past few hundred years and, as a consequence, the dog genome is characterized by extensive linkage disequilibrium (LD), extending usually from hundreds of kilobases to several megabases within a breed, rather than tens of kilobases observed in the human genome. Genome-wide canine SNP arrays have been developed, and increasing success of using these arrays to map disease loci in dogs is emerging. No equivalent of the human HapMap currently exists for different canine breeds, and the LD structure for such breeds is far less understood than for humans. This study is a dedicated large-scale assessment of the functionalities (LD and SNP tagging performance) of canine genome-wide SNP arrays in multiple domestic dog breeds. We have used genotype data from 18 breeds as well as wolves and coyotes genotyped by the Illumina 22K canine SNP array and Affymetrix 50K canine SNP array. As expected, high tagging performance was observed with most of the breeds using both Illumina and Affymetrix arrays when multi-marker tagging was applied. In contrast, however, large differences in population structure, LD coverage and pairwise tagging performance were found between breeds, suggesting that study designs should be carefully assessed for individual breeds before undertaking genome-wide association studies (GWAS).

  9. Fine tuning genomic evaluations in dairy cattle through SNP pre-selection with the Elastic-Net algorithm.

    PubMed

    Croiseau, Pascal; Legarra, Andrés; Guillaume, François; Fritz, Sébastien; Baur, Aurélia; Colombani, Carine; Robert-Granié, Christèle; Boichard, Didier; Ducrocq, Vincent

    2011-12-01

    For genomic selection methods, the statistical challenge is to estimate the effect of each of the available single-nucleotide polymorphism (SNP). In a context where the number of SNPs (p) is much higher than the number of bulls (n), this task may lead to a poor estimation of these SNP effects if, as for genomic BLUP (gBLUP), all SNPs have a non-null effect. An alternative is to use approaches that have been developed specifically to solve the 'p > n' problem. This is the case of variable selection methods and among them, we focus on the Elastic-Net (EN) algorithm that is a penalized regression approach. Performances of EN, gBLUP and pedigree-based BLUP were compared with data from three French dairy cattle breeds, giving very encouraging results for EN. We tried to push further the idea of improving SNP effect estimates by considering fewer of them. This variable selection strategy was considered both in the case of gBLUP and EN by adding an SNP pre-selection step based on quantitative trait locus (QTL) detection. Similar results were observed with or without a pre-selection step, in terms of correlations between direct genomic value (DGV) and observed daughter yield deviation in a validation data set. However, when applied to the EN algorithm, this strategy led to a substantial reduction of the number of SNPs included in the prediction equation. In a context where the number of genotyped animals and the number of SNPs gets larger and larger, SNP pre-selection strongly alleviates computing requirements and ensures that national evaluations can be completed within a reasonable time frame.

  10. Identification of genes with nonsynonymous SNP in Jeju horse by whole-genome resequencing reveals a functional role for immune response.

    PubMed

    Lee, J-H; Song, K-D; Kim, J-M; Leem, H-K; Park, K-D

    2016-03-01

    Jeju horse (Natural Monument number 347) is a breed of horse that has experienced long-term isolation and domestication in Jeju Island, South Korea. We evaluated genetic features of this breed, including SNP, by whole-genome resequencing using an Illumina HiSeq 2000. A total of 5,986,852 SNP were identified in 4 Jeju horses and were divided into homozygous and heterozygous SNP (2,357,099 and 3,629,753 SNP, respectively). It revealed that 63.8% of these SNP resided in intergenic regions. Immune response genes with nonsynonymous SNP were overrepresented in Jeju horses as evidenced by Gene Ontology clustering. Among these genes, Toll-like receptors (TLR) are highly enriched. Comparing TLR genes between Jeju horses and the Przewalski's horse, and genes showed "possibly damaging" mutations in several regions by analysis with PolyPhen-2. These results provide a framework for further genetic studies in Jeju horse by domestication. Furthermore, research on functions of SNP-associated genes would aid in understanding the molecular genetic variation of horse breeds.

  11. Genome-wide association study for behavior, type traits, and muscular development in Charolais beef cattle.

    PubMed

    Vallée, A; Daures, J; van Arendonk, J A M; Bovenhuis, H

    2016-06-01

    Behavior, type traits, and muscular development are of interest for beef cattle breeding. Genome-wide association studies (GWAS) enable the identification of candidate genes, which enables gene-based selection and provides insight in the genetic architecture of these traits. The objective of the current study was to perform a GWAS for 3 behavior traits, 12 type traits, and muscular development in Charolais cattle. Behavior traits, including aggressiveness at parturition, aggressiveness during gestation period, and maternal care, were scored by farmers. Type traits, including udder conformation, teat, feet and legs, and locomotion, were scored by trained classifiers. Data used in the GWAS consisted of 3,274 cows with phenotypic records and genotyping information for 44,930 SNP. When SNP had a false discovery rate (FDR) smaller than 0.05, they were referred to as significant. When SNP had a FDR between 0.05 and 0.20, they were referred to as suggestive. Four significant and 12 suggestive regions were detected for aggressiveness during gestation, maternal care, udder balance, teat thinness, teat length, foot angle, foot depth, and locomotion. These 4 significant and 12 suggestive regions were not supported by other significant SNP in close proximity. No SNP with major effects were detected for behavior and type traits, and SNP associations for these traits were spread across the genome, suggesting that behavior and type traits were influenced by many genes, each explaining a small part of genetic variance. The GWAS identified 1 region on chromosome 2 significantly associated with muscular development, which included the myostatin gene (), which is known to affect muscularity. No other regions associated with muscular development were found. Results showed that the myostatin region associated with muscular development had pleiotropic effects on udder volume, teat thinness, rear leg, and leg angle. PMID:27285908

  12. A global view of 54,001 single nucleotide polymorphisms (SNPs) on the Illumina BovineSNP50 BeadChip and their transferability to water buffalo.

    PubMed

    Michelizzi, Vanessa N; Wu, Xiaolin; Dodson, Michael V; Michal, Jennifer J; Zambrano-Varon, Jorge; McLean, Derek J; Jiang, Zhihua

    2010-01-01

    The Illumina BovineSNP50 BeadChip features 54,001 informative single nucleotide polymorphisms (SNPs) that uniformly span the entire bovine genome. Among them, 52,255 SNPs have locations assigned in the current genome assembly (Btau_4.0), including 19,294 (37%) intragenic SNPs (i.e., located within genes) and 32,961 (63%) intergenic SNPs (i.e., located between genes). While the SNPs represented on the Illumina Bovine50K BeadChip are evenly distributed along each bovine chromosome, there are over 14,000 genes that have no SNPs placed on the current BeadChip. Kernel density estimation, a non-parametric method, was used in the present study to identify SNP-poor and SNP-rich regions on each bovine chromosome. With bandwidth = 0.05 Mb, we observed that most regions have SNP densities within 2 standard deviations of the chromosome SNP density mean. The SNP density on chromosome X was the most dynamic, with more than 30 SNP-rich regions and at least 20 regions with no SNPs. Genotyping ten water buffalo using the Illumina BovineSNP50 BeadChip revealed that 41,870 of the 54,001 SNPs are fully scored on all ten water buffalo, but 6,771 SNPs are partially scored on one to nine animals. Both fully scored and partially/no scored SNPs are clearly clustered with various sizes on each chromosome. However, among 43,687 bovine SNPs that were successfully genotyped on nine and ten water buffalo, only 1,159 were polymorphic in the species. These results indicate that the SNPs sites, but not the polymorphisms, are conserved between two species. Overall, our present study provides a solid foundation to further characterize the SNP evolutionary process, thus improving understanding of within- and between-species biodiversity, phylogenetics and adaption to environmental changes.

  13. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies.

    PubMed

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79-92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50-70%). PMID:27547936

  14. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies

    PubMed Central

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79–92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50–70%). PMID:27547936

  15. Association of the ARL15 rs6450176 SNP and serum lipid levels in the Jing and Han populations

    PubMed Central

    Sun, Jia-Qi; Yin, Rui-Xing; Shi, Guang-Yuan; Shen, Shao-Wen; Chen, Xia; Bin, Yuan; Huang, Feng; Wang, Wei; Lin, Wei-Xiong; Pan, Shang-Ling

    2015-01-01

    The association of ADP-ribosylation factor-like 15 (ARL15) rs6450176 single nucleotide polymorphism (SNP) and serum lipid profiles has never been studied in the Chinese population. The present study was undertaken to detect the association of ARL15 rs6450176 SNP and several environmental factors with serum lipid levels in the Jing and Han populations. Genotypes of the SNP were determined in 726 unrelated subjects of Jing nationality and 726 participants of Han nationality. The genotypic and allelic frequencies of the SNP in Jing but not in Han were different between males and females (P < 0.001 and P < 0.05; respectively). The G allele carriers in Han had lower serum total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C) and apolipoprotein (Apo) B levels, and higher ApoA1/ApoB ratio than the G allele non-carriers (P < 0.05-0.01). The G allele carriers in Jing had lower serum TC, high-density lipoprotein cholesterol (HDL-C), ApoA1, ApoB levels and higher ApoA1/ApoB ratio than the G allele non-carriers (P < 0.05 for all). Subgroup analyses showed that the G allele carriers had lower TC and LDL-C levels in Han males; lower LDL-C and ApoB levels in Han females; lower ApoB levels and ApoA1/ApoB ratio in Jing males; and lower LDL-C levels in Jing females than the G allele non-carriers (P < 0.05-0.01). Multiple linear regression analysis showed that serum TC, LDL-C, ApoB levels and the ApoA1/ApoB ratio in Han; and TC, HDL-C and ApoA1 levels in Jing were correlated with the genotypes of the ARL15 rs6450176 SNP (P < 0.05-0.001). Serum lipid parameters were also associated with several environmental factors in both ethnic groups. These findings indicated that there may be a racial/ethnic- and/or sex-specific association of the ARL15 rs6450176 SNP and serum lipid levels. PMID:26722494

  16. Blood Type Influences Pancreatic Cancer Risk | Division of Cancer Prevention

    Cancer.gov

    A variation in the gene that determines ABO blood type influences the risk of pancreatic cancer, according to the results of the first genome-wide association study (GWAS) for this highly lethal disease. The genetic variation, a single nucleotide polymorphism (SNP), was discovered in a region of chromosome 9 that harbors the gene that determines blood type, the researchers reported August 2 online in Nature Genetics. |

  17. The impact of a leptin gene SNP on beef calf weaning weights.

    PubMed

    DeVuyst, E A; Bauer, M L; Cheng, F-C; Mitchell, J; Larson, D

    2008-06-01

    Prior research indicates that a SNP at position 305 of exon 2 in the leptin gene affects milk production in dairy cows. Dairy cows with at least one copy of the T allele have been shown to have higher milk production than CC cows. If that effect carries over to beef breeds, it is reasonable to expect that CT and TT beef cows will wean heavier calves than CC beef cows. We tested this hypothesis for a herd of mixed breed cows using anova. Results indicated that both crossbred CT and TT beef cows wean significantly heavier beef calves than CC crossbred beef cows. A lack of observations generally hinders detection of significance in other breeds. However, two other comparisons were found to be significant. The results suggest further investigation into the link between leptin genotype and calf weaning weights. Aside from interest to animal scientists, these results have the potential to alter mating and replacement selection decisions by cow-calf producers, given the importance of weaning weights on profitability.

  18. Reducing bias of allele frequency estimates by modeling SNP genotype data with informative missingness.

    PubMed

    Lin, Wan-Yu; Liu, Nianjun

    2012-01-01

    The presence of missing single-nucleotide polymorphism (SNP) genotypes is common in genetic studies. For studies with low-density SNPs, the most commonly used approach to dealing with genotype missingness is to simply remove the observations with missing genotypes from the analyses. This naïve method is straightforward but is valid only when the missingness is random. However, a given assay often has a different capability in genotyping heterozygotes and homozygotes, causing the phenomenon of "differential dropout" in the sense that the missing rates of heterozygotes and homozygotes are different. In practice, differential dropout among genotypes exists in even carefully designed studies, such as the data from the HapMap project and the Wellcome Trust Case Control Consortium. Under the assumption of Hardy-Weinberg equilibrium and no genotyping error, we here propose a statistical method to model the differential dropout among different genotypes. Compared with the naïve method, our method provides more accurate allele frequency estimates when the differential dropout is present. To demonstrate its practical use, we further apply our method to the HapMap data and a scleroderma data set. PMID:22719749

  19. The LASSO and sparse least square regression methods for SNP selection in predicting quantitative traits.

    PubMed

    Feng, Zeny Z; Yang, Xiaojian; Subedi, Sanjeena; McNicholas, Paul D

    2012-01-01

    Recent work concerning quantitative traits of interest has focused on selecting a small subset of single nucleotide polymorphisms (SNPs) from amongst the SNPs responsible for the phenotypic variation of the trait. When considered as covariates, the large number of variables (SNPs) and their association with those in close proximity pose challenges for variable selection. The features of sparsity and shrinkage of regression coefficients of the least absolute shrinkage and selection operator (LASSO) method appear attractive for SNP selection. Sparse partial least squares (SPLS) is also appealing as it combines the features of sparsity in subset selection and dimension reduction to handle correlations amongst SNPs. In this paper we investigate application of the LASSO and SPLS methods for selecting SNPs that predict quantitative traits. We evaluate the performance of both methods with different criteria and under different scenarios using simulation studies. Results indicate that these methods can be effective in selecting SNPs that predict quantitative traits but are limited by some conditions. Both methods perform similarly overall but each exhibit advantages over the other in given situations. Both methods are applied to Canadian Holstein cattle data to compare their performance.

  20. Genome-wide SNP analysis explains coral diversity and recovery in the Ryukyu Archipelago.

    PubMed

    Shinzato, Chuya; Mungpakdee, Sutada; Arakaki, Nana; Satoh, Noriyuki

    2015-12-10

    Following a global coral bleaching event in 1998, Acropora corals surrounding most of Okinawa island (OI) were devastated, although they are now gradually recovering. In contrast, the Kerama Islands (KIs) only 30 km west of OI, have continuously hosted a great variety of healthy corals. Taking advantage of the decoded Acropora digitifera genome and using genome-wide SNP analyses, we clarified Acropora population structure in the southern Ryukyu Archipelago (sRA). Despite small genetic distances, we identified distinct clusters corresponding to specific island groups, suggesting infrequent long-distance dispersal within the sRA. Although the KIs were believed to supply coral larvae to OI, admixture analyses showed that such dispersal is much more limited than previously realized, indicating independent recovery of OI coral populations and the necessity of local conservation efforts for each region. We detected strong historical migration from the Yaeyama Islands (YIs) to OI, and suggest that the YIs are the original source of OI corals. In addition, migration edges to the KIs suggest that they are a historical sink population in the sRA, resulting in high diversity. This population genomics study provides the highest resolution data to date regarding coral population structure and history.

  1. Purifying selection shapes the coincident SNP distribution of primate coding sequences.

    PubMed

    Chen, Chia-Ying; Hung, Li-Yuan; Wu, Chan-Shuo; Chuang, Trees-Juen

    2016-01-01

    Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density, and recombination rate, and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently, and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a "signature" during primate protein evolution. PMID:27255481

  2. Genome-wide SNP analysis explains coral diversity and recovery in the Ryukyu Archipelago

    PubMed Central

    Shinzato, Chuya; Mungpakdee, Sutada; Arakaki, Nana; Satoh, Noriyuki

    2015-01-01

    Following a global coral bleaching event in 1998, Acropora corals surrounding most of Okinawa island (OI) were devastated, although they are now gradually recovering. In contrast, the Kerama Islands (KIs) only 30 km west of OI, have continuously hosted a great variety of healthy corals. Taking advantage of the decoded Acropora digitifera genome and using genome-wide SNP analyses, we clarified Acropora population structure in the southern Ryukyu Archipelago (sRA). Despite small genetic distances, we identified distinct clusters corresponding to specific island groups, suggesting infrequent long-distance dispersal within the sRA. Although the KIs were believed to supply coral larvae to OI, admixture analyses showed that such dispersal is much more limited than previously realized, indicating independent recovery of OI coral populations and the necessity of local conservation efforts for each region. We detected strong historical migration from the Yaeyama Islands (YIs) to OI, and suggest that the YIs are the original source of OI corals. In addition, migration edges to the KIs suggest that they are a historical sink population in the sRA, resulting in high diversity. This population genomics study provides the highest resolution data to date regarding coral population structure and history. PMID:26656261

  3. Identifying main effects and epistatic interactions from large-scale SNP data via adaptive group Lasso

    PubMed Central

    2010-01-01

    Background Single nucleotide polymorphism (SNP) based association studies aim at identifying SNPs associated with phenotypes, for example, complex diseases. The associated SNPs may influence the disease risk individually (main effects) or behave jointly (epistatic interactions). For the analysis of high throughput data, the main difficulty is that the number of SNPs far exceeds the number of samples. This difficulty is amplified when identifying interactions. Results In this paper, we propose an Adaptive Group Lasso (AGL) model for large-scale association studies. Our model enables us to analyze SNPs and their interactions simultaneously. We achieve this by introducing a sparsity constraint in our model based on the fact that only a small fraction of SNPs is disease-associated. In order to reduce the number of false positive findings, we develop an adaptive reweighting scheme to enhance sparsity. In addition, our method treats SNPs and their interactions as factors, and identifies them in a grouped manner. Thus, it is flexible to analyze various disease models, especially for interaction detection. However, due to the intensive computation when millions of interaction terms needs to be searched in the model fitting, our method needs to combined with some filtering methods when applied to genome-wide data for detecting interactions. Conclusion By using a wide range of simulated datasets and a real dataset from WTCCC, we demonstrate the advantages of our method. PMID:20122189

  4. Development and application of a novel genome-wide SNP array reveals domestication history in soybean.

    PubMed

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-02-09

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.

  5. The RNAsnp web server: predicting SNP effects on local RNA secondary structure.

    PubMed

    Sabarinathan, Radhakrishnan; Tafer, Hakim; Seemann, Stefan E; Hofacker, Ivo L; Stadler, Peter F; Gorodkin, Jan

    2013-07-01

    The function of many non-coding RNA genes and cis-regulatory elements of messenger RNA largely depends on the structure, which is in turn determined by their sequence. Single nucleotide polymorphisms (SNPs) and other mutations may disrupt the RNA structure, interfere with the molecular function and hence cause a phenotypic effect. RNAsnp is an efficient method to predict the effect of SNPs on local RNA secondary structure based on the RNA folding algorithms implemented in the Vienna RNA package. The SNP effects are quantified in terms of empirical P-values, which, for computational efficiency, are derived from extensive pre-computed tables of distributions of substitution effects as a function of gene length and GC content. Here, we present a web service that not only provides an interface for RNAsnp but also features a graphical output representation. In addition, the web server is connected to a local mirror of the UCSC genome browser database that enables the users to select the genomic sequences for analysis and visualize the results directly in the UCSC genome browser. The RNAsnp web server is freely available at: http://rth.dk/resources/rnasnp/.

  6. Comparative analysis of SNP candidates in disparate milk yielding river buffaloes using targeted sequencing

    PubMed Central

    2016-01-01

    River buffalo (Bubalus bubalis) milk plays an important role in economy and nutritious diet in several developing countries. However, reliable milk-yield genomic markers and their functional insights remain unexposed. Here, we have used a target capture sequencing approach in three economically important buffalo breeds namely: Banni, Jafrabadi and Mehsani, belonging to either high or low milk-yield group. Blood samples were collected from the milk-yield/breed balanced group of 12 buffaloes, and whole exome sequencing was performed using Roche 454 GS-FLX Titanium sequencer. Using an innovative approach namely, MultiCom; we have identified high-quality SNPs specific for high and low-milk yield buffaloes. Almost 70% of the reported genes in QTL regions of milk-yield and milk-fat in cattle were present among the buffalo milk-yield gene candidates. Functional analysis highlighted transcriptional regulation category in the low milk-yield group, and several new pathways in the two groups. Further, the discovered SNP candidates may account for more than half of mammary transcriptome changes in high versus low-milk yielding cattle. Thus, starting from the design of a reliable strategy, we identified reliable genomic markers specific for high and low-milk yield buffalo breeds and addressed possible downstream effects. PMID:27441113

  7. Ancestry informative marker panels for African Americans based on subsets of commercially available SNP arrays.

    PubMed

    Tandon, Arti; Patterson, Nick; Reich, David

    2011-01-01

    Admixture mapping is a widely used method for localizing disease genes in African Americans. Most current methods for inferring ancestry at each locus in the genome use a few thousand single nucleotide polymorphisms (SNPs) that are very different in frequency between West Africans and European Americans, and that are required to not be in linkage disequilibrium in the ancestral populations. Modern SNP arrays provide data on hundreds of thousands of SNPs per sample, and to use these to infer ancestry, using many of the standard methods, it is necessary to choose subsets of the SNPs for analysis. Here we present panels of about 4,300 ancestry informative markers (AIMs) that are subsets respectively of SNPs on the Illumina 1 M, Illumina 650, Illumina 610, Affymetrix 6.0 and Affymetrix 5.0 arrays. To validate the usefulness of these panels, we applied them to samples that are different from the ones used to select the SNPs. The panels provide about 80% of the maximum information about African or European ancestry, even with up to 10% missing data.

  8. Primers to amplify SNP markers in Epichloë canadensis (Clavicipitaceae)1

    PubMed Central

    Sullivan, Terrence J.; Bultman, Thomas L.; Schoolcraft, Jennifer

    2016-01-01

    Premise of the study: Primers were designed to produce short amplicons containing single-nucleotide polymorphisms (SNPs) in β-tubulin (tubB) and translation elongation factor 1-α (tefA) in Epichloë canadensis (Clavicipitaceae), an endophytic fungus of Elymus canadensis (Poaceae). Methods and Results: Primers to amplify regions of tubB and tefA containing suspected SNPs were designed and tested on individuals from six populations. Two tubB alleles were identified that differed by a single SNP, and three tefA alleles were identified that differed by a combination of two SNPs. All six populations tested were polymorphic for the tefA marker, and three of the populations were also polymorphic for the tubB marker. These primers are also predicted to amplify these regions in 11 additional epichloid species. Conclusions: Primers for short amplicons within tubB and tefA genes can be used to successfully genotype E. canadensis, making them useful markers for population genetic or landscape genomic studies. PMID:27011893

  9. Prospective diagnostic analysis of copy number variants using SNP microarrays in individuals with autism spectrum disorders

    PubMed Central

    Nava, Caroline; Keren, Boris; Mignot, Cyril; Rastetter, Agnès; Chantot-Bastaraud, Sandra; Faudet, Anne; Fonteneau, Eric; Amiet, Claire; Laurent, Claudine; Jacquette, Aurélia; Whalen, Sandra; Afenjar, Alexandra; Périsse, Didier; Doummar, Diane; Dorison, Nathalie; Leboyer, Marion; Siffroi, Jean-Pierre; Cohen, David; Brice, Alexis; Héron, Delphine; Depienne, Christel

    2014-01-01

    Copy number variants (CNVs) have repeatedly been found to cause or predispose to autism spectrum disorders (ASDs). For diagnostic purposes, we screened 194 individuals with ASDs for CNVs using Illumina SNP arrays. In several probands, we also analyzed candidate genes located in inherited deletions to unmask autosomal recessive variants. Three CNVs, a de novo triplication of chromosome 15q11–q12 of paternal origin, a deletion on chromosome 9p24 and a de novo 3q29 deletion, were identified as the cause of the disorder in one individual each. An autosomal recessive cause was considered possible in two patients: a homozygous 1p31.1 deletion encompassing PTGER3 and a deletion of the entire DOCK10 gene associated with a rare hemizygous missense variant. We also identified multiple private or recurrent CNVs, the majority of which were inherited from asymptomatic parents. Although highly penetrant CNVs or variants inherited in an autosomal recessive manner were detected in rare cases, our results mainly support the hypothesis that most CNVs contribute to ASDs in association with other CNVs or point variants located elsewhere in the genome. Identification of these genetic interactions in individuals with ASDs constitutes a formidable challenge. PMID:23632794

  10. Prospective diagnostic analysis of copy number variants using SNP microarrays in individuals with autism spectrum disorders.

    PubMed

    Nava, Caroline; Keren, Boris; Mignot, Cyril; Rastetter, Agnès; Chantot-Bastaraud, Sandra; Faudet, Anne; Fonteneau, Eric; Amiet, Claire; Laurent, Claudine; Jacquette, Aurélia; Whalen, Sandra; Afenjar, Alexandra; Périsse, Didier; Doummar, Diane; Dorison, Nathalie; Leboyer, Marion; Siffroi, Jean-Pierre; Cohen, David; Brice, Alexis; Héron, Delphine; Depienne, Christel

    2014-01-01

    Copy number variants (CNVs) have repeatedly been found to cause or predispose to autism spectrum disorders (ASDs). For diagnostic purposes, we screened 194 individuals with ASDs for CNVs using Illumina SNP arrays. In several probands, we also analyzed candidate genes located in inherited deletions to unmask autosomal recessive variants. Three CNVs, a de novo triplication of chromosome 15q11-q12 of paternal origin, a deletion on chromosome 9p24 and a de novo 3q29 deletion, were identified as the cause of the disorder in one individual each. An autosomal recessive cause was considered possible in two patients: a homozygous 1p31.1 deletion encompassing PTGER3 and a deletion of the entire DOCK10 gene associated with a rare hemizygous missense variant. We also identified multiple private or recurrent CNVs, the majority of which were inherited from asymptomatic parents. Although highly penetrant CNVs or variants inherited in an autosomal recessive manner were detected in rare cases, our results mainly support the hypothesis that most CNVs contribute to ASDs in association with other CNVs or point variants located elsewhere in the genome. Identification of these genetic interactions in individuals with ASDs constitutes a formidable challenge. PMID:23632794

  11. Genome-wide SNP analysis explains coral diversity and recovery in the Ryukyu Archipelago.

    PubMed

    Shinzato, Chuya; Mungpakdee, Sutada; Arakaki, Nana; Satoh, Noriyuki

    2015-01-01

    Following a global coral bleaching event in 1998, Acropora corals surrounding most of Okinawa island (OI) were devastated, although they are now gradually recovering. In contrast, the Kerama Islands (KIs) only 30 km west of OI, have continuously hosted a great variety of healthy corals. Taking advantage of the decoded Acropora digitifera genome and using genome-wide SNP analyses, we clarified Acropora population structure in the southern Ryukyu Archipelago (sRA). Despite small genetic distances, we identified distinct clusters corresponding to specific island groups, suggesting infrequent long-distance dispersal within the sRA. Although the KIs were believed to supply coral larvae to OI, admixture analyses showed that such dispersal is much more limited than previously realized, indicating independent recovery of OI coral populations and the necessity of local conservation efforts for each region. We detected strong historical migration from the Yaeyama Islands (YIs) to OI, and suggest that the YIs are the original source of OI corals. In addition, migration edges to the KIs suggest that they are a historical sink population in the sRA, resulting in high diversity. This population genomics study provides the highest resolution data to date regarding coral population structure and history. PMID:26656261

  12. De Novo SNP Discovery in the Scandinavian Brown Bear (Ursus arctos)

    PubMed Central

    Norman, Anita J.; Street, Nathaniel R.; Spong, Göran

    2013-01-01

    Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos) within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia. PMID:24260529

  13. Purifying selection shapes the coincident SNP distribution of primate coding sequences

    PubMed Central

    Chen, Chia-Ying; Hung, Li-Yuan; Wu, Chan-Shuo; Chuang, Trees-Juen

    2016-01-01

    Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density, and recombination rate, and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently, and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a “signature” during primate protein evolution. PMID:27255481

  14. Genomic relationships computed from either next-generation sequence or array SNP data.

    PubMed

    Pérez-Enciso, M

    2014-04-01

    The use of sequence data in genomic prediction models is a topic of high interest, given the decreasing prices of current 'next'-generation sequencing technologies (NGS) and the theoretical possibility of directly interrogating the genomes for all causal mutations. Here, we compare by simulation how well genetic relationships (G) could be estimated using either NGS or ascertained SNP arrays. DNA sequences were simulated using the coalescence according to two scenarios: a 'cattle' scenario that consisted of a bottleneck followed by a split in two breeds without migration, and a 'pig' model where Chinese introgression into international pig breeds was simulated. We found that introgression results in a large amount of variability across the genome and between individuals, both in differentiation and in diversity. In general, NGS data allowed the most accurate estimates of G, provided enough sequencing depth was available, because shallow NGS (4×) may result in highly distorted estimates of G elements, especially if not standardized by allele frequency. However, high-density genotyping can also result in accurate estimates of G. Given that genotyping is much less noisy than NGS data, it is suggested that specific high-density arrays (~3M SNPs) that minimize the effects of ascertainment could be developed in the population of interest by sequencing the most influential animals and rely on those arrays for implementing genomic selection.

  15. Design and synthesis of the superionic conductor Na10SnP2S12

    PubMed Central

    Richards, William D.; Tsujimura, Tomoyuki; Miara, Lincoln J.; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand

    2016-01-01

    Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm−1 rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity. PMID:26984102

  16. Design and synthesis of the superionic conductor Na10SnP2S12.

    PubMed

    Richards, William D; Tsujimura, Tomoyuki; Miara, Lincoln J; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand

    2016-01-01

    Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm(-1) rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity. PMID:26984102

  17. Four-copy number intervals in SNP microarray analysis: unique patterns and positions.

    PubMed

    Papenhausen, Peter R; Kelly, Carla A; Zvereff, Val; Schwartz, Stuart

    2014-01-01

    Over the past several years, the utility of microarray technology in delineating copy number changes has become well established. In the past 4 years, we have used the SNP array to detect and analyze allele ratios in 150 cases with 4-copy intervals, confirmed by FISH, offering insight into the underlying mechanisms of formation. These cases may be divided into 5 allele patterns--the first 4 of which involve a single homologue--as detected by the genotyping aspects of the microarray: (1) triplications combining homozygous and heterozygous alleles, with a 3:1 ratio of heterozygotes; (2) triplications with allele patterns combining homozygous and heterozygous alleles, with heterozygote ratios of both 3:1 and 2:2; (3) triplications that have homozygous alleles combined with only 2:2 heterozygous alleles; (4) triplications that are completely homozygous; and (5) homozygous duplications on each homologue with no heterozygous alleles. The implications of copy number variants with diverse allelic segregations are presented in this study. PMID:25401283

  18. AncestrySNPminer: A bioinformatics tool to retrieve and develop ancestry informative SNP panels

    PubMed Central

    Amirisetty, Sushil; Khurana Hershey, Gurjit K.; Baye, Tesfaye M.

    2012-01-01

    A wealth of genomic information is available in public and private databases. However, this information is underutilized for uncovering population specific and functionally relevant markers underlying complex human traits. Given the huge amount of SNP data available from the annotation of human genetic variation, data mining is a faster and cost effective approach for investigating the number of SNPs that are informative for ancestry. In this study, we present AncestrySNPminer, the first web-based bioinformatics tool specifically designed to retrieve Ancestry Informative Markers (AIMs) from genomic data sets and link these informative markers to genes and ontological annotation classes. The tool includes an automated and simple “scripting at the click of a button” functionality that enables researchers to perform various population genomics statistical analyses methods with user friendly querying and filtering of data sets across various populations through a single web interface. AncestrySNPminer can be freely accessed at https://research.cchmc.org/mershalab/AncestrySNPminer/login.php. PMID:22584067

  19. Identification of a functional SNP in the 3'-UTR of caprine MTHFR gene that is associated with milk protein levels.

    PubMed

    An, Xiaopeng; Song, Yuxuan; Hou, Jinxing; Wang, Shan; Gao, Kexin; Cao, Binyun

    2016-08-01

    Xinong Saanen (n = 305) and Guanzhong (n = 317) dairy goats were used to detect SNPs in the caprine MTHFR 3'-UTR by DNA sequencing. One novel SNP (c.*2494G>A) was identified in the said region. Individuals with the AA genotype had greater milk protein levels than did those with the GG genotype at the c.*2494 G>A locus in both dairy goat breeds (P < 0.05). Functional assays indicated that the MTHFR:c.2494G>A substitution could increase the binding activity of bta-miR-370 with the MTHFR 3'-UTR. In addition, we observed a significant increase in the MTHFR protein level of AA carriers relative to that of GG carriers. These altered levels of MTHFR protein may account for the association of the SNP with milk protein level. PMID:27062401

  20. Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data

    PubMed Central

    Xu, Lingyang; Hou, Yali; Bickhart, Derek M.; Song, Jiuzhou; Liu, George E.

    2013-01-01

    Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.

  1. SNP Analysis and Whole Exome Sequencing: Their Application in the Analysis of a Consanguineous Pedigree Segregating Ataxia

    PubMed Central

    Nickerson, Sarah L.; Marquis-Nicholson, Renate; Claxton, Karen; Ashton, Fern; Leong, Ivone U. S.; Prosser, Debra O.; Love, Jennifer M.; George, Alice M.; Taylor, Graham; Wilson, Callum; McKinlay Gardner, R. J.; Love, Donald R.

    2015-01-01

    Autosomal recessive cerebellar ataxia encompasses a large and heterogeneous group of neurodegenerative disorders. We employed single nucleotide polymorphism (SNP) analysis and whole exome sequencing to investigate a consanguineous Maori pedigree segregating ataxia. We identified a novel mutation in exon 10 of the SACS gene: c.7962T>G p.(Tyr2654*), establishing the diagnosis of autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS). Our findings expand both the genetic and phenotypic spectrum of this rare disorder, and highlight the value of high-density SNP analysis and whole exome sequencing as powerful and cost-effective tools in the diagnosis of genetically heterogeneous disorders such as the hereditary ataxias. PMID:27600236

  2. Regulatory Variants and Disease: The E-Cadherin −160C/A SNP as an Example

    PubMed Central

    Li, Gongcheng; Pan, Tiejun; Guo, Dan

    2014-01-01

    Single nucleotide polymorphisms (SNPs) occurring in noncoding sequences have largely been ignored in genome-wide association studies (GWAS). Yet, amounting evidence suggests that many noncoding SNPs especially those that are in the vicinity of protein coding genes play important roles in shaping chromatin structure and regulate gene expression and, as such, are implicated in a wide variety of diseases. One of such regulatory SNPs (rSNPs) is the E-cadherin (CDH1) promoter −160C/A SNP (rs16260) which is known to affect E-cadherin promoter transcription by displacing transcription factor binding and has been extensively scrutinized for its association with several diseases especially malignancies. Findings from studying this SNP highlight important clinical relevance of rSNPs and justify their inclusion in future GWAS to identify novel disease causing SNPs. PMID:25276428

  3. Identification of a functional SNP in the 3'-UTR of caprine MTHFR gene that is associated with milk protein levels.

    PubMed

    An, Xiaopeng; Song, Yuxuan; Hou, Jinxing; Wang, Shan; Gao, Kexin; Cao, Binyun

    2016-08-01

    Xinong Saanen (n = 305) and Guanzhong (n = 317) dairy goats were used to detect SNPs in the caprine MTHFR 3'-UTR by DNA sequencing. One novel SNP (c.*2494G>A) was identified in the said region. Individuals with the AA genotype had greater milk protein levels than did those with the GG genotype at the c.*2494 G>A locus in both dairy goat breeds (P < 0.05). Functional assays indicated that the MTHFR:c.2494G>A substitution could increase the binding activity of bta-miR-370 with the MTHFR 3'-UTR. In addition, we observed a significant increase in the MTHFR protein level of AA carriers relative to that of GG carriers. These altered levels of MTHFR protein may account for the association of the SNP with milk protein level.

  4. Pedigree- and SNP-Associated Genetics and Recent Environment are the Major Contributors to Anthropometric and Cardiometabolic Trait Variation.

    PubMed

    Xia, Charley; Amador, Carmen; Huffman, Jennifer; Trochet, Holly; Campbell, Archie; Porteous, David; Hastie, Nicholas D; Hayward, Caroline; Vitart, Veronique; Navarro, Pau; Haley, Chris S

    2016-02-01

    Genome-wide association studies have successfully identified thousands of loci for a range of human complex traits and diseases. The proportion of phenotypic variance explained by significant associations is, however, limited. Given the same dense SNP panels, mixed model analyses capture a greater proportion of phenotypic variance than single SNP analyses but the total is generally still less than the genetic variance estimated from pedigree studies. Combining information from pedigree relationships and SNPs, we examined 16 complex anthropometric and cardiometabolic traits in a Scottish family-based cohort comprising up to 20,000 individuals genotyped for ~520,000 common autosomal SNPs. The inclusion of related individuals provides the opportunity to also estimate the genetic variance associated with pedigree as well as the effects of common family environment. Trait variation was partitioned into SNP-associated and pedigree-associated genetic variation, shared nuclear family environment, shared couple (partner) environment and shared full-sibling environment. Results demonstrate that trait heritabilities vary widely but, on average across traits, SNP-associated and pedigree-associated genetic effects each explain around half the genetic variance. For most traits the recently-shared environment of couples is also significant, accounting for ~11% of the phenotypic variance on average. On the other hand, the environment shared largely in the past by members of a nuclear family or by full-siblings, has a more limited impact. Our findings point to appropriate models to use in future studies as pedigree-associated genetic effects and couple environmental effects have seldom been taken into account in genotype-based analyses. Appropriate description of the trait variation could help understand causes of intra-individual variation and in the detection of contributing loci and environmental factors.

  5. Development of high-density SNP genotyping arrays for white spruce (Picea glauca) and transferability to subtropical and nordic congeners.

    PubMed

    Pavy, Nathalie; Gagnon, France; Rigault, Philippe; Blais, Sylvie; Deschênes, Astrid; Boyle, Brian; Pelgas, Betty; Deslauriers, Marie; Clément, Sébastien; Lavigne, Patricia; Lamothe, Manuel; Cooke, Janice E K; Jaramillo-Correa, Juan P; Beaulieu, Jean; Isabel, Nathalie; Mackay, John; Bousquet, Jean

    2013-03-01

    High-density SNP genotyping arrays can be designed for any species given sufficient sequence information of high quality. Two high-density SNP arrays relying on the Infinium iSelect technology (Illumina) were designed for use in the conifer white spruce (Picea glauca). One array contained 7338 segregating SNPs representative of 2814 genes of various molecular functional classes for main uses in genetic association and population genetics studies. The other one contained 9559 segregating SNPs representative of 9543 genes for main uses in population genetics, linkage mapping of the genome and genomic prediction. The SNPs assayed were discovered from various sources of gene resequencing data. SNPs predicted from high-quality sequences derived from genomic DNA reached a genotyping success rate of 64.7%. Nonsingleton in silico SNPs (i.e. a sequence polymorphism present in at least two reads) predicted from expressed sequenced tags obtained with the Roche 454 technology and Illumina GAII analyser resulted in a similar genotyping success rate of 71.6% when the deepest alignment was used and the most favourable SNP probe per gene was selected. A variable proportion of these SNPs was shared by other nordic and subtropical spruce species from North America and Europe. The number of shared SNPs was inversely proportional to phylogenetic divergence and standing genetic variation in the recipient species, but positively related to allele frequency in P. glauca natural populations. These validated SNP resources should open up new avenues for population genetics and comparative genetic mapping at a genomic scale in spruce species.

  6. Identification of a Sex-Linked SNP Marker in the Salmon Louse (Lepeophtheirus salmonis) Using RAD Sequencing

    PubMed Central

    Taggart, John B.; Christie, Hayden R. L.; Bassett, David I.; Bron, James E.; Skuce, Philip J.; Gharbi, Karim; Skern-Mauritzen, Rasmus; Sturm, Armin

    2013-01-01

    The salmon louse (Lepeophtheirus salmonis (Krøyer, 1837)) is a parasitic copepod that can, if untreated, cause considerable damage to Atlantic salmon (Salmo salar Linnaeus, 1758) and incurs significant costs to the Atlantic salmon mariculture industry. Salmon lice are gonochoristic and normally show sex ratios close to 1:1. While this observation suggests that sex determination in salmon lice is genetic, with only minor environmental influences, the mechanism of sex determination in the salmon louse is unknown. This paper describes the identification of a sex-linked Single Nucleotide Polymorphism (SNP) marker, providing the first evidence for a genetic mechanism of sex determination in the salmon louse. Restriction site-associated DNA sequencing (RAD-seq) was used to isolate SNP markers in a laboratory-maintained salmon louse strain. A total of 85 million raw Illumina 100 base paired-end reads produced 281,838 unique RAD-tags across 24 unrelated individuals. RAD marker Lsa101901 showed complete association with phenotypic sex for all individuals analysed, being heterozygous in females and homozygous in males. Using an allele-specific PCR assay for genotyping, this SNP association pattern was further confirmed for three unrelated salmon louse strains, displaying complete association with phenotypic sex in a total of 96 genotyped individuals. The marker Lsa101901 was located in the coding region of the prohibitin-2 gene, which showed a sex-dependent differential expression, with mRNA levels determined by RT-qPCR about 1.8-fold higher in adult female than adult male salmon lice. This study’s observations of a novel sex-linked SNP marker are consistent with sex determination in the salmon louse being genetic and following a female heterozygous system. Marker Lsa101901 provides a tool to determine the genetic sex of salmon lice, and could be useful in the development of control strategies. PMID:24147087

  7. Pedigree- and SNP-Associated Genetics and Recent Environment are the Major Contributors to Anthropometric and Cardiometabolic Trait Variation

    PubMed Central

    Xia, Charley; Amador, Carmen; Huffman, Jennifer; Trochet, Holly; Campbell, Archie; Porteous, David; Hastie, Nicholas D.; Hayward, Caroline; Vitart, Veronique; Navarro, Pau; Haley, Chris S.

    2016-01-01

    Genome-wide association studies have successfully identified thousands of loci for a range of human complex traits and diseases. The proportion of phenotypic variance explained by significant associations is, however, limited. Given the same dense SNP panels, mixed model analyses capture a greater proportion of phenotypic variance than single SNP analyses but the total is generally still less than the genetic variance estimated from pedigree studies. Combining information from pedigree relationships and SNPs, we examined 16 complex anthropometric and cardiometabolic traits in a Scottish family-based cohort comprising up to 20,000 individuals genotyped for ~520,000 common autosomal SNPs. The inclusion of related individuals provides the opportunity to also estimate the genetic variance associated with pedigree as well as the effects of common family environment. Trait variation was partitioned into SNP-associated and pedigree-associated genetic variation, shared nuclear family environment, shared couple (partner) environment and shared full-sibling environment. Results demonstrate that trait heritabilities vary widely but, on average across traits, SNP-associated and pedigree-associated genetic effects each explain around half the genetic variance. For most traits the recently-shared environment of couples is also significant, accounting for ~11% of the phenotypic variance on average. On the other hand, the environment shared largely in the past by members of a nuclear family or by full-siblings, has a more limited impact. Our findings point to appropriate models to use in future studies as pedigree-associated genetic effects and couple environmental effects have seldom been taken into account in genotype-based analyses. Appropriate description of the trait variation could help understand causes of intra-individual variation and in the detection of contributing loci and environmental factors. PMID:26836320

  8. Incremental impact of breast cancer SNP panel on risk classification in a screening population of white and African American women

    PubMed Central

    McCarthy, Anne Marie; Armstrong, Katrina; Handorf, Elizabeth; Jones, Marisa; Chen, Jinbo; Demeter, Mirar Bristol; McGuire, Erin; Conant, Emily F; Domchek, Susan M

    2014-01-01

    Breast cancer risk prediction remains imperfect, particularly among non-white populations. This study examines the impact of including single nucleotide polymorphism (SNP) alleles in risk prediction for white and African American women undergoing screening mammogram. Using a prospective cohort study, standard risk information and buccal swabs were collected at the time of screening mammography. A 12 SNP panel was performed by deCODE Genetics. Five-year and lifetime risks incorporating SNPs were calculated by multiplying estimated Breast Cancer Risk Assessment Tool (BCRAT) risk by the total genetic risk ratio. Concordance between the BCRAT and the Combined Model (BCRAT + SNPs) in identifying high-risk women was measured using the kappa statistic. SNP data were available for 813 women (39% African American, 55% white). The mean BCRAT 5-year risk was 1.70% for whites and 1.19% for African Americans. Mean genetic risk ratios were 1.10 in whites and 1.29 in African Americans. Among whites, three SNPs had higher frequencies, and among African Americans, seven SNPs had higher and four had lower high-risk allele frequencies than previously reported. Agreement between the BCRAT and the Combined Model was relatively low for identifying high-risk women (5-year κ=0.53, lifetime κ=0.37). Addition of SNPs had the greatest effect among African Americans, with 13% identified as having high