Science.gov

Sample records for 124-plex snp typing

  1. A robust SNP barcode for typing Mycobacterium tuberculosis complex strains

    PubMed Central

    Coll, Francesc; McNerney, Ruth; Guerra-Assunção, José Afonso; Glynn, Judith R.; Perdigão, João; Viveiros, Miguel; Portugal, Isabel; Pain, Arnab; Martin, Nigel; Clark, Taane G.

    2014-01-01

    Strain-specific genomic diversity in the Mycobacterium tuberculosis complex (MTBC) is an important factor in pathogenesis that may affect virulence, transmissibility, host response and emergence of drug resistance. Several systems have been proposed to classify MTBC strains into distinct lineages and families. Here, we investigate single-nucleotide polymorphisms (SNPs) as robust (stable) markers of genetic variation for phylogenetic analysis. We identify ~92k SNP across a global collection of 1,601 genomes. The SNP-based phylogeny is consistent with the gold-standard regions of difference (RD) classification system. Of the ~7k strain-specific SNPs identified, 62 markers are proposed to discriminate known circulating strains. This SNP-based barcode is the first to cover all main lineages, and classifies a greater number of sublineages than current alternatives. It may be used to classify clinical isolates to evaluate tools to control the disease, including therapeutics and vaccines whose effectiveness may vary by strain type. PMID:25176035

  2. Imputation of KIR Types from SNP Variation Data.

    PubMed

    Vukcevic, Damjan; Traherne, James A; Næss, Sigrid; Ellinghaus, Eva; Kamatani, Yoichiro; Dilthey, Alexander; Lathrop, Mark; Karlsen, Tom H; Franke, Andre; Moffatt, Miriam; Cookson, William; Trowsdale, John; McVean, Gil; Sawcer, Stephen; Leslie, Stephen

    2015-10-01

    Large population studies of immune system genes are essential for characterizing their role in diseases, including autoimmune conditions. Of key interest are a group of genes encoding the killer cell immunoglobulin-like receptors (KIRs), which have known and hypothesized roles in autoimmune diseases, resistance to viruses, reproductive conditions, and cancer. These genes are highly polymorphic, which makes typing expensive and time consuming. Consequently, despite their importance, KIRs have been little studied in large cohorts. Statistical imputation methods developed for other complex loci (e.g., human leukocyte antigen [HLA]) on the basis of SNP data provide an inexpensive high-throughput alternative to direct laboratory typing of these loci and have enabled important findings and insights for many diseases. We present KIR∗IMP, a method for imputation of KIR copy number. We show that KIR∗IMP is highly accurate and thus allows the study of KIRs in large cohorts and enables detailed investigation of the role of KIRs in human disease. PMID:26430804

  3. Imputation of KIR Types from SNP Variation Data

    PubMed Central

    Vukcevic, Damjan; Traherne, James A.; Næss, Sigrid; Ellinghaus, Eva; Kamatani, Yoichiro; Dilthey, Alexander; Lathrop, Mark; Karlsen, Tom H.; Franke, Andre; Moffatt, Miriam; Cookson, William; Trowsdale, John; McVean, Gil; Sawcer, Stephen; Leslie, Stephen

    2015-01-01

    Large population studies of immune system genes are essential for characterizing their role in diseases, including autoimmune conditions. Of key interest are a group of genes encoding the killer cell immunoglobulin-like receptors (KIRs), which have known and hypothesized roles in autoimmune diseases, resistance to viruses, reproductive conditions, and cancer. These genes are highly polymorphic, which makes typing expensive and time consuming. Consequently, despite their importance, KIRs have been little studied in large cohorts. Statistical imputation methods developed for other complex loci (e.g., human leukocyte antigen [HLA]) on the basis of SNP data provide an inexpensive high-throughput alternative to direct laboratory typing of these loci and have enabled important findings and insights for many diseases. We present KIR∗IMP, a method for imputation of KIR copy number. We show that KIR∗IMP is highly accurate and thus allows the study of KIRs in large cohorts and enables detailed investigation of the role of KIRs in human disease. PMID:26430804

  4. Typing SNP based on the near-infrared spectroscopy and artificial neural network

    NASA Astrophysics Data System (ADS)

    Ren, Li; Wang, Wei-Peng; Gao, Yu-Zhen; Yu, Xiao-Wei; Xie, Hong-Ping

    2009-07-01

    Based on the near-infrared spectra (NIRS) of the measured samples as the discriminant variables of their genotypes, the genotype discriminant model of SNP has been established by using back-propagation artificial neural network (BP-ANN). Taking a SNP (857G > A) of N-acetyltransferase 2 (NAT2) as an example, DNA fragments containing the SNP site were amplified by the PCR method based on a pair of primers to obtain the three-genotype (GG, AA, and GA) modeling samples. The NIRS-s of the amplified samples were directly measured in transmission by using quartz cell. Based on the sample spectra measured, the two BP-ANN-s were combined to obtain the stronger ability of the three-genotype classification. One of them was established to compress the measured NIRS variables by using the resilient back-propagation algorithm, and another network established by Levenberg-Marquardt algorithm according to the compressed NIRS-s was used as the discriminant model of the three-genotype classification. For the established model, the root mean square error for the training and the prediction sample sets were 0.0135 and 0.0132, respectively. Certainly, this model could rightly predict the three genotypes (i.e. the accuracy of prediction samples was up to100%) and had a good robust for the prediction of unknown samples. Since the three genotypes of SNP could be directly determined by using the NIRS-s without any preprocessing for the analyzed samples after PCR, this method is simple, rapid and low-cost.

  5. SNP-VISTA

    SciTech Connect

    Shah, Nameeta; Teplitsky, Michael; Minovitsky, Simon; Dubchak, Inna

    2005-11-07

    SNP-VISTA aids in analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) Mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNPs data.

  6. High-throughput bacterial SNP typing identifies distinct clusters of Salmonella Typhi causing typhoid in Nepalese children

    PubMed Central

    2010-01-01

    Background Salmonella Typhi (S. Typhi) causes typhoid fever, which remains an important public health issue in many developing countries. Kathmandu, the capital of Nepal, is an area of high incidence and the pediatric population appears to be at high risk of exposure and infection. Methods We recently defined the population structure of S. Typhi, using new sequencing technologies to identify nearly 2,000 single nucleotide polymorphisms (SNPs) that can be used as unequivocal phylogenetic markers. Here we have used the GoldenGate (Illumina) platform to simultaneously type 1,500 of these SNPs in 62 S. Typhi isolates causing severe typhoid in children admitted to Patan Hospital in Kathmandu. Results Eight distinct S. Typhi haplotypes were identified during the 20-month study period, with 68% of isolates belonging to a subclone of the previously defined H58 S. Typhi. This subclone was closely associated with resistance to nalidixic acid, with all isolates from this group demonstrating a resistant phenotype and harbouring the same resistance-associated SNP in GyrA (Phe83). A secondary clone, comprising 19% of isolates, was observed only during the second half of the study. Conclusions Our data demonstrate the utility of SNP typing for monitoring bacterial populations over a defined period in a single endemic setting. We provide evidence for genotype introduction and define a nalidixic acid resistant subclone of S. Typhi, which appears to be the dominant cause of severe pediatric typhoid in Kathmandu during the study period. PMID:20509974

  7. SNP-VISTA

    Energy Science and Technology Software Center (ESTSC)

    2005-11-07

    SNP-VISTA aids in analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) Mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering,more » based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNPs data.« less

  8. Association of rs5888 SNP in the scavenger receptor class B type 1 gene and serum lipid levels

    PubMed Central

    2012-01-01

    Background Bai Ku Yao is a special subgroup of the Yao minority in China. The present study was undertaken to detect the association of rs5888 single nucleotide polymorphism (SNP) in the scavenger receptor class B type 1 (SCARB1) gene and several environmental factors with serum lipid levels in the Guangxi Bai Ku Yao and Han populations. Methods A total of 598 subjects of Bai Ku Yao and 585 subjects of Han Chinese were randomly selected from our stratified randomized cluster samples. Genotypes of the SCARB1 rs5888 SNP were determined by polymerase chain reaction and restriction fragment length polymorphism combined with gel electrophoresis, and then confirmed by direct sequencing. Results The levels of total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), apolipoprotein (Apo) AI were lower but ApoB was higher in Bai Ku Yao than in Han (P < 0.05-0.001). The frequencies of C and T alleles were 78.3% and 21.7% in Bai Ku Yao, and 73.7% and 26.3% in Han (P < 0.01); respectively. The frequencies of CC, CT and TT genotypes were 60.0%, 36.6% and 3.4% in Bai Ku Yao, and 54.2%, 39.0% and 6.8% in Han (P < 0.01); respectively. The subjects with TT genotype in both ethnic groups had lower HDL-C and ApoAI levels than the subjects with CC or CT genotype (P < 0.05 for all). Subgroup analyses showed that the subjects with TT genotype in Bai Ku Yao had lower HDL-C and ApoAI levels in males than the subjects with CC or CT genotype (P < 0.05 for all), and the T allele carriers had higher TC, LDL-C and ApoB levels in females than the T allele noncarriers (P < 0.05 for all). The participants with TT genotype in Han also had a lower tendency of HDL-C and ApoAI levels in males than the participants with CC or CT genotype, but the difference did not reach statistically significant (P = 0.063 and P = 0.086; respectively). The association of serum HDL-C and ApoAI levels and genotypes was confirmed by

  9. Eight New Genomes and Synthetic Controls Increase the Accessibility of Rapid Melt-MAMA SNP Typing of Coxiella burnetii

    PubMed Central

    Byström, Mona; Forsman, Mats; Frangoulidis, Dimitrios; Janse, Ingmar; Larsson, Pär; Lindgren, Petter; Öhrman, Caroline; van Rotterdam, Bart; Sjödin, Andreas; Myrtennäs, Kerstin

    2014-01-01

    The case rate of Q fever in Europe has increased dramatically in recent years, mainly because of an epidemic in the Netherlands in 2009. Consequently, there is a need for more extensive genetic characterization of the disease agent Coxiella burnetii in order to better understand the epidemiology and spread of this disease. Genome reference data are essential for this purpose, but only thirteen genome sequences are currently available. Current methods for typing C. burnetii are criticized for having problems in comparing results across laboratories, require the use of genomic control DNA, and/or rely on markers in highly variable regions. We developed in this work a method for single nucleotide polymorphism (SNP) typing of C. burnetii isolates and tissue samples based on new assays targeting ten phylogenetically stable synonymous canonical SNPs (canSNPs). These canSNPs represent previously known phylogenetic branches and were here identified from sequence comparisons of twenty-one C. burnetii genomes, eight of which were sequenced in this work. Importantly, synthetic control templates were developed, to make the method useful to laboratories lacking genomic control DNA. An analysis of twenty-one C. burnetii genomes confirmed that the species exhibits high sequence identity. Most of its SNPs (7,493/7,559 shared by >1 genome) follow a clonal inheritance pattern and are therefore stable phylogenetic typing markers. The assays were validated using twenty-six genetically diverse C. burnetii isolates and three tissue samples from small ruminants infected during the epidemic in the Netherlands. Each sample was assigned to a clade. Synthetic controls (vector and PCR amplified) gave identical results compared to the corresponding genomic controls and are viable alternatives to genomic DNA. The results from the described method indicate that it could be useful for cheap and rapid disease source tracking at non-specialized laboratories, which requires accurate genotyping

  10. SNP genotyping by DNA photoligation: application to SNP detection of genes from food crops

    NASA Astrophysics Data System (ADS)

    Yoshimura, Yoshinaga; Ohtake, Tomoko; Okada, Hajime; Ami, Takehiro; Tsukaguchi, Tadashi; Fujimoto, Kenzo

    2009-06-01

    We describe a simple and inexpensive single-nucleotide polymorphism (SNP) typing method, using DNA photoligation with 5-carboxyvinyl-2'-deoxyuridine and two fluorophores. This SNP-typing method facilitates qualitative determination of genes from indica and japonica rice, and showed a high degree of single nucleotide specificity up to 10 000. This method can be used in the SNP typing of actual genomic DNA samples from food crops.

  11. SNP-VISTA: An interactive SNP visualization tool

    PubMed Central

    Shah, Nameeta; Teplitsky, Michael V; Minovitsky, Simon; Pennacchio, Len A; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L

    2005-01-01

    Background Recent advances in sequencing technologies promise to provide a better understanding of the genetics of human disease as well as the evolution of microbial populations. Single Nucleotide Polymorphisms (SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it has become possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease in an attempt to identify causative mutations. In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples enables more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at [1]. Results We have developed and present two modifications of an interactive visualization tool, SNP-VISTA, to aid in the analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein evolutionary conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. Conclusion The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNP data by the user. PMID

  12. Electrochemical detection of type 2 diabetes mellitus-related SNP via DNA-mediated growth of silver nanoparticles on single walled carbon nanotubes.

    PubMed

    Tao, Jia; Zhao, Peng; Zheng, Jing; Wu, Cuichen; Shi, Muling; Li, Jishan; Li, Yinhui; Yang, Ronghua

    2015-11-01

    Herein, we proposed a new electrochemical sensing strategy for T2DM-related SNP detection via DNA-mediated growth of AgNPs on a SWCNT-modified electrode. Coupled with RNase HII enzyme assisted amplification, this approach could realize T2DM-related SNP assay and be applied in crude extracts of carcinoma pancreatic β-cell lines. PMID:26365891

  13. Genome-wide detection of CNVs in Chinese indigenous sheep with different types of tails using ovine high-density 600K SNP arrays

    PubMed Central

    Zhu, Caiye; Fan, Hongying; Yuan, Zehu; Hu, Shijin; Ma, Xiaomeng; Xuan, Junli; Wang, Hongwei; Zhang, Li; Wei, Caihong; Zhang, Qin; Zhao, Fuping; Du, Lixin

    2016-01-01

    Chinese indigenous sheep can be classified into three types based on tail morphology: fat-tailed, fat-rumped, and thin-tailed sheep, of which the typical breeds are large-tailed Han sheep, Altay sheep, and Tibetan sheep, respectively. To unravel the genetic mechanisms underlying the phenotypic differences among Chinese indigenous sheep with tails of three different types, we used ovine high-density 600K SNP arrays to detect genome-wide copy number variation (CNV). In large-tailed Han sheep, Altay sheep, and Tibetan sheep, 371, 301, and 66 CNV regions (CNVRs) with lengths of 71.35 Mb, 51.65 Mb, and 10.56 Mb, respectively, were identified on autosomal chromosomes. Ten CNVRs were randomly chosen for confirmation, of which eight were successfully validated. The detected CNVRs harboured 3130 genes, including genes associated with fat deposition, such as PPARA, RXRA, KLF11, ADD1, FASN, PPP1CA, PDGFA, and PEX6. Moreover, multilevel bioinformatics analyses of the detected candidate genes were significantly enriched for involvement in fat deposition, GTPase regulator, and peptide receptor activities. This is the first high-resolution sheep CNV map for Chinese indigenous sheep breeds with three types of tails. Our results provide valuable information that will support investigations of genomic structural variation underlying traits of interest in sheep. PMID:27282145

  14. Genome-wide detection of CNVs in Chinese indigenous sheep with different types of tails using ovine high-density 600K SNP arrays.

    PubMed

    Zhu, Caiye; Fan, Hongying; Yuan, Zehu; Hu, Shijin; Ma, Xiaomeng; Xuan, Junli; Wang, Hongwei; Zhang, Li; Wei, Caihong; Zhang, Qin; Zhao, Fuping; Du, Lixin

    2016-01-01

    Chinese indigenous sheep can be classified into three types based on tail morphology: fat-tailed, fat-rumped, and thin-tailed sheep, of which the typical breeds are large-tailed Han sheep, Altay sheep, and Tibetan sheep, respectively. To unravel the genetic mechanisms underlying the phenotypic differences among Chinese indigenous sheep with tails of three different types, we used ovine high-density 600K SNP arrays to detect genome-wide copy number variation (CNV). In large-tailed Han sheep, Altay sheep, and Tibetan sheep, 371, 301, and 66 CNV regions (CNVRs) with lengths of 71.35 Mb, 51.65 Mb, and 10.56 Mb, respectively, were identified on autosomal chromosomes. Ten CNVRs were randomly chosen for confirmation, of which eight were successfully validated. The detected CNVRs harboured 3130 genes, including genes associated with fat deposition, such as PPARA, RXRA, KLF11, ADD1, FASN, PPP1CA, PDGFA, and PEX6. Moreover, multilevel bioinformatics analyses of the detected candidate genes were significantly enriched for involvement in fat deposition, GTPase regulator, and peptide receptor activities. This is the first high-resolution sheep CNV map for Chinese indigenous sheep breeds with three types of tails. Our results provide valuable information that will support investigations of genomic structural variation underlying traits of interest in sheep. PMID:27282145

  15. Localization of Type 1 Diabetes susceptibility in the ancestral haplotype 18.2 by high density SNP mapping.

    PubMed

    Santiago, Jose Luis; Li, Wentian; Lee, Annette; Martinez, Alfonso; Chandrasekaran, Alamelu; Fernandez-Arquero, Miguel; Khalili, Houman; de la Concha, Emilio G; Urcelay, Elena; Gregersen, Peter K

    2009-10-01

    Previous studies have suggested that the ancestral haplotype 18.2 (AH18.2) carries additional susceptibility gene to Type 1 Diabetes (T1D) on the Major Histocompatibility Complex (MHC). We analyzed 10 DR3/TNFa1b5 homozygous subjects in order to establish the conservation of the AH18.2 and then compared this conserved region with other DR3 haplotype, the AH8.1. The Illumina's HumanHap550 Bead chip was used to perform an extensive genotyping of the MHC region. The AH18.2 was highly conserved between DDR1 and HLA-DQA1 genes; therefore most probably the second susceptibility gene is located within this region. We can exclude the region centromeric to HLA-DRA gene and telomeric to DDR1 gene. A comparison between the AH18.2 and AH8.1 haplotypes showed that 233 SNPs were different in the aforementioned conserved region. These data suggest that the 1.65 Mb MHC region between DDR1 and HLA-DRA genes is likely to carry additional susceptibility alleles for T1D on the AH18.2 haplotype. PMID:19591919

  16. Peopling of the North Circumpolar Region – Insights from Y Chromosome STR and SNP Typing of Greenlanders

    PubMed Central

    Olofsson, Jill Katharina; Pereira, Vania; Børsting, Claus; Morling, Niels

    2015-01-01

    The human population in Greenland is characterized by migration events of Paleo- and Neo-Eskimos, as well as admixture with Europeans. In this study, the Y-chromosomal variation in male Greenlanders was investigated in detail by typing 73 Y-chromosomal single nucleotide polymorphisms (Y-SNPs) and 17 Y-chromosomal short tandem repeats (Y-STRs). Approximately 40% of the analyzed Greenlandic Y chromosomes were of European origin (I-M170, R1a-M513 and R1b-M343). Y chromosomes of European origin were mainly found in individuals from the west and south coasts of Greenland, which is in agreement with the historic records of the geographic placements of European settlements in Greenland. Two Inuit Y-chromosomal lineages, Q-M3 (xM19, M194, L663, SA01 and L766) and Q-NWT01 (xM265) were found in 23% and 31% of the male Greenlanders, respectively. The time to the most recent common ancestor (TMRCA) of the Q-M3 lineage of the Greenlanders was estimated to be between 4,400 and 10,900 years ago (y. a.) using two different methods. This is in agreement with the theory that the North Circumpolar Region was populated via a second expansion of humans in the North American continent. The TMRCA of the Q-NWT01 (xM265) lineage in Greenland was estimated to be between 7,000 and 14,300 y. a. using two different methods, which is older than the previously reported TMRCA of this lineage in other Inuit populations. Our results indicate that Inuit individuals carrying the Q-NWT01 (xM265) lineage may have their origin in the northeastern parts of North America and could be descendants of the Dorset culture. This in turn points to the possibility that the current Inuit population in Greenland is comprised of individuals of both Thule and Dorset descent. PMID:25635810

  17. Use of SNP array analysis to identify a novel TRIM32 mutation in limb-girdle muscular dystrophy type 2H.

    PubMed

    Cossée, Mireille; Lagier-Tourenne, Clotilde; Seguela, Claire; Mohr, Michel; Leturcq, France; Gundesli, Hulya; Chelly, Jamel; Tranchant, Christine; Koenig, Michel; Mandel, Jean-Louis

    2009-04-01

    Molecular diagnosis of monogenic diseases with high genetic heterogeneity is usually challenging. In the case of limb-girdle muscular dystrophy, multiplex Western blot analysis is a very useful initial step, but that often fails to identify the primarily affected protein. We report how homozygosity analysis using a genome-wide SNP array allowed us to solve the diagnostic enigma in a patient with a moderate form of LGMD, born from consanguineous parents. The genome-wide scan performed on the patient's DNA revealed several regions of homozygosity, that were compared to the location of known LGMD genes. One such region indeed contained the TRIM32 gene. This gene was previously found mutated in families with limb-girdle muscular dystrophy type 2H (LGMD2H), a mild autosomal recessive myopathy described in Hutterite populations and in 4 patients with a diagnosis of sarcotubular myopathy. A single missense mutation was found in all these patients, located in a conserved domain of the C-terminal part of the protein. Another missense mutation affecting the N-terminal part of TRIM32, observed in a single consanguineous Bedouin family, was reported to cause the phenotypically unrelated and genetically heterogeneous Bardet-Biedl syndrome, defining the BBS11 locus. Sequencing of TRIM32 in our patient revealed a distal frameshift mutation, c.1753_1766dup14 (p.Ile590Leu fsX38). Together with two recently reported mutations, this novel mutation confirms that integrity of the C-terminal domain of TRIM32 is necessary for muscle maintenance. PMID:19303295

  18. SNP panels/Imputation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Participants from thirteen countries discussed services that Interbull can perform or recommendations that Interbull can make to promote harmonization and assist member countries in improving their genomic evaluations in regard to SNP panels and imputation. The panel recommended: A mechanism to shar...

  19. Development of ARMS-PCR assay for genotyping of Pro12Ala SNP of PPARG gene: a cost effective way for case-control studies of type 2 diabetes in developing countries.

    PubMed

    Islam, Mehboob; Awan, Fazli Rabbi; Baig, Shahid Mahmood

    2014-09-01

    Type 2 diabetes (T2D) is a prevalent metabolic disorder across the globe. Research is underway on various aspects including genetics to understand and control the global epidemic of diabetes. Recently, several SNPs in various genes have been associated with T2D. These association studies are mainly carried out in the developed countries through Genome Wide Association Scans, with follow-up replication/validation studies by high-throughput genotyping techniques (e.g. Taqman Technology). Although, similar studies could be conducted in developing countries, however, the limiting factors are the associated cost and expertise. These factors hamper research into the genetic association and replication studies from low-income countries to figure out the role of putatively associated SNPs in diabetes. Although, there are several SNP detection methods (e.g. Taqman assay, Dot-blot, PCR-RFLP, DGGE, SSCP) but these are either expensive or labor intensive or less sensitive. Hence, our aim was to develop a low-cost method for the validation of PPARG (Pro12Ala, CCA>GCA) SNP (rs1801282) for its association with T2D. Here, we developed a cost-effective and rapid amplification refractory mutation specific-PCR (ARMS-PCR) method for this SNP detection. We successfully genotyped PPARG SNPs (Pro12Ala) in human samples and the validity of this method was confirmed by DNA sequencing of a few representative samples for the three different genotypes. Furthermore, ARMS-PCR was applied to T2D patients and control samples for the screening of this SNP. PMID:25063576

  20. Temple syndrome: A patient with maternal hetero-UPD14, mixed iso- and hetero-disomy detected by SNP microarray typing of patient-father duos.

    PubMed

    Shin, Eun-Hye; Cho, Eunhae; Lee, Cha Gon

    2016-08-01

    Temple syndrome (TS, MIM 616222) is an imprinting disorder involving genes within the imprinted region of chromosome 14q32. TS is a genetically complex disorder, which is associated with maternal uniparental disomy of chromosome 14 (UPD14), paternal deletions on chromosome 14, or loss of methylation at the intergenic differentially methylated region (IG-DMR). Here, we describe the case of a patient with maternal hetero-UPD14, mixed iso-/hetero-disomy mechanism identified by a single nucleotide polymorphism (SNP) array analysis of patient-father duos study. The phenotype of our case is similarities to Prader-Willi syndrome (PWS) during infancy and to Russell-Silver syndrome (RSS) during childhood. This SNP array appears to be an effective initial screening tool for patients with nonspecific clinical features suggestive of chromosomal disorders. PMID:26867509

  1. SNPConvert: SNP Array Standardization and Integration in Livestock Species

    PubMed Central

    Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra

    2016-01-01

    One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git. PMID:27600083

  2. SNPConvert: SNP Array Standardization and Integration in Livestock Species.

    PubMed

    Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra

    2016-01-01

    One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git. PMID:27600083

  3. Exhaustive Genome-Wide Search for SNP-SNP Interactions Across 10 Human Diseases.

    PubMed

    Murk, William; DeWan, Andrew T

    2016-01-01

    The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA) study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs) with minor allele frequency (MAF) ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10(-12)). Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized. PMID:27185397

  4. Exhaustive Genome-Wide Search for SNP-SNP Interactions Across 10 Human Diseases

    PubMed Central

    Murk, William; DeWan, Andrew T.

    2016-01-01

    The identification of statistical SNP-SNP interactions may help explain the genetic etiology of many human diseases, but exhaustive genome-wide searches for these interactions have been difficult, due to a lack of power in most datasets. We aimed to use data from the Resource for Genetic Epidemiology Research on Adult Health and Aging (GERA) study to search for SNP-SNP interactions associated with 10 common diseases. FastEpistasis and BOOST were used to evaluate all pairwise interactions among approximately N = 300,000 single nucleotide polymorphisms (SNPs) with minor allele frequency (MAF) ≥ 0.15, for the dichotomous outcomes of allergic rhinitis, asthma, cardiac disease, depression, dermatophytosis, type 2 diabetes, dyslipidemia, hemorrhoids, hypertensive disease, and osteoarthritis. A total of N = 45,171 subjects were included after quality control steps were applied. These data were divided into discovery and replication subsets; the discovery subset had > 80% power, under selected models, to detect genome-wide significant interactions (P < 10−12). Interactions were also evaluated for enrichment in particular SNP features, including functionality, prior disease relevancy, and marginal effects. No interaction in any disease was significant in both the discovery and replication subsets. Enrichment analysis suggested that, for some outcomes, interactions involving SNPs with marginal effects were more likely to be nominally replicated, compared to interactions without marginal effects. If SNP-SNP interactions play a role in the etiology of the studied conditions, they likely have weak effect sizes, involve lower-frequency variants, and/or involve complex models of interaction that are not captured well by the methods that were utilized. PMID:27185397

  5. A Bayesian Framework for SNP Identification

    SciTech Connect

    Webb-Robertson, Bobbie-Jo M.; Havre, Susan L.; Payne, Deborah A.

    2005-07-01

    Current proteomics techniques, such as mass spectrometry, focus on protein identification, usually ignoring most types of modifications beyond post-translational modifications, with the assumption that only a small number of peptides have to be matched to a protein for a positive identification. However, not all proteins are being identified with current techniques and improved methods to locate points of mutation are becoming a necessity. In the case when single-nucleotide polymorphisms (SNPs) are observed, brute force is the most common method to locate them, quickly becoming computationally unattractive as the size of the database associated with the model organism grows. We have developed a Bayesian model for SNPs, BSNP, incorporating evolutionary information at both the nucleotide and amino acid levels. Formulating SNPs as a Bayesian inference problem allows probabilities of interest to be easily obtained, for example the probability of a specific SNP or specific type of mutation over a gene or entire genome. Three SNP databases were observed in the evaluation of the BSNP model; the first SNP database is a disease specific gene in human, hemoglobin, the second is also a disease specific gene in human, p53, and the third is a more general SNP database for multiple genes in mouse. We validate that the BSNP model assigns higher posterior probabilities to the SNPs defined in all three separate databases than can be attributed to chance under specific evolutionary information, for example the amino acid model described by Majewski and Ott in conjunction with either the four-parameter nucleotide model by Bulmer or seven-parameter nucleotide model by Majewski and Ott.

  6. pfSNP: An integrated potentially functional SNP resource that facilitates hypotheses generation through knowledge syntheses.

    PubMed

    Wang, Jingbo; Ronaghi, Mostafa; Chong, Samuel S; Lee, Caroline G L

    2011-01-01

    Currently, >14,000,000 single nucleotide polymorphisms (SNPs) are reported. Identifying phenotype-affecting SNPs among these many SNPs pose significant challenges. Although several Web resources are available that can inform about the functionality of SNPs, these resources are mainly annotation databases and are not very comprehensive. In this article, we present a comprehensive, well-annotated, integrated pfSNP (potentially functional SNPs) Web resource (http://pfs.nus.edu.sg/), which is aimed to facilitate better hypothesis generation through knowledge syntheses mediated by better data integration and a user-friendly Web interface. pfSNP integrates >40 different algorithms/resources to interrogate >14,000,000 SNPs from the dbSNP database for SNPs of potential functional significance based on previous published reports, inferred potential functionality from genetic approaches as well as predicted potential functionality from sequence motifs. Its query interface has the user-friendly "auto-complete, prompt-as-you-type" feature and is highly customizable, facilitating different combination of queries using Boolean-logic. Additionally, to facilitate better understanding of the results and aid in hypotheses generation, gene/pathway-level information with text clouds highlighting enriched tissues/pathways as well as detailed-related information are also provided on the results page. Hence, the pfSNP resource will be of great interest to scientists focusing on association studies as well as those interested to experimentally address the functionality of SNPs. PMID:20672376

  7. SNP genotyping by heteroduplex analysis.

    PubMed

    Paniego, Norma; Fusari, Corina; Lia, Verónica; Puebla, Andrea

    2015-01-01

    Heteroduplex-based genotyping methods have proven to be technologically effective and economically efficient for low- to medium-range throughput single-nucleotide polymorphism (SNP) determination. In this chapter we describe two protocols that were successfully applied for SNP detection and haplotype analysis of candidate genes in association studies. The protocols involve (1) enzymatic mismatch cleavage with endonuclease CEL1 from celery, associated with fragment separation using capillary electrophoresis (CEL1 cleavage), and (2) differential retention of the homo/heteroduplex DNA molecules under partial denaturing conditions on ion pair reversed-phase liquid chromatography (dHPLC). Both methods are complementary since dHPLC is more versatile than CEL1 cleavage for identifying multiple SNP per target region, and the latter is easily optimized for sequences with fewer SNPs or small insertion/deletion polymorphisms. Besides, CEL1 cleavage is a powerful method to localize the position of the mutation when fragment resolution is done using capillary electrophoresis. PMID:25373754

  8. The utility of high-resolution melting analysis of SNP nucleated PCR amplicons--an MLST based Staphylococcus aureus typing scheme.

    PubMed

    Lilliebridge, Rachael A; Tong, Steven Y C; Giffard, Philip M; Holt, Deborah C

    2011-01-01

    High resolution melting (HRM) analysis is gaining prominence as a method for discriminating DNA sequence variants. Its advantage is that it is performed in a real-time PCR device, and the PCR amplification and HRM analysis are closed tube, and effectively single step. We have developed an HRM-based method for Staphylococcus aureus genotyping. Eight single nucleotide polymorphisms (SNPs) were derived from the S. aureus multi-locus sequence typing (MLST) database on the basis of maximized Simpson's Index of Diversity. Only G↔A, G↔T, C↔A, C↔T SNPs were considered for inclusion, to facilitate allele discrimination by HRM. In silico experiments revealed that DNA fragments incorporating the SNPs give much higher resolving power than randomly selected fragments. It was shown that the predicted optimum fragment size for HRM analysis was 200 bp, and that other SNPs within the fragments contribute to the resolving power. Six DNA fragments ranging from 83 bp to 219 bp, incorporating the resolution optimized SNPs were designed. HRM analysis of these fragments using 94 diverse S. aureus isolates of known sequence type or clonal complex (CC) revealed that sequence variants are resolved largely in accordance with G+C content. A combination of experimental results and in silico prediction indicates that HRM analysis resolves S. aureus into 268 "melt types" (MelTs), and provides a Simpson's Index of Diversity of 0.978 with respect to MLST. There is a high concordance between HRM analysis and the MLST defined CCs. We have generated a Microsoft Excel key which facilitates data interpretation and translation between MelT and MLST data. The potential of this approach for genotyping other bacterial pathogens was investigated using a computerized approach to estimate the densities of SNPs with unlinked allelic states. The MLST databases for all species tested contained abundant unlinked SNPs, thus suggesting that high resolving power is not dependent upon large numbers of SNPs

  9. Linkage mapping bovine EST-based SNP

    PubMed Central

    Snelling, Warren M; Casas, Eduardo; Stone, Roger T; Keele, John W; Harhay, Gregory P; Bennett, Gary L; Smith, Timothy PL

    2005-01-01

    Background Existing linkage maps of the bovine genome primarily contain anonymous microsatellite markers. These maps have proved valuable for mapping quantitative trait loci (QTL) to broad regions of the genome, but more closely spaced markers are needed to fine-map QTL, and markers associated with genes and annotated sequence are needed to identify genes and sequence variation that may explain QTL. Results Bovine expressed sequence tag (EST) and bacterial artificial chromosome (BAC)sequence data were used to develop 918 single nucleotide polymorphism (SNP) markers to map genes on the bovine linkage map. DNA of sires from the MARC reference population was used to detect SNPs, and progeny and mates of heterozygous sires were genotyped. Chromosome assignments for 861 SNPs were determined by twopoint analysis, and positions for 735 SNPs were established by multipoint analyses. Linkage maps of bovine autosomes with these SNPs represent 4585 markers in 2475 positions spanning 3058 cM . Markers include 3612 microsatellites, 913 SNPs and 60 other markers. Mean separation between marker positions is 1.2 cM. New SNP markers appear in 511 positions, with mean separation of 4.7 cM. Multi-allelic markers, mostly microsatellites, had a mean (maximum) of 216 (366) informative meioses, and a mean 3-lod confidence interval of 3.6 cM Bi-allelic markers, including SNP and other marker types, had a mean (maximum) of 55 (191) informative meioses, and were placed within a mean 8.5 cM 3-lod confidence interval. Homologous human sequences were identified for 1159 markers, including 582 newly developed and mapped SNP. Conclusion Addition of these EST- and BAC-based SNPs to the bovine linkage map not only increases marker density, but provides connections to gene-rich physical maps, including annotated human sequence. The map provides a resource for fine-mapping quantitative trait loci and identification of positional candidate genes, and can be integrated with other data to guide and

  10. Detection of copy number variation by SNP-allelotyping.

    PubMed

    Parker, Brett; Alexander, Ryan; Wu, Xingyao; Feely, Shawna; Shy, Michael; Schnetz-Boutaud, Nathalie; Li, Jun

    2015-03-01

    Charcot-Marie-Tooth disease type 1A (CMT1A) is caused by an abnormal copy number variation (CNV) with a trisomy of chromosome 17p12. The increase of the DNA-segment copy number is expected to alter the allele frequency of single nucleotide polymorphism (SNP) within the duplicated region. We tested whether SNP allele frequency determined by a Sequenom MassArray can be used to detect the CMT1A mutation. Our results revealed distinct patterns of SNP allele frequency distribution, which reliably differentiated CMT1A patients from controls. This finding suggests that this technique may serve as an alternative approach to identifying CNV in certain diseases, including CMT1A. PMID:24830919

  11. A 48 SNP set for grapevine cultivar identification

    PubMed Central

    2011-01-01

    Background Rapid and consistent genotyping is an important requirement for cultivar identification in many crop species. Among them grapevine cultivars have been the subject of multiple studies given the large number of synonyms and homonyms generated during many centuries of vegetative multiplication and exchange. Simple sequence repeat (SSR) markers have been preferred until now because of their high level of polymorphism, their codominant nature and their high profile repeatability. However, the rapid application of partial or complete genome sequencing approaches is identifying thousands of single nucleotide polymorphisms (SNP) that can be very useful for such purposes. Although SNP markers are bi-allelic, and therefore not as polymorphic as microsatellites, the high number of loci that can be multiplexed and the possibilities of automation as well as their highly repeatable results under any analytical procedure make them the future markers of choice for any type of genetic identification. Results We analyzed over 300 SNP in the genome of grapevine using a re-sequencing strategy in a selection of 11 genotypes. Among the identified polymorphisms, we selected 48 SNP spread across all grapevine chromosomes with allele frequencies balanced enough as to provide sufficient information content for genetic identification in grapevine allowing for good genotyping success rate. Marker stability was tested in repeated analyses of a selected group of cultivars obtained worldwide to demonstrate their usefulness in genetic identification. Conclusions We have selected a set of 48 stable SNP markers with a high discrimination power and a uniform genome distribution (2-3 markers/chromosome), which is proposed as a standard set for grapevine (Vitis vinifera L.) genotyping. Any previous problems derived from microsatellite allele confusion between labs or the need to run reference cultivars to identify allele sizes disappear using this type of marker. Furthermore, because SNP

  12. SNP-SNP interactions in breast cancer susceptibility

    PubMed Central

    Onay, Venüs Ümmiye; Briollais, Laurent; Knight, Julia A; Shi, Ellen; Wang, Yuanyuan; Wells, Sean; Li, Hong; Rajendram, Isaac; Andrulis, Irene L; Ozcelik, Hilmi

    2006-01-01

    Background Breast cancer predisposition genes identified to date (e.g., BRCA1 and BRCA2) are responsible for less than 5% of all breast cancer cases. Many studies have shown that the cancer risks associated with individual commonly occurring single nucleotide polymorphisms (SNPs) are incremental. However, polygenic models suggest that multiple commonly occurring low to modestly penetrant SNPs of cancer related genes might have a greater effect on a disease when considered in combination. Methods In an attempt to identify the breast cancer risk conferred by SNP interactions, we have studied 19 SNPs from genes involved in major cancer related pathways. All SNPs were genotyped by TaqMan 5'nuclease assay. The association between the case-control status and each individual SNP, measured by the odds ratio and its corresponding 95% confidence interval, was estimated using unconditional logistic regression models. At the second stage, two-way interactions were investigated using multivariate logistic models. The robustness of the interactions, which were observed among SNPs with stronger functional evidence, was assessed using a bootstrap approach, and correction for multiple testing based on the false discovery rate (FDR) principle. Results None of these SNPs contributed to breast cancer risk individually. However, we have demonstrated evidence for gene-gene (SNP-SNP) interaction among these SNPs, which were associated with increased breast cancer risk. Our study suggests cross talk between the SNPs of the DNA repair and immune system (XPD-[Lys751Gln] and IL10-[G(-1082)A]), cell cycle and estrogen metabolism (CCND1-[Pro241Pro] and COMT-[Met108/158Val]), cell cycle and DNA repair (BARD1-[Pro24Ser] and XPD-[Lys751Gln]), and within carcinogen metabolism (GSTP1-[Ile105Val] and COMT-[Met108/158Val]) pathways. Conclusion The importance of these pathways and their communication in breast cancer predisposition has been emphasized previously, but their biological interactions

  13. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data.

    PubMed

    Colella, Stefano; Yau, Christopher; Taylor, Jennifer M; Mirza, Ghazala; Butler, Helen; Clouston, Penny; Bassett, Anne S; Seller, Anneke; Holmes, Christopher C; Ragoussis, Jiannis

    2007-01-01

    Array-based technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally validated, a novel computational framework (QuantiSNP) for detecting regions of copy number variation from BeadArray SNP genotyping data using an Objective Bayes Hidden-Markov Model (OB-HMM). Objective Bayes measures are used to set certain hyperparameters in the priors using a novel re-sampling framework to calibrate the model to a fixed Type I (false positive) error rate. Other parameters are set via maximum marginal likelihood to prior training data of known structure. QuantiSNP provides probabilistic quantification of state classifications and significantly improves the accuracy of segmental aneuploidy identification and mapping, relative to existing analytical tools (Beadstudio, Illumina), as demonstrated by validation of breakpoint boundaries. QuantiSNP identified both novel and validated CNVs. QuantiSNP was developed using BeadArray SNP data but it can be adapted to other platforms and we believe that the OB-HMM framework has widespread applicability in genomic research. In conclusion, QuantiSNP is a novel algorithm for high-resolution CNV/aneuploidy detection with application to clinical genetics, cancer and disease association studies. PMID:17341461

  14. BM-SNP: A Bayesian Model for SNP Calling Using High Throughput Sequencing Data.

    PubMed

    Xu, Yanxun; Zheng, Xiaofeng; Yuan, Yuan; Estecio, Marcos R; Issa, Jean-Pierre; Qiu, Peng; Ji, Yuan; Liang, Shoudan

    2014-01-01

    A single-nucleotide polymorphism (SNP) is a sole base change in the DNA sequence and is the most common polymorphism. Detection and annotation of SNPs are among the central topics in biomedical research as SNPs are believed to play important roles on the manifestation of phenotypic events, such as disease susceptibility. To take full advantage of the next-generation sequencing (NGS) technology, we propose a Bayesian approach, BM-SNP, to identify SNPs based on the posterior inference using NGS data. In particular, BM-SNP computes the posterior probability of nucleotide variation at each covered genomic position using the contents and frequency of the mapped short reads. The position with a high posterior probability of nucleotide variation is flagged as a potential SNP. We apply BM-SNP to two cell-line NGS data, and the results show a high ratio of overlap ( >95 percent) with the dbSNP database. Compared with MAQ, BM-SNP identifies more SNPs that are in dbSNP, with higher quality. The SNPs that are called only by BM-SNP but not in dbSNP may serve as new discoveries. The proposed BM-SNP method integrates information from multiple aspects of NGS data, and therefore achieves high detection power. BM-SNP is fast, capable of processing whole genome data at 20-fold average coverage in a short amount of time. PMID:26357041

  15. Detecting Susceptibility to Breast Cancer with SNP-SNP Interaction Using BPSOHS and Emotional Neural Networks.

    PubMed

    Wang, Xiao; Peng, Qinke; Fan, Yue

    2016-01-01

    Studies for the association between diseases and informative single nucleotide polymorphisms (SNPs) have received great attention. However, most of them just use the whole set of useful SNPs and fail to consider the SNP-SNP interactions, while these interactions have already been proven in biology experiments. In this paper, we use a binary particle swarm optimization with hierarchical structure (BPSOHS) algorithm to improve the effective of PSO for the identification of the SNP-SNP interactions. Furthermore, in order to use these SNP interactions in the susceptibility analysis, we propose an emotional neural network (ENN) to treat SNP interactions as emotional tendency. Different from the normal architecture, just as the emotional brain, this architecture provides a specific path to treat the emotional value, by which the SNP interactions can be considered more quickly and directly. The ENN helps us use the prior knowledge about the SNP interactions and other influence factors together. Finally, the experimental results prove that the proposed BPSOHS_ENN algorithm can detect the informative SNP-SNP interaction and predict the breast cancer risk with a much higher accuracy than existing methods. PMID:27294121

  16. Detecting Susceptibility to Breast Cancer with SNP-SNP Interaction Using BPSOHS and Emotional Neural Networks

    PubMed Central

    Wang, Xiao; Fan, Yue

    2016-01-01

    Studies for the association between diseases and informative single nucleotide polymorphisms (SNPs) have received great attention. However, most of them just use the whole set of useful SNPs and fail to consider the SNP-SNP interactions, while these interactions have already been proven in biology experiments. In this paper, we use a binary particle swarm optimization with hierarchical structure (BPSOHS) algorithm to improve the effective of PSO for the identification of the SNP-SNP interactions. Furthermore, in order to use these SNP interactions in the susceptibility analysis, we propose an emotional neural network (ENN) to treat SNP interactions as emotional tendency. Different from the normal architecture, just as the emotional brain, this architecture provides a specific path to treat the emotional value, by which the SNP interactions can be considered more quickly and directly. The ENN helps us use the prior knowledge about the SNP interactions and other influence factors together. Finally, the experimental results prove that the proposed BPSOHS_ENN algorithm can detect the informative SNP-SNP interaction and predict the breast cancer risk with a much higher accuracy than existing methods. PMID:27294121

  17. SNP Arrays for Species Identification in Salmonids.

    PubMed

    Wenne, Roman; Drywa, Agata; Kent, Matthew; Sundsaasen, Kristil Kindem; Lien, Sigbjørn

    2016-01-01

    The use of SNP genotyping microarrays, developed in one species to analyze a closely related species for which genomic sequence information is scarce, enables the rapid development of a genomic resource (SNP information) without the need to develop new species-specific markers. Using large numbers of microarray SNPs offers the best chance to detect informative markers in nontarget species, markers that can very often be assayed using a lower throughput platform as is described in this paper. PMID:27460372

  18. SNP analysis of follistatin gene associated with polycystic ovarian syndrome

    PubMed Central

    Panneerselvam, Palanisamy; Sivakumari, Kanakarajan; Jayaprakash, Ponmani; Srikanth, Ramanathan

    2010-01-01

    Follistatin has been reported as a candidate gene for polycystic ovarian syndrome (PCOS) based on linkage and association studies. In this study, investigation of polymorphisms in the FST gene was done to determine if genetic variation is associated with susceptibility to PCOS. The nucleotide sequence of human follistatin and the protein sequence of human follistatin were retrieved from the NCBI database using Entrez. The follistatin protein of human was retrieved from the Swiss-Prot database. There are 344 amino acids and the molecular weight is 38,007 Da. The ProtParam analysis shows that the isoelectric point is 5.53 and the aliphatic index is 61.25. The hydropathicity is −0.490. The domains in FST protein are as follows: Pfam-B 5005 domain from 1 to 92; EGF-like subdomain from 93 to 116; Kazal 1 domain, occurred in three places, namely, 118–164, 192–239, and 270–316. There are 31 single-nucleotide polymorphisms (SNPs) for this gene. Some are nonsynonymous, some occur in the intron region, and some in an untranslated region. Two nonsynonymous SNPs, namely, rs11745088 and rs1127760, were taken for analysis. In the SNP rs11745088, the change is E152Q. Likewise, in rs1127760, the change is C239S. SIFT (Sorting Intolerant from Tolerant) showed positions of amino acids and the single letter code of amino acids that can be tolerated or deleterious for each position. There were six SNP results and each result had links to it. The dbSNP id, primary database id, and the type of mutation whether silent and if occurring in coding region are given as phenotype alterations. The FASTA format of protein was given to the nsSNP Analyzer tool, and the variation E152Q and C239S were given as inputs in the SNP data field. E152Q change was neutral and C239S causes disease. Using PANTHER for evolutionary analysis of coding SNPs, the protein sequence was given as input and analyzed for the E152Q and C239S SNPs for deleterious effect on protein function. The genetic association

  19. SNP analysis of follistatin gene associated with polycystic ovarian syndrome.

    PubMed

    Panneerselvam, Palanisamy; Sivakumari, Kanakarajan; Jayaprakash, Ponmani; Srikanth, Ramanathan

    2010-01-01

    Follistatin has been reported as a candidate gene for polycystic ovarian syndrome (PCOS) based on linkage and association studies. In this study, investigation of polymorphisms in the FST gene was done to determine if genetic variation is associated with susceptibility to PCOS. The nucleotide sequence of human follistatin and the protein sequence of human follistatin were retrieved from the NCBI database using Entrez. The follistatin protein of human was retrieved from the Swiss-Prot database. There are 344 amino acids and the molecular weight is 38,007 Da. The ProtParam analysis shows that the isoelectric point is 5.53 and the aliphatic index is 61.25. The hydropathicity is -0.490. The domains in FST protein are as follows: Pfam-B 5005 domain from 1 to 92; EGF-like subdomain from 93 to 116; Kazal 1 domain, occurred in three places, namely, 118-164, 192-239, and 270-316. There are 31 single-nucleotide polymorphisms (SNPs) for this gene. Some are nonsynonymous, some occur in the intron region, and some in an untranslated region. Two nonsynonymous SNPs, namely, rs11745088 and rs1127760, were taken for analysis. In the SNP rs11745088, the change is E152Q. Likewise, in rs1127760, the change is C239S. SIFT (Sorting Intolerant from Tolerant) showed positions of amino acids and the single letter code of amino acids that can be tolerated or deleterious for each position. There were six SNP results and each result had links to it. The dbSNP id, primary database id, and the type of mutation whether silent and if occurring in coding region are given as phenotype alterations. The FASTA format of protein was given to the nsSNP Analyzer tool, and the variation E152Q and C239S were given as inputs in the SNP data field. E152Q change was neutral and C239S causes disease. Using PANTHER for evolutionary analysis of coding SNPs, the protein sequence was given as input and analyzed for the E152Q and C239S SNPs for deleterious effect on protein function. The genetic association

  20. Genome-wide SNP detection, validation, and development of an 8K SNP array for apple

    Technology Transfer Automated Retrieval System (TEKTRAN)

    As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide...

  1. SNPMeta: SNP annotation and SNP metadata collection without a reference genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The increase in availability of resequencing data is greatly accelerating SNP discovery and has facilitated the development of SNP genotyping assays. This, in turn, is increasing interest in annotation of individual SNPs. Currently, these data are only available through curation, or comparison to a ...

  2. Construction of a high-density DArTseq SNP-based genetic map and identification of genomic regions with segregation distortion in a genetic population derived from a cross between feral and cultivated-type watermelon.

    PubMed

    Ren, Runsheng; Ray, Rumiana; Li, Pingfang; Xu, Jinhua; Zhang, Man; Liu, Guang; Yao, Xiefeng; Kilian, Andrzej; Yang, Xingping

    2015-08-01

    Watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai] is an economically important vegetable crop grown extensively worldwide. To facilitate the identification of agronomically important traits and provide new information for genetic and genomic research on this species, a high-density genetic linkage map of watermelon was constructed using an F2 population derived from a cross between elite watermelon cultivar K3 and wild watermelon germplasm PI 189225. Based on a sliding window approach, a total of 1,161 bin markers representing 3,465 SNP markers were mapped onto 11 linkage groups corresponding to the chromosome pair number of watermelon. The total length of the genetic map is 1,099.2 cM, with an average distance between bins of 1.0 cM. The number of markers in each chromosome varies from 62 in chromosome 07 to 160 in chromosome 05. The length of individual chromosomes ranged between 61.8 cM for chromosome 07 and 140.2 cM for chromosome 05. A total of 616 SNP bin markers showed significant (P < 0.05) segregation distortion across all 11 chromosomes, and 513 (83.3 %) of these distorted loci showed distortion in favor of the elite watermelon cultivar K3 allele and 103 were skewed toward PI 189225. The number of SNPs and InDels per Mb varied considerably across the segregation distorted regions (SDRs) on each chromosome, and a mixture of dense and sparse SNPs and InDel SDRs coexisted on some chromosomes suggesting that SDRs were randomly distributed throughout the genome. Recombination rates varied greatly among each chromosome, from 2.0 to 4.2 centimorgans per megabase (cM/Mb). An inconsistency was found between the genetic and physical positions on the map for a segment on chromosome 11. The high-density genetic map described in the present study will facilitate fine mapping of quantitative trait loci, the identification of candidate genes, map-based cloning, as well as marker-assisted selection (MAS) in watermelon breeding programs. PMID:25702268

  3. SNP Discovery Using Next Generation Transcriptomic Sequencing.

    PubMed

    De Wit, Pierre

    2016-01-01

    In this chapter, I will guide the user through methods to find new SNP markers from expressed sequence (RNA-Seq) data, focusing on the sample preparation and also on the bioinformatic analyses needed to sort through the immense flood of data from high-throughput sequencing machines. The general steps included are as follows: sample preparation, sequencing, quality control of data, assembly, mapping, SNP discovery, filtering, validation. The first few steps are traditional laboratory protocols, whereas steps following the sequencing are of bioinformatic nature. The bioinformatics described herein are by no means exhaustive, rather they serve as one example of a simple way of analyzing high-throughput sequence data to find SNP markers. Ideally, one would like to run through this protocol several times with a new dataset, while varying software parameters slightly, in order to determine the robustness of the results. The final validation step, although not described in much detail here, is also quite critical as that will be the final test of the accuracy of the assumptions made in silico.There is a plethora of downstream applications of a SNP dataset, not covered in this chapter. For an example of a more thorough protocol also including differential gene expression and functional enrichment analyses, BLAST annotation and downstream applications of SNP markers, a good starting point could be the "Simple Fool's Guide to population genomics via RNA-Seq," which is available at http://sfg.stanford.edu . PMID:27460371

  4. Rapid Identification of Ginseng Cultivars (Panax ginseng Meyer) Using Novel SNP-Based Probes

    PubMed Central

    Jo, Ick-Hyun; Bang, Kyong Hwan; Kim, Young-Chang; Lee, Jei-Wan; Seo, A-Yeon; Seong, Bong-Jae; Kim, Hyun-Ho; Kim, Dong-Hwi; Cha, Seon-Woo; Cho, Yong-Gu; Kim, Hong-Sig

    2011-01-01

    In order to develop a novel system for the discrimination of five ginseng cultivars (Panax ginseng Meyer), single nucleotide polymorphism (SNP) genotyping assays with real-time polymerase chain reaction were conducted. Nucleotide substitution in gDNA library clones of P. ginseng cv. Yunpoong was targeted for the SNP genotyping assay. From these SNP sites, a set of modified SNP specific fluorescence probes (PGP74, PGP110, and PGP130) and novel primer sets have been developed to distinguish among five ginseng cultivars. The combination of the SNP type of the five cultivars, Chungpoong, Yunpoong, Gopoong, Kumpoong, and Sunpoong, was identified as ‘ATA’, ‘GCC’, ‘GTA’, ‘GCA’, and ‘ACC’, respectively. This study represents the first report of the identification of ginseng cultivars by fluorescence probes. An SNP genotyping assay using fluorescence probes could prove useful for the identification of ginseng cultivars and ginseng seed management systems and guarantee the purity of ginseng seed. PMID:23717098

  5. SNP Array in Hematopoietic Neoplasms: A Review

    PubMed Central

    Song, Jinming; Shao, Haipeng

    2015-01-01

    Cytogenetic analysis is essential for the diagnosis and prognosis of hematopoietic neoplasms in current clinical practice. Many hematopoietic malignancies are characterized by structural chromosomal abnormalities such as specific translocations, inversions, deletions and/or numerical abnormalities that can be identified by karyotype analysis or fluorescence in situ hybridization (FISH) studies. Single nucleotide polymorphism (SNP) arrays offer high-resolution identification of copy number variants (CNVs) and acquired copy-neutral loss of heterozygosity (LOH)/uniparental disomy (UPD) that are usually not identifiable by conventional cytogenetic analysis and FISH studies. As a result, SNP arrays have been increasingly applied to hematopoietic neoplasms to search for clinically-significant genetic abnormalities. A large numbers of CNVs and UPDs have been identified in a variety of hematopoietic neoplasms. CNVs detected by SNP array in some hematopoietic neoplasms are of prognostic significance. A few specific genes in the affected regions have been implicated in the pathogenesis and may be the targets for specific therapeutic agents in the future. In this review, we summarize the current findings of application of SNP arrays in a variety of hematopoietic malignancies with an emphasis on the clinically significant genetic variants. PMID:27600067

  6. The SNP Consortium website: past, present and future.

    PubMed

    Thorisson, Gudmundur A; Stein, Lincoln D

    2003-01-01

    The SNP Consortium website (http://snp.cshl.org) has undergone many changes since its initial conception three years ago. The database back end has been changed from the venerable ACeDB to the more scalable MySQL engine. Users can access the data via gene or single nucleotide polymorphism (SNP) keyword searches and browse or dump SNP data to textfiles. A graphical genome browsing interface shows SNPs mapped onto the genome assembly in the context of externally available gene predictions and other features. SNP allele frequency and genotype data are available via FTP-download and on individual SNP report web pages. SNP linkage maps are available for download and for browsing in a comparative map viewer. All software components of the data coordinating center (DCC) website (http://snp.cshl.org) are open source. PMID:12519964

  7. An Improved Opposition-Based Learning Particle Swarm Optimization for the Detection of SNP-SNP Interactions

    PubMed Central

    Shang, Junliang; Sun, Yan; Li, Shengjun; Liu, Jin-Xing; Zheng, Chun-Hou; Zhang, Junying

    2015-01-01

    SNP-SNP interactions have been receiving increasing attention in understanding the mechanism underlying susceptibility to complex diseases. Though many works have been done for the detection of SNP-SNP interactions, the algorithmic development is still ongoing. In this study, an improved opposition-based learning particle swarm optimization (IOBLPSO) is proposed for the detection of SNP-SNP interactions. Highlights of IOBLPSO are the introduction of three strategies, namely, opposition-based learning, dynamic inertia weight, and a postprocedure. Opposition-based learning not only enhances the global explorative ability, but also avoids premature convergence. Dynamic inertia weight allows particles to cover a wider search space when the considered SNP is likely to be a random one and converges on promising regions of the search space while capturing a highly suspected SNP. The postprocedure is used to carry out a deep search in highly suspected SNP sets. Experiments of IOBLPSO are performed on both simulation data sets and a real data set of age-related macular degeneration, results of which demonstrate that IOBLPSO is promising in detecting SNP-SNP interactions. IOBLPSO might be an alternative to existing methods for detecting SNP-SNP interactions. PMID:26236727

  8. Mycobacterium leprae in Colombia described by SNP7614 in gyrA, two minisatellites and geography

    PubMed Central

    Cardona-Castro, Nora; Beltrán-Alzate, Juan Camilo; Romero-Montoya, Irma Marcela; Li, Wei; Brennan, Patrick J; Vissa, Varalakshmi

    2013-01-01

    New cases of leprosy are still being detected in Colombia after the country declared achievement of the WHO defined ‘elimination’ status. To study the ecology of leprosy in endemic regions, a combination of geographic and molecular tools were applied for a group of 201 multibacillary patients including six multi-case families from eleven departments. The location (latitude and longitude) of patient residences were mapped. Slit skin smears and/or skin biopsies were collected and DNA was extracted. Standard agarose gel electrophoresis following a multiplex PCR-was developed for rapid and inexpensive strain typing of M. leprae based on copy numbers of two VNTR minisatellite loci 27-5 and 12-5. A SNP (C/T) in gyrA (SNP7614) was mapped by introducing a novel PCR-RFLP into an ongoing drug resistance surveillance effort. Multiple genotypes were detected combining the three molecular markers. The two frequent genotypes in Colombia were SNP7614(C)/27-5(5)/12-5(4) [C54] predominantly distributed in the Atlantic departments and SNP7614 (T)/27-5(4)/12-5(5) [T45] associated with the Andean departments. A novel genotype SNP7614 (C)/27-5(6)/12-5(4) [C64] was detected in cities along the Magdalena river which separates the Andean from Atlantic departments; a subset was further characterized showing association with a rare allele of minisatellite 23-3 and the SNP type 1 of M. leprae. The genotypes within intra-family cases were conserved. Overall, this is the first large scale study that utilized simple and rapid assay formats for identification of major strain types and their distribution in Colombia. It provides the framework for further strain type discrimination and geographic information systems as tools for tracing transmission of leprosy. PMID:23291420

  9. Mycobacterium leprae in Colombia described by SNP7614 in gyrA, two minisatellites and geography.

    PubMed

    Cardona-Castro, Nora; Beltrán-Alzate, Juan Camilo; Romero-Montoya, Irma Marcela; Li, Wei; Brennan, Patrick J; Vissa, Varalakshmi

    2013-03-01

    New cases of leprosy are still being detected in Colombia after the country declared achievement of the WHO defined 'elimination' status. To study the ecology of leprosy in endemic regions, a combination of geographic and molecular tools were applied for a group of 201 multibacillary patients including six multi-case families from eleven departments. The location (latitude and longitude) of patient residences were mapped. Slit skin smears and/or skin biopsies were collected and DNA was extracted. Standard agarose gel electrophoresis following a multiplex PCR-was developed for rapid and inexpensive strain typing of Mycobacterium leprae based on copy numbers of two VNTR minisatellite loci 27-5 and 12-5. A SNP (C/T) in gyrA (SNP7614) was mapped by introducing a novel PCR-RFLP into an ongoing drug resistance surveillance effort. Multiple genotypes were detected combining the three molecular markers. The two frequent genotypes in Colombia were SNP7614(C)/27-5(5)/12-5(4) [C54] predominantly distributed in the Atlantic departments and SNP7614 (T)/27-5(4)/12-5(5) [T45] associated with the Andean departments. A novel genotype SNP7614 (C)/27-5(6)/12-5(4) [C64] was detected in cities along the Magdalena river which separates the Andean from Atlantic departments; a subset was further characterized showing association with a rare allele of minisatellite 23-3 and the SNP type 1 of M. leprae. The genotypes within intra-family cases were conserved. Overall, this is the first large scale study that utilized simple and rapid assay formats for identification of major strain types and their distribution in Colombia. It provides the framework for further strain type discrimination and geographic information systems as tools for tracing transmission of leprosy. PMID:23291420

  10. Genome-wide SNP discovery in walnut with an AGSNP pipeline updated for SNP discovery in allogamous organisms

    PubMed Central

    2012-01-01

    Background A genome-wide set of single nucleotide polymorphisms (SNPs) is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a pipeline (AGSNP) for genome-wide SNP discovery in coding sequences and other single-copy DNA without a complete genome sequence in self-pollinating (autogamous) plants. Here we updated this pipeline for SNP discovery in outcrossing (allogamous) species and demonstrated its efficacy in SNP discovery in walnut (Juglans regia L.). Results The first step in the original implementation of the AGSNP pipeline was the construction of a reference sequence and the identification of single-copy sequences in it. To identify single-copy sequences, multiple genome equivalents of short SOLiD reads of another individual were mapped to shallow genome coverage of long Sanger or Roche 454 reads making up the reference sequence. The relative depth of SOLiD reads was used to filter out repeated sequences from single-copy sequences in the reference sequence. The second step was a search for SNPs between SOLiD reads and the reference sequence. Polymorphism within the mapped SOLiD reads would have precluded SNP discovery; hence both individuals had to be homozygous. The AGSNP pipeline was updated here for using SOLiD or other type of short reads of a heterozygous individual for these two principal steps. A total of 32.6X walnut genome equivalents of SOLiD reads of vegetatively propagated walnut scion cultivar ‘Chandler’ were mapped to 48,661 ‘Chandler’ bacterial artificial chromosome (BAC) end sequences (BESs) produced by Sanger sequencing during the construction of a walnut physical map. A total of 22,799 putative SNPs were initially identified. A total of 6,000 Infinium II type SNPs evenly distributed along the walnut physical map were selected for the construction of an Infinium Bead

  11. A SNP-Based Molecular Barcode for Characterization of Common Wheat

    PubMed Central

    Gao, LiFeng; Jia, JiZeng; Kong, XiuYing

    2016-01-01

    Wheat is grown as a staple crop worldwide. It is important to develop an effective genotyping tool for this cereal grain both to identify germplasm diversity and to protect the rights of breeders. Single-nucleotide polymorphism (SNP) genotyping provides a means for developing a practical, rapid, inexpensive and high-throughput assay. Here, we investigated SNPs as robust markers of genetic variation for typing wheat cultivars. We identified SNPs from an array of 9000 across a collection of 429 well-known wheat cultivars grown in China, of which 43 SNP markers with high minor allele frequency and variations discriminated the selected wheat varieties and their wild ancestors. This SNP-based barcode will allow for the rapid and precise identification of wheat germplasm resources and newly released varieties and will further assist in the wheat breeding program. PMID:26985664

  12. A SNP-Based Molecular Barcode for Characterization of Common Wheat.

    PubMed

    Gao, LiFeng; Jia, JiZeng; Kong, XiuYing

    2016-01-01

    Wheat is grown as a staple crop worldwide. It is important to develop an effective genotyping tool for this cereal grain both to identify germplasm diversity and to protect the rights of breeders. Single-nucleotide polymorphism (SNP) genotyping provides a means for developing a practical, rapid, inexpensive and high-throughput assay. Here, we investigated SNPs as robust markers of genetic variation for typing wheat cultivars. We identified SNPs from an array of 9000 across a collection of 429 well-known wheat cultivars grown in China, of which 43 SNP markers with high minor allele frequency and variations discriminated the selected wheat varieties and their wild ancestors. This SNP-based barcode will allow for the rapid and precise identification of wheat germplasm resources and newly released varieties and will further assist in the wheat breeding program. PMID:26985664

  13. Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

    PubMed Central

    Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

    2015-01-01

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559

  14. Applying SNP marker technology in the cacao breeding program at the Cocoa Research Institute of Ghana

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this investigation 45 parental cacao plants and five progeny derived from the parental stock studied were genotyped using six SNP markers to determine off-types or mislabeled clones and to authenticate crosses made in the Cocoa Research Institute of Ghana (CRIG) breeding program. Investigation wa...

  15. High-throughput SNP genotyping in Cucurbita pepo for map construction and quantitative trait loci mapping

    PubMed Central

    2012-01-01

    Background Cucurbita pepo is a member of the Cucurbitaceae family, the second- most important horticultural family in terms of economic importance after Solanaceae. The "summer squash" types, including Zucchini and Scallop, rank among the highest-valued vegetables worldwide. There are few genomic tools available for this species. The first Cucurbita transcriptome, along with a large collection of Single Nucleotide Polymorphisms (SNP), was recently generated using massive sequencing. A set of 384 SNP was selected to generate an Illumina GoldenGate assay in order to construct the first SNP-based genetic map of Cucurbita and map quantitative trait loci (QTL). Results We herein present the construction of the first SNP-based genetic map of Cucurbita pepo using a population derived from the cross of two varieties with contrasting phenotypes, representing the main cultivar groups of the species' two subspecies: Zucchini (subsp. pepo) × Scallop (subsp. ovifera). The mapping population was genotyped with 384 SNP, a set of selected EST-SNP identified in silico after massive sequencing of the transcriptomes of both parents, using the Illumina GoldenGate platform. The global success rate of the assay was higher than 85%. In total, 304 SNP were mapped, along with 11 SSR from a previous map, giving a map density of 5.56 cM/marker. This map was used to infer syntenic relationships between C. pepo and cucumber and to successfully map QTL that control plant, flowering and fruit traits that are of benefit to squash breeding. The QTL effects were validated in backcross populations. Conclusion Our results show that massive sequencing in different genotypes is an excellent tool for SNP discovery, and that the Illumina GoldenGate platform can be successfully applied to constructing genetic maps and performing QTL analysis in Cucurbita. This is the first SNP-based genetic map in the Cucurbita genus and is an invaluable new tool for biological research, especially considering that most

  16. METU-SNP: an integrated software system for SNP-complex disease association analysis.

    PubMed

    Ustünkar, Gürkan; Aydın Son, Yeşim

    2011-01-01

    Recently, there has been increasing research to discover genomic biomarkers, haplotypes, and potentially other variables that together contribute to the development of diseases. Single Nucleotide Polymorphisms (SNPs) are the most common form of genomic variations and they can represent an individual’s genetic variability in greatest detail. Genome-wide association studies (GWAS) of SNPs, high-dimensional case-control studies, are among the most promising approaches for identifying disease causing variants. METU-SNP software is a Java based integrated desktop application specifically designed for the prioritization of SNP biomarkers and the discovery of genes and pathways related to diseases via analysis of the GWAS case-control data. Outputs of METU-SNP can easily be utilized for the downstream biomarkers research to allow the prediction and the diagnosis of diseases and other personalized medical approaches. Here, we introduce and describe the system functionality and architecture of the METU-SNP. We believe that the METU-SNP will help researchers with the reliable identification of SNPs that are involved in the etiology of complex diseases, ultimately supporting the development of personalized medicine approaches and targeted drug discoveries. PMID:22156365

  17. RNASEL and MIR146A SNP-SNP Interaction as a Susceptibility Factor for Non-Melanoma Skin Cancer

    PubMed Central

    Farzan, Shohreh F.; Karagas, Margaret R.; Christensen, Brock C.; Li, Zhongze; Kuriger, Jacquelyn K.; Nelson, Heather H.

    2014-01-01

    Immunity and inflammatory pathways are important in the genesis of non-melanoma skin cancers (NMSC). Functional genetic variation in immune modulators has the potential to affect disease etiology. We investigated associations between common variants in two key regulators, MIR146A and RNASEL, and their relation to NMSCs. Using a large population-based case-control study of basal cell (BCC) and squamous cell carcinoma (SCC), we investigated the impact of MIR146A SNP rs2910164 on cancer risk, and interaction with a SNP in one of its putative targets (RNASEL, rs486907). To examine associations between genotype and BCC and SCC, occurrence odds ratios (OR) and 95% confidence intervals (95%CI) were calculated using unconditional logistic regression, accounting for multiple confounding factors. We did not observe an overall change in the odds ratios for SCC or BCC among individuals carrying either of the RNASEL or MIR146A variants compared with those who were wild type at these loci. However, there was a sex-specific association between BCC and MIR146A in women (ORGC = 0.73, [95%CI = 0.52–1.03]; ORCC = 0.29, [95% CI = 0.14–0.61], p-trend<0.001), and a reduction in risk, albeit not statistically significant, associated with RNASEL and SCC in men (ORAG = 0.88, [95%CI = 0.65–1.19]; ORAA = 0.68, [95%CI = 0.43–1.08], p-trend = 0.10). Most striking was the strong interaction between the two genes. Among individuals carrying variant alleles of both rs2910164 and rs486907, we observed inverse relationships with SCC (ORSCC = 0.56, [95%CI = 0.38–0.81], p-interaction = 0.012) and BCC (ORBCC = 0.57, [95%CI = 0.40–0.80], p-interaction = 0.005). Our results suggest that genetic variation in immune and inflammatory regulators may influence susceptibility to NMSC, and novel SNP-SNP interaction for a microRNA and its target. These data suggest that RNASEL, an enzyme involved in RNA turnover, is controlled by miR-146a

  18. RNASEL and MIR146A SNP-SNP interaction as a susceptibility factor for non-melanoma skin cancer.

    PubMed

    Farzan, Shohreh F; Karagas, Margaret R; Christensen, Brock C; Li, Zhongze; Kuriger, Jacquelyn K; Nelson, Heather H

    2014-01-01

    Immunity and inflammatory pathways are important in the genesis of non-melanoma skin cancers (NMSC). Functional genetic variation in immune modulators has the potential to affect disease etiology. We investigated associations between common variants in two key regulators, MIR146A and RNASEL, and their relation to NMSCs. Using a large population-based case-control study of basal cell (BCC) and squamous cell carcinoma (SCC), we investigated the impact of MIR146A SNP rs2910164 on cancer risk, and interaction with a SNP in one of its putative targets (RNASEL, rs486907). To examine associations between genotype and BCC and SCC, occurrence odds ratios (OR) and 95% confidence intervals (95%CI) were calculated using unconditional logistic regression, accounting for multiple confounding factors. We did not observe an overall change in the odds ratios for SCC or BCC among individuals carrying either of the RNASEL or MIR146A variants compared with those who were wild type at these loci. However, there was a sex-specific association between BCC and MIR146A in women (ORGC = 0.73, [95%CI = 0.52-1.03]; ORCC = 0.29, [95% CI = 0.14-0.61], p-trend<0.001), and a reduction in risk, albeit not statistically significant, associated with RNASEL and SCC in men (ORAG = 0.88, [95%CI = 0.65-1.19]; ORAA = 0.68, [95%CI = 0.43-1.08], p-trend = 0.10). Most striking was the strong interaction between the two genes. Among individuals carrying variant alleles of both rs2910164 and rs486907, we observed inverse relationships with SCC (ORSCC = 0.56, [95%CI = 0.38-0.81], p-interaction = 0.012) and BCC (ORBCC = 0.57, [95%CI = 0.40-0.80], p-interaction = 0.005). Our results suggest that genetic variation in immune and inflammatory regulators may influence susceptibility to NMSC, and novel SNP-SNP interaction for a microRNA and its target. These data suggest that RNASEL, an enzyme involved in RNA turnover, is controlled by miR-146a and may be important in NMSC etiology. PMID:24699816

  19. eSNPO: An eQTL-based SNP Ontology and SNP functional enrichment analysis platform

    PubMed Central

    Li, Jin; Wang, Limei; Jiang, Tao; Wang, Jizhe; Li, Xue; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Zhang, Ruijie; Lv, Hongchao; Guo, Maozu

    2016-01-01

    Genome-wide association studies (GWASs) have mined many common genetic variants associated with human complex traits like diseases. After that, the functional annotation and enrichment analysis of significant SNPs are important tasks. Classic methods are always based on physical positions of SNPs and genes. Expression quantitative trait loci (eQTLs) are genomic loci that contribute to variation in gene expression levels and have been proven efficient to connect SNPs and genes. In this work, we integrated the eQTL data and Gene Ontology (GO), constructed associations between SNPs and GO terms, then performed functional enrichment analysis. Finally, we constructed an eQTL-based SNP Ontology and SNP functional enrichment analysis platform. Taking Parkinson Disease (PD) as an example, the proposed platform and method are efficient. We believe eSNPO will be a useful resource for SNP functional annotation and enrichment analysis after we have got significant disease related SNPs. PMID:27470167

  20. eSNPO: An eQTL-based SNP Ontology and SNP functional enrichment analysis platform.

    PubMed

    Li, Jin; Wang, Limei; Jiang, Tao; Wang, Jizhe; Li, Xue; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Zhang, Ruijie; Lv, Hongchao; Guo, Maozu

    2016-01-01

    Genome-wide association studies (GWASs) have mined many common genetic variants associated with human complex traits like diseases. After that, the functional annotation and enrichment analysis of significant SNPs are important tasks. Classic methods are always based on physical positions of SNPs and genes. Expression quantitative trait loci (eQTLs) are genomic loci that contribute to variation in gene expression levels and have been proven efficient to connect SNPs and genes. In this work, we integrated the eQTL data and Gene Ontology (GO), constructed associations between SNPs and GO terms, then performed functional enrichment analysis. Finally, we constructed an eQTL-based SNP Ontology and SNP functional enrichment analysis platform. Taking Parkinson Disease (PD) as an example, the proposed platform and method are efficient. We believe eSNPO will be a useful resource for SNP functional annotation and enrichment analysis after we have got significant disease related SNPs. PMID:27470167

  1. SNP sets and reading ability: testing confirmation of a 10-SNP set in a population sample.

    PubMed

    Luciano, Michelle; Montgomery, Grant W; Martin, Nicholas G; Wright, Margaret J; Bates, Timothy C

    2011-06-01

    A set of 10 SNPs associated with reading ability in 7-year-olds was reported based on initial pooled analyses of 100K SNP chip data, with follow-up testing stages using pooling and individual testing. Here we examine this association in an adolescent population sample of Australian twins and siblings (N = 1177) aged 12 to 25 years. One (rs1842129) of the 10 SNPs approached significance (P = .05) but no support was found for the remaining 9 SNPs or the SNP set itself. Results indicate that these SNPs are not associated with reading ability in an Australian population. The results are interpreted as supporting use of much larger SNP sets in common disorders where effects are small. PMID:21623652

  2. Linkage Analysis and QTL Mapping Using SNP Dosage Data in a Tetraploid Potato Mapping Population

    PubMed Central

    Hackett, Christine A.; McLean, Karen; Bryan, Glenn J.

    2013-01-01

    New sequencing and genotyping technologies have enabled researchers to generate high density SNP genotype data for mapping populations. In polyploid species, SNP data usually contain a new type of information, the allele dosage, which is not used by current methodologies for linkage analysis and QTL mapping. Here we extend existing methodology to use dosage data on SNPs in an autotetraploid mapping population. The SNP dosages are inferred from allele intensity ratios using normal mixture models. The steps of the linkage analysis (testing for distorted segregation, clustering SNPs, calculation of recombination fractions and LOD scores, ordering of SNPs and inference of parental phase) are extended to use the dosage information. For QTL analysis, the probability of each possible offspring genotype is inferred at a grid of locations along the chromosome from the ordered parental genotypes and phases and the offspring dosages. A normal mixture model is then used to relate trait values to the offspring genotypes and to identify the most likely locations for QTLs. These methods are applied to analyse a tetraploid potato mapping population of parents and 190 offspring, genotyped using an Infinium 8300 Potato SNP Array. Linkage maps for each of the 12 chromosomes are constructed. The allele intensity ratios are mapped as quantitative traits to check that their position and phase agrees with that of the corresponding SNP. This analysis confirms most SNP positions, and eliminates some problem SNPs to give high-density maps for each chromosome, with between 74 and 152 SNPs mapped and between 100 and 300 further SNPs allocated to approximate bins. Low numbers of double reduction products were detected. Overall 3839 of the 5378 polymorphic SNPs can be assigned putative genetic locations. This methodology can be applied to construct high-density linkage maps in any autotetraploid species, and could also be extended to higher autopolyploids. PMID:23704960

  3. Smarter clustering methods for SNP genotype calling

    PubMed Central

    Lin, Yan; Tseng, George C.; Cheong, Soo Yeon; Bean, Lora J. H.; Sherman, Stephanie L.; Feingold, Eleanor

    2008-01-01

    Motivation: Most genotyping technologies for single nucleotide polymorphism (SNP) markers use standard clustering methods to ‘call’ the SNP genotypes. These methods are not always optimal in distinguishing the genotype clusters of a SNP because they do not take advantage of specific features of the genotype calling problem. In particular, when family data are available, pedigree information is ignored. Furthermore, prior information about the distribution of the measurements for each cluster can be used to choose an appropriate model-based clustering method and can significantly improve the genotype calls. One special genotyping problem that has never been discussed in the literature is that of genotyping of trisomic individuals, such as individuals with Down syndrome. Calling trisomic genotypes is a more complicated problem, and the addition of external information becomes very important. Results: In this article, we discuss the impact of incorporating external information into clustering algorithms to call the genotypes for both disomic and trisomic data. We also propose two new methods to call genotypes using family data. One is a modification of the K-means method and uses the pedigree information by updating all members of a family together. The other is a likelihood-based method that combines the Gaussian or beta-mixture model with pedigree information. We compare the performance of these two methods and some other existing methods using simulation studies. We also compare the performance of these methods on a real dataset generated by the Illumina platform (www.illumina.com). Availability: The R code for the family-based genotype calling methods (SNPCaller) is available to be downloaded from the following website: http://watson.hgen.pitt.edu/register. Contact: liny@upmc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18826959

  4. KinSNP software for homozygosity mapping of disease genes using SNP microarrays.

    PubMed

    Amir, El-Ad David; Bartal, Ofer; Morad, Efrat; Nagar, Tal; Sheynin, Jony; Parvari, Ruti; Chalifa-Caspi, Vered

    2010-08-01

    Consanguineous families affected with a recessive genetic disease caused by homozygotisation of a mutation offer a unique advantage for positional cloning of rare diseases. Homozygosity mapping of patient genotypes is a powerful technique for the identification of the genomic locus harbouring the causing mutation. This strategy relies on the observation that in these patients a large region spanning the disease locus is also homozygous with high probability. The high marker density in single nucleotide polymorphism (SNP) arrays is extremely advantageous for homozygosity mapping. We present KinSNP, a user-friendly software tool for homozygosity mapping using SNP arrays. The software searches for stretches of SNPs which are homozygous to the same allele in all ascertained sick individuals. User-specified parameters control the number of allowed genotyping 'errors' within homozygous blocks. Candidate disease regions are then reported in a detailed, coloured Excel file, along with genotypes of family members and healthy controls. An interactive genome browser has been included which shows homozygous blocks, individual genotypes, genes and further annotations along the chromosomes, with zooming and scrolling capabilities. The software has been used to identify the location of a mutated gene causing insensitivity to pain in a large Bedouin family. KinSNP is freely available from. PMID:20846928

  5. SNP marker detection and genotyping in tilapia.

    PubMed

    Van Bers, N E M; Crooijmans, R P M A; Groenen, M A M; Dibbits, B W; Komen, J

    2012-09-01

    We have generated a unique resource consisting of nearly 175 000 short contig sequences and 3569 SNP markers from the widely cultured GIFT (Genetically Improved Farmed Tilapia) strain of Nile tilapia (Oreochromis niloticus). In total, 384 SNPs were selected to monitor the wider applicability of the SNPs by genotyping tilapia individuals from different strains and different geographical locations. In all strains and species tested (O. niloticus, O. aureus and O. mossambicus), the genotyping assay was working for a similar number of SNPs (288-305 SNPs). The actual number of polymorphic SNPs was, as expected, highest for individuals from the GIFT population (255 SNPs). In the individuals from an Egyptian strain and in individuals caught in the wild in the basin of the river Volta, 197 and 163 SNPs were polymorphic, respectively. A pairwise calculation of Nei's genetic distance allowed the discrimination of the individual strains and species based on the genotypes determined with the SNP set. We expect that this set will be widely applicable for use in tilapia aquaculture, e.g. for pedigree reconstruction. In addition, this set is currently used for assaying the genetic diversity of native Nile tilapia in areas where tilapia is, or will be, introduced in aquaculture projects. This allows the tracing of escapees from aquaculture and the monitoring of effects of introgression and hybridization. PMID:22524158

  6. COL18A1 is highly expressed during human adipocyte differentiation and the SNP c.1136C > T in its "frizzled" motif is associated with obesity in diabetes type 2 patients.

    PubMed

    Errera, Flavia I V; Canani, Luís H; Yeh, Erika; Kague, Erika; Armelin-Corrêa, Lucia M; Suzuki, Oscar T; Tschiedel, Balduíno; Silva, Maria Elizabeth R; Sertié, Andréa L; Passos-Bueno, Maria Rita

    2008-03-01

    Collagen XVIII can generate two fragments, NC11-728 containing a frizzled motif which possibly acts in Wnt signaling and Endostatin, which is cleaved from the NC1 and is a potent inhibitor of angiogenesis. Collagen XVIII and Wnt signaling have recently been associated with adipogenic differentiation and obesity in some animal models, but not in humans. In the present report, we have shown that COL18A1 expression increases during human adipogenic differentiation. We also tested if polymorphisms in the Frizzled (c.1136C>T; Thr379Met) and Endostatin (c.4349G>A; Asp1437Asn) regions contribute towards susceptibility to obesity in patients with type 2 diabetes (113 obese, BMI > or =30; 232 non-obese, BMI < 30) of European ancestry. No evidence of association was observed between the allele c.4349G>A and obesity, but we observed a significantly higher frequency of homozygotes c.1136TT in obese (19.5%) than in non-obese individuals (10.9%) [P = 0.02; OR = 2.0 (95%CI: 1.07-3.73)], suggesting that the allele c.1136T is associated to obesity in a recessive model. This genotype, after controlling for cholesterol, LDL cholesterol, and triglycerides, was independently associated with obesity (P = 0.048), and increases the chance of obesity in 2.8 times. Therefore, our data suggest the involvement of collagen XVIII in human adipogenesis and susceptibility to obesity. PMID:18345385

  7. Single Nucleotide Polymorphism (SNP) in the Adiponectin Gene and Cardiovascular Disease.

    PubMed

    Chirumbolo, Salvatore

    2016-07-01

    Dear Editor, The recent article by Mohammadzadeh et al.[1] on the latest issue of this Journal showed that the T allele +276G/T SNP of ADIPOQ gene is more associated with the increasing risk of coronary artery disease (CAD) in subjects with type 2 diabetes. Adipocytes were described in myocardial tissue of CAD patients and their role recently discussed[2,3]. Susceptibility to CAD by polymorphism in the Q gene of adiponectin has been reported for 3'-UTR, which harbours some genetic loci associated with metabolic risks and atherosclerosis[4]. Actually, previous studies have shown that the haplotype SNP +276G>T was associated with a decreased risk of CAD, after adjustment for potential confounding factors, therefore some controversial opinion still exists[5]. This evidence should be associated with the role exerted by adipocytes and adiponectin in heart physiology. In particular, in hypertensive disorder complicating pregnancy (HDCP), by investigating the population frequency of alleles, genotypes, and haplotypes of two single nucleotide polymorphisms (SNPs), namely +45T>G (rs2241766) and +276G>T (rs1501299), some authors found that the SNP +276 TT genotype was significantly associated with protection against HDCP, when compared to the pooled G genotypes[6]. Moreover, the same +276G/T SNP haplotype was strongly associated with biliary atresia, an intractable neonatal inflammatory and obliterative cholangiopathy, leading to progressive fibrosis and cirrhosis[7]. CAD is closely related to adiponectin biology. The same isoforms of adiponectin seem to be not associated to CAD severity but to glucose metabolism and its impairment[8]. In the paper by Mohammadzadeh et al.[1], T allele in +276G/T SNP haplotype is highly associated with CAD in subjects with type 2 diabetes, but this linkage should be reappraised if related much more to diabetes rather than CAD. Association of T allele in the indicated SNP with CAD may be an indirect consequence of type 2 diabetes, as reported

  8. Pigment phenotype and biogeographical ancestry from ancient skeletal remains: inferences from multiplexed autosomal SNP analysis.

    PubMed

    Bouakaze, Caroline; Keyser, Christine; Crubézy, Eric; Montagnon, Daniel; Ludes, Bertrand

    2009-07-01

    In the present study, a multiplexed genotyping assay for ten single nucleotide polymorphisms (SNPs) located within six pigmentation candidate genes was developed on modern biological samples and applied to DNA retrieved from 25 archeological human remains from southern central Siberia dating from the Bronze and Iron Ages. SNP genotyping was successful for the majority of ancient samples and revealed that most probably had typical European pigment features, i.e., blue or green eye color, light hair color and skin type, and were likely of European individual ancestry. To our knowledge, this study reports for the first time the multiplexed typing of autosomal SNPs on aged and degraded DNA. By providing valuable information on pigment traits of an individual and allowing individual biogeographical ancestry estimation, autosomal SNP typing can improve ancient DNA studies and aid human identification in some forensic casework situations when used to complement conventional molecular markers. PMID:19415315

  9. Longevity and plasticity of CFTR provide an argument for noncanonical SNP organization in hominid DNA.

    PubMed

    Hill, Aubrey E; Plyler, Zackery E; Tiwari, Hemant; Patki, Amit; Tully, Joel P; McAtee, Christopher W; Moseley, Leah A; Sorscher, Eric J

    2014-01-01

    Like many other ancient genes, the cystic fibrosis transmembrane conductance regulator (CFTR) has survived for hundreds of millions of years. In this report, we consider whether such prodigious longevity of an individual gene--as opposed to an entire genome or species--should be considered surprising in the face of eons of relentless DNA replication errors, mutagenesis, and other causes of sequence polymorphism. The conventions that modern human SNP patterns result either from purifying selection or random (neutral) drift were not well supported, since extant models account rather poorly for the known plasticity and function (or the established SNP distributions) found in a multitude of genes such as CFTR. Instead, our analysis can be taken as a polemic indicating that SNPs in CFTR and many other mammalian genes may have been generated--and continue to accrue--in a fundamentally more organized manner than would otherwise have been expected. The resulting viewpoint contradicts earlier claims of 'directional' or 'intelligent design-type' SNP formation, and has important implications regarding the pace of DNA adaptation, the genesis of conserved non-coding DNA, and the extent to which eukaryotic SNP formation should be viewed as adaptive. PMID:25350658

  10. Changes in variance explained by top SNP windows over generations for three traits in broiler chicken

    PubMed Central

    Fragomeni, Breno de Oliveira; Misztal, Ignacy; Lourenco, Daniela Lino; Aguilar, Ignacio; Okimoto, Ronald; Muir, William M.

    2014-01-01

    The purpose of this study was to determine if the set of genomic regions inferred as accounting for the majority of genetic variation in quantitative traits remain stable over multiple generations of selection. The data set contained phenotypes for five generations of broiler chicken for body weight, breast meat, and leg score. The population consisted of 294,632 animals over five generations and also included genotypes of 41,036 single nucleotide polymorphism (SNP) for 4,866 animals, after quality control. The SNP effects were calculated by a GWAS type analysis using single step genomic BLUP approach for generations 1–3, 2–4, 3–5, and 1–5. Variances were calculated for windows of 20 SNP. The top ten windows for each trait that explained the largest fraction of the genetic variance across generations were examined. Across generations, the top 10 windows explained more than 0.5% but less than 1% of the total variance. Also, the pattern of the windows was not consistent across generations. The windows that explained the greatest variance changed greatly among the combinations of generations, with a few exceptions. In many cases, a window identified as top for one combination, explained less than 0.1% for the other combinations. We conclude that identification of top SNP windows for a population may have little predictive power for genetic selection in the following generations for the traits here evaluated. PMID:25324857

  11. Large-Scale SNP Marker Development and Genotyping in Oat

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this study, our goals are to develop genome-wide SNP markers using next generation sequencing technologies and to apply a highly parallel SNP genotyping system developed by Illumina for genetics and breeding applications in oat. The large amount of DNA sequence sources generated from cDNAs and Di...

  12. Accelerating genetic improvement with SNP chips and DNA sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The development of high-density single nucleotide polymorphism (SNP) assays is expected to have a profound impact on genetic progress in the U.S. dairy industry. In the 16 months since its initial availability, the Illumina BovineSNP50 BeadChip has been used to genotype nearly 20,000 Holsteins. Thes...

  13. Atomic Force Microscopy for DNA SNP Identification

    NASA Astrophysics Data System (ADS)

    Valbusa, Ugo; Ierardi, Vincenzo

    The knowledge of the effects of single-nucleotide polymorphisms (SNPs) in the human genome greatly contributes to better comprehension of the relation between genetic factors and diseases. Sequence analysis of genomic DNA in different individuals reveals positions where variations that involve individual base substitutions can occur. Single-nucleotide polymorphisms are highly abundant and can have different consequences at phenotypic level. Several attempts were made to apply atomic force microscopy (AFM) to detect and map SNP sites in DNA strands. The most promising approach is the study of DNA mutations producing heteroduplex DNA strands and identifying the mismatches by means of a protein that labels the mismatches. MutS is a protein that is part of a well-known complex of mismatch repair, which initiates the process of repairing when the MutS binds to the mismatched DNA filament. The position of MutS on the DNA filament can be easily recorded by means of AFM imaging.

  14. Identification of Mendelian inconsistencies between SNP and pedigree information of sibs

    PubMed Central

    2011-01-01

    Background Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Methods Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP) genotypes between parent and offspring (PAR-OFF). Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT). The second test compares pedigree and SNP-based relationships (SIBREL). All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Results Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL) identified 18 (22) additional inconsistent animals. Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error), were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error), were considerably higher for SIBREL compared to SIBCOUNT. Conclusions Tests to remove

  15. De novo sequencing of sunflower genome for SNP discovery using RAD (Restriction site Associated DNA) approach

    PubMed Central

    2013-01-01

    Background Application of Single Nucleotide Polymorphism (SNP) marker technology as a tool in sunflower breeding programs offers enormous potential to improve sunflower genetics, and facilitate faster release of sunflower hybrids to the market place. Through a National Sunflower Association (NSA) funded initiative, we report on the process of SNP discovery through reductive genome sequencing and local assembly of six diverse sunflower inbred lines that represent oil as well as confection types. Results A combination of Restriction site Associated DNA Sequencing (RAD-Seq) protocols and Illumina paired-end sequencing chemistry generated high quality 89.4 M paired end reads from the six lines which represent 5.3 GB of the sequencing data. Raw reads from the sunflower line, RHA 464 were assembled de novo to serve as a framework reference genome. About 15.2 Mb of sunflower genome distributed over 42,267 contigs were obtained upon assembly of RHA 464 sequencing data, the contig lengths ranged from 200 to 950 bp with an N50 length of 393 bp. SNP calling was performed by aligning sequencing data from the six sunflower lines to the assembled reference RHA 464. On average, 1 SNP was located every 143 bp of the sunflower genome sequence. Based on several filtering criteria, a final set of 16,467 putative sequence variants with characteristics favorable for Illumina Infinium Genotyping Technology (IGT) were mined from the sequence data generated across six diverse sunflower lines. Conclusion Here we report the molecular and computational methodology involved in SNP development for a complex genome like sunflower lacking reference assembly, offering an attractive tool for molecular breeding purposes in sunflower. PMID:23947483

  16. Deriving Gene Networks from SNP Associated with Triacylglycerol and Phospholipid Fatty Acid Fractions from Ribeyes of Angus Cattle

    PubMed Central

    Buchanan, Justin W.; Reecy, James M.; Garrick, Dorian J.; Duan, Qing; Beitz, Don C.; Koltes, James E.; Saatchi, Mahdi; Koesterke, Lars; Mateescu, Raluca G.

    2016-01-01

    The fatty acid profile of beef is a complex trait that can benefit from gene-interaction network analysis to understand relationships among loci that contribute to phenotypic variation. Phenotypic measures of fatty acid profile from triacylglycerol and phospholipid fractions of longissimus muscle, pedigree information, and Illumina 54 k bovine SNP genotypes were utilized to derive an annotated gene network associated with fatty acid composition in 1,833 Angus beef cattle. The Bayes-B statistical model was utilized to perform a genome wide association study to estimate associations between 54 k SNP genotypes and 39 individual fatty acid phenotypes within each fraction. Posterior means of the effects were estimated for each of the 54 k SNP and for the collective effects of all the SNP in every 1-Mb genomic window in terms of the proportion of genetic variance explained by the window. Windows that explained the largest proportions of genetic variance for individual lipids were found in the triacylglycerol fraction. There was almost no overlap in the genomic regions explaining variance between the triacylglycerol and phospholipid fractions. Partial correlations were used to identify correlated regions of the genome for the set of largest 1 Mb windows that explained up to 35% genetic variation in either fatty acid fraction. SNP were allocated to windows based on the bovine UMD3.1 assembly. Gene network clusters were generated utilizing a partial correlation and information theory algorithm. Results were used in conjunction with network scoring and visualization software to analyze correlated SNP across 39 fatty acid phenotypes to identify SNP of significance. Significant pathways implicated in fatty acid metabolism through GO term enrichment analysis included homeostasis of number of cells, homeostatic process, coenzyme/cofactor activity, and immunoglobulin. These results suggest different metabolic pathways regulate the development of different types of lipids found in

  17. SNP Markers as Additional Information to Resolve Complex Kinship Cases

    PubMed Central

    Pontes, M. Lurdes; Fondevila, Manuel; Laréu, Maria Victoria; Medeiros, Rui

    2015-01-01

    Summary Background DNA profiling with sets of highly polymorphic autosomal short tandem repeat (STR) markers has been applied in various aspects of human identification in forensic casework for nearly 20 years. However, in some cases of complex kinship investigation, the information provided by the conventionally used STR markers is not enough, often resulting in low likelihood ratio (LR) calculations. In these cases, it becomes necessary to increment the number of loci under analysis to reach adequate LRs. Recently, it has been proposed that single nucleotide polymorphisms (SNPs) could be used as a supportive tool to STR typing, eventually even replacing the methods/markers now employed. Methods In this work, we describe the results obtained in 7 revised complex paternity cases when applying a battery of STRs, as well as 52 human identification SNPs (SNPforID 52plex identification panel) using a SNaPshot methodology followed by capillary electrophoresis. Results Our results show that the analysis of SNPs, as complement to STR typing in forensic casework applications, would at least increase by a factor of 4 total PI values and correspondent Essen-Möller's W value. Conclusions We demonstrated that SNP genotyping could be a key complement to STR information in challenging casework of disputed paternity, such as close relative individualization or complex pedigrees subject to endogamous relations. PMID:26733770

  18. SNP-SNP interaction analysis of NF-κB signaling pathway on breast cancer survival.

    PubMed

    Jamshidi, Maral; Fagerholm, Rainer; Khan, Sofia; Aittomäki, Kristiina; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Andrulis, Irene L; Chang-Claude, Jenny; Devilee, Peter; Fasching, Peter A; Michailidou, Kyriaki; Bolla, Manjeet K; Dennis, Joe; Wang, Qin; Guo, Qi; Rhenius, Valerie; Cornelissen, Sten; Rudolph, Anja; Knight, Julia A; Loehberg, Christian R; Burwinkel, Barbara; Marme, Frederik; Hopper, John L; Southey, Melissa C; Bojesen, Stig E; Flyger, Henrik; Brenner, Hermann; Holleczek, Bernd; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Van Dyck, Laurien; Nevelsteen, Ines; Couch, Fergus J; Olson, Janet E; Giles, Graham G; McLean, Catriona; Haiman, Christopher A; Henderson, Brian E; Winqvist, Robert; Pylkäs, Katri; Tollenaar, Rob A E M; García-Closas, Montserrat; Figueroa, Jonine; Hooning, Maartje J; Martens, John W M; Cox, Angela; Cross, Simon S; Simard, Jacques; Dunning, Alison M; Easton, Douglas F; Pharoah, Paul D P; Hall, Per; Blomqvist, Carl; Schmidt, Marjanka K; Nevanlinna, Heli

    2015-11-10

    In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-SNP interactions on survival using the likelihood-ratio test comparing multivariate Cox' regression models of SNP pairs without and with an interaction term. We found two interacting pairs associating with prognosis: patients simultaneously homozygous for the rare alleles of rs5996080 and rs7973914 had worse survival (HRinteraction 6.98, 95% CI=3.3-14.4, P=1.42E-07), and patients carrying at least one rare allele for rs17243893 and rs57890595 had better survival (HRinteraction 0.51, 95% CI=0.3-0.6, P = 2.19E-05). Based on in silico functional analyses and literature, we speculate that the rs5996080 and rs7973914 loci may affect the BAFFR and TNFR1/TNFR3 receptors and breast cancer survival, possibly by disturbing both the canonical and non-canonical NF-κB pathways or their dynamics, whereas, rs17243893-rs57890595 interaction on survival may be mediated through TRAF2-TRAIL-R4 interplay. These results warrant further validation and functional analyses. PMID:26317411

  19. SNP-SNP interaction analysis of NF-κB signaling pathway on breast cancer survival

    PubMed Central

    Jamshidi, Maral; Fagerholm, Rainer; Khan, Sofia; Aittomäki, Kristiina; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Andrulis, Irene L.; Chang-Claude, Jenny; Devilee, Peter; Fasching, Peter A.; Michailidou, Kyriaki; Bolla, Manjeet K.; Dennis, Joe; Wang, Qin; Guo, Qi; Rhenius, Valerie; Cornelissen, Sten; Rudolph, Anja; Knight, Julia A.; Loehberg, Christian R.; Burwinkel, Barbara; Marme, Frederik; Hopper, John L.; Southey, Melissa C.; Bojesen, Stig E.; Flyger, Henrik; Brenner, Hermann; Holleczek, Bernd; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Dyck, Laurien Van; Nevelsteen, Ines; Couch, Fergus J.; Olson, Janet E.; Giles, Graham G.; McLean, Catriona; Haiman, Christopher A.; Henderson, Brian E.; Winqvist, Robert; Pylkäs, Katri; Tollenaar, Rob A.E.M.; García-Closas, Montserrat; Figueroa, Jonine; Hooning, Maartje J.; Martens, John W.M.; Cox, Angela; Cross, Simon S.; Simard, Jacques; Dunning, Alison M.; Easton, Douglas F.; Pharoah, Paul D.P.; Hall, Per; Blomqvist, Carl; Schmidt, Marjanka K.; Nevanlinna, Heli

    2015-01-01

    In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-SNP interactions on survival using the likelihood-ratio test comparing multivariate Cox’ regression models of SNP pairs without and with an interaction term. We found two interacting pairs associating with prognosis: patients simultaneously homozygous for the rare alleles of rs5996080 and rs7973914 had worse survival (HRinteraction 6.98, 95% CI=3.3-14.4, P = 1.42E-07), and patients carrying at least one rare allele for rs17243893 and rs57890595 had better survival (HRinteraction 0.51, 95% CI=0.3-0.6, P = 2.19E-05). Based on in silico functional analyses and literature, we speculate that the rs5996080 and rs7973914 loci may affect the BAFFR and TNFR1/TNFR3 receptors and breast cancer survival, possibly by disturbing both the canonical and non-canonical NF-κB pathways or their dynamics, whereas, rs17243893-rs57890595 interaction on survival may be mediated through TRAF2-TRAIL-R4 interplay. These results warrant further validation and functional analyses. PMID:26317411

  20. Slider—maximum use of probability information for alignment of short sequence reads and SNP detection

    PubMed Central

    Malhis, Nawar; Butterfield, Yaron S. N.; Ester, Martin; Jones, Steven J. M.

    2009-01-01

    Motivation: A plethora of alignment tools have been created that are designed to best fit different types of alignment conditions. While some of these are made for aligning Illumina Sequence Analyzer reads, none of these are fully utilizing its probability (prb) output. In this article, we will introduce a new alignment approach (Slider) that reduces the alignment problem space by utilizing each read base's probabilities given in the prb files. Results: Compared with other aligners, Slider has higher alignment accuracy and efficiency. In addition, given that Slider matches bases with probabilities other than the most probable, it significantly reduces the percentage of base mismatches. The result is that its SNP predictions are more accurate than other SNP prediction approaches used today that start from the most probable sequence, including those using base quality. Contact: nmalhis@bcgsc.ca Supplementary information and availability: http://www.bcgsc.ca/platform/bioinfo/software/slider PMID:18974170

  1. Rapid Detection of Rare Deleterious Variants by Next Generation Sequencing with Optional Microarray SNP Genotype Data

    PubMed Central

    Watson, Christopher M.; Crinnion, Laura A.; Gurgel‐Gianetti, Juliana; Harrison, Sally M.; Daly, Catherine; Antanavicuite, Agne; Lascelles, Carolina; Markham, Alexander F.; Pena, Sergio D. J.; Bonthron, David T.

    2015-01-01

    ABSTRACT Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease‐causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome‐wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution. PMID:26037133

  2. The Perils of SNP Microarray Testing: Uncovering Unexpected Consanguinity

    PubMed Central

    Tarini, Beth A.; Konczal, Laura; Goldenberg, Aaron J.; Goldman, Edward B.; McCandless, Shawn E.

    2013-01-01

    Background While single nucleotide polymorphism (SNP) chromosomal microarrays identify areas of small genetic deletions/duplications, they can also reveal regions of homozygosity indicative of consanguinity. As more non-geneticists order SNP microarrays, they must prepare for the potential ethical, legal and social issues that result from revelation of unanticipated consanguinity. Patient An infant with multiple congenital anomalies underwent SNP microarray testing. Results The results of the SNP microarray revealed several large regions of homozygosity that indicated identity by descent most consistent with a second or third degree relative mating (e.g., uncle/ niece, half brother/sister, first cousins). Mother was not aware of the test's potential to reveal consanguinity. When informed of the test results, she reluctantly admitted to being raped by her half-brother around the time of conception. Conclusions During the pre-testing consent process, providers should inform parents that SNP microarray testing could reveal consanguinity. Providers must also understand the psychological implications, as well as the legal and moral obligations, that accompany SNP microarray results that indicate consanguinity. PMID:23827427

  3. Detection of selective sweeps in cattle using genome-wide SNP data

    PubMed Central

    2013-01-01

    Background The domestication and subsequent selection by humans to create breeds and biological types of cattle undoubtedly altered the patterning of variation within their genomes. Strong selection to fix advantageous large-effect mutations underlying domesticability, breed characteristics or productivity created selective sweeps in which variation was lost in the chromosomal region flanking the selected allele. Selective sweeps have now been identified in the genomes of many animal species including humans, dogs, horses, and chickens. Here, we attempt to identify and characterise regions of the bovine genome that have been subjected to selective sweeps. Results Two datasets were used for the discovery and validation of selective sweeps via the fixation of alleles at a series of contiguous SNP loci. BovineSNP50 data were used to identify 28 putative sweep regions among 14 diverse cattle breeds. Affymetrix BOS 1 prescreening assay data for five breeds were used to identify 85 regions and validate 5 regions identified using the BovineSNP50 data. Many genes are located within these regions and the lack of sequence data for the analysed breeds precludes the nomination of selected genes or variants and limits the prediction of the selected phenotypes. However, phenotypes that we predict to have historically been under strong selection include horned-polled, coat colour, stature, ear morphology, and behaviour. Conclusions The bias towards common SNPs in the design of the BovineSNP50 assay led to the identification of recent selective sweeps associated with breed formation and common to only a small number of breeds rather than ancient events associated with domestication which could potentially be common to all European taurines. The limited SNP density, or marker resolution, of the BovineSNP50 assay significantly impacted the rate of false discovery of selective sweeps, however, we found sweeps in common between breeds which were confirmed using an ultra

  4. An extended Tajima’s D neutrality test incorporating SNP calling and imputation uncertainties

    PubMed Central

    Zhang, Qingrun; Tyler-Smith, Chris; Long, Quan

    2015-01-01

    To identify evolutionary events from the footprints left in the patterns of genetic variation in a population, people use many statistical frameworks, including neutrality tests. In datasets from current high throughput sequencing and genotyping platforms, it is common to have missing data and low-confidence SNP calls at many segregating sites. However, the traditional statistical framework for neutrality tests does not allow for these possibilities; therefore the usual way of treating missing data is to ignore segregating sites with missing/low confidence calls, regardless of the good SNP calls at these sites in other individuals. In this work, we propose a modified neutrality test, Extended Tajima’s D, which incorporates missing data and SNP-calling uncertainties. Because we do not specify any particular error-generating mechanism, this approach is robust and widely applicable. Simulations show that in most cases the power of the new test is better than the original Tajima’s D, given the same type I error. Applications to real data show that it detects fewer outliers associated with low quality data. PMID:26681995

  5. ASSIsT: an automatic SNP scoring tool for in- and outbreeding species

    PubMed Central

    Di Guardo, Mario; Micheletti, Diego; Bianco, Luca; Koehorst-van Putten, Herma J. J.; Longhi, Sara; Costa, Fabrizio; Aranzana, Maria J.; Velasco, Riccardo; Arús, Pere; Troggio, Michela; van de Weg, Eric W.

    2015-01-01

    ASSIsT (Automatic SNP ScorIng Tool) is a user-friendly customized pipeline for efficient calling and filtering of SNPs from Illumina Infinium arrays, specifically devised for custom genotyping arrays. Illumina has developed an integrated software for SNP data visualization and inspection called GenomeStudio® (GS). ASSIsT builds on GS-derived data and identifies those markers that follow a bi-allelic genetic model and show reliable genotype calls. Moreover, ASSIsT re-edits SNP calls with null alleles or additional SNPs in the probe annealing site. ASSIsT can be employed in the analysis of different population types such as full-sib families and mating schemes used in the plant kingdom (backcross, F1, F2), and unrelated individuals. The final result can be directly exported in the format required by the most common software for genetic mapping and marker–trait association analysis. ASSIsT is developed in Python and runs in Windows and Linux. Availability and implementation: The software, example data sets and tutorials are freely available at http://compbiotoolbox.fmach.it/assist/. Contact: eric.vandeweg@wur.nl PMID:26249809

  6. Evaluation of Y chromosomal SNP haplogrouping in the HID-Ion AmpliSeq™ Identity Panel.

    PubMed

    Ochiai, Eriko; Minaguchi, Kiyoshi; Nambiar, Phrabhakaran; Kakimoto, Yu; Satoh, Fumiko; Nakatome, Masato; Miyashita, Keiko; Osawa, Motoki

    2016-09-01

    The Y chromosomal haplogroup determined from single nucleotide polymorphism (SNP) combinations is a valuable genetic marker to study ancestral male lineage and ethical distribution. Next-generation sequencing has been developed for widely diverse genetics fields. For this study, we demonstrate 34 Y-SNP typing employing the Ion PGM™ system to perform haplogrouping. DNA libraries were constructed using the HID-Ion AmpliSeq™ Identity Panel. Emulsion PCR was performed, then DNA sequences were analyzed on the Ion 314 and 316 Chip Kit v2. Some difficulties became apparent during the analytic processes. No-call was reported at rs2032599 and M479 in six samples, in which the least coverage was observed at M479. A minor misreading occurred at rs2032631 and M479. A real time PCR experiment using other pairs of oligonucleotide primers showed that these events might result from the flanking sequence. Finally, Y haplogroup was determined completely for 81 unrelated males including Japanese (n=59) and Malay (n=22) subjects. The allelic divergence differed between the two populations. In comparison with the conventional Sanger method, next-generation sequencing provides a comprehensive SNP analysis with convenient procedures, but further system improvement is necessary. PMID:27591541

  7. Genome-wide SNP analysis of the Systemic Capillary Leak Syndrome (Clarkson disease)

    PubMed Central

    Xie, Zhihui; Nagarajan, Vijayaraj; Sturdevant, Daniel E; Iwaki, Shoko; Chan, Eunice; Wisch, Laura; Young, Michael; Nelson, Celeste M; Porcella, Stephen F; Druey, Kirk M

    2013-01-01

    The Systemic Capillary Leak Syndrome (SCLS) is an extremely rare, orphan disease that resembles, and is frequently erroneously diagnosed as, systemic anaphylaxis. The disorder is characterized by repeated, transient, and seemingly unprovoked episodes of hypotensive shock and peripheral edema due to transient endothelial hyperpermeability. SCLS is often accompanied by a monoclonal gammopathy of unknown significance (MGUS). Using Affymetrix Single Nucleotide Polymorphism (SNP) microarrays, we performed the first genome-wide SNP analysis of SCLS in a cohort of 12 disease subjects and 18 controls. Exome capture sequencing was performed on genomic DNA from nine of these patients as validation for the SNP-chip discoveries and de novo data generation. We identified candidate susceptibility loci for SCLS, which included a region flanking CAV3 (3p25.3) as well as SNP clusters in PON1 (7q21.3), PSORS1C1 (6p21.3), and CHCHD3 (7q33). Among the most highly ranked discoveries were gene-associated SNPs in the uncharacterized LOC100130480 gene (rs6417039, rs2004296). Top case-associated SNPs were observed in BTRC (rs12355803, 3rs4436485), ARHGEF18 (rs11668246), CDH13 (rs4782779), and EDG2 (rs12552348), which encode proteins with known or suspected roles in B cell function and/or vascular integrity. 61 SNPs that were significantly associated with SCLS by microarray analysis were also detected and validated by exome deep sequencing. Functional annotation of highly ranked SNPs revealed enrichment of cell projections, cell junctions and adhesion, and molecules containing pleckstrin homology, Ras/Rho regulatory, and immunoglobulin Ig-like C2/fibronectin type III domains, all of which involve mechanistic functions that correlate with the SCLS phenotype. These results highlight SNPs with potential relevance to SCLS. PMID:24808988

  8. A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation

    PubMed Central

    2013-01-01

    Background Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. Results We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Conclusions Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array—more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation

  9. SNP markers identify widely distributed clonal lineages of Phytophthora colocasiae in Vietnam, Hawaii and Hainan Island, China.

    PubMed

    Shrestha, Sandesh; Hu, Jian; Fryxell, Rebecca Trout; Mudge, Joann; Lamour, Kurt

    2014-01-01

    Taro (Colocasia esculenta) is an important food crop, and taro leaf blight caused by Phytophthora colocasiae can significantly affect production. Our objectives were to develop single nucleotide polymorphism (SNP) markers for P. colocasiae and characterize populations in Hawaii (HI), Vietnam (VN) and Hainan Island, China (HIC). In total, 379 isolates were analyzed for mating type and multilocus SNP profiles including 214 from HI, 97 from VN and 68 from HIC. A total of 1152 single nucleotide variant (SNV) sites were identified via restriction site-associated DNA (RAD) sequencing of two field isolates. Genotyping with 27 SNPs revealed 41 multilocus SNP genotypes grouped into seven clonal lineages containing 2-232 members. Three clonal lineages were shared among countries. In addition, five SNP markers had a low incidence of loss of heterozygosity (LOH) during asexual laboratory growth. For HI and VN, >95% of isolates were the A2 mating type. On HIC, isolates within single clonal lineages had A1, A2 and A0 (neuter) isolates. The implications for the wide dispersal of clonal lineages are discussed. PMID:24895424

  10. DoGSD: the dog and wolf genome SNP database.

    PubMed

    Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping

    2015-01-01

    The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. PMID:25404132

  11. DoGSD: the dog and wolf genome SNP database

    PubMed Central

    Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M.; Wang, Guo-Dong; Zhang, Ya-Ping

    2015-01-01

    The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. PMID:25404132

  12. Substantial SNP-based heritability estimates for working memory performance

    PubMed Central

    Vogler, C; Gschwind, L; Coynel, D; Freytag, V; Milnik, A; Egli, T; Heck, A; de Quervain, D J-F; Papassotiropoulos, A

    2014-01-01

    Working memory (WM) is an important endophenotype in neuropsychiatric research and its use in genetic association studies is thought to be a promising approach to increase our understanding of psychiatric disease. As for any genetically complex trait, demonstration of sufficient heritability within the specific study context is a prerequisite for conducting genetic studies of that trait. Recently developed methods allow estimating trait heritability using sets of common genetic markers from genome-wide association study (GWAS) data in samples of unrelated individuals. Here we present single-nucleotide polymorphism (SNP)-based heritability estimates (h2SNP) for a WM phenotype. A Caucasian sample comprising a total of N=2298 healthy and young individuals was subjected to an N-back WM task. We calculated the genetic relationship between all individuals on the basis of genome-wide SNP data and performed restricted maximum likelihood analyses for variance component estimation to derive the h2SNP estimates. Heritability estimates for three 2-back derived WM performance measures based on all autosomal chromosomes ranged between 31 and 41%, indicating a substantial SNP-based heritability for WM traits. These results indicate that common genetic factors account for a prominent part of the phenotypic variation in WM performance. Hence, the application of GWAS on WM phenotypes is a valid method to identify the molecular underpinnings of WM. PMID:25203169

  13. Whole genome SNP scanning of snow sheep (Ovis nivicola).

    PubMed

    Deniskova, T E; Okhlopkov, I M; Sermyagin, A A; Gladyr', E A; Bagirov, V A; Sölkner, J; Mamaev, N V; Brem, G; Zinov'eva, N A

    2016-07-01

    This is the first report performing the whole genome SNP scanning of snow sheep (Ovis nivicola). Samples of snow sheep (n = 18) collected in six different regions of the Republic of Sakha (Yakutia) from 64° to 71° N. For SNP genotyping, we applied Ovine 50K SNP BeadChip (Illumina, United States), designed for domestic sheep. The total number of genotyped SNPs (call rate 90%) was 47796 (88.1% of total SNPs), wherein 1006 SNPs were polymorphic (2.1%). Principal component analysis (PCA) showed the clear differentiation within the species O. nivicola: studied individuals were distributed among five distinct arrays corresponding to the geographical locations of sampling points. Our results demonstrate that the DNA chip designed for domestic sheep can be successfully used to study the allele pool and the genetic structure of snow sheep populations. PMID:27599514

  14. Conditions for the validity of SNP-based heritability estimation

    PubMed Central

    2014-01-01

    The heritability of a trait (h2) is the proportion of its population variance caused by genetic differences, and estimates of this parameter are important for interpreting the results of genome-wide association studies (GWAS). In recent years, researchers have adopted a novel method for estimating a lower bound on heritability directly from GWAS data that uses realized genetic similarities between nominally unrelated individuals. The quantity estimated by this method is purported to be the contribution to heritability that could in principle be recovered from association studies employing the given panel of SNPs (hSNP2). Thus far, the validity of this approach has mostly been tested empirically. Here, we provide a mathematical explication and show that the method should remain a robust means of obtaining hSNP2 SNP under circumstances wider than those under which it has so far been derived. PMID:24744256

  15. SNP variation in ADRB3 gene reflects the breed difference of sheep populations.

    PubMed

    Wu, Jianliang; Qiao, Liying; Liu, Jianhua; Yuan, Yanan; Liu, Wenzhong

    2012-08-01

    The β3-adrenergic receptor (ADRB3), a G-protein coupled receptor, plays a major role in energy metabolism and regulation of lipolysis and homeostasis. We detect the single nucleotide polymorphism (SNP) variation in full-length sequence of ovine ADRB3 gene in 12 domestic sheep populations within four types by polymerase chain reaction-single strand conformation polymorphism and sequencing to reveal the breed difference. Twenty-two SNPs, 12 of which in the exon 1 and ten in the intron, were detected, and 12 new exonic and four new intronic SNPs were found. Most SNPs presented in Shanxi Dam Line and least ones in Dorset. The average SNP number in both meat and dual purpose for meat and wool breeds was significantly higher than general and dual purpose breeds for wool and meat. Frequency of each SNP in studied breeds or types was different. The 18C Del and 1617T Ins majorly existed in dual purpose breeds for wool and meat. The 25A Del, 119C>G and 130C>T were mostly found in the meat and dual purpose for meat and wool breeds. The 1764C>A more frequently presented in meat than in other types. The majority of variations came from within the populations as suggested by analysis of molecular variance. Close relationship presented among the Chinese and western breeds, respectively. In conclusion, SNPs of ovine ADRB3 gene can reflect the breed difference and within- and between-population variations, and to a great extent, the breed relationship. PMID:22711302

  16. SNP diversity within and among Brassica rapa accessions reveals no geographic differentiation.

    PubMed

    Tanhuanpää, P; Erkkilä, M; Tenhola-Roininen, T; Tanskanen, J; Manninen, O

    2016-01-01

    Genetic diversity was studied in a collection of 61 accessions of Brassica rapa, which were mostly oil-type turnip rapes but also included two oil-type subsp. dichotoma and five subsp. trilocularis accessions, as well as three leaf-type subspecies (subsp. japonica, pekinensis, and chinensis) and five turnip cultivars (subsp. rapa). Two-hundred and nine SNP markers, which had been discovered by amplicon resequencing, were used to genotype 893 plants from the B. rapa collection using Illumina BeadXpress. There was great variation in the diversity indices between accessions. With STRUCTURE analysis, the plant collection could be divided into three groups that seemed to correspond to morphotype and flowering habit but not to geography. According to AMOVA analysis, 65% of the variation was due to variation within accessions, 25% among accessions, and 10% among groups. A smaller subset of the plant collection, 12 accessions, was also studied with 5727 GBS-SNPs. Diversity indices obtained with GBS-SNPs correlated well with those obtained with Illumina BeadXpress SNPs. The developed SNP markers have already been used and will be used in future plant breeding programs as well as in mapping and diversity studies. PMID:26694015

  17. Genetic mapping in grapevine using a SNP microarray: intensity values

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genotyping microarrays are widely used for genome wide association studies, but in high-diversity organisms, the quality of SNP calls can be diminished by genetic variation near the assayed nucleotide. To address this limitation in grapevine, we developed a simple heuristic that uses hybridization i...

  18. SNP marker diversity in common bean (Phaseolus vulgaris L.).

    PubMed

    Cortés, Andrés J; Chavarro, Martha C; Blair, Matthew W

    2011-09-01

    Single nucleotide polymorphism (SNP) markers have become a genetic technology of choice because of their automation and high precision of allele calls. In this study, our goal was to develop 94 SNPs and test them across well-chosen common bean (Phaseolus vulgaris L.) germplasm. We validated and accessed SNP diversity at 84 gene-based and 10 non-genic loci using KASPar technology in a panel of 70 genotypes that have been used as parents of mapping populations and have been previously evaluated for SSRs. SNPs exhibited high levels of genetic diversity, an excess of middle frequency polymorphism, and a within-genepool mismatch distribution as expected for populations affected by sudden demographic expansions after domestication bottlenecks. This set of markers was useful for distinguishing Andean and Mesoamerican genotypes but less useful for distinguishing within each gene pool. In summary, slightly greater polymorphism and race structure was found within the Andean gene pool than within the Mesoamerican gene pool but polymorphism rate between genotypes was consistent with genepool and race identity. Our survey results represent a baseline for the choice of SNP markers for future applications because gene-associated SNPs could themselves be causative SNPs for traits. Finally, we discuss that the ideal genetic marker combination with which to carry out diversity, mapping and association studies in common bean should consider a mix of both SNP and SSR markers. PMID:21785951

  19. SNP Discovery through Next-Generation Sequencing and Its Applications

    PubMed Central

    Kumar, Santosh; Banks, Travis W.; Cloutier, Sylvie

    2012-01-01

    The decreasing cost along with rapid progress in next-generation sequencing and related bioinformatics computing resources has facilitated large-scale discovery of SNPs in various model and nonmodel plant species. Large numbers and genome-wide availability of SNPs make them the marker of choice in partially or completely sequenced genomes. Although excellent reviews have been published on next-generation sequencing, its associated bioinformatics challenges, and the applications of SNPs in genetic studies, a comprehensive review connecting these three intertwined research areas is needed. This paper touches upon various aspects of SNP discovery, highlighting key points in availability and selection of appropriate sequencing platforms, bioinformatics pipelines, SNP filtering criteria, and applications of SNPs in genetic analyses. The use of next-generation sequencing methodologies in many non-model crops leading to discovery and implementation of SNPs in various genetic studies is discussed. Development and improvement of bioinformatics software that are open source and freely available have accelerated the SNP discovery while reducing the associated cost. Key considerations for SNP filtering and associated pipelines are discussed in specific topics. A list of commonly used software and their sources is compiled for easy access and reference. PMID:23227038

  20. Do you really know where this SNP goes?

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The release of build 10.2 of the swine genome was a marked improvement over previous builds and has proven extremely useful. However, as most know, there are regions of the genome that this particular build does not accurately represent. For instance, nearly 25% of the 62,162 SNP on the Illumina Por...

  1. Analysis of genetic diversity using SNP markers in oat

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A large-scale single nucleotide polymorphism (SNP) discovery was carried out in cultivated oat using Roche 454 sequencing methods. DNA sequences were generated from cDNAs originating from a panel of 20 diverse oat cultivars, and from Diversity Array Technology (DArT) genomic complexity reductions fr...

  2. Software solutions for the livestock genomics SNP array revolution.

    PubMed

    Nicolazzi, E L; Biffani, S; Biscarini, F; Orozco Ter Wengel, P; Caprera, A; Nazzicari, N; Stella, A

    2015-08-01

    Since the beginning of the genomic era, the number of available single nucleotide polymorphism (SNP) arrays has grown considerably. In the bovine species alone, 11 SNP chips not completely covered by intellectual property are currently available, and the number is growing. Genomic/genotype data are not standardized, and this hampers its exchange and integration. In addition, software used for the analyses of these data usually requires not standard (i.e. case specific) input files which, considering the large amount of data to be handled, require at least some programming skills in their production. In this work, we describe a software toolkit for SNP array data management, imputation, genome-wide association studies, population genetics and genomic selection. However, this toolkit does not solve the critical need for standardization of the genotypic data and software input files. It only highlights the chaotic situation each researcher has to face on a daily basis and gives some helpful advice on the currently available tools in order to navigate the SNP array data complexity. PMID:25907889

  3. Association of Atherosclerotic Peripheral Arterial Disease with Adiponectin Genes SNP+45 and SNP+276: A Case-Control Study

    PubMed Central

    Gherman, Claudia D.; Bolboacă, Sorana D.

    2013-01-01

    Objectives. We hypothesized that adiponectin gene SNP+45 (rs2241766) and SNP+276 (rs1501299) would be associated with atherosclerotic peripheral arterial disease (PAD). Furthermore, the association between circulating adiponectin levels, fetuin-A, and tumoral necrosis factor-alpha (TNF-α) in patients with atherosclerotic peripheral arterial disease was investigated. Method. Several blood parameters (such as adiponectin, fetuin-A, and TNF-α) were measured in 346 patients, 226 with atherosclerotic peripheral arterial disease (PAD) and 120 without symptomatic PAD (non-PAD). Two common SNPs of the ADIPOQ gene represented by +45T/G 2 and +276G/T were also investigated. Results. Adiponectin concentrations showed lower circulating levels in the PAD patients compared to non-PAD patients (P < 0.001). Decreasing adiponectin concentration was associated with increasing serum levels of fetuin-A in the PAD patients. None of the investigated adiponectin SNPs proved to be associated with the subjects' susceptibility to PAD (P > 0.05). Conclusion. The results of our study demonstrated that neither adiponectin SNP+45 nor SNP+276 is associated with the risk of PAD. PMID:23819115

  4. High throughput SNP discovery and validation in the pig: towards the development of a high density swine SNP chip

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recent developments in sequencing technology have allowed the generation of millions of short read sequences in a fast and inexpensive way. This enables the cost effective large scale identification of hundreds of thousands of SNPs needed for the development of high density SNP arrays. Currently, a ...

  5. Large-Scale SNP Discovery through RNA Sequencing and SNP Genotyping by Targeted Enrichment Sequencing in Cassava (Manihot esculenta Crantz)

    PubMed Central

    Pootakham, Wirulda; Shearman, Jeremy R.; Ruang-areerate, Panthita; Sonthirod, Chutima; Sangsrakru, Duangjai; Jomchai, Nukoon; Yoocha, Thippawan; Triwitayakorn, Kanokporn; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2014-01-01

    Cassava (Manihot esculenta Crantz) is one of the most important crop species being the main source of dietary energy in several countries. Marker-assisted selection has become an essential tool in plant breeding. Single nucleotide polymorphism (SNP) discovery via transcriptome sequencing is an attractive strategy for genome complexity reduction in organisms with large genomes. We sequenced the transcriptome of 16 cassava accessions using the Illumina HiSeq platform and identified 675,559 EST-derived SNP markers. A subset of those markers was subsequently genotyped by capture-based targeted enrichment sequencing in 100 F1 progeny segregating for starch viscosity phenotypes. A total of 2,110 non-redundant SNP markers were used to construct a genetic map. This map encompasses 1,785 cM and consists of 19 linkage groups. A major quantitative trait locus (QTL) controlling starch pasting properties was identified and shown to coincide with the QTL previously reported for this trait. With a high-density SNP-based linkage map presented here, we also uncovered a novel QTL associated with starch pasting time on LG 10. PMID:25551642

  6. Multiple SNP Set Analysis for Genome-Wide Association Studies Through Bayesian Latent Variable Selection.

    PubMed

    Lu, Zhao-Hua; Zhu, Hongtu; Knickmeyer, Rebecca C; Sullivan, Patrick F; Williams, Stephanie N; Zou, Fei

    2015-12-01

    The power of genome-wide association studies (GWAS) for mapping complex traits with single-SNP analysis (where SNP is single-nucleotide polymorphism) may be undermined by modest SNP effect sizes, unobserved causal SNPs, correlation among adjacent SNPs, and SNP-SNP interactions. Alternative approaches for testing the association between a single SNP set and individual phenotypes have been shown to be promising for improving the power of GWAS. We propose a Bayesian latent variable selection (BLVS) method to simultaneously model the joint association mapping between a large number of SNP sets and complex traits. Compared with single SNP set analysis, such joint association mapping not only accounts for the correlation among SNP sets but also is capable of detecting causal SNP sets that are marginally uncorrelated with traits. The spike-and-slab prior assigned to the effects of SNP sets can greatly reduce the dimension of effective SNP sets, while speeding up computation. An efficient Markov chain Monte Carlo algorithm is developed. Simulations demonstrate that BLVS outperforms several competing variable selection methods in some important scenarios. PMID:26515609

  7. High-throughput SNP genotyping for breeding applications in rice using the BeadXpress platform

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Multiplexed single nucleotide polymorphism (SNP) markers have the potential to increase the speed and cost-effectiveness of genotyping, provided that an optimal SNP density is used for each application. To test the efficiency of multiplexed SNP genotyping for diversity, mapping and breeding applicat...

  8. Development of Single Nucleotide Polymorphism (SNP) Markers for Use in Commercial Maize (Zea Mays L.) Germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The development of single nucleotide polymorphism (SNP) markers in maize offer the opportunity to utilize DNA markers in many new areas of population genetics, gene discovery, plant breeding, and germplasm identification. However, the steps from sequencing and SNP discovery to SNP marker design and ...

  9. The development and characterization of a 57K SNP array for rainbow trout

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper we describe the development and characterization of the first high density single nucleotide polymorphism (SNP) genotyping array for rainbow trout. The SNP array is publically available from a commercial vendor. The SNP genotyping quality was high and validation rate was close to 90%...

  10. Influence of MDM2 SNP309 and SNP285 status on the risk of cancer in the breast, prostate, lung and colon.

    PubMed

    Gansmo, Liv B; Knappskog, Stian; Romundstad, Pål; Hveem, Kristian; Vatten, Lars; Lønning, Per E

    2015-07-01

    MDM2 is a key regulator of the p53 tumor suppressor protein and is overexpressed in many human cancers. Two single nucleotide polymorphisms (SNPs) located in the MDM2 intronic promoter (P2) have been found to exert biological function. The G-allele of SNP309T>G; rs2279744 increases MDM2 transcription and has been linked to increased cancer risk. In contrast, the less frequent SNP285G>C; rs117039649, which is in complete linkage disequilibrium with SNP309 (generating a SNP285C/309G variant haplotype), has been related to reduced MDM2 transcription and to reduced risk of breast, endometrial and ovarian cancer. In this large population-based case-control study, we genotyped SNP309 and SNP285 in 10,830 individuals, including cases with cancer of the breast (n=1,717), colon (n=1,532), lung (n=1,331) and prostate (n=2,501), as well as 3,749 non-cancer controls. We found a slightly reduced risk for lung cancer among individuals harboring the SNP309TG/GG genotypes compared to the SNP309TT genotype (OR= 0.86; CI = 0.67-0.98), but this association was restricted to women (OR = 0.77; CI = 0.63-0.95) and was not present among men (OR = 0.91; CI = 0.77-1.08). Consistent with previous findings, we found a reduced risk for breast cancer among individuals carrying the SNP285GC/309GG genotype versus the SNP285GG/309GG genotype (OR = 0.55; CI = 0.33-0.93). In conclusion, our data support the hypothesis that the effects of both SNP285 and SNP309 status are tissue dependent. PMID:25431177

  11. Role of an SNP in Alternative Splicing of Bovine NCF4 and Mastitis Susceptibility

    PubMed Central

    Wang, Xiuge; Yang, Chunhong; Sun, Yan; Jiang, Qiang; Wang, Fei; Li, Mengjiao; Zhong, Jifeng; Huang, Jinming

    2015-01-01

    Neutrophil cytosolic factor 4 (NCF4) is component of the nicotinamide dinucleotide phosphate oxidase complex, a key factor in biochemical pathways and innate immune responses. In this study, splice variants and functional single-nucleotide polymorphism (SNP) of NCF4 were identified to determine the variability and association of the gene with susceptibility to bovine mastitis characterized by inflammation. A novel splice variant, designated as NCF4-TV and characterized by the retention of a 48 bp sequence in intron 9, was detected in the mammary gland tissues of infected cows. The expression of the NCF4-reference main transcript in the mastitic mammary tissues was higher than that in normal tissues. A novel SNP, g.18174 A>G, was also found in the retained 48 bp region of intron 9. To determine whether NCF4-TV could be due to the g.18174 A>G mutation, we constructed two mini-gene expression vectors with the wild-type or mutant NCF4 g.18174 A>G fragment. The vectors were then transiently transfected into 293T cells, and alternative splicing of NCF4 was analyzed by reverse transcription-PCR and sequencing. Mini-gene splicing assay demonstrated that the aberrantly spliced NCF4-TV with 48 bp retained fragment in intron 9 could be due to g.18174 A>G, which was associated with milk somatic count score and increased risk of mastitis infection in cows. NCF4 expression was also regulated by alternative splicing. This study proposes that NCF4 splice variants generated by functional SNP are important risk factors for mastitis susceptibility in dairy cows. PMID:26600390

  12. Investigating single nucleotide polymorphism (SNP) density in the human genome and its implications for molecular evolution.

    PubMed

    Zhao, Zhongming; Fu, Yun-Xin; Hewett-Emmett, David; Boerwinkle, Eric

    2003-07-17

    We investigated the single nucleotide polymorphism (SNP) density across the human genome and in different genic categories using two SNP databases: Celera's CgsSNP, which includes SNPs identified by comparing genomic sequences, and Celera's RefSNP, which includes SNPs from a variety of sources and is biased toward disease-associated genes. Based on CgsSNP, the average numbers of SNPs per 10 kb was 8.33, 8.44, and 8.09 in the human genome, in intergenic regions, and in genic regions, respectively. In genic regions, the SNP density in intronic, exonic and adjoining untranslated regions was 8.21, 5.28, and 7.51 SNPs per 10 kb, respectively. The pattern of SNP density based on RefSNP was different from that based on CgsSNP, emphasizing its utility for genotype-phenotype association studies but not for most population genetic studies. The number of SNPs per chromosome was correlated with chromosome length, but the density of SNPs estimated by CgsSNP was not significantly correlated with the GC content of the chromosome. Based on CgsSNP, the ratio of nonsense to missense mutations (0.027), the ratio of missense to silent mutations (1.15), and the ratio of non-synonymous to synonymous mutations (1.18) was less than half of that expected in a human protein coding sequence under the neutral mutation theory, reflecting a role for natural selection, especially purifying selection. PMID:12909357

  13. Multiple SNP-sets Analysis for Genome-wide Association Studies through Bayesian Latent Variable Selection

    PubMed Central

    Lu, Zhaohua; Zhu, Hongtu; Knickmeyer, Rebecca C; Sullivan, Patrick F.; Stephanie, Williams N.; Zou, Fei

    2015-01-01

    The power of genome-wide association studies (GWAS) for mapping complex traits with single SNP analysis may be undermined by modest SNP effect sizes, unobserved causal SNPs, correlation among adjacent SNPs, and SNP-SNP interactions. Alternative approaches for testing the association between a single SNP-set and individual phenotypes have been shown to be promising for improving the power of GWAS. We propose a Bayesian latent variable selection (BLVS) method to simultaneously model the joint association mapping between a large number of SNP-sets and complex traits. Compared to single SNP-set analysis, such joint association mapping not only accounts for the correlation among SNP-sets, but also is capable of detecting causal SNP-sets that are marginally uncorrelated with traits. The spike-slab prior assigned to the effects of SNP-sets can greatly reduce the dimension of effective SNP-sets, while speeding up computation. An efficient MCMC algorithm is developed. Simulations demonstrate that BLVS outperforms several competing variable selection methods in some important scenarios. PMID:26515609

  14. mrsFAST-Ultra: a compact, SNP-aware mapper for high performance sequencing applications

    PubMed Central

    Hach, Faraz; Sarrafi, Iman; Hormozdiari, Farhad; Alkan, Can; Eichler, Evan E.; Sahinalp, S. Cenk

    2014-01-01

    High throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges for processing and downstream analysis. While tools that report the ‘best’ mapping location of each read provide a fast way to process HTS data, they are not suitable for many types of downstream analysis such as structural variation detection, where it is important to report multiple mapping loci for each read. For this purpose we introduce mrsFAST-Ultra, a fast, cache oblivious, SNP-aware aligner that can handle the multi-mapping of HTS reads very efficiently. mrsFAST-Ultra improves mrsFAST, our first cache oblivious read aligner capable of handling multi-mapping reads, through new and compact index structures that reduce not only the overall memory usage but also the number of CPU operations per alignment. In fact the size of the index generated by mrsFAST-Ultra is 10 times smaller than that of mrsFAST. As importantly, mrsFAST-Ultra introduces new features such as being able to (i) obtain the best mapping loci for each read, and (ii) return all reads that have at most n mapping loci (within an error threshold), together with these loci, for any user specified n. Furthermore, mrsFAST-Ultra is SNP-aware, i.e. it can map reads to reference genome while discounting the mismatches that occur at common SNP locations provided by db-SNP; this significantly increases the number of reads that can be mapped to the reference genome. Notice that all of the above features are implemented within the index structure and are not simple post-processing steps and thus are performed highly efficiently. Finally, mrsFAST-Ultra utilizes multiple available cores and processors and can be tuned for various memory settings. Our results show that mrsFAST-Ultra is roughly five times faster than its predecessor mrsFAST. In comparison to newly enhanced popular tools such as Bowtie2, it is more sensitive (it can report 10 times or more mappings per read) and much faster (six times

  15. Genetic heterogeneity in rhabdomyosarcoma revealed by SNP array analysis.

    PubMed

    Walther, Charles; Mayrhofer, Markus; Nilsson, Jenny; Hofvander, Jakob; Jonson, Tord; Mandahl, Nils; Øra, Ingrid; Gisselsson, David; Mertens, Fredrik

    2016-01-01

    Rhabdomyosarcoma (RMS) is the most common soft tissue sarcoma in children and adolescents. Alveolar (ARMS) and embryonal (ERMS) histologies predominate, but rare cases are classified as spindle cell/sclerosing (SRMS). For treatment stratification, RMS is further subclassified as fusion-positive (FP-RMS) or fusion-negative (FN-RMS), depending on whether a gene fusion involving PAX3 or PAX7 is present or not. We investigated 19 cases of pediatric RMS using high resolution single-nucleotide polymorphism (SNP) array. FP-ARMS displayed, on average, more structural rearrangements than ERMS; the single FN-ARMS had a genomic profile similar to ERMS. Apart from previously known amplification (e.g., MYCN, CDK4, and MIR17HG) and deletion (e.g., NF1, CDKN2A, and CDKN2B) targets, amplification of ERBB2 and homozygous loss of ASCC3 or ODZ3 were seen. Combining SNP array with cytogenetic data revealed that most cases were polyploid, with at least one case having started as a near-haploid tumor. Further bioinformatic analysis of the SNP array data disclosed genetic heterogeneity, in the form of subclonal chromosomal imbalances, in five tumors. The outcome was worse for patients with FP-ARMS than ERMS or FN-ARMS (6/8 vs. 1/9 dead of disease), and the only children with ERMS showing intratumor diversity or with MYOD1 mutation-positive SRMS also died of disease. High resolution SNP array can be useful in evaluating genomic imbalances in pediatric RMS. PMID:26482321

  16. Introgression browser: high-throughput whole-genome SNP visualization.

    PubMed

    Aflitos, Saulo Alves; Sanchez-Perez, Gabino; de Ridder, Dick; Fransz, Paul; Schranz, Michael E; de Jong, Hans; Peters, Sander A

    2015-04-01

    Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not viable. Here, we present the Introgression Browser (iBrowser), a bioinformatics tool aimed at visualizing introgressions at nucleotide or SNP (Single Nucleotide Polymorphisms) accuracy. The software selects homozygous SNPs from Variant Call Format (VCF) information and filters out heterozygous SNPs, multi-nucleotide polymorphisms (MNPs) and insertion-deletions (InDels). For data analysis iBrowser makes use of sliding windows, but if needed it can generate any desired fragmentation pattern through General Feature Format (GFF) information. In an example of tomato (Solanum lycopersicum) accessions we visualize SNP patterns and elucidate both position and boundaries of the introgressions. We also show that our tool is capable of identifying alien DNA in a panel of the closely related S. pimpinellifolium by examining phylogenetic relationships of the introgressed segments in tomato. In a third example, we demonstrate the power of the iBrowser in a panel of 597 Arabidopsis accessions, detecting the boundaries of a SNP-free region around a polymorphic 1.17 Mbp inverted segment on the short arm of chromosome 4. The architecture and functionality of iBrowser makes the software appropriate for a broad set of analyses including SNP mining, genome structure analysis, and pedigree analysis. Its functionality, together with the capability to process large data sets and efficient visualization of sequence variation, makes iBrowser a valuable breeding tool. PMID:25704554

  17. Detection of homologous horizontal gene transfer in SNP data

    Energy Science and Technology Software Center (ESTSC)

    2012-07-23

    We study the detection of mutations, sequencing errors, and homologous horizontal gene transfers (HGT) in a set of closely related microbial genomes. We base the model on single nucleotide polymorphisms (SNP's) and break the genomes into blocks to handle the rearrangement problem. Then we apply a synamic programming algorithm to model whether changes within each block are likely a result of mutations, sequencing errors, or HGT.

  18. Robust Demographic Inference from Genomic and SNP Data

    PubMed Central

    Excoffier, Laurent; Dupanloup, Isabelle; Huerta-Sánchez, Emilia; Sousa, Vitor C.; Foll, Matthieu

    2013-01-01

    We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets. PMID:24204310

  19. PanSNPdb: The Pan-Asian SNP Genotyping Database

    PubMed Central

    Ngamphiw, Chumpol; Assawamakin, Anunchai; Xu, Shuhua; Shaw, Philip J.; Yang, Jin Ok; Ghang, Ho; Bhak, Jong; Liu, Edison; Tongsima, Sissades

    2011-01-01

    The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb), which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at: http://www4a.biotec.or.th/PASNP. PMID:21731755

  20. How far from the SNP may the causative genes be?

    PubMed

    Brodie, Aharon; Azaria, Johnathan Roy; Ofran, Yanay

    2016-07-27

    While GWAS identify many disease-associated SNPs, using them to decipher disease mechanisms is hindered by the difficulty in mapping SNPs to genes. Most SNPs are in non-coding regions and it is often hard to identify the genes they implicate. To explore how far the SNP may be from the affected genes we used a pathway-based approach. We found that affected genes are often up to 2 Mbps away from the associated SNP, and are not necessarily the closest genes to the SNP. Existing approaches for mapping SNPs to genes leave many SNPs unmapped to genes and reveal only 86 significant phenotype-pathway associations for all known GWAS hits combined. Using the pathway-based approach we propose here allows mapping of virtually all SNPs to genes and reveals 435 statistically significant phenotype-pathway associations. In search for mechanisms that may explain the relationships between SNPs and distant genes, we found that SNPs that are mapped to distant genes have significantly more large insertions/deletions around them than other SNPs, suggesting that these SNPs may sometimes be markers for large insertions/deletions that may affect large genomic regions. PMID:27269582

  1. Development of a forensic identity SNP panel for Indonesia.

    PubMed

    Augustinus, Daniel; Gahan, Michelle E; McNevin, Dennis

    2015-07-01

    Genetic markers included in forensic identity panels must exhibit Hardy-Weinberg and linkage equilibrium (HWE and LE). "Universal" panels designed for global use can fail these tests in regional jurisdictions exhibiting high levels of genetic differentiation such as the Indonesian archipelago. This is especially the case where a single DNA database is required for allele frequency estimates to calculate random match probabilities (RMPs) and associated likelihood ratios (LRs). A panel of 65 single nucleotide polymorphisms (SNPs) and a reduced set of 52 SNPs have been selected from 15 Indonesian subpopulations in the HUGO Pan Asian SNP database using a SNP selection strategy that could be applied to any panel of forensic identity markers. The strategy consists of four screening steps: (1) application of a G test for HWE; (2) ranking for high heterozygosity; (3) selection for LE; and (4) selection for low inbreeding depression. SNPs in our Indonesian panel perform well in comparison to some other universal SNP and short tandem repeat (STR) panels as measured by Fisher's exact test for HWE and LE and Wright's F statistics. PMID:25104323

  2. Multiplex Detection and SNP Genotyping in a Single Fluorescence Channel

    PubMed Central

    Fu, Guoliang; Miles, Andrea; Alphey, Luke

    2012-01-01

    Probe-based PCR is widely used for SNP (single nucleotide polymorphism) genotyping and pathogen nucleic acid detection due to its simplicity, sensitivity and cost-effectiveness. However, the multiplex capability of hydrolysis probe-based PCR is normally limited to one target (pathogen or allele) per fluorescence channel. Current fluorescence PCR machines typically have 4–6 channels. We present a strategy permitting the multiplex detection of multiple targets in a single detection channel. The technique is named Multiplex Probe Amplification (MPA). Polymorphisms of the CYP2C9 gene (cytochrome P450, family 2, subfamily C, polypeptide 9, CYP2C9*2) and human papillomavirus sequences HPV16, 18, 31, 52 and 59 were chosen as model targets for testing MPA. The allele status of the CYP2C9*2 determined by MPA was entirely concordant with the reference TaqMan® SNP Genotyping Assays. The four HPV strain sequences could be independently detected in a single fluorescence detection channel. The results validate the multiplex capacity, the simplicity and accuracy of MPA for SNP genotyping and multiplex detection using different probes labeled with the same fluorophore. The technique offers a new way to multiplex in a single detection channel of a closed-tube PCR. PMID:22272339

  3. Multiplex detection and SNP genotyping in a single fluorescence channel.

    PubMed

    Fu, Guoliang; Miles, Andrea; Alphey, Luke

    2012-01-01

    Probe-based PCR is widely used for SNP (single nucleotide polymorphism) genotyping and pathogen nucleic acid detection due to its simplicity, sensitivity and cost-effectiveness. However, the multiplex capability of hydrolysis probe-based PCR is normally limited to one target (pathogen or allele) per fluorescence channel. Current fluorescence PCR machines typically have 4-6 channels. We present a strategy permitting the multiplex detection of multiple targets in a single detection channel. The technique is named Multiplex Probe Amplification (MPA). Polymorphisms of the CYP2C9 gene (cytochrome P450, family 2, subfamily C, polypeptide 9, CYP2C9*2) and human papillomavirus sequences HPV16, 18, 31, 52 and 59 were chosen as model targets for testing MPA. The allele status of the CYP2C9*2 determined by MPA was entirely concordant with the reference TaqMan® SNP Genotyping Assays. The four HPV strain sequences could be independently detected in a single fluorescence detection channel. The results validate the multiplex capacity, the simplicity and accuracy of MPA for SNP genotyping and multiplex detection using different probes labeled with the same fluorophore. The technique offers a new way to multiplex in a single detection channel of a closed-tube PCR. PMID:22272339

  4. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649).

    PubMed

    Knappskog, Stian; Gansmo, Liv B; Dibirova, Khadizha; Metspalu, Andres; Cybulski, Cezary; Peterlongo, Paolo; Aaltonen, Lauri; Vatten, Lars; Romundstad, Pål; Hveem, Kristian; Devilee, Peter; Evans, Gareth D; Lin, Dongxin; Van Camp, Guy; Manolopoulos, Vangelis G; Osorio, Ana; Milani, Lili; Ozcelik, Tayfun; Zalloua, Pierre; Mouzaya, Francis; Bliznetz, Elena; Balanovska, Elena; Pocheshkova, Elvira; Kučinskas, Vaidutis; Atramentova, Lubov; Nymadawa, Pagbajabyn; Titov, Konstantin; Lavryashina, Maria; Yusupov, Yuldash; Bogdanova, Natalia; Koshel, Sergey; Zamora, Jorge; Wedge, David C; Charlesworth, Deborah; Dörk, Thilo; Balanovsky, Oleg; Lønning, Per E

    2014-09-30

    The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 - 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk. PMID:25327560

  5. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649)

    PubMed Central

    Knappskog, Stian; Gansmo, Liv B.; Dibirova, Khadizha; Metspalu, Andres; Cybulski, Cezary; Peterlongo, Paolo; Aaltonen, Lauri; Vatten, Lars; Romundstad, Pål; Hveem, Kristian; Devilee, Peter; Evans, Gareth D.; Lin, Dongxin; Camp, Guy Van; Manolopoulos, Vangelis G.; Osorio, Ana; Milani, Lili; Ozcelik, Tayfun; Zalloua, Pierre; Mouzaya, Francis; Bliznetz, Elena; Balanovska, Elena; Pocheshkova, Elvira; Kučinskas, Vaidutis; Atramentova, Lubov; Nymadawa, Pagbajabyn; Titov, Konstantin; Lavryashina, Maria; Yusupov, Yuldash; Bogdanova, Natalia; Koshel, Sergey; Zamora, Jorge; Wedge, David C.; Charlesworth, Deborah; Dörk, Thilo; Balanovsky, Oleg; Lønning, Per E.

    2014-01-01

    The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 – 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk. PMID:25327560

  6. Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics

    PubMed Central

    Rueedi, Rico; Kutalik, Zoltán; Bergmann, Sven

    2016-01-01

    Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries. PMID:26808494

  7. Use of Microsatellite and SNP Markers for Biotype Characterization in Hessian Fly

    PubMed Central

    Crane, Yan Ma; Cambron, Sue E.; Crane, Charles F.; Shukle, Richard H.

    2015-01-01

    Exploration of the biotype structure of Hessian fly, Mayetiola destructor (Say) (Diptera: Cecidomyiidae), would improve our knowledge regarding variation in virulence phenotypes and difference in genetic background. Microsatellites (simple sequence repeats) and single-nucleotide polymorphisms (SNPs) are highly variable genetic markers that are widely used in population genetic studies. This study developed and tested a panel of 18 microsatellite and 22 SNP markers to investigate the genetic structure of nine Hessian fly biotypes: B, C, D, E, GP, L, O, vH9, and vH13. The simple sequence repeats were more polymorphic than the SNP markers, and their neighbor-joining trees differed in consequence. Microsatellites suggested a simple geographic association of related biotypes that did not progressively gain virulence with increasing genetic distance from a founder type. Use of the k-means clustering algorithm in the STRUCTURE program shows that the nine biotypes comprise six to eight populations that are related to geography or history within laboratory cultures. PMID:26543089

  8. Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics.

    PubMed

    Lamparter, David; Marbach, Daniel; Rueedi, Rico; Kutalik, Zoltán; Bergmann, Sven

    2016-01-01

    Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries. PMID:26808494

  9. SNPs3D: Candidate gene and SNP selection for association studies

    PubMed Central

    Yue, Peng; Melamud, Eugene; Moult, John

    2006-01-01

    Background The relationship between disease susceptibility and genetic variation is complex, and many different types of data are relevant. We describe a web resource and database that provides and integrates as much information as possible on disease/gene relationships at the molecular level. Description The resource has three primary modules. One module identifies which genes are candidates for involvement in a specified disease. A second module provides information about the relationships between sets of candidate genes. The third module analyzes the likely impact of non-synonymous SNPs on protein function. Disease/candidate gene relationships and gene-gene relationships are derived from the literature using simple but effective text profiling. SNP/protein function relationships are derived by two methods, one using principles of protein structure and stability, the other based on sequence conservation. Entries for each gene include a number of links to other data, such as expression profiles, pathway context, mouse knockout information and papers. Gene-gene interactions are presented in an interactive graphical interface, providing rapid access to the underlying information, as well as convenient navigation through the network. Use of the resource is illustrated with aspects of the inflammatory response and hypertension. Conclusion The combination of SNP impact analysis, a knowledge based network of gene relationships and candidate genes, and access to a wide range of data and literature allow a user to quickly assimilate available information, and so develop models of gene-pathway-disease interaction. PMID:16551372

  10. Genome-wide prediction of cancer driver genes based on SNP and cancer SNV data.

    PubMed

    He, Quanze; He, Quanyuan; Liu, Xiaohui; Wei, Youheng; Shen, Suqin; Hu, Xiaohui; Li, Qiao; Peng, Xiangwen; Wang, Lin; Yu, Long

    2014-01-01

    Identifying cancer driver genes and exploring their functions are essential and the most urgent need in basic cancer research. Developing efficient methods to differentiate between driver and passenger somatic mutations revealed from large-scale cancer genome sequencing data is critical to cancer driver gene discovery. Here, we compared distinct features of SNP with SNV data in detail and found that the weighted ratio of SNV to SNP (termed as WVPR) is an excellent indicator for cancer driver genes. The power of WVPR was validated by accurate predictions of known drivers. We ranked most of human genes by WVPR and did functional analyses on the list. The results demonstrate that driver genes are usually highly enriched in chromatin organization related genes/pathways. And some protein complexes, such as histone acetyltransferase, histone methyltransferase, telomerase, centrosome, sin3 and U12-type spliceosomal complexes, are hot spots of driver mutations. Furthermore, this study identified many new potential driver genes (e.g. NTRK3 and ZIC4) and pathways including oxidative phosphorylation pathway, which were not deemed by previous methods. Taken together, our study not only developed a method to identify cancer driver genes/pathways but also provided new insights into molecular mechanisms of cancer development. PMID:25057442

  11. Single Nucleotide Polymorphism (SNP)-Based Loss of Heterozygosity (LOH) Testing by Real Time PCR in Patients Suspect of Myeloproliferative Disease

    PubMed Central

    Huijsmans, Cornelis J. J.; Poodt, Jeroen; Damen, Jan; van der Linden, Johannes C.; Savelkoul, Paul H. M.; Pruijt, Johannes F. M.; Hilbink, Mirrian; Hermans, Mirjam H. A.

    2012-01-01

    During tumor development, loss of heterozygosity (LOH) often occurs. When LOH is preceded by an oncogene activating mutation, the mutant allele may be further potentiated if the wild-type allele is lost or inactivated. In myeloproliferative neoplasms (MPN) somatic acquisition of JAK2V617F may be followed by LOH resulting in loss of the wild type allele. The occurrence of LOH in MPN and other proliferative diseases may lead to a further potentiating the mutant allele and thereby increasing morbidity. A real time PCR based SNP profiling assay was developed and validated for LOH detection of the JAK2 region (JAK2LOH). Blood of a cohort of 12 JAK2V617F-positive patients (n = 6 25–50% and n = 6>50% JAK2V617F) and a cohort of 81 patients suspected of MPN was stored with EDTA and subsequently used for validation. To generate germ-line profiles, non-neoplastic formalin-fixed paraffin-embedded tissue from each patient was analyzed. Results of the SNP assay were compared to those of an established Short Tandem Repeat (STR) assay. Both assays revealed JAK2LOH in 1/6 patients with 25–50% JAK2V617F. In patients with >50% JAK2V617F, JAK2LOH was detected in 6/6 by the SNP assay and 5/6 patients by the STR assay. Of the 81 patients suspected of MPN, 18 patients carried JAK2V617F. Both the SNP and STR assay demonstrated the occurrence of JAK2LOH in 5 of them. In the 63 JAK2V617F-negative patients, no JAK2LOH was observed by SNP and STR analyses. The presented SNP assay reliably detects JAK2LOH and is a fast and easy to perform alternative for STR analyses. We therefore anticipate the SNP approach as a proof of principle for the development of LOH SNP-assays for other clinically relevant LOH loci. PMID:22768290

  12. Rapid Diagnosis of Imprinting Disorders Involving Copy Number Variation and Uniparental Disomy Using Genome-Wide SNP Microarrays.

    PubMed

    Liu, Weiqiang; Zhang, Rui; Wei, Jun; Zhang, Huimin; Yu, Guojiu; Li, Zhihua; Chen, Min; Sun, Xiaofang

    2015-01-01

    Imprinting disorders, such as Beckwith-Wiedemann syndrome (BWS), Prader-Willi syndrome (PWS) and Angelman syndrome (AS), can be detected via methylation analysis, methylation-specific multiplex ligation-dependent probe amplification (MS-MLPA), or other methods. In this study, we applied single nucleotide polymorphism (SNP)-based chromosomal microarray analysis to detect copy number variations (CNVs) and uniparental disomy (UPD) events in patients with suspected imprinting disorders. Of 4 patients, 2 had a 5.25-Mb microdeletion in the 15q11.2q13.2 region, 1 had a 38.4-Mb mosaic UPD in the 11p15.4 region, and 1 had a 60-Mb detectable UPD between regions 14q13.2 and 14q32.13. Although the 14q32.2 region was classified as normal by SNP array for the 14q13 UPD patient, it turned out to be a heterodisomic UPD by short tandem repeat marker analysis. MS-MLPA analysis was performed to validate the variations. In conclusion, SNP-based microarray is an efficient alternative method for quickly and precisely diagnosing PWS, AS, BWS, and other imprinted gene-associated disorders when considering aberrations due to CNVs and most types of UPD. PMID:26184742

  13. Exploration of SNP variants affecting hair colour prediction in Europeans.

    PubMed

    Söchtig, Jens; Phillips, Chris; Maroñas, Olalla; Gómez-Tato, Antonio; Cruz, Raquel; Alvarez-Dios, Jose; de Cal, María-Ángeles Casares; Ruiz, Yarimar; Reich, Kristian; Fondevila, Manuel; Carracedo, Ángel; Lareu, María V

    2015-09-01

    DNA profiling is a key tool for forensic analysis; however, current methods identify a suspect either by direct comparison or from DNA database searches. In cases with unidentified suspects, prediction of visible physical traits e.g. pigmentation or hair distribution of the DNA donors can provide important probative information. This study aimed to explore single nucleotide polymorphism (SNP) variants for their effect on hair colour prediction. A discovery panel of 63 SNPs consisting of already established hair colour markers from the HIrisPlex hair colour phenotyping assay as well as additional markers for which associations to human pigmentation traits were previously identified was used to develop multiplex assays based on SNaPshot single-base extension technology. A genotyping study was performed on a range of European populations (n = 605). Hair colour phenotyping was accomplished by matching donor's hair to a graded colour category system of reference shades and photography. Since multiple SNPs in combination contribute in varying degrees to hair colour predictability in Europeans, we aimed to compile a compact marker set that could provide a reliable hair colour inference from the fewest SNPs. The predictive approach developed uses a naïve Bayes classifier to provide hair colour assignment probabilities for the SNP profiles of the key SNPs and was embedded into the Snipper online SNP classifier ( http://mathgene.usc.es/snipper/ ). Results indicate that red, blond, brown and black hair colours are predictable with informative probabilities in a high proportion of cases. Our study resulted in the identification of 12 most strongly associated SNPs to hair pigmentation variation in six genes. PMID:26162598

  14. Computational tradeoffs in multiplex PCR assay design for SNP genotyping

    PubMed Central

    Rachlin, John; Ding, Chunming; Cantor, Charles; Kasif, Simon

    2005-01-01

    Background Multiplex PCR is a key technology for detecting infectious microorganisms, whole-genome sequencing, forensic analysis, and for enabling flexible yet low-cost genotyping. However, the design of a multiplex PCR assays requires the consideration of multiple competing objectives and physical constraints, and extensive computational analysis must be performed in order to identify the possible formation of primer-dimers that can negatively impact product yield. Results This paper examines the computational design limits of multiplex PCR in the context of SNP genotyping and examines tradeoffs associated with several key design factors including multiplexing level (the number of primer pairs per tube), coverage (the % of SNP whose associated primers are actually assigned to one of several available tube), and tube-size uniformity. We also examine how design performance depends on the total number of available SNPs from which to choose, and primer stringency criterial. We show that finding high-multiplexing/high-coverage designs is subject to a computational phase transition, becoming dramatically more difficult when the probability of primer pair interaction exceeds a critical threshold. The precise location of this critical transition point depends on the number of available SNPs and the level of multiplexing required. We also demonstrate how coverage performance is impacted by the number of available snps, primer selection criteria, and target multiplexing levels. Conclusion The presence of a phase transition suggests limits to scaling Multiplex PCR performance for high-throughput genomics applications. Achieving broad SNP coverage rapidly transitions from being very easy to very hard as the target multiplexing level (# of primer pairs per tube) increases. The onset of a phase transition can be "delayed" by having a larger pool of SNPs, or loosening primer selection constraints so as to increase the number of candidate primer pairs per SNP, though the latter

  15. Highly effective SNP-based association mapping and management of recessive defects in livestock.

    PubMed

    Charlier, Carole; Coppieters, Wouter; Rollin, Frédéric; Desmecht, Daniel; Agerholm, Jorgen S; Cambisano, Nadine; Carta, Eloisa; Dardano, Sabrina; Dive, Marc; Fasquelle, Corinne; Frennet, Jean-Claude; Hanset, Roger; Hubin, Xavier; Jorgensen, Claus; Karim, Latifa; Kent, Matthew; Harvey, Kirsten; Pearce, Brian R; Simon, Patricia; Tama, Nico; Nie, Haisheng; Vandeputte, Sébastien; Lien, Sigbjorn; Longeri, Maria; Fredholm, Merete; Harvey, Robert J; Georges, Michel

    2008-04-01

    The widespread use of elite sires by means of artificial insemination in livestock breeding leads to the frequent emergence of recessive genetic defects, which cause significant economic and animal welfare concerns. Here we show that the availability of genome-wide, high-density SNP panels, combined with the typical structure of livestock populations, markedly accelerates the positional identification of genes and mutations that cause inherited defects. We report the fine-scale mapping of five recessive disorders in cattle and the molecular basis for three of these: congenital muscular dystony (CMD) types 1 and 2 in Belgian Blue cattle and ichthyosis fetalis in Italian Chianina cattle. Identification of these causative mutations has an immediate translation into breeding practice, allowing marker assisted selection against the defects through avoidance of at-risk matings. PMID:18344998

  16. Obesity-related known and candidate SNP markers can significantly change affinity of TATA-binding protein for human gene promoters

    PubMed Central

    2015-01-01

    Background Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e., significant differences of personal genomes of patients from the reference human genome) can help physicians to improve treatment. Clinical research can validate SNP markers via genotyping of patients and demonstration that SNP alleles are significantly more frequent in patients than in healthy people. The search for biomedical SNP markers of interest can be accelerated by computer-based analysis of hundreds of millions of SNPs in the 1000 Genomes project because of selection of the most meaningful candidate SNP markers and elimination of neutral SNPs. Results We cross-validated the output of two computer-based methods: DNA sequence analysis using Web service SNP_TATA_Comparator and keyword search for articles on comorbidities of obesity. Near the sites binding to TATA-binding protein (TBP) in human gene promoters, we found 22 obesity-related candidate SNP markers, including rs10895068 (male breast cancer in obesity); rs35036378 (reduced risk of obesity after ovariectomy); rs201739205 (reduced risk of obesity-related cancers due to weight loss by diet/exercise in obese postmenopausal women); rs183433761 (obesity resistance during a high-fat diet); rs367732974 and rs549591993 (both: cardiovascular complications in obese patients with type 2 diabetes mellitus); rs200487063 and rs34104384 (both: obesity-caused hypertension); rs35518301, rs72661131, and rs562962093 (all: obesity); and rs397509430, rs33980857, rs34598529, rs33931746, rs33981098, rs34500389, rs63750953, rs281864525, rs35518301, and rs34166473 (all: chronic inflammation in comorbidities of obesity). Using an electrophoretic mobility shift assay under nonequilibrium conditions, we

  17. PCR amplification of SNP loci from crude DNA for large-scale genotyping of oomycetes.

    PubMed

    Hu, Jian; Lyon, Rebecca; Zhou, Yuxin; Lamour, Kurt

    2014-01-01

    Similar to other eukaryotes, single nucleotide polymorphism (SNP) markers are abundant in many oomycete plant pathogen genomes. High resolution DNA melting analysis (HR-DMA) is a cost-effective method for SNP genotyping, but like many SNP marker technologies, is limited by the amount and quality of template DNA. We describe PCR preamplification of Phytophthora and Peronospora SNP loci from crude DNA extracted from a small amount of mycelium and/or infected plant tissue to produce sufficient template to genotype at least 10 000 SNPs. The approach is fast, inexpensive, requires minimal biological material and should be useful for many organisms in a variety of contexts. PMID:24871597

  18. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

    PubMed Central

    2010-01-01

    Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls

  19. Lack of association between MDM2 promoter SNP309 and clinical outcome in patients with neuroblastoma.

    PubMed

    Rihani, Ali; Van Maerken, Tom; De Wilde, Bram; Zeka, Fjoralba; Laureys, Geneviève; Norga, Koen; Tonini, Gian Paolo; Coco, Simona; Versteeg, Rogier; Noguera, Rosa; Schulte, Johannes H; Eggert, Angelika; Stallings, Raymond L; Speleman, Frank; Vandesompele, Jo

    2014-10-01

    While a polymorphism located within the promoter region of the MDM2 proto-oncogene, SNP309 (T > G), has previously been associated with increased risk and aggressiveness of neuroblastoma and other tumor entities, a protective effect has also been reported in certain other cancers. In this study, we evaluated the association of MDM2 SNP309 with outcome in 496 patients with neuroblastoma and its effect on MDM2 expression. No significant difference in overall or event-free survival was observed among patients with neuroblastoma with or without MDM2 SNP309. The presence of SNP309 does not affect MDM2 expression in neuroblastoma. PMID:24391119

  20. Mutations in the isocitrate dehydrogenase 2 gene and IDH1 SNP 105C > T have a prognostic value in acute myeloid leukemia

    PubMed Central

    2014-01-01

    Background The isocitrate dehydrogenase (IDH1/IDH2) genes are metabolic enzymes, which are frequently mutated in acute myeloid leukemia (AML). The enzymes acquire neomorphic enzymatic activity when they mutated. Methods We have investigated the frequency and outcome of the acquired IDH1/IDH2 mutations and the IDH1 SNP 105C > T (rs11554137) in 189 unselected de novo AML patients by polymerase chain reaction amplification followed by direct sequencing. The survival are presented in Kaplan Meier curves with log rank test. Multivariable survival analysis was conducted using Cox regression method, taking age, risk group, treatment, IDH1/2 mutations and IDH1 SNP105 genotype into account. Results Overall, IDH1/2 mutations were found in 41/187 (21.7%) of the AML patients. IDH1 codon 132 mutations were present in 7.9%, whereas IDH2 mutations were more frequent and mutations were identified in codon 140 and 172 in a frequency of 11.1% and 2.6%, respectively. The SNP 105C > T was present in 10.5% of the patients, similar to the normal population. A significantly reduced overall survival (OS) for patients carrying IDH2 codon 140 mutation compared with patients carrying wild-type IDH2 gene (p < 0.001) was observed in the intermediate risk patient group. Neither in the entire patient group nor subdivided in different risk groups, IDH1 mutations had any significance on OS compared to the wild-type IDH1 patients. A significant difference in OS between the heterozygous SNP variant and the homozygous wild-type was observed in the intermediate risk FLT3 negative AML patients (p = 0.004). Conclusions Our results indicate that AML-patients with IDH2 mutations or the IDH1 SNP 105C > T variant can represent a new subgroup for risk stratification and may indicate new treatment options. PMID:25324972

  1. SNP Markers and Their Impact on Plant Breeding

    PubMed Central

    Mammadov, Jafar; Aggarwal, Rajat; Buyyarapu, Ramesh; Kumpatla, Siva

    2012-01-01

    The use of molecular markers has revolutionized the pace and precision of plant genetic analysis which in turn facilitated the implementation of molecular breeding of crops. The last three decades have seen tremendous advances in the evolution of marker systems and the respective detection platforms. Markers based on single nucleotide polymorphisms (SNPs) have rapidly gained the center stage of molecular genetics during the recent years due to their abundance in the genomes and their amenability for high-throughput detection formats and platforms. Computational approaches dominate SNP discovery methods due to the ever-increasing sequence information in public databases; however, complex genomes pose special challenges in the identification of informative SNPs warranting alternative strategies in those crops. Many genotyping platforms and chemistries have become available making the use of SNPs even more attractive and efficient. This paper provides a review of historical and current efforts in the development, validation, and application of SNP markers in QTL/gene discovery and plant breeding by discussing key experimental strategies and cases exemplifying their impact. PMID:23316221

  2. Eigenanalysis of SNP data with an identity by descent interpretation.

    PubMed

    Zheng, Xiuwen; Weir, Bruce S

    2016-02-01

    Principal component analysis (PCA) is widely used in genome-wide association studies (GWAS), and the principal component axes often represent perpendicular gradients in geographic space. The explanation of PCA results is of major interest for geneticists to understand fundamental demographic parameters. Here, we provide an interpretation of PCA based on relatedness measures, which are described by the probability that sets of genes are identical-by-descent (IBD). An approximately linear transformation between ancestral proportions (AP) of individuals with multiple ancestries and their projections onto the principal components is found. In addition, a new method of eigenanalysis "EIGMIX" is proposed to estimate individual ancestries. EIGMIX is a method of moments with computational efficiency suitable for millions of SNP data, and it is not subject to the assumption of linkage equilibrium. With the assumptions of multiple ancestries and their surrogate ancestral samples, EIGMIX is able to infer ancestral proportions (APs) of individuals. The methods were applied to the SNP data from the HapMap Phase 3 project and the Human Genome Diversity Panel. The APs of individuals inferred by EIGMIX are consistent with the findings of the program ADMIXTURE. In conclusion, EIGMIX can be used to detect population structure and estimate genome-wide ancestral proportions with a relatively high accuracy. PMID:26482676

  3. New Insights into the Geographic Distribution of Mycobacterium leprae SNP Genotypes Determined for Isolates from Leprosy Cases Diagnosed in Metropolitan France and French Territories

    PubMed Central

    Reibel, Florence; Chauffour, Aurélie; Brossier, Florence; Jarlier, Vincent; Cambau, Emmanuelle; Aubry, Alexandra

    2015-01-01

    Background Between 20 and 30 bacteriologically confirmed cases of leprosy are diagnosed each year at the French National Reference Center for mycobacteria. Patients are mainly immigrants from various endemic countries or living in French overseas territories. We aimed at expanding data regarding the geographical distribution of the SNP genotypes of the M. leprae isolates from these patients. Methodology/Principal findings Skin biopsies were obtained from 71 leprosy patients diagnosed between January 2009 and December 2013. Data regarding age, sex and place of birth and residence were also collected. Diagnosis of leprosy was confirmed by microscopic detection of acid-fast bacilli and/or amplification by PCR of the M. leprae-specific RLEP region. Single nucleotide polymorphisms (SNP), present in the M. leprae genome at positions 14 676, 1 642 875 and 2 935 685, were determined with an efficiency of 94% (67/71). Almost all patients were from countries other than France where leprosy is still prevalent (n = 31) or from French overseas territories (n = 36) where leprosy is not totally eradicated, while only a minority (n = 4) was born in metropolitan France but have lived in other countries. SNP type 1 was predominant (n = 33), followed by type 3 (n = 17), type 4 (n = 11) and type 2 (n = 6). SNP types were concordant with those previously reported as prevalent in the patients’ countries of birth. SNP types found in patients born in countries other than France (Comoros, Haiti, Benin, Congo, Sri Lanka) and French overseas territories (French Polynesia, Mayotte and La Réunion) not covered by previous work correlated well with geographical location and history of human settlements. Conclusions/Significance The phylogenic analysis of M. leprae strains isolated in France strongly suggests that French leprosy cases are caused by SNP types that are (a) concordant with the geographic origin or residence of the patients (non-French countries, French overseas territories

  4. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding

    PubMed Central

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network. PMID:27602231

  5. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding.

    PubMed

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network. PMID:27602231

  6. FunctSNP: an R package to link SNPs to functional knowledge and dbAutoMaker: a suite of Perl scripts to build SNP databases

    PubMed Central

    2010-01-01

    Background Whole genome association studies using highly dense single nucleotide polymorphisms (SNPs) are a set of methods to identify DNA markers associated with variation in a particular complex trait of interest. One of the main outcomes from these studies is a subset of statistically significant SNPs. Finding the potential biological functions of such SNPs can be an important step towards further use in human and agricultural populations (e.g., for identifying genes related to susceptibility to complex diseases or genes playing key roles in development or performance). The current challenge is that the information holding the clues to SNP functions is distributed across many different databases. Efficient bioinformatics tools are therefore needed to seamlessly integrate up-to-date functional information on SNPs. Many web services have arisen to meet the challenge but most work only within the framework of human medical research. Although we acknowledge the importance of human research, we identify there is a need for SNP annotation tools for other organisms. Description We introduce an R package called FunctSNP, which is the user interface to custom built species-specific databases. The local relational databases contain SNP data together with functional annotations extracted from online resources. FunctSNP provides a unified bioinformatics resource to link SNPs with functional knowledge (e.g., genes, pathways, ontologies). We also introduce dbAutoMaker, a suite of Perl scripts, which can be scheduled to run periodically to automatically create/update the customised SNP databases. We illustrate the use of FunctSNP with a livestock example, but the approach and software tools presented here can be applied also to human and other organisms. Conclusions Finding the potential functional significance of SNPs is important when further using the outcomes from whole genome association studies. FunctSNP is unique in that it is the only R package that links SNPs to

  7. Networks of intergenic long-range enhancers and snpRNAs drive castration-resistant phenotype of prostate cancer and contribute to pathogenesis of multiple common human disorders

    PubMed Central

    Glinskii, Anna B; Ma, Shuang; Ma, Jun; Grant, Denise; Lim, Chang-Uk; Guest, Ian; Sell, Stewart; Buttyan, Ralph

    2011-01-01

    The mechanistic relevance of intergenic disease-associated genetic loci (IDAGL) containing highly statistically significant disease-linked SNPs remains unknown. Here, we present experimental and clinical evidence supporting the importantance of the role of IDAGL in human diseases. A targeted RT-PCR screen coupled with sequencing of purified PCR products detects widespread transcription at multiple IDAGL and identifies 96 small noncoding trans-regulatory RNAs of ∼100–300 nt in length containing SNPs (snpRNAs) associated with 21 common disorders. Multiple independent lines of experimental evidence support functionality of snpRNAs by documenting their cell type-specific expression and evolutionary conservation of sequences, genomic coordinates and biological effects. Chromatin state signatures, expression profiling experiments and luciferase reporter assays demonstrate that many IDAGL are Polycomb-regulated long-range enhancers. Expression of snpRNAs in human and mouse cells markedly affects cellular behavior and induces allele-specific clinically relevant phenotypic changes: NLRP1-locus snpRNAs rs2670660 exert regulatory effects on monocyte/macrophage transdifferentiation, induce prostate cancer (PC) susceptibility snpRNAs and transform low-malignancy hormone-dependent human PC cells into highly malignant androgen-independent PC. Q-PCR analysis and luciferase reporter assays demonstrate that snpRNA sequences represent allele-specific “decoy” targets of microRNAs that function as SNP allele-specific modifiers of microRNA expression and activity. We demonstrate that trans-acting RNA molecules facilitating resistance to androgen depletion (RAD) in vitro and castration-resistant phenotype (CRP) in vivo of PC contain intergenic 8q24-locus SNP variants (rs1447295; rs16901979; rs6983267) that were recently linked with increased risk of PC. Q-PCR analysis of clinical samples reveals markedly increased and highly concordant (r = 0.896; p < 0.0001) snpRNA expression

  8. SNP Discovery for mapping alien introgressions in wheat

    PubMed Central

    2014-01-01

    Background Monitoring alien introgressions in crop plants is difficult due to the lack of genetic and molecular mapping information on the wild crop relatives. The tertiary gene pool of wheat is a very important source of genetic variability for wheat improvement against biotic and abiotic stresses. By exploring the 5Mg short arm (5MgS) of Aegilops geniculata, we can apply chromosome genomics for the discovery of SNP markers and their use for monitoring alien introgressions in wheat (Triticum aestivum L). Results The short arm of chromosome 5Mg of Ae. geniculata Roth (syn. Ae. ovata L.; 2n = 4x = 28, UgUgMgMg) was flow-sorted from a wheat line in which it is maintained as a telocentric chromosome. DNA of the sorted arm was amplified and sequenced using an Illumina Hiseq 2000 with ~45x coverage. The sequence data was used for SNP discovery against wheat homoeologous group-5 assemblies. A total of 2,178 unique, 5MgS-specific SNPs were discovered. Randomly selected samples of 59 5MgS-specific SNPs were tested (44 by KASPar assay and 15 by Sanger sequencing) and 84% were validated. Of the selected SNPs, 97% mapped to a chromosome 5Mg addition to wheat (the source of t5MgS), and 94% to 5Mg introgressed from a different accession of Ae. geniculata substituting for chromosome 5D of wheat. The validated SNPs also identified chromosome segments of 5MgS origin in a set of T5D-5Mg translocation lines; eight SNPs (25%) mapped to TA5601 [T5DL · 5DS-5MgS(0.75)] and three (8%) to TA5602 [T5DL · 5DS-5MgS (0.95)]. SNPs (gsnp_5ms83 and gsnp_5ms94), tagging chromosome T5DL · 5DS-5MgS(0.95) with the smallest introgression carrying resistance to leaf rust (Lr57) and stripe rust (Yr40), were validated in two released germplasm lines with Lr57 and Yr40 genes. Conclusion This approach should be widely applicable for the identification of species/genome-specific SNPs. The development of a large number of SNP markers will facilitate the precise introgression and

  9. Identification of SNP barcode biomarkers for genes associated with facial emotion perception using particle swarm optimization algorithm

    PubMed Central

    2014-01-01

    Background Facial emotion perception (FEP) can affect social function. We previously reported that parts of five tested single-nucleotide polymorphisms (SNPs) in the MET and AKT1 genes may individually affect FEP performance. However, the effects of SNP-SNP interactions on FEP performance remain unclear. Methods This study compared patients with high and low FEP performances (n = 89 and 93, respectively). A particle swarm optimization (PSO) algorithm was used to identify the best SNP barcodes (i.e., the SNP combinations and genotypes that revealed the largest differences between the high and low FEP groups). Results The analyses of individual SNPs showed no significant differences between the high and low FEP groups. However, comparisons of multiple SNP-SNP interactions involving different combinations of two to five SNPs showed that the best PSO-generated SNP barcodes were significantly associated with high FEP score. The analyses of the joint effects of the best SNP barcodes for two to five interacting SNPs also showed that the best SNP barcodes had significantly higher odds ratios (2.119 to 3.138; P < 0.05) compared to other SNP barcodes. In conclusion, the proposed PSO algorithm effectively identifies the best SNP barcodes that have the strongest associations with FEP performance. Conclusions This study also proposes a computational methodology for analyzing complex SNP-SNP interactions in social cognition domains such as recognition of facial emotion. PMID:24955105

  10. A new SNP panel for evaluating genetic diversity in a composite cattle breed

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A custom 60K SNP panel, extracted from Bovine HD SNP chip was used to evaluate genotypic frequency changes in Braford (BF, a composite breed) when compared to progenitor breeds: Hereford (HF), Brahman (BR), and Nelore (NE). Samples from both the U. S. and Brazil were used. The new panel differentiat...

  11. A Coordinated Approach to Peach SNP Discovery in RosBREED

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In the USDA-funded multi-institutional and trans-disciplinary project, “RosBREED”, crop-specific SNP genome scan platforms are being developed for peach, apple, strawberry, and cherry at a resolution of at least one polymorphic SNP marker every 5 cM in any random cross, for use in Pedigree-Based Ana...

  12. A genome-wide SNP panel for genetic diversity, mapping and breeding studies in rice

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A genome-wide SNP resource was developed for rice using the GoldenGate assay and used to genotype 400 landrace accessions of O. sativa. SNPs were originally discovered using Perlegen re-sequencing technology in 20 diverse landraces of O. sativa as part of OryzaSNP project (http://irfgc.irri.org). An...

  13. Characterization of the Cattle HapMap Population using the Illumina Bovine-50K SNP Chip

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A Bovine 50K Illumina™ iSelect SNP chip (51,386 polymorphic SNP markers) was designed using a combination of publicly available SNPs along with highly informative novel SNPs discovered using a reduced representation and next-generation sequencing technology strategy. A total of 576 animals (426 mal...

  14. Strategies to build high-density linkage maps of the porcine 60k SNP chip

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We present here two different strategies to compute high-density linkage maps based on the porcine 60k SNP chip that was genotyped on 4 different pedigrees with a total of 5600 animals. The first strategy uses the draft sequence as a reference order, the SNP being first mapped to it. The second stra...

  15. Development and Applications of a Bovine 50,000 SNP Chip

    Technology Transfer Automated Retrieval System (TEKTRAN)

    To develop an Illumina iSelect high density single nucleotide polymorphism (SNP) assay for cattle, the collaborative iBMC (Illumina, USDA ARS Beltsville, University of Missouri, USDA ARS Clay Center) Consortium first performed a de novo SNP discovery project in which genomic reduced representation l...

  16. The development and characterization of a 60K SNP chip for chicken

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In livestock species like the chicken, high throughput SNP genotyping assays are increasingly being used for whole genome association studies and as a tool in breeding (referred to as genomic selection). We describe the design of a moderate density (60K) Illumina SNP BeadChip in chicken consisting o...

  17. SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genome projects routinely produce draft sequences for species from diverse evolutionary clades, but generally do not create single nucleotide polymorphism (SNP) resources. We present an approach for de novo SNP discovery based on short-read sequencing of reduced representation libraries (RRL) to ge...

  18. SNP-VISTA: An Interactive SNPs Visualization Tool

    SciTech Connect

    Shah, Nameeta; Teplitsky, Michael V.; Pennacchio, Len A.; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L.

    2005-07-05

    Recent advances in sequencing technologies promise better diagnostics for many diseases as well as better understanding of evolution of microbial populations. Single Nucleotide Polymorphisms(SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it is possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease and then screen for causative mutations.In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples makes possible more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at http://genome.lbl.gov/vista/snpvista.

  19. SNP and haplotype mapping for genetic analysis in the rat.

    PubMed

    Saar, Kathrin; Beck, Alfred; Bihoreau, Marie-Thérèse; Birney, Ewan; Brocklebank, Denise; Chen, Yuan; Cuppen, Edwin; Demonchy, Stephanie; Dopazo, Joaquin; Flicek, Paul; Foglio, Mario; Fujiyama, Asao; Gut, Ivo G; Gauguier, Dominique; Guigo, Roderic; Guryev, Victor; Heinig, Matthias; Hummel, Oliver; Jahn, Niels; Klages, Sven; Kren, Vladimir; Kube, Michael; Kuhl, Heiner; Kuramoto, Takashi; Kuroki, Yoko; Lechner, Doris; Lee, Young-Ae; Lopez-Bigas, Nuria; Lathrop, G Mark; Mashimo, Tomoji; Medina, Ignacio; Mott, Richard; Patone, Giannino; Perrier-Cornet, Jeanne-Antide; Platzer, Matthias; Pravenec, Michal; Reinhardt, Richard; Sakaki, Yoshiyuki; Schilhabel, Markus; Schulz, Herbert; Serikawa, Tadao; Shikhagaie, Medya; Tatsumoto, Shouji; Taudien, Stefan; Toyoda, Atsushi; Voigt, Birger; Zelenika, Diana; Zimdahl, Heike; Hubner, Norbert

    2008-05-01

    The laboratory rat is one of the most extensively studied model organisms. Inbred laboratory rat strains originated from limited Rattus norvegicus founder populations, and the inherited genetic variation provides an excellent resource for the correlation of genotype to phenotype. Here, we report a survey of genetic variation based on almost 3 million newly identified SNPs. We obtained accurate and complete genotypes for a subset of 20,238 SNPs across 167 distinct inbred rat strains, two rat recombinant inbred panels and an F2 intercross. Using 81% of these SNPs, we constructed high-density genetic maps, creating a large dataset of fully characterized SNPs for disease gene mapping. Our data characterize the population structure and illustrate the degree of linkage disequilibrium. We provide a detailed SNP map and demonstrate its utility for mapping of quantitative trait loci. This community resource is openly available and augments the genetic tools for this workhorse of physiological studies. PMID:18443594

  20. TcSNP: a database of genetic variation in Trypanosoma cruzi

    PubMed Central

    Ackermann, Alejandro A.; Carmona, Santiago J.; Agüero, Fernán

    2009-01-01

    The TcSNP database (http://snps.tcruzi.org) integrates information on genetic variation (polymorphisms and mutations) for different stocks, strains and isolates of Trypanosoma cruzi, the causative agent of Chagas disease. The database incorporates sequences (genes from the T. cruzi reference genome, mRNAs, ESTs and genomic sequences); multiple sequence alignments obtained from these sequences; and single-nucleotide polymorphisms and small indels identified by scanning these multiple sequence alignments. Information in TcSNP can be readily interrogated to arrive at gene sets, or SNP sets of interest based on a number of attributes. Sequence similarity searches using BLAST are also supported. This first release of TcSNP contains nearly 170 000 high-confidence candidate SNPs, derived from the analysis of annotated coding sequences. As new sequence data become available, TcSNP will incorporate these data, mapping new candidate SNPs onto the reference genome sequences. PMID:18974180

  1. Genotyping NAT2 with only two SNPs (rs1041983 and rs1801280) outperforms the tagging SNP rs1495741 and is equivalent to the conventional 7-SNP NAT2 genotype.

    PubMed

    Selinski, Silvia; Blaszkewicz, Meinolf; Lehmann, Marie-Louise; Ovsiannikov, Daniel; Moormann, Oliver; Guballa, Christoph; Kress, Alexander; Truss, Michael C; Gerullis, Holger; Otto, Thomas; Barski, Dimitri; Niegisch, Günter; Albers, Peter; Frees, Sebastian; Brenner, Walburgis; Thüroff, Joachim W; Angeli-Greaves, Miriam; Seidel, Thilo; Roth, Gerhard; Dietrich, Holger; Ebbinghaus, Rainer; Prager, Hans M; Bolt, Hermann M; Falkenstein, Michael; Zimmermann, Anna; Klein, Torsten; Reckwitz, Thomas; Roemer, Hermann C; Löhlein, Dietrich; Weistenhöfer, Wobbeke; Schöps, Wolfgang; Hassan Rizvi, Syed Adibul; Aslam, Muhammad; Bánfi, Gergely; Romics, Imre; Steffens, Michael; Ekici, Arif B; Winterpacht, Andreas; Ickstadt, Katja; Schwender, Holger; Hengstler, Jan G; Golka, Klaus

    2011-10-01

    Genotyping N-acetyltransferase 2 (NAT2) is of high relevance for individualized dosing of antituberculosis drugs and bladder cancer epidemiology. In this study we compared a recently published tagging single nucleotide polymorphism (SNP) (rs1495741) to the conventional 7-SNP genotype (G191A, C282T, T341C, C481T, G590A, A803G and G857A haplotype pairs) and systematically analysed if novel SNP combinations outperform the latter. For this purpose, we studied 3177 individuals by PCR and phenotyped 344 individuals by the caffeine test. Although the tagSNP and the 7-SNP genotype showed a high degree of correlation (R=0.933, P<0.0001) the 7-SNP genotype nevertheless outperformed the tagging SNP with respect to specificity (1.0 vs. 0.9444, P=0.0065). Considering all possible SNP combinations in a receiver operating characteristic analysis we identified a 2-SNP genotype (C282T, T341C) that outperformed the tagging SNP and was equivalent to the 7-SNP genotype. The 2-SNP genotype predicted the correct phenotype with a sensitivity of 0.8643 and a specificity of 1.0. In addition, it predicted the 7-SNP genotype with sensitivity and specificity of 0.9993 and 0.9880, respectively. The prediction of the NAT2 genotype by the 2-SNP genotype performed similar in populations of Caucasian, Venezuelan and Pakistani background. A 2-SNP genotype predicts NAT2 phenotypes with similar sensitivity and specificity as the conventional 7-SNP genotype. This procedure represents a facilitation in individualized dosing of NAT2 substrates without losing sensitivity or specificity. PMID:21750470

  2. Design and Characterization of a 52K SNP Chip for Goats

    PubMed Central

    Tosser-Klopp, Gwenola; Bardou, Philippe; Bouchez, Olivier; Cabau, Cédric; Crooijmans, Richard; Dong, Yang; Donnadieu-Tonon, Cécile; Eggen, André; Heuven, Henri C. M.; Jamli, Saadiah; Jiken, Abdullah Johari; Klopp, Christophe; Lawley, Cynthia T.; McEwan, John; Martin, Patrice; Moreno, Carole R.; Mulsant, Philippe; Nabihoudine, Ibouniyamine; Pailhoux, Eric; Palhière, Isabelle; Rupp, Rachel; Sarry, Julien; Sayre, Brian L.; Tircazes, Aurélie; Jun Wang; Wang, Wen; Zhang, Wenguang

    2014-01-01

    The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50–60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF) suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed): Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc) and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes), sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years. PMID:24465974

  3. miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3'UTRs of human genes

    PubMed Central

    2012-01-01

    Background Single nucleotide polymorphisms (SNPs) can lead to the susceptibility and onset of diseases through their effects on gene expression at the posttranscriptional level. Recent findings indicate that SNPs could create, destroy, or modify the efficiency of miRNA binding to the 3'UTR of a gene, resulting in gene dysregulation. With the rapidly growing number of published disease-associated SNPs (dSNPs), there is a strong need for resources specifically recording dSNPs on the 3'UTRs and their nucleotide distance from miRNA target sites. We present here miRdSNP, a database incorporating three important areas of dSNPs, miRNA target sites, and diseases. Description miRdSNP provides a unique database of dSNPs on the 3'UTRs of human genes manually curated from PubMed. The current release includes 786 dSNP-disease associations for 630 unique dSNPs and 204 disease types. miRdSNP annotates genes with experimentally confirmed targeting by miRNAs and indexes miRNA target sites predicted by TargetScan and PicTar as well as potential miRNA target sites newly generated by dSNPs. A robust web interface and search tools are provided for studying the proximity of miRNA binding sites to dSNPs in relation to human diseases. Searches can be dynamically filtered by gene name, miRBase ID, target prediction algorithm, disease, and any nucleotide distance between dSNPs and miRNA target sites. Results can be viewed at the sequence level showing the annotated locations for miRNA target sites and dSNPs on the entire 3'UTR sequences. The integration of dSNPs with the UCSC Genome browser is also supported. Conclusion miRdSNP provides a comprehensive data source of dSNPs and robust tools for exploring their distance from miRNA target sites on the 3'UTRs of human genes. miRdSNP enables researchers to further explore the molecular mechanism of gene dysregulation for dSNPs at posttranscriptional level. miRdSNP is freely available on the web at http://mirdsnp.ccr.buffalo.edu. PMID:22276777

  4. Identification of Swedish mosquitoes based on molecular barcoding of the COI gene and SNP analysis.

    PubMed

    Engdahl, Cecilia; Larsson, Pär; Näslund, Jonas; Bravo, Mayra; Evander, Magnus; Lundström, Jan O; Ahlm, Clas; Bucht, Göran

    2014-05-01

    Mosquito-borne infectious diseases are emerging in many regions of the world. Consequently, surveillance of mosquitoes and concomitant infectious agents is of great importance for prediction and prevention of mosquito-borne infectious diseases. Currently, morphological identification of mosquitoes is the traditional procedure. However, sequencing of specified genes or standard genomic regions, DNA barcoding, has recently been suggested as a global standard for identification and classification of many different species. Our aim was to develop a genetic method to identify mosquitoes and to study their relationship. Mosquitoes were captured at collection sites in northern Sweden and identified morphologically before the cytochrome c oxidase subunit I (COI) gene sequences of 14 of the most common mosquito species were determined. The sequences obtained were then used for phylogenetic placement, for validation and benchmarking of phenetic classifications and finally to develop a hierarchical PCR-based typing scheme based on single nucleotide polymorphism sites (SNPs) to enable rapid genetic identification, circumventing the need for morphological characterization. The results showed that exact phylogenetic relationships between mosquito taxa were preserved at shorter evolutionary distances, but at deeper levels, they could not be inferred with confidence using COI gene sequence data alone. Fourteen of the most common mosquito species in Sweden were identified by the SNP/PCR-based typing scheme, demonstrating that genetic typing using SNPs of the COI gene is a useful method for identification of mosquitoes with potential for worldwide application. PMID:24215491

  5. A Customized Pigmentation SNP Array Identifies a Novel SNP Associated with Melanoma Predisposition in the SLC45A2 Gene

    PubMed Central

    Alonso, Santos; Boyano, M. Dolores; Peña-Chilet, Maria; Pita, Guillermo; Aviles, Jose A.; Mayor, Matias; Gomez-Fernandez, Cristina; Casado, Beatriz; Martin-Gonzalez, Manuel; Izagirre, Neskuts; De la Rua, Concepcion; Asumendi, Aintzane; Perez-Yarza, Gorka; Arroyo-Berdugo, Yoana; Boldo, Enrique; Lozoya, Rafael; Torrijos-Aguilar, Arantxa; Pitarch, Ana; Pitarch, Gerard; Sanchez-Motilla, Jose M.; Valcuende-Cavero, Francisca; Tomas-Cabedo, Gloria; Perez-Pastor, Gemma; Diaz-Perez, Jose L.; Gardeazabal, Jesus; de Lizarduy, Iñigo Martinez; Sanchez-Diez, Ana; Valdes, Carlos; Pizarro, Angel; Casado, Mariano; Carretero, Gregorio; Botella-Estrada, Rafael; Nagore, Eduardo; Lazaro, Pablo; Lluch, Ana; Benitez, Javier; Martinez-Cadenas, Conrado; Ribas, Gloria

    2011-01-01

    As the incidence of Malignant Melanoma (MM) reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2) and rs2069398 (SILV/CKD2), were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls). A novel SNP located on the SLC45A2 gene (rs35414) was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001). None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively) had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls). Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date. PMID:21559390

  6. Identification, validation and survey of a single nucleotide polymorphism (SNP) associated with pungency in Capsicum spp.

    PubMed

    Garcés-Claver, Ana; Fellman, Shanna Moore; Gil-Ortega, Ramiro; Jahn, Molly; Arnedo-Andrés, María S

    2007-11-01

    A single nucleotide polymorphism (SNP) associated with pungency was detected within an expressed sequence tag (EST) of 307 bp. This fragment was identified after expression analysis of the EST clone SB2-66 in placenta tissue of Capsicum fruits. Sequence alignments corresponding to this new fragment allowed us to identify an SNP between pungent and non-pungent accessions. Two methods were chosen for the development of the SNP marker linked to pungency: tetra-primer amplification refractory mutation system-PCR (tetra-primer ARMS-PCR) and cleaved amplified polymorphic sequence. Results showed that both methods were successful in distinguishing genotypes. Nevertheless, tetra-primer ARMS-PCR was chosen for SNP genotyping because it was more rapid, reliable and less cost-effective. The utility of this SNP marker for pungency was demonstrated by the ability to distinguish between 29 pungent and non-pungent cultivars of Capsicum annuum. In addition, the SNP was also associated with phenotypic pungent character in the tested genotypes of C. chinense, C. baccatum, C. frutescens, C. galapagoense, C. eximium, C. tovarii and C. cardenasi. This SNP marker is a faster, cheaper and more reproducible method for identifying pungent peppers than other techniques such as panel tasting, and allows rapid screening of the trait in early growth stages. PMID:17882396

  7. A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees

    PubMed Central

    Silberstein, Mark; Weissbrod, Omer; Otten, Lars; Tzemach, Anna; Anisenia, Andrei; Shtark, Oren; Tuberg, Dvir; Galfrin, Eddie; Gannon, Irena; Shalata, Adel; Borochowitz, Zvi U.; Dechter, Rina; Thompson, Elizabeth; Geiger, Dan

    2013-01-01

    Motivation: The use of dense single nucleotide polymorphism (SNP) data in genetic linkage analysis of large pedigrees is impeded by significant technical, methodological and computational challenges. Here we describe Superlink-Online SNP, a new powerful online system that streamlines the linkage analysis of SNP data. It features a fully integrated flexible processing workflow comprising both well-known and novel data analysis tools, including SNP clustering, erroneous data filtering, exact and approximate LOD calculations and maximum-likelihood haplotyping. The system draws its power from thousands of CPUs, performing data analysis tasks orders of magnitude faster than a single computer. By providing an intuitive interface to sophisticated state-of-the-art analysis tools coupled with high computing capacity, Superlink-Online SNP helps geneticists unleash the potential of SNP data for detecting disease genes. Results: Computations performed by Superlink-Online SNP are automatically parallelized using novel paradigms, and executed on unlimited number of private or public CPUs. One novel service is large-scale approximate Markov Chain–Monte Carlo (MCMC) analysis. The accuracy of the results is reliably estimated by running the same computation on multiple CPUs and evaluating the Gelman–Rubin Score to set aside unreliable results. Another service within the workflow is a novel parallelized exact algorithm for inferring maximum-likelihood haplotyping. The reported system enables genetic analyses that were previously infeasible. We demonstrate the system capabilities through a study of a large complex pedigree affected with metabolic syndrome. Availability: Superlink-Online SNP is freely available for researchers at http://cbl-hap.cs.technion.ac.il/superlink-snp. The system source code can also be downloaded from the system website. Contact: omerw@cs.technion.ac.il Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23162081

  8. DBDiaSNP: An Open-Source Knowledgebase of Genetic Polymorphisms and Resistance Genes Related to Diarrheal Pathogens.

    PubMed

    Mehla, Kusum; Ramana, Jayashree

    2015-06-01

    Diarrhea is a highly common infection among children, responsible for significant morbidity and mortality rate worldwide. After pneumonia, diarrhea remains the second leading cause of neonatal deaths. Numerous viral, bacterial, and parasitic enteric pathogens are associated with diarrhea. With increasing antibiotic resistance among enteric pathogens, there is an urgent need for global surveillance of the mutations and resistance genes primarily responsible for resistance to antibiotic treatment. Single Nucleotide Polymorphisms are important in this regard as they have a vast potential to be utilized as molecular diagnostics for gene-disease or pharmacogenomics association studies linking genotype to phenotype. DBDiaSNP is a comprehensive repository of mutations and resistance genes among various diarrheal pathogens and hosts to advance breakthroughs that will find applications from development of sequence-based diagnostic tools to drug discovery. It contains information about 946 mutations and 326 resistance genes compiled from literature and various web resources. As of March 2015, it houses various pathogen genes and the mutations responsible for antibiotic resistance. The pathogens include, for example, DEC (Diarrheagenic E.coli), Salmonella spp., Campylobacter spp., Shigella spp., Clostridium difficile, Aeromonas spp., Helicobacter pylori, Entamoeba histolytica, Vibrio cholera, and viruses. It also includes mutations from hosts (e.g., humans, pigs, others) that render them either susceptible or resistant to a certain type of diarrhea. DBDiaSNP is therefore intended as an integrated open access database for researchers and clinicians working on diarrheal diseases. Additionally, we note that the DBDiaSNP is one of the first antibiotic resistance databases for the diarrheal pathogens covering mutations and resistance genes that have clinical relevance from a broad range of pathogens and hosts. For future translational research involving integrative biology and

  9. Genetic analysis of diabetic nephropathy on chromosome 18 in African Americans: linkage analysis and dense SNP mapping.

    PubMed

    McDonough, Caitrin W; Bostrom, Meredith A; Lu, Lingyi; Hicks, Pamela J; Langefeld, Carl D; Divers, Jasmin; Mychaleckyj, Josyf C; Freedman, Barry I; Bowden, Donald W

    2009-12-01

    Genetic studies in Turkish, Native American, European American, and African American (AA) families have linked chromosome 18q21.1-23 to susceptibility for diabetes-associated nephropathy. In this study, we have carried out fine linkage mapping in the 18q region previously linked to diabetic nephropathy in AAs by genotyping both microsatellite and single nucleotide polymorphisms (SNPs) for linkage analysis in an expanded set of 223 AA families multiplexed for type 2 diabetes associated ESRD (T2DM-ESRD). Several approaches were used to evaluate evidence of linkage with the strongest evidence for linkage in ordered subset analysis with an earlier age of T2DM diagnosis compared to the remaining pedigrees (LOD 3.9 at 90.1 cM, ΔP = 0.0161, NPL P value = 0.00002). Overall, the maximum LODs and LOD-1 intervals vary in magnitude and location depending upon analysis. The linkage mapping was followed up by performing a dense SNP map, genotyping 2,814 SNPs in the refined LOD-1 region in 1,029 AA T2DM-ESRD cases and 1,027 AA controls. Of the top 25 most associated SNPs, 10 resided within genic regions. Two candidate genes stood out: NEDD4L and SERPINB7. SNP rs512099, located in intron 1 of NEDD4L, was associated under a dominant model of inheritance [P value = 0.0006; Odds ratio (95% Confidence Interval) OR (95% CI) = 0.70 (0.57-0.86)]. SNP rs1720843, located in intron 2 of SERPINB7, was associated under a recessive model of inheritance [P value = 0.0017; OR (95% CI) = 0.65 (0.50-0.85)]. Collectively, these results suggest that multiple genes in this region may influence diabetic nephropathy susceptibility in AAs. PMID:19690890

  10. Drug-SNPing: an integrated drug-based, protein interaction-based tagSNP-based pharmacogenomics platform for SNP genotyping.

    PubMed

    Yang, Cheng-Hong; Cheng, Yu-Huei; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2013-03-15

    Many drug or single nucleotide polymorphism (SNP)-related resources and tools have been developed, but connecting and integrating them is still a challenge. Here, we describe a user-friendly web-based software package, named Drug-SNPing, which provides a platform for the integration of drug information (DrugBank and PharmGKB), protein-protein interactions (STRING), tagSNP selection (HapMap) and genotyping information (dbSNP, REBASE and SNP500Cancer). DrugBank-based inputs include the following: (i) common name of the drug, (ii) synonym or drug brand name, (iii) gene name (HUGO) and (iv) keywords. PharmGKB-based inputs include the following: (i) gene name (HUGO), (ii) drug name and (iii) disease-related keywords. The output provides drug-related information, metabolizing enzymes and drug targets, as well as protein-protein interaction data. Importantly, tagSNPs of the selected genes are retrieved for genotyping analyses. All drug-based and protein-protein interaction-based SNP genotyping information are provided with PCR-RFLP (PCR-restriction enzyme length polymorphism) and TaqMan probes. Thus, users can enter any drug keywords/brand names to obtain immediate information that is highly relevant to genotyping for pharmacogenomics research. PMID:23418190

  11. Porcine colonization of the Americas: a 60k SNP story

    PubMed Central

    Burgos-Paz, W; Souza, C A; Megens, H J; Ramayo-Caldas, Y; Melo, M; Lemús-Flores, C; Caal, E; Soto, H W; Martínez, R; Álvarez, L A; Aguirre, L; Iñiguez, V; Revidatti, M A; Martínez-López, O R; Llambi, S; Esteve-Codina, A; Rodríguez, M C; Crooijmans, R P M A; Paiva, S R; Schook, L B; Groenen, M A M; Pérez-Enciso, M

    2013-01-01

    The pig, Sus scrofa, is a foreign species to the American continent. Although pigs originally introduced in the Americas should be related to those from the Iberian Peninsula and Canary islands, the phylogeny of current creole pigs that now populate the continent is likely to be very complex. Because of the extreme climates that America harbors, these populations also provide a unique example of a fast evolutionary phenomenon of adaptation. Here, we provide a genome wide study of these issues by genotyping, with a 60k SNP chip, 206 village pigs sampled across 14 countries and 183 pigs from outgroup breeds that are potential founders of the American populations, including wild boar, Iberian, international and Chinese breeds. Results show that American village pigs are primarily of European ancestry, although the observed genetic landscape is that of a complex conglomerate. There was no correlation between genetic and geographical distances, neither continent wide nor when analyzing specific areas. Most populations showed a clear admixed structure where the Iberian pig was not necessarily the main component, illustrating how international breeds, but also Chinese pigs, have contributed to extant genetic composition of American village pigs. We also observe that many genes related to the cardiovascular system show an increased differentiation between altiplano and genetically related pigs living near sea level. PMID:23250008

  12. Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing.

    PubMed

    Ogden, R; Gharbi, K; Mugue, N; Martinsohn, J; Senn, H; Davey, J W; Pourkazemi, M; McEwing, R; Eland, C; Vidotto, M; Sergeev, A; Congiu, L

    2013-06-01

    Caviar-producing sturgeons belonging to the genus Acipenser are considered to be one of the most endangered species groups in the world. Continued overfishing in spite of increasing legislation, zero catch quotas and extensive aquaculture production have led to the collapse of wild stocks across Europe and Asia. The evolutionary relationships among Adriatic, Russian, Persian and Siberian sturgeons are complex because of past introgression events and remain poorly understood. Conservation management, traceability and enforcement suffer a lack of appropriate DNA markers for the genetic identification of sturgeon at the species, population and individual level. This study employed RAD sequencing to discover and characterize single nucleotide polymorphism (SNP) DNA markers for use in sturgeon conservation in these four tetraploid species over three biological levels, using a single sequencing lane. Four population meta-samples and eight individual samples from one family were barcoded separately before sequencing. Analysis of 14.4 Gb of paired-end RAD data focused on the identification of SNPs in the paired-end contig, with subsequent in silico and empirical validation of candidate markers. Thousands of putatively informative markers were identified including, for the first time, SNPs that show population-wide differentiation between Russian and Persian sturgeons, representing an important advance in our ability to manage these cryptic species. The results highlight the challenges of genotyping-by-sequencing in polyploid taxa, while establishing the potential genetic resources for developing a new range of caviar traceability and enforcement tools. PMID:23473098

  13. Association analysis of the FTO gene with obesity in children of Caucasian and African ancestry reveals a common tagging SNP.

    PubMed

    Grant, Struan F A; Li, Mingyao; Bradfield, Jonathan P; Kim, Cecilia E; Annaiah, Kiran; Santa, Erin; Glessner, Joseph T; Casalunovo, Tracy; Frackelton, Edward C; Otieno, F George; Shaner, Julie L; Smith, Ryan M; Imielinski, Marcin; Eckert, Andrew W; Chiavacci, Rosetta M; Berkowitz, Robert I; Hakonarson, Hakon

    2008-01-01

    Recently an association was demonstrated between the single nucleotide polymorphism (SNP), rs9939609, within the FTO locus and obesity as a consequence of a genome wide association (GWA) study of type 2 diabetes in adults. We examined the effects of two perfect surrogates for this SNP plus 11 other SNPs at this locus with respect to our childhood obesity cohort, consisting of both Caucasians and African Americans (AA). Utilizing data from our ongoing GWA study in our cohort of 418 Caucasian obese children (BMI>or=95th percentile), 2,270 Caucasian controls (BMI<95th percentile), 578 AA obese children and 1,424 AA controls, we investigated the association of the previously reported variation at the FTO locus with the childhood form of this disease in both ethnicities. The minor allele frequencies (MAF) of rs8050136 and rs3751812 (perfect surrogates for rs9939609 i.e. both r(2) = 1) in the Caucasian cases were 0.448 and 0.443 respectively while they were 0.391 and 0.386 in Caucasian controls respectively, yielding for both an odds ratio (OR) of 1.27 (95% CI 1.08-1.47; P = 0.0022). Furthermore, the MAFs of rs8050136 and rs3751812 in the AA cases were 0.449 and 0.115 respectively while they were 0.436 and 0.090 in AA controls respectively, yielding an OR of 1.05 (95% CI 0.91-1.21; P = 0.49) and of 1.31 (95% CI 1.050-1.643; P = 0.017) respectively. Investigating all 13 SNPs present on the Illumina HumanHap550 BeadChip in this region of linkage disequilibrium, rs3751812 was the only SNP conferring significant risk in AA. We have therefore replicated and refined the association in an AA cohort and distilled a tag-SNP, rs3751812, which captures the ancestral origin of the actual mutation. As such, variants in the FTO gene confer a similar magnitude of risk of obesity to children as to their adult counterparts and appear to have a global impact. PMID:18335027

  14. Genome-Wide Joint Meta-Analysis of SNP and SNP-by-Smoking Interaction Identifies Novel Loci for Pulmonary Function

    PubMed Central

    Imboden, Medea; Koch, Beate; McArdle, Wendy L.; Smith, Albert V.; Smolonska, Joanna; Sood, Akshay; Tang, Wenbo; Wilk, Jemma B.; Zhai, Guangju; Zhao, Jing Hua; Aschard, Hugues; Burkart, Kristin M.; Curjuric, Ivan; Eijgelsheim, Mark; Elliott, Paul; Gu, Xiangjun; Harris, Tamara B.; Janson, Christer; Homuth, Georg; Hysi, Pirro G.; Liu, Jason Z.; Loehr, Laura R.; Lohman, Kurt; Loos, Ruth J. F.; Manning, Alisa K.; Marciante, Kristin D.; Obeidat, Ma'en; Postma, Dirkje S.; Aldrich, Melinda C.; Brusselle, Guy G.; Chen, Ting-hsu; Eiriksdottir, Gudny; Franceschini, Nora; Heinrich, Joachim; Rotter, Jerome I.; Wijmenga, Cisca; Williams, O. Dale; Bentley, Amy R.; Hofman, Albert; Laurie, Cathy C.; Lumley, Thomas; Morrison, Alanna C.; Joubert, Bonnie R.; Rivadeneira, Fernando; Couper, David J.; Kritchevsky, Stephen B.; Liu, Yongmei; Wjst, Matthias; Wain, Louise V.; Vonk, Judith M.; Uitterlinden, André G.; Rochat, Thierry; Rich, Stephen S.; Psaty, Bruce M.; O'Connor, George T.; North, Kari E.; Mirel, Daniel B.; Meibohm, Bernd; Launer, Lenore J.; Khaw, Kay-Tee; Hartikainen, Anna-Liisa; Hammond, Christopher J.; Gläser, Sven; Marchini, Jonathan; Kraft, Peter; Wareham, Nicholas J.; Völzke, Henry; Stricker, Bruno H. C.; Spector, Timothy D.; Probst-Hensch, Nicole M.; Jarvis, Deborah; Jarvelin, Marjo-Riitta; Heckbert, Susan R.; Gudnason, Vilmundur; Boezen, H. Marike; Barr, R. Graham; Cassano, Patricia A.; Strachan, David P.; Fornage, Myriam; Hall, Ian P.; Dupuis, Josée; Tobin, Martin D.; London, Stephanie J.

    2012-01-01

    Genome-wide association studies have identified numerous genetic loci for spirometic measures of pulmonary function, forced expiratory volume in one second (FEV1), and its ratio to forced vital capacity (FEV1/FVC). Given that cigarette smoking adversely affects pulmonary function, we conducted genome-wide joint meta-analyses (JMA) of single nucleotide polymorphism (SNP) and SNP-by-smoking (ever-smoking or pack-years) associations on FEV1 and FEV1/FVC across 19 studies (total N = 50,047). We identified three novel loci not previously associated with pulmonary function. SNPs in or near DNER (smallest PJMA = 5.00×10−11), HLA-DQB1 and HLA-DQA2 (smallest PJMA = 4.35×10−9), and KCNJ2 and SOX9 (smallest PJMA = 1.28×10−8) were associated with FEV1/FVC or FEV1 in meta-analysis models including SNP main effects, smoking main effects, and SNP-by-smoking (ever-smoking or pack-years) interaction. The HLA region has been widely implicated for autoimmune and lung phenotypes, unlike the other novel loci, which have not been widely implicated. We evaluated DNER, KCNJ2, and SOX9 and found them to be expressed in human lung tissue. DNER and SOX9 further showed evidence of differential expression in human airway epithelium in smokers compared to non-smokers. Our findings demonstrated that joint testing of SNP and SNP-by-environment interaction identified novel loci associated with complex traits that are missed when considering only the genetic main effects. PMID:23284291

  15. Fish scales and SNP chips: SNP genotyping and allele frequency estimation in individual and pooled DNA from historical samples of Atlantic salmon (Salmo salar)

    PubMed Central

    2013-01-01

    Background DNA extracted from historical samples is an important resource for understanding genetic consequences of anthropogenic influences and long-term environmental change. However, such samples generally yield DNA of a lower amount and quality, and the extent to which DNA degradation affects SNP genotyping success and allele frequency estimation is not well understood. We conducted high density SNP genotyping and allele frequency estimation in both individual DNA samples and pooled DNA samples extracted from dried Atlantic salmon (Salmo salar) scales stored at room temperature for up to 35 years, and assessed genotyping success, repeatability and accuracy of allele frequency estimation using a high density SNP genotyping array. Results In individual DNA samples, genotyping success and repeatability was very high (> 0.973 and > 0.998, respectively) in samples stored for up to 35 years; both increased with the proportion of DNA of fragment size > 1000 bp. In pooled DNA samples, allele frequency estimation was highly repeatable (Repeatability = 0.986) and highly correlated with empirical allele frequency measures (Mean Adjusted R2 = 0.991); allele frequency could be accurately estimated in > 95% of pooled DNA samples with a reference group of at least 30 individuals. SNPs located in polyploid regions of the genome were more sensitive to DNA degradation: older samples had lower genotyping success at these loci, and a larger reference panel of individuals was required to accurately estimate allele frequencies. Conclusions SNP genotyping was highly successful in degraded DNA samples, paving the way for the use of degraded samples in SNP genotyping projects. DNA pooling provides the potential for large scale population genetic studies with fewer assays, provided enough reference individuals are also genotyped and DNA quality is properly assessed beforehand. We provide recommendations for future studies intending to conduct high-throughput SNP

  16. Identification of SNP Haplotypes and Prospects of Association Mapping in Watermelon

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Watermelon is the fifth most economically important vegetable crop cultivated world-wide. Implementing Single Nucleotide Polymorphism (SNP) marker technology in watermelon breeding and germplasm evaluation programs holds a key to improve horticulturally important traits. Next-generation sequencing...

  17. SNP analysis of AMY2 and CTSL genes in Litopenaeus vannamei and Penaeus monodon shrimp.

    PubMed

    Glenn, K L; Grapes, L; Suwanasopee, T; Harris, D L; Li, Y; Wilson, K; Rothschild, M F

    2005-06-01

    Genetic studies in shrimp have focused on disease, with production traits such as growth left unexamined. Two shrimp species, Litopenaeus vannamei and Penaeus monodon, which represent the majority of US shrimp imports, were selected for single nucleotide polymorphism (SNP) discovery in alpha-amylase (AMY2) and cathepsin-l (CTSL), both candidate genes for growth. In L. vannamei, four SNPs were found in AMY2 and one SNP was found in CTSL. In P. monodon, one SNP was identified in CTSL. The CTSL gene was mapped to linkage group 28 of P. monodon using the female map developed with the Australian P. monodon mapping population. Association analyses for the AMY2 and CTSL genes with body weight (BW) were performed in two L. vannamei populations. While neither gene was found to be significantly associated with BW in these populations, there was a trend in one population towards higher BW for allele G of CTSL SNP C681G. PMID:15932404

  18. Interim report on updated microarray probes for the LLNL Burkholderia pseudomallei SNP array

    SciTech Connect

    Gardner, S; Jaing, C

    2012-03-27

    The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interim report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.

  19. SNP discovery through de novo deep sequencing using the next generation of DNA sequencers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The production of high volumes of DNA sequence data using new technologies has permitted more efficient identification of single nucleotide polymorphisms in vertebrate genomes. This chapter presented practical methodology for production and analysis of DNA sequence data for SNP discovery....

  20. Evaluation of approaches for identifying population informative markers from high density SNP Chips

    PubMed Central

    2011-01-01

    Background Genetic markers can be used to identify and verify the origin of individuals. Motivation for the inference of ancestry ranges from conservation genetics to forensic analysis. High density assays featuring Single Nucleotide Polymorphism (SNP) markers can be exploited to create a reduced panel containing the most informative markers for these purposes. The objectives of this study were to evaluate methods of marker selection and determine the minimum number of markers from the BovineSNP50 BeadChip required to verify the origin of individuals in European cattle breeds. Delta, Wright's FST, Weir & Cockerham's FST and PCA methods for population differentiation were compared. The level of informativeness of each SNP was estimated from the breed specific allele frequencies. Individual assignment analysis was performed using the ranked informative markers. Stringency levels were applied by log-likelihood ratio to assess the confidence of the assignment test. Results A 95% assignment success rate for the 384 individually genotyped animals was achieved with < 80, < 100, < 140 and < 200 SNP markers (with increasing stringency threshold levels) across all the examined methods for marker selection. No further gain in power of assignment was achieved by sampling in excess of 200 SNP markers. The marker selection method that required the lowest number of SNP markers to verify the animal's breed origin was Wright's FST (60 to 140 SNPs depending on the chosen degree of confidence). Certain breeds required fewer markers (< 100) to achieve 100% assignment success. In contrast, closely related breeds require more markers (~200) to achieve > 95% assignment success. The power of assignment success, and therefore the number of SNP markers required, is dependent on the levels of genetic heterogeneity and pool of samples considered. Conclusions While all SNP selection methods produced marker panels capable of breed identification, the power of assignment varied markedly among

  1. A meta-analysis strategy for gene prioritization using gene expression, SNP genotype, and eQTL data.

    PubMed

    Che, Jingmin; Shin, Miyoung

    2015-01-01

    In order to understand disease pathogenesis, improve medical diagnosis, or discover effective drug targets, it is important to identify significant genes deeply involved in human disease. For this purpose, many earlier approaches attempted to prioritize candidate genes using gene expression profiles or SNP genotype data, but they often suffer from producing many false-positive results. To address this issue, in this paper, we propose a meta-analysis strategy for gene prioritization that employs three different genetic resources--gene expression data, single nucleotide polymorphism (SNP) genotype data, and expression quantitative trait loci (eQTL) data--in an integrative manner. For integration, we utilized an improved technique for the order of preference by similarity to ideal solution (TOPSIS) to combine scores from distinct resources. This method was evaluated on two publicly available datasets regarding prostate cancer and lung cancer to identify disease-related genes. Consequently, our proposed strategy for gene prioritization showed its superiority to conventional methods in discovering significant disease-related genes with several types of genetic resources, while making good use of potential complementarities among available resources. PMID:25874220

  2. Exploring Germplasm Diversity to Understand the Domestication Process in Cicer spp. Using SNP and DArT Markers

    PubMed Central

    Roorkiwal, Manish; von Wettberg, Eric J.; Upadhyaya, Hari D.; Warschefsky, Emily; Rathore, Abhishek; Varshney, Rajeev K.

    2014-01-01

    To estimate genetic diversity within and between 10 interfertile Cicer species (94 genotypes) from the primary, secondary and tertiary gene pool, we analysed 5,257 DArT markers and 651 KASPar SNP markers. Based on successful allele calling in the tertiary gene pool, 2,763 DArT and 624 SNP markers that are polymorphic between genotypes from the gene pools were analyzed further. STRUCTURE analyses were consistent with 3 cultivated populations, representing kabuli, desi and pea-shaped seed types, with substantial admixture among these groups, while two wild populations were observed using DArT markers. AMOVA was used to partition variance among hierarchical sets of landraces and wild species at both the geographical and species level, with 61% of the variation found between species, and 39% within species. Molecular variance among the wild species was high (39%) compared to the variation present in cultivated material (10%). Observed heterozygosity was higher in wild species than the cultivated species for each linkage group. Our results support the Fertile Crescent both as the center of domestication and diversification of chickpea. The collection used in the present study covers all the three regions of historical chickpea cultivation, with the highest diversity in the Fertile Crescent region. Shared alleles between different gene pools suggest the possibility of gene flow among these species or incomplete lineage sorting and could indicate complicated patterns of divergence and fusion of wild chickpea taxa in the past. PMID:25010059

  3. A Meta-Analysis Strategy for Gene Prioritization Using Gene Expression, SNP Genotype, and eQTL Data

    PubMed Central

    2015-01-01

    In order to understand disease pathogenesis, improve medical diagnosis, or discover effective drug targets, it is important to identify significant genes deeply involved in human disease. For this purpose, many earlier approaches attempted to prioritize candidate genes using gene expression profiles or SNP genotype data, but they often suffer from producing many false-positive results. To address this issue, in this paper, we propose a meta-analysis strategy for gene prioritization that employs three different genetic resources—gene expression data, single nucleotide polymorphism (SNP) genotype data, and expression quantitative trait loci (eQTL) data—in an integrative manner. For integration, we utilized an improved technique for the order of preference by similarity to ideal solution (TOPSIS) to combine scores from distinct resources. This method was evaluated on two publicly available datasets regarding prostate cancer and lung cancer to identify disease-related genes. Consequently, our proposed strategy for gene prioritization showed its superiority to conventional methods in discovering significant disease-related genes with several types of genetic resources, while making good use of potential complementarities among available resources. PMID:25874220

  4. High-resolution SNP arrays in mental retardation diagnostics: how much do we gain?

    PubMed Central

    Bernardini, Laura; Alesi, Viola; Loddo, Sara; Novelli, Antonio; Bottillo, Irene; Battaglia, Agatino; Digilio, Maria Cristina; Zampino, Giuseppe; Ertel, Adam; Fortina, Paolo; Surrey, Saul; Dallapiccola, Bruno

    2010-01-01

    We used Affymetrix 6.0 GeneChip SNP arrays to characterize copy number variations (CNVs) in a cohort of 70 patients previously characterized on lower-density oligonucleotide arrays affected by idiopathic mental retardation and dysmorphic features. The SNP array platform includes ∼900 000 SNP probes and 900 000 non-SNP oligonucleotide probes at an average distance of 0.7 Kb, which facilitates coverage of the whole genome, including coding and noncoding regions. The high density of probes is critical for detecting small CNVs, but it can lead to data interpretation problems. To reduce the number of false positives, parameters were set to consider only imbalances >75 Kb encompassing at least 80 probe sets. The higher resolution of the SNP array platform confirmed the increased ability to detect small CNVs, although more than 80% of these CNVs overlapped to copy number ‘neutral' polymorphism regions and 4.4% of them did not contain known genes. In our cohort of 70 patients, of the 51 previously evaluated as ‘normal' on the Agilent 44K array, the SNP array platform disclosed six additional CNV changes, including three in three patients, which may be pathogenic. This suggests that about 6% of individuals classified as ‘normal' using the lower-density oligonucleotide array could be found to be affected by a genomic disorder when evaluated with the higher-density microarray platforms. PMID:19809473

  5. Construction of a versatile SNP array for pyramiding useful genes of rice.

    PubMed

    Kurokawa, Yusuke; Noda, Tomonori; Yamagata, Yoshiyuki; Angeles-Shim, Rosalyn; Sunohara, Hidehiko; Uehara, Kanako; Furuta, Tomoyuki; Nagai, Keisuke; Jena, Kshirod Kumar; Yasui, Hideshi; Yoshimura, Atsushi; Ashikari, Motoyuki; Doi, Kazuyuki

    2016-01-01

    DNA marker-assisted selection (MAS) has become an indispensable component of breeding. Single nucleotide polymorphisms (SNP) are the most frequent polymorphism in the rice genome. However, SNP markers are not readily employed in MAS because of limitations in genotyping platforms. Here the authors report a Golden Gate SNP array that targets specific genes controlling yield-related traits and biotic stress resistance in rice. As a first step, the SNP genotypes were surveyed in 31 parental varieties using the Affymetrix Rice 44K SNP microarray. The haplotype information for 16 target genes was then converted to the Golden Gate platform with 143-plex markers. Haplotypes for the 14 useful allele are unique and can discriminate among all other varieties. The genotyping consistency between the Affymetrix microarray and the Golden Gate array was 92.8%, and the accuracy of the Golden Gate array was confirmed in 3 F2 segregating populations. The concept of the haplotype-based selection by using the constructed SNP array was proofed. PMID:26566831

  6. SNP Microarray in FISH Negative Clinically Suspected 22q11.2 Microdeletion Syndrome

    PubMed Central

    Jain, Manish; Kalsi, Amanpreet Kaur

    2016-01-01

    The present study evaluated the role of SNP microarray in 101 cases of clinically suspected FISH negative (noninformative/normal) 22q11.2 microdeletion syndrome. SNP microarray was carried out using 300 K HumanCytoSNP-12 BeadChip array or CytoScan 750 K array. SNP microarray identified 8 cases of 22q11.2 microdeletions and/or microduplications in addition to cases of chromosomal abnormalities and other pathogenic/likely pathogenic CNVs. Clinically suspected specific deletions (22q11.2) were detectable in approximately 8% of cases by SNP microarray, mostly from FISH noninformative cases. This study also identified several LOH/AOH loci with known and well-defined UPD (uniparental disomy) disorders. In conclusion, this study suggests more strict clinical criteria for FISH analysis. However, if clinical criteria are few or doubtful, in particular newborn/neonate in intensive care, SNP microarray should be the first screening test to be ordered. FISH is ideal test for detecting mosaicism, screening family members, and prenatal diagnosis in proven families. PMID:27051557

  7. Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation

    PubMed Central

    Wan, Lin; Sun, Kelian; Ding, Qi; Cui, Yuehua; Li, Ming; Wen, Yalu; Elston, Robert C.; Qian, Minping; Fu, Wenjiang J

    2009-01-01

    Affymetrix SNP arrays have been widely used for single-nucleotide polymorphism (SNP) genotype calling and DNA copy number variation inference. Although numerous methods have achieved high accuracy in these fields, most studies have paid little attention to the modeling of hybridization of probes to off-target allele sequences, which can affect the accuracy greatly. In this study, we address this issue and demonstrate that hybridization with mismatch nucleotides (HWMMN) occurs in all SNP probe-sets and has a critical effect on the estimation of allelic concentrations (ACs). We study sequence binding through binding free energy and then binding affinity, and develop a probe intensity composite representation (PICR) model. The PICR model allows the estimation of ACs at a given SNP through statistical regression. Furthermore, we demonstrate with cell-line data of known true copy numbers that the PICR model can achieve reasonable accuracy in copy number estimation at a single SNP locus, by using the ratio of the estimated AC of each sample to that of the reference sample, and can reveal subtle genotype structure of SNPs at abnormal loci. We also demonstrate with HapMap data that the PICR model yields accurate SNP genotype calls consistently across samples, laboratories and even across array platforms. PMID:19586935

  8. SNP Microarray in FISH Negative Clinically Suspected 22q11.2 Microdeletion Syndrome.

    PubMed

    Halder, Ashutosh; Jain, Manish; Kalsi, Amanpreet Kaur

    2016-01-01

    The present study evaluated the role of SNP microarray in 101 cases of clinically suspected FISH negative (noninformative/normal) 22q11.2 microdeletion syndrome. SNP microarray was carried out using 300 K HumanCytoSNP-12 BeadChip array or CytoScan 750 K array. SNP microarray identified 8 cases of 22q11.2 microdeletions and/or microduplications in addition to cases of chromosomal abnormalities and other pathogenic/likely pathogenic CNVs. Clinically suspected specific deletions (22q11.2) were detectable in approximately 8% of cases by SNP microarray, mostly from FISH noninformative cases. This study also identified several LOH/AOH loci with known and well-defined UPD (uniparental disomy) disorders. In conclusion, this study suggests more strict clinical criteria for FISH analysis. However, if clinical criteria are few or doubtful, in particular newborn/neonate in intensive care, SNP microarray should be the first screening test to be ordered. FISH is ideal test for detecting mosaicism, screening family members, and prenatal diagnosis in proven families. PMID:27051557

  9. Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies

    PubMed Central

    Pattaro, Cristian; Ruczinski, Ingo; Fallin, Danièle M; Parmigiani, Giovanni

    2008-01-01

    Background Identification of disease-related genes in association studies is challenged by the large number of SNPs typed. To address the dilution of power caused by high dimensionality, and to generate results that are biologically interpretable, it is critical to take into consideration spatial correlation of SNPs along the genome. With the goal of identifying true genetic associations, partitioning the genome according to spatial correlation can be a powerful and meaningful way to address this dimensionality problem. Results We developed and validated an MCMC Algorithm To Identify blocks of Linkage DisEquilibrium (MATILDE) for clustering contiguous SNPs, and a statistical testing framework to detect association using partitions as units of analysis. We compared its ability to detect true SNP associations to that of the most commonly used algorithm for block partitioning, as implemented in the Haploview and HapBlock software. Simulations were based on artificially assigning phenotypes to individuals with SNPs corresponding to region 14q11 of the HapMap database. When block partitioning is performed using MATILDE, the ability to correctly identify a disease SNP is higher, especially for small effects, than it is with the alternatives considered. Advantages can be both in terms of true positive findings and limiting the number of false discoveries. Finer partitions provided by LD-based methods or by marker-by-marker analysis are efficient only for detecting big effects, or in presence of large sample sizes. The probabilistic approach we propose offers several additional advantages, including: a) adapting the estimation of blocks to the population, technology, and sample size of the study; b) probabilistic assessment of uncertainty about block boundaries and about whether any two SNPs are in the same block; c) user selection of the probability threshold for assigning SNPs to the same block. Conclusion We demonstrate that, in realistic scenarios, our adaptive, study

  10. Family-Based Multi-SNP X Chromosome Analysis Using Parent Information.

    PubMed

    Wise, Alison S; Shi, Min; Weinberg, Clarice R

    2016-01-01

    We propose a method for association analysis of haplotypes on the X chromosome that offers both improved power and robustness to population stratification in studies of affected offspring and their parents if all three have been genotyped. The method makes use of assumed parental haplotype exchangeability (PHE), a weaker assumption than Hardy-Weinberg equilibrium (HWE). PHE requires that in the source population, of the three X chromosome haplotypes carried by the two parents, each is equally likely to be carried by the father. We propose a pseudo-sibling approach that exploits that exchangeability assumption. Our method extends the single-SNP PIX-LRT method to multiple SNPs in a high linkage block. We describe methods for testing the PHE assumption and also for determining how apparent violations can be distinguished from true fetal effects or maternally-mediated effects. We show results of simulations that demonstrate nominal type I error rate and good power. The methods are then applied to dbGaP data on the birth defect oral cleft, using both Asian and Caucasian families with cleft. PMID:26941777

  11. Sensitive DNA detection and SNP discrimination using ultrabright SERS nanorattles and magnetic beads for malaria diagnostics.

    PubMed

    Ngo, Hoan T; Gandra, Naveen; Fales, Andrew M; Taylor, Steve M; Vo-Dinh, Tuan

    2016-07-15

    One of the major obstacles to implement nucleic acid-based molecular diagnostics at the point-of-care (POC) and in resource-limited settings is the lack of sensitive and practical DNA detection methods that can be seamlessly integrated into portable platforms. Herein we present a sensitive yet simple DNA detection method using a surface-enhanced Raman scattering (SERS) nanoplatform: the ultrabright SERS nanorattle. The method, referred to as the nanorattle-based method, involves sandwich hybridization of magnetic beads that are loaded with capture probes, target sequences, and ultrabright SERS nanorattles that are loaded with reporter probes. Upon hybridization, a magnet was applied to concentrate the hybridization sandwiches at a detection spot for SERS measurements. The ultrabright SERS nanorattles, composed of a core and a shell with resonance Raman reporters loaded in the gap space between the core and the shell, serve as SERS tags for signal detection. Using this method, a specific DNA sequence of the malaria parasite Plasmodium falciparum could be detected with a detection limit of approximately 100 attomoles. Single nucleotide polymorphism (SNP) discrimination of wild type malaria DNA and mutant malaria DNA, which confers resistance to artemisinin drugs, was also demonstrated. These test models demonstrate the molecular diagnostic potential of the nanorattle-based method to both detect and genotype infectious pathogens. Furthermore, the method's simplicity makes it a suitable candidate for integration into portable platforms for POC and in resource-limited settings applications. PMID:26913502

  12. A genome-wide SNP-based phylogenetic analysis distinguishes different biovars of Brucella suis.

    PubMed

    Sankarasubramanian, Jagadesan; Vishnu, Udayakumar S; Gunasekaran, Paramasamy; Rajendhran, Jeyaprakash

    2016-07-01

    Brucellosis is an important zoonotic disease caused by Brucella spp. Brucella suis is the etiological agent of porcine brucellosis. B. suis is the most genetically diverged species within the genus Brucella. We present the first large-scale B. suis phylogenetic analysis based on an alignment-free k-mer approach of gathering polymorphic sites from whole genome sequences. Genome-wide core-SNP based phylogenetic tree clearly differentiated and discriminated the B. suis biovars and the vaccine strain into different clades. A total of 16,756 SNPs were identified from the genome sequences of 54 B. suis strains. Also, biovar-specific SNPs were identified. The vaccine strain B. suis S2-30 is extensively used in China, which was discriminated from all biovars with the accumulation of the highest number of SNPs. We have also identified the SNPs between B. suis vaccine strain S2-30 and its closest homolog, B. suis biovar 513UK. The highest number of mutations (22) was observed in the phosphomannomutase (pmm) gene essential for the synthesis of O-antigen. Also, mutations were identified in several virulent genes including genes coding for type IV secretion system and the effector proteins, which could be responsible for the attenuated virulence of B. suis S2-30. PMID:27085292

  13. Family-Based Multi-SNP X Chromosome Analysis Using Parent Information

    PubMed Central

    Wise, Alison S.; Shi, Min; Weinberg, Clarice R.

    2016-01-01

    We propose a method for association analysis of haplotypes on the X chromosome that offers both improved power and robustness to population stratification in studies of affected offspring and their parents if all three have been genotyped. The method makes use of assumed parental haplotype exchangeability (PHE), a weaker assumption than Hardy-Weinberg equilibrium (HWE). PHE requires that in the source population, of the three X chromosome haplotypes carried by the two parents, each is equally likely to be carried by the father. We propose a pseudo-sibling approach that exploits that exchangeability assumption. Our method extends the single-SNP PIX-LRT method to multiple SNPs in a high linkage block. We describe methods for testing the PHE assumption and also for determining how apparent violations can be distinguished from true fetal effects or maternally-mediated effects. We show results of simulations that demonstrate nominal type I error rate and good power. The methods are then applied to dbGaP data on the birth defect oral cleft, using both Asian and Caucasian families with cleft. PMID:26941777

  14. On the use of dense SNP marker data for the identification of distant relative pairs.

    PubMed

    Sun, M; Jobling, M A; Taliun, D; Pramstaller, P P; Egeland, T; Sheehan, N A

    2016-02-01

    There has been recent interest in the exploitation of readily available dense genome scan marker data for the identification of relatives. However, there are conflicting findings on how informative these data are in practical situations and, in particular, sets of thinned markers are often used with no concrete justification for the chosen spacing. We explore the potential usefulness of dense single nucleotide polymorphism (SNP) arrays for this application with a focus on inferring distant relative pairs. We distinguish between relationship estimation, as defined by a pedigree connecting the two individuals of interest, and estimation of general relatedness as would be provided by a kinship coefficient or a coefficient of relatedness. Since our primary interest is in the former case, we adopt a pedigree likelihood approach. We consider the effect of additional SNPs and data on an additional typed relative, together with choice of that relative, on relationship inference. We also consider the effect of linkage disequilibrium. When overall relatedness, rather than the specific relationship, would suffice, we propose an approximate approach that is easy to implement and appears to compete well with a popular moment-based estimator and a recent maximum likelihood approach based on chromosomal sharing. We conclude that denser marker data are more informative for distant relatives. However, linkage disequilibrium cannot be ignored and will be the main limiting factor for applications to real data. PMID:26474828

  15. Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing

    PubMed Central

    Yu, ShiGang; Chu, WeiWei; Zhang, LiFan; Han, HouMing; Zhao, RongXue; Wu, Wei; Zhu, JiangNing; Dodson, Michael V.; Wei, Wei; Liu, HongLin; Chen, Jie

    2015-01-01

    Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying

  16. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

    PubMed

    Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R; Taylor, Jeremy F; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart

    2016-01-01

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The

  17. Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing.

    PubMed

    Yu, ShiGang; Chu, WeiWei; Zhang, LiFan; Han, HouMing; Zhao, RongXue; Wu, Wei; Zhu, JiangNing; Dodson, Michael V; Wei, Wei; Liu, HongLin; Chen, Jie

    2015-01-01

    Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying

  18. Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms

    PubMed Central

    2014-01-01

    Background High-throughput sequencing has opened up exciting possibilities in population and conservation genetics by enabling the assessment of genetic variation at genome-wide scales. One approach to reduce genome complexity, i.e. investigating only parts of the genome, is reduced-representation library (RRL) sequencing. Like similar approaches, RRL sequencing reduces ascertainment bias due to simultaneous discovery and genotyping of single-nucleotide polymorphisms (SNPs) and does not require reference genomes. Yet, generating such datasets remains challenging due to laboratory and bioinformatical issues. In the laboratory, current protocols require improvements with regards to sequencing homologous fragments to reduce the number of missing genotypes. From the bioinformatical perspective, the reliance of most studies on a single SNP caller disregards the possibility that different algorithms may produce disparate SNP datasets. Results We present an improved RRL (iRRL) protocol that maximizes the generation of homologous DNA sequences, thus achieving improved genotyping-by-sequencing efficiency. Our modifications facilitate generation of single-sample libraries, enabling individual genotype assignments instead of pooled-sample analysis. We sequenced ~1% of the orangutan genome with 41-fold median coverage in 31 wild-born individuals from two populations. SNPs and genotypes were called using three different algorithms. We obtained substantially different SNP datasets depending on the SNP caller. Genotype validations revealed that the Unified Genotyper of the Genome Analysis Toolkit and SAMtools performed significantly better than a caller from CLC Genomics Workbench (CLC). Of all conflicting genotype calls, CLC was only correct in 17% of the cases. Furthermore, conflicting genotypes between two algorithms showed a systematic bias in that one caller almost exclusively assigned heterozygotes, while the other one almost exclusively assigned homozygotes. Conclusions

  19. Application of Multi-SNP Approaches Bayesian LASSO and AUC-RF to Detect Main Effects of Inflammatory-Gene Variants Associated with Bladder Cancer Risk

    PubMed Central

    Calle, M. Luz; Rothman, Nathaniel; Urrea, Víctor; Kogevinas, Manolis; Petrus, Sandra; Chanock, Stephen J.; Tardón, Adonina; García-Closas, Montserrat; González-Neira, Anna; Vellalta, Gemma; Carrato, Alfredo; Navarro, Arcadi; Lorente-Galdós, Belén; Silverman, Debra T.; Real, Francisco X.; Wu, Xifeng; Malats, Núria

    2013-01-01

    The relationship between inflammation and cancer is well established in several tumor types, including bladder cancer. We performed an association study between 886 inflammatory-gene variants and bladder cancer risk in 1,047 cases and 988 controls from the Spanish Bladder Cancer (SBC)/EPICURO Study. A preliminary exploration with the widely used univariate logistic regression approach did not identify any significant SNP after correcting for multiple testing. We further applied two more comprehensive methods to capture the complexity of bladder cancer genetic susceptibility: Bayesian Threshold LASSO (BTL), a regularized regression method, and AUC-Random Forest, a machine-learning algorithm. Both approaches explore the joint effect of markers. BTL analysis identified a signature of 37 SNPs in 34 genes showing an association with bladder cancer. AUC-RF detected an optimal predictive subset of 56 SNPs. 13 SNPs were identified by both methods in the total population. Using resources from the Texas Bladder Cancer study we were able to replicate 30% of the SNPs assessed. The associations between inflammatory SNPs and bladder cancer were reexamined among non-smokers to eliminate the effect of tobacco, one of the strongest and most prevalent environmental risk factor for this tumor. A 9 SNP-signature was detected by BTL. Here we report, for the first time, a set of SNP in inflammatory genes jointly associated with bladder cancer risk. These results highlight the importance of the complex structure of genetic susceptibility associated with cancer risk. PMID:24391818

  20. The Swiss-Prot variant page and the ModSNP database: a resource for sequence and structure information on human protein variants.

    PubMed

    Yip, Yum L; Scheib, Holger; Diemand, Alexander V; Gattiker, Alexandre; Famiglietti, Livia M; Gasteiger, Elisabeth; Bairoch, Amos

    2004-05-01

    Missense mutation leading to single amino acid polymorphism (SAP) is the type of mutation most frequently related to human diseases. The Swiss-Prot protein knowledgebase records information on such mutations in various sections of a protein entry, namely in the "feature," "comment," and "reference" fields. To facilitate users in obtaining the most relevant information about each human SAP recorded in the knowledgebase, the Swiss-Prot Variant web pages were created to provide a summary of available sequence information, as well as additional structural information on each variant. In particular, the ModSNP database was set up to store information related to SAPs and to manage the modeling of SAPs onto protein structures via an automatic homology modeling pipeline. Currently, among the 16,566 human SAPs recorded in the Swiss-Prot knowledgebase (release 42.5, 21 November 2003), more than 25% have corresponding 3D-models. Of these variants, 47% are related to disease, 26% are polymorphisms, and 27% are not yet clearly classified. The ModSNP database is updated and the subsequent model construction pipeline is launched with each weekly Swiss-Prot release. Thus, the ModSNP database represents a valuable resource for the structural analysis of protein variation. The Swiss-Prot variant pages are accessible from the NiceProt view of a Swiss-Prot entry on the ExPASy server (www.expasy.org/), via a hyperlink created for the stable and unique identifier FTId of each human SAP. PMID:15108278

  1. SNP Design from 454 Sequencing of Podosphaera plantaginis Transcriptome Reveals a Genetically Diverse Pathogen Metapopulation with High Levels of Mixed-Genotype Infection

    PubMed Central

    Tollenaere, Charlotte; Susi, Hanna; Nokso-Koivisto, Jussi; Koskinen, Patrik; Tack, Ayco; Auvinen, Petri; Paulin, Lars; Frilander, Mikko J.; Lehtonen, Rainer; Laine, Anna-Liisa

    2012-01-01

    Background Molecular tools may greatly improve our understanding of pathogen evolution and epidemiology but technical constraints have hindered the development of genetic resources for parasites compared to free-living organisms. This study aims at developing molecular tools for Podosphaera plantaginis, an obligate fungal pathogen of Plantago lanceolata. This interaction has been intensively studied in the Åland archipelago of Finland with epidemiological data collected from over 4,000 host populations annually since year 2001. Principal Findings A cDNA library of a pooled sample of fungal conidia was sequenced on the 454 GS-FLX platform. Over 549,411 reads were obtained and annotated into 45,245 contigs. Annotation data was acquired for 65.2% of the assembled sequences. The transcriptome assembly was screened for SNP loci, as well as for functionally important genes (mating-type genes and potential effector proteins). A genotyping assay of 27 SNP loci was designed and tested on 380 infected leaf samples from 80 populations within the Åland archipelago. With this panel we identified 85 multilocus genotypes (MLG) with uneven frequencies across the pathogen metapopulation. Approximately half of the sampled populations contain polymorphism. Our genotyping protocol revealed mixed-genotype infection within a single host leaf to be common. Mixed infection has been proposed as one of the main drivers of pathogen evolution, and hence may be an important process in this pathosystem. Significance The developed SNP panel offers exciting research perspectives for future studies in this well-characterized pathosystem. Also, the transcriptome provides an invaluable novel genomic resource for powdery mildews, which cause significant yield losses on commercially important crops annually. Furthermore, the features that render genetic studies in this system a challenge are shared with the majority of obligate parasitic species, and hence our results provide methodological insights

  2. AMY-tree: an algorithm to use whole genome SNP calling for Y chromosomal phylogenetic applications

    PubMed Central

    2013-01-01

    Background Due to the rapid progress of next-generation sequencing (NGS) facilities, an explosion of human whole genome data will become available in the coming years. These data can be used to optimize and to increase the resolution of the phylogenetic Y chromosomal tree. Moreover, the exponential growth of known Y chromosomal lineages will require an automatic determination of the phylogenetic position of an individual based on whole genome SNP calling data and an up to date Y chromosomal tree. Results We present an automated approach, ‘AMY-tree’, which is able to determine the phylogenetic position of a Y chromosome using a whole genome SNP profile, independently from the NGS platform and SNP calling program, whereby mistakes in the SNP calling or phylogenetic Y chromosomal tree are taken into account. Moreover, AMY-tree indicates ambiguities within the present phylogenetic tree and points out new Y-SNPs which may be phylogenetically relevant. The AMY-tree software package was validated successfully on 118 whole genome SNP profiles of 109 males with different origins. Moreover, support was found for an unknown recurrent mutation, wrong reported mutation conversions and a large amount of new interesting Y-SNPs. Conclusions Therefore, AMY-tree is a useful tool to determine the Y lineage of a sample based on SNP calling, to identify Y-SNPs with yet unknown phylogenetic position and to optimize the Y chromosomal phylogenetic tree in the future. AMY-tree will not add lineages to the existing phylogenetic tree of the Y-chromosome but it is the first step to analyse whole genome SNP profiles in a phylogenetic framework. PMID:23405914

  3. Is MDM2 SNP309 Variation a Risk Factor for Head and Neck Carcinoma?

    PubMed Central

    Zhuo, Xianlu; Ye, Huiping; Li, Qi; Xiang, Zhaolan; Zhang, Xueyuan

    2016-01-01

    Abstract Murine double minute-2 (MDM2) is a negative regulator of P53, and its T309G polymorphism has been suggested as a risk factor for a variety of cancers. Increasing evidence has shown the association of MDM2 T309G polymorphism with head and neck carcinoma (HNC) risk. However, the results are inconsistent. Thus, we performed a meta-analysis to elucidate the association. The meta-analysis retrieved studies published up to August 2015, and essential information was extracted for analysis. Separate analyses on ethnicity, source of controls, sample size, detection method, and cancer types were also conducted. Odds ratios (ORs) and their 95% confidence intervals (CIs) were used to estimate the association. Pooled data from 16 case–control studies including 4625 cases and 6927 controls failed to indicate a significant association. However, in the subgroup analysis of sample sizes, an increased risk was observed in the largest sample size group (>1000) under a recessive model (OR = 1.52; 95% CI = 1.08–2.13). Increased risks were also found in the nasopharyngeal cancer in the subgroup analysis of cancer types (GG vs TT: OR = 2.07; 95% CI = 1.38–3.12; dominant model: OR = 1.48; 95% CI = 1.13–1.93; recessive model: OR = 1.76; 95% CI = 1.17–2.65). The results suggest that homozygote GG alleles of MDM2 SNP309 may be a low-penetrant risk factor for HNC, and G allele may confer nasopharyngeal cancer susceptibility. PMID:26945408

  4. MDM2 SNP309 promoter polymorphism confers risk for hereditary melanoma.

    PubMed

    Thunell, Lena K; Bivik, Cecilia; Wäster, Petra; Fredrikson, Mats; Stjernström, Annika; Synnerstad, Ingrid; Rosdahl, Inger; Enerbäck, Charlotta

    2014-06-01

    The p53 pathway regulates stress response, and variations in p53, MDM2, and MDM4 may predispose an individual to tumor development. The aim of this study was to study the impact of genetic variation on sporadic and hereditary melanoma. We have analyzed a combination of three functionally relevant variants of the p53 pathway in 258 individuals with sporadic malignant melanomas, 50 with hereditary malignant melanomas, and 799 healthy controls. Genotyping was performed by PCR-restriction fragment length polymorphism, pyrosequencing, and allelic discrimination. We found an increased risk for hereditary melanoma in MDM2 GG homozygotes, which was more pronounced among women (P=0.035). In the event of pairwise combinations of the single nucleotide polymorphisms, a risk elevation was shown for MDM2 GG homozygotes/p53 wild-type Arg in hereditary melanoma (P=0.01). Individuals with sporadic melanomas of the superficial spreading type, including melanoma in situ, showed a slightly higher frequency of the MDM2 GG genotype compared with those with nodular melanomas (P=0.04). The dysplastic nevus phenotype, present in the majority of our hereditary melanoma cases and also in some sporadic cases, further enhanced the effect of the MDM2 GG genotype on melanoma risk (P=0.005). In conclusion, the results show an association between MDM2 SNP309 and an increased risk for hereditary melanoma, especially among women. Analysis of sporadic melanoma also shows an association between MDM2 and the superficial spreading melanoma subtype, as well as an association with the presence of dysplastic nevi in sporadic melanoma. PMID:24625390

  5. A SNP panel for identity and kinship testing using massive parallel sequencing.

    PubMed

    Grandell, Ida; Samara, Raed; Tillmar, Andreas O

    2016-07-01

    Within forensic genetics, there is still a need for supplementary DNA marker typing in order to increase the power to solve cases for both identity testing and complex kinship issues. One major disadvantage with current capillary electrophoresis (CE) methods is the limitation in DNA marker multiplex capability. By utilizing massive parallel sequencing (MPS) technology, this capability can, however, be increased. We have designed a customized GeneRead DNASeq SNP panel (Qiagen) of 140 previously published autosomal forensically relevant identity SNPs for analysis using MPS. One single amplification step was followed by library preparation using the GeneRead Library Prep workflow (Qiagen). The sequencing was performed on a MiSeq System (Illumina), and the bioinformatic analyses were done using the software Biomedical Genomics Workbench (CLC Bio, Qiagen). Forty-nine individuals from a Swedish population were genotyped in order to establish genotype frequencies and to evaluate the performance of the assay. The analyses showed to have a balanced coverage among the included loci, and the heterozygous balance showed to have less than 0.5 % outliers. Analyses of dilution series of the 2800M Control DNA gave reproducible results down to 0.2 ng DNA input. In addition, typing of FTA samples and bone samples was performed with promising results. Further studies and optimizations are, however, required for a more detailed evaluation of the performance of degraded and PCR-inhibited forensic samples. In summary, the assay offers a straightforward sample-to-genotype workflow and could be useful to gain information in forensic casework, for both identity testing and in order to solve complex kinship issues. PMID:26932869

  6. The extent of linkage disequilibrium in beef cattle breeds using high-density SNP genotypes

    PubMed Central

    2014-01-01

    Background The extent of linkage disequilibrium (LD) between molecular markers impacts genome-wide association studies and implementation of genomic selection. The availability of high-density single nucleotide polymorphism (SNP) genotyping platforms makes it possible to investigate LD at an unprecedented resolution. In this work, we characterised LD decay in breeds of beef cattle of taurine, indicine and composite origins and explored its variation across autosomes and the X chromosome. Findings In each breed, LD decayed rapidly and r2 was less than 0.2 for marker pairs separated by 50 kb. The LD decay curves clustered into three groups of similar LD decay that distinguished the three main cattle types. At short distances between markers (< 10 kb), taurine breeds showed higher LD (r2 = 0.45) than their indicine (r2 = 0.25) and composite (r2 = 0.32) counterparts. This higher LD in taurine breeds was attributed to a smaller effective population size and a stronger bottleneck during breed formation. Using all SNPs on only the X chromosome, the three cattle types could still be distinguished. However for taurine breeds, the LD decay on the X chromosome was much faster and the background level much lower than for indicine breeds and composite populations. When using only SNPs that were polymorphic in all breeds, the analysis of the X chromosome mimicked that of the autosomes. Conclusions The pattern of LD mirrored some aspects of the history of breed populations and showed a sharp decay with increasing physical distance between markers. We conclude that the availability of the HD chip can be used to detect association signals that remained hidden when using lower density genotyping platforms, since LD dropped below 0.2 at distances of 50 kb. PMID:24661366

  7. SSCP-SNP in pearl millet--a new marker system for comparative genetics.

    PubMed

    Bertin, I; Zhu, J H; Gale, M D

    2005-05-01

    A considerable array of genomic resources are in place in pearl millet, and marker-aided selection is already in use in the public breeding programme at ICRISAT. This paper describes experiments to extend these publicly available resources to a single nucleotide polymorphism (SNP)-based marker system. A new marker system, single-strand conformational polymorphism (SSCP)-SNP, was developed using annotated rice genomic sequences to initially predict the intron-exon borders in millet expressed sequence tags (ESTs) and then to design primers that would amplify across the introns. An adequate supply of millet ESTs was available for us to identify 299 homologues of single-copy rice genes in which the intron positions could be precisely predicted. PCR primers were then designed to amplify approximately 500-bp genomic fragments containing introns. Analysis of these fragments on SSCP gels revealed considerable polymorphism. A detailed DNA sequence analysis of variation at four of the SSCP-SNP loci over a panel of eight inbred genotypes showed complex patterns of variation, with about one SNP or indel (insertion-deletion) every 59 bp in the introns, but considerably fewer in the exons. About two-thirds of the variation was derived from SNPs and one-third from indels. Most haplotypes were detected by SSCP. As a marker system, SSCP-SNP has lower development costs than simple sequence repeats (SSRs), because much of the work is in silico, and similar deployment costs and through-put potential. The rates of polymorphism were lower but useable, with a mean PIC of 0.49 relative to 0.72 for SSRs in our eight inbred genotype panel screen. The major advantage of the system is in comparative applications. Syntenic information can be used to target SSCP-SNP markers to specific chromosomal regions or, conversely, SSCP-SNP markers can be used to unravel detailed syntenic relationships in specific parts of the genome. Finally, a preliminary analysis showed that the millet SSCP-SNP primers

  8. A novel algorithm for simultaneous SNP selection in high-dimensional genome-wide association studies

    PubMed Central

    2012-01-01

    Background Identification of causal SNPs in most genome wide association studies relies on approaches that consider each SNP individually. However, there is a strong correlation structure among SNPs that needs to be taken into account. Hence, increasingly modern computationally expensive regression methods are employed for SNP selection that consider all markers simultaneously and thus incorporate dependencies among SNPs. Results We develop a novel multivariate algorithm for large scale SNP selection using CAR score regression, a promising new approach for prioritizing biomarkers. Specifically, we propose a computationally efficient procedure for shrinkage estimation of CAR scores from high-dimensional data. Subsequently, we conduct a comprehensive comparison study including five advanced regression approaches (boosting, lasso, NEG, MCP, and CAR score) and a univariate approach (marginal correlation) to determine the effectiveness in finding true causal SNPs. Conclusions Simultaneous SNP selection is a challenging task. We demonstrate that our CAR score-based algorithm consistently outperforms all competing approaches, both uni- and multivariate, in terms of correctly recovered causal SNPs and SNP ranking. An R package implementing the approach as well as R code to reproduce the complete study presented here is available from http://strimmerlab.org/software/care/. PMID:23113980

  9. Highly specific SNP detection using 2D graphene electronics and DNA strand displacement

    PubMed Central

    Hwang, Michael T.; Landon, Preston B.; Lee, Joon; Choi, Duyoung; Mo, Alexander H.; Glinsky, Gennadi; Lal, Ratnesh

    2016-01-01

    Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine. PMID:27298347

  10. Single Nucleotide Polymorphism (SNP) Arrays and Unexpected Consanguinity: Considerations for Clinicians When Returning Results to Families

    PubMed Central

    Delgado, Fernanda; Tabor, Holly K.; Chow, Penny M.; Conta, Jessie H.; Feldman, Kenneth W.; Tsuchiya, Karen D.; Beck, Anita E.

    2014-01-01

    Purpose The broad use of SNP microarrays has increased identification of unexpected consanguinity. Therefore, guidelines to address reporting of consanguinity have been published for clinical laboratories. Because no such guidelines exist for clinicians, we describe a case and present recommendations for clinicians to disclose unexpected consanguinity to families. Methods In a boy with multiple endocrine abnormalities and structural birth defects, SNP array analysis revealed ~23% autosomal homozygosity suggestive of a 1st-degree parental relationship. We assembled an interdisciplinary healthcare team, planned the most appropriate way to discuss results of the SNP array with the adult mother including the possibility of multiple autosomal recessive disorders in her child, and finally met with her as a team. Results From these discussions, we developed four major considerations for clinicians returning results of unexpected consanguinity, all guided by the child’s best interests: 1) ethical and legal obligations for reporting possible abuse, 2) preservation of the clinical relationship, 3) attention to justice and psychosocial challenges, and 4) utilization of the SNP array results to guide further testing. Conclusion As SNP arrays become a common clinical diagnostic tool, clinicians can use this framework to return results of unexpected consanguinity to families in a supportive and productive manner. PMID:25232848

  11. Mining and Analysis of SNP in Response to Salinity Stress in Upland Cotton (Gossypium hirsutum L.)

    PubMed Central

    Wang, Xiaoge; Lu, Xuke; Wang, Junjuan; Wang, Delong; Yin, Zujun; Fan, Weili; Wang, Shuai; Ye, Wuwei

    2016-01-01

    Salinity stress is a major abiotic factor that affects crop output, and as a pioneer crop in saline and alkaline land, salt tolerance study of cotton is particularly important. In our experiment, four salt-tolerance varieties with different salt tolerance indexes including CRI35 (65.04%), Kanghuanwei164 (56.19%), Zhong9807 (55.20%) and CRI44 (50.50%), as well as four salt-sensitive cotton varieties including Hengmian3 (48.21%), GK50 (40.20%), Xinyan96-48 (34.90%), ZhongS9612 (24.80%) were used as the materials. These materials were divided into salt-tolerant group (ST) and salt-sensitive group (SS). Illumina Cotton SNP 70K Chip was used to detect SNP in different cotton varieties. SNPv (SNP variation of the same seedling pre- and after- salt stress) in different varieties were screened; polymorphic SNP and SNPr (SNP related to salt tolerance) were obtained. Annotation and analysis of these SNPs showed that (1) the induction efficiency of salinity stress on SNPv of cotton materials with different salt tolerance index was different, in which the induction efficiency on salt-sensitive materials was significantly higher than that on salt-tolerant materials. The induction of salt stress on SNPv was obviously biased. (2) SNPv induced by salt stress may be related to the methylation changes under salt stress. (3) SNPr may influence salt tolerance of plants by affecting the expression of salt-tolerance related genes. PMID:27355327

  12. A SNP discovery method to assess variant allele probability from next-generation resequencing data

    PubMed Central

    Shen, Yufeng; Wan, Zhengzheng; Coarfa, Cristian; Drabek, Rafal; Chen, Lei; Ostrowski, Elizabeth A.; Liu, Yue; Weinstock, George M.; Wheeler, David A.; Gibbs, Richard A.; Yu, Fuli

    2010-01-01

    Accurate identification of genetic variants from next-generation sequencing (NGS) data is essential for immediate large-scale genomic endeavors such as the 1000 Genomes Project, and is crucial for further genetic analysis based on the discoveries. The key challenge in single nucleotide polymorphism (SNP) discovery is to distinguish true individual variants (occurring at a low frequency) from sequencing errors (often occurring at frequencies orders of magnitude higher). Therefore, knowledge of the error probabilities of base calls is essential. We have developed Atlas-SNP2, a computational tool that detects and accounts for systematic sequencing errors caused by context-related variables in a logistic regression model learned from training data sets. Subsequently, it estimates the posterior error probability for each substitution through a Bayesian formula that integrates prior knowledge of the overall sequencing error probability and the estimated SNP rate with the results from the logistic regression model for the given substitutions. The estimated posterior SNP probability can be used to distinguish true SNPs from sequencing errors. Validation results show that Atlas-SNP2 achieves a false-positive rate of lower than 10%, with an ∼5% or lower false-negative rate. PMID:20019143

  13. Different SNP combinations in the GCH1 gene and use of labor analgesia

    PubMed Central

    2010-01-01

    Background The aim of this study was to investigate if there is an association between different SNP combinations in the guanosine triphosphate cyclohydrolase (GCH1) gene and a number of pain behavior related outcomes during labor. A population-based sample of pregnant women (n = 814) was recruited at gestational week 18. A plasma sample was collected from each subject. Genotyping was performed and three single nucleotide polymorphisms (SNP) previously defined as a pain-protective SNP combination of GCH1 were used. Results Homozygous carriers of the pain-protective SNP combination of GCH1 arrived to the delivery ward with a more advanced stage of cervical dilation compared to heterozygous carriers and non-carriers. However, homozygous carriers more often used second line labor analgesia compared to the others. Conclusion The pain-protective SNP combination of GCH1 may be of importance in the limited number of homozygous carriers during the initial dilation of cervix but upon arrival at the delivery unit these women are more inclined to use second line labor analgesia. PMID:20633294

  14. Supervised learning-based tagSNP selection for genome-wide disease classifications

    PubMed Central

    Liu, Qingzhong; Yang, Jack; Chen, Zhongxue; Yang, Mary Qu; Sung, Andrew H; Huang, Xudong

    2008-01-01

    Background Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markers. Results We have developed a feature selection method named Supervised Recursive Feature Addition (SRFA). This method combines supervised learning and statistical measures for the chosen candidate features/SNPs to reconcile the redundancy information and, in doing so, improve the classification performance in association studies. Additionally, we have proposed a Support Vector based Recursive Feature Addition (SVRFA) scheme in SNP-disease association analysis. Conclusions We have proposed using SRFA with different statistical learning classifiers and SVRFA for both SNP selection and disease classification and then applying them to two complex disease data sets. In general, our approaches outperform the well-known feature selection method of Support Vector Machine Recursive Feature Elimination and logic regression-based SNP selection for disease classification in genetic association studies. Our study further indicates that both genetic and environmental variables should be taken into account when doing disease predictions and classifications for the most complex human diseases that have gene-environment interactions. PMID:18366619

  15. Highly specific SNP detection using 2D graphene electronics and DNA strand displacement.

    PubMed

    Hwang, Michael T; Landon, Preston B; Lee, Joon; Choi, Duyoung; Mo, Alexander H; Glinsky, Gennadi; Lal, Ratnesh

    2016-06-28

    Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine. PMID:27298347

  16. CsSNP: A Web-Based Tool for the Detecting of Comparative Segments SNPs.

    PubMed

    Wang, Yi; Wang, Shuangshuang; Zhou, Dongjie; Yang, Shuai; Xu, Yongchao; Yang, Chao; Yang, Long

    2016-07-01

    SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies. PMID:27347883

  17. A novel three-round multiplex PCR for SNP genotyping with next generation sequencing.

    PubMed

    Chen, Ke; Zhou, Yu-Xun; Li, Kai; Qi, Li-Xin; Zhang, Qi-Fei; Wang, Mao-Chun; Xiao, Jun-Hua

    2016-06-01

    Owing to the high throughput and low cost, next generation sequencing has attracted much attention for SNP genotyping application for researchers. Here, we introduce a new method based on three-round multiplex PCR to precisely genotype SNPs with next generation sequencing. This method can as much as possible consume the equivalent amount of each pair of specific primers to largely eliminate the amplification discrepancy between different loci. After the PCR amplification, the products can be directly subjected to next generation sequencing platform. We simultaneously amplified 37 SNP loci of 757 samples and sequenced all amplicons on ion torrent PGM platform; 90.5 % of the target SNP loci were accurately genotyped (at least 15×) and 90.4 % amplicons had uniform coverage with a variation less than 50-fold. Ligase detection reaction (LDR) was performed to genotype the 19 SNP loci (as part of the 37 SNP loci) with 91 samples randomly selected from the 757 samples, and 99.5 % genotyping data were consistent with the next generation sequencing results. Our results demonstrate that three-round PCR coupled with next generation sequencing is an efficient and economical genotyping approach. Graphical Abstract The schematic diagram of three-round PCR. PMID:27113460

  18. Breast cancer-associated high-order SNP-SNP interaction of CXCL12/CXCR4-related genes by an improved multifactor dimensionality reduction (MDR-ER).

    PubMed

    Fu, Ou-Yang; Chang, Hsueh-Wei; Lin, Yu-Da; Chuang, Li-Yeh; Hou, Ming-Feng; Yang, Cheng-Hong

    2016-09-01

    In association studies, the combined effects of single nucleotide polymorphism (SNP)-SNP interactions and the problem of imbalanced data between cases and controls are frequently ignored. In the present study, we used an improved multifactor dimensionality reduction (MDR) approach namely MDR-ER to detect the high order SNP‑SNP interaction in an imbalanced breast cancer data set containing seven SNPs of chemokine CXCL12/CXCR4 pathway genes. Most individual SNPs were not significantly associated with breast cancer. After MDR‑ER analysis, six significant SNP‑SNP interaction models with seven genes (highest cross‑validation consistency, 10; classification error rates, 41.3‑21.0; and prediction error rates, 47.4‑55.3) were identified. CD4 and VEGFA genes were associated in a 2‑loci interaction model (classification error rate, 41.3; prediction error rate, 47.5; odds ratio (OR), 2.069; 95% bootstrap CI, 1.40‑2.90; P=1.71E‑04) and it also appeared in all the best 2‑7‑loci models. When the loci number increased, the classification error rates and P‑values decreased. The powers in 2‑7‑loci in all models were >0.9. The minimum classification error rate of the MDR‑ER‑generated model was shown with the 7‑loci interaction model (classification error rate, 21.0; OR=15.282; 95% bootstrap CI, 9.54‑23.87; P=4.03E‑31). In the epistasis network analysis, the overall effect with breast cancer susceptibility was identified and the SNP order of impact on breast cancer was identified as follows: CD4 = VEGFA > KITLG > CXCL12 > CCR7 = MMP2 > CXCR4. In conclusion, the MDR‑ER can effectively and correctly identify the best SNP‑SNP interaction models in an imbalanced data set for breast cancer cases. PMID:27461876

  19. SNP Discovery by Illumina-Based Transcriptome Sequencing of the Olive and the Genetic Characterization of Turkish Olive Genotypes Revealed by AFLP, SSR and SNP Markers

    PubMed Central

    Kaya, Hilal Betul; Cetin, Oznur; Kaya, Hulya; Sahin, Mustafa; Sefer, Filiz; Kahraman, Abdullah; Tanyolac, Bahattin

    2013-01-01

    Background The olive tree (Olea europaea L.) is a diploid (2n = 2x = 46) outcrossing species mainly grown in the Mediterranean area, where it is the most important oil-producing crop. Because of its economic, cultural and ecological importance, various DNA markers have been used in the olive to characterize and elucidate homonyms, synonyms and unknown accessions. However, a comprehensive characterization and a full sequence of its transcriptome are unavailable, leading to the importance of an efficient large-scale single nucleotide polymorphism (SNP) discovery in olive. The objectives of this study were (1) to discover olive SNPs using next-generation sequencing and to identify SNP primers for cultivar identification and (2) to characterize 96 olive genotypes originating from different regions of Turkey. Methodology/Principal Findings Next-generation sequencing technology was used with five distinct olive genotypes and generated cDNA, producing 126,542,413 reads using an Illumina Genome Analyzer IIx. Following quality and size trimming, the high-quality reads were assembled into 22,052 contigs with an average length of 1,321 bases and 45 singletons. The SNPs were filtered and 2,987 high-quality putative SNP primers were identified. The assembled sequences and singletons were subjected to BLAST similarity searches and annotated with a Gene Ontology identifier. To identify the 96 olive genotypes, these SNP primers were applied to the genotypes in combination with amplified fragment length polymorphism (AFLP) and simple sequence repeats (SSR) markers. Conclusions/Significance This study marks the highest number of SNP markers discovered to date from olive genotypes using transcriptome sequencing. The developed SNP markers will provide a useful source for molecular genetic studies, such as genetic diversity and characterization, high density quantitative trait locus (QTL) analysis, association mapping and map-based gene cloning in the olive. High levels of

  20. Transcriptome sequencing for SNP discovery across Cucumis melo

    PubMed Central

    2012-01-01

    from India and Africa as compared to commercial cultivars, cultigens and landraces from Eastern Europe, Western Asia and the Mediterranean basin is consistent with the evolutionary history proposed for the species. Group-specific SNVs that will be useful in introgression programs were also detected. In a sample of 143 selected putative SNPs, we verified 93% of the polymorphisms in a panel of 78 genotypes. Conclusions This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial) melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity. This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties. PMID:22726804

  1. SNP Formation Bias in the Murine Genome Provides Evidence for Parallel Evolution.

    PubMed

    Plyler, Zackery E; Hill, Aubrey E; McAtee, Christopher W; Cui, Xiangqin; Moseley, Leah A; Sorscher, Eric J

    2015-09-01

    In this study, we show novel DNA motifs that promote single nucleotide polymorphism (SNP) formation and are conserved among exons, introns, and intergenic DNA from mice (Sanger Mouse Genomes Project), human genes (1000 Genomes), and tumor-specific somatic mutations (data from TCGA). We further characterize SNPs likely to be very recent in origin (i.e., formed in otherwise congenic mice) and show enrichment for both synonymous and parallel DNA variants occurring under circumstances not attributable to purifying selection. The findings provide insight regarding SNP contextual bias and eukaryotic codon usage as strategies that favor long-term exonic stability. The study also furnishes new information concerning rates of murine genomic evolution and features of DNA mutagenesis (at the time of SNP formation) that should be viewed as "adaptive." PMID:26253317

  2. Cross-Species Application of SNP Chips is Not Suitable for Identifying Runs of Homozygosity.

    PubMed

    Shafer, Aaron B A; Miller, Joshua M; Kardos, Marty

    2016-03-01

    Cross-species application of single-nucleotide polymorphism (SNP) chips is a valid, relatively cost-effective alternative to the high-throughput sequencing methods generally required to obtain a genome-wide sampling of polymorphisms. Kharzinova et al. (2015) examined the applicability of SNP chips developed in domestic bovids (cattle and sheep) to a semi-wild cervid (reindeer). The ancestors of bovids and cervids diverged between 20 and 30 million years ago (Hassanin and Douzery 2003; Bibi et al. 2013). Empirical work has shown that for a SNP chip developed in a bovid and applied to a cervid species, approximately 50% genotype success with 1% of the loci being polymorphic is expected (Miller et al. 2012). The genotyping of Kharzinova et al. (2015) follows this pattern; however, these data are not appropriate for identifying runs of homozygosity (ROH) and can be problematic for estimating linkage disequilibrium (LD) and we caution readers in this regard. PMID:26774056

  3. SNP-Seek database of SNPs derived from 3000 rice genomes.

    PubMed

    Alexandrov, Nickolai; Tai, Shuaishuai; Wang, Wensheng; Mansueto, Locedie; Palis, Kevin; Fuentes, Roven Rommel; Ulat, Victor Jun; Chebotarov, Dmytro; Zhang, Gengyun; Li, Zhikang; Mauleon, Ramil; Hamilton, Ruaraidh Sackville; McNally, Kenneth L

    2015-01-01

    We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome. The SNPs and allele information are organized into a SNP-Seek system (http://www.oryzasnp.org/iric-portal/), which consists of Oracle database having a total number of rows with SNP genotypes close to 60 billion (20 M SNPs × 3 K rice lines) and web interface for convenient querying. The database allows quick retrieving of SNP alleles for all varieties in a given genome region, finding different alleles from predefined varieties and querying basic passport and morphological phenotypic information about sequenced rice lines. SNPs can be visualized together with the gene structures in JBrowse genome browser. Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots. PMID:25429973

  4. Large Scale Association Analysis for Drug Addiction: Results from SNP to Gene

    PubMed Central

    Guo, Xiaobo; Liu, Zhifa; Wang, Xueqin; Zhang, Heping

    2012-01-01

    Many genetic association studies used single nucleotide polymorphisms (SNPs) data to identify genetic variants for complex diseases. Although SNP-based associations are most common in genome-wide association studies (GWAS), gene-based association analysis has received increasing attention in understanding genetic etiologies for complex diseases. While both methods have been used to analyze the same data, few genome-wide association studies compare the results or observe the connection between them. We performed a comprehensive analysis of the data from the Study of Addiction: Genetics and Environment (SAGE) and compared the results from the SNP-based and gene-based analyses. Our results suggest that the gene-based method complements the individual SNP-based analysis, and conceptually they are closely related. In terms of gene findings, our results validate many genes that were either reported from the analysis of the same dataset or based on animal studies for substance dependence. PMID:23365539

  5. Vitis Phylogenomics: Hybridization Intensities from a SNP Array Outperform Genotype Calls

    PubMed Central

    Miller, Allison J.; Matasci, Naim; Schwaninger, Heidi; Aradhya, Mallikarjuna K.; Prins, Bernard; Zhong, Gan-Yuan; Simon, Charles; Buckler, Edward S.; Myles, Sean

    2013-01-01

    Understanding relationships among species is a fundamental goal of evolutionary biology. Single nucleotide polymorphisms (SNPs) identified through next generation sequencing and related technologies enable phylogeny reconstruction by providing unprecedented numbers of characters for analysis. One approach to SNP-based phylogeny reconstruction is to identify SNPs in a subset of individuals, and then to compile SNPs on an array that can be used to genotype additional samples at hundreds or thousands of sites simultaneously. Although powerful and efficient, this method is subject to ascertainment bias because applying variation discovered in a representative subset to a larger sample favors identification of SNPs with high minor allele frequencies and introduces bias against rare alleles. Here, we demonstrate that the use of hybridization intensity data, rather than genotype calls, reduces the effects of ascertainment bias. Whereas traditional SNP calls assess known variants based on diversity housed in the discovery panel, hybridization intensity data survey variation in the broader sample pool, regardless of whether those variants are present in the initial SNP discovery process. We apply SNP genotype and hybridization intensity data derived from the Vitis9kSNP array developed for grape to show the effects of ascertainment bias and to reconstruct evolutionary relationships among Vitis species. We demonstrate that phylogenies constructed using hybridization intensities suffer less from the distorting effects of ascertainment bias, and are thus more accurate than phylogenies based on genotype calls. Moreover, we reconstruct the phylogeny of the genus Vitis using hybridization data, show that North American subgenus Vitis species are monophyletic, and resolve several previously poorly known relationships among North American species. This study builds on earlier work that applied the Vitis9kSNP array to evolutionary questions within Vitis vinifera and has general

  6. Association of MDM2 SNP309, age of onset, and gender in cutaneous melanoma

    PubMed Central

    Firoz, Elnaz F.; Warycha, Melanie; Zakrzewski, Jan; Pollens, Danuta; Wang, Guimin; Shapiro, Richard; Berman, Russell; Pavlick, Anna; Manga, Prashiela; Ostrer, Harry; Celebi, Julide Tok; Kamino, Hideko; Darvishian, Farbod; Rolnitzky, Linda; Goldberg, Judith D.; Osman, Iman; Polsky, David

    2013-01-01

    Purpose In certain cancers, MDM2 SNP309 has been associated with early tumor onset in women. In melanoma, incidence rates are higher in women than in men among individuals less than age 40; however, among those older than age 50, melanoma is more frequent in men than in women. To investigate this difference, we examined the association between MDM2 SNP309, age at diagnosis, and gender among melanoma patients. Experimental Design Prospectively enrolled melanoma patients (N=227) were evaluated for MDM2 SNP309 and the related polymorphism, p53 Arg72Pro. DNA was isolated from patient blood samples and genotypes were analyzed by PCR-RFLP. Associations between MDM2 SNP309, p53 Arg72Pro, age at diagnosis, and clinicopathologic features of melanoma were analyzed. Results The median age at diagnosis was 13 years earlier among women with a SNP309 GG genotype (46 years) compared to women with TG+TT genotypes (59 years; p=0.19). Analyses using age dichotomized at each decade indicated that women with a GG genotype had significantly higher risks of being diagnosed with melanoma at ages less than 50 compared to women 50 and older, but not 60 and older. At ages less than 50, women with a GG genotype had a 3.89 times greater chance of being diagnosed compared to women with TG+TT genotypes (p=0.01). Similar observations were not seen among men. Conclusions Our data suggest that MDM2 may play an important role in the development of melanoma in women. The MDM2 SNP309 genotype may help identify women at risk for developing melanoma at a young age. PMID:19318491

  7. Using Hamming Distance as Information for SNP-Sets Clustering and Testing in Disease Association Studies

    PubMed Central

    Wang, Charlotte; Kao, Wen-Hsin; Hsiao, Chuhsing Kate

    2015-01-01

    The availability of high-throughput genomic data has led to several challenges in recent genetic association studies, including the large number of genetic variants that must be considered and the computational complexity in statistical analyses. Tackling these problems with a marker-set study such as SNP-set analysis can be an efficient solution. To construct SNP-sets, we first propose a clustering algorithm, which employs Hamming distance to measure the similarity between strings of SNP genotypes and evaluates whether the given SNPs or SNP-sets should be clustered. A dendrogram can then be constructed based on such distance measure, and the number of clusters can be determined. With the resulting SNP-sets, we next develop an association test HDAT to examine susceptibility to the disease of interest. This proposed test assesses, based on Hamming distance, whether the similarity between a diseased and a normal individual differs from the similarity between two individuals of the same disease status. In our proposed methodology, only genotype information is needed. No inference of haplotypes is required, and SNPs under consideration do not need to locate in nearby regions. The proposed clustering algorithm and association test are illustrated with applications and simulation studies. As compared with other existing methods, the clustering algorithm is faster and better at identifying sets containing SNPs exerting a similar effect. In addition, the simulation studies demonstrated that the proposed test works well for SNP-sets containing a large proportion of neutral SNPs. Furthermore, employing the clustering algorithm before testing a large set of data improves the knowledge in confining the genetic regions for susceptible genetic markers. PMID:26302001

  8. An Integrated SNP Mining and Utilization (ISMU) Pipeline for Next Generation Sequencing Data

    PubMed Central

    Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M.; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A. V. S. K.; Varshney, Rajeev K.

    2014-01-01

    Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone

  9. k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

    SciTech Connect

    2014-11-18

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.

  10. k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

    Energy Science and Technology Software Center (ESTSC)

    2014-11-18

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny inmore » minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.« less

  11. A novel SNP in 3' UTR of INS gene: A case report of neonatal diabetes mellitus.

    PubMed

    Bogari, Neda M; Rayes, Husni H; Mostafa, Fakri; Abdel-Latif, Azza M; Ramadan, Abeer; Al-Allaf, Faisal A; Taher, Mohiuddin M; Fawzy, Ahmed

    2015-09-01

    Neonatal diabetes mellitus (NDM) is a rare condition with a prevalence of 1 in 300,000 live births. We have found 3 known SNPs in 5'UTR and a novel SNP in 3' UTR in the INS gene. These SNPs were present in 9-month-old girl from Saudi Arabia and also present in the father and mother. The novel SNP we found is not present in 1000 Genome project or other databases. Further, the newly identified 3' UTR mutation in the INS gene may abolish the polyadenylation signal and result in severe RNA instability. PMID:26212367

  12. Minimal SNP overlap among multiple panels of ancestry informative markers argues for more international collaboration.

    PubMed

    Soundararajan, Usha; Yun, Libing; Shi, Meisen; Kidd, Kenneth K

    2016-07-01

    The century-old use of genetic markers to determine population relationships has morphed in modern forensics into use of markers to determine the ancestry of an individual from a DNA sample. Researchers have identified sets of SNPs that have frequency differences among populations and many sets of SNPs have been published for the purpose of inferring ancestry. Such inference also requires reference datasets for the particular set of SNPs selected. We have identified 21 largely independent published panels of ancestry informative SNPs (AISNPs) and examined their union of 1397 SNPs. No SNP occurs in more than 6 panels. The 1397 SNPs in 21 panels yield a largely empty matrix that is inhibiting progress on more refined ability to infer ancestry for a forensic sample. The most common set of reference populations is the HGDP set of 52 small population samples totaling a thousand individuals. Only 46 (3%) of the 1397 SNPs occur in three or more panels. We assembled a new dataset for 44 of those SNPs involving 4,559 individuals from 73 populations. Analyses of this dataset provided clear differentiation of only five biogeographic regions: sub-Saharan Africa, Europe and SW Asia, South Asia, East Asia, and the Americas. This is an inadequate level of biogeographic resolution already exceeded by other panels. We conclude that more such AISNP panels are not needed and that the forensic community must collaborate to develop a common set of highly differentiating AISNPs typed on a very large number of population samples. How that can be accomplished will be the subject of future discussion. PMID:26977931

  13. Insertion Sequence Element Single Nucleotide Polymorphism Typing Provides Insights into the Population Structure and Evolution of Mycobacterium ulcerans across Africa

    PubMed Central

    Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C.

    2014-01-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the “pan-African clade” were found to be widespread throughout Africa, while the ISE-SNP types of the “Gabonese/Cameroonian clade” were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types. PMID:24296504

  14. Genome-wide SNP validation and mantle tissue transcriptome analysis in the silver-lipped pearl oyster, Pinctada maxima.

    PubMed

    Jones, David B; Jerry, Dean R; Forêt, Sylvain; Konovalov, Dmitry A; Zenger, Kyall R

    2013-12-01

    Pearl oysters are not only farmed for their gemstone quality pearls worldwide, but they are also becoming important model organisms for investigating genetic mechanisms of biomineralisation. Despite their economic and scientific significance, limited genomic resources are available for this important group of bivalves, hampering investigations into identifying genes that regulate important pearl quality traits and unique biological characteristics (i.e. biomineralisation). The silver-lipped pearl oyster, Pinctada maxima, is one species where there is interest in understanding genes that regulate commercially important pearl traits, but presently, there is a dearth of genomic information. The objective of this study was to develop and validate a large number of type I genome-wide single nucleotide polymorphisms (SNPs) for P. maxima suitable for high-throughput genotyping. In addition, sequence annotations and Gene Ontology terms were assigned to a large mantle tissue 454 expressed sequence tag assembly (96,794 contigs) and information on known bivalve biomineralisation genes was incorporated into SNP discovery. The SNP discovery effort resulted in the de novo identification of 172,625 SNPs, of which 9,108 were identified as high value [minor allele frequency (MAF)≥ 0.15, read depth  ≥ 8]. Validation of 2,782 of these SNPs using Illumina iSelect Infinium genotyping technology returned some of the highest assay conversion (86.6 %) and validation (59.9 %; mean MAF 0.28) rates observed in aquaculture species to date. Genomic resources presented here will be pivotal to future research investigating the biological mechanisms behind biomineralisation and will form a strong foundation for genetic selective breeding programs in the P. maxima pearling industry. PMID:23715808

  15. Meta-Analysis of High-Density SNP Associations for Beef Cattle Production Traits from Three Countries

    Technology Transfer Automated Retrieval System (TEKTRAN)

    About 50,000 SNP were evaluated for associations with growth, carcass, and meat quality traits in three populations of cattle in the United States, Canada, and Australia. Regression coefficients for each SNP were independently estimated within each country. Coefficients for similar traits were stand...

  16. MDM2 promoter SNP55 (rs2870820) affects risk of colon cancer but not breast-, lung-, or prostate cancer.

    PubMed

    Helwa, Reham; Gansmo, Liv B; Romundstad, Pål; Hveem, Kristian; Vatten, Lars; Ryan, Bríd M; Harris, Curtis C; Lønning, Per E; Knappskog, Stian

    2016-01-01

    Two functional SNPs (SNP285G > C; rs117039649 and SNP309T > G; rs2279744) have previously been reported to modulate Sp1 transcription factor binding to the promoter of the proto-oncogene MDM2, and to influence cancer risk. Recently, a third SNP (SNP55C > T; rs2870820) was also reported to affect Sp1 binding and MDM2 transcription. In this large population based case-control study, we genotyped MDM2 SNP55 in 10,779 Caucasian individuals, previously genotyped for SNP309 and SNP285, including cases of colon (n = 1,524), lung (n = 1,323), breast (n = 1,709) and prostate cancer (n = 2,488) and 3,735 non-cancer controls, as well as 299 healthy African-Americans. Applying the dominant model, we found an elevated risk of colon cancer among individuals harbouring SNP55TT/CT genotypes compared to the SNP55CC genotype (OR = 1.15; 95% CI = 1.01-1.30). The risk was found to be highest for left-sided colon cancer (OR = 1.21; 95% CI = 1.00-1.45) and among females (OR = 1.32; 95% CI = 1.01-1.74). Assessing combined genotypes, we found the highest risk of colon cancer among individuals harbouring the SNP55TT or CT together with the SNP309TG genotype (OR = 1.21; 95% CI = 1.00-1.46). Supporting the conclusions from the risk estimates, we found colon cancer cases carrying the SNP55TT/CT genotypes to be diagnosed at younger age as compared to SNP55CC (p = 0.053), in particular among patients carrying the SNP309TG/TT genotypes (p = 0.009). PMID:27624283

  17. The use of SNP data for the monitoring of genetic diversity in cattle breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    LD between SNPs contains information about effective population size. In this study, we investigate the use of genome-wide SNP data for marker based estimation of effective population size for two taurine cattle breeds of Africa and two local cattle breeds of Switzerland. Estimated recombination rat...

  18. Application of RAD LongRead sequencing for SNP discovery in sugarcane

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The sugarcane (hybrid Saccharum spp.) genome presents a difficult challenge for SNP discovery and analysis due to its complex polyploid nature. This is compounded further due to the absence of a reference genome sequence. We report the discovery of SNPs in sugarcane through reductive sequencing and ...

  19. SNP-revealed genetic diversity in wild emmer wheat correlates with ecological factors

    PubMed Central

    2013-01-01

    Background Patterns of genetic diversity between and within natural plant populations and their driving forces are of great interest in evolutionary biology. However, few studies have been performed on the genetic structure and population divergence in wild emmer wheat using a large number of EST-related single nucleotide polymorphism (SNP) markers. Results In the present study, twenty-five natural wild emmer wheat populations representing a wide range of ecological conditions in Israel and Turkey were used. Genetic diversity and genetic structure were investigated using over 1,000 SNP markers. A moderate level of genetic diversity was detected due to the biallelic property of SNP markers. Clustering based on Bayesian model showed that grouping pattern is related to the geographical distribution of the wild emmer wheat. However, genetic differentiation between populations was not necessarily dependent on the geographical distances. A total of 33 outlier loci under positive selection were identified using a FST-outlier method. Significant correlations between loci and ecogeographical factors were observed. Conclusions Natural selection appears to play a major role in generating adaptive structures in wild emmer wheat. SNP markers are appropriate for detecting selectively-channeled adaptive genetic diversity in natural populations of wild emmer wheat. This adaptive genetic diversity is significantly associated with ecological factors. PMID:23937410

  20. SNP Discovery in Swine by Reduced Representation and High Throughput Pyrosequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Relatively little information is available for sequence variation in the pig. Because reduced representation reduces the complexity of the genome being sampled by orders of magnitude and samples identical regions dispersed across the genome, it is an ideal strategy for SNP discovery in a species wit...

  1. An integrative segmentation method for detecting germline copy number variations in SNP arrays.

    PubMed

    Shi, Jianxin; Li, Peng

    2012-05-01

    Germline copy number variations (CNVs) are a major source of genetic variation in humans. In large-scale studies of complex diseases, CNVs are usually detected from data generated by single nucleotide polymorphism (SNP) genotyping arrays. In this paper, we develop an integrative segmentation method, SegCNV, for detecting CNVs integrating both log R ratio (LRR) and B allele frequency (BAF). Based on simulation studies, SegCNV had modestly better power to detect deletions and substantially better power to detect duplications compared with circular binary segmentation (CBS) that relies purely on LRRs; and it had better power to detect deletions and a comparable performance to detect duplications compared with PennCNV and QuantiSNP. In two Hapmap subjects with deep sequence data available as a gold standard, SegCNV detected more true short deletions than PennCNV and QuantiSNP. For 21 short duplications validated experimentally in the AGRE dataset, SegCNV, QuantiSNP, and PennCNV detected all of them while CBS detected only three. SegCNV is much faster than the HMM-based (where HMM is hidden Markov model) methods, taking only several seconds to analyze genome-wide data for one subject. PMID:22539397

  2. SNP-based genotyping in lentil: linking sequence information with phenotypes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Lentil (Lens culinaris) has been late to enter the world of high throughput molecular analysis due to a general lack of genomic resources. Using a 454 sequencing-based approach, SNPs have been identified in genes across the lentil genome. Several hundred have been turned into single SNP KASP assay...

  3. Measuring diversity in Gossypium hirsutum using the CottonSNP63K Array

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A CottonSNP63K array and accompanying cluster file has been developed and includes 45,104 intra-specific SNPs and 17,954 inter-specific SNPs for automated genotyping of cotton (Gossypium spp.) samples. Development of the cluster file included genotyping of 1,156 samples, a subset of which were iden...

  4. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ~4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification pr...

  5. A novel approach to analyzing fMRI and SNP data via parallel independent component analysis

    NASA Astrophysics Data System (ADS)

    Liu, Jingyu; Pearlson, Godfrey; Calhoun, Vince; Windemuth, Andreas

    2007-03-01

    There is current interest in understanding genetic influences on brain function in both the healthy and the disordered brain. Parallel independent component analysis, a new method for analyzing multimodal data, is proposed in this paper and applied to functional magnetic resonance imaging (fMRI) and a single nucleotide polymorphism (SNP) array. The method aims to identify the independent components of each modality and the relationship between the two modalities. We analyzed 92 participants, including 29 schizophrenia (SZ) patients, 13 unaffected SZ relatives, and 50 healthy controls. We found a correlation of 0.79 between one fMRI component and one SNP component. The fMRI component consists of activations in cingulate gyrus, multiple frontal gyri, and superior temporal gyrus. The related SNP component is contributed to significantly by 9 SNPs located in sets of genes, including those coding for apolipoprotein A-I, and C-III, malate dehydrogenase 1 and the gamma-aminobutyric acid alpha-2 receptor. A significant difference in the presences of this SNP component is found between the SZ group (SZ patients and their relatives) and the control group. In summary, we constructed a framework to identify the interactions between brain functional and genetic information; our findings provide new insight into understanding genetic influences on brain function in a common mental disorder.

  6. Development of a SNP genotyping panel for genetic monitoring of the laboratory mouse.

    PubMed

    Petkov, Petko M; Cassell, Megan A; Sargent, Evelyn E; Donnelly, Charles J; Robinson, Phil; Crew, Victor; Asquith, Steven; Haar, Raymond Vonder; Wiles, Michael V

    2004-05-01

    We have developed a genotyping system for detecting genetic contamination in the laboratory mouse based on assaying single-nucleotide polymorphism (SNP) markers positioned on all autosomes and the X chromosome. This system provides a fast, reliable, and cost-effective way for genetic monitoring, while maintaining a very high degree of confidence. We describe the allelic distribution of 235 SNPs in 48 mouse strains, thereby creating a database of polymorphisms useful for genotyping purposes. The SNP markers used in this study were chosen from publicly available SNP databases. Four genotyping methods were evaluated, and dynamic two-tube allele-specific PCR assays were developed for each marker and tested on a set of 48 inbred mouse strains. The minimal number of assays sufficient to distinguish groups consisting of different numbers of mouse strains was estimated, and a panel of 28 SNPs sufficient to distinguish virtually all of the inbred strains tested was selected. Amplifluor SNP detection assays were developed for these markers and tested on an extended list of 96 strains. This panel was used as a genetic quality control approach to monitor the genotypes of nearly 300 inbred, wild-derived, congenic, consomic, and recombinant inbred strains maintained at The Jackson Laboratory. We have concluded that this marker panel is sufficient for genetic contamination monitoring in colonies containing a large number of genetically diverse mouse strains and that reduced versions of the panel could be implemented in facilities housing a lower number of strains. PMID:15081119

  7. Identification of a SNP marker associated with WB242 nematode resistance in sugar beet

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The beet-cyst nematode (Heterodera schachtii Schmidt) is one of the major diseases of sugar beet. The identification of molecular markers associated to the nematode resistance would be helpful for developing resistant varieties. The aim of this study was the identification of SNP (Single Nucleotide ...

  8. Field ionization process of Eu 4f76snp Rydberg states

    NASA Astrophysics Data System (ADS)

    Zhang, Jing; Shen, Li; Dai, Chang-Jian

    2015-11-01

    The field ionization process of the Eu 4f76snp Rydberg states, converging to the first ionization limit, 4f76s 9S4, is systematically investigated. The spectra of the Eu 4f76snp Rydberg states are populated with three-step laser excitation, and detected by electric field ionization (EFI) method. Two different kinds of the EFI pulses are applied after laser excitation to observe the possible impacts on the EFI process. The exact EFI ionization thresholds for the 4f76snp Rydberg states can be determined by observing the corresponding EFI spectra. In particular, some structures above the EFI threshold are found in the EFI spectra, which may be interpreted as the effect from black body radiation (BBR). Finally, the scaling law of the EFI threshold for the Eu 4f76snp Rydberg states with the effective quantum number is built. Project supported by the National Natural Science Foundation of China (Grant Nos. 11004151 and 11174218).

  9. High-throughput SNP scoring with GAMMArrays: genomic analysis using multiplexed microsphere arrays

    NASA Astrophysics Data System (ADS)

    Green, Lance D.; Cai, Hong; Torney, David C.; Wood, Diane J.; Uribe-Romeo, Francisco J.; Kaderali, Lars; Nolan, John P.; White, P. S.

    2002-06-01

    We have developed a SNP scoring platform, yielding high throughput, inexpensive assays. The basic platform uses fluorescently labeled DNA fragments bound to microspheres, which are analyzed using flow cytometry. SNP scoring is performed using minisequencing primers and fluorescently labeled dideoxynucleotides. Furthermore, multiplexed microspheres make it possible to score hundreds of SNPs simultaneously. Multiplexing, coupled with high throughput rates allow inexpensive scoring of several million SNPs/day. GAMMArrays use universal tags that consist of computer designed, unique DNA tails. These are incorporated into each primer, and the reverse-component is attached to a discrete population of microspheres in a multiplexed set. This enables simultaneous minisequencing of many SNPs in solution, followed by capture onto the appropriate microsphere for multiplexed analysis by flow cytometry. We present results from multiplexed SNP analyses of bacterial pathogens, and human mtDNA variation. Analytes are performed on PCR amplicons, each containing numerous SNPs scored simultaneously. In addition, these assays easily integrate into conventional liquid handling automation, and require no unique instrumentation for setup and analysis. Very high signal-to-noise ratios, ease of setup, flexibility in format and scale, and low cost make these assays extremely versatile and valuable tools for a wide variety of SNP scoring applications.

  10. SNP discovery in swine by reduced representation and high throughput pyrosequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A reduced representation library (RRL) of porcine genomic fragments was used to identify SNP from a pool of DNA isolated from 26 animals (52 chromosomes) relevant to current pork production. Treatment of the pooled DNA with a restriction enzyme, coupled with gel-based size selection of 450 base pair...

  11. Association Analyses of Candidate SNP on Reproductive and Production Traits in Swine

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The ability to identify young females with superior reproduction would have a large economic impact on commercial swine production. Previous studies have discovered SNP associated with economically important traits such as litter size, growth rate, and feed intake. The objective of this study was to...

  12. High-density SNP Scan of Production and Product Quality Traits in Beef Cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genotypes from the BovineSNP50 BeadChip (50K) were obtained on animals derived from 150 AI sires from seven breeds (22 sires per breed; Angus, Charolais, Gelbvieh, Hereford, Limousin, Red Angus, and Simmental) as either progeny (F1; 590 steers) or grandprogeny (F1 x F1 = F1**2; 1,306 steers and 707 ...

  13. Utilization of a whole genome SNP panel for efficient genetic mapping in the mouse

    PubMed Central

    Moran, Jennifer L.; Bolton, Andrew D.; Tran, Pamela V.; Brown, Alison; Dwyer, Noelle D.; Manning, Danielle K.; Bjork, Bryan C.; Li, Cheng; Montgomery, Kate; Siepka, Sandra M.; Vitaterna, Martha Hotz; Takahashi, Joseph S.; Wiltshire, Tim; Kwiatkowski, David J.; Kucherlapati, Raju; Beier, David R.

    2006-01-01

    Phenotype-driven genetics can be used to create mouse models of human disease and birth defects. However, the utility of these mutant models is limited without identification of the causal gene. To facilitate genetic mapping, we developed a fixed single nucleotide polymorphism (SNP) panel of 394 SNPs as an alternative to analyses using simple sequence length polymorphism (SSLP) marker mapping. With the SNP panel, chromosomal locations for 22 monogenic mutants were identified. The average number of affected progeny genotyped for mapped monogenic mutations is nine. Map locations for several mutants have been obtained with as few as four affected progeny. The average size of genetic intervals obtained for these mutants is 43 Mb, with a range of 17–83 Mb. Thus, our SNP panel allows for identification of moderate resolution map position with small numbers of mice in a high-throughput manner. Importantly, the panel is suitable for mapping crosses from many inbred and wild-derived inbred strain combinations. The chromosomal localizations obtained with the SNP panel allow one to quickly distinguish between potentially novel loci or remutations in known genes, and facilitates fine mapping and positional cloning. By using this approach, we identified DNA sequence changes in two ethylnitrosourea-induced mutants. PMID:16461637

  14. Development and validation of a low-density SNP panel related to prolificacy in sheep

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-density SNP panels (e.g., 50,000 and 600,000 markers) have been used in exploratory population genetic studies with commercial and minor breeds of sheep. However, routine genetic diversity evaluations of large numbers of samples with large panels are in general cost-prohibitive for gene banks. ...

  15. SNP discovery in complex allotetraploid genomes (Gossypium spp., Malvaceae) using genotyping by sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Dramatic decreases in the cost of DNA sequencing have enabled the development of very large numbers of markers based on single nucleotide polymorphism (SNP) for phylogenetic studies, population genetics, linkage mapping, marker-assisted breeding and other applications. Using Illumina next-generatio...

  16. Association mapping of resistance to leaf rust in emmer wheat using high throughput SNP markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Emmer wheat (Triticum turgidum L. subsp. dicoccum) is known to be a useful source of genes for many desirable characters for improvement of modern cultivated wheat. Recently, a panel of 181 emmer wheat accessions has been genotyped with wheat 9K SNP (single nucleotide polymorphism) markers and exte...

  17. Development of gene-tagged SNP markers for gland morphogenesis in cotton

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton (Gossypium spp.) plants, including cottonseed, have small, pigmented glands containing gossypol and other terpenoid compounds that are toxic to humans and non-ruminant animals. Single nucleotide polymorphism (SNP) markers involved in gland morphogenesis are useful for the discovery of candid...

  18. Association of Agronomic Traits with SNP Markers in Durum Wheat (Triticum turgidum L. durum (Desf.))

    PubMed Central

    Hu, Xin; Ren, Jing; Ren, Xifeng; Huang, Sisi; Sabiel, Salih A. I.; Luo, Mingcheng; Nevo, Eviatar; Fu, Chunjie; Peng, Junhua; Sun, Dongfa

    2015-01-01

    Association mapping is a powerful approach to detect associations between traits of interest and genetic markers based on linkage disequilibrium (LD) in molecular plant breeding. In this study, 150 accessions of worldwide originated durum wheat germplasm (Triticum turgidum spp. durum) were genotyped using 1,366 SNP markers. The extent of LD on each chromosome was evaluated. Association of single nucleotide polymorphisms (SNP) markers with ten agronomic traits measured in four consecutive years was analyzed under a mix linear model (MLM). Two hundred and one significant association pairs were detected in the four years. Several markers were associated with one trait, and also some markers were associated with multiple traits. Some of the associated markers were in agreement with previous quantitative trait loci (QTL) analyses. The function and homology analyses of the corresponding ESTs of some SNP markers could explain many of the associations for plant height, length of main spike, number of spikelets on main spike, grain number per plant, and 1000-grain weight, etc. The SNP associations for the observed traits are generally clustered in specific chromosome regions of the wheat genome, mainly in 2A, 5A, 6A, 7A, 1B, and 6B chromosomes. This study demonstrates that association mapping can complement and enhance previous QTL analyses and provide additional information for marker-assisted selection. PMID:26110423

  19. New tools and methods for direct programmatic access to the dbSNP relational database

    PubMed Central

    Saccone, Scott F.; Quan, Jiaxi; Mehta, Gaurang; Bolze, Raphael; Thomas, Prasanth; Deelman, Ewa; Tischfield, Jay A.; Rice, John P.

    2011-01-01

    Genome-wide association studies often incorporate information from public biological databases in order to provide a biological reference for interpreting the results. The dbSNP database is an extensive source of information on single nucleotide polymorphisms (SNPs) for many different organisms, including humans. We have developed free software that will download and install a local MySQL implementation of the dbSNP relational database for a specified organism. We have also designed a system for classifying dbSNP tables in terms of common tasks we wish to accomplish using the database. For each task we have designed a small set of custom tables that facilitate task-related queries and provide entity-relationship diagrams for each task composed from the relevant dbSNP tables. In order to expose these concepts and methods to a wider audience we have developed web tools for querying the database and browsing documentation on the tables and columns to clarify the relevant relational structure. All web tools and software are freely available to the public at http://cgsmd.isi.edu/dbsnpq. Resources such as these for programmatically querying biological databases are essential for viably integrating biological information into genetic association experiments on a genome-wide scale. PMID:21037260

  20. High-throughput RAD-SNP genotyping for characterization of sugar beet genotypes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput SNP genotyping provides a rapid way of developing resourceful set of markers for delineating the genetic architecture and for effective species discrimination. In the presented research, we demonstrate a set of 192 SNPs for effective genotyping in sugar beet using high-throughput mar...

  1. Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases...

  2. SNP-based high density genetic map and mapping of btwd1 dwarfing gene in barley

    PubMed Central

    Ren, Xifeng; Wang, Jibin; Liu, Lipan; Sun, Genlou; Li, Chengdao; Luo, Hong; Sun, Dongfa

    2016-01-01

    A high-density linkage map is a valuable tool for functional genomics and breeding. A newly developed sequence-based marker technology, restriction site associated DNA (RAD) sequencing, has been proven to be powerful for the rapid discovery and genotyping of genome-wide single nucleotide polymorphism (SNP) markers and for the high-density genetic map construction. The objective of this research was to construct a high-density genetic map of barley using RAD sequencing. 1894 high-quality SNP markers were developed and mapped onto all seven chromosomes together with 68 SSR markers. These 1962 markers constituted a total genetic length of 1375.8 cM and an average of 0.7 cM between adjacent loci. The number of markers within each linkage group ranged from 209 to 396. The new recessive dwarfing gene btwd1 in Huaai 11 was mapped onto the high density linkage maps. The result showed that the btwd1 is positioned between SNP marks 7HL_6335336 and 7_249275418 with a genetic distance of 0.9 cM and 0.7 cM on chromosome 7H, respectively. The SNP-based high-density genetic map developed and the dwarfing gene btwd1 mapped in this study provide critical information for position cloning of the btwd1 gene and molecular breeding of barley. PMID:27530597

  3. Imputation of microsatellite allele from dense SNP genotypes for parentage verification

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microsatellite (MS) markers have recently been used for parental verification and are still the international standard despite higher cost, error rate, and turnaround time compared with Single Nucleotide Polymorphisms (SNP)-based assays. Despite domestic and international interest from producers an...

  4. Optimal design of low-density SNP arrays for genomic prediction: algorithm and applications

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for their optimal design. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optim...

  5. selectSNP – An R package for selecting SNPs optimal for genetic evaluation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    There has been a huge increase in the number of SNPs in the public repositories. This has made it a challenge to design low and medium density SNP panels, which requires careful selection of available SNPs considering many criteria, such as map position, allelic frequency, possible biological functi...

  6. SNP-based high density genetic map and mapping of btwd1 dwarfing gene in barley.

    PubMed

    Ren, Xifeng; Wang, Jibin; Liu, Lipan; Sun, Genlou; Li, Chengdao; Luo, Hong; Sun, Dongfa

    2016-01-01

    A high-density linkage map is a valuable tool for functional genomics and breeding. A newly developed sequence-based marker technology, restriction site associated DNA (RAD) sequencing, has been proven to be powerful for the rapid discovery and genotyping of genome-wide single nucleotide polymorphism (SNP) markers and for the high-density genetic map construction. The objective of this research was to construct a high-density genetic map of barley using RAD sequencing. 1894 high-quality SNP markers were developed and mapped onto all seven chromosomes together with 68 SSR markers. These 1962 markers constituted a total genetic length of 1375.8 cM and an average of 0.7 cM between adjacent loci. The number of markers within each linkage group ranged from 209 to 396. The new recessive dwarfing gene btwd1 in Huaai 11 was mapped onto the high density linkage maps. The result showed that the btwd1 is positioned between SNP marks 7HL_6335336 and 7_249275418 with a genetic distance of 0.9 cM and 0.7 cM on chromosome 7H, respectively. The SNP-based high-density genetic map developed and the dwarfing gene btwd1 mapped in this study provide critical information for position cloning of the btwd1 gene and molecular breeding of barley. PMID:27530597

  7. Development and Characterization of a High Density SNP Genotyping Assay for Cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The success of genomewide association (GWA) studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP) genotyping for the identification of quantitative trait loci (QTL) and for ma...

  8. An improved consensus linkage map of barley based on flow-sorted chromosomes and SNP markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recent advances in high-throughput genotyping have made it easier to combine information from different mapping populations into consensus genetic maps, which provide increased marker density and genome coverage compared to individual maps. Previously, a SNP-based genotyping platform was developed a...

  9. The development and characterization of a 57K SNP Chip for rainbow trout

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper we describe the development and characterization of the first high density SNP chip for rainbow trout. The chip included 57,500 putative SNPs, of which 49,500 (86%) were validated as high quality and polymorphic in our validation panel of 960 rainbow trout samples. This array is compa...

  10. Longevity and Plasticity of CFTR Provide an Argument for Noncanonical SNP Organization in Hominid DNA

    PubMed Central

    Hill, Aubrey E.; Plyler, Zackery E.; Tiwari, Hemant; Patki, Amit; Tully, Joel P.; McAtee, Christopher W.; Moseley, Leah A.; Sorscher, Eric J.

    2014-01-01

    Like many other ancient genes, the cystic fibrosis transmembrane conductance regulator (CFTR) has survived for hundreds of millions of years. In this report, we consider whether such prodigious longevity of an individual gene – as opposed to an entire genome or species – should be considered surprising in the face of eons of relentless DNA replication errors, mutagenesis, and other causes of sequence polymorphism. The conventions that modern human SNP patterns result either from purifying selection or random (neutral) drift were not well supported, since extant models account rather poorly for the known plasticity and function (or the established SNP distributions) found in a multitude of genes such as CFTR. Instead, our analysis can be taken as a polemic indicating that SNPs in CFTR and many other mammalian genes may have been generated—and continue to accrue—in a fundamentally more organized manner than would otherwise have been expected. The resulting viewpoint contradicts earlier claims of ‘directional’ or ‘intelligent design-type’ SNP formation, and has important implications regarding the pace of DNA adaptation, the genesis of conserved non-coding DNA, and the extent to which eukaryotic SNP formation should be viewed as adaptive. PMID:25350658

  11. Recombination mapping using Boolean logic and high-density SNP genotyping for exome sequence filtering

    PubMed Central

    Markello, Thomas C.; Han, Ted; Carlson-Donohoe, Hannah; Ahaghotu, Chidi; Harper, Ursula; Jones, MaryPat; Chandrasekharappa, Settara; Anikster, Yair; Adams, David R.; Gahl, William A.; Boerkoel, Cornelius F.

    2012-01-01

    Whole genome sequence data for small pedigrees has been shown to provide sufficient information to resolve detailed haplotypes in small pedigrees. Using such information, recombinations can be mapped onto chromosomes, compared with the segregation of a disease of interest and used to filter genome sequence variants. We now show that relatively inexpensive SNP array data from small pedigrees can be used in a similar manner to provide a means of identifying regions of interest in exome sequencing projects. We demonstrate that in those situations where one can assume complete penetrance and parental DNA is available, SNP recombination mapping using Boolean logic identifies chromosomal regions identical to those detected by multipoint linkage using microsatellites but with much less computation. We further show that this approach is successful because the probability of a double crossover between informative SNP loci is negligible. Our observations provide a rationale for using SNP arrays and recombination mapping as a rapid and cost-effective means of incorporating chromosome segregation information into exome sequencing projects intended for disease-gene identification. PMID:22264778

  12. A web-based genome browser for 'SNP-aware' assay design

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Human and animal genomes contain an abundance of single nucleotide polymorphisms (SNPs) that are useful for genetic testing. However, the relatively large number of SNPs present in diverse populations can pose serious problems when designing assays. It is important to “mask” some SNP positions so ...

  13. The impact of SNP fingerprinting and parentage analysis on the effectiveness of variety recommendations in cacao

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Evidence for the impact of mislabeling and/or pollen contamination on consistency of field performance has been lacking to reinforce the need for strict adherence to quality control protocols in cacao seed garden and germplasm plot management. The present study used SNP fingerprinting at 64 loci to ...

  14. Development of the catfish 250K SNP array for genome-wide association studies

    PubMed Central

    2014-01-01

    Background Quantitative traits, such as disease resistance, are most often controlled by a set of genes involving a complex array of regulation. The dissection of genetic basis of quantitative traits requires large numbers of genetic markers with good genome coverage. The application of next-generation sequencing technologies has allowed discovery of over eight million SNPs in catfish, but the challenge remains as to how to efficiently and economically use such SNP resources for genetic analysis. Results In this work, we developed a catfish 250K SNP array using Affymetrix Axiom genotyping technology. The SNPs were obtained from multiple sources including gene-associated SNPs, anonymous genomic SNPs, and inter-specific SNPs. A set of 640K high-quality SNPs obtained following specific requirements of array design were submitted. A panel of 250,113 SNPs was finalized for inclusion on the array. The performance evaluated by genotyping individuals from wild populations and backcross families suggested the good utility of the catfish 250K SNP array. Conclusions This is the first high-density SNP array for catfish. The array should be a valuable resource for genome-wide association studies (GWAS), fine QTL mapping, high-density linkage map construction, haplotype analysis, and whole genome-based selection. PMID:24618043

  15. Construction and application of a bovine high-density SNP assay

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Bovine genomics has entered a new era and has been transformed by the availability of the whole genome sequence data. An additional resource currently under development is a 60,000 single nucleotide polymorphism (SNP) array that will soon be made commercially available. Targetted content for this SN...

  16. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate.

    PubMed

    Roffler, Gretchen H; Amish, Stephen J; Smith, Seth; Cosart, Ted; Kardos, Marty; Schwartz, Michael K; Luikart, Gordon

    2016-09-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5' and 3' untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species. PMID:27327375

  17. SNP Discovery in the Transcriptome of White Pacific Shrimp Litopenaeus vannamei by Next Generation Sequencing

    PubMed Central

    Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2014-01-01

    The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies. PMID:24498047

  18. Development and Validation of a High-Density SNP Genotyping Array for African Oil Palm.

    PubMed

    Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Heng, Huey Ying; Lee, Heng Leng; Mohamed, Mohaimi; Low, Joel Zi-Bin; Apparow, Sukganah; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Appleton, David Ross

    2016-08-01

    High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle. PMID:27112659

  19. Estrogen, SNP-Dependent Chemokine Expression and Selective Estrogen Receptor Modulator Regulation.

    PubMed

    Ho, Ming-Fen; Bongartz, Tim; Liu, Mohan; Kalari, Krishna R; Goss, Paul E; Shepherd, Lois E; Goetz, Matthew P; Kubo, Michiaki; Ingle, James N; Wang, Liewei; Weinshilboum, Richard M

    2016-03-01

    We previously reported, on the basis of a genome-wide association study for aromatase inhibitor-induced musculoskeletal symptoms, that single-nucleotide polymorphisms (SNPs) near the T-cell leukemia/lymphoma 1A (TCL1A) gene were associated with aromatase inhibitor-induced musculoskeletal pain and with estradiol (E2)-induced TCL1A expression. Furthermore, variation in TCL1A expression influenced the downstream expression of proinflammatory cytokines and cytokine receptors. Specifically, the top hit genome-wide association study SNP, rs11849538, created a functional estrogen response element (ERE) that displayed estrogen receptor (ER) binding and increased E2 induction of TCL1A expression only for the variant SNP genotype. In the present study, we pursued mechanisms underlying the E2-SNP-dependent regulation of TCL1A expression and, in parallel, our subsequent observations that SNPs at a distance from EREs can regulate ERα binding and that ER antagonists can reverse phenotypes associated with those SNPs. Specifically, we performed a series of functional genomic studies using a large panel of lymphoblastoid cell lines with dense genomic data that demonstrated that TCL1A SNPs at a distance from EREs can modulate ERα binding and expression of TCL1A as well as the expression of downstream immune mediators. Furthermore, 4-hydroxytamoxifen or fulvestrant could reverse these SNP-genotype effects. Similar results were found for SNPs in the IL17A cytokine and CCR6 chemokine receptor genes. These observations greatly expand our previous results and support the existence of a novel molecular mechanism that contributes to the complex interplay between estrogens and immune systems. They also raise the possibility of the pharmacological manipulation of the expression of proinflammatory cytokines and chemokines in a SNP genotype-dependent fashion. PMID:26866883

  20. Development and Characterization of a High Density SNP Genotyping Assay for Cattle

    PubMed Central

    Matukumalli, Lakshmi K.; Lawley, Cynthia T.; Schnabel, Robert D.; Taylor, Jeremy F.; Allan, Mark F.; Heaton, Michael P.; O'Connell, Jeff; Moore, Stephen S.; Smith, Timothy P. L.; Sonstegard, Tad S.; Van Tassell, Curtis P.

    2009-01-01

    The success of genome-wide association (GWA) studies for the detection of sequence variation affecting complex traits in human has spurred interest in the use of large-scale high-density single nucleotide polymorphism (SNP) genotyping for the identification of quantitative trait loci (QTL) and for marker-assisted selection in model and agricultural species. A cost-effective and efficient approach for the development of a custom genotyping assay interrogating 54,001 SNP loci to support GWA applications in cattle is described. A novel algorithm for achieving a compressed inter-marker interval distribution proved remarkably successful, with median interval of 37 kb and maximum predicted gap of <350 kb. The assay was tested on a panel of 576 animals from 21 cattle breeds and six outgroup species and revealed that from 39,765 to 46,492 SNP are polymorphic within individual breeds (average minor allele frequency (MAF) ranging from 0.24 to 0.27). The assay also identified 79 putative copy number variants in cattle. Utility for GWA was demonstrated by localizing known variation for coat color and the presence/absence of horns to their correct genomic locations. The combination of SNP selection and the novel spacing algorithm allows an efficient approach for the development of high-density genotyping platforms in species having full or even moderate quality draft sequence. Aspects of the approach can be exploited in species which lack an available genome sequence. The BovineSNP50 assay described here is commercially available from Illumina and provides a robust platform for mapping disease genes and QTL in cattle. PMID:19390634

  1. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems.

    PubMed

    Wade, Len J; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K; Reddy, K R Kamalnath; Varma, C Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H E; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7-40.7 Mb) and on chromosome 8 (20.3-21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions. PMID:25909711

  2. Environmental Response and Genomic Regions Correlated with Rice Root Growth and Yield under Drought in the OryzaSNP Panel across Multiple Study Systems

    PubMed Central

    Wade, Len J.; Bartolome, Violeta; Mauleon, Ramil; Vasant, Vivek Deshmuck; Prabakar, Sumeet Mankar; Chelliah, Muthukumar; Kameoka, Emi; Nagendra, K.; Reddy, K. R. Kamalnath; Varma, C. Mohan Kumar; Patil, Kalmeshwar Gouda; Shrestha, Roshi; Al-Shugeairy, Zaniab; Al-Ogaidi, Faez; Munasinghe, Mayuri; Gowda, Veeresh; Semon, Mande; Suralta, Roel R.; Shenoy, Vinay; Vadez, Vincent; Serraj, Rachid; Shashidhar, H. E.; Yamauchi, Akira; Babu, Ranganathan Chandra; Price, Adam; McNally, Kenneth L.; Henry, Amelia

    2015-01-01

    The rapid progress in rice genotyping must be matched by advances in phenotyping. A better understanding of genetic variation in rice for drought response, root traits, and practical methods for studying them are needed. In this study, the OryzaSNP set (20 diverse genotypes that have been genotyped for SNP markers) was phenotyped in a range of field and container studies to study the diversity of rice root growth and response to drought. Of the root traits measured across more than 20 root experiments, root dry weight showed the most stable genotypic performance across studies. The environment (E) component had the strongest effect on yield and root traits. We identified genomic regions correlated with root dry weight, percent deep roots, maximum root depth, and grain yield based on a correlation analysis with the phenotypes and aus, indica, or japonica introgression regions using the SNP data. Two genomic regions were identified as hot spots in which root traits and grain yield were co-located; on chromosome 1 (39.7–40.7 Mb) and on chromosome 8 (20.3–21.9 Mb). Across experiments, the soil type/ growth medium showed more correlations with plant growth than the container dimensions. Although the correlations among studies and genetic co-location of root traits from a range of study systems points to their potential utility to represent responses in field studies, the best correlations were observed when the two setups had some similar properties. Due to the co-location of the identified genomic regions (from introgression block analysis) with QTL for a number of previously reported root and drought traits, these regions are good candidates for detailed characterization to contribute to understanding rice improvement for response to drought. This study also highlights the utility of characterizing a small set of 20 genotypes for root growth, drought response, and related genomic regions. PMID:25909711

  3. Snat: a SNP annotation tool for bovine by integrating various sources of genomic information

    PubMed Central

    2011-01-01

    Background Most recently, with maturing of bovine genome sequencing and high throughput SNP genotyping technologies, a large number of significant SNPs associated with economic important traits can be identified by genome-wide association studies (GWAS). To further determine true association findings in GWAS, the common strategy is to sift out most promising SNPs for follow-up replication studies. Hence it is crucial to explore the functional significance of the candidate SNPs in order to screen and select the potential functional ones. To systematically prioritize these statistically significant SNPs and facilitate follow-up replication studies, we developed a bovine SNP annotation tool (Snat) based on a web interface. Results With Snat, various sources of genomic information are integrated and retrieved from several leading online databases, including SNP information from dbSNP, gene information from Entrez Gene, protein features from UniProt, linkage information from AnimalQTLdb, conserved elements from UCSC Genome Browser Database and gene functions from Gene Ontology (GO), KEGG PATHWAY and Online Mendelian Inheritance in Animals (OMIA). Snat provides two different applications, including a CGI-based web utility and a command-line version, to access the integrated database, target any single nucleotide loci of interest and perform multi-level functional annotations. For further validation of the practical significance of our study, SNPs involved in two commercial bovine SNP chips, i.e., the Affymetrix Bovine 10K chip array and the Illumina 50K chip array, have been annotated by Snat, and the corresponding outputs can be directly downloaded from Snat website. Furthermore, a real dataset involving 20 identified SNPs associated with milk yield in our recent GWAS was employed to demonstrate the practical significance of Snat. Conclusions To our best knowledge, Snat is one of first tools focusing on SNP annotation for livestock. Snat confers researchers with a

  4. A robust statistical method to detect null alleles in microsatellite and SNP datasets in both panmictic and inbred populations.

    PubMed

    Girard, Philippe

    2011-01-01

    Null alleles are common technical artifacts in genetic-based analysis. Powerful methods enabling their detection in either panmictic or inbred populations have been proposed. However, none of these methods appears unbiased in both types of mating systems, necessitating a priori knowledge of the inbreeding level of the population under study. To counter this problem, I propose to use the software FDist2 to detect the atypical fixation indices that characterize markers with null alleles. The rational behind this approach and the parameter settings are explained. The power of the method for various sample sizes, degrees of inbreeding and null allele frequencies is evaluated using simulated microsatellite and SNP datasets and then compared to two other null allele detection methods. The results clearly show the robustness of the method proposed here as well as its greater accuracy in both panmictic and inbred populations for both types of marker. By allowing a proper detection of null alleles for a wide range of mating systems and markers, this new method is particularly appealing for numerous genetic studies using co-dominant loci. PMID:21381434

  5. Population genomic structure and linkage disequilibrium analysis of South African goat breeds using genome-wide SNP data.

    PubMed

    Mdladla, K; Dzomba, E F; Huson, H J; Muchadeyi, F C

    2016-08-01

    The sustainability of goat farming in marginal areas of southern Africa depends on local breeds that are adapted to specific agro-ecological conditions. Unimproved non-descript goats are the main genetic resources used for the development of commercial meat-type breeds of South Africa. Little is known about genetic diversity and the genetics of adaptation of these indigenous goat populations. This study investigated the genetic diversity, population structure and breed relations, linkage disequilibrium, effective population size and persistence of gametic phase in goat populations of South Africa. Three locally developed meat-type breeds of the Boer (n = 33), Savanna (n = 31), Kalahari Red (n = 40), a feral breed of Tankwa (n = 25) and unimproved non-descript village ecotypes (n = 110) from four goat-producing provinces of the Eastern Cape, KwaZulu-Natal, Limpopo and North West were assessed using the Illumina Goat 50K SNP Bead Chip assay. The proportion of SNPs with minor allele frequencies >0.05 ranged from 84.22% in the Tankwa to 97.58% in the Xhosa ecotype, with a mean of 0.32 ± 0.13 across populations. Principal components analysis, admixture and pairwise FST identified Tankwa as a genetically distinct population and supported clustering of the populations according to their historical origins. Genome-wide FST identified 101 markers potentially under positive selection in the Tankwa. Average linkage disequilibrium was highest in the Tankwa (r(2)  = 0.25 ± 0.26) and lowest in the village ecotypes (r(2) range = 0.09 ± 0.12 to 0.11 ± 0.14). We observed an effective population size of <150 for all populations 13 generations ago. The estimated correlations for all breed pairs were lower than 0.80 at marker distances >100 kb with the exception of those in Savanna and Tswana populations. This study highlights the high level of genetic diversity in South African indigenous goats as well as the utility of the genome-wide SNP marker panels in

  6. Genome-wide SNP association-based localization of a dwarfism gene in Friesian dwarf horses.

    PubMed

    Orr, N; Back, W; Gu, J; Leegwater, P; Govindarajan, P; Conroy, J; Ducro, B; Van Arendonk, J A M; MacHugh, D E; Ennis, S; Hill, E W; Brama, P A J

    2010-12-01

    The recent completion of the horse genome and commercial availability of an equine SNP genotyping array has facilitated the mapping of disease genes. We report putative localization of the gene responsible for dwarfism, a trait in Friesian horses that is thought to have a recessive mode of inheritance, to a 2-MB region of chromosome 14 using just 10 affected animals and 10 controls. We successfully genotyped 34,429 SNPs that were tested for association with dwarfism using chi-square tests. The most significant SNP in our study, BIEC2-239376 (P(2df)=4.54 × 10(-5), P(rec)=7.74 × 10(-6)), is located close to a gene implicated in human dwarfism. Fine-mapping and resequencing analyses did not aid in further localization of the causative variant, and replication of our findings in independent sample sets will be necessary to confirm these results. PMID:21070269

  7. SNPit: a federated data integration system for the purpose of functional SNP annotation.

    PubMed

    Shen, Terry H; Carlson, Christopher S; Tarczy-Hornoch, Peter

    2009-08-01

    Genome wide association studies can potentially identify the genetic causes behind the majority of human diseases. With the advent of more advanced genotyping techniques, there is now an explosion of data gathered on single nucleotide polymorphisms (SNPs). The need exists for an integrated system that can provide up-to-date functional annotation information on SNPs. We have developed the SNP Integration Tool (SNPit) system to address this need. Built upon a federated data integration system, SNPit provides current information on a comprehensive list of SNP data sources. Additional logical inference analysis was included through an inference engine plug in. The SNPit web servlet is available online for use. SNPit allows users to go to one source for up-to-date information on the functional annotation of SNPs. A tool that can help to integrate and analyze the potential functional significance of SNPs is important for understanding the results from genome wide association studies. PMID:19327864

  8. Applying SNP-Derived Molecular Coancestry Estimates to Captive Breeding Programs.

    PubMed

    Ivy, Jamie A; Putnam, Andrea S; Navarro, Asako Y; Gurr, Jessica; Ryder, Oliver A

    2016-09-01

    Captive breeding programs for wildlife species typically rely on pedigrees to inform genetic management. Although pedigree-based breeding strategies are quite effective at retaining long-term genetic variation, management of zoo-based breeding programs continues to be hampered when pedigrees are poorly known. The objective of this study was to evaluate 2 options for generating single nucleotide polymorphism (SNP) data to resolve unknown relationships within captive breeding programs. We generated SNP data for a zoo-based population of addax (Addax nasomasculatus) using both the Illumina BovineHD BeadChip and double digest restriction site-associated DNA (ddRAD) sequencing. Our results demonstrated that estimates of allele sharing (AS) between pairs of individuals exhibited low variances. Average AS variances were highest when using 50 loci (SNPchipall = 0.00159; ddRADall = 0.0249), but fell below 0.0003 for the SNP chip dataset when sampling ≥250 loci and below 0.0025 for the ddRAD dataset when sampling ≥500 loci. Furthermore, the correlation between the SNPchipall and ddRADall AS datasets was 0.88 (95%CI = 0.84-0.91) when subsampling 500 loci. Collectively, our results indicated that both SNP genotyping methods produced sufficient data for accurately estimating relationships, even within an extremely bottlenecked population. Our results also suggested that analytic assumptions historically integrated into the addax pedigree are not adversely impacting long-term pedigree-based management; kinships calculated from the analytic pedigree were significantly correlated (P < 0.001) with AS estimates. Overall, our conclusions are intended to serve as both a proof of concept and a model for applying molecular data to the genetic management of captive breeding programs. PMID:27208150

  9. Genome Rearrangements Detected by SNP Microarrays in Individuals with Intellectual Disability Referred with Possible Williams Syndrome

    PubMed Central

    Pani, Ariel M.; Hobart, Holly H.; Morris, Colleen A.; Mervis, Carolyn B.; Bray-Ward, Patricia; Kimberley, Kendra W.; Rios, Cecilia M.; Clark, Robin C.; Gulbronson, Maricela D.; Gowans, Gordon C.; Gregg, Ronald G.

    2010-01-01

    Background Intellectual disability (ID) affects 2–3% of the population and may occur with or without multiple congenital anomalies (MCA) or other medical conditions. Established genetic syndromes and visible chromosome abnormalities account for a substantial percentage of ID diagnoses, although for ∼50% the molecular etiology is unknown. Individuals with features suggestive of various syndromes but lacking their associated genetic anomalies pose a formidable clinical challenge. With the advent of microarray techniques, submicroscopic genome alterations not associated with known syndromes are emerging as a significant cause of ID and MCA. Methodology/Principal Findings High-density SNP microarrays were used to determine genome wide copy number in 42 individuals: 7 with confirmed alterations in the WS region but atypical clinical phenotypes, 31 with ID and/or MCA, and 4 controls. One individual from the first group had the most telomeric gene in the WS critical region deleted along with 2 Mb of flanking sequence. A second person had the classic WS deletion and a rearrangement on chromosome 5p within the Cri du Chat syndrome (OMIM:123450) region. Six individuals from the ID/MCA group had large rearrangements (3 deletions, 3 duplications), one of whom had a large inversion associated with a deletion that was not detected by the SNP arrays. Conclusions/Significance Combining SNP microarray analyses and qPCR allowed us to clone and sequence 21 deletion breakpoints in individuals with atypical deletions in the WS region and/or ID or MCA. Comparison of these breakpoints to databases of genomic variation revealed that 52% occurred in regions harboring structural variants in the general population. For two probands the genomic alterations were flanked by segmental duplications, which frequently mediate recurrent genome rearrangements; these may represent new genomic disorders. While SNP arrays and related technologies can identify potentially pathogenic deletions and

  10. MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data.

    PubMed

    Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong

    2015-01-01

    Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package "MAFsnp" implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/. PMID:26309201

  11. Efficient Genome-Wide TagSNP Selection Across Populations via the Linkage Disequilibrium Criterion

    PubMed Central

    Wu, Yonghui; Lonardi, Stefano; Jiang, Tao

    2010-01-01

    Abstract In this article, we studied the tag single-nucleotide polymorphism (tagSNP) selection problem on multiple populations using the pairwise r2 linkage disequilibrium criterion. We proposed a novel combinatorial optimization model for the tagSNP selection problem, called the minimum common tagSNP selection (MCTS) problem, and presented efficient solutions for MCTS. Our approach consists of the following three main steps: (i) partitioning the SNP markers into small disjoint components, (ii) applying some data reduction rules to simplify the problem, and (iii) applying either a fast greedy algorithm or a Lagrangian relaxation algorithm to solve the remaining (general) MCTS. These algorithms also provide lower bounds on tagging (i.e., the minimum number of tagSNPs needed). The lower bounds allow us to evaluate how far our solution is from the optimum. To the best of our knowledge, it is the first time the tagging lower bounds are discussed in the literature. We assessed the performance of our algorithms on real HapMap data for genome-wide tagging. The experiments demonstrated that our algorithms run 3–4 orders of magnitude faster than the existing single-population tagging programs such as FESTA, LD-Select, and the multiple-population tagging method MultiPop-TagSelect. Our method also greatly reduced the required tagSNPs compared with LD-Select on a single population and MultiPop-TagSelect on multiple populations. Moreover, the numbers of tagSNPs selected by our algorithms are almost optimal because they are very close to the corresponding lower bounds obtained by our method. PMID:20078395

  12. Haplotype inference from unphased SNP data in heterozygous polyploids based on SAT

    PubMed Central

    Neigenfind, Jost; Gyetvai, Gabor; Basekow, Rico; Diehl, Svenja; Achenbach, Ute; Gebhardt, Christiane; Selbig, Joachim; Kersten, Birgit

    2008-01-01

    Background Haplotype inference based on unphased SNP markers is an important task in population genetics. Although there are different approaches to the inference of haplotypes in diploid species, the existing software is not suitable for inferring haplotypes from unphased SNP data in polyploid species, such as the cultivated potato (Solanum tuberosum). Potato species are tetraploid and highly heterozygous. Results Here we present the software SATlotyper which is able to handle polyploid and polyallelic data. SATlo-typer uses the Boolean satisfiability problem to formulate Haplotype Inference by Pure Parsimony. The software excludes existing haplotype inferences, thus allowing for calculation of alternative inferences. As it is not known which of the multiple haplotype inferences are best supported by the given unphased data set, we use a bootstrapping procedure that allows for scoring of alternative inferences. Finally, by means of the bootstrapping scores, it is possible to optimise the phased genotypes belonging to a given haplotype inference. The program is evaluated with simulated and experimental SNP data generated for heterozygous tetraploid populations of potato. We show that, instead of taking the first haplotype inference reported by the program, we can significantly improve the quality of the final result by applying additional methods that include scoring of the alternative haplotype inferences and genotype optimisation. For a sub-population of nineteen individuals, the predicted results computed by SATlotyper were directly compared with results obtained by experimental haplotype inference via sequencing of cloned amplicons. Prediction and experiment gave similar results regarding the inferred haplotypes and phased genotypes. Conclusion Our results suggest that Haplotype Inference by Pure Parsimony can be solved efficiently by the SAT approach, even for data sets of unphased SNP from heterozygous polyploids. SATlotyper is freeware and is distributed as

  13. SNP Discovery Using Next Generation Transcriptomic Sequencing in Atlantic Herring (Clupea harengus)

    PubMed Central

    Bekkevold, Dorte; Babbucci, Massimiliano; van Houdt, Jeroen; Maes, Gregory E.; Bargelloni, Luca; Nielsen, Rasmus O.; Taylor, Martin I.; Ogden, Rob; Cariani, Alessia; Carvalho, Gary R.; Consortium, FishPopTrace; Panitz, Frank

    2012-01-01

    The introduction of Next Generation Sequencing (NGS) has revolutionised population genetics, providing studies of non-model species with unprecedented genomic coverage, allowing evolutionary biologists to address questions previously far beyond the reach of available resources. Furthermore, the simple mutation model of Single Nucleotide Polymorphisms (SNPs) permits cost-effective high-throughput genotyping in thousands of individuals simultaneously. Genomic resources are scarce for the Atlantic herring (Clupea harengus), a small pelagic species that sustains high revenue fisheries. This paper details the development of 578 SNPs using a combined NGS and high-throughput genotyping approach. Eight individuals covering the species distribution in the eastern Atlantic were bar-coded and multiplexed into a single cDNA library and sequenced using the 454 GS FLX platform. SNP discovery was performed by de novo sequence clustering and contig assembly, followed by the mapping of reads against consensus contig sequences. Selection of candidate SNPs for genotyping was conducted using an in silico approach. SNP validation and genotyping were performed simultaneously using an Illumina 1,536 GoldenGate assay. Although the conversion rate of candidate SNPs in the genotyping assay cannot be predicted in advance, this approach has the potential to maximise cost and time efficiencies by avoiding expensive and time-consuming laboratory stages of SNP validation. Additionally, the in silico approach leads to lower ascertainment bias in the resulting SNP panel as marker selection is based only on the ability to design primers and the predicted presence of intron-exon boundaries. Consequently SNPs with a wider spectrum of minor allele frequencies (MAFs) will be genotyped in the final panel. The genomic resources presented here represent a valuable multi-purpose resource for developing informative marker panels for population discrimination, microarray development and for population

  14. Quadruplex-single nucleotide polymorphisms (Quad-SNP) influence gene expression difference among individuals

    PubMed Central

    Baral, Aradhita; Kumar, Pankaj; Halder, Rashi; Mani, Prithvi; Yadav, Vinod Kumar; Singh, Ankita; Das, Swapan K.; Chowdhury, Shantanu

    2012-01-01

    Non-canonical guanine quadruplex structures are not only predominant but also conserved among bacterial and mammalian promoters. Moreover recent findings directly implicate quadruplex structures in transcription. These argue for an intrinsic role of the structural motif and thereby posit that single nucleotide polymorphisms (SNP) that compromise the quadruplex architecture could influence function. To test this, we analysed SNPs within quadruplex motifs (Quad-SNP) and gene expression in 270 individuals across four populations (HapMap) representing more than 14 500 genotypes. Findings reveal significant association between quadruplex-SNPs and expression of the corresponding gene in individuals (P < 0.0001). Furthermore, analysis of Quad-SNPs obtained from population-scale sequencing of 1000 human genomes showed relative selection bias against alteration of the structural motif. To directly test the quadruplex-SNP-transcription connection, we constructed a reporter system using the RPS3 promoter—remarkable difference in promoter activity in the ‘quadruplex-destabilized’ versus ‘quadruplex-intact’ promoter was noticed. As a further test, we incorporated a quadruplex motif or its disrupted counterpart within a synthetic promoter reporter construct. The quadruplex motif, and not the disrupted-motif, enhanced transcription in human cell lines of different origin. Together, these findings build direct support for quadruplex-mediated transcription and suggest quadruplex-SNPs may play significant role in mechanistically understanding variations in gene expression among individuals. PMID:22238381

  15. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao

    PubMed Central

    Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos

    2015-01-01

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980

  16. Application of LogitBoost Classifier for Traceability Using SNP Chip Data

    PubMed Central

    Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok

    2015-01-01

    Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability. PMID:26436917

  17. Novel approach for deriving genome wide SNP analysis data from archived blood spots

    PubMed Central

    2012-01-01

    Background The ability to transport and store DNA at room temperature in low volumes has the advantage of optimising cost, time and storage space. Blood spots on adapted filter papers are popular for this, with FTA (Flinders Technology Associates) Whatman™TM technology being one of the most recent. Plant material, plasmids, viral particles, bacteria and animal blood have been stored and transported successfully using this technology, however the method of porcine DNA extraction from FTA Whatman™TM cards is a relatively new approach, allowing nucleic acids to be ready for downstream applications such as PCR, whole genome amplification, sequencing and subsequent application to single nucleotide polymorphism microarrays has hitherto been under-explored. Findings DNA was extracted from FTA Whatman™TM cards (following adaptations of the manufacturer’s instructions), whole genome amplified and subsequently analysed to validate the integrity of the DNA for downstream SNP analysis. DNA was successfully extracted from 288/288 samples and amplified by WGA. Allele dropout post WGA, was observed in less than 2% of samples and there was no clear evidence of amplification bias nor contamination. Acceptable call rates on porcine SNP chips were also achieved using DNA extracted and amplified in this way. Conclusions DNA extracted from FTA Whatman cards is of a high enough quality and quantity following whole genomic amplification to perform meaningful SNP chip studies. PMID:22974252

  18. High-throughput SNP-genotyping analysis of the relationships among Ponto-Caspian sturgeon species

    PubMed Central

    Rastorguev, Sergey M; Nedoluzhko, Artem V; Mazur, Alexander M; Gruzdeva, Natalia M; Volkov, Alexander A; Barmintseva, Anna E; Mugue, Nikolai S; Prokhortchouk, Egor B

    2013-01-01

    Abstract Legally certified sturgeon fisheries require population protection and conservation methods, including DNA tests to identify the source of valuable sturgeon roe. However, the available genetic data are insufficient to distinguish between different sturgeon populations, and are even unable to distinguish between some species. We performed high-throughput single-nucleotide polymorphism (SNP)-genotyping analysis on different populations of Russian (Acipenser gueldenstaedtii), Persian (A. persicus), and Siberian (A. baerii) sturgeon species from the Caspian Sea region (Volga and Ural Rivers), the Azov Sea, and two Siberian rivers. We found that Russian sturgeons from the Volga and Ural Rivers were essentially indistinguishable, but they differed from Russian sturgeons in the Azov Sea, and from Persian and Siberian sturgeons. We identified eight SNPs that were sufficient to distinguish these sturgeon populations with 80% confidence, and allowed the development of markers to distinguish sturgeon species. Finally, on the basis of our SNP data, we propose that the A. baerii-like mitochondrial DNA found in some Russian sturgeons from the Caspian Sea arose via an introgression event during the Pleistocene glaciation. In the present study, the high-throughput genotyping analysis of several sturgeon populations was performed. SNP markers for species identification were defined. The possible explanation of the baerii-like mitotype presence in some Russian sturgeons in the Caspian Sea was suggested. PMID:24567827

  19. A functional SNP in FLT1 increases risk of coronary artery disease in a Japanese population.

    PubMed

    Konta, Atsuko; Ozaki, Kouichi; Sakata, Yasuhiko; Takahashi, Atsushi; Morizono, Takashi; Suna, Shinichiro; Onouchi, Yoshihiro; Tsunoda, Tatsuhiko; Kubo, Michiaki; Komuro, Issei; Eishi, Yoshinobu; Tanaka, Toshihiro

    2016-05-01

    Coronary artery disease (CAD) including myocardial infarction is one of the leading causes of death in many countries. Similar to other common diseases, its pathogenesis is thought to result from complex interactions among multiple genetic and environmental factors. Recent large-scale genetic association analysis for CAD identified 15 new loci. We examined the reproducibility of these previous association findings with 7990 cases and 6582 controls in a Japanese population. We found a convincing association of rs9319428 in FLT1, encoding fms-related tyrosine kinase 1 (P=5.98 × 10(-8)). Fine mapping using tag single-nucleotide polymorphisms (SNPs) at FLT1 locus revealed that another SNP (rs74412485) showed more profound genetic effect for CAD (P=2.85 × 10(-12)). The SNP, located in intron 1 in FLT1, enhanced the transcriptional level of FLT1. RNA interference experiment against FLT1 showed that the suppression of FLT1 resulted in decreased expression of inflammatory adhesion molecules. Expression of FLT1 was observed in endothelial cells of human coronary artery. Our results indicate that the genetically coded increased expression of FLT1 by a functional SNP implicates activation in an inflammatory cascade that might eventually lead to CAD. PMID:26791355

  20. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.

    PubMed

    Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos

    2015-08-01

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980

  1. Identification and SNP association analysis of a novel gene in chicken.

    PubMed

    Mei, Xingxing; Kang, Xiangtao; Liu, Xiaojun; Jia, Lijuan; Li, Hong; Li, Zhuanjian; Jiang, Ruirui

    2016-02-01

    A novel gene that was predicted to encode a long noncoding RNA (lncRNA) transcript was identified in a previous study that aimed to detect candidate genes related to growth rate differences between Chinese local breed Gushi chickens and Anka broilers. To characterise the biological function of the lncRNA, we cloned and sequenced the complete open reading frame of the gene. We performed quantitative real-time polymerase chain reaction (qPCR) to analyse the expression patterns of the lncRNA in different tissues of chicken at different development stages. The qPCR data showed that the novel lncRNA gene was expressed extensively, with the highest abundance in spleen and lung and the lowest abundance in pectoralis and leg muscle. Additionally, we identified a single nucleotide polymorphism (SNP) at the 5'-end of the gene and studied the association between the SNP and chicken growth traits using data from an F2 resource population of Gushi chickens and Anka broilers. The association analysis showed that the SNP was significantly (P < 0.05) associated with leg muscle weight, chest breadth, sternal length and body weight in chickens at 1 day, 4 weeks and 6 weeks of age. We concluded that the novel lncRNA gene, which we designated pouBW1, may play an important role in regulating chicken growth. PMID:26643990

  2. PEAS V1.0: a package for elementary analysis of SNP data.

    PubMed

    Xu, Shuhua; Gupta, Sanchit; Jin, Li

    2010-11-01

    We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm. PMID:21565121

  3. Application of LogitBoost Classifier for Traceability Using SNP Chip Data.

    PubMed

    Kim, Kwondo; Seo, Minseok; Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok

    2015-01-01

    Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability. PMID:26436917

  4. How to Use SNP_TATA_Comparator to Find a Significant Change in Gene Expression Caused by the Regulatory SNP of This Gene's Promoter via a Change in Affinity of the TATA-Binding Protein for This Promoter

    PubMed Central

    Ponomarenko, Mikhail; Rasskazov, Dmitry; Arkova, Olga; Ponomarenko, Petr; Suslov, Valentin; Savinkova, Ludmila; Kolchanov, Nikolay

    2015-01-01

    The use of biomedical SNP markers of diseases can improve effectiveness of treatment. Genotyping of patients with subsequent searching for SNPs more frequent than in norm is the only commonly accepted method for identification of SNP markers within the framework of translational research. The bioinformatics applications aimed at millions of unannotated SNPs of the “1000 Genomes” can make this search for SNP markers more focused and less expensive. We used our Web service involving Fisher's Z-score for candidate SNP markers to find a significant change in a gene's expression. Here we analyzed the change caused by SNPs in the gene's promoter via a change in affinity of the TATA-binding protein for this promoter. We provide examples and discuss how to use this bioinformatics application in the course of practical analysis of unannotated SNPs from the “1000 Genomes” project. Using known biomedical SNP markers, we identified 17 novel candidate SNP markers nearby: rs549858786 (rheumatoid arthritis); rs72661131 (cardiovascular events in rheumatoid arthritis); rs562962093 (stroke); rs563558831 (cyclophosphamide bioactivation); rs55878706 (malaria resistance, leukopenia), rs572527200 (asthma, systemic sclerosis, and psoriasis), rs371045754 (hemophilia B), rs587745372 (cardiovascular events); rs372329931, rs200209906, rs367732974, and rs549591993 (all four: cancer); rs17231520 and rs569033466 (both: atherosclerosis); rs63750953, rs281864525, and rs34166473 (all three: malaria resistance, thalassemia). PMID:26516624

  5. Efficient SNP Discovery by Combining Microarray and Lab-on-a-Chip Data for Animal Breeding and Selection

    PubMed Central

    Huang, Chao-Wei; Lin, Yu-Tsung; Ding, Shih-Torng; Lo, Ling-Ling; Wang, Pei-Hwa; Lin, En-Chung; Liu, Fang-Wei; Lu, Yen-Wen

    2015-01-01

    The genetic markers associated with economic traits have been widely explored for animal breeding. Among these markers, single-nucleotide polymorphism (SNPs) are gradually becoming a prevalent and effective evaluation tool. Since SNPs only focus on the genetic sequences of interest, it thereby reduces the evaluation time and cost. Compared to traditional approaches, SNP genotyping techniques incorporate informative genetic background, improve the breeding prediction accuracy and acquiesce breeding quality on the farm. This article therefore reviews the typical procedures of animal breeding using SNPs and the current status of related techniques. The associated SNP information and genotyping techniques, including microarray and Lab-on-a-Chip based platforms, along with their potential are highlighted. Examples in pig and poultry with different SNP loci linked to high economic trait values are given. The recommendations for utilizing SNP genotyping in nimal breeding are summarized. PMID:27600241

  6. Generation of large numbers of SNP in cattle by coupling reduced genome representation with high throughput sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Whole genome sequencing projects have produced draft sequences for species from diverse evolutionary clades for comparative evolutionary studies. Generally, these projects have not simultaneously created extensive single nucleotide polymorphism (SNP) resources for use in genetics studies within the...

  7. Citrus (Rutaceae) SNP markers based on Competitive Allele-Specific PCR; transferability across the Aurantioideae subfamily1

    PubMed Central

    Garcia-Lor, Andres; Ancillo, Gema; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    • Premise of the study: Single nucleotide polymorphism (SNP) markers based on Competitive Allele-Specific PCR (KASPar) were developed from sequences of three Citrus species. Their transferability was tested in 63 Citrus genotypes and 19 relative genera of the subfamily Aurantioideae to estimate the potential of SNP markers, selected from a limited intrageneric discovery panel, for ongoing broader diversity analysis at the intra- and intergeneric levels and systematic germplasm bank characterization. • Methods and Results: Forty-two SNP markers were developed using KASPar technology. Forty-one were successfully genotyped in all of the Citrus germplasm, where intra- and interspecific polymorphisms were observed. The transferability and diversity decreased with increasing taxonomic distance. • Conclusions: SNP markers based on the KASPar method developed from sequence data of a limited intrageneric discovery panel provide a valuable molecular resource for genetic diversity analysis of germplasm within a genus and should be useful for germplasm fingerprinting at a much broader diversity level. PMID:25202535

  8. Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...

  9. LincSNP: a database of linking disease-associated SNPs to human large intergenic non-coding RNAs

    PubMed Central

    2014-01-01

    Background Genome-wide association studies (GWAS) have successfully identified a large number of single nucleotide polymorphisms (SNPs) that are associated with a wide range of human diseases. However, many of these disease-associated SNPs are located in non-coding regions and have remained largely unexplained. Recent findings indicate that disease-associated SNPs in human large intergenic non-coding RNA (lincRNA) may lead to susceptibility to diseases through their effects on lincRNA expression. There is, therefore, a need to specifically record these SNPs and annotate them as potential candidates for disease. Description We have built LincSNP, an integrated database, to identify and annotate disease-associated SNPs in human lincRNAs. The current release of LincSNP contains approximately 140,000 disease-associated SNPs (or linkage disequilibrium SNPs), which can be mapped to around 5,000 human lincRNAs, together with their comprehensive functional annotations. The database also contains annotated, experimentally supported SNP-lincRNA-disease associations and disease-associated lincRNAs. It provides flexible search options for data extraction and searches can be performed by disease/phenotype name, SNP ID, lincRNA name and chromosome region. In addition, we provide users with a link to download all the data from LincSNP and have developed a web interface for the submission of novel identified SNP-lincRNA-disease associations. Conclusions The LincSNP database aims to integrate disease-associated SNPs and human lincRNAs, which will be an important resource for the investigation of the functions and mechanisms of lincRNAs in human disease. The database is available at http://bioinfo.hrbmu.edu.cn/LincSNP. PMID:24885522

  10. A Multiple-SNP Approach for Genome-Wide Association Study of Milk Production Traits in Chinese Holstein Cattle

    PubMed Central

    Fang, Ming; Fu, Weixuan; Jiang, Dan; Zhang, Qin; Sun, Dongxiao; Ding, Xiangdong; Liu, Jianfeng

    2014-01-01

    The multiple-SNP analysis has been studied by many researchers, in which the effects of multiple SNPs are simultaneously estimated and tested in a multiple linear regression. The multiple-SNP association analysis usually has higher power and lower false-positive rate for detecting causative SNP(s) than single marker analysis (SMA). Several methods have been proposed to simultaneously estimate and test multiple SNP effects. In this research, a fast method called MEML (Mixed model based Expectation-Maximization Lasso algorithm) was developed for simultaneously estimate of multiple SNP effects. An improved Lasso prior was assigned to SNP effects which were estimated by searching the maximum joint posterior mode. The residual polygenic effect was included in the model to absorb many tiny SNP effects, which is treated as missing data in our EM algorithm. A series of simulation experiments were conducted to validate the proposed method, and the results showed that compared with SMMA, the new method can dramatically decrease the false-positive rate. The new method was also applied to the 50k SNP-panel dataset for genome-wide association study of milk production traits in Chinese Holstein cattle. Totally, 39 significant SNPs and their nearby 25 genes were found. The number of significant SNPs is remarkably fewer than that by SMMA which found 105 significant SNPs. Among 39 significant SNPs, 8 were also found by SMMA and several well-known QTLs or genes were confirmed again; furthermore, we also got some positional candidate gene with potential function of effecting milk production traits. These novel findings in our research should be valuable for further investigation. PMID:25148050

  11. Use of the Illumina GoldenGate assay for single nucleotide polymorphism (SNP) genotyping in cereal crops.

    PubMed

    Chao, Shiaoman; Lawley, Cindy

    2015-01-01

    Highly parallel genotyping assays, such as the GoldenGate assay developed by Illumina, capable of interrogating up to 3,072 single nucleotide polymorphisms (SNPs) simultaneously, have greatly facilitated genome-wide studies, particularly for crops with large and complex genome structures. In this report, we provide detailed information and guidelines regarding genomic DNA preparation, SNP assay design, SNP assay protocols, and genotype calling using Illumina's GenomeStudio software. PMID:25373766

  12. Analysis of 10 independent samples provides evidence for association between schizophrenia and a SNP flanking fibroblast growth factor receptor 2 (FGFR2)

    PubMed Central

    O’Donovan, M.C.; Norton, N.; Williams, H.; Peirce, T.; Moskvina, V.; Nikolov, I.; Hamshere, M.; Carroll, L.; Georgieva, L.; Dwyer, S; Holmans, P.; Marchini, J. L.; Spencer, C.C.A.; Howie, B.; Leung, H-T.; Giegling, I.; Hartmann, A.M.; Möller, H.-J.; Morris, D.W.; Shi, Y.; Feng, G.; Hoffmann, P.; Propping, P.; Vasilescu, C.; Maier, W.; Rietschel, M.; Zammit, S.; Schumacher, J.; Quinn, E.M.; Schulze, T.G.; Iwata, N.; Ikeda, M.; Darvasi, A.; Shifman, S.; He, L.; Duan, J.; Sanders, A.R.; Levinson, D.F.; Adolfsson, R.; Ösby, U.; Terenius, Lars; Jönsson, Erik G; Cichon, S.; Nöthen, M. M.; Gill, M.; Corvin, A.P.; Rujescu, D.; Gejman, P.V.; Kirov, G.; Craddock, N.; Williams, N.M.; Owen, M.J.

    2010-01-01

    We and others have previously reported linkage to schizophrenia on chromosome 10q25-q26 but, to date, a susceptibility gene in the region has not been identified. We examined data from 3606 SNPs mapping to 10q25-q26 that had been typed in a genome-wide association study (GWAS) of schizophrenia (479 UK cases/2937 controls). SNPs with p<0.01 (n=40) were genotyped in an additional 163 UK cases and those markers that remained nominally significant at p<0.01 (n=22) were genotyped in replication samples from Ireland, Germany and Bulgaria consisting of a total of 1664 cases with schizophrenia and 3541 controls. Only one SNP, rs17101921, was nominally significant after meta-analyses across the replication samples and this was genotyped in an additional six samples from the US/Australia, Germany, China, Japan, Israel and Sweden (n= 5142 cases/ 6561 controls). Across all replication samples, the allele at rs17101921 that was associated in the GWAS showed evidence for association independent of the original data (OR 1.17 (95% CI 1.06-1.29), p=0.0009). The SNP maps 85kb from the nearest gene encoding fibroblast growth factor receptor 2 (FGFR2) making this a potential susceptibility gene for schizophrenia. PMID:18813210

  13. Y-chromosome polymorphisms and ethnic group – a combined STR and SNP approach in a population sample from northern Italy

    PubMed Central

    Cortellini, Venusia; Verzeletti, Andrea; Cerri, Nicoletta; Marino, Alberto; De Ferrari, Francesco

    2013-01-01

    Aim To find an association between Y chromosome polymorphisms and some ethnic groups. Methods Short tandem repeats (STR) and single-nucleotide polymorphisms (SNP) on the Y chromosome were typed in 311 unrelated men from four different ethnic groups – Italians from northern Italy, Albanians, Africans from the Maghreb region, and Indo-Pakistanis, using the AmpFlSTR® Yfiler PCR Amplification Kit and the SNaPshot Multiplex Kit. Results STRs analysis found 299 different haplotypes and SNPs analysis 11 different haplogroups. Haplotypes and haplogroups were analyzed and compared between different ethnic groups. Significant differences were found among all the population groups, except between Italians and Indo-Pakistanis and between Albanians and Indo-Pakistanis. Conclusions Typing both STRs and SNPs on the Y chromosome could become useful in determining ethnic origin of a potential suspect. PMID:23771759

  14. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies

    PubMed Central

    Gimode, Davis; Odeny, Damaris A.; de Villiers, Etienne P.; Wanyonyi, Solomon; Dida, Mathews M.; Mneney, Emmarold E.; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M.

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional

  15. SNP genotypes of olfactory receptor genes associated with olfactory ability in German Shepherd dogs.

    PubMed

    Yang, M; Geng, G-J; Zhang, W; Cui, L; Zhang, H-X; Zheng, J-L

    2016-04-01

    To find out the relationship between SNP genotypes of canine olfactory receptor genes and olfactory ability, 28 males and 20 females from German Shepherd dogs in police service were scored by odor detection tests and analyzed using the Beckman GenomeLab SNPstream. The representative 22 SNP loci from the exonic regions of 12 olfactory receptor genes were investigated, and three kinds of odor (human, ice drug and trinitrotoluene) were detected. The results showed that the SNP genotypes at the OR10H1-like:c.632C>T, OR10H1-like:c.770A>T, OR2K2-like:c.518G>A, OR4C11-like:c.511T>G and OR4C11-like:c.692G>A loci had a statistically significant effect on the scenting abilities (P < 0.001). The kind of odor influenced the performances of the dogs (P < 0.001). In addition, there were interactions between genotype and the kind of odor at the following loci: OR10H1-like:c.632C>T, OR10H1-like:c.770A>T, OR4C11-like:c.511T>G and OR4C11-like:c.692G>A (P < 0.001). The dogs with genotype CC at the OR10H1-like:c.632C>T, genotype AA at the OR10H1-like:c.770A>T, genotype TT at the OR4C11-like:c.511T>G and genotype GG at the OR4C11-like:c.692G>A loci did better at detecting the ice drug. We concluded that there was linkage between certain SNP genotypes and the olfactory ability of dogs and that SNP genotypes might be useful in determining dogs' scenting potential. PMID:26582499

  16. Supplementing High-Density SNP Microarrays for Additional Coverage of Disease-Related Genes: Addiction as a Paradigm

    SciTech Connect

    SacconePhD, Scott F; Chesler, Elissa J; Bierut, Laura J; Kalivas, Peter J; Lerman, Caryn; Saccone, Nancy L; Uhl, George R; Li, Chuan-Yun; Philip, Vivek M; Edenberg, Howard; Sherry, Steven; Feolo, Michael; Moyzis, Robert K; Rutter, Joni L

    2009-01-01

    Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well represented by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions.

  17. Supplementing High-Density SNP Microarrays for Additional Coverage of Disease-Related Genes: Addiction as a Paradigm

    PubMed Central

    Saccone, Scott F.; Bierut, Laura J.; Chesler, Elissa J.; Kalivas, Peter W.; Lerman, Caryn; Saccone, Nancy L.; Uhl, George R.; Li, Chuan-Yun; Philip, Vivek M.; Edenberg, Howard J.; Sherry, Stephen T.; Feolo, Michael; Moyzis, Robert K.; Rutter, Joni L.

    2009-01-01

    Commercial SNP microarrays now provide comprehensive and affordable coverage of the human genome. However, some diseases have biologically relevant genomic regions that may require additional coverage. Addiction, for example, is thought to be influenced by complex interactions among many relevant genes and pathways. We have assembled a list of 486 biologically relevant genes nominated by a panel of experts on addiction. We then added 424 genes that showed evidence of association with addiction phenotypes through mouse QTL mappings and gene co-expression analysis. We demonstrate that there are a substantial number of SNPs in these genes that are not well represented by commercial SNP platforms. We address this problem by introducing a publicly available SNP database for addiction. The database is annotated using numeric prioritization scores indicating the extent of biological relevance. The scores incorporate a number of factors such as SNP/gene functional properties (including synonymy and promoter regions), data from mouse systems genetics and measures of human/mouse evolutionary conservation. We then used HapMap genotyping data to determine if a SNP is tagged by a commercial microarray through linkage disequilibrium. This combination of biological prioritization scores and LD tagging annotation will enable addiction researchers to supplement commercial SNP microarrays to ensure comprehensive coverage of biologically relevant regions. PMID:19381300

  18. Interactions between SNP alleles at multiple loci contribute to skin color differences between caucasoid and mongoloid subjects.

    PubMed

    Anno, Sumiko; Abe, Takashi; Yamamoto, Takushi

    2008-01-01

    This study aimed to identify single nucleotide polymorphism (SNP) alleles at multiple loci associated with racial differences in skin color using SNP genotyping. A total of 122 Caucasians in Toledo, Ohio and 100 Mongoloids in Japan were genotyped for 20 SNPs in 7 candidate genes, encoding the Agouti signaling protein (ASIP), tyrosinase-related protein 1 (TYRP1), tyrosinase (TYR), melanocortin 1 receptor (MC1R), oculocutaneous albinism II (OCA2), microphthalmia-associated transcription factor (MITF), and myosin VA (MYO5A). Data were used to analyze associations between the 20 SNP alleles using linkage disequilibrium (LD). Combinations of SNP alleles were jointly tested under LD for associations with racial groups by performing a chi(2) test for independence. Results showed that SNP alleles at multiple loci can be considered the haplotype that contributes to significant differences between the two population groups and suggest a high probability of LD. Confirmation of these findings requires further study with other ethnic groups to analyze the associations between SNP alleles at multiple loci and skin color variation among races. PMID:18392143

  19. Alternative oxidase pathway is involved in the exogenous SNP-elevated tolerance of Medicago truncatula to salt stress.

    PubMed

    Jian, Wei; Zhang, Da-Wei; Zhu, Feng; Wang, Shuo-Xun; Pu, Xiao-Jun; Deng, Xing-Guang; Luo, Shi-Shuai; Lin, Hong-Hui

    2016-04-01

    Exogenous application of sodium nitroprusside (SNP) would enhance the tolerance of plants to stress conditions. Some evidences suggested that nitric oxide (NO) could induce the expression of alternative oxidase (AOX). In this study, Medicago truncatula (Medicago) was chosen to study the role of AOX in the SNP-elevated resistance to salt stress. Our results showed that the expression of AOX genes (especially AOX1 and AOX2b1) and cyanide-resistant respiration rate (Valt) could be significantly induced by salt stress. Exogenous application of SNP could further enhance the expression of AOX genes and Valt. Exogenous application of SNP could alleviate the oxidative damage and photosynthetic damage caused by salt stress. However, the stress resistance was significantly decreased in the plants which were pretreated with n-propyl gallate (nPG). More importantly, the damage in nPG-pretreated plants could not be alleviated by application of SNP. Further study showed that effects of nPG on the activities of antioxidant enzymes were minor. These results showed that AOX pathway played an important role in the SNP-elevated resistance of Medicago to salt stress. AOX could contribute to regulating the accumulation of reactive oxygen (ROS) and protect of photosystem, and we proposed that all these were depend on the ability of maintaining the homeostasis of redox state. PMID:26962709

  20. Replication of obesity and diabetes-related SNP associations in individuals from Yucatán, México

    PubMed Central

    Hernandez-Escalante, Victor M.; Nava-Gonzalez, Edna J.; Voruganti, V. Saroja; Kent, Jack W.; Haack, Karin; Laviada-Molina, Hugo A.; Molina-Segui, Fernanda; Gallegos-Cabriales, Esther C.; Lopez-Alvarenga, Juan Carlos; Cole, Shelley A.; Mezzles, Marguerite J.; Comuzzie, Anthony G.; Bastarrachea, Raul A.

    2014-01-01

    The prevalence of type 2 diabetes (T2D) is rising rapidly and in Mexicans is ~19%. T2D is affected by both environmental and genetic factors. Although specific genes have been implicated in T2D risk few of these findings are confirmed in studies of Mexican subjects. Our aim was to replicate associations of 39 single nucleotide polymorphisms (SNPs) from 10 genes with T2D-related phenotypes in a community-based Mexican cohort. Unrelated individuals (n = 259) living in southeastern Mexico were enrolled in the study based at the University of Yucatan School of Medicine in Merida. Phenotypes measured included anthropometric measurements, circulating levels of adipose tissue endocrine factors (leptin, adiponectin, pro-inflammatory cytokines), and insulin, glucose, and blood pressure. Association analyses were conducted by measured genotype analysis implemented in SOLAR, adapted for unrelated individuals. SNP Minor allele frequencies ranged from 2.2 to 48.6%. Nominal associations were found for CNR1, SLC30A8, GCK, and PCSK1 SNPs with systolic blood pressure, insulin and glucose, and for CNR1, SLC30A8, KCNJ11, and PCSK1 SNPs with adiponectin and leptin (p < 0.05). P-values greater than 0.0014 were considered significant. Association of SNPs rs10485170 of CNR1 and rs5215 of KCNJ11 with adiponectin and leptin, respectively, reached near significance (p = 0.002). Significant association (p = 0.001) was observed between plasma leptin and rs5219 of KCNJ11. PMID:25477898

  1. Multi-SNP analysis of MHC region: remarkable conservation of HLA-A1-B8-DR3 haplotype.

    PubMed

    Aly, Theresa A; Eller, Elise; Ide, Akane; Gowan, Katherine; Babu, Sunanda R; Erlich, Henry A; Rewers, Marian J; Eisenbarth, George S; Fain, Pamela R

    2006-05-01

    Technology has become available to cost-effectively analyze thousands of single nucleotide polymorphisms (SNPs). We recently confirmed by genotyping a small series of class I alleles and microsatellite markers that the extended haplotype HLA-A1-B8-DR3 (8.1 AH) at the major histocompatibility complex (MHC) is a common and conserved haplotype. To further evaluate the region of conservation of the DR3 haplotypes, we genotyped 31 8.1 AHs and 29 other DR3 haplotypes with a panel of 656 SNPs spanning 4.8 Mb in the MHC region. This multi-SNP evaluation revealed a 2.9-Mb region that was essentially invariable for all 31 8.1 AHs. The 31 8.1 AHs were >99.9% identical for 384 consecutive SNPs of the 656 SNPs analyzed. Future association studies of MHC-linked susceptibility to type 1 diabetes will need to account for the extensive conservation of the 8.1 AH, since individuals who carry this haplotype provide no information about the differential effects of the alleles that are present on this haplotype. PMID:16644681

  2. Replication of obesity and diabetes-related SNP associations in individuals from Yucatán, México.

    PubMed

    Hernandez-Escalante, Victor M; Nava-Gonzalez, Edna J; Voruganti, V Saroja; Kent, Jack W; Haack, Karin; Laviada-Molina, Hugo A; Molina-Segui, Fernanda; Gallegos-Cabriales, Esther C; Lopez-Alvarenga, Juan Carlos; Cole, Shelley A; Mezzles, Marguerite J; Comuzzie, Anthony G; Bastarrachea, Raul A

    2014-01-01

    The prevalence of type 2 diabetes (T2D) is rising rapidly and in Mexicans is ~19%. T2D is affected by both environmental and genetic factors. Although specific genes have been implicated in T2D risk few of these findings are confirmed in studies of Mexican subjects. Our aim was to replicate associations of 39 single nucleotide polymorphisms (SNPs) from 10 genes with T2D-related phenotypes in a community-based Mexican cohort. Unrelated individuals (n = 259) living in southeastern Mexico were enrolled in the study based at the University of Yucatan School of Medicine in Merida. Phenotypes measured included anthropometric measurements, circulating levels of adipose tissue endocrine factors (leptin, adiponectin, pro-inflammatory cytokines), and insulin, glucose, and blood pressure. Association analyses were conducted by measured genotype analysis implemented in SOLAR, adapted for unrelated individuals. SNP Minor allele frequencies ranged from 2.2 to 48.6%. Nominal associations were found for CNR1, SLC30A8, GCK, and PCSK1 SNPs with systolic blood pressure, insulin and glucose, and for CNR1, SLC30A8, KCNJ11, and PCSK1 SNPs with adiponectin and leptin (p < 0.05). P-values greater than 0.0014 were considered significant. Association of SNPs rs10485170 of CNR1 and rs5215 of KCNJ11 with adiponectin and leptin, respectively, reached near significance (p = 0.002). Significant association (p = 0.001) was observed between plasma leptin and rs5219 of KCNJ11. PMID:25477898

  3. Strategies for single nucleotide polymorphism (SNP) genotyping to enhance genotype imputation in Gyr (Bos indicus) dairy cattle: Comparison of commercially available SNP chips.

    PubMed

    Boison, S A; Santos, D J A; Utsunomiya, A H T; Carvalheiro, R; Neves, H H R; O'Brien, A M Perez; Garcia, J F; Sölkner, J; da Silva, M V G B

    2015-07-01

    Genotype imputation is widely used as a cost-effective strategy in genomic evaluation of cattle. Key determinants of imputation accuracies, such as linkage disequilibrium patterns, marker densities, and ascertainment bias, differ between Bos indicus and Bos taurus breeds. Consequently, there is a need to investigate effectiveness of genotype imputation in indicine breeds. Thus, the objective of the study was to investigate strategies and factors affecting the accuracy of genotype imputation in Gyr (Bos indicus) dairy cattle. Four imputation scenarios were studied using 471 sires and 1,644 dams genotyped on Illumina BovineHD (HD-777K; San Diego, CA) and BovineSNP50 (50K) chips, respectively. Scenarios were based on which reference high-density single nucleotide polymorphism (SNP) panel (HDP) should be adopted [HD-777K, 50K, and GeneSeek GGP-75Ki (Lincoln, NE)]. Depending on the scenario, validation animals had their genotypes masked for one of the lower-density panels: Illumina (3K, 7K, and 50K) and GeneSeek (SGGP-20Ki and GGP-75Ki). We randomly selected 171 sires as reference and 300 as validation for all the scenarios. Additionally, all sires were used as reference and the 1,644 dams were imputed for validation. Genotypes of 98 individuals with 4 and more offspring were completely masked and imputed. Imputation algorithms FImpute and Beagle v3.3 and v4 were used. Imputation accuracies were measured using the correlation and allelic correct rate. FImpute resulted in highest accuracies, whereas Beagle 3.3 gave the least-accurate imputations. Accuracies evaluated as correlation (allelic correct rate) ranged from 0.910 (0.942) to 0.961 (0.974) using 50K as HDP and with 3K (7K) as low-density panels. With GGP-75Ki as HDP, accuracies were moderate for 3K, 7K, and 50K, but high for SGGP-20Ki. The use of HD-777K as HDP resulted in accuracies of 0.888 (3K), 0.941 (7K), 0.980 (SGGP-20Ki), 0.982 (50K), and 0.993 (GGP-75Ki). Ungenotyped individuals were imputed with an

  4. A GWAS SNP for Schizophrenia Is Linked to the Internal MIR137 Promoter and Supports Differential Allele-Specific Expression

    PubMed Central

    Warburton, Alix; Breen, Gerome; Bubb, Vivien J.; Quinn, John P.

    2016-01-01

    Single nucleotide polymorphisms (SNPs) within the MIR137 gene locus have been shown to confer risk for schizophrenia through genome-wide association studies (GWAS). The expression levels of microRNA-137 (miR-137) and its validated gene targets have also been shown to be disrupted in several neuropsychiatric conditions, including schizophrenia. Regulation of miR-137 expression is thus imperative for normal neuronal functioning. We previously characterized an internal promoter domain within the MIR137 gene that contained a variable number tandem repeat (VNTR) polymorphism and could alter the in vitro levels of miR-137 in a stimulus-induced and allele-specific manner. We now demonstrate that haplotype tagging-SNP analysis linked the rs1625579 GWAS SNP for schizophrenia to this internal MIR137 promoter through a proxy SNP rs2660304 located at this domain. We postulated that the rs2660304 promoter SNP may act as predisposing factor for schizophrenia through altering the levels of miR-137 expression in a genotype-dependent manner. Reporter gene analysis of the internal MIR137 promoter containing the common VNTR variant demonstrated genotype-dependent differences in promoter activity with respect to rs2660304. In line with previous reports, the major allele of the rs2660304 proxy SNP, which has previously been linked with schizophrenia risk through genetic association, resulted in downregulation of reporter gene expression in a tissue culture model. The genetic influence of the rs2660304 proxy SNP on the transcriptional activity of the internal MIR137 promoter, and thus the levels of miR-137 expression, therefore offers a distinct regulatory mechanism to explain the functional significance of the rs1625579 GWAS SNP for schizophrenia risk. PMID:26429811

  5. A GWAS SNP for Schizophrenia Is Linked to the Internal MIR137 Promoter and Supports Differential Allele-Specific Expression.

    PubMed

    Warburton, Alix; Breen, Gerome; Bubb, Vivien J; Quinn, John P

    2016-07-01

    Single nucleotide polymorphisms (SNPs) within the MIR137 gene locus have been shown to confer risk for schizophrenia through genome-wide association studies (GWAS). The expression levels of microRNA-137 (miR-137) and its validated gene targets have also been shown to be disrupted in several neuropsychiatric conditions, including schizophrenia. Regulation of miR-137 expression is thus imperative for normal neuronal functioning. We previously characterized an internal promoter domain within the MIR137 gene that contained a variable number tandem repeat (VNTR) polymorphism and could alter the in vitro levels of miR-137 in a stimulus-induced and allele-specific manner. We now demonstrate that haplotype tagging-SNP analysis linked the rs1625579 GWAS SNP for schizophrenia to this internal MIR137 promoter through a proxy SNP rs2660304 located at this domain. We postulated that the rs2660304 promoter SNP may act as predisposing factor for schizophrenia through altering the levels of miR-137 expression in a genotype-dependent manner. Reporter gene analysis of the internal MIR137 promoter containing the common VNTR variant demonstrated genotype-dependent differences in promoter activity with respect to rs2660304. In line with previous reports, the major allele of the rs2660304 proxy SNP, which has previously been linked with schizophrenia risk through genetic association, resulted in downregulation of reporter gene expression in a tissue culture model. The genetic influence of the rs2660304 proxy SNP on the transcriptional activity of the internal MIR137 promoter, and thus the levels of miR-137 expression, therefore offers a distinct regulatory mechanism to explain the functional significance of the rs1625579 GWAS SNP for schizophrenia risk. PMID:26429811

  6. Imputation of microsatellite alleles from dense SNP genotypes for parentage verification across multiple Bos taurus and Bos indicus breeds

    PubMed Central

    McClure, Matthew C.; Sonstegard, Tad S.; Wiggans, George R.; Van Eenennaam, Alison L.; Weber, Kristina L.; Penedo, Cecilia T.; Berry, Donagh P.; Flynn, John; Garcia, Jose F.; Carmo, Adriana S.; Regitano, Luciana C. A.; Albuquerque, Milla; Silva, Marcos V. G. B.; Machado, Marco A.; Coffey, Mike; Moore, Kirsty; Boscher, Marie-Yvonne; Genestout, Lucie; Mazza, Raffaele; Taylor, Jeremy F.; Schnabel, Robert D.; Simpson, Barry; Marques, Elisa; McEwan, John C.; Cromie, Andrew; Coutinho, Luiz L.; Kuehn, Larry A.; Keele, John W.; Piper, Emily K.; Cook, Jim; Williams, Robert; Van Tassell, Curtis P.

    2013-01-01

    To assist cattle producers transition from microsatellite (MS) to single nucleotide polymorphism (SNP) genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N = 479) from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8000 animals representing 39 breeds (Bos taurus and B. indicus) were used to predict 9410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles from 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N = 2 to 36 breeds). These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had ≤1 Mendelian inheritance conflicts with their parents' reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset. PMID:24065982

  7. Association of rs2072183 SNP and serum lipid levels in the Mulao and Han populations

    PubMed Central

    2012-01-01

    Background Niemann-pick C1-like 1 (NPC1L1) is a key protein for intestinal cholesterol transportation. Common single nucleotide polymorphisms (SNPs) in the NPC1L1 gene have been associated with cholesterol absorption and serum lipid levels. The present study was undertaken to explore the possible association of NPC1L1 rs2072183 1735 C > G SNP and several environmental factors with serum lipid levels in the Mulao and Han populations. Methods Genotyping of the rs2072183 SNP was performed in 688 subjects of Mulao and 738 participants of Han Chinese. The interactions between NPC1L1 1735 C > G polymorphism and several environmental factors on serum lipid phenotypes were tested using the factorial design covariance analysis after controlling for potential confounders. Results The frequency of G allele was lower in Mulao than in Han (29.72% vs. 37.26%, P < 0.001). The frequency of CC, CG and GG genotypes was 49.85%, 40.84% and 9.31% in Mulao, and 39.30%, 46.88% and 13.82% in Han (P < 0.001); respectively. The levels of low-density lipoprotein cholesterol (LDL-C), apolipoprotein (Apo) B and the ratio of ApoAI/ApoB in Han but not in Mulao were different among the three genotypes (P < 0.05 for all), the subjects with GG and CG genotypes had higher LDL-C, ApoB levels and lower ApoAI/ApoB ratio than the subjects with CC genotype. Subgroup analysis showed that the G allele carriers in Han had higher total cholesterol (TC), LDL-C and ApoB levels in males (P < 0.05) and lower ApoAI/ApoB ratio in both sexes (P < 0.05) than the G allele noncarriers. The G allele carriers in Mulao had higher TC and LDL-C levels in males (P < 0.05) and lower high-density lipoprotein cholesterol (HDL-C) levels in both sexes (P < 0.05) than the G allele noncarriers. Serum TC, LDL-C, ApoB levels and ApoAI/ApoB ratio were correlated with genotypes in Han males (P < 0.05) but not in females. Serum lipid parameters were also correlated with several environmental

  8. No association of IFNG+874T/A SNP and NOS2A-954G/C SNP variants with nitric oxide radical serum levels or susceptibility to tuberculosis in a Brazilian population subset.

    PubMed

    Leandro, Ana Cristina C S; Rocha, Márcia Andrade; Lamoglia-Souza, Andreia; VandeBerg, John L; Rolla, Valeria Cavalcanti; Bonecini-Almeida, Maria da Gloria

    2013-01-01

    Tuberculosis (TB) is one of the most common infectious diseases in the world. Mycobacterium tuberculosis infection leads to pulmonary active disease in approximately 5-10% of exposed individuals. Both bacteria- and host-related characteristics influence latent infection and disease. Host genetic predisposition to develop TB may involve multiple genes and their polymorphisms. It was reported previously that interferon gamma (IFN-γ) and nitric oxide synthase 2 (NOS2) are expressed on alveolar macrophages from TB patients and are responsible for bacilli control; thus, we aimed this study at genotyping single nucleotide polymorphisms IFNG+874T/A SNP and NOS2A-954G/C SNP to estimate their role on TB susceptibility and determine whether these polymorphisms influence serum nitrite and NOx(-) production. This case-control study enrolled 172 TB patients and 179 healthy controls. Neither polymorphism was associated with susceptibility to TB. NOS2A-954G/C SNP was not associated with serum levels of nitrite and NOx(-). These results indicate that variants of IFNG+874T/A SNP and NOS2A-954G/C SNP do not influence TB susceptibility or the secretion of nitric oxide radicals in the study population. PMID:24024215

  9. Allele-Specific Amplification in Cancer Revealed by SNP Array Analysis

    PubMed Central

    2005-01-01

    Amplification, deletion, and loss of heterozygosity of genomic DNA are hallmarks of cancer. In recent years a variety of studies have emerged measuring total chromosomal copy number at increasingly high resolution. Similarly, loss-of-heterozygosity events have been finely mapped using high-throughput genotyping technologies. We have developed a probe-level allele-specific quantitation procedure that extracts both copy number and allelotype information from single nucleotide polymorphism (SNP) array data to arrive at allele-specific copy number across the genome. Our approach applies an expectation-maximization algorithm to a model derived from a novel classification of SNP array probes. This method is the first to our knowledge that is able to (a) determine the generalized genotype of aberrant samples at each SNP site (e.g., CCCCT at an amplified site), and (b) infer the copy number of each parental chromosome across the genome. With this method, we are able to determine not just where amplifications and deletions occur, but also the haplotype of the region being amplified or deleted. The merit of our model and general approach is demonstrated by very precise genotyping of normal samples, and our allele-specific copy number inferences are validated using PCR experiments. Applying our method to a collection of lung cancer samples, we are able to conclude that amplification is essentially monoallelic, as would be expected under the mechanisms currently believed responsible for gene amplification. This suggests that a specific parental chromosome may be targeted for amplification, whether because of germ line or somatic variation. An R software package containing the methods described in this paper is freely available at http://genome.dfci.harvard.edu/~tlaframb/PLASQ. PMID:16322765

  10. Identification of SNP and SSR markers in eggplant using RAD tag sequencing

    PubMed Central

    2011-01-01

    Background The eggplant (Solanum melongena L.) genome is relatively unexplored, especially compared to those of the other major Solanaceae crops tomato and potato. In particular, no SNP markers are publicly available; on the other hand, over 1,000 SSR markers were developed and publicly available. We have combined the recently developed Restriction-site Associated DNA (RAD) approach with Illumina DNA sequencing for rapid and mass discovery of both SNP and SSR markers for eggplant. Results RAD tags were generated from the genomic DNA of a pair of eggplant mapping parents, and sequenced to produce ~17.5 Mb of sequences arrangeable into ~78,000 contigs. The resulting non-redundant genomic sequence dataset consisted of ~45,000 sequences, of which ~29% were putative coding sequences and ~70% were in common between the mapping parents. The shared sequences allowed the discovery of ~10,000 SNPs and nearly 1,000 indels, equivalent to a SNP frequency of 0.8 per Kb and an indel frequency of 0.07 per Kb. Over 2,000 of the SNPs are likely to be mappable via the Illumina GoldenGate assay. A subset of 384 SNPs was used to successfully fingerprint a panel of eggplant germplasm, producing a set of informative diversity data. The RAD sequences also included nearly 2,000 putative SSRs, and primer pairs were designed to amplify 1,155 loci. Conclusion The high throughput sequencing of the RAD tags allowed the discovery of a large number of DNA markers, which will prove useful for extending our current knowledge of the genome organization of eggplant, for assisting in marker-aided selection and for carrying out comparative genomic analyses within the Solanaceae family. PMID:21663628

  11. A Whole Methylome CpG-SNP Association Study of Psychosis in Blood and Brain Tissue.

    PubMed

    van den Oord, Edwin J C G; Clark, Shaunna L; Xie, Lin Ying; Shabalin, Andrey A; Dozmorov, Mikhail G; Kumar, Gaurav; Vladimirov, Vladimir I; Magnusson, Patrik K E; Aberg, Karolina A

    2016-07-01

    Mutated CpG sites (CpG-SNPs) are potential hotspots for human diseases because in addition to the sequence variation they may show individual differences in DNA methylation. We performed methylome-wide association studies (MWAS) to test whether methylation differences at those sites were associated with schizophrenia. We assayed all common CpG-SNPs with methyl-CpG binding domain protein-enriched genome sequencing (MBD-seq) using DNA extracted from 1408 blood samples and 66 postmortem brain samples (BA10) of schizophrenia cases and controls. Seven CpG-SNPs passed our FDR threshold of 0.1 in the blood MWAS. Of the CpG-SNPs methylated in brain, 94% were also methylated in blood. This significantly exceeded the 46.2% overlap expected by chance (P-value < 1.0×10(-8)) and justified replicating findings from blood in brain tissue. CpG-SNP rs3796293 in IL1RAP replicated (P-value = .003) with the same direction of effects. This site was further validated through targeted bisulfite pyrosequencing in 736 independent case-control blood samples (P-value < 9.5×10(-4)). Our top result in the brain MWAS (P-value = 8.8×10(-7)) was CpG-SNP rs16872141 located in the potential promoter of ENC1. Overall, our results suggested that CpG-SNP methylation may reflect effects of environmental insults and can provide biomarkers in blood that could potentially improve disease management. PMID:26656881

  12. PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations

    PubMed Central

    Bendl, Jaroslav; Stourac, Jan; Salanda, Ondrej; Pavelka, Antonin; Wieben, Eric D.; Zendulka, Jaroslav; Brezovsky, Jan; Damborsky, Jiri

    2014-01-01

    Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp. PMID:24453961

  13. Plastid DNA sequencing and nuclear SNP genotyping help resolve the puzzle of central American Platanus

    PubMed Central

    De Castro, Olga; Di Maio, Antonietta; Lozada García, José Armando; Piacenti, Danilo; Vázquez-Torres, Mario; De Luca, Paolo

    2013-01-01

    Background and Aims Recent research on the history of Platanus reveals that hybridization phenomena occurred in the central American species. This study has two goals: to help resolve the evolutive puzzle of central American Platanus, and to test the potential of real-time polymerase chain reaction (PCR) for detecting ancient hybridization. Methods Sequencing of a uniparental plastid DNA marker [psbA-trnH(GUG) intergenic spacer] and qualitative and quantitative single nucleotide polymorphism (SNP) genotyping of biparental nuclear ribosomal DNA (nrDNA) markers [LEAFY intron 2 (LFY-i2) and internal transcribed spacer 2 (ITS2)] were used. Key Results Based on the SNP genotyping results, several Platanus accessions show the presence of hybridization/introgression, including some accessions of P. rzedowskii and of P. mexicana var. interior and one of P. mexicana var. mexicana from Oaxaca (= P. oaxacana). Based on haplotype analyses of the psbA-trnH spacer, five haplotypes were detected. The most common of these is present in taxa belonging to P. orientalis, P. racemosa sensu lato, some accessions of P. occidentalis sensu stricto (s.s.) from Texas, P. occidentalis var. palmeri, P. mexicana s.s. and P. rzedowskii. This is highly relevant to genetic relationships with the haplotypes present in P. occidentalis s.s. and P. mexicana var. interior. Conclusions Hybridization and introgression events between lineages ancestral to modern central and eastern North American Platanus species occurred. Plastid haplotypes and qualitative and quantitative SNP genotyping provide information critical for understanding the complex history of Mexican Platanus. Compared with the usual molecular techniques of sub-cloning, sequencing and genotyping, real-time PCR assay is a quick and sensitive technique for analysing complex evolutionary patterns. PMID:23798602

  14. High-density SNP assay development for genetic analysis in maritime pine (Pinus pinaster).

    PubMed

    Plomion, C; Bartholomé, J; Lesur, I; Boury, C; Rodríguez-Quilón, I; Lagraulet, H; Ehrenmann, F; Bouffier, L; Gion, J M; Grivet, D; de Miguel, M; de María, N; Cervera, M T; Bagnoli, F; Isik, F; Vendramin, G G; González-Martínez, S C

    2016-03-01

    Maritime pine provides essential ecosystem services in the south-western Mediterranean basin, where it covers around 4 million ha. Its scattered distribution over a range of environmental conditions makes it an ideal forest tree species for studies of local adaptation and evolutionary responses to climatic change. Highly multiplexed single nucleotide polymorphism (SNP) genotyping arrays are increasingly used to study genetic variation in living organisms and for practical applications in plant and animal breeding and genetic resource conservation. We developed a 9k Illumina Infinium SNP array and genotyped maritime pine trees from (i) a three-generation inbred (F2) pedigree, (ii) the French breeding population and (iii) natural populations from Portugal and the French Atlantic coast. A large proportion of the exploitable SNPs (2052/8410, i.e. 24.4%) segregated in the mapping population and could be mapped, providing the densest ever gene-based linkage map for this species. Based on 5016 SNPs, natural and breeding populations from the French gene pool exhibited similar level of genetic diversity. Population genetics and structure analyses based on 3981 SNP markers common to the Portuguese and French gene pools revealed high levels of differentiation, leading to the identification of a set of highly differentiated SNPs that could be used for seed provenance certification. Finally, we discuss how the validated SNPs could facilitate the identification of ecologically and economically relevant genes in this species, improving our understanding of the demography and selective forces shaping its natural genetic diversity, and providing support for new breeding strategies. PMID:26358548

  15. CLUSTAG & WCLUSTAG: Hierarchical Clustering Algorithms for Efficient Tag-SNP Selection

    NASA Astrophysics Data System (ADS)

    Ao, Sio-Iong

    More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.

  16. SNP discovery in non-model organisms using 454 next generation sequencing.

    PubMed

    Wheat, Christopher W

    2012-01-01

    Roche 454 sequencing of the transcriptome has become a standard approach for efficiently obtaining single nucleotide polymorphisms (SNPs) in non-model species. In this chapter, the primary issues facing the development of SNPs from the transcriptome in non-model species are presented: tissue and sampling choices, mRNA preparation, considerations of normalization, pooling and barcoding, how much to sequence, how to assemble the data and assess the assembly, calling transcriptome SNPs, developing these into genomic SNPs, and publishing the work. Discussion also covers the comparison of this approach to RAD-tag sequencing and the potential of using other sequencing platforms for SNP development. PMID:22665274

  17. SNP annotation-based whole genomic prediction and selection: an application to feed efficiency and its component traits in pigs.

    PubMed

    Do, D N; Janss, L L G; Jensen, J; Kadarmideen, H N

    2015-05-01

    The study investigated genetic architecture and predictive ability using genomic annotation of residual feed intake (RFI) and its component traits (daily feed intake [DFI], ADG, and back fat [BF]). A total of 1,272 Duroc pigs had both genotypic and phenotypic records, and the records were split into a training (968 pigs) and a validation dataset (304 pigs) by assigning records as before and after January 1, 2012, respectively. SNP were annotated by 14 different classes using Ensembl variant effect prediction. Predictive accuracy and prediction bias were calculated using Bayesian Power LASSO, Bayesian A, B, and Cπ, and genomic BLUP (GBLUP) methods. Predictive accuracy ranged from 0.508 to 0.531, 0.506 to 0.532, 0.276 to 0.357, and 0.308 to 0.362 for DFI, RFI, ADG, and BF, respectively. BayesCπ100.1 increased accuracy slightly compared to the GBLUP model and other methods. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP groups. Genomic prediction has accuracy comparable to observed phenotype, and use of genomic prediction can be cost effective by replacing feed intake measurement. Genomic annotation had less impact on predictive accuracy traits considered here but may be different for other traits. It is the first study to provide useful insights into biological classes of SNP driving the whole genomic prediction for complex traits in pigs. PMID:26020301

  18. SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes

    PubMed Central

    Krueger, Felix; Andrews, Simon R.

    2016-01-01

    Sequencing reads overlapping polymorphic sites in diploid mammalian genomes may be assigned to one allele or the other. This holds the potential to detect gene expression, chromatin modifications, DNA methylation or nuclear interactions in an allele-specific fashion. SNPsplit is an allele-specific alignment sorter designed to read files in SAM/BAM format and determine the allelic origin of reads or read-pairs that cover known single nucleotide polymorphic (SNP) positions. For this to work libraries must have been aligned to a genome in which all known SNP positions were masked with the ambiguity base 'N' and aligned using a suitable mapping program such as Bowtie2, TopHat, STAR, HISAT2, HiCUP or Bismark. SNPsplit also provides an automated solution to generate N-masked reference genomes for hybrid mouse strains based on the variant call information provided by the Mouse Genomes Project. The unique ability of SNPsplit to work with various different kinds of sequencing data including RNA-Seq, ChIP-Seq, Bisulfite-Seq or Hi-C opens new avenues for the integrative exploration of allele-specific data. PMID:27429743

  19. A Genome Wide Survey of SNP Variation Reveals the Genetic Structure of Sheep Breeds

    PubMed Central

    Kijas, James W.; Townley, David; Dalrymple, Brian P.; Heaton, Michael P.; Maddox, Jillian F.; McGrath, Annette; Wilson, Peter; Ingersoll, Roxann G.; McCulloch, Russell; McWilliam, Sean; Tang, Dave; McEwan, John; Cockett, Noelle; Oddy, V. Hutton; Nicholas, Frank W.; Raadsma, Herman

    2009-01-01

    The genetic structure of sheep reflects their domestication and subsequent formation into discrete breeds. Understanding genetic structure is essential for achieving genetic improvement through genome-wide association studies, genomic selection and the dissection of quantitative traits. After identifying the first genome-wide set of SNP for sheep, we report on levels of genetic variability both within and between a diverse sample of ovine populations. Then, using cluster analysis and the partitioning of genetic variation, we demonstrate sheep are characterised by weak phylogeographic structure, overlapping genetic similarity and generally low differentiation which is consistent with their short evolutionary history. The degree of population substructure was, however, sufficient to cluster individuals based on geographic origin and known breed history. Specifically, African and Asian populations clustered separately from breeds of European origin sampled from Australia, New Zealand, Europe and North America. Furthermore, we demonstrate the presence of stratification within some, but not all, ovine breeds. The results emphasize that careful documentation of genetic structure will be an essential prerequisite when mapping the genetic basis of complex traits. Furthermore, the identification of a subset of SNP able to assign individuals into broad groupings demonstrates even a small panel of markers may be suitable for applications such as traceability. PMID:19270757

  20. GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies

    PubMed Central

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243

  1. Linkage Disequilibrium Estimation of Chinese Beef Simmental Cattle Using High-density SNP Panels

    PubMed Central

    Zhu, M.; Zhu, B.; Wang, Y. H.; Wu, Y.; Xu, L.; Guo, L. P.; Yuan, Z. R.; Zhang, L. P.; Gao, X.; Gao, H. J.; Xu, S. Z.; Li, J. Y.

    2013-01-01

    Linkage disequilibrium (LD) plays an important role in genomic selection and mapping quantitative trait loci (QTL). In this study, the pattern of LD and effective population size (Ne) were investigated in Chinese beef Simmental cattle. A total of 640 bulls were genotyped with IlluminaBovinSNP50BeadChip and IlluminaBovinHDBeadChip. We estimated LD for each autosomal chromosome at the distance between two random SNPs of <0 to 25 kb, 25 to 50 kb, 50 to 100 kb, 100 to 500 kb, 0.5 to 1 Mb, 1 to 5 Mb and 5 to 10 Mb. The mean values of r2 were 0.30, 0.16 and 0.08, when the separation between SNPs ranged from 0 to 25 kb to 50 to 100 kb and then to 0.5 to 1 Mb, respectively. The LD estimates decreased as the distance increased in SNP pairs, and increased with the increase of minor allelic frequency (MAF) and with the decrease of sample sizes. Estimates of effective population size for Chinese beef Simmental cattle decreased in the past generations and Ne was 73 at five generations ago. PMID:25049849

  2. Whole-Genome Analysis of Diversity and SNP-Major Gene Association in Peach Germplasm

    PubMed Central

    Micheletti, Diego; Dettori, Maria Teresa; Micali, Sabrina; Aramini, Valeria; Pacheco, Igor; Da Silva Linge, Cassia; Foschi, Stefano; Banchi, Elisa; Barreneche, Teresa; Quilot-Turion, Bénédicte; Lambert, Patrick; Pascal, Thierry; Iglesias, Ignasi; Carbó, Joaquim; Wang, Li-rong; Ma, Rui-juan; Li, Xiong-wei; Gao, Zhong-shan; Nazzicari, Nelson; Troggio, Michela; Bassi, Daniele; Rossini, Laura; Verde, Ignazio; Laurens, François; Arús, Pere; Aranzana, Maria José

    2015-01-01

    Peach was domesticated in China more than four millennia ago and from there it spread world-wide. Since the middle of the last century, peach breeding programs have been very dynamic generating hundreds of new commercial varieties, however, in most cases such varieties derive from a limited collection of parental lines (founders). This is one reason for the observed low levels of variability of the commercial gene pool, implying that knowledge of the extent and distribution of genetic variability in peach is critical to allow the choice of adequate parents to confer enhanced productivity, adaptation and quality to improved varieties. With this aim we genotyped 1,580 peach accessions (including a few closely related Prunus species) maintained and phenotyped in five germplasm collections (four European and one Chinese) with the International Peach SNP Consortium 9K SNP peach array. The study of population structure revealed the subdivision of the panel in three main populations, one mainly made up of Occidental varieties from breeding programs (POP1OCB), one of Occidental landraces (POP2OCT) and the third of Oriental accessions (POP3OR). Analysis of linkage disequilibrium (LD) identified differential patterns of genome-wide LD blocks in each of the populations. Phenotypic data for seven monogenic traits were integrated in a genome-wide association study (GWAS). The significantly associated SNPs were always in the regions predicted by linkage analysis, forming haplotypes of markers. These diagnostic haplotypes could be used for marker-assisted selection (MAS) in modern breeding programs. PMID:26352671

  3. Performance of different SNP panels for parentage testing in two East Asian cattle breeds.

    PubMed

    Strucken, E M; Gudex, B; Ferdosi, M H; Lee, H K; Song, K D; Gibson, J P; Kelly, M; Piper, E K; Porto-Neto, L R; Lee, S H; Gondro, C

    2014-08-01

    The International Society for Animal Genetics (ISAG) proposed a panel of single nucleotide polymorphisms (SNPs) for parentage testing in cattle (a core panel of 100 SNPs and an additional list of 100 SNPs). However, markers specific to East Asian taurine cattle breeds were not included, and no information is available as to whether the ISAG panel performs adequately for these breeds. We tested ISAG's core (100 SNP) and full (200 SNP) panels on two East Asian taurine breeds: the Korean Hanwoo and the Japanese Wagyu, the latter from the Australian herd. Even though the power of exclusion was high at 0.99 for both ISAG panels, the core panel performed poorly with 3.01% false-positive assignments in the Hanwoo population and 3.57% in the Wagyu. The full ISAG panel identified all sire-offspring relations correctly in both populations with 0.02% of relations wrongly excluded in the Hanwoo population. Based on these results, we created and tested two population-specific marker panels: one for the Wagyu population, which showed no false-positive assignments with either 100 or 200 SNPs, and a second panel for the Hanwoo, which still had some false-positive assignments with 100 SNPs but no false positives using 200 SNPs. In conclusion, for parentage assignment in East Asian cattle breeds, only the full ISAG panel is adequate for parentage testing. If fewer markers should be used, it is advisable to use population-specific markers rather than the ISAG panel. PMID:24730981

  4. SNP discovery in complex allotetraploid genomes (Gossypium spp., Malvaceae) using genotyping by sequencing1

    PubMed Central

    Logan-Young, Carla Jo; Yu, John Z.; Verma, Surender K.; Percy, Richard G.; Pepper, Alan E.

    2015-01-01

    Premise of the study: Single-nucleotide polymorphism (SNP) marker discovery in plants with complex allotetraploid genomes is often confounded by the presence of homeologous loci (along with paralogous and orthologous loci). Here we present a strategy to filter for SNPs representing orthologous loci. Methods and Results: Using Illumina next-generation sequencing, 54 million reads were collected from restriction enzyme–digested DNA libraries of a diversity of Gossypium taxa. Loci with one to three SNPs were discovered using the Stacks software package, yielding 25,529 new cotton SNP combinations, including those that are polymorphic at both interspecific and intraspecific levels. Frequencies of predicted dual-homozygous (aa/bb) marker polymorphisms ranged from 6.7–11.6% of total shared fragments in intraspecific comparisons and from 15.0–16.4% in interspecific comparisons. Conclusions: This resource provides dual-homozygous (aa/bb) marker polymorphisms. Both in silico and experimental validation efforts demonstrated that these markers are enriched for single orthologous loci that are homozygous for alternative alleles. PMID:25798340

  5. Prenatal SNP array testing in 1000 fetuses with ultrasound anomalies: causative, unexpected and susceptibility CNVs.

    PubMed

    Srebniak, Malgorzata I; Diderich, Karin Em; Joosten, Marieke; Govaerts, Lutgarde Cp; Knijnenburg, Jeroen; de Vries, Femke At; Boter, Marjan; Lont, Debora; Knapen, Maarten Fcm; de Wit, Merel C; Go, Attie Tji; Galjaard, Robert-Jan H; Van Opstal, Diane

    2016-05-01

    To evaluate the diagnostic value of single-nucleotide polymorphism (SNP) array testing in 1033 fetuses with ultrasound anomalies we investigated the prevalence and genetic nature of pathogenic findings. We reclassified all pathogenic findings into three categories: causative findings; unexpected diagnoses (UD); and susceptibility loci (SL) for neurodevelopmental disorders. After exclusion of trisomy 13, 18, 21, sex-chromosomal aneuploidy and triploidies, in 76/1033 (7.4%) fetuses a pathogenic chromosome abnormality was detected by genomic SNP array: in 19/1033 cases (1.8%) a microscopically detectable abnormality was found and in 57/1033 (5.5%) fetuses a pathogenic submicroscopic chromosome abnormality was detected. 58% (n=44) of all these pathogenic chromosome abnormalities involved a causative finding, 35% (n=27) a SL for neurodevelopmental disorder, and 6% (n=5) a UD of an early-onset untreatable disease. In 0.3% of parental samples an incidental pathogenic finding was encountered. Our results confirm that a genomic array should be the preferred first-tier technique in fetuses with ultrasound anomalies. All UDs involved early-onset diseases, which is beneficial for the patients to know. It also seems that UDs occur at a comparable frequency among microscopic and submicroscopic pathogenic findings. SL were more often detected than in pregnancies without ultrasound anomalies. PMID:26328504

  6. PrimerMapper: high throughput primer design and graphical assembly for PCR and SNP detection.

    PubMed

    O'Halloran, Damien M

    2016-01-01

    Primer design represents a widely employed gambit in diverse molecular applications including PCR, sequencing, and probe hybridization. Variations of PCR, including primer walking, allele-specific PCR, and nested PCR provide specialized validation and detection protocols for molecular analyses that often require screening large numbers of DNA fragments. In these cases, automated sequence retrieval and processing become important features, and furthermore, a graphic that provides the user with a visual guide to the distribution of designed primers across targets is most helpful in quickly ascertaining primer coverage. To this end, I describe here, PrimerMapper, which provides a comprehensive graphical user interface that designs robust primers from any number of inputted sequences while providing the user with both, graphical maps of primer distribution for each inputted sequence, and also a global assembled map of all inputted sequences with designed primers. PrimerMapper also enables the visualization of graphical maps within a browser and allows the user to draw new primers directly onto the webpage. Other features of PrimerMapper include allele-specific design features for SNP genotyping, a remote BLAST window to NCBI databases, and remote sequence retrieval from GenBank and dbSNP. PrimerMapper is hosted at GitHub and freely available without restriction. PMID:26853558

  7. MultiBLUP: improved SNP-based prediction for complex traits.

    PubMed

    Speed, Doug; Balding, David J

    2014-09-01

    BLUP (best linear unbiased prediction) is widely used to predict complex traits in plant and animal breeding, and increasingly in human genetics. The BLUP mathematical model, which consists of a single random effect term, was adequate when kinships were measured from pedigrees. However, when genome-wide SNPs are used to measure kinships, the BLUP model implicitly assumes that all SNPs have the same effect-size distribution, which is a severe and unnecessary limitation. We propose MultiBLUP, which extends the BLUP model to include multiple random effects, allowing greatly improved prediction when the random effects correspond to classes of SNPs with distinct effect-size variances. The SNP classes can be specified in advance, for example, based on SNP functional annotations, and we also provide an adaptive procedure for determining a suitable partition of SNPs. We apply MultiBLUP to genome-wide association data from the Wellcome Trust Case Control Consortium (seven diseases), and from much larger studies of celiac disease and inflammatory bowel disease, finding that it consistently provides better prediction than alternative methods. Moreover, MultiBLUP is computationally very efficient; for the largest data set, which includes 12,678 individuals and 1.5 M SNPs, the total analysis can be run on a single desktop PC in less than a day and can be parallelized to run even faster. Tools to perform MultiBLUP are freely available in our software LDAK. PMID:24963154

  8. Coding region SNP analysis to enhance dog mtDNA discrimination power in forensic casework.

    PubMed

    Verscheure, Sophie; Backeljau, Thierry; Desmyter, Stijn

    2015-01-01

    The high population frequencies of three control region haplotypes contribute to the low discrimination power of the dog mtDNA control region. It also diminishes the evidential power of a match with one of these haplotypes in forensic casework. A mitochondrial genome study of 214 Belgian dogs suggested 26 polymorphic coding region sites that successfully resolved dogs with the three most frequent control region haplotypes. In this study, three SNP assays were developed to determine the identity of the 26 informative sites. The control region of 132 newly sampled dogs was sequenced and added to the study of 214 dogs. The assays were applied to 58 dogs of the haplotypes of interest, which confirmed their suitability for enhancing dog mtDNA discrimination power. In the Belgian population study of 346 dogs, the set of 26 sites divided the dogs into 25 clusters of mtGenome sequences with substantially lower population frequency estimates than their control region sequences. In case of a match with one of the three control region haplotypes, using these three SNP assays in conjunction with control region sequencing would augment the exclusion probability of dog mtDNA analysis from 92.9% to 97.0%. PMID:25299153

  9. Three clinical experiences with SNP array results consistent with parental incest: a narrative with lessons learned.

    PubMed

    Helm, Benjamin M; Langley, Katherine; Spangler, Brooke; Vergano, Samantha

    2014-08-01

    Single nucleotide polymorphism microarrays have the ability to reveal parental consanguinity which may or may not be known to healthcare providers. Consanguinity can have significant implications for the health of patients and for individual and family psychosocial well-being. These results often present ethical and legal dilemmas that can have important ramifications. Unexpected consanguinity can be confounding to healthcare professionals who may be unprepared to handle these results or to communicate them to families or other appropriate representatives. There are few published accounts of experiences with consanguinity and SNP arrays. In this paper we discuss three cases where molecular evidence of parental incest was identified by SNP microarray. We hope to further highlight consanguinity as a potential incidental finding, how the cases were handled by the clinical team, and what resources were found to be most helpful. This paper aims to contribute further to professional discourse on incidental findings with genomic technology and how they were addressed clinically. These experiences may provide some guidance on how others can prepare for these findings and help improve practice. As genetic and genomic testing is utilized more by non-genetics providers, we also hope to inform about the importance of engaging with geneticists and genetic counselors when addressing these findings. PMID:24222483

  10. Differentiation of drug and non-drug Cannabis using a single nucleotide polymorphism (SNP) assay.

    PubMed

    Rotherham, D; Harbison, S A

    2011-04-15

    Cannabis sativa is both an illegal drug and a legitimate crop. The differentiation of illegal drug Cannabis from non-drug forms of Cannabis is relevant in the context of the growth of fibre and seed oil varieties of Cannabis for commercial purposes. This differentiation is currently determined based on the levels of tetrahydrocannabinol (THC) in adult plants. DNA based methods have the potential to assay Cannabis material unsuitable for analysis using conventional means including seeds, pollen and severely degraded material. The purpose of this research was to develop a single nucleotide polymorphism (SNP) assay for the differentiation of "drug" and "non-drug"Cannabis plants. An assay was developed based on four polymorphisms within a 399 bp fragment of the tetrahydrocannabinolic acid (THCA) synthase gene, utilising the snapshot multiplex kit. This SNP assay was tested on 94 Cannabis plants, which included 10 blind samples, and was able to differentiate between "drug" and "non-drug"Cannabis in all cases, while also differentiating between Cannabis and other species. Non-drug plants were found to be homozygous at the four sites assayed while drug Cannabis plants were either homozygous or heterozygous. PMID:21036496

  11. SNP Miniplexes for Individual Identification of Random-Bred Domestic Cats.

    PubMed

    Brooks, Ashley; Creighton, Erica K; Gandolfi, Barbara; Khan, Razib; Grahn, Robert A; Lyons, Leslie A

    2016-05-01

    Phenotypic and genotypic characteristics of the cat can be obtained from single nucleotide polymorphisms (SNPs) analyses of fur. This study developed miniplexes using SNPs with high discriminating power for random-bred domestic cats, focusing on individual and phenotypic identification. Seventy-eight SNPs were investigated using a multiplex PCR followed by a fluorescently labeled single base extension (SBE) technique (SNaPshot(®) ). The SNP miniplexes were evaluated for reliability, reproducibility, sensitivity, species specificity, detection limitations, and assignment accuracy. Six SNPplexes were developed containing 39 intergenic SNPs and 26 phenotypic SNPs, including a sex identification marker, ZFXY. The combined random match probability (cRMP) was 6.58 × 10(-19) across all Western cat populations and the likelihood ratio was 1.52 × 10(18) . These SNPplexes can distinguish individual cats and their phenotypic traits, which could provide insight into crime reconstructions. A SNP database of 237 cats from 13 worldwide populations is now available for forensic applications. PMID:27122395

  12. SNP Detection in mRNA in Living Cells Using Allele Specific FRET Probes

    PubMed Central

    Dahan, Liya; Huang, Lingyan; Kedmi, Ranit; Behlke, Mark A.; Peer, Dan

    2013-01-01

    Live mRNA detection allows real time monitoring of specific transcripts and genetic alterations. The main challenge of live genetic detection is overcoming the high background generated by unbound probes and reaching high level of specificity with minimal off target effects. The use of Fluorescence Resonance Energy Transfer (FRET) probes allows differentiation between bound and unbound probes thus decreasing background. Probe specificity can be optimized by adjusting the length and through use of chemical modifications that alter binding affinity. Herein, we report the use of two oligonucleotide FRET probe system to detect a single nucleotide polymorphism (SNP) in murine Hras mRNA, which is associated with malignant transformations. The FRET oligonucleotides were modified with phosphorothioate (PS) bonds, 2′OMe RNA and LNA residues to enhance nuclease stability and improve SNP discrimination. Our results show that a point mutation in Hras can be detected in endogenous RNA of living cells. As determined by an Acceptor Photobleaching method, FRET levels were higher in cells transfected with perfect match FRET probes whereas a single mismatch showed decreased FRET signal. This approach promotes in vivo molecular imaging methods and could further be applied in cancer diagnosis and theranostic strategies. PMID:24039756

  13. PrimerMapper: high throughput primer design and graphical assembly for PCR and SNP detection

    PubMed Central

    O’Halloran, Damien M.

    2016-01-01

    Primer design represents a widely employed gambit in diverse molecular applications including PCR, sequencing, and probe hybridization. Variations of PCR, including primer walking, allele-specific PCR, and nested PCR provide specialized validation and detection protocols for molecular analyses that often require screening large numbers of DNA fragments. In these cases, automated sequence retrieval and processing become important features, and furthermore, a graphic that provides the user with a visual guide to the distribution of designed primers across targets is most helpful in quickly ascertaining primer coverage. To this end, I describe here, PrimerMapper, which provides a comprehensive graphical user interface that designs robust primers from any number of inputted sequences while providing the user with both, graphical maps of primer distribution for each inputted sequence, and also a global assembled map of all inputted sequences with designed primers. PrimerMapper also enables the visualization of graphical maps within a browser and allows the user to draw new primers directly onto the webpage. Other features of PrimerMapper include allele-specific design features for SNP genotyping, a remote BLAST window to NCBI databases, and remote sequence retrieval from GenBank and dbSNP. PrimerMapper is hosted at GitHub and freely available without restriction. PMID:26853558

  14. Use of Sequenom Sample ID Plus® SNP Genotyping in Identification of FFPE Tumor Samples

    PubMed Central

    Miller, Jessica K.; Buchner, Nicholas; Timms, Lee; Tam, Shirley; Luo, Xuemei; Brown, Andrew M. K.; Pasternack, Danielle; Bristow, Robert G.; Fraser, Michael; Boutros, Paul C.; McPherson, John D.

    2014-01-01

    Short tandem repeat (STR) analysis, such as the AmpFlSTR® Identifiler® Plus kit, is a standard, PCR-based human genotyping method used in the field of forensics. Misidentification of cell line and tissue DNA can be costly if not detected early; therefore it is necessary to have quality control measures such as STR profiling in place. A major issue in large-scale research studies involving archival formalin-fixed paraffin embedded (FFPE) tissues is that varying levels of DNA degradation can result in failure to correctly identify samples using STR genotyping. PCR amplification of STRs of several hundred base pairs is not always possible when DNA is degraded. The Sample ID Plus® panel from Sequenom allows for human DNA identification and authentication using SNP genotyping. In comparison to lengthy STR amplicons, this multiplexing PCR assay requires amplification of only 76–139 base pairs, and utilizes 47 SNPs to discriminate between individual samples. In this study, we evaluated both STR and SNP genotyping methods of sample identification, with a focus on paired FFPE tumor/normal DNA samples intended for next-generation sequencing (NGS). The ability to successfully validate the identity of FFPE samples can enable cost savings by reducing rework. PMID:24551080

  15. Varietal identification of tea (Camellia sinensis) using nanofluidic array of single nucleotide polymorphism (SNP) markers

    PubMed Central

    Fang, Wan-Ping; Meinhardt, Lyndel W; Tan, Hua-Wei; Zhou, Lin; Mischke, Sue; Zhang, Dapeng

    2014-01-01

    Apart from water, tea is the world’s most widely consumed beverage. Tea is produced in more than 50 countries with an annual production of approximately 4.7 million tons. The market segment for specialty tea has been expanding rapidly owing to increased demand, resulting in higher revenues and profits for tea growers and the industry. Accurate varietal identification is critically important to ensure traceability and authentication of premium tea products, which in turn contribute to on-farm conservation of tea genetic diversity. Using a set of single nucleotide polymorphism (SNP) markers developed from the expressed sequence tag (EST) database of Camilla senensis, we genotyped deoxyribonucleic acid (DNA) samples extracted from a diverse group of tea varieties, including both fresh and processed commercial loose-leaf teas. The validation led to the designation of 60 SNPs that unambiguously identified all 40 tested tea varieties with high statistical rigor (p<0.0001). Varietal authenticity and genetic relationships among the analyzed cultivars were further characterized by ordination and Bayesian clustering analysis. These SNP markers, in combination with a high-throughput genotyping protocol, effectively established and verified specific DNA fingerprints for all tested tea varieties. This method provides a powerful tool for variety authentication and quality control for the tea industry. It is also highly useful for the management of tea genetic resources and breeding, where accurate and efficient genotype identification is essential. PMID:26504544

  16. Diversity in 113 cowpea [Vigna unguiculata (L) Walp] accessions assessed with 458 SNP markers.

    PubMed

    Egbadzor, Kenneth F; Ofori, Kwadwo; Yeboah, Martin; Aboagye, Lawrence M; Opoku-Agyeman, Michael O; Danquah, Eric Y; Offei, Samuel K

    2014-01-01

    Single Nucleotide Polymorphism (SNP) markers were used in characterization of 113 cowpea accessions comprising of 108 from Ghana and 5 from abroad. Leaf tissues from plants cultivated at the University of Ghana were genotyped at KBioscience in the United Kingdom. Data was generated for 477 SNPs, out of which 458 revealed polymorphism. The results were used to analyze genetic dissimilarity among the accessions using Darwin 5 software. The markers discriminated among all of the cowpea accessions and the dissimilarity values which ranged from 0.006 to 0.63 were used for factorial plot. Unexpected high levels of heterozygosity were observed on some of the accessions. Accessions known to be closely related clustered together in a dendrogram drawn with WPGMA method. A maximum length sub-tree which comprised of 48 core accessions was constructed. The software package structure was used to separate accessions into three groups, and the programme correctly identified varieties that were known hybrids. The hybrids were those accessions with numerous heterozygous loci. The structure plot showed closely related accessions with similar genome patterns. The SNP markers were more efficient in discriminating among the cowpea germplasm than morphological, seed protein polymorphism and simple sequence repeat studies reported earlier on the same collection. PMID:25332852

  17. MultiBLUP: improved SNP-based prediction for complex traits

    PubMed Central

    Balding, David J.

    2014-01-01

    BLUP (best linear unbiased prediction) is widely used to predict complex traits in plant and animal breeding, and increasingly in human genetics. The BLUP mathematical model, which consists of a single random effect term, was adequate when kinships were measured from pedigrees. However, when genome-wide SNPs are used to measure kinships, the BLUP model implicitly assumes that all SNPs have the same effect-size distribution, which is a severe and unnecessary limitation. We propose MultiBLUP, which extends the BLUP model to include multiple random effects, allowing greatly improved prediction when the random effects correspond to classes of SNPs with distinct effect-size variances. The SNP classes can be specified in advance, for example, based on SNP functional annotations, and we also provide an adaptive procedure for determining a suitable partition of SNPs. We apply MultiBLUP to genome-wide association data from the Wellcome Trust Case Control Consortium (seven diseases), and from much larger studies of celiac disease and inflammatory bowel disease, finding that it consistently provides better prediction than alternative methods. Moreover, MultiBLUP is computationally very efficient; for the largest data set, which includes 12,678 individuals and 1.5 M SNPs, the total analysis can be run on a single desktop PC in less than a day and can be parallelized to run even faster. Tools to perform MultiBLUP are freely available in our software LDAK. PMID:24963154

  18. Informatics Enhanced SNP Microarray Analysis of 30 Miscarriage Samples Compared to Routine Cytogenetics

    PubMed Central

    Lathi, Ruth B.; Loring, Megan; Massie, Jamie A. M.; Demko, Zachary P.; Johnson, David; Sigurjonsson, Styrmir; Gemelos, George; Rabinowitz, Matthew

    2012-01-01

    Purpose The metaphase karyotype is often used as a diagnostic tool in the setting of early miscarriage; however this technique has several limitations. We evaluate a new technique for karyotyping that uses single nucleotide polymorphism microarrays (SNP). This technique was compared in a blinded, prospective fashion, to the traditional metaphase karyotype. Methods Patients undergoing dilation and curettage for first trimester miscarriage between February and August 2010 were enrolled. Samples of chorionic villi were equally divided and sent for microarray testing in parallel with routine cytogenetic testing. Results Thirty samples were analyzed, with only four discordant results. Discordant results occurred when the entire genome was duplicated or when a balanced rearrangement was present. Cytogenetic karyotyping took an average of 29 days while microarray-based karytoyping took an average of 12 days. Conclusions Molecular karyotyping of POC after missed abortion using SNP microarray analysis allows for the ability to detect maternal cell contamination and provides rapid results with good concordance to standard cytogenetic analysis. PMID:22403611

  19. A Pipeline for Classifying Relationships Using Dense SNP/SNV Data and Putative Pedigree Information.

    PubMed

    Zeng, Zhen; Weeks, Daniel E; Chen, Wei; Mukhopadhyay, Nandita; Feingold, Eleanor

    2016-02-01

    When genome-wide association studies (GWAS) or sequencing studies are performed on family-based datasets, the genotype data can be used to check the structure of putative pedigrees. Even in datasets of putatively unrelated people, close relationships can often be detected using dense single-nucleotide polymorphism/variant (SNP/SNV) data. A number of methods for finding relationships using dense genetic data exist, but they all have certain limitations, including that they typically use average genetic sharing, which is only a subset of the available information. Here, we present a set of approaches for classifying relationships in GWAS datasets or large-scale sequencing datasets. We first propose an empirical method for detecting identity by descent segments in close relative pairs using un-phased dense SNP data and demonstrate how that information can assist in building a relationship classifier. We then develop a strategy to take advantage of putative pedigree information to enhance classification accuracy. Our methods are tested and illustrated with two datasets from two distinct populations. Finally, we propose classification pipelines for checking and identifying relationships in datasets containing a large number of small pedigrees. PMID:26709242

  20. Heritability of Recurrent Exertional Rhabdomyolysis in Standardbred and Thoroughbred Racehorses Derived From SNP Genotyping Data.

    PubMed

    Norton, Elaine M; Mickelson, James R; Binns, Matthew M; Blott, Sarah C; Caputo, Paul; Isgren, Cajsa M; McCoy, Annette M; Moore, Alison; Piercy, Richard J; Swinburne, June E; Vaudin, Mark; McCue, Molly E

    2016-11-01

    Recurrent exertional rhabdomyolysis (RER) in Thoroughbred and Standardbred racehorses is characterized by episodes of muscle rigidity and cell damage that often recur upon strenuous exercise. The objective was to evaluate the importance of genetic factors in RER by obtaining an unbiased estimate of heritability in cohorts of unrelated Thoroughbred and Standardbred racehorses. Four hundred ninety-one Thoroughbred and 196 Standardbred racehorses were genotyped with the 54K or 74K SNP genotyping arrays. Heritability was calculated from genome-wide SNP data with a mixed linear and Bayesian model, utilizing the standard genetic relationship matrix (GRM). Both the mixed linear and Bayesian models estimated heritability of RER in Thoroughbreds to be approximately 0.34 and in Standardbred racehorses to be approximately 0.45 after adjusting for disease prevalence and sex. To account for potential differences in the genetic architecture of the underlying causal variants, heritability estimates were adjusted based on linkage disequilibrium weighted kinship matrix, minor allele frequency and variant effect size, yielding heritability estimates that ranged between 0.41-0.46 (Thoroughbreds) and 0.39-0.49 (Standardbreds). In conclusion, between 34-46% and 39-49% of the variance in RER susceptibility in Thoroughbred and Standardbred racehorses, respectively, can be explained by the SNPs present on these 2 genotyping arrays, indicating that RER is moderately heritable. These data provide further rationale for the investigation of genetic mutations associated with RER susceptibility. PMID:27489252

  1. SNPsplit: Allele-specific splitting of alignments between genomes with known SNP genotypes.

    PubMed

    Krueger, Felix; Andrews, Simon R

    2016-01-01

    Sequencing reads overlapping polymorphic sites in diploid mammalian genomes may be assigned to one allele or the other. This holds the potential to detect gene expression, chromatin modifications, DNA methylation or nuclear interactions in an allele-specific fashion. SNPsplit is an allele-specific alignment sorter designed to read files in SAM/BAM format and determine the allelic origin of reads or read-pairs that cover known single nucleotide polymorphic (SNP) positions. For this to work libraries must have been aligned to a genome in which all known SNP positions were masked with the ambiguity base 'N' and aligned using a suitable mapping program such as Bowtie2, TopHat, STAR, HISAT2, HiCUP or Bismark. SNPsplit also provides an automated solution to generate N-masked reference genomes for hybrid mouse strains based on the variant call information provided by the Mouse Genomes Project. The unique ability of SNPsplit to work with various different kinds of sequencing data including RNA-Seq, ChIP-Seq, Bisulfite-Seq or Hi-C opens new avenues for the integrative exploration of allele-specific data. PMID:27429743

  2. Evaluating the Influence of Quality Control Decisions and Software Algorithms on SNP Calling for the Affymetrix 6.0 SNP Array Platform

    PubMed Central

    de Andrade, Mariza; Atkinson, Elizabeth J.; Bamlet, William R.; Matsumoto, Martha E.; Maharjan, Sooraj; Slager, Susan L.; Vachon, Celine M.; Cunningham, Julie M.; Kardia, Sharon L.R.

    2011-01-01

    Objective Our goal was to evaluate the influence of quality control (QC) decisions using two genotype calling algorithms, CRLMM and Birdseed, designed for the Affymetrix SNP Array 6.0. Methods Various QC options were tried using the two algorithms and comparisons were made on subject and call rate and on association results using two data sets. Results For Birdseed, we recommend using the contrast QC instead of QC call rate for sample QC. For CRLMM, we recommend using the signal-to-noise rate ≥4 for sample QC and a posterior probability of 90% for genotype accuracy. For both algorithms, we recommend calling the genotype separately for each plate, and dropping SNPs with a lower call rate (<95%) before evaluating samples with lower call rates. To investigate whether the genotype calls from the two algorithms impacted the genome-wide association results, we performed association analysis using data from the GENOA cohort; we observed that the number of significant SNPs were similar using either CRLMM or Birdseed. Conclusions Using our suggested workflow both algorithms performed similarly; however, fewer samples were removed and CRLMM took half the time to run our 854 study samples (4.2 h) compared to Birdseed (8.4 h). PMID:21734406

  3. Population structure of Atlantic mackerel inferred from RAD-seq-derived SNP markers: effects of sequence clustering parameters and hierarchical SNP selection.

    PubMed

    Rodríguez-Ezpeleta, Naiara; Bradbury, Ian R; Mendibil, Iñaki; Álvarez, Paula; Cotano, Unai; Irigoien, Xabier

    2016-07-01

    Restriction-site-associated DNA sequencing (RAD-seq) and related methods are revolutionizing the field of population genomics in nonmodel organisms as they allow generating an unprecedented number of single nucleotide polymorphisms (SNPs) even when no genomic information is available. Yet, RAD-seq data analyses rely on assumptions on nature and number of nucleotide variants present in a single locus, the choice of which may lead to an under- or overestimated number of SNPs and/or to incorrectly called genotypes. Using the Atlantic mackerel (Scomber scombrus L.) and a close relative, the Atlantic chub mackerel (Scomber colias), as case study, here we explore the sensitivity of population structure inferences to two crucial aspects in RAD-seq data analysis: the maximum number of mismatches allowed to merge reads into a locus and the relatedness of the individuals used for genotype calling and SNP selection. Our study resolves the population structure of the Atlantic mackerel, but, most importantly, provides insights into the effects of alternative RAD-seq data analysis strategies on population structure inferences that are directly applicable to other species. PMID:26936210

  4. Seq4SNPs: new software for retrieval of multiple, accurately annotated DNA sequences, ready formatted for SNP assay design

    PubMed Central

    Field, Helen I; Scollen, Serena A; Luccarini, Craig; Baynes, Caroline; Morrison, Jonathan; Dunning, Alison M; Easton, Douglas F; Pharoah, Paul DP

    2009-01-01

    Background In moderate-throughput SNP genotyping there was a gap in the workflow, between choosing a set of SNPs and submitting their sequences to proprietary assay design software, which was not met by existing software. Retrieval and formatting of sequences flanking each SNP, prior to assay design, becomes rate-limiting for more than about ten SNPs, especially if annotated for repetitive regions and adjacent variations. We routinely process up to 50 SNPs at once. Implementation We created Seq4SNPs, a web-based, walk-away software that can process one to several hundred SNPs given rs numbers as input. It outputs a file of fully annotated sequences formatted for one of three proprietary design softwares: TaqMan's Primer-By-Design FileBuilder, Sequenom's iPLEX or SNPstream's Autoprimer, as well as unannotated fasta sequences. We found genotyping assays to be inhibited by repetitive sequences or the presence of additional variations flanking the SNP under test, and in multiplexes, repetitive sequence flanking one SNP adversely affects multiple assays. Assay design software programs avoid such regions if the input sequences are appropriately annotated, so we used Seq4SNPs to provide suitably annotated input sequences, and improved our genotyping success rate. Adjacent SNPs can also be avoided, by annotating sequences used as input for primer design. Conclusion The accuracy of annotation by Seq4SNPs is significantly better than manual annotation (P < 1e-5). Using Seq4SNPs to incorporate all annotation for additional SNPs and repetitive elements into sequences, for genotyping assay designer software, minimizes assay failure at the design stage, reducing the cost of genotyping. Seq4SNPs provides a rapid route for replacement of poor test SNP sequences. We routinely use this software for assay sequence preparation. Seq4SNPs is available as a service at and , currently for human SNPs, but easily extended to include any species in dbSNP. PMID:19523221

  5. SNiPloid: A Utility to Exploit High-Throughput SNP Data Derived from RNA-Seq in Allopolyploid Species.

    PubMed

    Peralta, Marine; Combes, Marie-Christine; Cenci, Alberto; Lashermes, Philippe; Dereeper, Alexis

    2013-01-01

    High-throughput sequencing is a common approach to discover SNP variants, especially in plant species. However, methods to analyze predicted SNPs are often optimized for diploid plant species whereas many crop species are allopolyploids and combine related but divergent subgenomes (homoeologous chromosome sets). We created a software tool, SNiPloid, that exploits and interprets putative SNPs in the context of allopolyploidy by comparing SNPs from an allopolyploid with those obtained in its modern-day diploid progenitors. SNiPloid can compare SNPs obtained from a sample to estimate the subgenome contribution to the transcriptome or SNPs obtained from two polyploid accessions to search for SNP divergence. PMID:24163691

  6. SNP Analysis and Whole Exome Sequencing: Their Application in the Analysis of a Consanguineous Pedigree Segregating Ataxia

    PubMed Central

    Nickerson, Sarah L.; Marquis-Nicholson, Renate; Claxton, Karen; Ashton, Fern; Leong, Ivone U. S.; Prosser, Debra O.; Love, Jennifer M.; George, Alice M.; Taylor, Graham; Wilson, Callum; McKinlay Gardner, R. J.; Love, Donald R.

    2015-01-01

    Autosomal recessive cerebellar ataxia encompasses a large and heterogeneous group of neurodegenerative disorders. We employed single nucleotide polymorphism (SNP) analysis and whole exome sequencing to investigate a consanguineous Maori pedigree segregating ataxia. We identified a novel mutation in exon 10 of the SACS gene: c.7962T>G p.(Tyr2654*), establishing the diagnosis of autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS). Our findings expand both the genetic and phenotypic spectrum of this rare disorder, and highlight the value of high-density SNP analysis and whole exome sequencing as powerful and cost-effective tools in the diagnosis of genetically heterogeneous disorders such as the hereditary ataxias.

  7. SNiPloid: A Utility to Exploit High-Throughput SNP Data Derived from RNA-Seq in Allopolyploid Species

    PubMed Central

    Peralta, Marine; Combes, Marie-Christine; Lashermes, Philippe; Dereeper, Alexis

    2013-01-01

    High-throughput sequencing is a common approach to discover SNP variants, especially in plant species. However, methods to analyze predicted SNPs are often optimized for diploid plant species whereas many crop species are allopolyploids and combine related but divergent subgenomes (homoeologous chromosome sets). We created a software tool, SNiPloid, that exploits and interprets putative SNPs in the context of allopolyploidy by comparing SNPs from an allopolyploid with those obtained in its modern-day diploid progenitors. SNiPloid can compare SNPs obtained from a sample to estimate the subgenome contribution to the transcriptome or SNPs obtained from two polyploid accessions to search for SNP divergence. PMID:24163691

  8. Genomic and SNP Analyses Demonstrate a Distant Separation of the Hospital and Community-Associated Clades of Enterococcus faecium

    PubMed Central

    Latorre, Mauricio; Qin, Xiang; Murray, Barbara E.

    2012-01-01

    Recent studies have pointed to the existence of two subpopulations of Enterococcus faecium, one containing primarily commensal/community-associated (CA) strains and one that contains most clinical or hospital-associated (HA) strains, including those classified by multi-locus sequence typing (MLST) as belonging to the CC17 group. The HA subpopulation more frequently has IS16, pathogenicity island(s), and plasmids or genes associated with antibiotic resistance, colonization, and/or virulence. Supporting the two clades concept, we previously found a 3–10% difference between four genes from HA-clade strains vs. CA-clade strains, including 5% difference between pbp5-R of ampicillin-resistant, HA strains and pbp5-S of ampicillin-sensitive, CA strains. To further investigate the core genome of these subpopulations, we studied 100 genes from 21 E. faecium genome sequences; our analyses of concatenated sequences, SNPs, and individual genes all identified two distinct groups. With the concatenated sequence, HA-clade strains differed by 0–1% from one another while CA clade strains differed from each other by 0–1.1%, with 3.5–4.2% difference between the two clades. While many strains had a few genes that grouped in one clade with most of their genes in the other clade, one strain had 28% of its genes in the CA clade and 72% in the HA clade, consistent with the predicted role of recombination in the evolution of E. faecium. Using estimates for Escherichia coli, molecular clock calculations using sSNP analysis indicate that these two clades may have diverged ≥1 million years ago or, using the higher mutation rate for Bacillus anthracis, ∼300,000 years ago. These data confirm the existence of two clades of E. faecium and show that the differences between the HA and CA clades occur at the core genomic level and long preceded the modern antibiotic era. PMID:22291916

  9. RNA-Seq-Mediated Transcriptome Analysis of a Fiberless Mutant Cotton and Its Possible Origin Based on SNP Markers

    PubMed Central

    Ma, Qifeng; Wu, Man; Pei, Wenfeng; Wang, Xiaoyan; Zhai, Honghong; Wang, Wenkui; Li, Xingli; Zhang, Jinfa; Yu, Jiwen; Yu, Shuxun

    2016-01-01

    As the longest known single-celled trichomes, cotton (Gossypium L.) fibers constitute a classic model system to investigate cell initiation and elongation. In this study, we used a high-throughput transcriptome sequencing technology to identify fiber-initiation-related single nucleotide polymorphism (SNP) markers and differentially expressed genes (DEGs) between the wild-type (WT) Upland cotton (G. hirsutum) Xuzhou 142 and its natural fuzzless-lintless mutant Xuzhou 142 fl. Approximately 700 million high-quality cDNA reads representing over 58 Gb of sequences were obtained, resulting in the identification of 28,610 SNPs—of which 17,479 were novel—from 13,960 expressed genes. Of these SNPs, 50% of SNPs in fl were identical to those of G. barbadense, which suggests the likely origin of the fl mutant from an interspecific hybridization between Xuzhou 142 and an unknown G. barbadense genotype. Of all detected SNPs, 15,555, 12,750, and 305 were classified as non-synonymous, synonymous, and pre-terminated ones, respectively. Moreover, 1,352 insertion/deletion polymorphisms (InDels) were also detected. A total of 865 DEGs were identified between the WT and fl in ovules at −3 and 0 days post-anthesis, with 302 candidate SNPs selected from these DEGs for validation by a high-resolution melting analysis and Sanger sequencing in seven cotton genotypes. The number of genotypic pairwise polymorphisms varied from 43 to 302, indicating that the identified SNPs are reliable. These SNPs should serve as good resources for breeding and genetic studies in cotton. PMID:26990639

  10. RNA-Seq-Mediated Transcriptome Analysis of a Fiberless Mutant Cotton and Its Possible Origin Based on SNP Markers.

    PubMed

    Ma, Qifeng; Wu, Man; Pei, Wenfeng; Wang, Xiaoyan; Zhai, Honghong; Wang, Wenkui; Li, Xingli; Zhang, Jinfa; Yu, Jiwen; Yu, Shuxun

    2016-01-01

    As the longest known single-celled trichomes, cotton (Gossypium L.) fibers constitute a classic model system to investigate cell initiation and elongation. In this study, we used a high-throughput transcriptome sequencing technology to identify fiber-initiation-related single nucleotide polymorphism (SNP) markers and differentially expressed genes (DEGs) between the wild-type (WT) Upland cotton (G. hirsutum) Xuzhou 142 and its natural fuzzless-lintless mutant Xuzhou 142 fl. Approximately 700 million high-quality cDNA reads representing over 58 Gb of sequences were obtained, resulting in the identification of 28,610 SNPs--of which 17,479 were novel--from 13,960 expressed genes. Of these SNPs, 50% of SNPs in fl were identical to those of G. barbadense, which suggests the likely origin of the fl mutant from an interspecific hybridization between Xuzhou 142 and an unknown G. barbadense genotype. Of all detected SNPs, 15,555, 12,750, and 305 were classified as non-synonymous, synonymous, and pre-terminated ones, respectively. Moreover, 1,352 insertion/deletion polymorphisms (InDels) were also detected. A total of 865 DEGs were identified between the WT and fl in ovules at -3 and 0 days post-anthesis, with 302 candidate SNPs selected from these DEGs for validation by a high-resolution melting analysis and Sanger sequencing in seven cotton genotypes. The number of genotypic pairwise polymorphisms varied from 43 to 302, indicating that the identified SNPs are reliable. These SNPs should serve as good resources for breeding and genetic studies in cotton. PMID:26990639

  11. Candidate SNP Associations of Optimism and Resilience in Older Adults: Exploratory Study of 935 Community-Dwelling Adults

    PubMed Central

    Rana, Brinda K.; Darst, Burcu F.; Bloss, Cinnamon; Shih, Pei-an Betty; Depp, Colin; Nievergelt, Caroline M.; Allison, Matthew; Parsons, J. Kellogg; Schork, Nicholas; Jeste, Dilip V.

    2014-01-01

    Objective Optimism and resilience promote health and well-being in older adults, and previous reports suggest that these traits are heritable. We examined the association of selected single-nucleotide polymorphisms (SNPs) with optimism and resilience in older adults. Design Candidate gene association study that was a follow-on at the University of California, San Diego sites of two NIH-funded multi-site longitudinal investigations: Women's Health Initiative (WHI) and SELenium and vitamin E Cancer prevention Trial (SELECT). Participants 426 Women from WHI older than age 50, and 509 men older than age 55 (age 50 for African-American men) from SELECT. Measurements 65 candidate gene SNPs that were judged by consensus, based on a literature review, as being related to predisposition to optimism and resilience, and 31 ancestry informative marker SNPs, genotyped from blood-based DNA samples and self-report scales for trait optimism, resilience, and depressive symptoms. Results Using a Bonferroni threshold for significant association (p=0.00089), there were no significant associations for individual SNPs with optimism or resilience in single-locus analyses. Exploratory multi-locus polygenic analyses with a p-value of <.05, showed an association of optimism with SNPs in MAO-A, IL10, and FGG genes, and an association of resilience with a SNP in MAO-A gene. Conclusions Correcting for Type I errors, there were no significant associations of optimism and resilience with specific gene SNPs in single-locus analyses. Positive psychological traits are likely to be genetically complex, with many loci having small effects contributing to phenotypic variation. Our exploratory multi-locus polygenic analyses suggest that larger sample sizes and complementary approaches involving methods such as sequence-based association studies, copy number variation analyses, and pathway-based analyses could be useful for better understanding the genetic basis of these positive psychological traits

  12. Mouse SNP Miner: an annotated database of mouse functional single nucleotide polymorphisms

    PubMed Central

    Reuveni, Eli; Ramensky, Vasily E; Gross, Cornelius

    2007-01-01

    Background The mapping of quantitative trait loci in rat and mouse has been extremely successful in identifying chromosomal regions associated with human disease-related phenotypes. However, identifying the specific phenotype-causing DNA sequence variations within a quantitative trait locus has been much more difficult. The recent availability of genomic sequence from several mouse inbred strains (including C57BL/6J, 129X1/SvJ, 129S1/SvImJ, A/J, and DBA/2J) has made it possible to catalog DNA sequence differences within a quantitative trait locus derived from crosses between these strains. However, even for well-defined quantitative trait loci (<10 Mb) the identification of candidate functional DNA sequence changes remains challenging due to the high density of sequence variation between strains. Description To help identify functional DNA sequence variations within quantitative trait loci we have used the Ensembl annotated genome sequence to compile a database of mouse single nucleotide polymorphisms (SNPs) that are predicted to cause missense, nonsense, frameshift, or splice site mutations (available at ). For missense mutations we have used the PolyPhen and PANTHER algorithms to predict whether amino acid changes are likely to disrupt protein function. Conclusion We have developed a database of mouse SNPs predicted to cause missense, nonsense, frameshift, and splice-site mutations. Our analysis revealed that 20% and 14% of missense SNPs are likely to be deleterious according to PolyPhen and PANTHER, respectively, and 6% are considered deleterious by both algorithms. The database also provides gene expression and functional annotations from the Symatlas, Gene Ontology, and OMIM databases to further assess candidate phenotype-causing mutations. To demonstrate its utility, we show that Mouse SNP Miner successfully finds a previously identified candidate SNP in the taste receptor, Tas1r3, that underlies sucrose preference in the C57BL/6J strain. We also use Mouse

  13. Pacifiplex: an ancestry-informative SNP panel centred on Australia and the Pacific region.

    PubMed

    Santos, Carla; Phillips, Christopher; Fondevila, Manuel; Daniel, Runa; van Oorschot, Roland A H; Burchard, Esteban G; Schanfield, Moses S; Souto, Luis; Uacyisrael, Jolame; Via, Marc; Carracedo, Ángel; Lareu, Maria V

    2016-01-01

    The analysis of human population variation is an area of considerable interest in the forensic, medical genetics and anthropological fields. Several forensic single nucleotide polymorphism (SNP) assays provide ancestry-informative genotypes in sensitive tests designed to work with limited DNA samples, including a 34-SNP multiplex differentiating African, European and East Asian ancestries. Although assays capable of differentiating Oceanian ancestry at a global scale have become available, this study describes markers compiled specifically for differentiation of Oceanian populations. A sensitive multiplex assay, termed Pacifiplex, was developed and optimized in a small-scale test applicable to forensic analyses. The Pacifiplex assay comprises 29 ancestry-informative marker SNPs (AIM-SNPs) selected to complement the 34-plex test, that in a combined set distinguish Africans, Europeans, East Asians and Oceanians. Nine Pacific region study populations were genotyped with both SNP assays, then compared to four reference population groups from the HGDP-CEPH human diversity panel. STRUCTURE analyses estimated population cluster membership proportions that aligned with the patterns of variation suggested for each study population's currently inferred demographic histories. Aboriginal Taiwanese and Philippine samples indicated high East Asian ancestry components, Papua New Guinean and Aboriginal Australians samples were predominantly Oceanian, while other populations displayed cluster patterns explained by the distribution of divergence amongst Melanesians, Polynesians and Micronesians. Genotype data from Pacifiplex and 34-plex tests is particularly well suited to analysis of Australian Aboriginal populations and when combined with Y and mitochondrial DNA variation will provide a powerful set of markers for ancestry inference applied to modern Australian demographic profiles. On a broader geographic scale, Pacifiplex adds highly informative data for inferring the ancestry

  14. SNP Regulation of microRNA Expression and Subsequent Colon Cancer Risk

    PubMed Central

    Mullany, Lila E.; Wolff, Roger K.; Herrick, Jennifer S.; Buas, Matthew F.; Slattery, Martha L.

    2015-01-01

    Introduction MicroRNAs (miRNAs) regulate messenger RNAs (mRNAs) and as such have been implicated in a variety of diseases, including cancer. MiRNAs regulate mRNAs through binding of the miRNA 5’ seed sequence (~7–8 nucleotides) to the mRNA 3’ UTRs; polymorphisms in these regions have the potential to alter miRNA-mRNA target associations. SNPs in miRNA genes as well as miRNA-target genes have been proposed to influence cancer risk through altered miRNA expression levels. Methods MiRNA-SNPs and miRNA-target gene-SNPs were identified through the literature. We used SNPs from Genome-Wide Association Study (GWAS) data that were matched to individuals with miRNA expression data generated from an Agilent platform for colon tumor and non-tumor paired tissues. These samples were used to evaluate 327 miRNA-SNP pairs for associations between SNPs and miRNA expression levels as well as for SNP associations with colon cancer. Results Twenty-two miRNAs expressed in non-tumor tissue were significantly different by genotype and 21 SNPs were associated with altered tumor/non-tumor differential miRNA expression across genotypes. Two miRNAs were associated with SNP genotype for both non-tumor and tumor/non-tumor differential expression. Of the 41 miRNAs significantly associated with SNPs all but seven were significantly differentially expressed in colon tumor tissue. Two of the 41 SNPs significantly associated with miRNA expression levels were associated with colon cancer risk: rs8176318 (BRCA1), ORAA 1.31 95% CI 1.01, 1.78, and rs8905 (PRKAR1A), ORGG 2.31 95% CI 1.11, 4.77. Conclusion Of the 327 SNPs identified in the literature as being important because of their potential regulation of miRNA expression levels, 12.5% had statistically significantly associations with miRNA expression. However, only two of these SNPs were significantly associated with colon cancer. PMID:26630397

  15. The human lactase persistence-associated SNP -13910*T enables in vivo functional persistence of lactase promoter-reporter transgene expression.

    PubMed

    Fang, Lin; Ahn, Jong Kun; Wodziak, Dariusz; Sibley, Eric

    2012-07-01

    Lactase is the intestinal enzyme responsible for digestion of the milk sugar lactose. Lactase gene expression declines dramatically upon weaning in mammals and during early childhood in humans (lactase nonpersistence). In various ethnic groups, however, lactase persists in high levels throughout adulthood (lactase persistence). Genetic association studies have identified that lactase persistence in northern Europeans is strongly associated with a single nucleotide polymorphism (SNP) located 14 kb upstream of the lactase gene: -13910*C/T. To determine whether the -13910*T SNP can function in vivo to mediate lactase persistence, we generated transgenic mice harboring human DNA fragments with the -13910*T SNP or the ancestral -13910*C SNP cloned upstream of a 2-kb rat lactase gene promoter in a luciferase reporter construct. We previously reported that the 2-kb rat lactase promoter directs a post-weaning decline of luciferase transgene expression similar to that of the endogenous lactase gene. In the present study, the post-weaning decline directed by the rat lactase promoter is impeded by addition of the -13910*T SNP human DNA fragment, but not by addition of the -13910*C ancestral SNP fragment. Persistence of transgene expression associated with the -13910*T SNP represents the first in vivo data in support of a functional role for the -13910*T SNP in mediating the human lactase persistence phenotype. PMID:22258180

  16. Design of the Illumina Porcine 50K+ SNP Iselect(TM) Beadchip and Characterization of the Porcine HapMap Population

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Using next generation sequencing technology the International Swine SNP Consortium has identified 500,000 SNPs and used these to design an Illumina Infinium iSelect™ SNP BeadChip with a selection of 60,218 SNPs. The selected SNPs include previously validated SNPs and SNPs identified de novo using se...

  17. Developments of geriatric autopsy database and Internet-based database of Japanese single nucleotide polymorphisms for geriatric research (JG-SNP).

    PubMed

    Sawabe, Motoji; Arai, Tomio; Kasahara, Ichiro; Esaki, Yukiyoshi; Nakahara, Ken-ichi; Hosoi, Takayuki; Orimo, Hajime; Takubo, Kaiyo; Murayama, Shigeo; Tanaka, Noriko

    2004-08-01

    To facilitate geriatric research on the roles of genetic polymorphisms of candidate genes, two databases were developed based on data obtained from autopsy examinations of elderly subjects: the geriatric autopsy database (GEAD) and the Japanese single nucleotide polymorphisms (SNP) database for geriatric research (JG-SNP) which is accessible on the Internet (http://www.tmgh.metro.tokyo.jp/jg-snp/english/E_top.html). The data for the GEAD were derived from 1074 consecutive autopsy cases (565 male and 509 female cases) with an average age of 80 years. The GEAD was installed on a stand-alone Windows 2000 server using Oracle 8i as the database application. The GEAD contains clinical diagnoses of 26 geriatric diseases, histories of smoking and alcohol consumption, pathological findings (720 items), severity of atherosclerosis, genetic polymorphism data, etc. On the JG-SNP website, case distribution corresponding to a specified SNP or disease can be searched or downloaded. Although there are several Internet-based SNP databases such as dbSNP, no databases are available at present on the web that contain both SNP data and phenotypic data. As autopsy studies can provide large amounts of accurate medical information, including the presence of undiagnosed diseases such as latent cancers, the GEAD is a unique and excellent database for research on genetic polymorphisms. PMID:15336912

  18. A high resolution genetic linkage map of soybean based on 357 recombinant inbred lines genotyped with BARCSoySNP6K

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The objective of this study was to construct a high density genetic map of soybean (Glycine max L. Merr) using a high throughput single nucleotide polymorphism (SNP) genotyping on 357 F7 recombinant inbred lines (RILs) from a cross of ‘Wyandot’ × PI 567301B. Of 5,403 SNP loci scored from the Infiniu...

  19. A large maize (Zea Mays L.) SNP genotyping array: development and germplasm genotyping, and genetic mapping to compare with the B73 reference genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    SNP genotyping arrays have been useful for many applications that require a large number of molecular markers such as high-density genetic mapping, genome-wide association studies (GWAS), and genomic selection for accelerated breeding. We report the establishment of a large SNP array for maize and i...

  20. Incorporation of Personal Single Nucleotide Polymorphism (SNP) Data into a National Level Electronic Health Record for Disease Risk Assessment, Part 2: The Incorporation of SNP into the National Health Information System of Turkey

    PubMed Central

    Beyan, Timur

    2014-01-01

    Background A personalized medicine approach provides opportunities for predictive and preventive medicine. Using genomic, clinical, environmental, and behavioral data, the tracking and management of individual wellness is possible. A prolific way to carry this personalized approach into routine practices can be accomplished by integrating clinical interpretations of genomic variations into electronic medical record (EMR)s/electronic health record (EHR)s systems. Today, various central EHR infrastructures have been constituted in many countries of the world, including Turkey. Objective As an initial attempt to develop a sophisticated infrastructure, we have concentrated on incorporating the personal single nucleotide polymorphism (SNP) data into the National Health Information System of Turkey (NHIS-T) for disease risk assessment, and evaluated the performance of various predictive models for prostate cancer cases. We present our work as a miniseries containing three parts: (1) an overview of requirements, (2) the incorporation of SNP into the NHIS-T, and (3) an evaluation of SNP data incorporated into the NHIS-T for prostate cancer. Methods For the second article of this miniseries, we have analyzed the existing NHIS-T and proposed the possible extensional architectures. In light of the literature survey and characteristics of NHIS-T, we have proposed and argued opportunities and obstacles for a SNP incorporated NHIS-T. A prototype with complementary capabilities (knowledge base and end-user applications) for these architectures has been designed and developed. Results In the proposed architectures, the clinically relevant personal SNP (CR-SNP) and clinicogenomic associations are shared between central repositories and end-users via the NHIS-T infrastructure. To produce these files, we need to develop a national level clinicogenomic knowledge base. Regarding clinicogenomic decision support, we planned to complete interpretation of these associations on the end

  1. SNP-SNP Interaction between TLR4 and MyD88 in Susceptibility to Coronary Artery Disease in the Chinese Han Population

    PubMed Central

    Sun, Dandan; Sun, Liping; Xu, Qian; Gong, Yuehua; Wang, Honghu; Yang, Jun; Yuan, Yuan

    2016-01-01

    The toll-like receptor 4 (TLR4)-myeloid differentiation factor 88 (MyD88)-dependent signaling pathway plays a role in the initiation and progression of coronary artery disease (CAD). We investigated SNP–SNP interactions between the TLR4 and MyD88 genes in CAD susceptibility and assessed whether the effects of such interactions were modified by confounding risk factors (hyperglycemia, hyperlipidemia and Helicobacter pylori (H. pylori) infection). Participants with CAD (n = 424) and controls (n = 424) without CAD were enrolled. Polymerase chain restriction-restriction fragment length polymorphism was performed on genomic DNA to detect polymorphisms in TLR4 (rs10116253, rs10983755, and rs11536889) and MyD88 (rs7744). H. pylori infections were evaluated by enzyme-linked immunosorbent assays, and the cardiovascular risk factors for each subject were evaluated clinically. The significant interaction between TLR4 rs11536889 and MyD88 rs7744 was associated with an increased CAD risk (p value for interaction = 0.024). In conditions of hyperglycemia, the interaction effect was strengthened between TLR4 rs11536889 and MyD88 rs7744 (p value for interaction = 0.004). In hyperlipidemic participants, the interaction strength was also enhanced for TLR4 rs11536889 and MyD88 rs7744 (p value for interaction = 0.006). Thus, the novel interaction between TLR4 rs11536889 and MyD88 rs7744 was related with an increased risk of CAD, that could be strengthened by the presence of hyperglycemia or hyperlipidemia. PMID:26959040

  2. SNP in starch biosynthesis genes associated with nutritional and functional properties of rice

    PubMed Central

    Kharabian-Masouleh, Ardashir; Waters, Daniel L. E.; Reinke, Russell F.; Ward, Rachelle; Henry, Robert J.

    2012-01-01

    Starch is a major component of human diets. The relative contribution of variation in the genes of starch biosynthesis to the nutritional and functional properties of the rice was evaluated in a rice breeding population. Sequencing 18 genes involved in starch synthesis in a population of 233 rice breeding lines discovered 66 functional SNPs in exonic regions. Five genes, AGPS2b, Isoamylase1, SPHOL, SSIIb and SSIVb showed no polymorphism. Association analysis found 31 of the SNP were associated with differences in pasting and cooking quality properties of the rice lines. Two genes appear to be the major loci controlling traits under human selection in rice, GBSSI (waxy gene) and SSIIa. GBSSI influenced amylose content and retrogradation. Other genes contributing to retrogradation were GPT1, SSI, BEI and SSIIIa. SSIIa explained much of the variation in cooking characteristics. Other genes had relatively small effects. PMID:22870386

  3. Design and synthesis of the superionic conductor Na10SnP2S12

    NASA Astrophysics Data System (ADS)

    Richards, William D.; Tsujimura, Tomoyuki; Miara, Lincoln J.; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand

    2016-03-01

    Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm-1 rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity.

  4. To Cheat or Not To Cheat: Tryptophan Hydroxylase 2 SNP Variants Contribute to Dishonest Behavior.

    PubMed

    Shen, Qiang; Teo, Meijun; Winter, Eyal; Hart, Einav; Chew, Soo H; Ebstein, Richard P

    2016-01-01

    Although, lying (bear false witness) is explicitly prohibited in the Decalogue and a focus of interest in philosophy and theology, more recently the behavioral and neural mechanisms of deception are gaining increasing attention from diverse fields especially economics, psychology, and neuroscience. Despite the considerable role of heredity in explaining individual differences in deceptive behavior, few studies have investigated which specific genes contribute to the heterogeneity of lying behavior across individuals. Also, little is known concerning which specific neurotransmitter pathways underlie deception. Toward addressing these two key questions, we implemented a neurogenetic strategy and modeled deception by an incentivized die-under-cup task in a laboratory setting. The results of this exploratory study provide provisional evidence that SNP variants across the tryptophan hydroxylase 2 (TPH2) gene, that encodes the rate-limiting enzyme in the biosynthesis of brain serotonin, contribute to individual differences in deceptive behavior. PMID:27199691

  5. SNP genotyping by combination of 192-well MADGE, ARMS and computerized gel image analysis.

    PubMed

    O'Dell, S D; Gaunt, T R; Day, I N

    2000-09-01

    A new modification of the microplate array diagonal gel electrophoresis (MADGE) system accommodates the dual amplification refractory mutation system (ARMS) products of 96 samples on one 192-well gel. Simultaneous electrophoresis of a number of horizontal ARMS-MADGE gels achieves high throughput. Gels are imaged digitally, here using the FluorImager 595 fluorescent scanning system. Customized software by Phoretix enables rapid computerized calling of band patterns in ARMS-MADGE arrays, in which the two wells receiving a pair of allele-specific assays for a single template are juxtaposed to form one virtual track, with genotype data exported directly into Microsoft Excel for statistical analysis. An ARMS assay of the A/T base change at the -23/HphI RFLP in the insulin gene promoter, which initiates from 2.5 ng template DNA, was used here to demonstrate this improved general approach for population SNP analyses. PMID:10997263

  6. A high-performance computing toolset for relatedness and principal component analysis of SNP data.

    PubMed

    Zheng, Xiuwen; Levine, David; Shen, Jess; Gogarten, Stephanie M; Laurie, Cathy; Weir, Bruce S

    2012-12-15

    Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed gdsfmt and SNPRelate (R packages for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized. Benchmarks show the uniprocessor implementations of PCA and identity-by-descent are ∼8-50 times faster than the implementations provided in the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs, respectively, and can be sped up to 30-300-fold by using eight cores. SNPRelate can analyse tens of thousands of samples with millions of SNPs. For example, our package was used to perform PCA on 55 324 subjects from the 'Gene-Environment Association Studies' consortium studies. PMID:23060615

  7. Prediction of a time-to-event trait using genome wide SNP data

    PubMed Central

    2013-01-01

    Background A popular objective of many high-throughput genome projects is to discover various genomic markers associated with traits and develop statistical models to predict traits of future patients based on marker values. Results In this paper, we present a prediction method for time-to-event traits using genome-wide single-nucleotide polymorphisms (SNPs). We also propose a MaxTest associating between a time-to-event trait and a SNP accounting for its possible genetic models. The proposed MaxTest can help screen out nonprognostic SNPs and identify genetic models of prognostic SNPs. The performance of the proposed method is evaluated through simulations. Conclusions In conjunction with the MaxTest, the proposed method provides more parsimonious prediction models but includes more prognostic SNPs than some naive prediction methods. The proposed method is demonstrated with real GWAS data. PMID:23418752

  8. A Method for Checking Genomic Integrity in Cultured Cell Lines from SNP Genotyping Data

    PubMed Central

    McCarthy, Shane A.; Durbin, Richard

    2016-01-01

    Genomic screening for chromosomal abnormalities is an important part of quality control when establishing and maintaining stem cell lines. We present a new method for sensitive detection of copy number alterations, aneuploidy, and contamination in cell lines using genome-wide SNP genotyping data. In contrast to other methods designed for identifying copy number variations in a single sample or in a sample composed of a mixture of normal and tumor cells, this new method is tailored for determining differences between cell lines and the starting material from which they were derived, which allows us to distinguish between normal and novel copy number variation. We implemented the method in the freely available BCFtools package and present results based on induced pluripotent stem cell lines obtained in the HipSci project. PMID:27176002

  9. Use of SNP-arrays for ChIP assays: computational aspects.

    PubMed

    Muro, Enrique M; McCann, Jennifer A; Rudnicki, Michael A; Andrade-Navarro, Miguel A

    2009-01-01

    The simultaneous genotyping of thousands of single nucleotide polymorphisms (SNPs) in a genome using SNP-Arrays is a very important tool that is revolutionizing genetics and molecular biology. We expanded the utility of this technique by using it following chromatin immunoprecipitation (ChIP) to assess the multiple genomic locations protected by a protein complex recognized by an antibody. The power of this technique is illustrated through an analysis of the changes in histone H4 acetylation, a marker of open chromatin and transcriptionally active genomic regions, which occur during differentiation of human myoblasts into myotubes. The findings have been validated by the observation of a significant correlation between the detected histone modifications and the expression of the nearby genes, as measured by DNA expression microarrays. This chapter focuses on the computational analysis of the data. PMID:19588091

  10. To Cheat or Not To Cheat: Tryptophan Hydroxylase 2 SNP Variants Contribute to Dishonest Behavior

    PubMed Central

    Shen, Qiang; Teo, Meijun; Winter, Eyal; Hart, Einav; Chew, Soo H.; Ebstein, Richard P.

    2016-01-01

    Although, lying (bear false witness) is explicitly prohibited in the Decalogue and a focus of interest in philosophy and theology, more recently the behavioral and neural mechanisms of deception are gaining increasing attention from diverse fields especially economics, psychology, and neuroscience. Despite the considerable role of heredity in explaining individual differences in deceptive behavior, few studies have investigated which specific genes contribute to the heterogeneity of lying behavior across individuals. Also, little is known concerning which specific neurotransmitter pathways underlie deception. Toward addressing these two key questions, we implemented a neurogenetic strategy and modeled deception by an incentivized die-under-cup task in a laboratory setting. The results of this exploratory study provide provisional evidence that SNP variants across the tryptophan hydroxylase 2 (TPH2) gene, that encodes the rate-limiting enzyme in the biosynthesis of brain serotonin, contribute to individual differences in deceptive behavior. PMID:27199691

  11. Efficient fast heuristic algorithms for minimum error correction haplotyping from SNP fragments.

    PubMed

    Anaraki, Maryam Pourkamali; Sadeghi, Mehdi

    2014-01-01

    Availability of complete human genome is a crucial factor for genetic studies to explore possible association between the genome and complex diseases. Haplotype, as a set of single nucleotide polymorphisms (SNPs) on a single chromosome, is believed to contain promising data for disease association studies, detecting natural positive selection and recombination hotspots. Various computational methods for haplotype reconstruction from aligned fragment of SNPs have already been proposed. This study presents a novel approach to obtain paternal and maternal haplotypes form the SNP fragments on minimum error correction (MEC) model. Reconstructing haplotypes in MEC model is an NP-hard problem. Therefore, our proposed methods employ two fast and accurate clustering techniques as the core of their procedure to efficiently solve this ill-defined problem. The assessment of our approaches, compared to conventional methods, on two real benchmark datasets, i.e., ACE and DALY, proves the efficiency and accuracy. PMID:25539847

  12. SNP-based pathway enrichment analysis for genome-wide association studies

    PubMed Central

    2011-01-01

    Background Recently we have witnessed a surge of interest in using genome-wide association studies (GWAS) to discover the genetic basis of complex diseases. Many genetic variations, mostly in the form of single nucleotide polymorphisms (SNPs), have been identified in a wide spectrum of diseases, including diabetes, cancer, and psychiatric diseases. A common theme arising from these studies is that the genetic variations discovered by GWAS can only explain a small fraction of the genetic risks associated with the complex diseases. New strategies and statistical approaches are needed to address this lack of explanation. One such approach is the pathway analysis, which considers the genetic variations underlying a biological pathway, rather than separately as in the traditional GWAS studies. A critical challenge in the pathway analysis is how to combine evidences of association over multiple SNPs within a gene and multiple genes within a pathway. Most current methods choose the most significant SNP from each gene as a representative, ignoring the joint action of multiple SNPs within a gene. This approach leads to preferential identification of genes with a greater number of SNPs. Results We describe a SNP-based pathway enrichment method for GWAS studies. The method consists of the following two main steps: 1) for a given pathway, using an adaptive truncated product statistic to identify all representative (potentially more than one) SNPs of each gene, calculating the average number of representative SNPs for the genes, then re-selecting the representative SNPs of genes in the pathway based on this number; and 2) ranking all selected SNPs by the significance of their statistical association with a trait of interest, and testing if the set of SNPs from a particular pathway is significantly enriched with high ranks using a weighted Kolmogorov-Smirnov test. We applied our method to two large genetically distinct GWAS data sets of schizophrenia, one from European

  13. Mapping of Genetic Abnormalities of Primary Tumours from Metastatic CRC by High-Resolution SNP Arrays

    PubMed Central

    Sayagués, José María; Fontanillo, Celia; Abad, María del Mar; González-González, María; Sarasquete, María Eugenia; del Carmen Chillon, Maria; Garcia, Eva; Bengoechea, Oscar; Fonseca, Emilio; Gonzalez-Diaz, Marcos; De Las Rivas, Javier

    2010-01-01

    Background For years, the genetics of metastatic colorectal cancer (CRC) have been studied using a variety of techniques. However, most of the approaches employed so far have a relatively limited resolution which hampers detailed characterization of the common recurrent chromosomal breakpoints as well as the identification of small regions carrying genetic changes and the genes involved in them. Methodology/Principal Findings Here we applied 500K SNP arrays to map the most common chromosomal lesions present at diagnosis in a series of 23 primary tumours from sporadic CRC patients who had developed liver metastasis. Overall our results confirm that the genetic profile of metastatic CRC is defined by imbalanced gains of chromosomes 7, 8q, 11q, 13q, 20q and X together with losses of the 1p, 8p, 17p and 18q chromosome regions. In addition, SNP-array studies allowed the identification of small (<1.3 Mb) and extensive/large (>1.5 Mb) altered DNA sequences, many of which contain cancer genes known to be involved in CRC and the metastatic process. Detailed characterization of the breakpoint regions for the altered chromosomes showed four recurrent breakpoints at chromosomes 1p12, 8p12, 17p11.2 and 20p12.1; interestingly, the most frequently observed recurrent chromosomal breakpoint was localized at 17p11.2 and systematically targeted the FAM27L gene, whose role in CRC deserves further investigations. Conclusions/Significance In summary, in the present study we provide a detailed map of the genetic abnormalities of primary tumours from metastatic CRC patients, which confirm and extend on previous observations as regards the identification of genes potentially involved in development of CRC and the metastatic process. PMID:21060790

  14. JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects

    PubMed Central

    Conti, David V.; Richardson, Sylvia

    2016-01-01

    ABSTRACT Recently, large scale genome‐wide association study (GWAS) meta‐analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one‐at‐a‐time. This complicates the ability of fine‐mapping to identify a small set of SNPs for further functional follow‐up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re‐analysis of published marginal summary stactistics under joint multi‐SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi‐region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta‐analysis of glucose and insulin related traits consortium) – a GWAS meta‐analysis of more than 15,000 people. We re‐analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index. PMID:27027514

  15. No evidence that GATA3 rs570613 SNP modifies breast cancer risk

    PubMed Central

    Johnatty, Sharon E.; Couch, Fergus J.; Fredericksen, Zachary; Tarrell, Robert; Spurdle, Amanda B.; Beesley, Jonathan; Chen, Xiaoqing; Gschwantler-Kaulich, Daphne; Singer, Christian F.; Fuerhauser, Christine; Fink-Retter, Anneliese; Domchek, Susan M.; Nathanson, Katherine L.; Pankratz, Vernon S.; Lindor, Noralane M.; Godwin, Andrew K.; Caligo, Maria A.; Hopper, John; Southey, Melissa C.; Giles, Graham G.; Justenhoven, Christina; Brauch, Hiltrud; Hamann, Ute; Ko, Yon-Dschun; Heikkinen, Tuomas; Aaltonen, Kirsimari; Aittomäki, Kristiina; Blomqvist, Carl; Nevanlinna, Heli; Hall, Per; Czene, Kamila; Liu, Jianjun; Peock, Susan; Cook, Margaret; Platte, Radka; Evans, D. Gareth; Lalloo, Fiona; Eeles, Rosalind; Pichert, Gabriella; Eccles, Diana; Davidson, Rosemarie; Cole, Trevor; Cook, Jackie; Douglas, Fiona; Chu, Carol; Hodgson, Shirley; Paterson, Joan; Hogervorst, Frans B.L.; Rookus, Matti A.; Seynaeve, Caroline; Wijnen, Juul; Vreeswijk, Maaike; Ligtenberg, Marjolijn; Luijt, Rob B. van der; van Os, Theo A.M.; Gille, Hans J.P.; Blok, Marinus J.; Issacs, Claudine; Humphreys, Manjeet K.; McGuffog, Lesley; Healey, Sue; Sinilnikova, Olga; Antoniou, Antonis C.; Easton, Douglas F.; Chenevix-Trench, Georgia

    2009-01-01

    GATA-binding protein 3 (GATA3) is a transcription factor that is crucial to mammary gland morphogenesis and differentiation of progenitor cells, and has been suggested to have a tumor suppressor function. The rs570613 single nucleotide polymorphism (SNP) in intron 4 of GATA3 was previously found to be associated with a reduction in breast cancer risk in the Cancer Genetic Markers of Susceptibility project and in pooled analysis of two case-control studies from Norway and Poland (Ptrend =0.004), with some evidence for a stronger association with estrogen receptor (ER) negative tumours [1]. We genotyped GATA3 rs570613 in 6,388 cases and 4,995 controls from the Breast Cancer Association Consortium (BCAC) and 5,617 BRCA1 and BRCA2 carriers from the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA). We found no association between this SNP and breast cancer risk in BCAC cases overall (ORper-allele = 1.00, 95% CI 0.94 − 1.05), in ER negative BCAC cases (ORper-allele = 1.02, 95% CI 0.91−1.13), in BRCA1 mutation carriers RRper-allele = 0.99, 95% CI 0.90−1.09) or BRCA2 mutation carriers (RRper-allele = 0.93, 95% CI 0.80−1.07). We conclude that there is no evidence that either GATA3 rs570613, or any variant in strong linkage disequilibrium with it, is associated with breast cancer risk in women. PMID:19082709

  16. RNA-Seq identifies SNP markers for growth traits in rainbow trout.

    PubMed

    Salem, Mohamed; Vallejo, Roger L; Leeds, Timothy D; Palti, Yniv; Liu, Sixin; Sabbagh, Annas; Rexroad, Caird E; Yao, Jianbo

    2012-01-01

    Fast growth is an important and highly desired trait, which affects the profitability of food animal production, with feed costs accounting for the largest proportion of production costs. Traditional phenotype-based selection is typically used to select for growth traits; however, genetic improvement is slow over generations. Single nucleotide polymorphisms (SNPs) explain 90% of the genetic differences between individuals; therefore, they are most suitable for genetic evaluation and strategies that employ molecular genetics for selective breeding. SNPs found within or near a coding sequence are of particular interest because they are more likely to alter the biological function of a protein. We aimed to use SNPs to identify markers and genes associated with genetic variation in growth. RNA-Seq whole-transcriptome analysis of pooled cDNA samples from a population of rainbow trout selected for improved growth versus unselected genetic cohorts (10 fish from 1 full-sib family each) identified SNP markers associated with growth-rate. The allelic imbalances (the ratio between the allele frequencies of the fast growing sample and that of the slow growing sample) were considered at scores >5.0 as an amplification and <0.2 as loss of heterozygosity. A subset of SNPs (n = 54) were validated and evaluated for association with growth traits in 778 individuals of a three-generation parent/offspring panel representing 40 families. Twenty-two SNP markers and one mitochondrial haplotype were significantly associated with growth traits. Polymorphism of 48 of the markers was confirmed in other commercially important aquaculture stocks. Many markers were clustered into genes of metabolic energy production pathways and are suitable candidates for genetic selection. The study demonstrates that RNA-Seq at low sequence coverage of divergent populations is a fast and effective means of identifying SNPs, with allelic imbalances between phenotypes. This technique is suitable for marker

  17. JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects.

    PubMed

    Newcombe, Paul J; Conti, David V; Richardson, Sylvia

    2016-04-01

    Recently, large scale genome-wide association study (GWAS) meta-analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one-at-a-time. This complicates the ability of fine-mapping to identify a small set of SNPs for further functional follow-up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re-analysis of published marginal summary statistics under joint multi-SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi-region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta-analysis of glucose and insulin related traits consortium) - a GWAS meta-analysis of more than 15,000 people. We re-analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index. PMID:27027514

  18. Virtual karyotyping with SNP microarrays reduces uncertainty in the diagnosis of renal epithelial tumors

    PubMed Central

    Hagenkord, Jill M; Parwani, Anil V; Lyons-Weiler, Maureen A; Alvarez, Karla; Amato, Robert; Gatalica, Zoran; Gonzalez-Berjon, Jose M; Peterson, Leif; Dhir, Rajiv; Monzon, Federico A

    2008-01-01

    Background Renal epithelial tumors are morphologically, biologically, and clinically heterogeneous. Different morphologic subtypes require specific management due to markedly different prognosis and response to therapy. Each common subtype has characteristic chromosomal gains and losses, including some with prognostic value. However, copy number information has not been readily accessible for clinical purposes and thus has not been routinely used in the diagnostic evaluation of these tumors. This information can be useful for classification of tumors with complex or challenging morphology. 'Virtual karyotypes' generated using SNP arrays can readily detect characteristic chromosomal lesions in paraffin embedded renal tumors and can be used to correctly categorize the common subtypes with performance characteristics that are amenable for routine clinical use. Methods To investigate the use of virtual karyotypes for diagnostically challenging renal epithelial tumors, we evaluated 25 archived renal neoplasms where sub-classification could not be definitively rendered based on morphology and other ancillary studies. We generated virtual karyotypes with the Affymetrix 10 K 2.0 mapping array platform and identified the presence of genomic lesions across all 22 autosomes. Results In 91% of challenging cases the virtual karyotype unambiguously detected the presence or absence of chromosomal aberrations characteristic of one of the common subtypes of renal epithelial tumors, while immunohistochemistry and fluorescent in situ hybridization had no or limited utility in the diagnosis of these tumors. Conclusion These results show that virtual karyotypes generated by SNP arrays can be used as a practical ancillary study for the classification of renal epithelial tumors with complex or ambiguous morphology. PMID:18990225

  19. SNP Discovery with EST and NextGen Sequencing in Switchgrass (Panicum virgatum L.)

    PubMed Central

    Ersoz, Elhan S.; Wright, Mark H.; Pangilinan, Jasmyn L.; Sheehan, Moira J.; Tobias, Christian; Casler, Michael D.; Buckler, Edward S.; Costich, Denise E.

    2012-01-01

    Although yield trials for switchgrass (Panicum virgatum L.), a potentially high value biofuel feedstock crop, are currently underway throughout North America, the genetic tools for crop improvement in this species are still in the early stages of development. Identification of high-density molecular markers, such as single nucleotide polymorphisms (SNPs), that are amenable to high-throughput genotyping approaches, is the first step in a quantitative genetics study of this model biofuel crop species. We generated and sequenced expressed sequence tag (EST) libraries from thirteen diverse switchgrass cultivars representing both upland and lowland ecotypes, as well as tetraploid and octoploid genomes. We followed this with reduced genomic library preparation and massively parallel sequencing of the same samples using the Illumina Genome Analyzer technology platform. EST libraries were used to generate unigene clusters and establish a gene-space reference sequence, thus providing a framework for assembly of the short sequence reads. SNPs were identified utilizing these scaffolds. We used a custom software program for alignment and SNP detection and identified over 149,000 SNPs across the 13 short-read sequencing libraries (SRSLs). Approximately 25,000 additional SNPs were identified from the entire EST collection available for the species. This sequencing effort generated data that are suitable for marker development and for estimation of population genetic parameters, such as nucleotide diversity and linkage disequilibrium. Based on these data, we assessed the feasibility of genome wide association mapping and genomic selection applications in switchgrass. Overall, the SNP markers discovered in this study will help facilitate quantitative genetics experiments and greatly enhance breeding efforts that target improvement of key biofuel traits and development of new switchgrass cultivars. PMID:23049744

  20. EST-derived SNP discovery and selective pressure analysis in Pacific white shrimp ( Litopenaeus vannamei)

    NASA Astrophysics Data System (ADS)

    Liu, Chengzhang; Wang, Xia; Xiang, Jianhai; Li, Fuhua

    2012-09-01

    Pacific white shrimp has become a major aquaculture and fishery species worldwide. Although a large scale EST resource has been publicly available since 2008, the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure. In this study, a set of 155 411 expressed sequence tags (ESTs) from the NCBI database were computationally analyzed and 17 225 single nucleotide polymorphisms (SNPs) were predicted, including 9 546 transitions, 5 124 transversions and 2 481 indels. Among the 7 298 SNP substitutions located in functionally annotated contigs, 58.4% (4 262) are non-synonymous SNPs capable of introducing amino acid mutations. Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding. Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous, suggesting negative selection. Distribution of non-synonymous to synonymous substitutions (Ka/Ks) ratio ranges from 0 to 4.01, (average 0.42, median 0.26), suggesting that the majority of the affected genes are under purifying selection. Enrichment analysis identified multiple gene ontology categories under positive or negative selection. Categories involved in innate immune response and male gamete generation are rich in positively selected genes, which is similar to reports in Drosophila and primates. This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species. The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.

  1. SNP array mapping of chromosome 20p deletions: genotypes, phenotypes, and copy number variation.

    PubMed

    Kamath, Binita M; Thiel, Brian D; Gai, Xiaowu; Conlin, Laura K; Munoz, Pedro S; Glessner, Joseph; Clark, Dinah; Warthen, Daniel M; Shaikh, Tamim H; Mihci, Ercan; Piccoli, David A; Grant, Struan F A; Hakonarson, Hakon; Krantz, Ian D; Spinner, Nancy B

    2009-03-01

    The use of array technology to define chromosome deletions and duplications is bringing us closer to establishing a genotype/phenotype map of genomic copy number alterations. We studied 21 patients and five relatives with deletions of the short arm of chromosome 20 using the Illumina HumanHap550 SNP array to: 1) more accurately determine the deletion sizes; 2) identify and compare breakpoints; 3) establish genotype/phenotype correlations; and 4) investigate the use of the HumanHap550 platform for analysis of chromosome deletions. Deletions ranged from 95 kb to 14.62 Mb, and all of the breakpoints were unique. Eleven patients had deletions between 95 kb and 4 Mb and these individuals had normal development, with no anomalies outside of those associated with Alagille syndrome (AGS). The proximal and distal boundaries of these 11 deletions constitute a 5.4-Mb region, and we propose that haploinsufficiency for only 1 of the 12 genes in this region causes phenotypic abnormalities. This defines the JAG1-associated critical region, in which deletions do not confer findings other than those associated with AGS. The other 10 patients had deletions between 3.28 Mb and 14.62 Mb, which extended outside the critical region, and, notably, all of these patients had developmental delay. This group had other findings such as autism, scoliosis, and bifid uvula. We identified 47 additional polymorphic genome-wide copy number variants (>20 SNPs), with 0 to 5 variants called per patient. Deletions of the short arm of chromosome 20 are associated with relatively mild and limited clinical anomalies. The use of SNP arrays provides accurate high-resolution definition of genomic abnormalities. PMID:19058200

  2. Y-SNP L1034: limited genetic link between Mansi and Hungarian-speaking populations.

    PubMed

    Fehér, T; Németh, E; Vándor, A; Kornienko, I V; Csáji, L K; Pamjav, H

    2015-02-01

    Genetic studies noted that the Hungarian Y-chromosomal gene pool significantly differs from other Uralic-speaking populations. Hungarians show very limited or no presence of haplogroup N-Tat, which is frequent among other Uralic-speaking populations. We proposed that some genetic links need to be observed between the linguistically related Hungarian and Mansi populations.This is the first attempt to divide haplogroup N-Tat into subhaplogroups by testing new downstream SNP markers L708 and L1034. Sixty Northern Mansi samples were collected in Western Siberia and genotyped for Y-chromosomal haplotypes and haplogroups. We found 14 Mansi and 92 N-Tat samples from 7 populations. Comparative results showed that all N-Tat samples carried the N-L708 mutation. Some Hungarian, Sekler, and Uzbek samples were L1034 SNP positive, while all Mongolians, Buryats, Khanty, Finnish, and Roma samples yielded a negative result for this marker. Based on the above, L1034 marker seems to be a subgroup of N-Tat, which is typical for Mansi and Hungarian-speaking ethnic groups so far. Based on our time to most recent common ancestor data, the L1034 marker arose 2,500 years before present. The overall frequency of the L1034 is very low among the analyzed populations, thus it does not necessarily mean that proto-Hungarians and Mansi descend from common ancestors. It does provide, however, a limited genetic link supporting language contact. Both Hungarians and Mansi have much more complex genetic population history than the traditional tree-based linguistic model would suggest. PMID:25258186

  3. PPLine: An Automated Pipeline for SNP, SAP, and Splice Variant Detection in the Context of Proteogenomics.

    PubMed

    Krasnov, George Sergeevich; Dmitriev, Alexey Alexandrovich; Kudryavtseva, Anna Viktorovna; Shargunov, Alexander Valerievich; Karpov, Dmitry Sergeevich; Uroshlev, Leonid Andreevich; Melnikova, Natalya Vladimirovna; Blinov, Vladimir Mikhailovich; Poverennaya, Ekaterina Vladimirovna; Archakov, Alexander Ivanovich; Lisitsa, Andrey Valerievich; Ponomarenko, Elena Alexandrovna

    2015-09-01

    The fundamental mission of the Chromosome-Centric Human Proteome Project (C-HPP) is the research of human proteome diversity, including rare variants. Liver tissues, HepG2 cells, and plasma were selected as one of the major objects for C-HPP studies. The proteogenomic approach, a recently introduced technique, is a powerful method for predicting and validating proteoforms coming from alternative splicing, mutations, and transcript editing. We developed PPLine, a Python-based proteogenomic pipeline providing automated single-amino-acid polymorphism (SAP), indel, and alternative-spliced-variants discovery based on raw transcriptome and exome sequence data, single-nucleotide polymorphism (SNP) annotation and filtration, and the prediction of proteotypic peptides (available at https://sourceforge.net/projects/ppline). In this work, we performed deep transcriptome sequencing of HepG2 cells and liver tissues using two platforms: Illumina HiSeq and Applied Biosystems SOLiD. Using PPLine, we revealed 7756 SAP and indels for HepG2 cells and liver (including 659 variants nonannotated in dbSNP). We found 17 indels in transcripts associated with the translation of alternate reading frames (ARF) longer than 300 bp. The ARF products of two genes, SLMO1 and TMEM8A, demonstrate signatures of caspase-binding domain and Gcn5-related N-acetyltransferase. Alternative splicing analysis predicted novel proteoforms encoded by 203 (liver) and 475 (HepG2) genes according to both Illumina and SOLiD data. The results of the present work represent a basis for subsequent proteomic studies by the C-HPP consortium. PMID:26147802

  4. Identifying Litchi (Litchi chinensis Sonn.) Cultivars and Their Genetic Relationships Using Single Nucleotide Polymorphism (SNP) Markers

    PubMed Central

    Liu, Wei; Xiao, Zhidan; Bao, Xiuli; Yang, Xiaoyan; Fang, Jing; Xiang, Xu

    2015-01-01

    Litchi is an important fruit tree in tropical and subtropical areas of the world. However, there is widespread confusion regarding litchi cultivar nomenclature and detailed information of genetic relationships among litchi germplasm is unclear. In the present study, the potential of single nucleotide polymorphism (SNP) for the identification of 96 representative litchi accessions and their genetic relationships in China was evaluated using 155 SNPs that were evenly spaced across litchi genome. Ninety SNPs with minor allele frequencies above 0.05 and a good genotyping success rate were used for further analysis. A relatively high level of genetic variation was observed among litchi accessions, as quantified by the expected heterozygosity (He = 0.305). The SNP based multilocus matching identified two synonymous groups, ‘Heiye’ and ‘Wuye’, and ‘Chengtuo’ and ‘Baitangli 1’. A subset of 14 SNPs was sufficient to distinguish all the non-redundant litchi genotypes, and these SNPs were proven to be highly stable by repeated analyses of a selected group of cultivars. Unweighted pair-group method of arithmetic averages (UPGMA) cluster analysis divided the litchi accessions analyzed into four main groups, which corresponded to the traits of extremely early-maturing, early-maturing, middle-maturing, and late-maturing, indicating that the fruit maturation period should be considered as the primary criterion for litchi taxonomy. Two subpopulations were detected among litchi accessions by STRUCTURE analysis, and accessions with extremely early- and late-maturing traits showed membership coefficients above 0.99 for Cluster 1 and Cluster 2, respectively. Accessions with early- and middle-maturing traits were identified as admixture forms with varying levels of membership shared between the two clusters, indicating their hybrid origin during litchi domestication. The results of this study will benefit litchi germplasm conservation programs and facilitate maximum

  5. Identification and characterisation of novel SNP markers in Atlantic cod: Evidence for directional selection

    PubMed Central

    Moen, Thomas; Hayes, Ben; Nilsen, Frank; Delghandi, Madjid; Fjalestad, Kjersti T; Fevolden, Svein-Erik; Berg, Paul R; Lien, Sigbjørn

    2008-01-01

    Background The Atlantic cod (Gadus morhua) is a groundfish of great economic value in fisheries and an emerging species in aquaculture. Genetic markers are needed to identify wild stocks in order to ensure sustainable management, and for marker-assisted selection and pedigree determination in aquaculture. Here, we report on the development and evaluation of a large number of Single Nucleotide Polymorphism (SNP) markers from the alignment of Expressed Sequence Tag (EST) sequences in Atlantic cod. We also present basic population parameters of the SNPs in samples of North-East Arctic cod and Norwegian coastal cod obtained from three different localities, and test for SNPs that may have been targeted by natural selection. Results A total of 17,056 EST sequences were used to find 724 putative SNPs, from which 318 segregating SNPs were isolated. The SNPs were tested on Atlantic cod from four different sites, comprising both North-East Arctic cod (NEAC) and Norwegian coastal cod (NCC). The average heterozygosity of the SNPs was 0.25 and the average minor allele frequency was 0.18. FST values were highly variable, with the majority of SNPs displaying very little differentiation while others had FST values as high as 0.83. The FST values of 29 SNPs were found to be larger than expected under a strictly neutral model, suggesting that these loci are, or have been, influenced by natural selection. For the majority of these outlier SNPs, allele frequencies in a northern sample of NCC were intermediate between allele frequencies in a southern sample of NCC and a sample of NEAC, indicating a cline in allele frequencies similar to that found at the Pantophysin I locus. Conclusion The SNP markers presented here are powerful tools for future genetics work related to management and aquaculture. In particular, some SNPs exhibiting high levels of population divergence have potential to significantly enhance studies on the population structure of Atlantic cod. PMID:18302786

  6. Association of a single nucleotide polymorphism (SNP) of calpain 1 (CAPN1) gene with meat tenderness of yak.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The association of a single nucleotide polymorphism (SNP) of calpain 1 (CAPN1) gene with shear force of 2.54 cm steaks from M. longissimus dorsi from Gannan yaks (Bos grunniens, n=181) was studied. Yaks were harvested at 2, 3, and 4 yr of age (n=51, 59, and 71, respectively), and samples of each yak...

  7. Association of STAT2 SNP genotypes and growth phenotypes in heifers from an Angus, Brahman and Romosinuano diallel population

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Components of the growth endocrine axis regulate growth and reproduction traits in cattle. A SNP in the promoter of the signal transducer and activator of transcription 2 (STAT2) has been previously reported to be associated with postpartum rebreeding in a diallel beef population composed of 650 hei...

  8. Development of EST-based SNP and InDel markers and their utilization in tetraploid cotton genetic mapping

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Expressed sequence tags (ESTs) were analyzed in silico in order to identify single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (InDels) in cotton. A total of 1349 EST-based SNP and InDel markers were developed by comparing ESTs between Gossypium hirsutum and G. barbadense, m...

  9. Ethnic heterogeneity of IRF6 AP-2α binding site promoter SNP association with nonsyndromic cleft lip and palate

    PubMed Central

    Blanton, Susan H.; Burt, Amber; Garcia, Elizabeth; Mulliken, John B.; Stal, Samuel; Hecht, Jacqueline T.

    2010-01-01

    Objective The goal of this study was to confirm the reported association between a noncoding SNP (rs642961) in IRF6 and nonsyndromic cleft lip and palate (NSCLP). Design, Setting and Participants Two SNPs in IRF6 (rs2235371 and rs64296) were genotyped in Hispanic and nonHispanic white multiplex (122) and simplex (308) NSCLP families. Linkage and family-based association analyses were performed on the individual SNPs as well as the 2-SNP haplotype. Results We find only modest evidence for an association with rs642961 and the 2-SNP haplotype. In contrast, we found strong evidence for association with rs2235371; this was most evident in the nonHispanic white simplex families. Conclusions While we confirm that variation in IRF6 is associated with NSCLP, our results do not support the reported association with SNP rs64296. Importantly, the association varies between ethnic groups. This finding underscores the need for evaluating additional variations in IRF6 across multiple populations to better determine its role in NSCLP. PMID:21039277

  10. Association of single nucleotide polymorphism (SNP) markers in candidate genes and QTL regions with pork quality traits in commercial pigs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Numerous reports have described genetic markers or genomic regions (QTL) associated with pork quality and/or palatability. Validation of these associations in other commercial populations is necessary before these markers should be used. Therefore, 156 SNP markers from 45 candidate genes and 8 QTL r...

  11. Association of Single Nucleotide Polymorphism (SNP) Markers in Candidate Genes and QTL Regions with Pork Quality in Commercial Pigs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Numerous reports have described genetic markers or genomic regions (QTL) associated with pork quality and/or palatability. Validation of these associations in other commercial populations is necessary before these markers should be used. Therefore, we tested 130 SNP markers from 35 candidate genes a...

  12. CLOCK 3111 T/C SNP interacts with emotional eating behavior for weight-loss in a Mediterranean population

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The goals of this research was (1) to analyze the role of emotional eating behavior on weight-loss progression during a 30-week weight-loss program in 1,272 individuals from a large Mediterranean population and (2) to test for interaction between CLOCK 3111 T/C SNP and emotional eating behavior on t...

  13. SNP discovery and chromosome anchoring provide the first physically-anchored hexaploid oat map and reveal synteny with model species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    For the first time in many years a comprehensive genome map for cultivated oat has been constructed using a combination of single nucleotide polymorphism (SNP) markers and validated with a collection of cytogenetically defined germplasm lines. The markers were able to help distinguish the three geno...

  14. Finding Markers That Make a Difference: DNA Pooling and SNP-Arrays Identify Population Informative Markers for Genetic Stock Identification

    PubMed Central

    Ozerov, Mikhail; Vasemägi, Anti; Wennevik, Vidar; Diaz-Fernandez, Rogelio; Kent, Matthew; Gilbey, John; Prusov, Sergey; Niemelä, Eero; Vähä, Juha-Pekka

    2013-01-01

    Genetic stock identification (GSI) using molecular markers is an important tool for management of migratory species. Here, we tested a cost-effective alternative to individual genotyping, known as allelotyping, for identification of highly informative SNPs for accurate genetic stock identification. We estimated allele frequencies of 2880 SNPs from DNA pools of 23 Atlantic salmon populations using Illumina SNP-chip. We evaluated the performance of four common strategies (global FST, pairwise FST, Delta and outlier approach) for selection of the most informative set of SNPs and tested their effectiveness for GSI compared to random sets of SNP and microsatellite markers. For the majority of cases, SNPs selected using the outlier approach performed best followed by pairwise FST and Delta methods. Overall, the selection procedure reduced the number of SNPs required for accurate GSI by up to 53% compared with randomly chosen SNPs. However, GSI accuracy was more affected by populations in the ascertainment group rather than the ranking method itself. We demonstrated for the first time the compatibility of different large-scale SNP datasets by compiling the largest population genetic dataset for Atlantic salmon to date. Finally, we showed an excellent performance of our top SNPs on an independent set of populations covering the main European distribution range of Atlantic salmon. Taken together, we demonstrate how combination of DNA pooling and SNP arrays can be applied for conservation and management of salmonids as well as other species. PMID:24358184

  15. Finding markers that make a difference: DNA pooling and SNP-arrays identify population informative markers for genetic stock identification.

    PubMed

    Ozerov, Mikhail; Vasemägi, Anti; Wennevik, Vidar; Diaz-Fernandez, Rogelio; Kent, Matthew; Gilbey, John; Prusov, Sergey; Niemelä, Eero; Vähä, Juha-Pekka

    2013-01-01

    Genetic stock identification (GSI) using molecular markers is an important tool for management of migratory species. Here, we tested a cost-effective alternative to individual genotyping, known as allelotyping, for identification of highly informative SNPs for accurate genetic stock identification. We estimated allele frequencies of 2880 SNPs from DNA pools of 23 Atlantic salmon populations using Illumina SNP-chip. We evaluated the performance of four common strategies (global F ST, pairwise F ST, Delta and outlier approach) for selection of the most informative set of SNPs and tested their effectiveness for GSI compared to random sets of SNP and microsatellite markers. For the majority of cases, SNPs selected using the outlier approach performed best followed by pairwise F ST and Delta methods. Overall, the selection procedure reduced the number of SNPs required for accurate GSI by up to 53% compared with randomly chosen SNPs. However, GSI accuracy was more affected by populations in the ascertainment group rather than the ranking method itself. We demonstrated for the first time the compatibility of different large-scale SNP datasets by compiling the largest population genetic dataset for Atlantic salmon to date. Finally, we showed an excellent performance of our top SNPs on an independent set of populations covering the main European distribution range of Atlantic salmon. Taken together, we demonstrate how combination of DNA pooling and SNP arrays can be applied for conservation and management of salmonids as well as other species. PMID:24358184

  16. Development of a high-throughput SNP resource to advance genomic, genetic and breeding research in carrot (Daucus carota L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The rapid advancement in high-throughput SNP genotyping technologies along with next generation sequencing (NGS) platforms has decreased the cost, improved the quality of large-scale genome surveys, and allowed specialty crops with limited genomic resources such as carrot (Daucus carota) to access t...

  17. Development and evaluation of a genome-wide 6K SNP array for diploid sweet cherry and tetraploid sour cherry

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a commun...

  18. Translational genomics for abiotic stress in sorghum: transcriptional profiling and validation of SNP markers between germplasm with differential cold tolerance

    Technology Transfer Automated Retrieval System (TEKTRAN)

    One focus of the Sorghum Translational Genomics Lab (part of sorghum CRIS, PSGD, CSRL, USDA-ARS, Lubbock TX) is to utilize nucleotide variation between sorghum germplasm such as those derived from RNA seq for translation and validation of Single Nucleotide Polymorphism (SNP) into easy access DNA m...

  19. SNP analysis with duplicated fish genomes: differentiation of SNPs, paralogous sequence variants and multi-site variants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughout SNP discovery and genotyping have facilitated genome analyses aimed at identifying factors that affect traits of interest. Platforms that multiplex thousands of SNPs are available for some agricultural species but not yet for aquaculture. Ray-finned fish share an additional (3R) roun...

  20. Imputation of microsatellite alleles from dense SNP genotypes for parentage verification across multiple Bos taurus and Bos indicus breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microsatellite markers (MS) have traditionally been used for parental verification and are still the international standard in spite of their higher cost, error rate, and turnaround time compared with Single Nucleotide Polymorphisms (SNP) -based assays. Despite domestic and international demands fr...

  1. SEL1L SNP rs12435998, a predictor of glioblastoma survival and response to radio-chemotherapy

    PubMed Central

    Storaci, Alessandra Maria; Annovazzi, Laura; Cassoni, Paola; Melcarne, Antonio; De Blasio, Pasquale; Schiffer, Davide; Biunno, Ida

    2015-01-01

    The suppressor of Lin-12-like (C. elegans) (SEL1L) is involved in the endoplasmic reticulum (ER)-associated degradation pathway, malignant transformation and stem cells. In 412 formalin-fixed and paraffin-embedded brain tumors and 39 Glioblastoma multiforme (GBM) cell lines, we determined the frequency of five SEL1L single nucleotide genetic variants with regulatory and coding functions by a SNaPShot™ assay. We tested their possible association with brain tumor risk, prognosis and therapy. We studied the in vitro cytotoxicity of valproic acid (VPA), temozolomide (TMZ), doxorubicin (DOX) and paclitaxel (PTX), alone or in combination, on 11 GBM cell lines, with respect to the SNP rs12435998 genotype. The SNP rs12435998 was prevalent in anaplastic and malignant gliomas, and in meningiomas of all histologic grades, but unrelated to brain tumor risks. In GBM patients, the SNP rs12435998 was associated with prolonged overall survival (OS) and better response to TMZ-based radio-chemotherapy. GBM stem cells with this SNP showed lower levels of SEL1L expression and enhanced sensitivity to VPA. PMID:25948789

  2. The use and economic value of the 3K SNP genomic test for calves on dairy farms

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Dairy producers now have the opportunity to test their females with the low-density 3K SNP genomic test. This test provides an estimate of an animal’s genetic merit for many traits, including milk production and Net Merit (NM$). As of August 2011, approximately 45,000 animals have been tested with t...

  3. Population-standardized genetic risk score: the SNP-based method of choice for inherited risk assessment of prostate cancer.

    PubMed

    Conran, Carly A; Na, Rong; Chen, Haitao; Jiang, Deke; Lin, Xiaoling; Zheng, S Lilly; Brendler, Charles B; Xu, Jianfeng

    2016-01-01

    Several different approaches are available to clinicians for determining prostate cancer (PCa) risk. The clinical validity of various PCa risk assessment methods utilizing single nucleotide polymorphisms (SNPs) has been established; however, these SNP-based methods have not been compared. The objective of this study was to compare the three most commonly used SNP-based methods for PCa risk assessment. Participants were men (n = 1654) enrolled in a prospective study of PCa development. Genotypes of 59 PCa risk-associated SNPs were available in this cohort. Three methods of calculating SNP-based genetic risk scores (GRSs) were used for the evaluation of individual disease risk such as risk allele count (GRS-RAC), weighted risk allele count (GRS-wRAC), and population-standardized genetic risk score (GRS-PS). Mean GRSs were calculated, and performances were compared using area under the receiver operating characteristic curve (AUC) and positive predictive value (PPV). All SNP-based methods were found to be independently associated with PCa (all P < 0.05; hence their clinical validity). The mean GRSs in men with or without PCa using GRS-RAC were 55.15 and 53.46, respectively, using GRS-wRAC were 7.42 and 6.97, respectively, and using GRS-PS were 1.12 and 0.84, respectively (all P < 0.05 for differences between patients with or without PCa). All three SNP-based methods performed similarly in discriminating PCa from non-PCa based on AUC and in predicting PCa risk based on PPV (all P > 0.05 for comparisons between the three methods), and all three SNP-based methods had a significantly higher AUC than family history (all P < 0.05). Results from this study suggest that while the three most commonly used SNP-based methods performed similarly in discriminating PCa from non-PCa at the population level, GRS-PS is the method of choice for risk assessment at the individual level because its value (where 1.0 represents average population risk) can be easily interpreted regardless

  4. Population-standardized genetic risk score: the SNP-based method of choice for inherited risk assessment of prostate cancer

    PubMed Central

    Conran, Carly A; Na, Rong; Chen, Haitao; Jiang, Deke; Lin, Xiaoling; Zheng, S Lilly; Brendler, Charles B; Xu, Jianfeng

    2016-01-01

    Several different approaches are available to clinicians for determining prostate cancer (PCa) risk. The clinical validity of various PCa risk assessment methods utilizing single nucleotide polymorphisms (SNPs) has been established; however, these SNP-based methods have not been compared. The objective of this study was to compare the three most commonly used SNP-based methods for PCa risk assessment. Participants were men (n = 1654) enrolled in a prospective study of PCa development. Genotypes of 59 PCa risk-associated SNPs were available in this cohort. Three methods of calculating SNP-based genetic risk scores (GRSs) were used for the evaluation of individual disease risk such as risk allele count (GRS-RAC), weighted risk allele count (GRS-wRAC), and population-standardized genetic risk score (GRS-PS). Mean GRSs were calculated, and performances were compared using area under the receiver operating characteristic curve (AUC) and positive predictive value (PPV). All SNP-based methods were found to be independently associated with PCa (all P < 0.05; hence their clinical validity). The mean GRSs in men with or without PCa using GRS-RAC were 55.15 and 53.46, respectively, using GRS-wRAC were 7.42 and 6.97, respectively, and using GRS-PS were 1.12 and 0.84, respectively (all P < 0.05 for differences between patients with or without PCa). All three SNP-based methods performed similarly in discriminating PCa from non-PCa based on AUC and in predicting PCa risk based on PPV (all P > 0.05 for comparisons between the three methods), and all three SNP-based methods had a significantly higher AUC than family history (all P < 0.05). Results from this study suggest that while the three most commonly used SNP-based methods performed similarly in discriminating PCa from non-PCa at the population level, GRS-PS is the method of choice for risk assessment at the individual level because its value (where 1.0 represents average population risk) can be easily interpreted regardless

  5. SNP discovery and chromosome anchoring provide the first physically-anchored hexaploid oat map and reveal synteny with model species.

    PubMed

    Oliver, Rebekah E; Tinker, Nicholas A; Lazo, Gerard R; Chao, Shiaoman; Jellen, Eric N; Carson, Martin L; Rines, Howard W; Obert, Donald E; Lutz, Joseph D; Shackelford, Irene; Korol, Abraham B; Wight, Charlene P; Gardner, Kyle M; Hattori, Jiro; Beattie, Aaron D; Bjørnstad, Åsmund; Bonman, J Michael; Jannink, Jean-Luc; Sorrells, Mark E; Brown-Guedira, Gina L; Mitchell Fetch, Jennifer W; Harrison, Stephen A; Howarth, Catherine J; Ibrahim, Amir; Kolb, Frederic L; McMullen, Michael S; Murphy, J Paul; Ohm, Herbert W; Rossnagel, Brian G; Yan, Weikai; Miclaus, Kelci J; Hiller, Jordan; Maughan, Peter J; Redman Hulse, Rachel R; Anderson, Joseph M; Islamovic, Emir; Jackson, Eric W

    2013-01-01

    A physically anchored consensus map is foundational to modern genomics research; however, construction of such a map in oat (Avena sativa L., 2n = 6x = 42) has been hindered by the size and complexity of the genome, the scarcity of robust molecular markers, and the lack of aneuploid stocks. Resources developed in this study include a modified SNP discovery method for complex genomes, a diverse set of oat SNP markers, and a novel chromosome-deficient SNP anchoring strategy. These resources were applied to build the first complete, physically-anchored consensus map of hexaploid oat. Approximately 11,000 high-confidence in silico SNPs were discovered based on nine million inter-varietal sequence reads of genomic and cDNA origin. GoldenGate genotyping of 3,072 SNP assays yielded 1,311 robust markers, of which 985 were mapped in 390 recombinant-inbred lines from six bi-parental mapping populations ranging in size from 49 to 97 progeny. The consensus map included 985 SNPs and 68 previously-published markers, resolving 21 linkage groups with a total map distance of 1,838.8 cM. Consensus linkage groups were assigned to 21 chromosomes using SNP deletion analysis of chromosome-deficient monosomic hybrid stocks. Alignments with sequenced genomes of rice and Brachypodium provide evidence for extensive conservation of genomic regions, and renewed encouragement for orthology-based genomic discovery in this important hexaploid species. These results also provide a framework for high-resolution genetic analysis in oat, and a model for marker development and map construction in other species with complex genomes and limited resources. PMID:23533580

  6. Genome-wide SNP scan of pooled DNA reveals nonsense mutation in FGF20 in the scaleless line of featherless chickens

    PubMed Central

    2012-01-01

    Background Scaleless (sc/sc) chickens carry a single recessive mutation that causes a lack of almost all body feathers, as well as foot scales and spurs, due to a failure of skin patterning during embryogenesis. This spontaneous mutant line, first described in the 1950s, has been used extensively to explore the tissue interactions involved in ectodermal appendage formation in embryonic skin. Moreover, the trait is potentially useful in tropical agriculture due to the ability of featherless chickens to tolerate heat, which is at present a major constraint to efficient poultry meat production in hot climates. In the interests of enhancing our understanding of feather placode development, and to provide the poultry industry with a strategy to breed heat-tolerant meat-type chickens (broilers), we mapped and identified the sc mutation. Results Through a cost-effective and labour-efficient SNP array mapping approach using DNA from sc/sc and sc/+ blood sample pools, we map the sc trait to chromosome 4 and show that a nonsense mutation in FGF20 is completely associated with the sc/sc phenotype. This mutation, common to all sc/sc individuals and absent from wild type, is predicted to lead to loss of a highly conserved region of the FGF20 protein important for FGF signalling. In situ hybridisation and quantitative RT-PCR studies reveal that FGF20 is epidermally expressed during the early stages of feather placode patterning. In addition, we describe a dCAPS genotyping assay based on the mutation, developed to facilitate discrimination between wild type and sc alleles. Conclusions This work represents the first loss of function genetic evidence supporting a role for FGF ligand signalling in feather development, and suggests FGF20 as a novel central player in the development of vertebrate skin appendages, including hair follicles and exocrine glands. In addition, this is to our knowledge the first report describing the use of the chicken SNP array to map genes based on

  7. Genome-wide association study for behavior, type traits, and muscular development in Charolais beef cattle.

    PubMed

    Vallée, A; Daures, J; van Arendonk, J A M; Bovenhuis, H

    2016-06-01

    Behavior, type traits, and muscular development are of interest for beef cattle breeding. Genome-wide association studies (GWAS) enable the identification of candidate genes, which enables gene-based selection and provides insight in the genetic architecture of these traits. The objective of the current study was to perform a GWAS for 3 behavior traits, 12 type traits, and muscular development in Charolais cattle. Behavior traits, including aggressiveness at parturition, aggressiveness during gestation period, and maternal care, were scored by farmers. Type traits, including udder conformation, teat, feet and legs, and locomotion, were scored by trained classifiers. Data used in the GWAS consisted of 3,274 cows with phenotypic records and genotyping information for 44,930 SNP. When SNP had a false discovery rate (FDR) smaller than 0.05, they were referred to as significant. When SNP had a FDR between 0.05 and 0.20, they were referred to as suggestive. Four significant and 12 suggestive regions were detected for aggressiveness during gestation, maternal care, udder balance, teat thinness, teat length, foot angle, foot depth, and locomotion. These 4 significant and 12 suggestive regions were not supported by other significant SNP in close proximity. No SNP with major effects were detected for behavior and type traits, and SNP associations for these traits were spread across the genome, suggesting that behavior and type traits were influenced by many genes, each explaining a small part of genetic variance. The GWAS identified 1 region on chromosome 2 significantly associated with muscular development, which included the myostatin gene (), which is known to affect muscularity. No other regions associated with muscular development were found. Results showed that the myostatin region associated with muscular development had pleiotropic effects on udder volume, teat thinness, rear leg, and leg angle. PMID:27285908

  8. SOD2 V16A SNP in the Mitochondrial Targeting Sequence is Associated with Noise Induced Hearing Loss in Chinese Workers

    PubMed Central

    Liu, Yi-Min; Li, Xu-Dong; Guo, Xiao; Liu, Bin; Lin, Ai-Hua; Ding, Yuan-Lin; Rao, Shao-Qi

    2010-01-01

    Objective: To investigate whether single nucleotide polymorphisms (SNPs) in the Mn-superoxide dismutase gene (SOD2) underlie the susceptibility to noise-induced hearing loss (NIHL). Methods: Audiometric data from 2400 Chinese Han workers who exposed to occupational noise were analyzed. DNA samples were collected from the 10% most susceptible and the 10% most resistant individuals, and five SNPs (SOD2 rs2842980, rs5746136, rs2758331, rs4880 and rs5746092) were genotyped by Taqman SNP Genotyping Kits. The SNP main effects and interactions between noise exposure and SNP were analyzed using logistic regression. Haplotypes were analyzed by using Haploview software. Results: The CT genotype of rs4880 (SOD2 V16A SNP) was associated with a higher risk of NIHL (covariates-adjusted OR, 2.18; 95% CI, 1.34–3.54, P = 0.002). Haplotype analysis revealed that the frequency of AGCCG at the five SNP loci was significantly higher in the susceptible group (P = 0.020). With AGCTG as the reference, the OR (95% CI) was 2.63 (1.14, 6.06). The rs4880 polymorphisms imposed larger effects when the carriers were exposed to higher levels of noise, indicating the interaction between SNP and noise exposure. Conclusions: Our results suggest that SOD2 V16A SNP in the mitochondrial targeting sequence is associated with noise induced hearing loss in Chinese workers, and this effect was enhanced by higher levels of noise exposure. PMID:20534900

  9. Identification of genes with nonsynonymous SNP in Jeju horse by whole-genome resequencing reveals a functional role for immune response.

    PubMed

    Lee, J-H; Song, K-D; Kim, J-M; Leem, H-K; Park, K-D

    2016-03-01

    Jeju horse (Natural Monument number 347) is a breed of horse that has experienced long-term isolation and domestication in Jeju Island, South Korea. We evaluated genetic features of this breed, including SNP, by whole-genome resequencing using an Illumina HiSeq 2000. A total of 5,986,852 SNP were identified in 4 Jeju horses and were divided into homozygous and heterozygous SNP (2,357,099 and 3,629,753 SNP, respectively). It revealed that 63.8% of these SNP resided in intergenic regions. Immune response genes with nonsynonymous SNP were overrepresented in Jeju horses as evidenced by Gene Ontology clustering. Among these genes, Toll-like receptors (TLR) are highly enriched. Comparing TLR genes between Jeju horses and the Przewalski's horse, and genes showed "possibly damaging" mutations in several regions by analysis with PolyPhen-2. These results provide a framework for further genetic studies in Jeju horse by domestication. Furthermore, research on functions of SNP-associated genes would aid in understanding the molecular genetic variation of horse breeds. PMID:27065251

  10. Blood Type Influences Pancreatic Cancer Risk | Division of Cancer Prevention

    Cancer.gov

    A variation in the gene that determines ABO blood type influences the risk of pancreatic cancer, according to the results of the first genome-wide association study (GWAS) for this highly lethal disease. The genetic variation, a single nucleotide polymorphism (SNP), was discovered in a region of chromosome 9 that harbors the gene that determines blood type, the researchers reported August 2 online in Nature Genetics. |

  11. Rapid Typing of Coxiella burnetii

    PubMed Central

    Georgia, Shalamar M.; Kachur, Sergey; Birdsell, Dawn N.; Hilsabeck, Remy; Gates, Lauren T.; Samuel, James E.; Heinzen, Robert A.; Kersh, Gilbert J.; Keim, Paul; Massung, Robert F.; Pearson, Talima

    2011-01-01

    Coxiella burnetii has the potential to cause serious disease and is highly prevalent in the environment. Despite this, epidemiological data are sparse and isolate collections are typically small, rare, and difficult to share among laboratories as this pathogen is governed by select agent rules and fastidious to culture. With the advent of whole genome sequencing, some of this knowledge gap has been overcome by the development of genotyping schemes, however many of these methods are cumbersome and not readily transferable between institutions. As comparisons of the few existing collections can dramatically increase our knowledge of the evolution and phylogeography of the species, we aimed to facilitate such comparisons by extracting SNP signatures from past genotyping efforts and then incorporated these signatures into assays that quickly and easily define genotypes and phylogenetic groups. We found 91 polymorphisms (SNPs and indels) among multispacer sequence typing (MST) loci and designed 14 SNP-based assays that could be used to type samples based on previously established phylogenetic groups. These assays are rapid, inexpensive, real-time PCR assays whose results are unambiguous. Data from these assays allowed us to assign 43 previously untyped isolates to established genotypes and genomic groups. Furthermore, genotyping results based on assays from the signatures provided here are easily transferred between institutions, readily interpreted phylogenetically and simple to adapt to new genotyping technologies. PMID:22073151

  12. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies

    PubMed Central

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79–92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50–70%). PMID:27547936

  13. Association of the ARL15 rs6450176 SNP and serum lipid levels in the Jing and Han populations

    PubMed Central

    Sun, Jia-Qi; Yin, Rui-Xing; Shi, Guang-Yuan; Shen, Shao-Wen; Chen, Xia; Bin, Yuan; Huang, Feng; Wang, Wei; Lin, Wei-Xiong; Pan, Shang-Ling

    2015-01-01

    The association of ADP-ribosylation factor-like 15 (ARL15) rs6450176 single nucleotide polymorphism (SNP) and serum lipid profiles has never been studied in the Chinese population. The present study was undertaken to detect the association of ARL15 rs6450176 SNP and several environmental factors with serum lipid levels in the Jing and Han populations. Genotypes of the SNP were determined in 726 unrelated subjects of Jing nationality and 726 participants of Han nationality. The genotypic and allelic frequencies of the SNP in Jing but not in Han were different between males and females (P < 0.001 and P < 0.05; respectively). The G allele carriers in Han had lower serum total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C) and apolipoprotein (Apo) B levels, and higher ApoA1/ApoB ratio than the G allele non-carriers (P < 0.05-0.01). The G allele carriers in Jing had lower serum TC, high-density lipoprotein cholesterol (HDL-C), ApoA1, ApoB levels and higher ApoA1/ApoB ratio than the G allele non-carriers (P < 0.05 for all). Subgroup analyses showed that the G allele carriers had lower TC and LDL-C levels in Han males; lower LDL-C and ApoB levels in Han females; lower ApoB levels and ApoA1/ApoB ratio in Jing males; and lower LDL-C levels in Jing females than the G allele non-carriers (P < 0.05-0.01). Multiple linear regression analysis showed that serum TC, LDL-C, ApoB levels and the ApoA1/ApoB ratio in Han; and TC, HDL-C and ApoA1 levels in Jing were correlated with the genotypes of the ARL15 rs6450176 SNP (P < 0.05-0.001). Serum lipid parameters were also associated with several environmental factors in both ethnic groups. These findings indicated that there may be a racial/ethnic- and/or sex-specific association of the ARL15 rs6450176 SNP and serum lipid levels. PMID:26722494

  14. Genome-Wide SNP Calling from Genotyping by Sequencing (GBS) Data: A Comparison of Seven Pipelines and Two Sequencing Technologies.

    PubMed

    Torkamaneh, Davoud; Laroche, Jérôme; Belzile, François

    2016-01-01

    Next-generation sequencing (NGS) has revolutionized plant and animal research in many ways including new methods of high throughput genotyping. Genotyping-by-sequencing (GBS) has been demonstrated to be a robust and cost-effective genotyping method capable of producing thousands to millions of SNPs across a wide range of species. Undoubtedly, the greatest barrier to its broader use is the challenge of data analysis. Herein we describe a comprehensive comparison of seven GBS bioinformatics pipelines developed to process raw GBS sequence data into SNP genotypes. We compared five pipelines requiring a reference genome (TASSEL-GBS v1& v2, Stacks, IGST, and Fast-GBS) and two de novo pipelines that do not require a reference genome (UNEAK and Stacks). Using Illumina sequence data from a set of 24 re-sequenced soybean lines, we performed SNP calling with these pipelines and compared the GBS SNP calls with the re-sequencing data to assess their accuracy. The number of SNPs called without a reference genome was lower (13k to 24k) than with a reference genome (25k to 54k SNPs) while accuracy was high (92.3 to 98.7%) for all but one pipeline (TASSEL-GBSv1, 76.1%). Among pipelines offering a high accuracy (>95%), Fast-GBS called the greatest number of polymorphisms (close to 35,000 SNPs + Indels) and yielded the highest accuracy (98.7%). Using Ion Torrent sequence data for the same 24 lines, we compared the performance of Fast-GBS with that of TASSEL-GBSv2. It again called more polymorphisms (25.8K vs 22.9K) and these proved more accurate (95.2 vs 91.1%). Typically, SNP catalogues called from the same sequencing data using different pipelines resulted in highly overlapping SNP catalogues (79-92% overlap). In contrast, overlap between SNP catalogues obtained using the same pipeline but different sequencing technologies was less extensive (~50-70%). PMID:27547936

  15. Software comparison for evaluating genomic copy number variation for Affymetrix 6.0 SNP array platform

    PubMed Central

    2011-01-01

    Background Copy number data are routinely being extracted from genome-wide association study chips using a variety of software. We empirically evaluated and compared four freely-available software packages designed for Affymetrix SNP chips to estimate copy number: Affymetrix Power Tools (APT), Aroma.Affymetrix, PennCNV and CRLMM. Our evaluation used 1,418 GENOA samples that were genotyped on the Affymetrix Genome-Wide Human SNP Array 6.0. We compared bias and variance in the locus-level copy number data, the concordance amongst regions of copy number gains/deletions and the false-positive rate amongst deleted segments. Results APT had median locus-level copy numbers closest to a value of two, whereas PennCNV and Aroma.Affymetrix had the smallest variability associated with the median copy number. Of those evaluated, only PennCNV provides copy number specific quality-control metrics and identified 136 poor CNV samples. Regions of copy number variation (CNV) were detected using the hidden Markov models provided within PennCNV and CRLMM/VanillaIce. PennCNV detected more CNVs than CRLMM/VanillaIce; the median number of CNVs detected per sample was 39 and 30, respectively. PennCNV detected most of the regions that CRLMM/VanillaIce did as well as additional CNV regions. The median concordance between PennCNV and CRLMM/VanillaIce was 47.9% for duplications and 51.5% for deletions. The estimated false-positive rate associated with deletions was similar for PennCNV and CRLMM/VanillaIce. Conclusions If the objective is to perform statistical tests on the locus-level copy number data, our empirical results suggest that PennCNV or Aroma.Affymetrix is optimal. If the objective is to perform statistical tests on the summarized segmented data then PennCNV would be preferred over CRLMM/VanillaIce. Specifically, PennCNV allows the analyst to estimate locus-level copy number, perform segmentation and evaluate CNV-specific quality-control metrics within a single software package

  16. Joint effect of unlinked genotypes: application to type 2 diabetes in the EPIC-Potsdam case-cohort study.

    PubMed

    Knüppel, Sven; Meidtner, Karina; Arregui, Maria; Holzhütter, Hermann-Georg; Boeing, Heiner

    2015-07-01

    Analyzing multiple single nucleotide polymorphisms (SNPs) is a promising approach to finding genetic effects beyond single-locus associations. We proposed the use of multilocus stepwise regression (MSR) to screen for allele combinations as a method to model joint effects, and compared the results with the often used genetic risk score (GRS), conventional stepwise selection, and the shrinkage method LASSO. In contrast to MSR, the GRS, conventional stepwise selection, and LASSO model each genotype by the risk allele doses. We reanalyzed 20 unlinked SNPs related to type 2 diabetes (T2D) in the EPIC-Potsdam case-cohort study (760 cases, 2193 noncases). No SNP-SNP interactions and no nonlinear effects were found. Two SNP combinations selected by MSR (Nagelkerke's R² = 0.050 and 0.048) included eight SNPs with mean allele combination frequency of 2%. GRS and stepwise selection selected nearly the same SNP combinations consisting of 12 and 13 SNPs (Nagelkerke's R² ranged from 0.020 to 0.029). LASSO showed similar results. The MSR method showed the best model fit measured by Nagelkerke's R² suggesting that further improvement may render this method a useful tool in genetic research. However, our comparison suggests that the GRS is a simple way to model genetic effects since it does not consider linkage, SNP-SNP interactions, and no non-linear effects. PMID:25907404

  17. Design and synthesis of the superionic conductor Na10SnP2S12

    PubMed Central

    Richards, William D.; Tsujimura, Tomoyuki; Miara, Lincoln J.; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand

    2016-01-01

    Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm−1 rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity. PMID:26984102

  18. De Novo SNP Discovery in the Scandinavian Brown Bear (Ursus arctos)

    PubMed Central

    Norman, Anita J.; Street, Nathaniel R.; Spong, Göran

    2013-01-01

    Information about relatedness between individuals in wild populations is advantageous when studying evolutionary, behavioural and ecological processes. Genomic data can be used to determine relatedness between individuals either when no prior knowledge exists or to confirm suspected relatedness. Here we present a set of 96 SNPs suitable for inferring relatedness for brown bears (Ursus arctos) within Scandinavia. We sequenced reduced representation libraries from nine individuals throughout the geographic range. With consensus reads containing putative SNPs, we applied strict filtering criteria with the aim of finding only high-quality, highly-informative SNPs. We tested 150 putative SNPs of which 96% were validated on a panel of 68 individuals. Ninety-six of the validated SNPs with the highest minor allele frequency were selected. The final SNP panel includes four mitochondrial markers, two monomorphic Y-chromosome sex-determination markers, three X-chromosome SNPs and 87 autosomal SNPs. From our validation sample panel, we identified two previously known parent-offspring dyads with reasonable accuracy. This panel of SNPs is a promising tool for inferring relatedness in the brown bear population in Scandinavia. PMID:24260529

  19. [Correlations between SNP of LALBA gene and economic traits in Inner Mongolian white cashmere goat].

    PubMed

    Lan, Xian-Yong; Chen, Hong; Tian, Zhi-Quan; Liu, Shao-Qing; Zhang, Yong-Bin; Wang, Xin; Fang, Xing-Tang

    2008-02-01

    PCR-SSCP and DNA sequencing methods were conducted to detect single nucleotide polymorphism of alpha-lactalbumin (LALBA) gene in 452 Inner Mongolian white cashmere goats (IMWC). Correlations between SNP of goat LALBA gene and economic traits, e.g., cashmere yield, cashmere thickness, length and weight, were analyzed. The SSCP in P2 primer locus, which was caused by the point mutation M63868:g.1897T>C in the exon 3 of LALBA gene was detected. At this locus, the genotype TT and allele T were predominant in the IMWC population, which agreed with Hardy-Weinberg equilibrium. Moreover, there was a significant correlation between polymorphism of goat M63868:g.1897 locus and cashmere yield of IMWC (P=0.017). The individuals with genotype TC had more cashmere yield than those with geontype TT. Hence, genotype TC of LALBA gene can be used as a molecular marker for breeding superior cashmere yield in goat marker-assisted selection. PMID:18244921

  20. A high-performance computing toolset for relatedness and principal component analysis of SNP data

    PubMed Central

    Zheng, Xiuwen; Levine, David; Shen, Jess; Gogarten, Stephanie M.; Laurie, Cathy; Weir, Bruce S.

    2012-01-01

    Summary: Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed gdsfmt and SNPRelate (R packages for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized. Benchmarks show the uniprocessor implementations of PCA and identity-by-descent are ∼8–50 times faster than the implementations provided in the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs, respectively, and can be sped up to 30–300-fold by using eight cores. SNPRelate can analyse tens of thousands of samples with millions of SNPs. For example, our package was used to perform PCA on 55 324 subjects from the ‘Gene-Environment Association Studies’ consortium studies. Availability and implementation: gdsfmt and SNPRelate are available from R CRAN (http://cran.r-project.org), including a vignette. A tutorial can be found at https://www.genevastudy.org/Accomplishments/software. Contact: zhengx@u.washington.edu PMID:23060615

  1. Prioritization of Cancer-Related Genomic Variants by SNP Association Network

    PubMed Central

    Liu, Changning; Xuan, Zhenyu

    2015-01-01

    We have developed a general framework to construct an association network of single nucleotide polymorphisms (SNPs) (SNP association network, SAN) based on the functional interactions of genes located in the flanking regions of SNPs. SAN, which was constructed based on protein–protein interactions in the Human Protein Reference Database (HPRD), showed significantly enriched signals in both linkage disequilibrium (LD) and long-range chromatin interaction (Hi-C). We used this network to further develop two methods for predicting and prioritizing disease-associated genes from genome-wide association studies (GWASs). We found that random walk with restart (RWR) using SAN (RWR-SAN) can greatly improve the prediction of lung-cancer-associated genes by comparing RWR with the use of network in HPRD (AUC 0.81 vs 0.66). In a reanalysis of the GWAS dataset of age-related macular degeneration (AMD), SAN could identify more potential AMD-associated genes that were previously ranked lower in the GWAS study. The interactions in SAN could facilitate the study of complex diseases. PMID:25995611

  2. A Functional SNP in BNC2 Is Associated with Adolescent Idiopathic Scoliosis

    PubMed Central

    Ogura, Yoji; Kou, Ikuyo; Miura, Shigenori; Takahashi, Atsushi; Xu, Leilei; Takeda, Kazuki; Takahashi, Yohei; Kono, Katsuki; Kawakami, Noriaki; Uno, Koki; Ito, Manabu; Minami, Shohei; Yonezawa, Ikuho; Yanagida, Haruhisa; Taneichi, Hiroshi; Zhu, Zezhang; Tsuji, Taichi; Suzuki, Teppei; Sudo, Hideki; Kotani, Toshiaki; Watanabe, Kota; Hosogane, Naobumi; Okada, Eijiro; Iida, Aritoshi; Nakajima, Masahiro; Sudo, Akihiro; Chiba, Kazuhiro; Hiraki, Yuji; Toyama, Yoshiaki; Qiu, Yong; Shukunami, Chisa; Kamatani, Yoichiro; Kubo, Michiaki; Matsumoto, Morio; Ikegawa, Shiro

    2015-01-01

    Adolescent idiopathic scoliosis (AIS) is the most common spinal deformity. We previously conducted a genome-wide association study (GWAS) and detected two loci associated with AIS. To identify additional loci, we extended our GWAS by increasing the number of cohorts (2,109 affected subjects and 11,140 control subjects in total) and conducting a whole-genome imputation. Through the extended GWAS and replication studies using independent Japanese and Chinese populations, we identified a susceptibility locus on chromosome 9p22.2 (p = 2.46 × 10−13; odds ratio = 1.21). The most significantly associated SNPs were in intron 3 of BNC2, which encodes a zinc finger transcription factor, basonuclin-2. Expression quantitative trait loci data suggested that the associated SNPs have the potential to regulate the BNC2 transcriptional activity and that the susceptibility alleles increase BNC2 expression. We identified a functional SNP, rs10738445 in BNC2, whose susceptibility allele showed both higher binding to a transcription factor, YY1 (yin and yang 1), and higher BNC2 enhancer activity than the non-susceptibility allele. BNC2 overexpression produced body curvature in developing zebrafish in a gene-dosage-dependent manner. Our results suggest that increased BNC2 expression is implicated in the etiology of AIS. PMID:26211971

  3. Genomic relationships computed from either next-generation sequence or array SNP data.

    PubMed

    Pérez-Enciso, M

    2014-04-01

    The use of sequence data in genomic prediction models is a topic of high interest, given the decreasing prices of current 'next'-generation sequencing technologies (NGS) and the theoretical possibility of directly interrogating the genomes for all causal mutations. Here, we compare by simulation how well genetic relationships (G) could be estimated using either NGS or ascertained SNP arrays. DNA sequences were simulated using the coalescence according to two scenarios: a 'cattle' scenario that consisted of a bottleneck followed by a split in two breeds without migration, and a 'pig' model where Chinese introgression into international pig breeds was simulated. We found that introgression results in a large amount of variability across the genome and between individuals, both in differentiation and in diversity. In general, NGS data allowed the most accurate estimates of G, provided enough sequencing depth was available, because shallow NGS (4×) may result in highly distorted estimates of G elements, especially if not standardized by allele frequency. However, high-density genotyping can also result in accurate estimates of G. Given that genotyping is much less noisy than NGS data, it is suggested that specific high-density arrays (~3M SNPs) that minimize the effects of ascertainment could be developed in the population of interest by sequencing the most influential animals and rely on those arrays for implementing genomic selection. PMID:24397314

  4. A study of East Timor variability using the SNPforID 52-plex SNP panel.

    PubMed

    Santos, C; Phillips, C; Fondevila, M; Porras-Hurtado, L; Carracedo, A; Souto, L; Lareu, M V

    2011-01-01

    A set of 52 autosomal single nucleotide polymorphism (SNP) loci was analyzed in 46 unrelated individuals from the East Timor population using the forensic assay previously described by Sanchez et al. (2006) [J.J. Sanchez, C. Phillips, C. Børsting, K. Balogh, M. Bogus, M. Fondevila, C.D. Harrison, E. Musgrave-Brown, A. Salas, D. Syndercombe Court, PM. Schneider, A. Carracedo, N. Morling, A multiplex assay with 52 single nucleotide polymorphisms for human identification, Electrophoresis 27 (2006) 1713-1724]. Allele frequencies are presented for the 52 SNPs with all loci in Hardy-Weinberg equilibrium for the study population. Comparison with African, European, East Asian and Oceanian populations of the CEPH human genome diversity panel (CEPH-HGDP) revealed significant differences in allele frequency distributions between East Timor and each of the above population groups. Statistical parameters measuring forensic informativeness were also calculated and the values obtained reached comparable levels to those previously described for the other global population groups. This is the first study of variability in these SNPs in an Oceanian population outside of the CEPH-HGDP. PMID:20457102

  5. Primers to amplify SNP markers in Epichloë canadensis (Clavicipitaceae)1

    PubMed Central

    Sullivan, Terrence J.; Bultman, Thomas L.; Schoolcraft, Jennifer

    2016-01-01

    Premise of the study: Primers were designed to produce short amplicons containing single-nucleotide polymorphisms (SNPs) in β-tubulin (tubB) and translation elongation factor 1-α (tefA) in Epichloë canadensis (Clavicipitaceae), an endophytic fungus of Elymus canadensis (Poaceae). Methods and Results: Primers to amplify regions of tubB and tefA containing suspected SNPs were designed and tested on individuals from six populations. Two tubB alleles were identified that differed by a single SNP, and three tefA alleles were identified that differed by a combination of two SNPs. All six populations tested were polymorphic for the tefA marker, and three of the populations were also polymorphic for the tubB marker. These primers are also predicted to amplify these regions in 11 additional epichloid species. Conclusions: Primers for short amplicons within tubB and tefA genes can be used to successfully genotype E. canadensis, making them useful markers for population genetic or landscape genomic studies. PMID:27011893

  6. SNP Selection in Genome-Wide Association Studies via Penalized Support Vector Machine with MAX Test

    PubMed Central

    Kim, Jinseog; Kim, Dennis (Dong Hwan); Jung, Sin-Ho

    2013-01-01

    One of main objectives of a genome-wide association study (GWAS) is to develop a prediction model for a binary clinical outcome using single-nucleotide polymorphisms (SNPs) which can be used for diagnostic and prognostic purposes and for better understanding of the relationship between the disease and SNPs. Penalized support vector machine (SVM) methods have been widely used toward this end. However, since investigators often ignore the genetic models of SNPs, a final model results in a loss of efficiency in prediction of the clinical outcome. In order to overcome this problem, we propose a two-stage method such that the the genetic models of each SNP are identified using the MAX test and then a prediction model is fitted using a penalized SVM method. We apply the proposed method to various penalized SVMs and compare the performance of SVMs using various penalty functions. The results from simulations and real GWAS data analysis show that the proposed method performs better than the prediction methods ignoring the genetic models in terms of prediction power and selectivity. PMID:24174989

  7. Purifying selection shapes the coincident SNP distribution of primate coding sequences.

    PubMed

    Chen, Chia-Ying; Hung, Li-Yuan; Wu, Chan-Shuo; Chuang, Trees-Juen

    2016-01-01

    Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density, and recombination rate, and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently, and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a "signature" during primate protein evolution. PMID:27255481

  8. Purifying selection shapes the coincident SNP distribution of primate coding sequences

    PubMed Central

    Chen, Chia-Ying; Hung, Li-Yuan; Wu, Chan-Shuo; Chuang, Trees-Juen

    2016-01-01

    Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density, and recombination rate, and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently, and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a “signature” during primate protein evolution. PMID:27255481

  9. SNP-guided identification of monoallelic DNA-methylation events from enrichment-based sequencing data.

    PubMed

    Steyaert, Sandra; Van Criekinge, Wim; De Paepe, Ayla; Denil, Simon; Mensaert, Klaas; Vandepitte, Katrien; Vanden Berghe, Wim; Trooskens, Geert; De Meyer, Tim

    2014-11-10

    Monoallelic gene expression is typically initiated early in the development of an organism. Dysregulation of monoallelic gene expression has already been linked to several non-Mendelian inherited genetic disorders. In humans, DNA-methylation is deemed to be an important regulator of monoallelic gene expression, but only few examples are known. One important reason is that current, cost-affordable truly genome-wide methods to assess DNA-methylation are based on sequencing post-enrichment. Here, we present a new methodology based on classical population genetic theory, i.e. the Hardy-Weinberg theorem, that combines methylomic data from MethylCap-seq with associated SNP profiles to identify monoallelically methylated loci. Applied on 334 MethylCap-seq samples of very diverse origin, this resulted in the identification of 80 genomic regions featured by monoallelic DNA-methylation. Of these 80 loci, 49 are located in genic regions of which 25 have already been linked to imprinting. Further analysis revealed statistically significant enrichment of these loci in promoter regions, further establishing the relevance and usefulness of the method. Additional validation was done using both 14 whole-genome bisulfite sequencing data sets and 16 mRNA-seq data sets. Importantly, the developed approach can be easily applied to other enrichment-based sequencing technologies, like the ChIP-seq-based identification of monoallelic histone modifications. PMID:25237057

  10. Development and application of a novel genome-wide SNP array reveals domestication history in soybean

    PubMed Central

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-01-01

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884

  11. Genome-wide SNP analysis explains coral diversity and recovery in the Ryukyu Archipelago

    PubMed Central

    Shinzato, Chuya; Mungpakdee, Sutada; Arakaki, Nana; Satoh, Noriyuki

    2015-01-01

    Following a global coral bleaching event in 1998, Acropora corals surrounding most of Okinawa island (OI) were devastated, although they are now gradually recovering. In contrast, the Kerama Islands (KIs) only 30 km west of OI, have continuously hosted a great variety of healthy corals. Taking advantage of the decoded Acropora digitifera genome and using genome-wide SNP analyses, we clarified Acropora population structure in the southern Ryukyu Archipelago (sRA). Despite small genetic distances, we identified distinct clusters corresponding to specific island groups, suggesting infrequent long-distance dispersal within the sRA. Although the KIs were believed to supply coral larvae to OI, admixture analyses showed that such dispersal is much more limited than previously realized, indicating independent recovery of OI coral populations and the necessity of local conservation efforts for each region. We detected strong historical migration from the Yaeyama Islands (YIs) to OI, and suggest that the YIs are the original source of OI corals. In addition, migration edges to the KIs suggest that they are a historical sink population in the sRA, resulting in high diversity. This population genomics study provides the highest resolution data to date regarding coral population structure and history. PMID:26656261

  12. Design and synthesis of the superionic conductor Na10SnP2S12.

    PubMed

    Richards, William D; Tsujimura, Tomoyuki; Miara, Lincoln J; Wang, Yan; Kim, Jae Chul; Ong, Shyue Ping; Uechi, Ichiro; Suzuki, Naoki; Ceder, Gerbrand

    2016-01-01

    Sodium-ion batteries are emerging as candidates for large-scale energy storage due to their low cost and the wide variety of cathode materials available. As battery size and adoption in critical applications increases, safety concerns are resurfacing due to the inherent flammability of organic electrolytes currently in use in both lithium and sodium battery chemistries. Development of solid-state batteries with ionic electrolytes eliminates this concern, while also allowing novel device architectures and potentially improving cycle life. Here we report the computation-assisted discovery and synthesis of a high-performance solid-state electrolyte material: Na10SnP2S12, with room temperature ionic conductivity of 0.4 mS cm(-1) rivalling the conductivity of the best sodium sulfide solid electrolytes to date. We also computationally investigate the variants of this compound where tin is substituted by germanium or silicon and find that the latter may achieve even higher conductivity. PMID:26984102

  13. Development and application of a novel genome-wide SNP array reveals domestication history in soybean.

    PubMed

    Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue

    2016-01-01

    Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884

  14. Pharmacogenomics: accessing important alleles by imputation from commercial genome-wide SNP arrays.

    PubMed

    Liboredo, R; Pena, S D J

    2014-01-01

    Personalized medicine is becoming a medical reality, as important genotype-phenotype relationships are being unraveled. The availability of pharmacogenomic data is a key element of individualized care. In this study, we explored genotype imputation as a means to infer important pharmacogenomic alleles from a regular commercially available genome-wide SNP array. Using these arrays as a starting point can reduce testing costs, increasing access to these pharmacogenomic data and still retain a larger amount of genome-wide information. IMPUTE2 and MaCH-Admix were used to perform genotype imputation with a dense reference panel from 1000 Genomes data. We were able to correctly infer genotypes for the warfarin-related loci VKORC1 and CYP2C9 alleles 2, 3, 5, and 11 and also clopidogrel-related CYP2C19 alleles 2 and 17 for a small sample of Brazilian individuals, as well as for HapMap samples. The success of an imputation approach in admixed samples using publicly available reference panels can encourage further imputation initiatives in those populations. PMID:25117329

  15. Comparative analysis of SNP candidates in disparate milk yielding river buffaloes using targeted sequencing

    PubMed Central

    2016-01-01

    River buffalo (Bubalus bubalis) milk plays an important role in economy and nutritious diet in several developing countries. However, reliable milk-yield genomic markers and their functional insights remain unexposed. Here, we have used a target capture sequencing approach in three economically important buffalo breeds namely: Banni, Jafrabadi and Mehsani, belonging to either high or low milk-yield group. Blood samples were collected from the milk-yield/breed balanced group of 12 buffaloes, and whole exome sequencing was performed using Roche 454 GS-FLX Titanium sequencer. Using an innovative approach namely, MultiCom; we have identified high-quality SNPs specific for high and low-milk yield buffaloes. Almost 70% of the reported genes in QTL regions of milk-yield and milk-fat in cattle were present among the buffalo milk-yield gene candidates. Functional analysis highlighted transcriptional regulation category in the low milk-yield group, and several new pathways in the two groups. Further, the discovered SNP candidates may account for more than half of mammary transcriptome changes in high versus low-milk yielding cattle. Thus, starting from the design of a reliable strategy, we identified reliable genomic markers specific for high and low-milk yield buffalo breeds and addressed possible downstream effects. PMID:27441113

  16. SNP-guided identification of monoallelic DNA-methylation events from enrichment-based sequencing data

    PubMed Central

    Steyaert, Sandra; Van Criekinge, Wim; De Paepe, Ayla; Denil, Simon; Mensaert, Klaas; Vandepitte, Katrien; Berghe, Wim Vanden; Trooskens, Geert; De Meyer, Tim

    2014-01-01

    Monoallelic gene expression is typically initiated early in the development of an organism. Dysregulation of monoallelic gene expression has already been linked to several non-Mendelian inherited genetic disorders. In humans, DNA-methylation is deemed to be an important regulator of monoallelic gene expression, but only few examples are known. One important reason is that current, cost-affordable truly genome-wide methods to assess DNA-methylation are based on sequencing post-enrichment. Here, we present a new methodology based on classical population genetic theory, i.e. the Hardy–Weinberg theorem, that combines methylomic data from MethylCap-seq with associated SNP profiles to identify monoallelically methylated loci. Applied on 334 MethylCap-seq samples of very diverse origin, this resulted in the identification of 80 genomic regions featured by monoallelic DNA-methylation. Of these 80 loci, 49 are located in genic regions of which 25 have already been linked to imprinting. Further analysis revealed statistically significant enrichment of these loci in promoter regions, further establishing the relevance and usefulness of the method. Additional validation was done using both 14 whole-genome bisulfite sequencing data sets and 16 mRNA-seq data sets. Importantly, the developed approach can be easily applied to other enrichment-based sequencing technologies, like the ChIP-seq-based identification of monoallelic histone modifications. PMID:25237057

  17. Four-copy number intervals in SNP microarray analysis: unique patterns and positions.

    PubMed

    Papenhausen, Peter R; Kelly, Carla A; Zvereff, Val; Schwartz, Stuart

    2014-01-01

    Over the past several years, the utility of microarray technology in delineating copy number changes has become well established. In the past 4 years, we have used the SNP array to detect and analyze allele ratios in 150 cases with 4-copy intervals, confirmed by FISH, offering insight into the underlying mechanisms of formation. These cases may be divided into 5 allele patterns--the first 4 of which involve a single homologue--as detected by the genotyping aspects of the microarray: (1) triplications combining homozygous and heterozygous alleles, with a 3:1 ratio of heterozygotes; (2) triplications with allele patterns combining homozygous and heterozygous alleles, with heterozygote ratios of both 3:1 and 2:2; (3) triplications that have homozygous alleles combined with only 2:2 heterozygous alleles; (4) triplications that are completely homozygous; and (5) homozygous duplications on each homologue with no heterozygous alleles. The implications of copy number variants with diverse allelic segregations are presented in this study. PMID:25401283

  18. Genome-wide SNP analysis explains coral diversity and recovery in the Ryukyu Archipelago.

    PubMed

    Shinzato, Chuya; Mungpakdee, Sutada; Arakaki, Nana; Satoh, Noriyuki

    2015-01-01

    Following a global coral bleaching event in 1998, Acropora corals surrounding most of Okinawa island (OI) were devastated, although they are now gradually recovering. In contrast, the Kerama Islands (KIs) only 30 km west of OI, have continuously hosted a great variety of healthy corals. Taking advantage of the decoded Acropora digitifera genome and using genome-wide SNP analyses, we clarified Acropora population structure in the southern Ryukyu Archipelago (sRA). Despite small genetic distances, we identified distinct clusters corresponding to specific island groups, suggesting infrequent long-distance dispersal within the sRA. Although the KIs were believed to supply coral larvae to OI, admixture analyses showed that such dispersal is much more limited than previously realized, indicating independent recovery of OI coral populations and the necessity of local conservation efforts for each region. We detected strong historical migration from the Yaeyama Islands (YIs) to OI, and suggest that the YIs are the original source of OI corals. In addition, migration edges to the KIs suggest that they are a historical sink population in the sRA, resulting in high diversity. This population genomics study provides the highest resolution data to date regarding coral population structure and history. PMID:26656261

  19. A High-Density SNP and SSR Consensus Map Reveals Segregation Distortion Regions in Wheat

    PubMed Central

    Li, Chunlian; Bai, Guihua; Chao, Shiaoman; Wang, Zhonghua

    2015-01-01

    Segregation distortion is a widespread phenomenon in plant and animal genomes and significantly affects linkage map construction and identification of quantitative trait loci (QTLs). To study segregation distortion in wheat, a high-density consensus map was constructed using single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers by merging two genetic maps developed from two recombinant-inbred line (RIL) populations, Ning7840 × Clark and Heyne × Lakin. Chromosome regions with obvious segregation distortion were identified in the map. A total of 3541 SNPs and 145 SSRs were mapped, and the map covered 3258.7 cM in genetic distance with an average interval of 0.88 cM. The number of markers that showed distorted segregation was 490 (18.5%) in the Ning7840 × Clark population and 225 (10.4%) in the Heyne × Lakin population. Most of the distorted markers (630) were mapped in the consensus map, which accounted for 17.1% of mapped markers. The majority of the distorted markers clustered in the segregation distortion regions (SDRs) on chromosomes 1B, 2A, 2B, 3A, 3B, 4B, 5A, 5B, 5D, 6B, 7A, and 7D. All of the markers in a given SDR skewed toward one of the parents, suggesting that gametophytic competition during zygote formation was most likely one of the causes for segregation distortion in the populations. PMID:26601111

  20. Prospective diagnostic analysis of copy number variants using SNP microarrays in individuals with autism spectrum disorders.

    PubMed

    Nava, Caroline; Keren, Boris; Mignot, Cyril; Rastetter, Agnès; Chantot-Bastaraud, Sandra; Faudet, Anne; Fonteneau, Eric; Amiet, Claire; Laurent, Claudine; Jacquette, Aurélia; Whalen, Sandra; Afenjar, Alexandra; Périsse, Didier; Doummar, Diane; Dorison, Nathalie; Leboyer, Marion; Siffroi, Jean-Pierre; Cohen, David; Brice, Alexis; Héron, Delphine; Depienne, Christel

    2014-01-01

    Copy number variants (CNVs) have repeatedly been found to cause or predispose to autism spectrum disorders (ASDs). For diagnostic purposes, we screened 194 individuals with ASDs for CNVs using Illumina SNP arrays. In several probands, we also analyzed candidate genes located in inherited deletions to unmask autosomal recessive variants. Three CNVs, a de novo triplication of chromosome 15q11-q12 of paternal origin, a deletion on chromosome 9p24 and a de novo 3q29 deletion, were identified as the cause of the disorder in one individual each. An autosomal recessive cause was considered possible in two patients: a homozygous 1p31.1 deletion encompassing PTGER3 and a deletion of the entire DOCK10 gene associated with a rare hemizygous missense variant. We also identified multiple private or recurrent CNVs, the majority of which were inherited from asymptomatic parents. Although highly penetrant CNVs or variants inherited in an autosomal recessive manner were detected in rare cases, our results mainly support the hypothesis that most CNVs contribute to ASDs in association with other CNVs or point variants located elsewhere in the genome. Identification of these genetic interactions in individuals with ASDs constitutes a formidable challenge. PMID:23632794

  1. Prospective diagnostic analysis of copy number variants using SNP microarrays in individuals with autism spectrum disorders

    PubMed Central

    Nava, Caroline; Keren, Boris; Mignot, Cyril; Rastetter, Agnès; Chantot-Bastaraud, Sandra; Faudet, Anne; Fonteneau, Eric; Amiet, Claire; Laurent, Claudine; Jacquette, Aurélia; Whalen, Sandra; Afenjar, Alexandra; Périsse, Didier; Doummar, Diane; Dorison, Nathalie; Leboyer, Marion; Siffroi, Jean-Pierre; Cohen, David; Brice, Alexis; Héron, Delphine; Depienne, Christel

    2014-01-01

    Copy number variants (CNVs) have repeatedly been found to cause or predispose to autism spectrum disorders (ASDs). For diagnostic purposes, we screened 194 individuals with ASDs for CNVs using Illumina SNP arrays. In several probands, we also analyzed candidate genes located in inherited deletions to unmask autosomal recessive variants. Three CNVs, a de novo triplication of chromosome 15q11–q12 of paternal origin, a deletion on chromosome 9p24 and a de novo 3q29 deletion, were identified as the cause of the disorder in one individual each. An autosomal recessive cause was considered possible in two patients: a homozygous 1p31.1 deletion encompassing PTGER3 and a deletion of the entire DOCK10 gene associated with a rare hemizygous missense variant. We also identified multiple private or recurrent CNVs, the majority of which were inherited from asymptomatic parents. Although highly penetrant CNVs or variants inherited in an autosomal recessive manner were detected in rare cases, our results mainly support the hypothesis that most CNVs contribute to ASDs in association with other CNVs or point variants located elsewhere in the genome. Identification of these genetic interactions in individuals with ASDs constitutes a formidable challenge. PMID:23632794

  2. Identification of a functional SNP in the 3'-UTR of caprine MTHFR gene that is associated with milk protein levels.

    PubMed

    An, Xiaopeng; Song, Yuxuan; Hou, Jinxing; Wang, Shan; Gao, Kexin; Cao, Binyun

    2016-08-01

    Xinong Saanen (n = 305) and Guanzhong (n = 317) dairy goats were used to detect SNPs in the caprine MTHFR 3'-UTR by DNA sequencing. One novel SNP (c.*2494G>A) was identified in the said region. Individuals with the AA genotype had greater milk protein levels than did those with the GG genotype at the c.*2494 G>A locus in both dairy goat breeds (P < 0.05). Functional assays indicated that the MTHFR:c.2494G>A substitution could increase the binding activity of bta-miR-370 with the MTHFR 3'-UTR. In addition, we observed a significant increase in the MTHFR protein level of AA carriers relative to that of GG carriers. These altered levels of MTHFR protein may account for the association of the SNP with milk protein level. PMID:27062401

  3. Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data

    PubMed Central

    Xu, Lingyang; Hou, Yali; Bickhart, Derek M.; Song, Jiuzhou; Liu, George E.

    2013-01-01

    Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.

  4. Regulatory Variants and Disease: The E-Cadherin −160C/A SNP as an Example

    PubMed Central

    Li, Gongcheng; Pan, Tiejun; Guo, Dan

    2014-01-01

    Single nucleotide polymorphisms (SNPs) occurring in noncoding sequences have largely been ignored in genome-wide association studies (GWAS). Yet, amounting evidence suggests that many noncoding SNPs especially those that are in the vicinity of protein coding genes play important roles in shaping chromatin structure and regulate gene expression and, as such, are implicated in a wide variety of diseases. One of such regulatory SNPs (rSNPs) is the E-cadherin (CDH1) promoter −160C/A SNP (rs16260) which is known to affect E-cadherin promoter transcription by displacing transcription factor binding and has been extensively scrutinized for its association with several diseases especially malignancies. Findings from studying this SNP highlight important clinical relevance of rSNPs and justify their inclusion in future GWAS to identify novel disease causing SNPs. PMID:25276428

  5. A single-tube 27-plex SNP assay for estimating individual ancestry and admixture from three continents.

    PubMed

    Wei, Yi-Liang; Wei, Li; Zhao, Lei; Sun, Qi-Fan; Jiang, Li; Zhang, Tao; Liu, Hai-Bo; Chen, Jian-Gang; Ye, Jian; Hu, Lan; Li, Cai-Xia

    2016-01-01

    A single-tube multiplex assay of a small set of ancestry-informative markers (AIMs) for effectively estimating individual ancestry and admixture is an ideal forensic tool to trace the population origin of an unknown DNA sample. We present a newly developed 27-plex single nucleotide polymorphism (SNP) panel with highly robust and balanced differential power to perfectly assign individuals to African, European, and East Asian ancestries. Evaluating 968 previously described intercontinental AIMs from three HapMap population genotyping datasets (Yoruban in Ibadan, Nigeria (YRI); Utah residents with Northern and Western European ancestry from the Centre de'Etude du Polymorphism Humain (CEPH) collection (CEU); and Han Chinese in Beijing, China (CHB)), the best set of markers was selected on the basis of Hardy-Weinberg equilibrium (p > 0.00001), population-specific allele frequency (two of three δ values >0.5), according to linkage disequilibrium (r (2) < 0.2), and capable of being multiplexed in one tube and detected by capillary electrophoresis. The 27-SNP panel was first validated by assigning the ancestry of the 11 populations in the HapMap project. Then, we tested the 27-plex SNP assay with 1164 individuals from 17 additional populations. The results demonstrated that the SNP panel was successful for ancestry inference of individuals with African, European, and East Asian ancestry. Furthermore, the system performed well when inferring the admixture of Eurasians (EUR/EAS) after analyzing admixed populations from Xinjiang (Central Asian) as follows: Tajik (68:27), Uyghur (49:46), Kirgiz (40:57), and Kazak (36:60). For individual analyses, we interpreted each sample with a three-ancestry component percentage and a population match probability sequence. This multiplex assay is a convenient and cost-effective tool to assist in criminal investigations, as well as to correct for the effects of population stratification for case-control studies. PMID:25833170

  6. Dominant Genetic Variation and Missing Heritability for Human Complex Traits: Insights from Twin versus Genome-wide Common SNP Models

    PubMed Central

    Chen, Xu; Kuja-Halkola, Ralf; Rahman, Iffat; Arpegård, Johannes; Viktorin, Alexander; Karlsson, Robert; Hägg, Sara; Svensson, Per; Pedersen, Nancy L.; Magnusson, Patrik K.E.

    2015-01-01

    In order to further illuminate the potential role of dominant genetic variation in the “missing heritability” debate, we investigated the additive (narrow-sense heritability, h2) and dominant (δ2) genetic variance for 18 human complex traits. Within the same study base (10,682 Swedish twins), we calculated and compared the estimates from classic twin-based structural equation model with SNP-based genomic-relatedness-matrix restricted maximum likelihood [GREML(d)] method. Contributions of δ2 were evident for 14 traits in twin models (average δ2twin = 0.25, range 0.14–0.49), two of which also displayed significant δ2 in the GREMLd analyses (triglycerides δ2SNP = 0.28 and waist circumference δ2SNP = 0.19). On average, the proportion of h2SNP/h2twin was 70% for ADE-fitted traits (for which the best-fitting model included additive and dominant genetic and unique environmental components) and 31% for AE-fitted traits (for which the best-fitting model included additive genetic and unique environmental components). Independent evidence for contribution from shared environment, also in ADE-fitted traits, was obtained from self-reported within-pair contact frequency and age at separation. We conclude that despite the fact that additive genetics appear to constitute the bulk of genetic influences for most complex traits, dominant genetic variation might often be masked by shared environment in twin and family studies and might therefore have a more prominent role than what family-based estimates often suggest. The risk of erroneously attributing all inherited genetic influences (additive and dominant) to the h2 in too-small twin studies might also lead to exaggerated “missing heritability” (the proportion of h2 that remains unexplained by SNPs). PMID:26544805

  7. Identification of a Sex-Linked SNP Marker in the Salmon Louse (Lepeophtheirus salmonis) Using RAD Sequencing

    PubMed Central

    Taggart, John B.; Christie, Hayden R. L.; Bassett, David I.; Bron, James E.; Skuce, Philip J.; Gharbi, Karim; Skern-Mauritzen, Rasmus; Sturm, Armin

    2013-01-01

    The salmon louse (Lepeophtheirus salmonis (Krøyer, 1837)) is a parasitic copepod that can, if untreated, cause considerable damage to Atlantic salmon (Salmo salar Linnaeus, 1758) and incurs significant costs to the Atlantic salmon mariculture industry. Salmon lice are gonochoristic and normally show sex ratios close to 1:1. While this observation suggests that sex determination in salmon lice is genetic, with only minor environmental influences, the mechanism of sex determination in the salmon louse is unknown. This paper describes the identification of a sex-linked Single Nucleotide Polymorphism (SNP) marker, providing the first evidence for a genetic mechanism of sex determination in the salmon louse. Restriction site-associated DNA sequencing (RAD-seq) was used to isolate SNP markers in a laboratory-maintained salmon louse strain. A total of 85 million raw Illumina 100 base paired-end reads produced 281,838 unique RAD-tags across 24 unrelated individuals. RAD marker Lsa101901 showed complete association with phenotypic sex for all individuals analysed, being heterozygous in females and homozygous in males. Using an allele-specific PCR assay for genotyping, this SNP association pattern was further confirmed for three unrelated salmon louse strains, displaying complete association with phenotypic sex in a total of 96 genotyped individuals. The marker Lsa101901 was located in the coding region of the prohibitin-2 gene, which showed a sex-dependent differential expression, with mRNA levels determined by RT-qPCR about 1.8-fold higher in adult female than adult male salmon lice. This study’s observations of a novel sex-linked SNP marker are consistent with sex determination in the salmon louse being genetic and following a female heterozygous system. Marker Lsa101901 provides a tool to determine the genetic sex of salmon lice, and could be useful in the development of control strategies. PMID:24147087

  8. Pedigree- and SNP-Associated Genetics and Recent Environment are the Major Contributors to Anthropometric and Cardiometabolic Trait Variation

    PubMed Central

    Xia, Charley; Amador, Carmen; Huffman, Jennifer; Trochet, Holly; Campbell, Archie; Porteous, David; Hastie, Nicholas D.; Hayward, Caroline; Vitart, Veronique; Navarro, Pau; Haley, Chris S.

    2016-01-01

    Genome-wide association studies have successfully identified thousands of loci for a range of human complex traits and diseases. The proportion of phenotypic variance explained by significant associations is, however, limited. Given the same dense SNP panels, mixed model analyses capture a greater proportion of phenotypic variance than single SNP analyses but the total is generally still less than the genetic variance estimated from pedigree studies. Combining information from pedigree relationships and SNPs, we examined 16 complex anthropometric and cardiometabolic traits in a Scottish family-based cohort comprising up to 20,000 individuals genotyped for ~520,000 common autosomal SNPs. The inclusion of related individuals provides the opportunity to also estimate the genetic variance associated with pedigree as well as the effects of common family environment. Trait variation was partitioned into SNP-associated and pedigree-associated genetic variation, shared nuclear family environment, shared couple (partner) environment and shared full-sibling environment. Results demonstrate that trait heritabilities vary widely but, on average across traits, SNP-associated and pedigree-associated genetic effects each explain around half the genetic variance. For most traits the recently-shared environment of couples is also significant, accounting for ~11% of the phenotypic variance on average. On the other hand, the environment shared largely in the past by members of a nuclear family or by full-siblings, has a more limited impact. Our findings point to appropriate models to use in future studies as pedigree-associated genetic effects and couple environmental effects have seldom been taken into account in genotype-based analyses. Appropriate description of the trait variation could help understand causes of intra-individual variation and in the detection of contributing loci and environmental factors. PMID:26836320

  9. GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies

    PubMed Central

    2014-01-01

    Background Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. Results In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. Conclusion GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of

  10. The rs6983267 SNP and long non-coding RNA CARLo-5 are associated with endometrial carcinoma.

    PubMed

    Zhao, Xiwa; Wei, Xurui; Zhao, Lianmei; Shi, Li; Cheng, Jianxin; Kang, Shan; Zhang, Hui; Zhang, Jun; Li, Li; Zhang, Haibo; Zhao, Wei

    2016-08-01

    The single nucleotide polymorphism (SNP) rs6983267 and cancer-associated region long non-coding RNA (CARLo-5) are associated with various human cancers. This study aimed to investigate the expression of CARLo-5 in endometrial carcinoma (EC) and its relationship with clinicopathological features and patient survival. The association of the rs6983267 SNP with EC risk and its involvement in the regulation of CARLo-5 expression in EC were investigated. The rs6983267 SNP was genotyped by polymerase chain reaction (PCR) and ligase detection reaction in 543 EC patients and 584 controls. The expression of CARLo-5 in 108 EC tissues and 66 normal endometrial tissues (NETs) was determined using quantitative real-time PCR. The genotype and allele distributions of the rs6983267 SNP differed significantly between patients and controls. There was a significant correlation between the rs6983267 genotypes and lymph node metastasis of EC patients (P = 0.026). CARLo-5 expression was significantly higher in EC tissues than in NETs (P < 0.001) and significantly associated with FIGO stage (P = 0.029) and lymph node metastasis (P = 0.030). Patients with high CARLo-5 expression had significantly shorter overall survival than those with low CARLo-5 expression (P = 0.003). The rs6983267 genotype was significantly correlated with CARLo-5 expression (P < 0.05). In conclusion, CARLo-5 was identified as a pro-oncogenic lncRNA that may play an important role in EC progression and represent a prognostic marker for EC. The expression of CARLo-5 was significantly correlated with the rs6983267 genotype associated with increased susceptibility to EC. Environ. Mol. Mutagen. 57:508-515, 2016. © 2016 Wiley Periodicals, Inc. PMID:27432114

  11. Denaturing high performance liquid chromatography: high throughput mutation screening in familial hypertrophic cardiomyopathy and SNP genotyping in motor neurone disease

    PubMed Central

    Yu, B; Sawyer, N A; Caramins, M; Yuan, Z G; Saunderson, R B; Pamphlett, R; Richmond, D R; Jeremy, R W; Trent, R J

    2005-01-01

    Aims: To evaluate the usefulness of denaturing high performance liquid chromatography (DHPLC) as a high throughput tool in: (1) DNA mutation detection in familial hypertrophic cardiomyopathy (FHC), and (2) single nucleotide polymorphism (SNP) discovery and validation in sporadic motor neurone disease (MND). Methods: The coding sequence and intron–exon boundaries of the cardiac β myosin heavy chain gene (MYH7) were screened by DHPLC for mutation identification in 150 unrelated patients diagnosed with FHC. One hundred and forty patients with sporadic MND were genotyped for the A67T SNP in the poliovirus receptor gene. All DHPLC positive signals were confirmed by conventional methods. Results: Mutation screening of MYH7 covered 10 kb with a total of 5700 amplicons, and more than 6750 DHPLC injections were completed within 35 days. The causative mutation was identified in 14% of FHC cases, including seven novel missense mutations (L227V, E328G, K351E, V411I, M435T, E894G, and E927K). Genotyping of the A67T SNP was performed at two different temperatures both in MND cases and 280 controls. This coding SNP was found more frequently in MND cases (13.6%) than in controls (6.8%). Furthermore, 19 and two SNPs were identified in MYH7 and the poliovirus receptor gene, respectively, during DHPLC screening. Conclusions: DHPLC is a high throughput, sensitive, specific, and robust platform for the detection of DNA variants, such as disease causing mutations or SNPs. It enables rapid and accurate screening of large genomic regions. PMID:15858117

  12. Impact of Repetitive Transcranial Magnetic Stimulation on Post-Stroke Dysmnesia and the Role of BDNF Val66Met SNP

    PubMed Central

    Lu, Haitao; Zhang, Tong; Wen, Mei; Sun, Li

    2015-01-01

    Background Little is known about the effects of low-frequency repetitive transcranial magnetic stimulation (rTMS) on dysmnesia and the impact of brain nucleotide neurotrophic factor (BDNF) Val66Met single-nucleotide polymorphism (SNP). This study investigated the impact of low-frequency rTMS on post-stroke dysmnesia and the impact of BDNF Val66Met SNP. Material/Methods Forty patients with post-stroke dysmnesia were prospectively randomized into the rTMS and sham groups. BDNF Val66Met SNP was determined using restriction fragment length polymorphism. Montreal Cognitive Assessment (MoCA), Loewenstein Occupational Therapy of Cognitive Assessment (LOTCA), and Rivermead Behavior Memory Test (RBMT) scores, as well as plasma BDNF concentrations, were measured at baseline and at 3 days and 2 months post-treatment. Results MoCA, LOTCA, and RBMT scores were higher after rTMS. Three days after treatment, BDNF decreased in the rTMS group but it increased in the sham group (P<0.05). Two months after treatment, RMBT scores in the rTMS group were higher than in the sham group, but not MoCA and LOTCA scores. Conclusions Low-frequency rTMS may improve after-stoke memory through various pathways, which may involve polymorphisms and several neural genes, but not through an increase in BDNF levels. PMID:25770310

  13. Genome-wide linkage analysis of a Parkinsonian-pyramidal syndrome pedigree by 500 K SNP arrays.

    PubMed

    Shojaee, Seyedmehdi; Sina, Farzad; Banihosseini, Setareh Sadat; Kazemi, Mohammad Hossein; Kalhor, Reza; Shahidi, Gholam-Ali; Fakhrai-Rad, Hossein; Ronaghi, Mostafa; Elahi, Elahe

    2008-06-01

    Robust SNP genotyping technologies and data analysis programs have encouraged researchers in recent years to use SNPs for linkage studies. Platforms used to date have been 10 K chip arrays, but the possible value of interrogating SNPs at higher densities has been considered. Here, we present a genome-wide linkage analysis by means of a 500 K SNP platform. The analysis was done on a large pedigree affected with Parkinsonian-pyramidal syndrome (PPS), and the results showed linkage to chromosome 22. Sequencing of candidate genes revealed a disease-associated homozygous variation (R378G) in FBXO7. FBXO7 codes for a member of the F-box family of proteins, all of which may have a role in the ubiquitin-proteosome protein-degradation pathway. This pathway has been implicated in various neurodegenerative diseases, and identification of FBXO7 as the causative gene of PPS is expected to shed new light on its role. The performance of the array was assessed and systematic analysis of effects of SNP density reduction was performed with the real experimental data. Our results suggest that linkage in our pedigree may have been missed had we used chips containing less than 100,000 SNPs across the genome. PMID:18513678

  14. Genome-scale DNA variant analysis and functional validation of a SNP underlying yellow fruit color in wild strawberry.

    PubMed

    Hawkins, Charles; Caruana, Julie; Schiksnis, Erin; Liu, Zhongchi

    2016-01-01

    Fragaria vesca is a species of diploid strawberry being developed as a model for the octoploid garden strawberry. This work sequenced and compared the genomes of three F. vesca accessions: 'Hawaii 4', 'Rügen', and 'Yellow Wonder'. Genome-scale analyses of shared and distinct SNPs among these three accessions have revealed that 'Rügen' and 'Yellow Wonder' are more similar to each other than they are to 'Hawaii 4'. Though all three accessions are inbred seven generations, each accession still possesses extensive heterozygosity, highlighting the inherent differences between individual plants even of the same accession. The identification of the impact of each SNP as well as the large number of Indel markers provides a foundation for locating candidate mutations underlying phenotypic variations among these F. vesca accessions and for mapping new mutations generated through forward genetics screens. Through systematic analysis of SNP variants affecting genes in anthocyanin biosynthesis and regulation, a candidate SNP in FveMYB10 was identified and then functionally confirmed to be responsible for the yellow color fruits made by many F. vesca accessions. As a whole, this study provides further resources for F. vesca and establishes a foundation for linking traits of economic importance to specific genes and variants. PMID:27377763

  15. Integrating Milk Metabolite Profile Information for the Prediction of Traditional Milk Traits Based on SNP Information for Holstein Cows

    PubMed Central

    Melzer, Nina; Wittenburg, Dörte; Repsilber, Dirk

    2013-01-01

    In this study the benefit of metabolome level analysis for the prediction of genetic value of three traditional milk traits was investigated. Our proposed approach consists of three steps: First, milk metabolite profiles are used to predict three traditional milk traits of 1,305 Holstein cows. Two regression methods, both enabling variable selection, are applied to identify important milk metabolites in this step. Second, the prediction of these important milk metabolite from single nucleotide polymorphisms (SNPs) enables the detection of SNPs with significant genetic effects. Finally, these SNPs are used to predict milk traits. The observed precision of predicted genetic values was compared to the results observed for the classical genotype-phenotype prediction using all SNPs or a reduced SNP subset (reduced classical approach). To enable a comparison between SNP subsets, a special invariable evaluation design was implemented. SNPs close to or within known quantitative trait loci (QTL) were determined. This enabled us to determine if detected important SNP subsets were enriched in these regions. The results show that our approach can lead to genetic value prediction, but requires less than 1% of the total amount of (40,317) SNPs., significantly more important SNPs in known QTL regions were detected using our approach compared to the reduced classical approach. Concluding, our approach allows a deeper insight into the associations between the different levels of the genotype-phenotype map (genotype-metabolome, metabolome-phenotype, genotype-phenotype). PMID:23990900

  16. Frequent detection of parental consanguinity in children with developmental disorders by a combined CGH and SNP microarray

    PubMed Central

    2013-01-01

    Background Genomic microarrays have been used as the first-tier cytogenetic diagnostic test for patients with developmental delay/intellectual disability, autism spectrum disorders and/or multiple congenital anomalies. The use of SNP arrays has revealed regions of homozygosity in the genome which can lead to identification of uniparental disomy and parental consanguinity in addition to copy number variations. Consanguinity is associated with an increased risk of birth defects and autosomal recessive disorders. However, the frequency of parental consanguinity in children with developmental disabilities is unknown, and consanguineous couples may not be identified during doctor’s visit or genetic counseling without microarray. Results We studied 607 proband pediatric patients referred for developmental disorders using a 4 × 180 K array containing both CGH and SNP probes. Using 720, 360, 180, and 90 Mb as the expected sizes of homozygosity for an estimated coefficient of inbreeding (F) 1/4, 1/8, 1/16, 1/32, parental consanguinity was detected in 21cases (3.46%). Conclusion Parental consanguinity is not uncommon in children with developmental problems in our study population, and can be identified by use of a combined CGH and SNP chromosome microarray. Identification of parental consanguinity in such cases can be important for further diagnostic testing. PMID:24053112

  17. Genome-scale DNA variant analysis and functional validation of a SNP underlying yellow fruit color in wild strawberry

    PubMed Central

    Hawkins, Charles; Caruana, Julie; Schiksnis, Erin; Liu, Zhongchi

    2016-01-01

    Fragaria vesca is a species of diploid strawberry being developed as a model for the octoploid garden strawberry. This work sequenced and compared the genomes of three F. vesca accessions: ‘Hawaii 4′, ‘Rügen’, and ‘Yellow Wonder’. Genome-scale analyses of shared and distinct SNPs among these three accessions have revealed that ‘Rügen’ and ‘Yellow Wonder’ are more similar to each other than they are to ‘Hawaii 4’. Though all three accessions are inbred seven generations, each accession still possesses extensive heterozygosity, highlighting the inherent differences between individual plants even of the same accession. The identification of the impact of each SNP as well as the large number of Indel markers provides a foundation for locating candidate mutations underlying phenotypic variations among these F. vesca accessions and for mapping new mutations generated through forward genetics screens. Through systematic analysis of SNP variants affecting genes in anthocyanin biosynthesis and regulation, a candidate SNP in FveMYB10 was identified and then functionally confirmed to be responsible for the yellow color fruits made by many F. vesca accessions. As a whole, this study provides further resources for F. vesca and establishes a foundation for linking traits of economic importance to specific genes and variants. PMID:27377763

  18. RAD sequencing yields a high success rate for westslope cutthroat and rainbow trout species-diagnostic SNP assays

    USGS Publications Warehouse

    Stephen J. Amish; Paul A. Hohenlohe; Sally Painter; Robb F. Leary; Muhlfeld, Clint C.; Fred W. Allendorf; Luikart, Gordon

    2012-01-01

    Hybridization with introduced rainbow trout threatens most native westslope cutthroat trout populations. Understanding the genetic effects of hybridization and introgression requires a large set of high-throughput, diagnostic genetic markers to inform conservation and management. Recently, we identified several thousand candidate single-nucleotide polymorphism (SNP) markers based on RAD sequencing of 11 westslope cutthroat trout and 13 rainbow trout individuals. Here, we used flanking sequence for 56 of these candidate SNP markers to design high-throughput genotyping assays. We validated the assays on a total of 92 individuals from 22 populations and seven hatchery strains. Forty-six assays (82%) amplified consistently and allowed easy identification of westslope cutthroat and rainbow trout alleles as well as heterozygote controls. The 46 SNPs will provide high power for early detection of population admixture and improved identification of hybrid and nonhybridized individuals. This technique shows promise as a very low-cost, reliable and relatively rapid method for developing and testing SNP markers for nonmodel organisms with limited genomic resources.

  19. Genetic Variation and Breeding Signature in Mass Selection Lines of the Pacific Oyster (Crassostrea gigas) Assessed by SNP Markers

    PubMed Central

    Zhong, Xiaoxiao; Feng, Dandan; Yu, Hong; Kong, Lingfeng; Li, Qi

    2016-01-01

    In breeding industries, a challenging problem is how to keep genetic diversity over generations. To investigate genetic variation and identify breeding signatures in mass selected lines of Pacific oyster (Crassostrea gigas), three sixth-generation selected lines and four wild populations were assessed using 103 single nucleotide polymorphism (SNP) markers. The genetic diversity data indicated that the selected lines exhibited a significant reduction in the observed heterozygosity and observed number of alleles per locus compared with the wild populations (P≤0.05), indicating the selected lines tended to lose genetic diversity contrasted with the wild populations. The unweighted pair-group method with arithmetic mean (UPGMA) analysis showed that the wild populations and selected lines were not separated into two groups. Using four outlier tests, a total of 17 loci were found under selection at two levels. The global outlier detection suggested that 4 common outlier loci were subject to selection using both the hierarchical island model and Bayesian likelihood approaches. At regional level, 3 SNPs were detected as outlier using at least two outlier tests and one outlier SNP (CgSNP309) was overlapped in the two wild-selected population comparisons. The candidate outlier SNPs provide valuable resources for future association studies in C. gigas. PMID:26954577

  20. MapNext: a software tool for spliced and unspliced alignments and SNP detection of short sequence reads

    PubMed Central

    2009-01-01

    Background Next-generation sequencing technologies provide exciting avenues for studies of transcriptomics and population genomics. There is an increasing need to conduct spliced and unspliced alignments of short transcript reads onto a reference genome and estimate minor allele frequency from sequences of population samples. Results We have designed and implemented MapNext, a software tool for both spliced and unspliced alignments of short sequence reads onto reference sequences, and automated SNP detection using neighbourhood quality standards. MapNext provides four main analyses: (i) unspliced alignment and clustering of reads, (ii) spliced alignment of transcript reads over intron boundaries, (iii) SNP detection and estimation of minor allele frequency from population sequences, and (iv) storage of result data in a database to make it available for more flexible queries and for further analyses. The software tool has been tested using both simulated and real data. Conclusion MapNext is a comprehensive and powerful tool for both spliced and unspliced alignments of short reads and automated SNP detection from population sequences. The simplicity, flexibility and efficiency of MapNext makes it a valuable tool for transcriptomic and population genomic research. PMID:19958476

  1. Effects of sodium nitroprusside (SNP) pretreatment on UV-B stress tolerance in lettuce (Lactuca sativa L.) seedlings.

    PubMed

    Esringu, Aslıhan; Aksakal, Ozkan; Tabay, Dilruba; Kara, Ayse Aydan

    2016-01-01

    Ultraviolet-B (UV-B) radiation is one of the most important abiotic stress factors that could influence plant growth, development, and productivity. Nitric oxide (NO) is an important plant growth regulator involved in a wide variety of physiological processes. In the present study, the possibility of enhancing UV-B stress tolerance of lettuce seedlings by the exogenous application of sodium nitroprusside (SNP) was investigated. UV-B radiation increased the activities of superoxide dismutase (SOD), catalase (CAT), ascorbate peroxidase (APX), peroxidase (POD) and total phenolic concentrations, antioxidant capacity, and expression of phenylalanine ammonia lyase (PAL) gene in seedlings, but the combination of SNP pretreatment and UV-B enhanced antioxidant enzyme activities, total phenolic concentrations, antioxidant capacity, and PAL gene expression even more. Moreover, UV-B radiation significantly inhibited chlorophylls, carotenoid, gibberellic acid (GA), and indole-3-acetic acid (IAA) contents and increased the contents of abscisic acid (ABA), salicylic acid (SA), malondialdehyde (MDA), hydrogen peroxide (H2O2), and superoxide radical (O2•(-)) in lettuce seedlings. When SNP pretreatment was combined with the UV-B radiation, we observed alleviated chlorophylls, carotenoid, GA, and IAA inhibition and decreased content of ABA, SA, MDA, H2O2, and O2•(-) in comparison to non-pretreated stressed seedlings. PMID:26330324

  2. Anti-tumor and anti-virus activity of polysaccharides extracted from Sipunculus nudus(SNP) on Hepg2.2.15.

    PubMed

    Su, Jie; Jiang, Linlin; Wu, Jingna; Liu, Zhiyu; Wu, Yuping

    2016-06-01

    Many polysaccharides have biological activities and have been investigated for their antitumor effects. In this study, we investigated the anti-tumor activity and anti-virus activity of SNP-the water-soluble polysaccharides extracted from Sipunculus nudus on Hepg2.2.15. Flow cytometry analysis demonstrated that SNP induced dose-dependent cell apoptosis on Hepg2.2.15. Real-time PCR and Western Blot analysis showed that SNP down-regulated the synthesis of HBsAg, HBV-DNA and enhanced the expression of pro-apoptosis proteins TNF-α, caspase-3, and Bax, while decreasing the expression of the anti-apoptosis proteins survivin, Bcl-2, and VEGF. These results suggested that SNP suppressed cell viability of Hepg2.2.15 and that could be a novel anti-tumor and anti-HBV agent. PMID:26987430

  3. Association of ATP binding cassette transporter G8 rs4148217 SNP and serum lipid levels in Mulao and Han nationalities

    PubMed Central

    2012-01-01

    Background The association of ATP binding cassette transporter G8 gene (ABCG8) rs4148217 single nucleotide polymorphism (SNP) and serum lipid profiles is still controversial in diverse racial/ethnic groups. Mulao nationality is an isolated minority in China. The aim of this study was to evaluate the association of ABCG8 rs4148217 SNP and several environmental factors with serum lipid levels in the Guangxi Mulao and Han populations. Methods A total of 634 subjects of Mulao nationality and 717 participants of Han nationality were randomly selected from our previous samples. Genotyping of the ABCG8 rs4148217 SNP was performed by polymerase chain reaction and restriction fragment length polymorphism combined with gel electrophoresis, and then confirmed by direct sequencing. Results The genotypic and allelic frequencies of ABCG8 rs4148217 SNP were different between the two nationalities (P < 0.01 for each), the frequency of A allele was higher in Mulao than in Han. The A allele carriers in Han had lower high-density lipoprotein cholesterol (HDL-C) and apolipoprotein (Apo) A1 levels than the A allele noncarriers (P < 0.05 for each), whereas the A allele carriers in Mulao had lower ApoA1 levels than the A allele noncarriers (P < 0.05). Subgroup analyses showed that the A allele carriers in Han had lower HDL-C and higher triglyceride (TG) levels in females but not in males than the A allele noncarriers (P < 0.05 for each), and the A allele carriers in Mulao had lower ApoA1 levels in females but not in males than the A allele noncarriers (P < 0.05). The levels of TG and HDL-C in Han, and ApoA1 in Mulao were associated with genotypes in females but not in males (P < 0.05-0.01). Serum lipid parameters were also correlated with several environmental factors (P < 0.05-0.001). Conclusions The ABCG8 rs4148217 SNP is associated with serum TG, HDL-C and ApoA1 levels in our study populations, but this association is different between the Mulao and Han

  4. Single nucleotide polymorphism (SNP) variation of wolves (Canis lupus) in Southeast Alaska and comparison with wolves, dogs, and coyotes in North America.

    PubMed

    Cronin, Matthew A; Cánovas, Angela; Bannasch, Danika L; Oberbauer, Anita M; Medrano, Juan F

    2015-01-01

    There is considerable interest in the genetics of wolves (Canis lupus) because of their close relationship to domestic dogs (C. familiaris) and the need for informed conservation and management. This includes wolf populations in Southeast Alaska for which we determined genotypes of 305 wolves at 173662 single nucleotide polymorphism (SNP) loci. After removal of invariant and linked SNP, 123801 SNP were used to quantify genetic differentiation of wolves in Southeast Alaska and wolves, coyotes (C. latrans), and dogs from other areas in North America. There is differentiation of SNP allele frequencies between the species (wolves, coyotes, and dogs), although differentiation is relatively low between some wolf and coyote populations. There are varying levels of differentiation among populations of wolves, including low differentiation of wolves in interior Alaska, British Columbia, and the northern US Rocky Mountains. There is considerable differentiation of SNP allele frequencies of wolves in Southeast Alaska from wolves in other areas. However, wolves in Southeast Alaska are not a genetically homogeneous group and there are comparable levels of genetic differentiation among areas within Southeast Alaska and between Southeast Alaska and other geographic areas. SNP variation and other genetic data are discussed regarding taxonomy and management. PMID:25429025

  5. A Tandem Duplicate of Anti-Müllerian Hormone with a Missense SNP on the Y Chromosome Is Essential for Male Sex Determination in Nile Tilapia, Oreochromis niloticus

    PubMed Central

    Li, Minghui; Sun, Yunlv; Zhao, Jiue; Shi, Hongjuan; Zeng, Sheng; Ye, Kai; Jiang, Dongneng; Zhou, Linyan; Sun, Lina; Tao, Wenjing; Nagahama, Yoshitaka; Kocher, Thomas D.; Wang, Deshou

    2015-01-01

    Variation in the TGF-β signaling pathway is emerging as an important mechanism by which gonadal sex determination is controlled in teleosts. Here we show that amhy, a Y-specific duplicate of the anti-Müllerian hormone (amh) gene, induces male sex determination in Nile tilapia. amhy is a tandem duplicate located immediately downstream of amhΔ-y on the Y chromosome. The coding sequence of amhy was identical to the X-linked amh (amh) except a missense SNP (C/T) which changes an amino acid (Ser/Leu92) in the N-terminal region. amhy lacks 5608 bp of promoter sequence that is found in the X-linked amh homolog. The amhΔ-y contains several insertions and deletions in the promoter region, and even a 5 bp insertion in exonVI that results in a premature stop codon and thus a truncated protein product lacking the TGF-β binding domain. Both amhy and amhΔ-y expression is restricted to XY gonads from 5 days after hatching (dah) onwards. CRISPR/Cas9 knockout of amhy in XY fish resulted in male to female sex reversal, while mutation of amhΔ-y alone could not. In contrast, overexpression of Amhy in XX fish, using a fosmid transgene that carries the amhy/amhΔ-y haplotype or a vector containing amhy ORF under the control of CMV promoter, resulted in female to male sex reversal, while overexpression of AmhΔ-y alone in XX fish could not. Knockout of the anti-Müllerian hormone receptor type II (amhrII) in XY fish also resulted in 100% complete male to female sex reversal. Taken together, these results strongly suggest that the duplicated amhy with a missense SNP is the candidate sex determining gene and amhy/amhrII signal is essential for male sex determination in Nile tilapia. These findings highlight the conserved roles of TGF-β signaling pathway in fish sex determination. PMID:26588702

  6. A Tandem Duplicate of Anti-Müllerian Hormone with a Missense SNP on the Y Chromosome Is Essential for Male Sex Determination in Nile Tilapia, Oreochromis niloticus.

    PubMed

    Li, Minghui; Sun, Yunlv; Zhao, Jiue; Shi, Hongjuan; Zeng, Sheng; Ye, Kai; Jiang, Dongneng; Zhou, Linyan; Sun, Lina; Tao, Wenjing; Nagahama, Yoshitaka; Kocher, Thomas D; Wang, Deshou

    2015-11-01

    Variation in the TGF-β signaling pathway is emerging as an important mechanism by which gonadal sex determination is controlled in teleosts. Here we show that amhy, a Y-specific duplicate of the anti-Müllerian hormone (amh) gene, induces male sex determination in Nile tilapia. amhy is a tandem duplicate located immediately downstream of amhΔ-y on the Y chromosome. The coding sequence of amhy was identical to the X-linked amh (amh) except a missense SNP (C/T) which changes an amino acid (Ser/Leu92) in the N-terminal region. amhy lacks 5608 bp of promoter sequence that is found in the X-linked amh homolog. The amhΔ-y contains several insertions and deletions in the promoter region, and even a 5 bp insertion in exonVI that results in a premature stop codon and thus a truncated protein product lacking the TGF-β binding domain. Both amhy and amhΔ-y expression is restricted to XY gonads from 5 days after hatching (dah) onwards. CRISPR/Cas9 knockout of amhy in XY fish resulted in male to female sex reversal, while mutation of amhΔ-y alone could not. In contrast, overexpression of Amhy in XX fish, using a fosmid transgene that carries the amhy/amhΔ-y haplotype or a vector containing amhy ORF under the control of CMV promoter, resulted in female to male sex reversal, while overexpression of AmhΔ-y alone in XX fish co