Science.gov

Sample records for 124-plex snp typing

  1. SNIT: SNP identification for strain typing

    PubMed Central

    2011-01-01

    With ever-increasing numbers of microbial genomes being sequenced, efficient tools are needed to perform strain-level identification of any newly sequenced genome. Here, we present the SNP identification for strain typing (SNIT) pipeline, a fast and accurate software system that compares a newly sequenced bacterial genome with other genomes of the same species to identify single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels). Based on this information, the pipeline analyzes the polymorphic loci present in all input genomes to identify the genome that has the fewest differences with the newly sequenced genome. Similarly, for each of the other genomes, SNIT identifies the input genome with the fewest differences. Results from five bacterial species show that the SNIT pipeline identifies the correct closest neighbor with 75% to 100% accuracy. The SNIT pipeline is available for download at http://www.bhsai.org/snit.html PMID:21902825

  2. SNP typing on the NanoChip electronic microarray.

    PubMed

    Børsting, Claus; Sanchez, Juan J; Morling, Niels

    2005-01-01

    We describe a single nucleotide polymorphism (SNP) typing protocol developed for the NanoChip electronic microarray. The NanoChip array consists of 100 electrodes covered by a thin hydrogel layer containing streptavidin. An electric currency can be applied to one, several, or all electrodes at the same time according to a loading protocol generated by the user. Biotinylated deoxyribonucleic acid (DNA) is directed to the pad(s) via the electronic field(s) and bound to streptavidin in the hydrogel layer. Subsequently, fluorescently labeled reporter oligos and a stabilizer oligo are hybridized to the bound DNA. Base stacking between the short reporter and the longer stabilizer oligo stabilizes the binding of a matching reporter, whereas the binding of a reporter carrying a mismatch in the SNP position will be relatively weak. Thermal stringency is applied to the NanoChip array according to a reader protocol generated by the user and the fluorescent label on the matching reporter is detected.

  3. Imputation of KIR Types from SNP Variation Data

    PubMed Central

    Vukcevic, Damjan; Traherne, James A.; Næss, Sigrid; Ellinghaus, Eva; Kamatani, Yoichiro; Dilthey, Alexander; Lathrop, Mark; Karlsen, Tom H.; Franke, Andre; Moffatt, Miriam; Cookson, William; Trowsdale, John; McVean, Gil; Sawcer, Stephen; Leslie, Stephen

    2015-01-01

    Large population studies of immune system genes are essential for characterizing their role in diseases, including autoimmune conditions. Of key interest are a group of genes encoding the killer cell immunoglobulin-like receptors (KIRs), which have known and hypothesized roles in autoimmune diseases, resistance to viruses, reproductive conditions, and cancer. These genes are highly polymorphic, which makes typing expensive and time consuming. Consequently, despite their importance, KIRs have been little studied in large cohorts. Statistical imputation methods developed for other complex loci (e.g., human leukocyte antigen [HLA]) on the basis of SNP data provide an inexpensive high-throughput alternative to direct laboratory typing of these loci and have enabled important findings and insights for many diseases. We present KIR∗IMP, a method for imputation of KIR copy number. We show that KIR∗IMP is highly accurate and thus allows the study of KIRs in large cohorts and enables detailed investigation of the role of KIRs in human disease. PMID:26430804

  4. [Accurate detection of a case with Angelman syndrome (type 1) using SNP array].

    PubMed

    Shi, Shanshan; Lin, Shaobin; Liao, Yanfen; Li, Weijing

    2016-12-10

    To analyze a case with Angelman syndrome (AS) using single nucleotide polymorphism array (SNP array) and explore its genotype-phenotype correlation. G-banded karyotyping and SNP array were performed on a child featuring congenital malformations, intellectual disability and developmental delay. Mendelian error checking based on the SNP information was used to delineate the parental origin of detected abnormality. Result of the SNP array was validated with fluorescence in situ hybridization (FISH). The SNP array has detected a 6.053 Mb deletion at 15q11.2q13.1 (22,770,421- 28,823,722) which overlapped with the critical region of AS (type 1). The parents of the child showed no abnormal results for G-banded karyotyping, SNP array and FISH analysis, indicating a de novo origin of the deletion. Mendelian error checking based on the SNP information suggested that the 15q11.2q13.1 deletion was of maternal origin. SNP array can accurately define the size, location and parental origin of chromosomal microdeletions, which may facilitate the diagnosis of AS due to 15q11q13 deletion and better understanding of its genotype-phenotype correlation.

  5. Genome-wide SNP typing reveals signatures of population history.

    PubMed

    Hughes, Austin L; Welch, Robert; Puri, Vinita; Matthews, Casey; Haque, Kashif; Chanock, Stephen J; Yeager, Meredith

    2008-07-01

    Single-nucleotide polymorphism (SNP) arrays have become a popular technology for disease-association studies, but they also have potential for studying the genetic differentiation of human populations. Application of the Affymetrix GeneChip Human Mapping 500K Array Set to a population of 102 individuals representing the major ethnic groups in the United States (African, Asian, European, and Hispanic) revealed patterns of gene diversity and genetic distance that reflected population history. We analyzed allelic frequencies at 388,654 autosomal SNP sites that showed some variation in our study population and 10% or fewer missing values. Despite the small size (23-31 individuals) of each subpopulation, there were no fixed differences at any site between any two subpopulations. As expected from the African origin of modern humans, greater gene diversity was seen in Africans than in either Asians or Europeans, and the genetic distance between the Asian and the European populations was significantly lower than that between either of these two populations and Africans. Principal components analysis applied to a correlation matrix among individuals was able to separate completely the major continental groups of humans (Africans, Asians, and Europeans), while Hispanics overlapped all three of these groups. Genes containing two or more markers with extraordinarily high genetic distance between subpopulations were identified as candidate genes for health differences between subpopulations. The results show that, even with modest sample sizes, genome-wide SNP genotyping technologies have great promise for capturing signatures of gene frequency difference between human subpopulations, with applications in areas as diverse as forensics and the study of ethnic health disparities.

  6. Correlation between SNP genotypes and periodontitis in Japanese type II diabetic patients: a preliminary study.

    PubMed

    Damrongrungruang, Teerasak; Ogawa, Hiroshi; Hori-Matsumoto, Sayaka; Minagawa, Kumiko; Hanyu, Osamu; Sone, Hirohito; Miyazaki, Hideo

    2015-05-01

    The present study aims to investigate the correlation between SNP genotype patterns and periodontitis severity in Japanese type II diabetic patients. A cross-sectional study in 43 Japanese diabetic patients with periodontitis was performed. Blood samples were drawn for single nucleotide polymorphism (SNP) analyses and periodontal index (probing pocket depth and clinical attachment level) was subsequently recorded. Twelve functional genes with SNPs that had been shown to be associated with diabetes and/or inflammation were genotyped using a nuclease-mediated SNP-specific ligation method. Subjects with two or more sites with clinical attachment level ≥6 mm and who additionally had one or more sites with pocket depth ≥5 mm were classified as having severe periodontitis. Proportions of risk genotypes/non-risk genotypes between severe and non-severe periodontitis were subsequently compared. A high frequency (21/43 participants, 49%) of adiponectin gene polymorphism (ADIPOQ 45T > G) homozygous risk genotype (TT genotype) was observed in the participants. The frequency of TGF-β1 SNP (29C > T) risk genotype (TT genotype) in severe periodontitis (34%, n = 11) was significantly higher than in non-severe periodontitis (0%, n = 0) (p = 0.04). Our study suggests that TGF-β1 SNPs (29C > T) may be used as one of the risk indicators for severe periodontitis in Japanese diabetic patients.

  7. SNP Arrays

    PubMed Central

    Louhelainen, Jari

    2016-01-01

    The papers published in this Special Issue “SNP arrays” (Single Nucleotide Polymorphism Arrays) focus on several perspectives associated with arrays of this type. The range of papers vary from a case report to reviews, thereby targeting wider audiences working in this field. The research focus of SNP arrays is often human cancers but this Issue expands that focus to include areas such as rare conditions, animal breeding and bioinformatics tools. Given the limited scope, the spectrum of papers is nothing short of remarkable and even from a technical point of view these papers will contribute to the field at a general level. Three of the papers published in this Special Issue focus on the use of various SNP array approaches in the analysis of three different cancer types. Two of the papers concentrate on two very different rare conditions, applying the SNP arrays slightly differently. Finally, two other papers evaluate the use of the SNP arrays in the context of genetic analysis of livestock. The findings reported in these papers help to close gaps in the current literature and also to give guidelines for future applications of SNP arrays. PMID:27792140

  8. Comparative performance of SNP typing and 'Bruce-ladder' in the discrimination of Brucella suis and Brucella canis.

    PubMed

    Koylass, Mark S; King, Amanda C; Edwards-Smallbone, James; Gopaul, Krishna K; Perrett, Lorraine L; Whatmore, Adrian M

    2010-05-19

    Two novel molecular assays, 'Bruce-ladder' and SNP typing, have recently been described designed to differentiate isolates of the genus Brucella, causative organisms of the significant zoonotic disease brucellosis, at the species level. Differentiation of Brucella canis from Brucella suis by molecular approaches can be difficult and here we compare the performance of 'Bruce-ladder' and SNP typing in correctly identifying B. canis isolates. Both assays proved easy to perform but while 'Bruce-ladder' misidentifies a substantial proportion of B. canis isolates as B. suis, all B. canis isolates were correctly identified by SNP typing. Crown Copyright 2009. Published by Elsevier B.V. All rights reserved.

  9. SNP typing reveals similarity in Mycobacterium tuberculosis genetic diversity between Portugal and Northeast Brazil.

    PubMed

    Lopes, Joao S; Marques, Isabel; Soares, Patricia; Nebenzahl-Guimaraes, Hanna; Costa, Joao; Miranda, Anabela; Duarte, Raquel; Alves, Adriana; Macedo, Rita; Duarte, Tonya A; Barbosa, Theolis; Oliveira, Martha; Nery, Joilda S; Boechat, Neio; Pereira, Susan M; Barreto, Mauricio L; Pereira-Leal, Jose; Gomes, Maria Gabriela Miranda; Penha-Goncalves, Carlos

    2013-08-01

    Human tuberculosis is an infectious disease caused by bacteria from the Mycobacterium tuberculosis complex (MTBC). Although spoligotyping and MIRU-VNTR are standard methodologies in MTBC genetic epidemiology, recent studies suggest that Single Nucleotide Polymorphisms (SNP) are advantageous in phylogenetics and strain group/lineages identification. In this work we use a set of 79 SNPs to characterize 1987 MTBC isolates from Portugal and 141 from Northeast Brazil. All Brazilian samples were further characterized using spolygotyping. Phylogenetic analysis against a reference set revealed that about 95% of the isolates in both populations are singly attributed to bacterial lineage 4. Within this lineage, the most frequent strain groups in both Portugal and Brazil are LAM, followed by Haarlem and X. Contrary to these groups, strain group T showed a very different prevalence between Portugal (10%) and Brazil (1.5%). Spoligotype identification shows about 10% of mis-matches compared to the use of SNPs and a little more than 1% of strains unidentifiability. The mis-matches are observed in the most represented groups of our sample set (i.e., LAM and Haarlem) in almost the same proportion. Besides being more accurate in identifying strain groups/lineages, SNP-typing can also provide phylogenetic relationships between strain groups/lineages and, thus, indicate cases showing phylogenetic incongruence. Overall, the use of SNP-typing revealed striking similarities between MTBC populations from Portugal and Brazil.

  10. A comparison of two informative SNP-based strategies for typing Pseudomonas aeruginosa isolates from patients with cystic fibrosis

    PubMed Central

    2014-01-01

    Background Molecular typing is integral for identifying Pseudomonas aeruginosa strains that may be shared between patients with cystic fibrosis (CF). We conducted a side-by-side comparison of two P. aeruginosa genotyping methods utilising informative-single nucleotide polymorphism (SNP) methods; one targeting 10 P. aeruginosa SNPs and using real-time polymerase chain reaction technology (HRM10SNP) and the other targeting 20 SNPs and based on the Sequenom MassARRAY platform (iPLEX20SNP). Methods An in-silico analysis of the 20 SNPs used for the iPLEX20SNP method was initially conducted using sequence type (ST) data on the P. aeruginosa PubMLST website. A total of 506 clinical isolates collected from patients attending 11 CF centres throughout Australia were then tested by both the HRM10SNP and iPLEX20SNP assays. Type-ability and discriminatory power of the methods, as well as their ability to identify commonly shared P. aeruginosa strains, were compared. Results The in-silico analyses showed that the 1401 STs available on the PubMLST website could be divided into 927 different 20-SNP profiles (D-value = 0.999), and that most STs of national or international importance in CF could be distinguished either individually or as belonging to closely related single- or double-locus variant groups. When applied to the 506 clinical isolates, the iPLEX20SNP provided better discrimination over the HRM10SNP method with 147 different 20-SNP and 92 different 10-SNP profiles observed, respectively. For detecting the three most commonly shared Australian P. aeruginosa strains AUST-01, AUST-02 and AUST-06, the two methods were in agreement for 80/81 (98.8%), 48/49 (97.8%) and 11/12 (91.7%) isolates, respectively. Conclusions The iPLEX20SNP is a superior new method for broader SNP-based MLST-style investigations of P. aeruginosa. However, because of convenience and availability, the HRM10SNP method remains better suited for clinical microbiology laboratories that only utilise real

  11. Allele frequencies for 40 autosomal SNP loci typed for US population samples using electrospray ionization mass spectrometry.

    PubMed

    Kiesler, Kevin M; Vallone, Peter M

    2013-06-01

    To type a set of 194 US African American, Caucasian, and Hispanic samples (self-declared ancestry) for 40 autosomal single nucleotide polymorphism (SNP) markers intended for human identification purposes. Genotyping was performed on an automated commercial electrospray ionization time-of-flight mass spectrometer, the PLEX-ID. The 40 SNP markers were amplified in eight unique 5plex PCRs, desalted, and resolved based on amplicon mass. For each of the three US sample groups statistical analyses were performed on the resulting genotypes. The assay was found to be robust and capable of genotyping the 40 SNP markers consuming approximately 4 nanograms of template per sample. The combined random match probabilities for the 40 SNP assay ranged from 10-16 to 10-21. The multiplex PLEX-ID SNP-40 assay is the first fully automated genotyping method capable of typing a panel of 40 forensically relevant autosomal SNP markers on a mass spectrometry platform. The data produced provided the first allele frequencies estimates for these 40 SNPs in a National Institute of Standards and Technology US population sample set. No population bias was detected although one locus deviated from its expected level of heterozygosity.

  12. Allele frequencies for 40 autosomal SNP loci typed for US population samples using electrospray ionization mass spectrometry

    PubMed Central

    Kiesler, Kevin M.; Vallone, Peter M.

    2013-01-01

    Aim To type a set of 194 US African American, Caucasian, and Hispanic samples (self-declared ancestry) for 40 autosomal single nucleotide polymorphism (SNP) markers intended for human identification purposes. Methods Genotyping was performed on an automated commercial electrospray ionization time-of-flight mass spectrometer, the PLEX-ID. The 40 SNP markers were amplified in eight unique 5plex PCRs, desalted, and resolved based on amplicon mass. For each of the three US sample groups statistical analyses were performed on the resulting genotypes. Results The assay was found to be robust and capable of genotyping the 40 SNP markers consuming approximately 4 nanograms of template per sample. The combined random match probabilities for the 40 SNP assay ranged from 10−16 to 10−21. Conclusion The multiplex PLEX-ID SNP-40 assay is the first fully automated genotyping method capable of typing a panel of 40 forensically relevant autosomal SNP markers on a mass spectrometry platform. The data produced provided the first allele frequencies estimates for these 40 SNPs in a National Institute of Standards and Technology US population sample set. No population bias was detected although one locus deviated from its expected level of heterozygosity. PMID:23771752

  13. SNP/RD typing of Mycobacterium tuberculosis Beijing strains reveals local and worldwide disseminated clonal complexes.

    PubMed

    Schürch, Anita C; Kremer, Kristin; Hendriks, Amber C A; Freyee, Benthe; McEvoy, Christopher R E; van Crevel, Reinout; Boeree, Martin J; van Helden, Paul; Warren, Robin M; Siezen, Roland J; van Soolingen, Dick

    2011-01-01

    The Beijing strain is one of the most successful genotypes of Mycobacterium tuberculosis worldwide and appears to be highly homogenous according to existing genotyping methods. To type Beijing strains reliably we developed a robust typing scheme using single nucleotide polymorphisms (SNPs) and regions of difference (RDs) derived from whole-genome sequencing data of eight Beijing strains. SNP/RD typing of 259 M. tuberculosis isolates originating from 45 countries worldwide discriminated 27 clonal complexes within the Beijing genotype family. A total of 16 Beijing clonal complexes contained more than one isolate of known origin, of which two clonal complexes were strongly associated with South African origin. The remaining 14 clonal complexes encompassed isolates from different countries. Even highly resolved clonal complexes comprised isolates from distinct geographical sites. Our results suggest that Beijing strains spread globally on multiple occasions and that the tuberculosis epidemic caused by the Beijing genotype is at least partially driven by modern migration patterns. The SNPs and RDs presented in this study will facilitate future molecular epidemiological and phylogenetic studies on Beijing strains.

  14. The impact of natural selection on an ABCC11 SNP determining earwax type.

    PubMed

    Ohashi, Jun; Naka, Izumi; Tsuchiya, Naoyuki

    2011-01-01

    A nonsynonymous single nucleotide polymorphism (SNP), rs17822931-G/A (538G>A; Gly180Arg), in the ABCC11 gene determines human earwax type (i.e., wet or dry) and is one of most differentiated nonsynonymous SNPs between East Asian and African populations. A recent genome-wide scan for positive selection revealed that a genomic region spanning ABCC11, LONP2, and SIAH1 genes has been subjected to a selective sweep in East Asians. Considering the potential functional significance as well as the population differentiation of SNPs located in that region, rs17822931 is the most plausible candidate polymorphism to have undergone geographically restricted positive selection. In this study, we estimated the selection intensity or selection coefficient of rs17822931-A in East Asians by analyzing two microsatellite loci flanking rs17822931 in the African (HapMap-YRI) and East Asian (HapMap-JPT and HapMap-CHB) populations. Assuming a recessive selection model, a coalescent-based simulation approach suggested that the selection coefficient of rs17822931-A had been approximately 0.01 in the East Asian population, and a simulation experiment using a pseudo-sampling variable revealed that the mutation of rs17822931-A occurred 2006 generations (95% credible interval, 1,023-3,901 generations) ago. In addition, we show that absolute latitude is significantly associated with the allele frequency of rs17822931-A in Asian, Native American, and European populations, implying that the selective advantage of rs17822931-A is related to an adaptation to a cold climate. Our results provide a striking example of how local adaptation has played a significant role in the diversification of human traits.

  15. Human population genetic diversity as a function of SNP type from HapMap data.

    PubMed

    Garte, Seymour

    2010-01-01

    Data from the international HapMap project were mined to determine if the degree of genetic differentiation (Fst) is dependent on single nucleotide polymorphism (SNP) category. The Fst statistic was evaluated across all SNPs for each of 30 genes and for each of five chromosomes. A consistent decrease in diversity between Europeans and Africans was seen for nonsynonymous coding region SNPs compared to the three other SNP categories: synonymous SNPs, UTR, and intronic SNPs. This suggests an effect of balancing selection in reducing interpopulation genetic diversity at sites that would be expected to influence phenotype and therefore be subject to selection. This result is inconsistent with the concept of large population specific genetic differences that could have applications in "racialized medicine."

  16. SNP-VISTA

    SciTech Connect

    Shah, Nameeta; Teplitsky, Michael; Minovitsky, Simon; Dubchak, Inna

    2005-11-07

    SNP-VISTA aids in analyses of the following types of data: A. Large-scale re-sequence data of disease-related genes for discovery of associated and/or causative alleles (GeneSNP-VISTA). B. Massive amounts of ecogenomics data for studying homologous recombination in microbial populations (EcoSNP-VISTA). The main features and capabilities of SNP-VISTA are: 1) Mapping of SNPs to gene structure; 2) classification of SNPs, based on their location in the gene, frequency of occurrence in samples and allele composition; 3) clustering, based on user-defined subsets of SNPs, highlighting haplotypes as well as recombinant sequences; 4) integration of protein conservation visualization; and 5) display of automatically calculated recombination points that are user-editable. The main strength of SNP-VISTA is its graphical interface and use of visual representations, which support interactive exploration and hence better understanding of large-scale SNPs data.

  17. A novel TCF7L2 type 2 diabetes SNP identified from fine mapping in African American women

    PubMed Central

    Haddad, Stephen A.; Palmer, Julie R.; Lunetta, Kathryn L.; Ng, Maggie C. Y.; Ruiz-Narváez, Edward A.

    2017-01-01

    SNP rs7903146 in the Wnt pathway’s TCF7L2 gene is the variant most significantly associated with type 2 diabetes to date, with associations observed across diverse populations. We sought to determine whether variants in other Wnt pathway genes are also associated with this disease. We evaluated 69 genes involved in the Wnt pathway, including TCF7L2, for associations with type 2 diabetes in 2632 African American cases and 2596 controls from the Black Women’s Health Study. Tag SNPs for each gene region were genotyped on a custom Affymetrix Axiom Array, and imputation was performed to 1000 Genomes Phase 3 data. Gene-based analyses were conducted using the adaptive rank truncated product (ARTP) statistic. The PSMD2 gene was significantly associated with type 2 diabetes after correction for multiple testing (corrected p = 0.016), based on the nine most significant single variants in the +/- 20 kb region surrounding the gene, which includes nearby genes EIF4G1, ECE2, and EIF2B5. Association data on four of the nine variants were available from an independent sample of 8284 African American cases and 15,543 controls; associations were in the same direction, but weak and not statistically significant. TCF7L2 was the only other gene associated with type 2 diabetes at nominal p <0.01 in our data. One of the three variants in the best gene-based model for TCF7L2, rs114770437, was not correlated with the GWAS index SNP rs7903146 and may represent an independent association signal seen only in African ancestry populations. Data on this SNP were not available in the replication sample. PMID:28253288

  18. Forensic genetic SNP typing of low-template DNA and highly degraded DNA from crime case samples.

    PubMed

    Børsting, Claus; Mogensen, Helle Smidt; Morling, Niels

    2013-05-01

    Heterozygote imbalances leading to allele drop-outs and disproportionally large stutters leading to allele drop-ins are known stochastic phenomena related to STR typing of low-template DNA (LtDNA). The large stutters and the many drop-ins in typical STR stutter positions are artifacts from the PCR amplification of tandem repeats. These artifacts may be avoided by typing bi-allelic markers instead of STRs. In this work, the SNPforID multiplex assay was used to type LtDNA. A sensitized SNP typing protocol was introduced, that increased signal strengths without increasing noise and without affecting the heterozygote balance. Allele drop-ins were only observed in experiments with 25 pg of DNA and not in experiments with 50 and 100 pg of DNA. The allele drop-in rate in the 25 pg experiments was 0.06% or 100 times lower than what was previously reported for STR typing of LtDNA. A composite model and two different consensus models were used to interpret the SNP data. Correct profiles with 42-49 SNPs were generated from the 50 and 100 pg experiments, whereas a few incorrect genotypes were included in the generated profiles from the 25 pg experiments. With the strict consensus model, between 35 and 48 SNPs were correctly typed in the 25 pg experiments and only one allele drop-out (error rate: 0.07%) was observed in the consensus profiles. A total of 28 crime case samples were selected for typing with the sensitized SNPforID protocol. The samples were previously typed with old STR kits during the crime case investigation and only partial profiles (0-6 STRs) were obtained. Eleven of the samples could not be quantified with the Quantifiler™ Human DNA Quantification kit because of partial or complete inhibition of the PCR. For eight of these samples, SNP typing was only possible when the buffer and DNA polymerase used in the original protocol was replaced with the AmpFℓSTR(®) SEfiler Plus™ Master Mix, which was developed specifically for challenging forensic samples. All

  19. Development of a rapid SNP-typing assay to differentiate Bifidobacterium animalis ssp. lactis strains used in probiotic-supplemented dairy products.

    PubMed

    Lomonaco, Sara; Furumoto, Emily J; Loquasto, Joseph R; Morra, Patrizia; Grassi, Ausilia; Roberts, Robert F

    2015-02-01

    Identification at the genus, species, and strain levels is desirable when a probiotic microorganism is added to foods. Strains of Bifidobacterium animalis ssp. lactis (BAL) are commonly used worldwide in dairy products supplemented with probiotic strains. However, strain discrimination is difficult because of the high degree of genome identity (99.975%) between different genomes of this subspecies. Typing of monomorphic species can be carried out efficiently by targeting informative single nucleotide polymorphisms (SNP). Findings from a previous study analyzing both reference and commercial strains of BAL identified SNP that could be used to discriminate common strains into 8 groups. This paper describes development of a minisequencing assay based on the primer extension reaction (PER) targeting multiple SNP that can allow strain differentiation of BAL. Based on previous data, 6 informative SNP were selected for further testing, and a multiplex preliminary PCR was optimized to amplify the DNA regions containing the selected SNP. Extension primers (EP) annealing immediately adjacent to the selected SNP were developed and tested in simplex and multiplex PER to evaluate their performance. Twenty-five strains belonging to 9 distinct genomic clusters of B. animalis ssp. lactis were selected and analyzed using the developed minisequencing assay, simultaneously targeting the 6 selected SNP. Fragment analysis was subsequently carried out in duplicate and demonstrated that the assay yielded 8 specific profiles separating the most commonly used commercial strains. This novel multiplex PER approach provides a simple, rapid, flexible SNP-based subtyping method for proper characterization and identification of commercial probiotic strains of BAL from fermented dairy products. To assess the usefulness of this method, DNA was extracted from yogurt manufactured with and without the addition of B. animalis ssp. lactis BB-12. Extracted DNA was then subjected to the minisequencing

  20. Network-based regularization for high dimensional SNP data in the case-control study of Type 2 diabetes.

    PubMed

    Ren, Jie; He, Tao; Li, Ye; Liu, Sai; Du, Yinhao; Jiang, Yu; Wu, Cen

    2017-05-16

    Over the past decades, the prevalence of type 2 diabetes mellitus (T2D) has been steadily increasing around the world. Despite large efforts devoted to better understand the genetic basis of the disease, the identified susceptibility loci can only account for a small portion of the T2D heritability. Some of the existing approaches proposed for the high dimensional genetic data from the T2D case-control study are limited by analyzing a few number of SNPs at a time from a large pool of SNPs, by ignoring the correlations among SNPs and by adopting inefficient selection techniques. We propose a network constrained regularization method to select important SNPs by taking the linkage disequilibrium into account. To accomodate the case control study, an iteratively reweighted least square algorithm has been developed within the coordinate descent framework where optimization of the regularized logistic loss function is performed with respect to one parameter at a time and iteratively cycle through all the parameters until convergence. In this article, a novel approach is developed to identify important SNPs more effectively through incorporating the interconnections among them in the regularized selection. A coordinate descent based iteratively reweighed least squares (IRLS) algorithm has been proposed. Both the simulation study and the analysis of the Nurses's Health Study, a case-control study of type 2 diabetes data with high dimensional SNP measurements, demonstrate the advantage of the network based approach over the competing alternatives.

  1. Association of SNP rs9939609 in FTO gene with metabolic syndrome in type 2 diabetic subjects, rectruited from a tertiary care unit of Karachi, Pakistan

    PubMed Central

    Fawwad, Asher; Siddiqui, Iftikhar Ahmed; Zeeshan, Nimra Fatima; Shahid, Syed Muhammad; Basit, Abdul

    2015-01-01

    Objective: To determine the association of SNP in FTO gene, rs9939609, with Metabolic Syndrome (MS) in type 2 diabetic subjects at a tertiary care unit of Karachi, Pakistan. Methods: We genotyped FTO rs9939609 SNP in 296 patients with type 2 diabetes from the Out Patient Department (OPD) of Baqai Institute of Diabetology and Endocrinology (BIDE). MS was defined on the basis of International Diabetes Federation (IDF) and National Cholesterol Education program (NCEP) criterion. Association between the rs9939609 SNP and MS was tested through chi-square and Z-tests by using odds ratio (OR) with 95% confidence intervals. Results: The frequency of MS as defined by IDF criterion was significantly higher in female subjects as compared to male subjects (p= 0.006). Carriers of ≥ 1 copy of the rs9939609 A allele were significantly more likely to had MS (69.6%) than non-carriers (30.4%), corresponding to a carrier odds ratio (OR) of 0.52 (95% confidence interval [CI] (0.29-0.93), with a similar trend for the ATP III-defined MS.“A” allele carriers under dominant model, carry all the criterion of MS more significantly as compared to non-carriers. Conclusion: The FTO rs9939609 SNP was associated with an increased risk for Metabolic Syndrome in type 2 diabetic populations at a tertiary care unit of Karachi, Pakistan. PMID:25878631

  2. Association of SNP rs9939609 in FTO gene with metabolic syndrome in type 2 diabetic subjects, rectruited from a tertiary care unit of Karachi, Pakistan.

    PubMed

    Fawwad, Asher; Siddiqui, Iftikhar Ahmed; Zeeshan, Nimra Fatima; Shahid, Syed Muhammad; Basit, Abdul

    2015-01-01

    To determine the association of SNP in FTO gene, rs9939609, with Metabolic Syndrome (MS) in type 2 diabetic subjects at a tertiary care unit of Karachi, Pakistan. We genotyped FTO rs9939609 SNP in 296 patients with type 2 diabetes from the Out Patient Department (OPD) of Baqai Institute of Diabetology and Endocrinology (BIDE). MS was defined on the basis of International Diabetes Federation (IDF) and National Cholesterol Education program (NCEP) criterion. Association between the rs9939609 SNP and MS was tested through chi-square and Z-tests by using odds ratio (OR) with 95% confidence intervals. The frequency of MS as defined by IDF criterion was significantly higher in female subjects as compared to male subjects (p= 0.006). Carriers of ≥ 1 copy of the rs9939609 A allele were significantly more likely to had MS (69.6%) than non-carriers (30.4%), corresponding to a carrier odds ratio (OR) of 0.52 (95% confidence interval [CI] (0.29-0.93), with a similar trend for the ATP III-defined MS."A" allele carriers under dominant model, carry all the criterion of MS more significantly as compared to non-carriers. The FTO rs9939609 SNP was associated with an increased risk for Metabolic Syndrome in type 2 diabetic populations at a tertiary care unit of Karachi, Pakistan.

  3. Multiplexed SNP typing of ancient DNA clarifies the origin of Andaman mtDNA haplogroups amongst South Asian tribal populations.

    PubMed

    Endicott, Phillip; Metspalu, Mait; Stringer, Chris; Macaulay, Vincent; Cooper, Alan; Sanchez, Juan J

    2006-12-20

    The issue of errors in genetic data sets is of growing concern, particularly in population genetics where whole genome mtDNA sequence data is coming under increased scrutiny. Multiplexed PCR reactions, combined with SNP typing, are currently under-exploited in this context, but have the potential to genotype whole populations rapidly and accurately, significantly reducing the amount of errors appearing in published data sets. To show the sensitivity of this technique for screening mtDNA genomic sequence data, 20 historic samples of the enigmatic Andaman Islanders and 12 modern samples from three Indian tribal populations (Chenchu, Lambadi and Lodha) were genotyped for 20 coding region sites after provisional haplogroup assignment with control region sequences. The genotype data from the historic samples significantly revise the topologies for the Andaman M31 and M32 mtDNA lineages by rectifying conflicts in published data sets. The new Indian data extend the distribution of the M31a lineage to South Asia, challenging previous interpretations of mtDNA phylogeography. This genetic connection between the ancestors of the Andamanese and South Asian tribal groups approximately 30 kya has important implications for the debate concerning migration routes and settlement patterns of humans leaving Africa during the late Pleistocene, and indicates the need for more detailed genotyping strategies. The methodology serves as a low-cost, high-throughput model for the production and authentication of data from modern or ancient DNA, and demonstrates the value of museum collections as important records of human genetic diversity.

  4. Eight new genomes and synthetic controls increase the accessibility of rapid melt-MAMA SNP typing of Coxiella burnetii.

    PubMed

    Karlsson, Edvin; Macellaro, Anna; Byström, Mona; Forsman, Mats; Frangoulidis, Dimitrios; Janse, Ingmar; Larsson, Pär; Lindgren, Petter; Ohrman, Caroline; van Rotterdam, Bart; Sjödin, Andreas; Myrtennäs, Kerstin

    2014-01-01

    The case rate of Q fever in Europe has increased dramatically in recent years, mainly because of an epidemic in the Netherlands in 2009. Consequently, there is a need for more extensive genetic characterization of the disease agent Coxiella burnetii in order to better understand the epidemiology and spread of this disease. Genome reference data are essential for this purpose, but only thirteen genome sequences are currently available. Current methods for typing C. burnetii are criticized for having problems in comparing results across laboratories, require the use of genomic control DNA, and/or rely on markers in highly variable regions. We developed in this work a method for single nucleotide polymorphism (SNP) typing of C. burnetii isolates and tissue samples based on new assays targeting ten phylogenetically stable synonymous canonical SNPs (canSNPs). These canSNPs represent previously known phylogenetic branches and were here identified from sequence comparisons of twenty-one C. burnetii genomes, eight of which were sequenced in this work. Importantly, synthetic control templates were developed, to make the method useful to laboratories lacking genomic control DNA. An analysis of twenty-one C. burnetii genomes confirmed that the species exhibits high sequence identity. Most of its SNPs (7,493/7,559 shared by >1 genome) follow a clonal inheritance pattern and are therefore stable phylogenetic typing markers. The assays were validated using twenty-six genetically diverse C. burnetii isolates and three tissue samples from small ruminants infected during the epidemic in the Netherlands. Each sample was assigned to a clade. Synthetic controls (vector and PCR amplified) gave identical results compared to the corresponding genomic controls and are viable alternatives to genomic DNA. The results from the described method indicate that it could be useful for cheap and rapid disease source tracking at non-specialized laboratories, which requires accurate genotyping

  5. Association of the 17-hydroxysteroid dehydrogenase type 5 gene polymorphism (-71A/G HSD17B5 SNP) with hyperandrogenemia in polycystic ovary syndrome (PCOS).

    PubMed

    Marioli, Dimitra J; Saltamavros, Alexandros D; Vervita, Vasiliki; Koika, Vasiliki; Adonakis, George; Decavalas, George; Markou, Kostas B; Georgopoulos, Neoklis A

    2009-08-01

    To evaluate the association of an activating single-nucleotide polymorphism (SNP) at position -71 of the promoter of 17beta-hydroxysteroid dehydrogenase type 5 gene (-71A/G HSD17B5 SNP) and polycystic ovary syndrome (PCOS) in a well characterized cohort of caucasian PCOS women with biochemical hyperandrogenemia. The PCOS patients and unrelated healthy control subjects were genotyped for the -71A/G HSD17B5 SNP. The acquired genotypic data was tested for association with PCOS and other quantitative phenotypic traits of the syndrome in PCOS patients. Subjects were recruited from the Division of Reproductive Endocrinology, Department of Obstetrics and Gynecology, at the University Hospital of Patras, Greece. Genotyping and biochemical determinations took place at the Laboratory of Molecular Endrocinology, University of Patras Medical School, Rion, Greece. Participants comprised 150 caucasian Greek PCOS women with biochemical hyperandrogenism and chronic anovulation and polycystic ovarian morphology on ultrasound and 51 healthy control subjects. HSD17B5 genotype, serum testosterone, serum androstenedione. No association of the -71A/G HSD17B5 SNP with PCOS was detected. However, the -71G HSD17B5 variant was associated with increased serum testosterone levels and decreased androstenedione/testosterone ratio. The -71G HSD17B5 variant is not a major component of the molecular pathogenetic mechanisms of PCOS, although it might contribute to the severity of hyperandrogenemia in women with PCOS and biochemical hyperandrogenism.

  6. Expression Level of the DREB2-Type Gene, Identified with Amplifluor SNP Markers, Correlates with Performance, and Tolerance to Dehydration in Bread Wheat Cultivars from Northern Kazakhstan

    PubMed Central

    Shavrukov, Yuri; Zhumalin, Aibek; Serikbay, Dauren; Botayeva, Makpal; Otemisova, Ainur; Absattarova, Aiman; Sereda, Grigoriy; Sereda, Sergey; Shvidchenko, Vladimir; Turbekova, Arysgul; Jatayev, Satyvaldy; Lopato, Sergiy; Soole, Kathleen; Langridge, Peter

    2016-01-01

    A panel of 89 local commercial cultivars of bread wheat was tested in field trials in the dry conditions of Northern Kazakhstan. Two distinct groups of cultivars (six cultivars in each group), which had the highest and the lowest grain yield under drought were selected for further experiments. A dehydration test conducted on detached leaves indicated a strong association between rates of water loss in plants from the first group with highest grain yield production in the dry environment relative to the second group. Modern high-throughput Amplifluor Single Nucleotide Polymorphism (SNP) technology was applied to study allelic variations in a series of drought-responsive genes using 19 SNP markers. Genotyping of an SNP in the TaDREB5 (DREB2-type) gene using the Amplifluor SNP marker KATU48 revealed clear allele distribution across the entire panel of wheat accessions, and distinguished between the two groups of cultivars with high and low yield under drought. Significant differences in expression levels of TaDREB5 were revealed by qRT-PCR. Most wheat plants from the first group of cultivars with high grain yield showed slight up-regulation in the TaDREB5 transcript in dehydrated leaves. In contrast, expression of TaDREB5 in plants from the second group of cultivars with low grain yield was significantly down-regulated. It was found that SNPs did not alter the amino acid sequence of TaDREB5 protein. Thus, a possible explanation is that alternative splicing and up-stream regulation of TaDREB5 may be affected by SNP, but these hypotheses require additional analysis (and will be the focus of future studies). PMID:27917186

  7. SKM-SNP: SNP markers detection method.

    PubMed

    Liu, Yang; Li, Mark; Cheung, Yiu M; Sham, Pak C; Ng, Michael K

    2010-04-01

    SKM-SNP, SNP markers detection program, is proposed to identify a set of relevant SNPs for the association between a disease and multiple marker genotypes. We employ a subspace categorical clustering algorithm to compute a weight for each SNP in the group of patient samples and the group of normal samples, and use the weights to identify the subsets of relevant SNPs that categorize these two groups. The experiments on both Schizophrenia and Parkinson Disease data sets containing genome-wide SNPs are reported to demonstrate the program. Results indicate that our method can find some relevant SNPs that categorize the disease samples. The online SKM-SNP program is available at http://www.math.hkbu.edu.hk/~mng/SKM-SNP/SKM-SNP.html.

  8. SNP genotyping by DNA photoligation: application to SNP detection of genes from food crops

    NASA Astrophysics Data System (ADS)

    Yoshimura, Yoshinaga; Ohtake, Tomoko; Okada, Hajime; Ami, Takehiro; Tsukaguchi, Tadashi; Fujimoto, Kenzo

    2009-06-01

    We describe a simple and inexpensive single-nucleotide polymorphism (SNP) typing method, using DNA photoligation with 5-carboxyvinyl-2'-deoxyuridine and two fluorophores. This SNP-typing method facilitates qualitative determination of genes from indica and japonica rice, and showed a high degree of single nucleotide specificity up to 10 000. This method can be used in the SNP typing of actual genomic DNA samples from food crops.

  9. Replication Study in a Japanese Population to Evaluate the Association between 10 SNP Loci, Identified in European Genome-Wide Association Studies, and Type 2 Diabetes

    PubMed Central

    Imamura, Minako; Tanaka, Yasushi; Iwata, Minoru; Hirose, Hiroshi; Kaku, Kohei; Maegawa, Hiroshi; Watada, Hirotaka; Tobe, Kazuyuki; Kashiwagi, Atsunori; Kawamori, Ryuzo; Maeda, Shiro

    2015-01-01

    Aim We performed a replication study in a Japanese population to evaluate the association between type 2 diabetes and 7 susceptibility loci originally identified by European genome-wide association study (GWAS) in 2012: ZMIZ1, KLHDC5, TLE1, ANKRD55, CILP2, MC4R, and BCAR1. We also examined the association of 3 additional loci: CCND2 and GIPR, identified in sex-differentiated analyses, and LAMA1, which was shown to be associated with non-obese European type 2 diabetes. Methods We genotyped 6,972 Japanese participants (4,280 type 2 diabetes patients and 2,692 controls) for each of the 10 single nucleotide polymorphisms (SNPs): rs12571751 in ZMIZ1, rs10842994 near KLHDC5, rs2796441 near TLE1, rs459193 near ANKRD55, rs10401969 in CILP2, rs12970134 near MC4R, rs7202877 near BCAR1, rs11063069 near CCND2, rs8108269 near GIPR, and rs8090011 in LAMA1 using a multiplex polymerase chain reaction invader assay. The association of each SNP locus with the disease was evaluated using a logistic regression analysis. Results All SNPs examined in this study had the same direction of effect (odds ratio > 1.0, p = 9.77 × 10-4, binomial test), as in the original reports. Among them, rs12571751 in ZMIZ1 was significantly associated with type 2 diabetes [p = 0.0041, odds ratio = 1.123, 95% confidence interval 1.037–1.215, adjusted for sex, age and body mass index (BMI)], but we did not observe significant association of the remaining 9 SNP loci with type 2 diabetes in the present Japanese population (p ≥ 0.005). A genetic risk score, constructed from the sum of risk alleles for the 7 SNP loci identified by un-stratified analyses in the European GWAS meta-analysis were associated with type 2 diabetes in the present Japanese population (p = 2.3 × 10-4, adjusted for sex, age and BMI). Conclusions ZMIZ1 locus has a significant effect on conferring susceptibility to type 2 diabetes also in the Japanese population. PMID:25951451

  10. Association of Type 2 Diabetes Mellitus related SNP genotypes with altered serum adipokine levels and metabolic syndrome phenotypes

    PubMed Central

    Al-Daghri, Nasser M; Al-Attas, Omar S; Krishnaswamy, Soundararajan; Mohammed, Abdul Khader; Alenad, Amal M; Chrousos, George P; Alokail, Majed S

    2015-01-01

    The pathogenesis of T2DM involves secretion of several pro-inflammatory molecules by the dramatically increased adipocytes, both by number and size, and associated macrophages of adipose tissue. Since T2DM is usually preceded by obesity and chronic systemic inflammation, the objective of this study was to explore for any association between genetic variants of previously established 36 T2DM-associated SNPs and altered serum adipocytokine levels and metabolic syndrome phenotypes. Study consisted of 566 subjects (284 males and 282 females) of whom 147 were T2DM patients and 419 healthy controls. Study subjects were genotyped for 36 T2DM-linked single nucleotide polymorphisms (SNPs) using the KASPar SNP Genotyping System and grouped into different genotypes for each SNP. Various anthropometric and biochemical parameters were measured following standard procedures. The mean values of serum levels of individual adipocytokines and the presence/absence of metabolic syndrome phenotypes corresponding to various genotypes were compared by determining the odds ratios. Genotypic variants of five and seven of the 36 T2DM-related SNPs were significantly associated with altered serum levels of adiponectin and aPAI, respectively. Six variants of the 36 SNPs were associated with metabolic syndrome manifestations. This study identified positive associations between genotypic variants of five and seven of the 36 T2DM related SNPs and altered serum levels of adiponectin and aPAI, respectively. Six of 36 SNPs were also associated with metabolic syndrome in the studied population. The relation between specific SNPs and individual phenotypic traits may be useful in explaining the causal mechanisms of hereditary component of T2DM. PMID:26064370

  11. SNP-RFLPing 2: an updated and integrated PCR-RFLP tool for SNP genotyping

    PubMed Central

    2010-01-01

    Background PCR-restriction fragment length polymorphism (RFLP) assay is a cost-effective method for SNP genotyping and mutation detection, but the manual mining for restriction enzyme sites is challenging and cumbersome. Three years after we constructed SNP-RFLPing, a freely accessible database and analysis tool for restriction enzyme mining of SNPs, significant improvements over the 2006 version have been made and incorporated into the latest version, SNP-RFLPing 2. Results The primary aim of SNP-RFLPing 2 is to provide comprehensive PCR-RFLP information with multiple functionality about SNPs, such as SNP retrieval to multiple species, different polymorphism types (bi-allelic, tri-allelic, tetra-allelic or indels), gene-centric searching, HapMap tagSNPs, gene ontology-based searching, miRNAs, and SNP500Cancer. The RFLP restriction enzymes and the corresponding PCR primers for the natural and mutagenic types of each SNP are simultaneously analyzed. All the RFLP restriction enzyme prices are also provided to aid selection. Furthermore, the previously encountered updating problems for most SNP related databases are resolved by an on-line retrieval system. Conclusions The user interfaces for functional SNP analyses have been substantially improved and integrated. SNP-RFLPing 2 offers a new and user-friendly interface for RFLP genotyping that can be used in association studies and is freely available at http://bio.kuas.edu.tw/snp-rflping2. PMID:20377871

  12. A Complete Association of an intronic SNP rs6798742 with Origin of Spinocerebellar Ataxia Type 7-CAG Expansion Loci in the Indian and Mexican Population.

    PubMed

    Faruq, Mohammed; Magaña, Jonathan J; Suroliya, Varun; Narang, Ankita; Murillo-Melo, Nadia M; Hernández-Hernández, Oscar; Srivastava, Achal K; Mukerji, Mitali

    2017-09-01

    Spinocerebellar ataxia type 7 (SCA7) is a rare neurogenetic disorder caused by highly unstable CAG repeat expansion mutation in coding region of SCA7. We aimed to understand the effect of diverse ATXN7 cis-element in correlation with CAG expansion mutation of SCA7. We initially performed an analysis to identify the haplotype background of CAG expanded alleles using eight bi-allelic single nucleotide polymorphisms (SNPs) flanking an ATXN7-CAG expansion in 32 individuals from nine unrelated Indian SCA7 families and 88 healthy controls. Subsequent validation of the findings was performed in 89 ATXN7-CAG mutation carriers and in 119 unrelated healthy controls of Mexican ancestry. The haplotype analyses showed a shared haplotype background and C allele of SNP rs6798742 (approximately 6 kb from the 3'-end of CAG repeats) is in complete association with expanded, premutation, intermediate, and the majority of large normal (≥12) CAG allele. The C allele (ancestral/chimp allele) association was validated in SCA7 subjects and healthy controls from Mexico, suggesting its substantial association with CAG expanded and expansion-prone chromosomes. Analysis of rs6798742 and other neighboring functional SNPs within 6 kb in experimental datasets (Encyclopedia of DNA Elements; ENCODE) shows functional marks that could affect transcription as well as histone methylation. An allelic association of the CAG region to an intronic SNP in two different ethnic and geographical populations suggests a -cis factor-dependent mechanism in ATXN7 CAG-region expansion. © 2017 John Wiley & Sons Ltd/University College London.

  13. Electrochemical detection of type 2 diabetes mellitus-related SNP via DNA-mediated growth of silver nanoparticles on single walled carbon nanotubes.

    PubMed

    Tao, Jia; Zhao, Peng; Zheng, Jing; Wu, Cuichen; Shi, Muling; Li, Jishan; Li, Yinhui; Yang, Ronghua

    2015-11-07

    Herein, we proposed a new electrochemical sensing strategy for T2DM-related SNP detection via DNA-mediated growth of AgNPs on a SWCNT-modified electrode. Coupled with RNase HII enzyme assisted amplification, this approach could realize T2DM-related SNP assay and be applied in crude extracts of carcinoma pancreatic β-cell lines.

  14. Genome-wide detection of CNVs in Chinese indigenous sheep with different types of tails using ovine high-density 600K SNP arrays

    PubMed Central

    Zhu, Caiye; Fan, Hongying; Yuan, Zehu; Hu, Shijin; Ma, Xiaomeng; Xuan, Junli; Wang, Hongwei; Zhang, Li; Wei, Caihong; Zhang, Qin; Zhao, Fuping; Du, Lixin

    2016-01-01

    Chinese indigenous sheep can be classified into three types based on tail morphology: fat-tailed, fat-rumped, and thin-tailed sheep, of which the typical breeds are large-tailed Han sheep, Altay sheep, and Tibetan sheep, respectively. To unravel the genetic mechanisms underlying the phenotypic differences among Chinese indigenous sheep with tails of three different types, we used ovine high-density 600K SNP arrays to detect genome-wide copy number variation (CNV). In large-tailed Han sheep, Altay sheep, and Tibetan sheep, 371, 301, and 66 CNV regions (CNVRs) with lengths of 71.35 Mb, 51.65 Mb, and 10.56 Mb, respectively, were identified on autosomal chromosomes. Ten CNVRs were randomly chosen for confirmation, of which eight were successfully validated. The detected CNVRs harboured 3130 genes, including genes associated with fat deposition, such as PPARA, RXRA, KLF11, ADD1, FASN, PPP1CA, PDGFA, and PEX6. Moreover, multilevel bioinformatics analyses of the detected candidate genes were significantly enriched for involvement in fat deposition, GTPase regulator, and peptide receptor activities. This is the first high-resolution sheep CNV map for Chinese indigenous sheep breeds with three types of tails. Our results provide valuable information that will support investigations of genomic structural variation underlying traits of interest in sheep. PMID:27282145

  15. Preferential access to genetic information from endogenous hominin ancient DNA and accurate quantitative SNP-typing via SPEX

    PubMed Central

    Brotherton, Paul; Sanchez, Juan J.; Cooper, Alan; Endicott, Phillip

    2010-01-01

    The analysis of targeted genetic loci from ancient, forensic and clinical samples is usually built upon polymerase chain reaction (PCR)-generated sequence data. However, many studies have shown that PCR amplification from poor-quality DNA templates can create sequence artefacts at significant levels. With hominin (human and other hominid) samples, the pervasive presence of highly PCR-amplifiable human DNA contaminants in the vast majority of samples can lead to the creation of recombinant hybrids and other non-authentic artefacts. The resulting PCR-generated sequences can then be difficult, if not impossible, to authenticate. In contrast, single primer extension (SPEX)-based approaches can genotype single nucleotide polymorphisms from ancient fragments of DNA as accurately as modern DNA. A single SPEX-type assay can amplify just one of the duplex DNA strands at target loci and generate a multi-fold depth-of-coverage, with non-authentic recombinant hybrids reduced to undetectable levels. Crucially, SPEX-type approaches can preferentially access genetic information from damaged and degraded endogenous ancient DNA templates over modern human DNA contaminants. The development of SPEX-type assays offers the potential for highly accurate, quantitative genotyping from ancient hominin samples. PMID:19864251

  16. STR and mitochondrial DNA SNP typing of a bone marrow transplant recipient after death in a fire.

    PubMed

    Seo, Yasuhisa; Uchiyama, Daisuke; Kuroki, Kohji; Kishida, Tetsuko

    2012-11-01

    Personal identification of a house fire victim is described. About 5 years prior to death, the victim had been underwent bone marrow transplantation (BMT) with a graft from an unrelated donor as treatment for acute myelogenous leukemia. Clinically, the victim had been in remission at the time of death. Typing of STRs and sequencing of mitochondrial DNA (mtDNA) were performed using blood from the heart as well as several soft (psoas major muscle, uterine muscle and mucous membrane of the urinary bladder) and hard (costal cartilage and nail) tissues. STR genotypes and amelogenin from each of the tissue samples were successfully typed, and the parentage was identified. The blood STR types demonstrated no relationship with those from other tissues. None of the blood STR loci showed extra peaks arising from those of the recipient. Therefore, the blood stem cells were assumed to have been altered to those of the donor. The genotypes of mtDNA control regions were also examined. The electropherogram of hypervariable region II (nucleotide positions 29-408) obtained from the blood revealed a similar length heteroplasmy, suggesting microchimerism of the blood. Sequence analysis of mtDNA might be applicable as a more sensitive method for determination of chimerisms after BMT.

  17. Genetic Diversity of the Mycobacterium tuberculosis Beijing Family Based on SNP and VNTR Typing Profiles in Asian Countries

    PubMed Central

    Chen, Yih-Yuan; Chang, Jia-Ru; Huang, Wei-Feng; Kuo, Shu-Chen; Su, Ih-Jen; Sun, Jun-Ren; Chiueh, Tzong-Shi; Huang, Tsi-Shu; Chen, Yao-Shen; Dou, Horng-Yunn

    2012-01-01

    The Mycobacterium tuberculosis (MTB) Beijing strain is highly virulent, drug resistant, and endemic over Asia. To explore the genetic diversity of this family in several different regions of eastern Asia, 338 Beijing strains collected in Taiwan (Republic of China) were analyzed by mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTR) typing and compared with published MIRU-VNTR profiles and by the Hunter-Gaston diversity index (HGDI) of Beijing strains from Japan and South Korea. The results revealed that VNTR2163b (HGDI>0.6) and five other loci (VNTR424, VNTR4052, VNTR1955, VNTR4156 and VNTR 2996; HGDI>0.3) could be used to discriminate the Beijing strains in a given geographic region. Analysis based on the number of VNTR repeats showed three VNTRs (VNTR424, 3192, and 1955) to be phylogenetically informative loci. In addition, to determine the geographic variation of sequence types in MTB populations, we also compared sequence type (ST) data of our strains with published ST profiles of Beijing strains from Japan and Thailand. ST10, ST22, and ST19 were found to be prevalent in Taiwan (82%) and Thailand (92%). Furthermore, classification of Beijing sublineages as ancient or modern in Taiwan was found to depend on the repeat number of VNTR424. Finally, phylogenetic relationships of MTB isolates in Taiwan, South Korea, and Japan were revealed by a minimum spanning tree based on MIRU-VNTR genotyping. In this topology, the MIRU-VNTR genotypes of the respective clusters were tightly correlated to other genotypic characters. These results are consistent with the hypothesis that clonal evolution of these MTB lineages has occurred. PMID:22808061

  18. Two novel type 2 diabetes loci revealed through integration of TCF7L2 DNA occupancy and SNP association data.

    PubMed

    Johnson, Matthew E; Zhao, Jianhua; Schug, Jonathan; Deliard, Sandra; Xia, Qianghua; Guy, Vanessa C; Sainz, Jesus; Kaestner, Klaus H; Wells, Andrew D; Grant, Struan F A

    2014-01-01

    The transcription factor 7-like 2 (TCF7L2) locus is strongly implicated in the pathogenesis of type 2 diabetes (T2D). We previously mapped the genomic regions bound by TCF7L2 using ChIP (chromatin immunoprecipitation)-seq in the colorectal carcinoma cell line, HCT116, revealing an unexpected highly significant over-representation of genome-wide association studies (GWAS) loci associated primarily with endocrine (in particular T2D) and cardiovascular traits. In order to further explore if this observed phenomenon occurs in other cell lines, we carried out ChIP-seq in HepG2 cells and leveraged ENCODE data for five additional cell lines. Given that only a minority of the predicted genetic component to most complex traits has been identified to date, plus our GWAS-related observations with respect to TCF7L2 occupancy, we investigated if restricting association analyses to the genes yielded from this approach, in order to reduce the constraints of multiple testing, could reveal novel T2D loci. We found strong evidence for the continued enrichment of endocrine and cardiovascular GWAS categories, with additional support for cancer. When investigating all the known GWAS loci bound by TCF7L2 in the shortest gene list, derived from HCT116, the coronary artery disease-associated variant, rs46522 at the UBE2Z-GIP-ATP5G1-SNF8 locus, yielded significant association with T2D within DIAGRAM. Furthermore, when we analyzed tag-SNPs (single nucleotide polymorphisms) in genes not previously implicated by GWAS but bound by TCF7L2 within 5 kb, we observed a significant association of rs4780476 within CPPED1 in DIAGRAM. ChIP-seq data generated with this GWAS-implicated transcription factor provided a biologically plausible method to limit multiple testing in the assessment of genome-wide genotyping data to uncover two novel T2D-associated loci.

  19. Peopling of the North Circumpolar Region – Insights from Y Chromosome STR and SNP Typing of Greenlanders

    PubMed Central

    Olofsson, Jill Katharina; Pereira, Vania; Børsting, Claus; Morling, Niels

    2015-01-01

    The human population in Greenland is characterized by migration events of Paleo- and Neo-Eskimos, as well as admixture with Europeans. In this study, the Y-chromosomal variation in male Greenlanders was investigated in detail by typing 73 Y-chromosomal single nucleotide polymorphisms (Y-SNPs) and 17 Y-chromosomal short tandem repeats (Y-STRs). Approximately 40% of the analyzed Greenlandic Y chromosomes were of European origin (I-M170, R1a-M513 and R1b-M343). Y chromosomes of European origin were mainly found in individuals from the west and south coasts of Greenland, which is in agreement with the historic records of the geographic placements of European settlements in Greenland. Two Inuit Y-chromosomal lineages, Q-M3 (xM19, M194, L663, SA01 and L766) and Q-NWT01 (xM265) were found in 23% and 31% of the male Greenlanders, respectively. The time to the most recent common ancestor (TMRCA) of the Q-M3 lineage of the Greenlanders was estimated to be between 4,400 and 10,900 years ago (y. a.) using two different methods. This is in agreement with the theory that the North Circumpolar Region was populated via a second expansion of humans in the North American continent. The TMRCA of the Q-NWT01 (xM265) lineage in Greenland was estimated to be between 7,000 and 14,300 y. a. using two different methods, which is older than the previously reported TMRCA of this lineage in other Inuit populations. Our results indicate that Inuit individuals carrying the Q-NWT01 (xM265) lineage may have their origin in the northeastern parts of North America and could be descendants of the Dorset culture. This in turn points to the possibility that the current Inuit population in Greenland is comprised of individuals of both Thule and Dorset descent. PMID:25635810

  20. 3'-UTR SNP rs2229611 in G6PC1 affects mRNA stability, expression and Glycogen Storage Disease type-Ia risk.

    PubMed

    Karthi, Sellamuthu; Rajeshwari, Mohan; Francis, Amirtharaj; Saravanan, Matheshwaran; Varalakshmi, Perumal; Houlden, Henry; Thangaraj, Kumarasamy; Ashokkumar, Balasubramaniem

    2017-08-01

    The frequency of rs2229611, previously reported in Chinese, Caucasians, Japanese and Hispanics, was investigated for the first time in Indian ethnicity. We analyzed its role in the progression of Glycogen Storage Disease type-Ia (GSD-Ia) and breast cancer. Genotype data on rs2229611 revealed that the risk of GSD-Ia was higher (P=0.0195) with CC compared to TT/TC genotypes, whereas no such correlation was observed with breast cancer cases. We observed a strong linkage disequilibrium (LD) among rs2229611 and other disease causing G6PC1 variants (|D'|=1, r(2)=1). Functional validation performed in HepG2 cells using luciferase constructs showed significant (P<0.05) decrease in expression than wild-type 3'-UTR due to curtailed mRNA stability. Furthermore, AU-rich elements (AREs) mediated regulation of G6PC1 expression characterized using 3'-UTR deletion constructs showed a prominent decrease in mRNA stability. We then examined whether miRNAs are involved in controlling G6PC1 expression using pmirGLO-UTR constructs, with evidence of more distinct inhibition in the reporter function with rs2229611. These data suggests that rs2229611 is a crucial regulatory SNP which in homozygous state leads to a more aggressive disease phenotype in GSD-Ia patients. The implication of this result is significant in predicting disease onset, progression and response to disease modifying treatments in patients with GSD-Ia. Copyright © 2017 Elsevier B.V. All rights reserved.

  1. Identification of SNP-SNP interaction for chronic dialysis patients.

    PubMed

    Yang, Cheng-Hong; Weng, Zi-Jie; Chuang, Li-Yeh; Yang, Cheng-San

    2017-04-01

    Analyses of interactions between single nucleotide polymorphisms (SNPs) have reported significant associations between mitochondrial displacement loops (D-loops) and chronic dialysis diseases. However, the method used to detect potential SNP-SNP interaction still requires improvement. This study proposes an effective algorithm named dynamic center particle swarm optimization k-nearest neighbors (DCPSO-KNN) to detect the SNP-SNP interaction. DCPSO-KNN uses dynamic center particle swarm optimization (DCPSO) to generate SNP combinations with a fitness function designed using the KNN method and statistical verification. A total of 77 SNPs in the mitochondrial D-loop were used to detect the SNP-SNP interactions and the search ability was compared against that of other methods. The detected SNP-SNP interactions were statistically evaluated. Experimental results showed that DCPSO-KNN successfully detects SNP-SNP interactions in two-to-seven-order combinations (positive predictive value (PPV)+negative predictive value (NPV)=1.154 to 1.310; odds ratio (OR)=1.859 to 4.015; 95% confidence interval (95% CI)=1.151 to 4.265; p-value <0.001). DCPSO-KNN can improve the detection ability of SNP-SNP associations between mitochondrial D-loops and chronic dialysis diseases, thus facilitating the development of biomedical applications. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. SNP panels/Imputation

    USDA-ARS?s Scientific Manuscript database

    Participants from thirteen countries discussed services that Interbull can perform or recommendations that Interbull can make to promote harmonization and assist member countries in improving their genomic evaluations in regard to SNP panels and imputation. The panel recommended: A mechanism to shar...

  3. Temple syndrome: A patient with maternal hetero-UPD14, mixed iso- and hetero-disomy detected by SNP microarray typing of patient-father duos.

    PubMed

    Shin, Eun-Hye; Cho, Eunhae; Lee, Cha Gon

    2016-08-01

    Temple syndrome (TS, MIM 616222) is an imprinting disorder involving genes within the imprinted region of chromosome 14q32. TS is a genetically complex disorder, which is associated with maternal uniparental disomy of chromosome 14 (UPD14), paternal deletions on chromosome 14, or loss of methylation at the intergenic differentially methylated region (IG-DMR). Here, we describe the case of a patient with maternal hetero-UPD14, mixed iso-/hetero-disomy mechanism identified by a single nucleotide polymorphism (SNP) array analysis of patient-father duos study. The phenotype of our case is similarities to Prader-Willi syndrome (PWS) during infancy and to Russell-Silver syndrome (RSS) during childhood. This SNP array appears to be an effective initial screening tool for patients with nonspecific clinical features suggestive of chromosomal disorders. Copyright © 2016 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.

  4. Development of ARMS-PCR assay for genotyping of Pro12Ala SNP of PPARG gene: a cost effective way for case-control studies of type 2 diabetes in developing countries.

    PubMed

    Islam, Mehboob; Awan, Fazli Rabbi; Baig, Shahid Mahmood

    2014-09-01

    Type 2 diabetes (T2D) is a prevalent metabolic disorder across the globe. Research is underway on various aspects including genetics to understand and control the global epidemic of diabetes. Recently, several SNPs in various genes have been associated with T2D. These association studies are mainly carried out in the developed countries through Genome Wide Association Scans, with follow-up replication/validation studies by high-throughput genotyping techniques (e.g. Taqman Technology). Although, similar studies could be conducted in developing countries, however, the limiting factors are the associated cost and expertise. These factors hamper research into the genetic association and replication studies from low-income countries to figure out the role of putatively associated SNPs in diabetes. Although, there are several SNP detection methods (e.g. Taqman assay, Dot-blot, PCR-RFLP, DGGE, SSCP) but these are either expensive or labor intensive or less sensitive. Hence, our aim was to develop a low-cost method for the validation of PPARG (Pro12Ala, CCA>GCA) SNP (rs1801282) for its association with T2D. Here, we developed a cost-effective and rapid amplification refractory mutation specific-PCR (ARMS-PCR) method for this SNP detection. We successfully genotyped PPARG SNPs (Pro12Ala) in human samples and the validity of this method was confirmed by DNA sequencing of a few representative samples for the three different genotypes. Furthermore, ARMS-PCR was applied to T2D patients and control samples for the screening of this SNP.

  5. [SNP-19 genotypic variants of CAPN10 gene and its relation to diabetes mellitus type 2 in a population of Ciudad Juarez, Mexico].

    PubMed

    Loya Méndez, Yolanda; Reyes Leal, Gilberto; Sánchez González, Adriana; Portillo Reyes, Verónica; Reyes Ruvalcaba, David; Bojórquez Rangel, Guillermo

    2014-09-28

    Introducción: La diabetes mellitus (DM) tipo 2 es una patología común de origen multifactorial cuyas bases genéticas exactas se desconocen aún; diversos estudios sugieren que los polimorfismos de nucleótido único (SNPs) en el gen CAPN10 (Locus 2q37.3) podrían participar en su desarrollo, incluyendo el polimorfismo de inserción/ deleción SNP-19 (2R→3R). Objetivo: Determinar la relación entre el polimorfismo SNP-19 y la presencia de DM tipo 2 en una población de Ciudad Juárez. Métodos: Se seleccionaron 107 individuos: 43 diabéticos tipo 2 (casos) y 64 no diabéticos sin antecedentes heredo-familiares de DM tipo 2 en primer grado (control). Se realizó estudio antropométrico y perfil bioquímico de lípidos, lipoproteínas y glucosa sérica. Se extrajo ADN de linfocitos de sangre periférica y se amplificó mediante la técnica de reacción en cadena de la polimerasa (PCR). Se analizaron los genotipos del polimorfismo SNP-19 del gen CAPN10 por análisis electroforético en geles de agarosa. Se calcularon las frecuencias genotípicas y alélicas y se realizaron pruebas de equilibrio de Hardy-Weinberg (GenAlEx 6.4). Resultados: El análisis mediante la prueba X² identificó diferencias en los genotipos entre casos y control, con una mayor frecuencia del genotipo homocigoto 3R del SNP-19 en el grupo de casos (0.418) respecto al grupo control (0.265). El genotipo 2R/3R presentó relación con valores elevados de peso, índice de masa corporal y perímetros de cintura y cadera; pero solo en el grupo de diabéticos (P=< 0.05). Conclusión: Los resultados de esta investigación sugieren la participación del SNP-19 del gen CAPN10 en el desarrollo de DM tipo 2 en la población estudiada.

  6. Analysis of consequences of non-synonymous SNP in feed conversion ratio associated TGF-β receptor type 3 gene in chicken

    PubMed Central

    Rasal, Kiran D.; Shah, Tejas M.; Vaidya, Megha; Jakhesara, Subhash J.; Joshi, Chaitanya G.

    2015-01-01

    The recent advances in high throughput sequencing technology accelerate possible ways for the study of genome wide variation in several organisms and associated consequences. In the present study, mutations in TGFBR3 showing significant association with FCR trait in chicken during exome sequencing were further analyzed. Out of four SNPs, one nsSNP p.Val451Leu was found in the coding region of TGFBR3. In silico tools such as SnpSift and PANTHER predicted it as deleterious (0.04) and to be tolerated, respectively, while I-Mutant revealed that protein stability decreased. The TGFBR3 I-TASSER model has a C-score of 0.85, which was validated using PROCHECK. Based on MD simulation, mutant protein structure deviated from native with RMSD 0.08 Å due to change in the H-bonding distances of mutant residue. The docking of TGFBR3 with interacting TGFBR2 inferred that mutant required more global energy. Therefore, the present study will provide useful information about functional SNPs that have an impact on FCR traits. PMID:25941634

  7. MDM2 SNP309 and SNP285 Act as Negative Prognostic Markers for Non-small Cell Lung Cancer Adenocarcinoma Patients

    PubMed Central

    Deben, Christophe; Op de Beeck, Ken; Van den Bossche, Jolien; Jacobs, Julie; Lardon, Filip; Wouters, An; Peeters, Marc; Van Camp, Guy; Rolfo, Christian; Deschoolmeester, Vanessa; Pauwels, Patrick

    2017-01-01

    Objectives: Two functional polymorphisms in the MDM2 promoter region, SNP309T>G and SNP285G>C, have been shown to impact MDM2 expression and cancer risk. Currently available data on the prognostic value of MDM2 SNP309 in non-small cell lung cancer (NSCLC) is contradictory and unavailable for SNP285. The goal of this study was to clarify the role of these MDM2 SNPs in the outcome of NSCLC patients. Materials and Methods: In this study we genotyped SNP309 and SNP285 in 98 NSCLC adenocarcinoma patients and determined MDM2 mRNA and protein levels. In addition, we assessed the prognostic value of these common SNPs on overall and progression free survival, taking into account the TP53 status of the tumor. Results and Conclusion: We found that the SNP285C allele, but not the SNP309G allele, was significantly associated with increased MDM2 mRNA expression levels (p = 0.025). However, we did not observe an association with MDM2 protein levels for SNP285. The SNP309G allele was significantly associated with the presence of wild type TP53 (p = 0.047) and showed a strong trend towards increased MDM2 protein levels (p = 0.068). In addition, patients harboring the SNP309G allele showed a worse overall survival, but only in the presence of wild type TP53. The SNP285C allele was significantly associated with an early age of diagnosis and metastasis. Additionally, the SNP285C allele acted as an independent predictor for worse progression free survival (HR = 3.97; 95% CI = 1.51 - 10.42; p = 0.005). Our data showed that both SNP309 (in the presence of wild type TP53) and SNP285 act as negative prognostic markers for NSCLC patients, implicating a prominent role for these variants in the outcome of these patients. PMID:28819417

  8. A Mutagenic Primer Assay for Genotyping of the CRHR1 Gene Rare Variant rs1876828 (A/G) in Asians: A Cost-Effective SNP Typing.

    PubMed

    Sharma, Neeraj; Awasthi, Shally; Phadke, Shubha R

    2016-03-01

    Today, the genetic and genomic research entered in a new era of high-throughput genotyping technology. However, mutagenic polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) is still a choice of genotyping method in molecular epidemiological research. It has been extensively used for the detection of risk alleles, if the target SNP has no natural discriminating restriction site. We undertook this study to develop a mutagenic primer assay for a CRHR1 rare gene variant: rs1876828 (A/G) and to determine their allele frequency in north Indian children. The mutagenic primers were designed and assay conditions were optimized to perform mutagenic PCR-RFLP in 550 subjects. The efficiency of assay and results were validated by sequencing. This study demonstrated that the mutagenic primer assay is feasible and applicable to discriminate CRHR1 gene rare variant rs1876828 (A/G) and the "frequency of allele "G" was 100% in north Indian asthmatics as well as normal subjects. This method can be used for both large- and small-scale study of complex genetic, where CRHR1 gene plays the pivotal roles. © 2014 Wiley Periodicals, Inc.

  9. Genetic variation in Pythium myriotylum based on SNP typing and development of a PCR-RFLP detection of isolates recovered from Pythium soft rot ginger.

    PubMed

    Le, D P; Smith, M K; Aitken, E A B

    2017-10-01

    Pythium myriotylum is responsible for severe losses in both capsicum and ginger crops in Australia under different regimes. Intraspecific genomic variation within the pathogen might explain the differences in aggressiveness and pathogenicity on diverse hosts. In this study, whole genome data of four P. myriotylum isolates recovered from three hosts and one Pythium zingiberis isolate were derived and analysed for sequence diversity based on single nucleotide polymorphisms (SNPs). A higher number of true and unique SNPs occurred in P. myriotylum isolates obtained from ginger with symptoms of Pythium soft rot (PSR) in Australia compared to other P. myriotylum isolates. Overall, SNPs were discovered more in the mitochondrial genome than those in the nuclear genome. Among the SNPs, a single substitution from the cytosine (C) to the thymine (T) in the partially sequenced CoxII gene of 14 representatives of PSR P. myriotylum isolates was within a restriction site of HinP1I enzyme which was used in the PCR-RFLP for detection and identification of the isolates without sequencing. The PCR-RFLP was also sensitive to detect PSR P. myriotylum strains from artificially infected ginger without the need for isolation for pure cultures. This is the first study of intraspecific variants of Pythium myriotylum isolates recovered from different hosts and origins based on single nucleotide polymorphism (SNP) genotyping of multiple genes. The SNPs discovered provide valuable makers for detection and identification of P. myriotylum strains initially isolated from Pythium soft rot (PSR) ginger by using PCR-RFLP of the CoxII locus. The PCR-RFLP was also sensitive to detect P. myriotylum directly from PSR ginger sampled from pot trials without the need of isolation for pure cultures. © 2017 The Society for Applied Microbiology.

  10. Evaluation of the iPLEX® Sample ID Plus Panel designed for the Sequenom MassARRAY® system. A SNP typing assay developed for human identification and sample tracking based on the SNPforID panel.

    PubMed

    Johansen, P; Andersen, J D; Børsting, C; Morling, N

    2013-09-01

    Sequenom launched the first commercial SNP typing kit for human identification, named the iPLEX(®) Sample ID Plus Panel. The kit amplifies 47 of the 52 SNPs in the SNPforID panel, amelogenin and two Y-chromosome SNPs in one multiplex PCR. The SNPs were analyzed by single base extension (SBE) and Matrix Assisted Laser Desorption/Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS). In this study, we evaluated the accuracy and sensitivity of the iPLEX(®) Sample ID Plus Panel by comparing the typing results of the iPLEX(®) Sample ID Plus Panel with those obtained with our ISO 17025 accredited SNPforID assay. The average call rate for duplicate typing of any one SNPs in the panel was 90.0% when the mass spectra were analyzed automatically with the MassARRAY(®) TYPER 4.0 genotyping software in real time. Two reproducible inconsistencies were observed (error rate: 0.05%) at two different SNP loci. In addition, four inconsistencies were observed once. The optimal amount of template DNA in the PCR was ≥10ng. There was a relatively high risk of allele and locus drop-outs when ≤1ng template DNA was used. We developed an R script with a stringent set of "forensic analysis parameters" based on the peak height and the signal to noise data exported from the TYPER 4.0 software. With the forensic analysis parameters, all inconsistencies were eliminated in reactions with ≥10ng DNA. However, the average call rate decreased to 69.9%. The iPLEX(®) Sample ID Plus Panel was tested on 10 degraded samples from forensic case-work. Two samples could not be typed, presumably because the samples contained PCR and SBE inhibitors. The average call rate was generally lower for degraded DNA samples and the number of inconsistencies higher than for pristine DNA. However, none of the inconsistencies were reproduced and the highest match probability for the degraded samples typed with the panel was 1.7E-9 using the stringent forensic analysis parameters. Although the relatively low

  11. A Bayesian Framework for SNP Identification

    SciTech Connect

    Webb-Robertson, Bobbie-Jo M.; Havre, Susan L.; Payne, Deborah A.

    2005-07-01

    Current proteomics techniques, such as mass spectrometry, focus on protein identification, usually ignoring most types of modifications beyond post-translational modifications, with the assumption that only a small number of peptides have to be matched to a protein for a positive identification. However, not all proteins are being identified with current techniques and improved methods to locate points of mutation are becoming a necessity. In the case when single-nucleotide polymorphisms (SNPs) are observed, brute force is the most common method to locate them, quickly becoming computationally unattractive as the size of the database associated with the model organism grows. We have developed a Bayesian model for SNPs, BSNP, incorporating evolutionary information at both the nucleotide and amino acid levels. Formulating SNPs as a Bayesian inference problem allows probabilities of interest to be easily obtained, for example the probability of a specific SNP or specific type of mutation over a gene or entire genome. Three SNP databases were observed in the evaluation of the BSNP model; the first SNP database is a disease specific gene in human, hemoglobin, the second is also a disease specific gene in human, p53, and the third is a more general SNP database for multiple genes in mouse. We validate that the BSNP model assigns higher posterior probabilities to the SNPs defined in all three separate databases than can be attributed to chance under specific evolutionary information, for example the amino acid model described by Majewski and Ott in conjunction with either the four-parameter nucleotide model by Bulmer or seven-parameter nucleotide model by Majewski and Ott.

  12. SNP genotyping by heteroduplex analysis.

    PubMed

    Paniego, Norma; Fusari, Corina; Lia, Verónica; Puebla, Andrea

    2015-01-01

    Heteroduplex-based genotyping methods have proven to be technologically effective and economically efficient for low- to medium-range throughput single-nucleotide polymorphism (SNP) determination. In this chapter we describe two protocols that were successfully applied for SNP detection and haplotype analysis of candidate genes in association studies. The protocols involve (1) enzymatic mismatch cleavage with endonuclease CEL1 from celery, associated with fragment separation using capillary electrophoresis (CEL1 cleavage), and (2) differential retention of the homo/heteroduplex DNA molecules under partial denaturing conditions on ion pair reversed-phase liquid chromatography (dHPLC). Both methods are complementary since dHPLC is more versatile than CEL1 cleavage for identifying multiple SNP per target region, and the latter is easily optimized for sequences with fewer SNPs or small insertion/deletion polymorphisms. Besides, CEL1 cleavage is a powerful method to localize the position of the mutation when fragment resolution is done using capillary electrophoresis.

  13. Novel multiplex real-time PCR system using the SNP technology for the simultaneous diagnosis of Chlamydia trachomatis, Ureaplasma parvum and Ureaplasma urealyticum and genetic typing of serovars of C. trachomatis and U. parvum in NGU.

    PubMed

    Tang, Jingfeng; Zhou, Li; Liu, Xiaoying; Zhang, Changming; Zhao, Youyun; Wang, Yefu

    2011-02-01

    To explore the possibilities of a novel multiplex real-time PCR system for rapid diagnosis, genetic typing of serovars and clinical application in NGU, we developed a multiplex real-time PCR system for the simultaneous diagnosis of Chlamydia trachomatis, Ureaplasma parvum and Ureaplasma urealyticum and molecular detection of serovars of C. trachomatis and U. parvum in NGU using the SNP technology and TaqMan-LNA probe. In 57 pathogen-positive clinical specimens, we identified the following C. trachomatis serovars: D (20.05%, 12/57), E (36.84%, 21/57), F (19.30%, 11/57), G (8.77%, 5/57), H (5.26%, 3/57), J (3.51%, 2/57), and K (5.26%, 3/57). In 115 pathogen-positive clinical specimens, we identified the following U. parvum serovars: 1 (0.87%, 2/115), 3 (55.65%, 64/115), 6 (20.87%, 24/115) and 14 (21.74%, 25/115). Our fast pathogen diagnosis and serotyping assay using real-time TaqMan-LNA PCR may improve our ability to study the pathogenesis and epidemiology of NGU.

  14. A method for developing high-density SNP maps and its application at the type 1 angiotensin II receptor (AGTR1) locus.

    PubMed

    Antonellis, Anthony; Rogus, John J; Canani, Luis H; Makita, Yuchiro; Pezzolesi, Marcus G; Nam, MoonSuk; Ng, Daniel; Moczulski, Dariusz; Warram, James H; Krolewski, Andrzej S

    2002-03-01

    Evaluating the potential genetic components of complex disease will likely be aided through the use of dense polymorphism maps. Previously, we reported evidence for linkage with diabetic nephropathy on chromosome 3q in a region encompassing the type 1 angiotensin II receptor (AGTR1) gene. To further investigate any role for this gene in disease onset, we set out to design a dense polymorphism map spanning the AGTR1 locus for the purpose of association studies. Toward this goal, we have developed a technique for rapid identification of polymorphisms in long stretches of genomic DNA. This approach uses long-range PCR, DNA pooling, and transposon-based DNA sequencing. Using this technique, we efficiently validated and genotyped 18 polymorphisms spanning the 60.5-kb AGTR1 locus. Our panel of polymorphisms has an average spacing of 3.2 kb and an average minor allele frequency of 24%.

  15. The Utility of High-Resolution Melting Analysis of SNP Nucleated PCR Amplicons—An MLST Based Staphylococcus aureus Typing Scheme

    PubMed Central

    Giffard, Philip M.; Holt, Deborah C.

    2011-01-01

    High resolution melting (HRM) analysis is gaining prominence as a method for discriminating DNA sequence variants. Its advantage is that it is performed in a real-time PCR device, and the PCR amplification and HRM analysis are closed tube, and effectively single step. We have developed an HRM-based method for Staphylococcus aureus genotyping. Eight single nucleotide polymorphisms (SNPs) were derived from the S. aureus multi-locus sequence typing (MLST) database on the basis of maximized Simpson's Index of Diversity. Only G↔A, G↔T, C↔A, C↔T SNPs were considered for inclusion, to facilitate allele discrimination by HRM. In silico experiments revealed that DNA fragments incorporating the SNPs give much higher resolving power than randomly selected fragments. It was shown that the predicted optimum fragment size for HRM analysis was 200 bp, and that other SNPs within the fragments contribute to the resolving power. Six DNA fragments ranging from 83 bp to 219 bp, incorporating the resolution optimized SNPs were designed. HRM analysis of these fragments using 94 diverse S. aureus isolates of known sequence type or clonal complex (CC) revealed that sequence variants are resolved largely in accordance with G+C content. A combination of experimental results and in silico prediction indicates that HRM analysis resolves S. aureus into 268 “melt types” (MelTs), and provides a Simpson's Index of Diversity of 0.978 with respect to MLST. There is a high concordance between HRM analysis and the MLST defined CCs. We have generated a Microsoft Excel key which facilitates data interpretation and translation between MelT and MLST data. The potential of this approach for genotyping other bacterial pathogens was investigated using a computerized approach to estimate the densities of SNPs with unlinked allelic states. The MLST databases for all species tested contained abundant unlinked SNPs, thus suggesting that high resolving power is not dependent upon large numbers of

  16. Association of an Exon SNP of SLC2A9 Gene with Hyperuricemia Complicated with Type 2 Diabetes Mellitus in the Chinese Male Han Population.

    PubMed

    Xing, Shi-Chao; Wang, Xu-Fu; Miao, Zhi-Min; Zhang, Xue-Zhi; Zheng, Jun; Yuan, Ying

    2015-04-01

    Several recent genome-wide association studies and following studies have identified that genetic variants of SLC2A9 are associated with hyperuricemia (HUA) and diabetes mellitus (DM). Here, we set to investigate whether the exon 9 of SLC2A9 gene variations is associated with HUA complicated with Type 2 DM (T2DM) in the Chinese male Han population. The present study was designed to study rs2280205 polymorphism in exon 9 of SLC2A9 in 232 Chinese male subjects. Rs2280205 locus was genotyped in 52 T2DM subjects, 65 HUA subjects, 55 subjects with HUA complicated with T2DM, as well as 60 control subjects in this study. DNA from peripheral blood was purified and amplified by polymerase chain reaction (PCR). The PCR products were then digested by restriction enzyme MSPI, and part of PCR products was sequenced and analyzed. There was no significant difference in the levels of cholesterol, creatinine, and urea nitrogen between the Control Group and the HUA group. There was also no significant difference in levels of cholesterol between the DM group and Control Group. No significant difference in cholesterol and uric acid was observed between the HUA group and the HUA accompanied with DM group (P > 0.05). However, there was no statistical significance in the genotype frequency in these groups (P > 0.01). Results of the present study suggest that the exon 9 of SLC2A9 gene 109C/T polymorphism is not associated with HUA and diabetes in population living in the coastal area of Shandong province, China.

  17. RASSF1A and the rs2073498 Cancer Associated SNP

    PubMed Central

    Donninger, Howard; Barnoud, Thibaut; Nelson, Nick; Kassler, Suzanna; Clark, Jennifer; Cummins, Timothy D.; Powell, David W.; Nyante, Sarah; Millikan, Robert C.; Clark, Geoffrey J.

    2011-01-01

    RASSF1A is one of the most frequently inactivated tumor suppressors yet identified in human cancer. It is pro-apoptotic and appears to function as a scaffolding protein that interacts with a variety of other tumor suppressors to modulate their function. It can also complex with the Ras oncoprotein and may serve to integrate pro-growth and pro-death signaling pathways. A SNP has been identified that is present in approximately 29% of European populations [rs2073498, A(133)S]. Several studies have now presented evidence that this SNP is associated with an enhanced risk of developing breast cancer. We have used a proteomics based approach to identify multiple differences in the pattern of protein/protein interactions mediated by the wild type compared to the SNP variant protein. We have also identified a significant difference in biological activity between wild type and SNP variant protein. However, we have found only a very modest association of the SNP with breast cancer predisposition. PMID:22649770

  18. Dual Effects of a RETN Single Nucleotide Polymorphism (SNP) at -420 on Plasma Resistin: Genotype and DNA Methylation.

    PubMed

    Onuma, Hiroshi; Tabara, Yasuharu; Kawamura, Ryoichi; Ohashi, Jun; Nishida, Wataru; Takata, Yasunori; Ochi, Masaaki; Nishimiya, Tatsuya; Ohyagi, Yasumasa; Kawamoto, Ryuichi; Kohara, Katsuhiko; Miki, Tetsuro; Osawa, Haruhiko

    2017-03-01

    We previously reported that single nucleotide polymorphism (SNP)-420 C>G (rs1862513) in the promoter region of RETN was associated with type 2 diabetes. Plasma resistin was tightly correlated with SNP-420 genotypes. SNP-420 is a CpG-SNP affecting the sequence of cytosine-phosphate-guanine dinucleotides. To examine whether methylation at SNP-420 affects plasma resistin, we analyzed plasma resistin and methylation at RETN SNP-420. Genomic DNA was extracted from peripheral white blood cells in 2078 Japanese subjects. Quantification of the methylation was performed by pyrosequencing after DNA bisulfite conversion. Methylation at SNP-420 was highest in the C/C genotype (36.9 ± 5.7%), followed by C/G (21.4 ± 3.5%) and G/G (2.9 ± 1.4%; P < 0.001). When assessed in each genotype, methylation at SNP-420 was inversely associated with plasma resistin in the C/C (β = -0.134, P < 0.001) or C/G (β = -0.227, P < 0.001) genotype. In THP-1 human monocytes intrinsically having the C/C genotype, a demethylating reagent, 5-aza-dC, decreased the methylation at SNP-420 and increased RETN messenger RNA. SNP+1263 (rs3745369), located in the 3' untranslated region of RETN, was also associated with methylation at SNP-420. In addition, highly sensitive C-reactive protein was inversely associated with methylation at SNP-420 in the C/C genotype, whereas body mass index was positively associated. Plasma resistin was inversely associated with the extent of methylation at SNP-420 mainly dependent on the SNP-420 genotype. The association can also be explained partially independent of SNP-420 genotypes. SNP-420 could have dual, genetic and epigenetic effects on plasma resistin.

  19. CACNA1C SNP rs1006737 associates with bipolar I disorder independent of the Bcl-2 SNP rs956572 variant and its associated effect on intracellular calcium homeostasis.

    PubMed

    Uemura, Takuji; Green, Marty; Warsh, Jerry J

    2016-10-01

    Intracellular calcium (Ca(2+)) dyshomeostasis (ICDH) has been implicated in bipolar disorder (BD) pathophysiology. We previously showed that SNP rs956572 in the B-cell CLL/lymphoma 2 (Bcl-2) gene associates with elevated B lymphoblast (BLCL) intracellular Ca(2+) concentrations ([Ca(2+)]B) differentially in BD-I. Genome-wide association studies strongly support the association between BD and the SNP rs1006737, located within the L-type voltage-dependent Ca(2+) channel α1C subunit gene (CACNA1C). Here we investigated whether this CACNA1C variant also associates with ICDH and interacts with SNP rs956572 on [Ca(2+)]B in BD-I. CACNA1C SNP rs1006737 was genotyped in 150 BD-I, 65 BD-II, 30 major depressive disorder patients, and 70 healthy subjects with available BLCL [Ca(2+)]B and Bcl-2 SNP rs956572 genotype measures. SNP rs1006737 was significantly associated with BD-I. The [Ca(2+)]B was significantly higher in BD-I rs1006737 A compared with healthy A allele carriers and also in healthy GG compared with A allele carriers. There was no significant interaction between SNP rs1006737 and SNP rs956572 on [Ca(2+)]B. Our study further supports the association of SNP rs1006737 with BD-I and suggests that CACNA1C SNP rs1006737 and Bcl-2 SNP rs956572, or specific causal variants in LD with these proxies, act independently to increase risk and ICDH in BD-I.

  20. SNIT: SNP Identification for Strain Typing

    DTIC Science & Technology

    2011-01-01

    Durkin S, Schneewind O, Nierman WC: Genome sequencing and analysis of Yersina pestis KIM D27, an avirulent strain exempt from select agent regulation. PLoS...gener- ated from next-generation sequencing (NGS) data, we selected the recently published Yersinia pestis KIM D27 genome [12]. The Y. pestis D27 strain...is a deriva- tive of Y. pestis KIM 10 strain (accession no. NC_004088). The Y. pestis KIM D27 draft genome (accession no. ADDC00000000) was generated

  1. The Association of CYP1A1 Gene With Cervical Cancer and Additional SNP-SNP Interaction in Chinese Women.

    PubMed

    Li, Shuhong; Li, Guiqin; Kong, Fanqiang; Liu, Zhifen; Li, Ning; Li, Yan; Guo, Xiaojing

    2016-11-01

    The aim of this study was to investigate the association between CYP1A1 gene polymorphism and cervical cancer risk, and the impact of SNP-SNP interaction on cervical cancer risk in Chinese women. A total of 728 females with a mean age of 60.1 ± 14.5 years old were selected, including 360 cervical cancer patients and 368 normal controls. Logistic regression was performed to investigate association between single-nucleotide polymorphisms (SNP) and cervical cancer risk. Generalized multifactor dimensionality reduction (GMDR) was used to analyze the SNP-SNP interaction. Logistic analysis showed a significant association between rs4646903 and increased cervical cancer risk. The carriers of homozygous mutant of rs4646903 polymorphism revealed increased cervical cancer risk than those with wild-type homozygotes, OR (95%CI) were 1.45 (1.20-1.95). There was a significant two-locus model (P = 0.0107) involving rs4646903 and rs1048943, indicating a potential SNP-SNP interaction between rs4646903 and rs1048943. Overall, the two-locus models had a cross-validation consistency of 10 of 10, and had the testing accuracy of 60.72%. Subjects with TC or CC of rs4646903 and AG or GG of rs1048943 genotype have the highest cervical cancer risk, compared to subjects with TT of rs4646903 and AA of rs1048943 genotype, OR (95%CI) was 2.03 (1.42-2.89). rs4646903 minor alleles and interaction between rs4646903 and rs1048943 were associated with increased cervical cancer risk. © 2016 Wiley Periodicals, Inc.

  2. Linear reduction methods for tag SNP selection.

    PubMed

    He, Jingwu; Zelikovsky, Alex

    2004-01-01

    It is widely hoped that constructing a complete human haplotype map will help to associate complex diseases with certain SNP's. Unfortunately, the number of SNP's is huge and it is very costly to sequence many individuals. Therefore, it is desirable to reduce the number of SNP's that should be sequenced to considerably small number of informative representatives, so called tag SNP's. In this paper, we propose a new linear algebra based method for selecting and using tag SNP's. Our method is purely combinatorial and can be combined with linkage disequilibrium (LD) and block based methods. We measure the quality of our tag SNP selection algorithm by comparing actual SNP's with SNP's linearly predicted from linearly chosen tag SNP's. We obtain an extremely good compression and prediction rates. For example, for long haplotypes (>25000 SNP's), knowing only 0.4% of all SNP's we predict the entire unknown haplotype with 2% accuracy while the prediction method is based on a 10% sample of the population.

  3. A 48 SNP set for grapevine cultivar identification

    PubMed Central

    2011-01-01

    Background Rapid and consistent genotyping is an important requirement for cultivar identification in many crop species. Among them grapevine cultivars have been the subject of multiple studies given the large number of synonyms and homonyms generated during many centuries of vegetative multiplication and exchange. Simple sequence repeat (SSR) markers have been preferred until now because of their high level of polymorphism, their codominant nature and their high profile repeatability. However, the rapid application of partial or complete genome sequencing approaches is identifying thousands of single nucleotide polymorphisms (SNP) that can be very useful for such purposes. Although SNP markers are bi-allelic, and therefore not as polymorphic as microsatellites, the high number of loci that can be multiplexed and the possibilities of automation as well as their highly repeatable results under any analytical procedure make them the future markers of choice for any type of genetic identification. Results We analyzed over 300 SNP in the genome of grapevine using a re-sequencing strategy in a selection of 11 genotypes. Among the identified polymorphisms, we selected 48 SNP spread across all grapevine chromosomes with allele frequencies balanced enough as to provide sufficient information content for genetic identification in grapevine allowing for good genotyping success rate. Marker stability was tested in repeated analyses of a selected group of cultivars obtained worldwide to demonstrate their usefulness in genetic identification. Conclusions We have selected a set of 48 stable SNP markers with a high discrimination power and a uniform genome distribution (2-3 markers/chromosome), which is proposed as a standard set for grapevine (Vitis vinifera L.) genotyping. Any previous problems derived from microsatellite allele confusion between labs or the need to run reference cultivars to identify allele sizes disappear using this type of marker. Furthermore, because SNP

  4. Kernel machine SNP-set analysis for censored survival outcomes in genome-wide association studies.

    PubMed

    Lin, Xinyi; Cai, Tianxi; Wu, Michael C; Zhou, Qian; Liu, Geoffrey; Christiani, David C; Lin, Xihong

    2011-11-01

    In this article, we develop a powerful test for identifying single nucleotide polymorphism (SNP)-sets that are predictive of survival with data from genome-wide association studies. We first group typed SNPs into SNP-sets based on genomic features and then apply a score test to assess the overall effect of each SNP-set on the survival outcome through a kernel machine Cox regression framework. This approach uses genetic information from all SNPs in the SNP-set simultaneously and accounts for linkage disequilibrium (LD), leading to a powerful test with reduced degrees of freedom when the typed SNPs are in LD with each other. This type of test also has the advantage of capturing the potentially nonlinear effects of the SNPs, SNP-SNP interactions (epistasis), and the joint effects of multiple causal variants. By simulating SNP data based on the LD structure of real genes from the HapMap project, we demonstrate that our proposed test is more powerful than the standard single SNP minimum P-value-based test for association studies with censored survival outcomes. We illustrate the proposed test with a real data application. © 2011 Wiley Periodicals, Inc.

  5. Weighted SNP set analysis in genome-wide association study.

    PubMed

    Dai, Hui; Zhao, Yang; Qian, Cheng; Cai, Min; Zhang, Ruyang; Chu, Minjie; Dai, Juncheng; Hu, Zhibin; Shen, Hongbing; Chen, Feng

    2013-01-01

    Genome-wide association studies (GWAS) are popular for identifying genetic variants which are associated with disease risk. Many approaches have been proposed to test multiple single nucleotide polymorphisms (SNPs) in a region simultaneously which considering disadvantages of methods in single locus association analysis. Kernel machine based SNP set analysis is more powerful than single locus analysis, which borrows information from SNPs correlated with causal or tag SNPs. Four types of kernel machine functions and principal component based approach (PCA) were also compared. However, given the loss of power caused by low minor allele frequencies (MAF), we conducted an extension work on PCA and used a new method called weighted PCA (wPCA). Comparative analysis was performed for weighted principal component analysis (wPCA), logistic kernel machine based test (LKM) and principal component analysis (PCA) based on SNP set in the case of different minor allele frequencies (MAF) and linkage disequilibrium (LD) structures. We also applied the three methods to analyze two SNP sets extracted from a real GWAS dataset of non-small cell lung cancer in Han Chinese population. Simulation results show that when the MAF of the causal SNP is low, weighted principal component and weighted IBS are more powerful than PCA and other kernel machine functions at different LD structures and different numbers of causal SNPs. Application of the three methods to a real GWAS dataset indicates that wPCA and wIBS have better performance than the linear kernel, IBS kernel and PCA.

  6. SNP genotyping using single-tube fluorescent bidirectional PCR.

    PubMed

    Waterfall, Christy M; Cobb, Benjamin D

    2002-07-01

    SNP genotyping is a well-populatedfield with a large number of assay formats offering accurate allelic discrimination. However, there remains a discord between the ultimate goal of rapid, inexpensive assays that do not require complex design considerations and involved optimization strategies. We describe the first integration of bidirectional allele-specific amplification, SYBR Green I, and rapid-cycle PCR to provide a homogeneous SNP-typing assay. Wild-type, mutant, and heterozygous alleles were easily discriminated in a single tube using melt curve profiling of PCR products alone. We demonstrate the effectiveness and reliability of this assay with a blinded trial using clinical samples from individuals with sickle cell anemia, sickle cell trait, or unaffected individuals. The tests were completed in less than 30 min without expensive fluorogenic probes, prohibiting design rules, or lengthy downstream processing for product analysis.

  7. Detecting Susceptibility to Breast Cancer with SNP-SNP Interaction Using BPSOHS and Emotional Neural Networks

    PubMed Central

    Wang, Xiao; Fan, Yue

    2016-01-01

    Studies for the association between diseases and informative single nucleotide polymorphisms (SNPs) have received great attention. However, most of them just use the whole set of useful SNPs and fail to consider the SNP-SNP interactions, while these interactions have already been proven in biology experiments. In this paper, we use a binary particle swarm optimization with hierarchical structure (BPSOHS) algorithm to improve the effective of PSO for the identification of the SNP-SNP interactions. Furthermore, in order to use these SNP interactions in the susceptibility analysis, we propose an emotional neural network (ENN) to treat SNP interactions as emotional tendency. Different from the normal architecture, just as the emotional brain, this architecture provides a specific path to treat the emotional value, by which the SNP interactions can be considered more quickly and directly. The ENN helps us use the prior knowledge about the SNP interactions and other influence factors together. Finally, the experimental results prove that the proposed BPSOHS_ENN algorithm can detect the informative SNP-SNP interaction and predict the breast cancer risk with a much higher accuracy than existing methods. PMID:27294121

  8. SNP Cutter: a comprehensive tool for SNP PCR–RFLP assay design

    PubMed Central

    Zhang, Ruifang; Zhu, Zanhua; Zhu, Hongming; Nguyen, Tu; Yao, Fengxia; Xia, Kun; Liang, Desheng; Liu, Chunyu

    2005-01-01

    The Polymerase chain reaction–restriction fragment length polymorphism (PCR–RFLP) is a relatively simple and inexpensive method for genotyping single nucleotide polymorphisms (SNPs). It requires minimal investment in instrumentation. Here, we describe a web application, ‘SNP Cutter,’ which designs PCR–RFLP assays on a batch of SNPs from the human genome. NCBI dbSNP rs IDs or formatted SNPs are submitted into the SNP Cutter which then uses restriction enzymes from a pre-selected list to perform enzyme selection. The program is capable of designing primers for either natural PCR–RFLP or mismatch PCR–RFLP, depending on the SNP sequence data. SNP Cutter generates the information needed to evaluate and perform genotyping experiments, including a PCR primers list, sizes of original amplicons and different allelic fragment after enzyme digestion. Some output data is tab-delimited, therefore suitable for database archiving. The SNP Cut-ter is available at . PMID:15980518

  9. The association between MEFV gene polymorphisms and Henoch-Schönlein purpura, and additional SNP-SNP interactions in Chinese Han children.

    PubMed

    Xiong, Shunjun; Xiong, Ying; Huang, Qian; Wang, Jierong; Zhang, Xiaofang

    2017-03-01

    The aim of this study was to investigate the association between single-nucleotide polymorphisms (SNP) within MEFV gene and Henoch-Schönlein purpura (HSP) risk, and the impact of SNP-SNP interaction on HSP risk in Chinese children. A total of 662 subjects with a mean age of 7.9 ± 2.4 years old were selected, including 320 HSP patients and 342 normal controls. Logistic regression was performed to investigate association between SNP and HSP risk, and generalized multifactor dimensionality reduction (GMDR) was used to analyze the SNP-SNP interaction. Logistic analysis showed a significant association between genotypes of variants in rs3743930 and increased HSP risk. The carriers of homozygous mutant of rs3743930 polymorphism revealed increased HSP risk than those with wild-type homozygotes; OR (95% CI) was 1.55 (1.23-1.85). GMDR analysis suggested a significant two-locus model (p = 0.0107) involving rs3743930 and rs28940580, indicating a potential SNP-SNP interaction between rs3743930 and rs28940580. Overall, the two-locus models had a cross-validation consistency of 10 of 10 and had the testing accuracy of 60.72%. Subjects with rs3743930-GC or CC and rs28940580-GA or AA genotype have the highest HSP risk, compared to subjects with rs3743930-GG and rs28940580-GG genotype; OR (95% CI) was 2.13 (1.52-2.89). The variants in rs3743930 and interaction between rs3743930 and rs28940580 were associated with increased HSP risk in Chinese children.

  10. [Effect of the Gly972Arg, SNP43 and Prol2Ala polymorphisms of the genes IRS1, CAPN10 and PPARG2 on secondary failure to sulphonylurea and metformin in patients with type 2 diabetes in Yucatán, México].

    PubMed

    García-Escalante, María Guadalupe; Suárez-Solís, Víctor Manuel; López-Avila, María Teresa de Jesús; Pinto-Escalante, Doris del Carmen; Laviada-Molina, Hugo

    2009-03-01

    In Yucatán, 52% of patients with type 2 diabetes (DT2) present secondary failure to treatment associated with sulphonylurea and metformin. A possible explanation may be due to polymorphisms in the genes IRS1, CAPN10, PPARG2, which are involved in pancreatic beta cell dysfunction and a poor response to the action of insulin. The association of the polymorphisms Gly972Arg, SNP43, and Pro12Ala, of the genes IRS1, CAPN10, PPARG2, with the risk of failure to sulphonylurea and metformin therapies was determinated in patients with DT2 in Yucatán, México. One hundred and thirty and two subjects with DT2 were classified in groups of responders (HbA1c < 8%) and non-responders (HbA1c > 8%) to the treatment, according to the control of hyperglucemia with sulphonylurea and metformin. Demographic, anthropometric and metabolic data were obtained from each subject. The polymorphisms were identified by means of DNA analysis by PCR/RFLP and PCR/OAL. Genotypic and allelic frequencies and the Hardy-Weinberg equilibrium were determined. Statistical analyses consisted of X2 and multiple logistic regression tests (Epi-Info 2000 and SPSS version 12). Obese subjects carrying the genotype AA SNP43 showed 4.69 times more risk of failure to respond to treatment (p = 0.027), when compared with subjects sharing GA genotype: X2 (OR = 4.69, IC: 1.15-20.59) and multiple logistic regression, p = 0.048, (OR = 3.72, IC: 1.009-13.718). The interaction between genotype AA and the BMI > 27 showed also a significant difference (p = 0.009). The findings suggest the fact that polymorphism SNP43 may influence the response to treatment with sulphonylurea and metformin, the expression being dependent on obesity.

  11. Medicare Special Needs Plan (SNP)

    MedlinePlus

    ... change plans Types of Medicare health plans Medicare Advantage Plans + Share widget - Select to show Subcategories Getting ... Types of Medicare health plans , current subcategory Medicare Advantage Plans , current page Medicare Medical Savings Account (MSA) ...

  12. Disease-driven detection of differential inherited SNP modules from SNP network.

    PubMed

    Li, Chuanxing; Li, Yongsheng; Xu, Juan; Lv, Junying; Ma, Ye; Shao, Tingting; Gong, Binsheng; Tan, Renjie; Xiao, Yun; Li, Xia

    2011-12-10

    Detection of the synergetic effects between variants, such as single-nucleotide polymorphisms (SNPs), is crucial for understanding the genetic characters of complex diseases. Here, we proposed a two-step approach to detect differentially inherited SNP modules (synergetic SNP units) from a SNP network. First, SNP-SNP interactions are identified based on prior biological knowledge, such as their adjacency on the chromosome or degree of relatedness between the functional relationships of their genes. These interactions form SNP networks. Second, disease-risk SNP modules (or sub-networks) are prioritised by their differentially inherited properties in IBD (Identity by Descent) profiles of affected and unaffected sibpairs. The search process is driven by the disease information and follows the structure of a SNP network. Simulation studies have indicated that this approach achieves high accuracy and a low false-positive rate in the identification of known disease-susceptible SNPs. Applying this method to an alcoholism dataset, we found that flexible patterns of susceptible SNP combinations do play a role in complex diseases, and some known genes were detected through these risk SNP modules. One example is GRM7, a known alcoholism gene successfully detected by a SNP module comprised of two SNPs, but neither of the two SNPs was significantly associated with the disease in single-locus analysis. These identified genes are also enriched in some pathways associated with alcoholism, including the calcium signalling pathway, axon guidance and neuroactive ligand-receptor interaction. The integration of network biology and genetic analysis provides putative functional bridges between genetic variants and candidate genes or pathways, thereby providing new insight into the aetiology of complex diseases. Copyright © 2011 Elsevier B.V. All rights reserved.

  13. Gene-Environment Interaction in the Etiology of Mathematical Ability Using SNP Sets

    PubMed Central

    Kovas, Yulia; Plomin, Robert

    2010-01-01

    Mathematics ability and disability is as heritable as other cognitive abilities and disabilities, however its genetic etiology has received relatively little attention. In our recent genome-wide association study of mathematical ability in 10-year-old children, 10 SNP associations were nominated from scans of pooled DNA and validated in an individually genotyped sample. In this paper, we use a ‘SNP set’ composite of these 10 SNPs to investigate gene-environment (GE) interaction, examining whether the association between the 10-SNP set and mathematical ability differs as a function of ten environmental measures in the home and school in a sample of 1888 children with complete data. We found two significant GE interactions for environmental measures in the home and the school both in the direction of the diathesis-stress type of GE interaction: The 10-SNP set was more strongly associated with mathematical ability in chaotic homes and when parents are negative. PMID:20978832

  14. Genome-wide SNP detection, validation, and development of an 8K SNP array for apple

    USDA-ARS?s Scientific Manuscript database

    As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide...

  15. SNPMeta: SNP annotation and SNP metadata collection without a reference genome

    USDA-ARS?s Scientific Manuscript database

    The increase in availability of resequencing data is greatly accelerating SNP discovery and has facilitated the development of SNP genotyping assays. This, in turn, is increasing interest in annotation of individual SNPs. Currently, these data are only available through curation, or comparison to a ...

  16. Universal SNP genotyping assay with fluorescence polarization detection.

    PubMed

    Hsu, T M; Chen, X; Duan, S; Miller, R D; Kwok, P Y

    2001-09-01

    The degree of fluorescence polarization (FP) of a fluorescent molecule is a reflection of its molecular weight (Mr). FP is therefore a useful detection methodfor homogeneous assays in which the starting reagents and products differ significantly in Mr. We have previously shown that FP is a good detection method for the single-base extension and the 5'-nuclease assays. In this report, we describe a universal, optimized single-base extension assay for genotyping single nucleotide polymorphisms (SNPs). This assay, which we named the template-directed dye-terminator incorporation assay with fluorescence polarization detection (FP-TDI), uses four spectrally distinct dye terminators to achieve universal assay conditions. Even without optimization, approximately 70% of all SNP markers tested yielded robust assays. The addition of an E. coli ssDNA-binding protein just before the FP reading significantly increased FP values of the products and brought the success rate of FP-TDI assays up to 90%. Increasing the amount of dye terminators and reducing the number of thermal cycles in the single-base extension step of the assay increased the separation of the FP values benveen the products corresponding to different genotypes and improved the success rate of the assay to 100%. In this study the genomic DNA samples of 90 individuals were typed for a total of 38 FP-TDI assays (using both the sense and antisense TDI primers for 19 SNP markers). With the previously described modifications, the FP-TDI assay gave unambiguous genotyping data for all the samples tested in the 38 FP-TDI assays. When the genotypes determined by the FP-TDI and 5'-nuclease assays were compared, they were in 100% concordance for all experiments (a total of 3420 genotypes). The four-dye-terminator master mixture described here can be used for assaying any SNP marker and greatly simplifies the SNP genotyping assay design.

  17. Characterization of the Streptomyces sp. Strain C5 snp Locus and Development of snp-Derived Expression Vectors

    PubMed Central

    DeSanti, Charles L.; Strohl, William R.

    2003-01-01

    The Streptomyces sp. strain C5 snp locus is comprised of two divergently oriented genes: snpA, a metalloproteinase gene, and snpR, which encodes a LysR-like activator of snpA transcription. The transcriptional start point of snpR is immediately downstream of a strong T-N11-A inverted repeat motif likely to be the SnpR binding site, while the snpA transcriptional start site overlaps the ATG start codon, generating a leaderless snpA transcript. By using the aphII reporter gene of pIJ486 as a reporter, the plasmid-borne snpR-activated snpA promoter was ca. 60-fold more active than either the nonactivated snpA promoter or the melC1 promoter of pIJ702. The snpR-activated snpA promoter produced reporter protein levels comparable to those of the up-mutated ermE∗ promoter. The SnpR-activated snpA promoter was built into a set of transcriptional and translational fusion expression vectors which have been used for the intracellular expression of numerous daunomycin biosynthesis pathway genes from Streptomyces sp. strain C5 as well as the expression and secretion of soluble recombinant human endostatin. PMID:12620855

  18. Chaotic particle swarm optimization for detecting SNP-SNP interactions for CXCL12-related genes in breast cancer prevention.

    PubMed

    Chuang, Li-Yeh; Chang, Hsueh-Wei; Lin, Ming-Cheng; Yang, Cheng-Hong

    2012-07-01

    Genome-wide association studies have revealed that many single nucleotide polymorphisms (SNPs) are associated with breast cancer, and yet the potential SNP-SNP interactions have not been well addressed to date. This study aims to develop a methodology for the selection of SNP-genotype combinations with a maximum difference between case and control groups. We propose a new chaotic particle swarm optimization (CPSO) algorithm that identifies the best SNP combinations for breast cancer association studies containing seven SNPs. Five scoring functions, that is, the percentage correct, sensitivity/specificity, positive predictive value/negative predictive value, risk ratio, and odds ratio, are provided for evaluating SNP interactions in different SNP combinations. The CPSO algorithm identified the best SNP combinations associated with breast cancer protection. Some SNP interactions in specific SNPs and their corresponding genotypes were revealed. These SNP combinations showed a significant association with breast cancer protection (P<0.05). The sensitivity and specificity of the respective best SNP combinations were all higher than 90%. In contrast to the corresponding non-SNP-SNP interaction combinations, the estimated odds ratio and risk ratio of the SNP-SNP interaction in SNP combinations for breast cancer were less than 100%. This suggests that CPSO can successfully identify the best SNP combinations for breast cancer protection. In conclusion, we focus on developing a methodology for the selection of SNP-genotype combinations with a maximum difference between case and control groups. The CPSO method can effectively identify SNP-SNP interactions in complex biological relationships underlying the progression of breast cancer.

  19. Rapid Identification of Ginseng Cultivars (Panax ginseng Meyer) Using Novel SNP-Based Probes

    PubMed Central

    Jo, Ick-Hyun; Bang, Kyong Hwan; Kim, Young-Chang; Lee, Jei-Wan; Seo, A-Yeon; Seong, Bong-Jae; Kim, Hyun-Ho; Kim, Dong-Hwi; Cha, Seon-Woo; Cho, Yong-Gu; Kim, Hong-Sig

    2011-01-01

    In order to develop a novel system for the discrimination of five ginseng cultivars (Panax ginseng Meyer), single nucleotide polymorphism (SNP) genotyping assays with real-time polymerase chain reaction were conducted. Nucleotide substitution in gDNA library clones of P. ginseng cv. Yunpoong was targeted for the SNP genotyping assay. From these SNP sites, a set of modified SNP specific fluorescence probes (PGP74, PGP110, and PGP130) and novel primer sets have been developed to distinguish among five ginseng cultivars. The combination of the SNP type of the five cultivars, Chungpoong, Yunpoong, Gopoong, Kumpoong, and Sunpoong, was identified as ‘ATA’, ‘GCC’, ‘GTA’, ‘GCA’, and ‘ACC’, respectively. This study represents the first report of the identification of ginseng cultivars by fluorescence probes. An SNP genotyping assay using fluorescence probes could prove useful for the identification of ginseng cultivars and ginseng seed management systems and guarantee the purity of ginseng seed. PMID:23717098

  20. SNP-SNP Interaction Analysis on Soybean Oil Content under Multi-Environments

    PubMed Central

    Yin, Zhengong; Leng, Yue; Yu, Hongxiao; Jia, Huiying; Jiang, Shanshan; Ni, Zhongqiu; Jiang, Hongwei; Han, Xue; Liu, Chunyan; Hu, Zhenbang; Wu, Xiaoxia; Hu, Guohua; Xin, Dawei; Qi, Zhaoming

    2016-01-01

    Soybean oil content is one of main quality traits. In this study, we used the multifactor dimensionality reduction (MDR) method and a soybean high-density genetic map including 5,308 markers to identify stable single nucleotide polymorphism (SNP)—SNP interactions controlling oil content in soybean across 23 environments. In total, 36,442,756 SNP-SNP interaction pairs were detected, 1865 of all interaction pairs associated with soybean oil content were identified under multiple environments by the Bonferroni correction with p <3.55×10−11. Two and 1863 SNP-SNP interaction pairs detected stable across 12 and 11 environments, respectively, which account around 50% of total environments. Epistasis values and contribution rates of stable interaction (the SNP interaction pairs were detected in more than 2 environments) pairs were detected by the two way ANOVA test, the available interaction pairs were ranged 0.01 to 0.89 and from 0.01 to 0.85, respectively. Some of one side of the interaction pairs were identified with previously research as a major QTL without epistasis effects. The results of this study provide insights into the genetic architecture of soybean oil content and can serve as a basis for marker-assisted selection breeding. PMID:27668866

  1. SNP Array in Hematopoietic Neoplasms: A Review

    PubMed Central

    Song, Jinming; Shao, Haipeng

    2015-01-01

    Cytogenetic analysis is essential for the diagnosis and prognosis of hematopoietic neoplasms in current clinical practice. Many hematopoietic malignancies are characterized by structural chromosomal abnormalities such as specific translocations, inversions, deletions and/or numerical abnormalities that can be identified by karyotype analysis or fluorescence in situ hybridization (FISH) studies. Single nucleotide polymorphism (SNP) arrays offer high-resolution identification of copy number variants (CNVs) and acquired copy-neutral loss of heterozygosity (LOH)/uniparental disomy (UPD) that are usually not identifiable by conventional cytogenetic analysis and FISH studies. As a result, SNP arrays have been increasingly applied to hematopoietic neoplasms to search for clinically-significant genetic abnormalities. A large numbers of CNVs and UPDs have been identified in a variety of hematopoietic neoplasms. CNVs detected by SNP array in some hematopoietic neoplasms are of prognostic significance. A few specific genes in the affected regions have been implicated in the pathogenesis and may be the targets for specific therapeutic agents in the future. In this review, we summarize the current findings of application of SNP arrays in a variety of hematopoietic malignancies with an emphasis on the clinically significant genetic variants. PMID:27600067

  2. Cardiovascular pharmacogenetics in the SNP era.

    PubMed

    Mooser, V; Waterworth, D M; Isenhour, T; Middleton, L

    2003-07-01

    In the past pharmacological agents have contributed to a significant reduction in age-adjusted incidence of cardiovascular events. However, not all patients treated with these agents respond favorably, and some individuals may develop side-effects. With aging of the population and the growing prevalence of cardiovascular risk factors worldwide, it is expected that the demand for cardiovascular drugs will increase in the future. Accordingly, there is a growing need to identify the 'good' responders as well as the persons at risk for developing adverse events. Evidence is accumulating to indicate that responses to drugs are at least partly under genetic control. As such, pharmacogenetics - the study of variability in drug responses attributed to hereditary factors in different populations - may significantly assist in providing answers toward meeting this challenge. Pharmacogenetics mostly relies on associations between a specific genetic marker like single nucleotide polymorphisms (SNPs), either alone or arranged in a specific linear order on a certain chromosomal region (haplotypes), and a particular response to drugs. Numerous associations have been reported between selected genotypes and specific responses to cardiovascular drugs. Recently, for instance, associations have been reported between specific alleles of the apoE gene and the lipid-lowering response to statins, or the lipid-elevating effect of isotretinoin. Thus far, these types of studies have been mostly limited to a priori selected candidate genes due to restricted genotyping and analytical capacities. Thanks to the large number of SNPs now available in the public domain through the SNP Consortium and the newly developed technologies (high throughput genotyping, bioinformatics software), it is now possible to interrogate more than 200,000 SNPs distributed over the entire human genome. One pharmacogenetic study using this approach has been launched by GlaxoSmithKline to identify the approximately 4% of

  3. Analysis of SNP-SNP interactions and bone quantitative ultrasound parameter in early adulthood.

    PubMed

    Correa-Rodríguez, María; Viatte, Sebastien; Massey, Jonathan; Schmidt-RioValle, Jacqueline; Rueda-Medina, Blanca; Orozco, Gisela

    2017-10-03

    Osteoporosis individual susceptibility is determined by the interaction of multiple genetic variants and environmental factors. The aim of this study was to conduct SNP-SNP interaction analyses in candidate genes influencing heel quantitative ultrasound (QUS) parameter in early adulthood to identify novel insights into the mechanism of disease. The study population included 575 healthy subjects (mean age 20.41; SD 2.36). To assess bone mass QUS was performed to determine Broadband ultrasound attenuation (BUA, dB/MHz). A total of 32 SNPs mapping to loci that have been characterized as genetic markers for QUS and/or BMD parameters were selected as genetic markers in this study. The association of all possible SNP pairs with QUS was assessed by linear regression and a SNP-SNP interaction was defined as a significant departure from additive effects. The pairwise SNP-SNP analysis showed multiple interactions. The interaction comprising SNPs rs9340799 and rs3736228 that map in the ESR1 and LRP5 genes respectively, revealed the lowest p value after adjusting for confounding factors (p-value = 0.001, β (95% CI) = 14.289 (5.548, 23.029). In addition, our model reported others such as TMEM135-WNT16 (p = 0.007, β(95%CI) = 9.101 (2.498, 15.704), ESR1-DKK1 (p = 0.012, β(95%CI) = 13.641 (2.959, 24.322) or OPG-LRP5 (p = 0.012, β(95%CI) = 8.724 (1.936, 15.512). However, none of the detected interactions remain significant considering the Bonferroni significance threshold for multiple testing (p<0.0001). Our analysis of SNP-SNP interaction in candidate genes of QUS in Caucasian young adults reveal several interactions, especially between ESR1 and LRP5 genes, that did not reach statistical significance. Although our results do not support a relevant genetic contribution of SNP-SNP epistatic interactions to QUS in young adults, further studies in larger independent populations would be necessary to support these preliminary findings.

  4. An Improved Opposition-Based Learning Particle Swarm Optimization for the Detection of SNP-SNP Interactions.

    PubMed

    Shang, Junliang; Sun, Yan; Li, Shengjun; Liu, Jin-Xing; Zheng, Chun-Hou; Zhang, Junying

    2015-01-01

    SNP-SNP interactions have been receiving increasing attention in understanding the mechanism underlying susceptibility to complex diseases. Though many works have been done for the detection of SNP-SNP interactions, the algorithmic development is still ongoing. In this study, an improved opposition-based learning particle swarm optimization (IOBLPSO) is proposed for the detection of SNP-SNP interactions. Highlights of IOBLPSO are the introduction of three strategies, namely, opposition-based learning, dynamic inertia weight, and a postprocedure. Opposition-based learning not only enhances the global explorative ability, but also avoids premature convergence. Dynamic inertia weight allows particles to cover a wider search space when the considered SNP is likely to be a random one and converges on promising regions of the search space while capturing a highly suspected SNP. The postprocedure is used to carry out a deep search in highly suspected SNP sets. Experiments of IOBLPSO are performed on both simulation data sets and a real data set of age-related macular degeneration, results of which demonstrate that IOBLPSO is promising in detecting SNP-SNP interactions. IOBLPSO might be an alternative to existing methods for detecting SNP-SNP interactions.

  5. snpGeneSets: An R Package for Genome-Wide Study Annotation.

    PubMed

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-12-07

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/.

  6. snpGeneSets: An R Package for Genome-Wide Study Annotation

    PubMed Central

    Mei, Hao; Li, Lianna; Jiang, Fan; Simino, Jeannette; Griswold, Michael; Mosley, Thomas; Liu, Shijian

    2016-01-01

    Genome-wide studies (GWS) of SNP associations and differential gene expressions have generated abundant results; next-generation sequencing technology has further boosted the number of variants and genes identified. Effective interpretation requires massive annotation and downstream analysis of these genome-wide results, a computationally challenging task. We developed the snpGeneSets package to simplify annotation and analysis of GWS results. Our package integrates local copies of knowledge bases for SNPs, genes, and gene sets, and implements wrapper functions in the R language to enable transparent access to low-level databases for efficient annotation of large genomic data. The package contains functions that execute three types of annotations: (1) genomic mapping annotation for SNPs and genes and functional annotation for gene sets; (2) bidirectional mapping between SNPs and genes, and genes and gene sets; and (3) calculation of gene effect measures from SNP associations and performance of gene set enrichment analyses to identify functional pathways. We applied snpGeneSets to type 2 diabetes (T2D) results from the NHGRI genome-wide association study (GWAS) catalog, a Finnish GWAS, and a genome-wide expression study (GWES). These studies demonstrate the usefulness of snpGeneSets for annotating and performing enrichment analysis of GWS results. The package is open-source, free, and can be downloaded at: https://www.umc.edu/biostats_software/. PMID:27807048

  7. Developing a new nonbinary SNP fluorescent multiplex detection system for forensic application in China.

    PubMed

    Liu, Yanfang; Liao, Huidan; Liu, Ying; Guo, Juanjuan; Sun, Yi; Fu, Xiaoliang; Xiao, Ding; Cai, Jifeng; Lan, Lingmei; Xie, Pingli; Zha, Lagabaiyila

    2017-02-06

    Nonbinary single-nucleotide polymorphisms (SNPs) are potential forensic genetic markers because their discrimination power is greater than that of normal binary SNPs, and that they can detect highly degraded samples. We previously developed a nonbinary SNP multiplex typing assay. In this study, we selected additional 20 nonbinary SNPs from the NCBI SNP database and verified them through pyrosequencing. These 20 nonbinary SNPs were analyzed using the fluorescent-labeled SNaPshot multiplex SNP typing method. The allele frequencies and genetic parameters of these 20 nonbinary SNPs were determined among 314 unrelated individuals from Han populations from China. The total power of discrimination was 0.9999999999994, and the cumulative probability of exclusion was 0.9986. Moreover, the result of the combination of this 20 nonbinary SNP assay with the 20 nonbinary SNP assay we previously developed demonstrated that the cumulative probability of exclusion of the 40 nonbinary SNPs was 0.999991 and that no significant linkage disequilibrium was observed in all 40 nonbinary SNPs. Thus, we concluded that this new system consisting of new 20 nonbinary SNPs could provide highly informative polymorphic data which would be further used in forensic application and would serve as a potentially valuable supplement to forensic DNA analysis.

  8. [Research progress on the phenotype informative SNP in forensic science].

    PubMed

    Liu, Yu-Xuan; Hu, Qing-Qing; Ma, Hong-Du; Huang, Dai-Xin

    2014-10-01

    Single nucleotide polymorphism (SNP) refers to the single base sequence variation in specific location of the human genome. Phenotype informative SNP has gradually become one of the research hot spots in forensic science. In this paper, the forensic research situation and application prospect of phenotype informative SNP in the characteristics of hair, eye and skin color, height, and facial feature are reviewed.

  9. Mycobacterium leprae in Colombia described by SNP7614 in gyrA, two minisatellites and geography

    PubMed Central

    Cardona-Castro, Nora; Beltrán-Alzate, Juan Camilo; Romero-Montoya, Irma Marcela; Li, Wei; Brennan, Patrick J; Vissa, Varalakshmi

    2013-01-01

    New cases of leprosy are still being detected in Colombia after the country declared achievement of the WHO defined ‘elimination’ status. To study the ecology of leprosy in endemic regions, a combination of geographic and molecular tools were applied for a group of 201 multibacillary patients including six multi-case families from eleven departments. The location (latitude and longitude) of patient residences were mapped. Slit skin smears and/or skin biopsies were collected and DNA was extracted. Standard agarose gel electrophoresis following a multiplex PCR-was developed for rapid and inexpensive strain typing of M. leprae based on copy numbers of two VNTR minisatellite loci 27-5 and 12-5. A SNP (C/T) in gyrA (SNP7614) was mapped by introducing a novel PCR-RFLP into an ongoing drug resistance surveillance effort. Multiple genotypes were detected combining the three molecular markers. The two frequent genotypes in Colombia were SNP7614(C)/27-5(5)/12-5(4) [C54] predominantly distributed in the Atlantic departments and SNP7614 (T)/27-5(4)/12-5(5) [T45] associated with the Andean departments. A novel genotype SNP7614 (C)/27-5(6)/12-5(4) [C64] was detected in cities along the Magdalena river which separates the Andean from Atlantic departments; a subset was further characterized showing association with a rare allele of minisatellite 23-3 and the SNP type 1 of M. leprae. The genotypes within intra-family cases were conserved. Overall, this is the first large scale study that utilized simple and rapid assay formats for identification of major strain types and their distribution in Colombia. It provides the framework for further strain type discrimination and geographic information systems as tools for tracing transmission of leprosy. PMID:23291420

  10. Etiological yield of SNP microarrays in idiopathic intellectual disability.

    PubMed

    Utine, G Eda; Haliloğlu, Göknur; Volkan-Salancı, Bilge; Çetinkaya, Arda; Kiper, Pelin Ö; Alanay, Yasemin; Aktaş, Dilek; Anlar, Banu; Topçu, Meral; Boduroğlu, Koray; Alikaşifoğlu, Mehmet

    2014-05-01

    Intellectual disability (ID) has a prevalence of 3% and is classified according to its severity. An underlying etiology cannot be determined in 75-80% in mild ID, and in 20-50% of severe ID. After it has been shown that copy number variations involving short DNA segments may cause ID, genome-wide SNP microarrays are being used as a tool for detecting submicroscopic copy number changes and uniparental disomy. This study was performed to investigate the presence of copy number changes in patients with ID of unidentified etiology. Affymetrix(®) 6.0 SNP microarray platform was used for analysis of 100 patients and their healthy parents, and data were evaluated using various databases and literature. Etiological diagnoses were made in 12 patients (12%). Homozygous deletion in NRXN1 gene and duplication in IL1RAPL1 gene were detected for the first time. Two separate patients had deletions in FOXP2 and UBE2A genes, respectively, for which only few patients have recently been reported. Interstitial and subtelomeric copy number changes were described in 6 patients, in whom routine cytogenetic tools revealed normal results. In one patient uniparental disomy type of Angelman syndrome was diagnosed. SNP microarrays constitute a screening test able to detect very small genomic changes, with a high etiological yield even in patients already evaluated using traditional cytogenetic tools, offer analysis for uniparental disomy and homozygosity, and thereby are helpful in finding novel disease-causing genes: for these reasons they should be considered as a first-tier genetic screening test in the evaluation of patients with ID and autism.

  11. Genome-Wide SNP Detection, Validation, and Development of an 8K SNP Array for Apple

    PubMed Central

    Chagné, David; Crowhurst, Ross N.; Troggio, Michela; Davey, Mark W.; Gilmore, Barbara; Lawley, Cindy; Vanderzande, Stijn; Hellens, Roger P.; Kumar, Satish; Cestaro, Alessandro; Velasco, Riccardo; Main, Dorrie; Rees, Jasper D.; Iezzoni, Amy; Mockler, Todd; Wilhelm, Larry; Van de Weg, Eric; Gardiner, Susan E.; Bassil, Nahla; Peace, Cameron

    2012-01-01

    As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of ‘Golden Delicious’, SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple. PMID:22363718

  12. is-rSNP: a novel technique for in silico regulatory SNP detection

    PubMed Central

    Macintyre, Geoff; Bailey, James; Haviv, Izhak; Kowalczyk, Adam

    2010-01-01

    Motivation: Determining the functional impact of non-coding disease-associated single nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) is challenging. Many of these SNPs are likely to be regulatory SNPs (rSNPs): variations which affect the ability of a transcription factor (TF) to bind to DNA. However, experimental procedures for identifying rSNPs are expensive and labour intensive. Therefore, in silico methods are required for rSNP prediction. By scoring two alleles with a TF position weight matrix (PWM), it can be determined which SNPs are likely rSNPs. However, predictions in this manner are noisy and no method exists that determines the statistical significance of a nucleotide variation on a PWM score. Results: We have designed an algorithm for in silico rSNP detection called is-rSNP. We employ novel convolution methods to determine the complete distributions of PWM scores and ratios between allele scores, facilitating assignment of statistical significance to rSNP effects. We have tested our method on 41 experimentally verified rSNPs, correctly predicting the disrupted TF in 28 cases. We also analysed 146 disease-associated SNPs with no known functional impact in an attempt to identify candidate rSNPs. Of the 11 significantly predicted disrupted TFs, 9 had previous evidence of being associated with the disease in the literature. These results demonstrate that is-rSNP is suitable for high-throughput screening of SNPs for potential regulatory function. This is a useful and important tool in the interpretation of GWAS. Availability: is-rSNP software is available for use at: www.genomics.csse.unimelb.edu.au/is-rSNP Contact: gmaci@csse.unimelb.edu.au; adam.kowalczyk@nicta.com.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20823317

  13. Genome-wide SNP discovery in walnut with an AGSNP pipeline updated for SNP discovery in allogamous organisms

    PubMed Central

    2012-01-01

    Background A genome-wide set of single nucleotide polymorphisms (SNPs) is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a pipeline (AGSNP) for genome-wide SNP discovery in coding sequences and other single-copy DNA without a complete genome sequence in self-pollinating (autogamous) plants. Here we updated this pipeline for SNP discovery in outcrossing (allogamous) species and demonstrated its efficacy in SNP discovery in walnut (Juglans regia L.). Results The first step in the original implementation of the AGSNP pipeline was the construction of a reference sequence and the identification of single-copy sequences in it. To identify single-copy sequences, multiple genome equivalents of short SOLiD reads of another individual were mapped to shallow genome coverage of long Sanger or Roche 454 reads making up the reference sequence. The relative depth of SOLiD reads was used to filter out repeated sequences from single-copy sequences in the reference sequence. The second step was a search for SNPs between SOLiD reads and the reference sequence. Polymorphism within the mapped SOLiD reads would have precluded SNP discovery; hence both individuals had to be homozygous. The AGSNP pipeline was updated here for using SOLiD or other type of short reads of a heterozygous individual for these two principal steps. A total of 32.6X walnut genome equivalents of SOLiD reads of vegetatively propagated walnut scion cultivar ‘Chandler’ were mapped to 48,661 ‘Chandler’ bacterial artificial chromosome (BAC) end sequences (BESs) produced by Sanger sequencing during the construction of a walnut physical map. A total of 22,799 putative SNPs were initially identified. A total of 6,000 Infinium II type SNPs evenly distributed along the walnut physical map were selected for the construction of an Infinium Bead

  14. Genome-wide SNP discovery in walnut with an AGSNP pipeline updated for SNP discovery in allogamous organisms.

    PubMed

    You, Frank M; Deal, Karin R; Wang, Jirui; Britton, Monica T; Fass, Joseph N; Lin, Dawei; Dandekar, Abhaya M; Leslie, Charles A; Aradhya, Mallikarjuna; Luo, Ming-Cheng; Dvorak, Jan

    2012-07-31

    A genome-wide set of single nucleotide polymorphisms (SNPs) is a valuable resource in genetic research and breeding and is usually developed by re-sequencing a genome. If a genome sequence is not available, an alternative strategy must be used. We previously reported the development of a pipeline (AGSNP) for genome-wide SNP discovery in coding sequences and other single-copy DNA without a complete genome sequence in self-pollinating (autogamous) plants. Here we updated this pipeline for SNP discovery in outcrossing (allogamous) species and demonstrated its efficacy in SNP discovery in walnut (Juglans regia L.). The first step in the original implementation of the AGSNP pipeline was the construction of a reference sequence and the identification of single-copy sequences in it. To identify single-copy sequences, multiple genome equivalents of short SOLiD reads of another individual were mapped to shallow genome coverage of long Sanger or Roche 454 reads making up the reference sequence. The relative depth of SOLiD reads was used to filter out repeated sequences from single-copy sequences in the reference sequence. The second step was a search for SNPs between SOLiD reads and the reference sequence. Polymorphism within the mapped SOLiD reads would have precluded SNP discovery; hence both individuals had to be homozygous. The AGSNP pipeline was updated here for using SOLiD or other type of short reads of a heterozygous individual for these two principal steps. A total of 32.6X walnut genome equivalents of SOLiD reads of vegetatively propagated walnut scion cultivar 'Chandler' were mapped to 48,661 'Chandler' bacterial artificial chromosome (BAC) end sequences (BESs) produced by Sanger sequencing during the construction of a walnut physical map. A total of 22,799 putative SNPs were initially identified. A total of 6,000 Infinium II type SNPs evenly distributed along the walnut physical map were selected for the construction of an Infinium BeadChip, which was used to

  15. Analyzing cancer samples with SNP arrays.

    PubMed

    Van Loo, Peter; Nilsen, Gro; Nordgard, Silje H; Vollan, Hans Kristian Moen; Børresen-Dale, Anne-Lise; Kristensen, Vessela N; Lingjærde, Ole Christian

    2012-01-01

    Single nucleotide polymorphism (SNP) arrays are powerful tools to delineate genomic aberrations in cancer genomes. However, the analysis of these SNP array data of cancer samples is complicated by three phenomena: (a) aneuploidy: due to massive aberrations, the total DNA content of a cancer cell can differ significantly from its normal two copies; (b) nonaberrant cell admixture: samples from solid tumors do not exclusively contain aberrant tumor cells, but always contain some portion of nonaberrant cells; (c) intratumor heterogeneity: different cells in the tumor sample may have different aberrations. We describe here how these phenomena impact the SNP array profile, and how these can be accounted for in the analysis. In an extended practical example, we apply our recently developed and further improved ASCAT (allele-specific copy number analysis of tumors) suite of tools to analyze SNP array data using data from a series of breast carcinomas as an example. We first describe the structure of the data, how it can be plotted and interpreted, and how it can be segmented. The core ASCAT algorithm next determines the fraction of nonaberrant cells and the tumor ploidy (the average number of DNA copies), and calculates an ASCAT profile. We describe how these ASCAT profiles visualize both copy number aberrations as well as copy-number-neutral events. Finally, we touch upon regions showing intratumor heterogeneity, and how they can be detected in ASCAT profiles. All source code and data described here can be found at our ASCAT Web site ( http://www.ifi.uio.no/forskning/grupper/bioinf/Projects/ASCAT/).

  16. Integrated Analysis of SNP, CNV and Gene Expression Data in Genetic Association Studies.

    PubMed

    Momtaz, Rana; Ghanem, Nagia M; El-Makky, Nagwa M; Ismail, Mohamed A

    2017-07-07

    Integrative approaches that combine multiple forms of data can more accurately capture CGEway associations and so provide a comprehensive understanding of the molecular mechanisms that cause complex diseases. Association analyses based on SNP genotypes, CNV genotypes, and gene expression profiles are the three most common paradigms used for gene set/ CGEway enrichment analyses. Many work has been done to leverage information from two types of data from these three paradigms. However, to the best of our knowledge, there is no work done before to integrate the three paradigms all together. In this paper, we present an integrated analysis that combine SNP, CNV, and gene expression data to generate a single gene list. We present different methods to compare this gene list with the other three possible lists that result from the combinations of the following pairs of data: SNP genotype with gene expression, CNV genotype with gene expression, and SNP genotype with CNV genotype. The comparison is done using three different cancer datasets and two different methods of comparison. Our results show that integrating SNP, CNV, and gene expression data give better association results than integrating any pair of three data. This article is protected by copyright. All rights reserved.

  17. Variable Selection in Logistic Regression for Detecting SNP-SNP Interactions: the Rheumatoid Arthritis Example

    PubMed Central

    Lin, H. Y.; Desmond, R.; Liu, Y. H.; Bridges, S. L.; Soong, S. J.

    2013-01-01

    Summary Many complex disease traits are observed to be associated with single nucleotide polymorphism (SNP) interactions. In testing small-scale SNP-SNP interactions, variable selection procedures in logistic regressions are commonly used. The empirical evidence of variable selection for testing interactions in logistic regressions is limited. This simulation study was designed to compare nine variable selection procedures in logistic regressions for testing SNP-SNP interactions. Data on 10 SNPs were simulated for 400 and 1000 subjects (case/control ratio=1). The simulated model included one main effect and two 2-way interactions. The variable selection procedures included automatic selection (stepwise, forward and backward), common 2-step selection, AIC- and BIC-based selection. The hierarchical rule effect, in which all main effects and lower order terms of the highest-order interaction term are included in the model regardless of their statistical significance, was also examined. We found that the stepwise variable selection without the hierarchical rule which had reasonably high authentic (true positive) proportion and low noise (false positive) proportion, is a better method compared to other variable selection procedures. The procedure without the hierarchical rule requires fewer terms in testing interactions, so it can accommodate more SNPs than the procedure with the hierarchical rule. For testing interactions, the procedures without the hierarchical rule had higher authentic proportion and lower noise proportion compared with ones with the hierarchical rule. These variable selection procedures were also applied and compared in a rheumatoid arthritis study. PMID:18231122

  18. Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

    PubMed Central

    Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

    2015-01-01

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559

  19. A SNP-Based Molecular Barcode for Characterization of Common Wheat

    PubMed Central

    Gao, LiFeng; Jia, JiZeng; Kong, XiuYing

    2016-01-01

    Wheat is grown as a staple crop worldwide. It is important to develop an effective genotyping tool for this cereal grain both to identify germplasm diversity and to protect the rights of breeders. Single-nucleotide polymorphism (SNP) genotyping provides a means for developing a practical, rapid, inexpensive and high-throughput assay. Here, we investigated SNPs as robust markers of genetic variation for typing wheat cultivars. We identified SNPs from an array of 9000 across a collection of 429 well-known wheat cultivars grown in China, of which 43 SNP markers with high minor allele frequency and variations discriminated the selected wheat varieties and their wild ancestors. This SNP-based barcode will allow for the rapid and precise identification of wheat germplasm resources and newly released varieties and will further assist in the wheat breeding program. PMID:26985664

  20. Analyzing copy number variation using SNP array data: protocols for calling CNV and association tests.

    PubMed

    Lin, Chiao-Feng; Naj, Adam C; Wang, Li-San

    2013-10-18

    High-density SNP genotyping technology provides a low-cost, effective tool for conducting Genome Wide Association (GWA) studies. The wide adoption of GWA studies has indeed led to discoveries of disease- or trait-associated SNPs, some of which were subsequently shown to be causal. However, the nearly universal shortcoming of many GWA studies--missing heritability--has prompted great interest in searching for other types of genetic variation, such as copy number variation (CNV). Certain CNVs have been reported to alter disease susceptibility. Algorithms and tools have been developed to identify CNVs using SNP array hybridization intensity data. Such an approach provides an additional source of data with almost no extra cost. In this unit, we demonstrate the steps for calling CNVs from Illumina SNP array data using PennCNV and performing association analysis using R and PLINK. Copyright © 2013 John Wiley & Sons, Inc.

  1. Applying SNP marker technology in the cacao breeding program at the Cocoa Research Institute of Ghana

    USDA-ARS?s Scientific Manuscript database

    In this investigation 45 parental cacao plants and five progeny derived from the parental stock studied were genotyped using six SNP markers to determine off-types or mislabeled clones and to authenticate crosses made in the Cocoa Research Institute of Ghana (CRIG) breeding program. Investigation wa...

  2. High-throughput SNP genotyping in Cucurbita pepo for map construction and quantitative trait loci mapping

    PubMed Central

    2012-01-01

    Background Cucurbita pepo is a member of the Cucurbitaceae family, the second- most important horticultural family in terms of economic importance after Solanaceae. The "summer squash" types, including Zucchini and Scallop, rank among the highest-valued vegetables worldwide. There are few genomic tools available for this species. The first Cucurbita transcriptome, along with a large collection of Single Nucleotide Polymorphisms (SNP), was recently generated using massive sequencing. A set of 384 SNP was selected to generate an Illumina GoldenGate assay in order to construct the first SNP-based genetic map of Cucurbita and map quantitative trait loci (QTL). Results We herein present the construction of the first SNP-based genetic map of Cucurbita pepo using a population derived from the cross of two varieties with contrasting phenotypes, representing the main cultivar groups of the species' two subspecies: Zucchini (subsp. pepo) × Scallop (subsp. ovifera). The mapping population was genotyped with 384 SNP, a set of selected EST-SNP identified in silico after massive sequencing of the transcriptomes of both parents, using the Illumina GoldenGate platform. The global success rate of the assay was higher than 85%. In total, 304 SNP were mapped, along with 11 SSR from a previous map, giving a map density of 5.56 cM/marker. This map was used to infer syntenic relationships between C. pepo and cucumber and to successfully map QTL that control plant, flowering and fruit traits that are of benefit to squash breeding. The QTL effects were validated in backcross populations. Conclusion Our results show that massive sequencing in different genotypes is an excellent tool for SNP discovery, and that the Illumina GoldenGate platform can be successfully applied to constructing genetic maps and performing QTL analysis in Cucurbita. This is the first SNP-based genetic map in the Cucurbita genus and is an invaluable new tool for biological research, especially considering that most

  3. Exploring SNP-SNP interactions and colon cancer risk using polymorphism interaction analysis

    PubMed Central

    Goodman, Julie E.; Mechanic, Leah E.; Luke, Brian T.; Ambs, Stefan; Chanock, Stephen; Harris, Curtis C.

    2006-01-01

    Several single nucleotide polymorphisms (SNPs) in genes derived from distinct pathways are associated with colon cancer risk; however, few studies have examined SNP-SNP interactions concurrently. We explored the association between colon cancer and 94 SNPs, using a novel approach, polymorphism interaction analysis (PIA). We developed PIA to examine all possible SNP combinations, based on the 94 SNPs studied in 216 male colon cancer cases and 255 male controls, employing 2 separate functions that cross-validate and minimize false-positive results in the evaluation of SNP combinations to predict colon cancer risk. PIA identified previously described null polymorphisms in glutathione-S-transferase T1 (GSTT1) as the best predictor of colon cancer among the studied SNPs, and also identified novel polymorphisms in the inflammation and hormone metabolism pathways that singly or jointly predict cancer risk. PIA identified SNPs that may interact with the GSTT1 polymorphism, including coding polymorphisms in TP53 (Arg72Pro in p53) and CASP8 (Asp302His in caspase 8), which may modify the association between this polymorphism and colon cancer. This was confirmed by logistic regression, as the GSTT1 null polymorphism in combination with either the TP53 or the CASP8 polymorphism significantly alter colon cancer risk (pinteraction < 0.02 for both). GSTT1 prevents DNA damage by detoxifying mutagenic compounds, while the p53 protein facilitates repair of DNA damage and induces apoptosis, and caspase 8 is activated in p53-mediated apoptosis. Our results suggest that PIA is a valid method for suggesting SNP-SNP interactions that may be validated in future studies, using more traditional statistical methods on different datasets (Supplementary material can be found on the International Journal of Cancer website at http://www.interscience.wiley.com/jpages/0020-7136/suppmat). PMID:16217767

  4. SNP-RFLPing: restriction enzyme mining for SNPs in genomes.

    PubMed

    Chang, Hsueh-Wei; Yang, Cheng-Hong; Chang, Phei-Lang; Cheng, Yu-Huei; Chuang, Li-Yeh

    2006-02-17

    The restriction fragment length polymorphism (RFLP) is a common laboratory method for the genotyping of single nucleotide polymorphisms (SNPs). Here, we describe a web-based software, named SNP-RFLPing, which provides the restriction enzyme for RFLP assays on a batch of SNPs and genes from the human, rat, and mouse genomes. Three user-friendly inputs are included: 1) NCBI dbSNP "rs" or "ss" IDs; 2) NCBI Entrez gene ID and HUGO gene name; 3) any formats of SNP-in-sequence, are allowed to perform the SNP-RFLPing assay. These inputs are auto-programmed to SNP-containing sequences and their complementary sequences for the selection of restriction enzymes. All SNPs with available RFLP restriction enzymes of each input genes are provided even if many SNPs exist. The SNP-RFLPing analysis provides the SNP contig position, heterozygosity, function, protein residue, and amino acid position for cSNPs, as well as commercial and non-commercial restriction enzymes. This web-based software solves the input format problems in similar softwares and greatly simplifies the procedure for providing the RFLP enzyme. Mixed free forms of input data are friendly to users who perform the SNP-RFLPing assay. SNP-RFLPing offers a time-saving application for association studies in personalized medicine and is freely available at http://bio.kuas.edu.tw/snp-rflp/.

  5. SNP calling by sequencing pooled samples

    PubMed Central

    2012-01-01

    Background Performing high throughput sequencing on samples pooled from different individuals is a strategy to characterize genetic variability at a small fraction of the cost required for individual sequencing. In certain circumstances some variability estimators have even lower variance than those obtained with individual sequencing. SNP calling and estimating the frequency of the minor allele from pooled samples, though, is a subtle exercise for at least three reasons. First, sequencing errors may have a much larger relevance than in individual SNP calling: while their impact in individual sequencing can be reduced by setting a restriction on a minimum number of reads per allele, this would have a strong and undesired effect in pools because it is unlikely that alleles at low frequency in the pool will be read many times. Second, the prior allele frequency for heterozygous sites in individuals is usually 0.5 (assuming one is not analyzing sequences coming from, e.g. cancer tissues), but this is not true in pools: in fact, under the standard neutral model, singletons (i.e. alleles of minimum frequency) are the most common class of variants because P(f) ∝ 1/f and they occur more often as the sample size increases. Third, an allele appearing only once in the reads from a pool does not necessarily correspond to a singleton in the set of individuals making up the pool, and vice versa, there can be more than one read – or, more likely, none – from a true singleton. Results To improve upon existing theory and software packages, we have developed a Bayesian approach for minor allele frequency (MAF) computation and SNP calling in pools (and implemented it in a program called snape): the approach takes into account sequencing errors and allows users to choose different priors. We also set up a pipeline which can simulate the coalescence process giving rise to the SNPs, the pooling procedure and the sequencing. We used it to compare the performance of snape to that

  6. The MDM2 promoter polymorphism SNP309T→G and the risk of uterine leiomyosarcoma, colorectal cancer, and squamous cell carcinoma of the head and neck

    PubMed Central

    Alhopuro, P; Ylisaukko-oja, S; Koskinen, W; Bono, P; Arola, J; Jarvinen, H; Mecklin, J; Atula, T; Kontio, R; Makitie, A; Suominen, S; Leivo, I; Vahteristo, P; Aaltonen, L; Aaltonen, L

    2005-01-01

    Background: MDM2 acts as a principal regulator of the tumour suppressor p53 by targeting its destruction through the ubiquitin pathway. A polymorphism in the MDM2 promoter (SNP309) was recently identified. SNP309 was shown to result, via Sp1, in higher levels of MDM2 RNA and protein, and subsequent attenuation of the p53 pathway. Furthermore, SNP309 was proposed to be associated with accelerated soft tissue sarcoma formation in both hereditary (Li-Fraumeni) and sporadic cases in humans. Methods: We evaluated the possible contribution of SNP309 to three tumour types known to be linked with the MDM2/p53 pathway, using genomic sequencing or restriction fragment length polymorphism as screening methods. Three separate Finnish tumour materials (population based sets of 68 patients with early onset uterine leiomyosarcomas and 1042 patients with colorectal cancer, and a series of 162 patients with squamous cell carcinoma of the head and neck) and a set of 185 healthy Finnish controls were analysed for SNP309. Results: Frequencies of SNP309 were similar in all four cohorts. In the colorectal cancer series, SNP309 was somewhat more frequent in women and in patients with microsatellite stable tumours. Female SNP309 carriers were diagnosed with colorectal cancer approximately 2.7 years earlier than those carrying the wild type gene. However, no statistically significant association of SNP309 with patients' age at disease onset or to any other clinicopathological parameter was found in these three tumour materials. Conclusion: SNP309 had no significant contribution to tumour formation in our materials. Possible associations of SNP309 with microsatellite stable colorectal cancer and with earlier disease onset in female carriers need to be examined in subsequent studies. PMID:16141004

  7. High-density SNP-based genetic map development and linkage disequilibrium assessment in Brassica napus L

    PubMed Central

    2013-01-01

    Background High density genetic maps built with SNP markers that are polymorphic in various genetic backgrounds are very useful for studying the genetics of agronomical traits as well as genome organization and evolution. Simultaneous dense SNP genotyping of segregating populations and variety collections was applied to oilseed rape (Brassica napus L.) to obtain a high density genetic map for this species and to study the linkage disequilibrium pattern. Results We developed an integrated genetic map for oilseed rape by high throughput SNP genotyping of four segregating doubled haploid populations. A very high level of collinearity was observed between the four individual maps and a large number of markers (>59%) was common to more than two maps. The precise integrated map comprises 5764 SNP and 1603 PCR markers. With a total genetic length of 2250 cM, the integrated map contains a density of 3.27 markers (2.56 SNP) per cM. Genotyping of these mapped SNP markers in oilseed rape collections allowed polymorphism level and linkage disequilibrium (LD) to be studied across the different collections (winter vs spring, different seed quality types) and along the linkage groups. Overall, polymorphism level was higher and LD decayed faster in spring than in “00” winter oilseed rape types but this was shown to vary greatly along the linkage groups. Conclusions Our study provides a valuable resource for further genetic studies using linkage or association mapping, for marker assisted breeding and for Brassica napus sequence assembly and genome organization analyses. PMID:23432809

  8. dbSNP: the NCBI database of genetic variation.

    PubMed

    Sherry, S T; Ward, M H; Kholodov, M; Baker, J; Phan, L; Smigielski, E M; Sirotkin, K

    2001-01-01

    In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Center for Biotechnology Information (NCBI) has established the dbSNP database [S.T.Sherry, M.Ward and K. Sirotkin (1999) Genome Res., 9, 677-679]. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. The complete contents of dbSNP can also be downloaded in multiple formats via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/.

  9. SNP2CAPS: a SNP and INDEL analysis tool for CAPS marker development.

    PubMed

    Thiel, Thomas; Kota, Raja; Grosse, Ivo; Stein, Nils; Graner, Andreas

    2004-01-02

    With the influx of various SNP genotyping assays in recent years, there has been a need for an assay that is robust, yet cost effective, and could be performed using standard gel-based procedures. In this context, CAPS markers have been shown to meet these criteria. However, converting SNPs to CAPS markers can be a difficult process if done manually. In order to address this problem, we describe a computer program, SNP2CAPS, that facilitates the computational conversion of SNP markers into CAPS markers. 413 multiple aligned sequences derived from barley ESTs were analysed for the presence of polymorphisms in 235 distinct restriction sites. 282 (90%) of 314 alignments that contain sequence variation due to SNPs and InDels revealed at least one polymorphic restriction site. After reducing the number of restriction enzymes from 235 to 10, 31% of the polymorphic sites could still be detected. In order to demonstrate the usefulness of this tool for marker development, we experimentally validated some of the results predicted by SNP2CAPS.

  10. SNP marker detection and genotyping in tilapia.

    PubMed

    Van Bers, N E M; Crooijmans, R P M A; Groenen, M A M; Dibbits, B W; Komen, J

    2012-09-01

    We have generated a unique resource consisting of nearly 175 000 short contig sequences and 3569 SNP markers from the widely cultured GIFT (Genetically Improved Farmed Tilapia) strain of Nile tilapia (Oreochromis niloticus). In total, 384 SNPs were selected to monitor the wider applicability of the SNPs by genotyping tilapia individuals from different strains and different geographical locations. In all strains and species tested (O. niloticus, O. aureus and O. mossambicus), the genotyping assay was working for a similar number of SNPs (288-305 SNPs). The actual number of polymorphic SNPs was, as expected, highest for individuals from the GIFT population (255 SNPs). In the individuals from an Egyptian strain and in individuals caught in the wild in the basin of the river Volta, 197 and 163 SNPs were polymorphic, respectively. A pairwise calculation of Nei's genetic distance allowed the discrimination of the individual strains and species based on the genotypes determined with the SNP set. We expect that this set will be widely applicable for use in tilapia aquaculture, e.g. for pedigree reconstruction. In addition, this set is currently used for assaying the genetic diversity of native Nile tilapia in areas where tilapia is, or will be, introduced in aquaculture projects. This allows the tracing of escapees from aquaculture and the monitoring of effects of introgression and hybridization.

  11. SNP-RFLPing: restriction enzyme mining for SNPs in genomes

    PubMed Central

    Chang, Hsueh-Wei; Yang, Cheng-Hong; Chang, Phei-Lang; Cheng, Yu-Huei; Chuang, Li-Yeh

    2006-01-01

    Background The restriction fragment length polymorphism (RFLP) is a common laboratory method for the genotyping of single nucleotide polymorphisms (SNPs). Here, we describe a web-based software, named SNP-RFLPing, which provides the restriction enzyme for RFLP assays on a batch of SNPs and genes from the human, rat, and mouse genomes. Results Three user-friendly inputs are included: 1) NCBI dbSNP "rs" or "ss" IDs; 2) NCBI Entrez gene ID and HUGO gene name; 3) any formats of SNP-in-sequence, are allowed to perform the SNP-RFLPing assay. These inputs are auto-programmed to SNP-containing sequences and their complementary sequences for the selection of restriction enzymes. All SNPs with available RFLP restriction enzymes of each input genes are provided even if many SNPs exist. The SNP-RFLPing analysis provides the SNP contig position, heterozygosity, function, protein residue, and amino acid position for cSNPs, as well as commercial and non-commercial restriction enzymes. Conclusion This web-based software solves the input format problems in similar softwares and greatly simplifies the procedure for providing the RFLP enzyme. Mixed free forms of input data are friendly to users who perform the SNP-RFLPing assay. SNP-RFLPing offers a time-saving application for association studies in personalized medicine and is freely available at . PMID:16503968

  12. The importance of integrating SNP and cheminformatics resources to pharmacogenomics.

    PubMed

    Chang, Hsueh-Wei; Chuang, Li-Yeh; Tsai, Ming-Tz; Yang, Cheng-Hong

    2012-09-01

    Single nucleotide polymorphisms (SNPs) are the most frequent variants in many genes and are promising markers in relation to drug responses in pharmacogenomics studies. In this review, we emphasized the importance of the cheminformatic-related and SNP-related resources and tools and how they can improve pharmacogenomics studies. Currently, many cheminformatic resources are well developed and provide much information on drug metabolism and targeting. In parallel, there are also many well established SNP-related resources that are able to provide the information related to SNP genotyping, tag SNPs and functional classification. However, cheminformatic and SNP resources have not, as yet, been well-integrated to provide a user-friendly platform for pharmacogenomics studies. This paper presents a brief overview of the many available public resources for cheminformatics (DrugBank, PharmGKB and other drugrelated databases) and SNPs (dbSNP, HapMap, SNP500Cancer, SNP-RFLPing 2 and other SNP tools) and points out the importance of integrating cheminformatic and SNP resources for the future of pharmacogenomics.

  13. Performance of the SNPforID 52 SNP-plex assay in paternity testing.

    PubMed

    Børsting, Claus; Sanchez, Juan J; Hansen, Hanna E; Hansen, Anders J; Bruun, Hanne Q; Morling, Niels

    2008-09-01

    The performance of a multiplex assay with 52 autosomal single nucleotide polymorphisms (SNPs) developed for human identification was tested on 124 mother-child-father trios. The typical paternity indices (PIs) were 10(5)-10(6) for the trios and 10(3)-10(4) for the child-father duos. Using the SNP profiles from the randomly selected trios and 700 previously typed individuals, a total of 83,096 comparisons between mother, child and an unrelated man were performed. On average, 9-10 mismatches per comparison were detected. Four mismatches were genetic inconsistencies and 5-6 mismatches were opposite homozygosities. In only two of the 83,096 comparisons did an unrelated man match perfectly to a mother-child duo, and in both cases the PI of the true father was much higher than the PI of the unrelated man. The trios were also typed for 15 short tandem repeats (STRs) and seven variable number of tandem repeats (VNTRs). The typical PIs based on 15 STRs or seven VNTRs were 5-50 times higher than the typical PIs based on 52 SNPs. Six mutations in tandem repeats were detected among the randomly selected trios. In contrast, there was not found any mutations in the SNP loci. The results showed that the 52 SNP-plex assay is a very useful alternative to currently used methods in relationship testing. The usefulness of SNP markers with low mutation rates in paternity and immigration casework is discussed.

  14. COL18A1 is highly expressed during human adipocyte differentiation and the SNP c.1136C > T in its "frizzled" motif is associated with obesity in diabetes type 2 patients.

    PubMed

    Errera, Flavia I V; Canani, Luís H; Yeh, Erika; Kague, Erika; Armelin-Corrêa, Lucia M; Suzuki, Oscar T; Tschiedel, Balduíno; Silva, Maria Elizabeth R; Sertié, Andréa L; Passos-Bueno, Maria Rita

    2008-03-01

    Collagen XVIII can generate two fragments, NC11-728 containing a frizzled motif which possibly acts in Wnt signaling and Endostatin, which is cleaved from the NC1 and is a potent inhibitor of angiogenesis. Collagen XVIII and Wnt signaling have recently been associated with adipogenic differentiation and obesity in some animal models, but not in humans. In the present report, we have shown that COL18A1 expression increases during human adipogenic differentiation. We also tested if polymorphisms in the Frizzled (c.1136C>T; Thr379Met) and Endostatin (c.4349G>A; Asp1437Asn) regions contribute towards susceptibility to obesity in patients with type 2 diabetes (113 obese, BMI > or =30; 232 non-obese, BMI < 30) of European ancestry. No evidence of association was observed between the allele c.4349G>A and obesity, but we observed a significantly higher frequency of homozygotes c.1136TT in obese (19.5%) than in non-obese individuals (10.9%) [P = 0.02; OR = 2.0 (95%CI: 1.07-3.73)], suggesting that the allele c.1136T is associated to obesity in a recessive model. This genotype, after controlling for cholesterol, LDL cholesterol, and triglycerides, was independently associated with obesity (P = 0.048), and increases the chance of obesity in 2.8 times. Therefore, our data suggest the involvement of collagen XVIII in human adipogenesis and susceptibility to obesity.

  15. Evaluation of TP53 Pro72Arg and MDM2 SNP285-SNP309 polymorphisms in an Italian cohort of LFS suggestive patients lacking identifiable TP53 germline mutations.

    PubMed

    Ponti, Francesca; Corsini, Serena; Gnoli, Maria; Pedrini, Elena; Mordenti, Marina; Sangiorgi, Luca

    2016-10-01

    Li-Fraumeni syndrome (LFS) is a rare genetic cancer predisposition disease, partly determined by the presence of a TP53 germline mutation; lacking thereof, in presence of a typical LFS phenotype, defines a wide group of 'LFS Suggestive' patients. Alternative LFS susceptibility genes have been investigated without promising results, thus suggesting other genetic determinants involvement in cancer predisposition. Hence, this study explores the single and combined effects of cancer risk, age of onset and cancer type of three single nucleotide polymorphisms (SNPs)-TP53 Pro72Arg, MDM2 SNP285 and SNP309-already described as modifiers on TP53 mutation carriers but not properly investigated in LFS Suggestive patients. This case-control study examines 34 Italian LFS Suggestive lacking of germline TP53 mutations and 95 tumour-free subjects. A significant prevalence of homozygous MDM2 SNP309 G in the LFS Suggestive group (p < 0.0005) confirms its contribute to cancer susceptibility, also highlighted in LFS TP53 positive families. Conversely its anticipating role on tumour onset has not been confirmed, as in our results it was associated with the SNP309 T allele. A strong combined outcome with a 'dosage' effect has also been reported for TP53 P72 and MDM2 SNP309 G allele on cancer susceptibility (p < 0.0005). Whereas the MDM2 SNP285 C allele neutralizing effect on MDM2 SNP309 G variant is not evident in our population. Although it needs further evaluations, obtained results strengthen the role of MDM2 SNP309 as a genetic factor in hereditary predisposition to cancer, so improving LFS Suggestive patients management.

  16. Single Nucleotide Polymorphism (SNP) in the Adiponectin Gene and Cardiovascular Disease.

    PubMed

    Chirumbolo, Salvatore

    2016-07-01

    Dear Editor, The recent article by Mohammadzadeh et al.[1] on the latest issue of this Journal showed that the T allele +276G/T SNP of ADIPOQ gene is more associated with the increasing risk of coronary artery disease (CAD) in subjects with type 2 diabetes. Adipocytes were described in myocardial tissue of CAD patients and their role recently discussed[2,3]. Susceptibility to CAD by polymorphism in the Q gene of adiponectin has been reported for 3'-UTR, which harbours some genetic loci associated with metabolic risks and atherosclerosis[4]. Actually, previous studies have shown that the haplotype SNP +276G>T was associated with a decreased risk of CAD, after adjustment for potential confounding factors, therefore some controversial opinion still exists[5]. This evidence should be associated with the role exerted by adipocytes and adiponectin in heart physiology. In particular, in hypertensive disorder complicating pregnancy (HDCP), by investigating the population frequency of alleles, genotypes, and haplotypes of two single nucleotide polymorphisms (SNPs), namely +45T>G (rs2241766) and +276G>T (rs1501299), some authors found that the SNP +276 TT genotype was significantly associated with protection against HDCP, when compared to the pooled G genotypes[6]. Moreover, the same +276G/T SNP haplotype was strongly associated with biliary atresia, an intractable neonatal inflammatory and obliterative cholangiopathy, leading to progressive fibrosis and cirrhosis[7]. CAD is closely related to adiponectin biology. The same isoforms of adiponectin seem to be not associated to CAD severity but to glucose metabolism and its impairment[8]. In the paper by Mohammadzadeh et al.[1], T allele in +276G/T SNP haplotype is highly associated with CAD in subjects with type 2 diabetes, but this linkage should be reappraised if related much more to diabetes rather than CAD. Association of T allele in the indicated SNP with CAD may be an indirect consequence of type 2 diabetes, as reported

  17. Atomic Force Microscopy for DNA SNP Identification

    NASA Astrophysics Data System (ADS)

    Valbusa, Ugo; Ierardi, Vincenzo

    The knowledge of the effects of single-nucleotide polymorphisms (SNPs) in the human genome greatly contributes to better comprehension of the relation between genetic factors and diseases. Sequence analysis of genomic DNA in different individuals reveals positions where variations that involve individual base substitutions can occur. Single-nucleotide polymorphisms are highly abundant and can have different consequences at phenotypic level. Several attempts were made to apply atomic force microscopy (AFM) to detect and map SNP sites in DNA strands. The most promising approach is the study of DNA mutations producing heteroduplex DNA strands and identifying the mismatches by means of a protein that labels the mismatches. MutS is a protein that is part of a well-known complex of mismatch repair, which initiates the process of repairing when the MutS binds to the mismatched DNA filament. The position of MutS on the DNA filament can be easily recorded by means of AFM imaging.

  18. SNP-microarrays can accurately identify the presence of an individual in complex forensic DNA mixtures.

    PubMed

    Voskoboinik, Lev; Ayers, Sheri B; LeFebvre, Aaron K; Darvasi, Ariel

    2015-05-01

    Common forensic and mass disaster scenarios present DNA evidence that comprises a mixture of several contributors. Identifying the presence of an individual in such mixtures has proven difficult. In the current study, we evaluate the practical usefulness of currently available "off-the-shelf" SNP microarrays for such purposes. We found that a set of 3000 SNPs specifically selected for this purpose can accurately identify the presence of an individual in complex DNA mixtures of various compositions. For example, individuals contributing as little as 5% to a complex DNA mixture can be robustly identified even if the starting DNA amount was as little as 5.0ng and had undergone whole-genome amplification (WGA) prior to SNP analysis. The work presented in this study represents proof-of-principle that our previously proposed approach, can work with real "forensic-type" samples. Furthermore, in the absence of a low-density focused forensic SNP microarray, the use of standard, currently available high-density SNP microarrays can be similarly used and even increase statistical power due to the larger amount of available information.

  19. Changes in variance explained by top SNP windows over generations for three traits in broiler chicken

    PubMed Central

    Fragomeni, Breno de Oliveira; Misztal, Ignacy; Lourenco, Daniela Lino; Aguilar, Ignacio; Okimoto, Ronald; Muir, William M.

    2014-01-01

    The purpose of this study was to determine if the set of genomic regions inferred as accounting for the majority of genetic variation in quantitative traits remain stable over multiple generations of selection. The data set contained phenotypes for five generations of broiler chicken for body weight, breast meat, and leg score. The population consisted of 294,632 animals over five generations and also included genotypes of 41,036 single nucleotide polymorphism (SNP) for 4,866 animals, after quality control. The SNP effects were calculated by a GWAS type analysis using single step genomic BLUP approach for generations 1–3, 2–4, 3–5, and 1–5. Variances were calculated for windows of 20 SNP. The top ten windows for each trait that explained the largest fraction of the genetic variance across generations were examined. Across generations, the top 10 windows explained more than 0.5% but less than 1% of the total variance. Also, the pattern of the windows was not consistent across generations. The windows that explained the greatest variance changed greatly among the combinations of generations, with a few exceptions. In many cases, a window identified as top for one combination, explained less than 0.1% for the other combinations. We conclude that identification of top SNP windows for a population may have little predictive power for genetic selection in the following generations for the traits here evaluated. PMID:25324857

  20. Multiplexed SNP genotyping using the Qbead™ system: a quantum dot-encoded microsphere-based assay

    PubMed Central

    Xu, Hongxia; Sha, Michael Y.; Wong, Edith Y.; Uphoff, Janet; Xu, Yanzhang; Treadway, Joseph A.; Truong, Anh; O’Brien, Eamonn; Asquith, Steven; Stubbins, Michael; Spurr, Nigel K.; Lai, Eric H.; Mahoney, Walt

    2003-01-01

    We have developed a new method using the Qbead™ system for high-throughput genotyping of single nucleotide polymorphisms (SNPs). The Qbead system employs fluorescent Qdot™ semiconductor nanocrystals, also known as quantum dots, to encode microspheres that subsequently can be used as a platform for multiplexed assays. By combining mixtures of quantum dots with distinct emission wavelengths and intensities, unique spectral ‘barcodes’ are created that enable the high levels of multiplexing required for complex genetic analyses. Here, we applied the Qbead system to SNP genotyping by encoding microspheres conjugated to allele-specific oligonucleotides. After hybridization of oligonucleotides to amplicons produced by multiplexed PCR of genomic DNA, individual microspheres are analyzed by flow cytometry and each SNP is distinguished by its unique spectral barcode. Using 10 model SNPs, we validated the Qbead system as an accurate and reliable technique for multiplexed SNP genotyping. By modifying the types of probes conjugated to microspheres, the Qbead system can easily be adapted to other assay chemistries for SNP genotyping as well as to other applications such as analysis of gene expression and protein–protein interactions. With its capability for high-throughput automation, the Qbead system has the potential to be a robust and cost-effective platform for a number of applications. PMID:12682378

  1. Longevity and plasticity of CFTR provide an argument for noncanonical SNP organization in hominid DNA.

    PubMed

    Hill, Aubrey E; Plyler, Zackery E; Tiwari, Hemant; Patki, Amit; Tully, Joel P; McAtee, Christopher W; Moseley, Leah A; Sorscher, Eric J

    2014-01-01

    Like many other ancient genes, the cystic fibrosis transmembrane conductance regulator (CFTR) has survived for hundreds of millions of years. In this report, we consider whether such prodigious longevity of an individual gene--as opposed to an entire genome or species--should be considered surprising in the face of eons of relentless DNA replication errors, mutagenesis, and other causes of sequence polymorphism. The conventions that modern human SNP patterns result either from purifying selection or random (neutral) drift were not well supported, since extant models account rather poorly for the known plasticity and function (or the established SNP distributions) found in a multitude of genes such as CFTR. Instead, our analysis can be taken as a polemic indicating that SNPs in CFTR and many other mammalian genes may have been generated--and continue to accrue--in a fundamentally more organized manner than would otherwise have been expected. The resulting viewpoint contradicts earlier claims of 'directional' or 'intelligent design-type' SNP formation, and has important implications regarding the pace of DNA adaptation, the genesis of conserved non-coding DNA, and the extent to which eukaryotic SNP formation should be viewed as adaptive.

  2. Changes in variance explained by top SNP windows over generations for three traits in broiler chicken.

    PubMed

    Fragomeni, Breno de Oliveira; Misztal, Ignacy; Lourenco, Daniela Lino; Aguilar, Ignacio; Okimoto, Ronald; Muir, William M

    2014-01-01

    The purpose of this study was to determine if the set of genomic regions inferred as accounting for the majority of genetic variation in quantitative traits remain stable over multiple generations of selection. The data set contained phenotypes for five generations of broiler chicken for body weight, breast meat, and leg score. The population consisted of 294,632 animals over five generations and also included genotypes of 41,036 single nucleotide polymorphism (SNP) for 4,866 animals, after quality control. The SNP effects were calculated by a GWAS type analysis using single step genomic BLUP approach for generations 1-3, 2-4, 3-5, and 1-5. Variances were calculated for windows of 20 SNP. The top ten windows for each trait that explained the largest fraction of the genetic variance across generations were examined. Across generations, the top 10 windows explained more than 0.5% but less than 1% of the total variance. Also, the pattern of the windows was not consistent across generations. The windows that explained the greatest variance changed greatly among the combinations of generations, with a few exceptions. In many cases, a window identified as top for one combination, explained less than 0.1% for the other combinations. We conclude that identification of top SNP windows for a population may have little predictive power for genetic selection in the following generations for the traits here evaluated.

  3. SNP-SNP interactions as risk factors for aggressive prostate cancer.

    PubMed

    Vaidyanathan, Venkatesh; Naidu, Vijay; Karunasinghe, Nishi; Jabed, Anower; Pallati, Radha; Marlow, Gareth; R Ferguson, Lynnette

    2017-01-01

    Prostate cancer (PCa) is one of the most significant male health concerns worldwide. Single nucleotide polymorphisms (SNPs) are becoming increasingly strong candidate biomarkers for identifying susceptibility to PCa. We identified a number of SNPs reported in genome-wide association analyses (GWAS) as risk factors for aggressive PCa in various European populations, and then defined SNP-SNP interactions, using PLINK software, with nucleic acid samples from a New Zealand cohort. We used this approach to find a gene x environment marker for aggressive PCa, as although statistically gene x environment interactions can be adjusted for, it is highly impossible in practicality, and thus must be incorporated in the search for a reliable biomarker for PCa. We found two intronic SNPs statistically significantly interacting with each other as a risk for aggressive prostate cancer on being compared to healthy controls in a New Zealand population.

  4. SNP-SNP interactions as risk factors for aggressive prostate cancer

    PubMed Central

    Vaidyanathan, Venkatesh; Naidu, Vijay; Karunasinghe, Nishi; Jabed, Anower; Pallati, Radha; Marlow, Gareth; R. Ferguson, Lynnette

    2017-01-01

    Prostate cancer (PCa) is one of the most significant male health concerns worldwide. Single nucleotide polymorphisms (SNPs) are becoming increasingly strong candidate biomarkers for identifying susceptibility to PCa. We identified a number of SNPs reported in genome-wide association analyses (GWAS) as risk factors for aggressive PCa in various European populations, and then defined SNP-SNP interactions, using PLINK software, with nucleic acid samples from a New Zealand cohort. We used this approach to find a gene x environment marker for aggressive PCa, as although statistically gene x environment interactions can be adjusted for, it is highly impossible in practicality, and thus must be incorporated in the search for a reliable biomarker for PCa. We found two intronic SNPs statistically significantly interacting with each other as a risk for aggressive prostate cancer on being compared to healthy controls in a New Zealand population. PMID:28580135

  5. Inference of kinship coefficients from Korean SNP genotyping data.

    PubMed

    Park, Seong-Jin; Yang, Jin Ok; Kim, Sang Cheol; Kwon, Jekeun; Lee, Sanghyuk; Lee, Byungwook

    2013-06-01

    The determination of relatedness between individuals in a family is crucial in analysis of common complex diseases. We present a method to infer close inter-familial relationships based on SNP genotyping data and provide the relationship coefficient of kinship in Korean families. We obtained blood samples from 43 Korean individuals in two families. SNP data was obtained using the Affymetrix Genome-wide Human SNP array 6.0 and the Illumina Human 1M-Duo chip. To measure the kinship coefficient with the SNP genotyping data, we considered all possible pairs of individuals in each family. The genetic distance between two individuals in a pair was determined using the allele sharing distance method. The results show that genetic distance is proportional to the kinship coefficient and that a close degree of kinship can be confirmed with SNP genotyping data. This study represents the first attempt to identify the genetic distance between very closely related individuals.

  6. Exercise improves adiponectin concentrations irrespective of the adiponectin gene polymorphisms SNP45 and the SNP276 in obese Korean women.

    PubMed

    Lee, Kyoung-Young; Kang, Hyun-Sik; Shin, Yun-A

    2013-03-10

    The effects of exercise on adiponectin levels have been reported to be variable and may be attributable to an interaction between environmental and genetic factors. The single nucleotide polymorphisms (SNP) 45 (T>G) and SNP276 (G>T) of the adiponectin gene are associated with metabolic risk factors including adiponectin levels. We examined whether SNP45 and SNP276 would differentially influence the effect of exercise training in middle-aged women with uncomplicated obesity. We conducted a prospective study in the general community that included 90 Korean women (age 47.0±5.1 years) with uncomplicated obesity. The intervention was aerobic exercise training for 3 months. Body composition, adiponectin levels, and other metabolic risk factors were measured. Prior to exercise training, only body weight differed among the SNP276 genotypes. Exercise training improved body composition, systolic blood pressure, maximal oxygen consumption, high-density lipoprotein cholesterol, and leptin levels. In addition, exercise improved adiponectin levels irrespective of weight gain or loss. However, after adjustments for age, BMI, body fat (%), and waist circumference, no differences were found in obesity-related characteristics (e.g., adiponectin) following exercise training among the SNP45 and the 276 genotypes. Our findings suggest that aerobic exercise affects adiponectin levels regardless of weight loss and this effect would not be influenced by SNP45 and SNP276 in the adiponectin gene.

  7. Automated SNP genotype clustering algorithm to improve data completeness in high-throughput SNP genotyping datasets from custom arrays.

    PubMed

    Smith, Edward M; Littrell, Jack; Olivier, Michael

    2007-12-01

    High-throughput SNP genotyping platforms use automated genotype calling algorithms to assign genotypes. While these algorithms work efficiently for individual platforms, they are not compatible with other platforms, and have individual biases that result in missed genotype calls. Here we present data on the use of a second complementary SNP genotype clustering algorithm. The algorithm was originally designed for individual fluorescent SNP genotyping assays, and has been optimized to permit the clustering of large datasets generated from custom-designed Affymetrix SNP panels. In an analysis of data from a 3K array genotyped on 1,560 samples, the additional analysis increased the overall number of genotypes by over 45,000, significantly improving the completeness of the experimental data. This analysis suggests that the use of multiple genotype calling algorithms may be advisable in high-throughput SNP genotyping experiments. The software is written in Perl and is available from the corresponding author.

  8. Identification of Mendelian inconsistencies between SNP and pedigree information of sibs

    PubMed Central

    2011-01-01

    Background Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Methods Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP) genotypes between parent and offspring (PAR-OFF). Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT). The second test compares pedigree and SNP-based relationships (SIBREL). All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Results Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL) identified 18 (22) additional inconsistent animals. Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error), were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error), were considerably higher for SIBREL compared to SIBCOUNT. Conclusions Tests to remove

  9. Identification of Mendelian inconsistencies between SNP and pedigree information of sibs.

    PubMed

    Calus, Mario P L; Mulder, Han A; Bastiaansen, John W M

    2011-10-11

    Using SNP genotypes to apply genomic selection in breeding programs is becoming common practice. Tools to edit and check the quality of genotype data are required. Checking for Mendelian inconsistencies makes it possible to identify animals for which pedigree information and genotype information are not in agreement. Straightforward tests to detect Mendelian inconsistencies exist that count the number of opposing homozygous marker (e.g. SNP) genotypes between parent and offspring (PAR-OFF). Here, we develop two tests to identify Mendelian inconsistencies between sibs. The first test counts SNP with opposing homozygous genotypes between sib pairs (SIBCOUNT). The second test compares pedigree and SNP-based relationships (SIBREL). All tests iteratively remove animals based on decreasing numbers of inconsistent parents and offspring or sibs. The PAR-OFF test, followed by either SIB test, was applied to a dataset comprising 2,078 genotyped cows and 211 genotyped sires. Theoretical expectations for distributions of test statistics of all three tests were calculated and compared to empirically derived values. Type I and II error rates were calculated after applying the tests to the edited data, while Mendelian inconsistencies were introduced by permuting pedigree against genotype data for various proportions of animals. Both SIB tests identified animal pairs for which pedigree and genomic relationships could be considered as inconsistent by visual inspection of a scatter plot of pairwise pedigree and SNP-based relationships. After removal of 235 animals with the PAR-OFF test, SIBCOUNT (SIBREL) identified 18 (22) additional inconsistent animals.Seventeen animals were identified by both methods. The numbers of incorrectly deleted animals (Type I error), were equally low for both methods, while the numbers of incorrectly non-deleted animals (Type II error), were considerably higher for SIBREL compared to SIBCOUNT. Tests to remove Mendelian inconsistencies between sibs should

  10. Hypothesis driven single nucleotide polymorphism search (HyDn-SNP-S).

    PubMed

    Swett, Rebecca J; Elias, Angela; Miller, Jeffrey A; Dyson, Gregory E; Andrés Cisneros, G

    2013-09-01

    The advent of complete-genome genotyping across phenotype cohorts has provided a rich source of information for bioinformaticians. However the search for SNPs from this data is generally performed on a study-by-study case without any specific hypothesis of the location for SNPs that are predictive for the phenotype. We have designed a method whereby very large SNP lists (several gigabytes in size), combining several genotyping studies at once, can be sorted and traced back to their ultimate consequence in protein structure. Given a working hypothesis, researchers are able to easily search whole genome genotyping data for SNPs that link genetic locations to phenotypes. This allows a targeted search for correlations between phenotypes and potentially relevant systems, rather than utilizing statistical methods only. HyDn-SNP-S returns results that are less data dense, allowing more thorough analysis, including haplotype analysis. We have applied our method to correlate DNA polymerases to cancer phenotypes using four of the available cancer databases in dbGaP. Logistic regression and derived haplotype analysis indicates that ~80SNPs, previously overlooked, are statistically significant. Derived haplotypes from this work link POLL to breast cancer and POLG to prostate cancer with an increase in incidence of 3.01- and 9.6-fold, respectively. Molecular dynamics simulations on wild-type and one of the SNP mutants from the haplotype of POLL provide insights at the atomic level on the functional impact of this cancer related SNP. Furthermore, HyDn-SNP-S has been designed to allow application to any system. The program is available upon request from the authors. Copyright © 2013 Elsevier B.V. All rights reserved.

  11. Impact of pre-imputation SNP-filtering on genotype imputation results

    PubMed Central

    2014-01-01

    Background Imputation of partially missing or unobserved genotypes is an indispensable tool for SNP data analyses. However, research and understanding of the impact of initial SNP-data quality control on imputation results is still limited. In this paper, we aim to evaluate the effect of different strategies of pre-imputation quality filtering on the performance of the widely used imputation algorithms MaCH and IMPUTE. Results We considered three scenarios: imputation of partially missing genotypes with usage of an external reference panel, without usage of an external reference panel, as well as imputation of completely un-typed SNPs using an external reference panel. We first created various datasets applying different SNP quality filters and masking certain percentages of randomly selected high-quality SNPs. We imputed these SNPs and compared the results between the different filtering scenarios by using established and newly proposed measures of imputation quality. While the established measures assess certainty of imputation results, our newly proposed measures focus on the agreement with true genotypes. These measures showed that pre-imputation SNP-filtering might be detrimental regarding imputation quality. Moreover, the strongest drivers of imputation quality were in general the burden of missingness and the number of SNPs used for imputation. We also found that using a reference panel always improves imputation quality of partially missing genotypes. MaCH performed slightly better than IMPUTE2 in most of our scenarios. Again, these results were more pronounced when using our newly defined measures of imputation quality. Conclusion Even a moderate filtering has a detrimental effect on the imputation quality. Therefore little or no SNP filtering prior to imputation appears to be the best strategy for imputing small to moderately sized datasets. Our results also showed that for these datasets, MaCH performs slightly better than IMPUTE2 in most scenarios at

  12. Gradient Boosting as a SNP Filter: an Evaluation Using Simulated and Hair Morphology Data

    PubMed Central

    Lubke, GH; Laurin, C; Walters, R; Eriksson, N; Hysi, P; Spector, TD; Montgomery, GW; Martin, NG; Medland, SE; Boomsma, DI

    2013-01-01

    Typically, genome-wide association studies consist of regressing the phenotype on each SNP separately using an additive genetic model. Although statistical models for recessive, dominant, SNP-SNP, or SNP-environment interactions exist, the testing burden makes an evaluation of all possible effects impractical for genome-wide data. We advocate a two-step approach where the first step consists of a filter that is sensitive to different types of SNP main and interactions effects. The aim is to substantially reduce the number of SNPs such that more specific modeling becomes feasible in a second step. We provide an evaluation of a statistical learning method called “gradient boosting machine” (GBM) that can be used as a filter. GBM does not require an a priori specification of a genetic model, and permits inclusion of large numbers of covariates. GBM can therefore be used to explore multiple GxE interactions, which would not be feasible within the parametric framework used in GWAS. We show in a simulation that GBM performs well even under conditions favorable to the standard additive regression model commonly used in GWAS, and is sensitive to the detection of interaction effects even if one of the interacting variables has a zero main effect. The latter would not be detected in GWAS. Our evaluation is accompanied by an analysis of empirical data concerning hair morphology. We estimate the phenotypic variance explained by increasing numbers of highest ranked SNPs, and show that it is sufficient to select 10K-20K SNPs in the first step of a two-step approach. PMID:24404405

  13. SNP genotypes of Mycobacterium leprae isolates in Thailand and their combination with rpoT and TTC genotyping for analysis of leprosy distribution and transmission.

    PubMed

    Phetsuksiri, Benjawan; Srisungngam, Sopa; Rudeeaneksin, Janisara; Bunchoo, Supranee; Lukebua, Atchariya; Wongtrungkapun, Ruch; Paitoon, Soontara; Sakamuri, Rama Murthy; Brennan, Patrick J; Vissa, Varalakshmi

    2012-01-01

    Based on the discovery of three single nucleotide polymorphisms (SNPs) in Mycobacterium leprae, it has been previously reported that there are four major SNP types associated with different geographic regions around the world. Another typing system for global differentiation of M. leprae is the analysis of the variable number of short tandem repeats within the rpoT gene. To expand the analysis of geographic distribution of M. leprae, classified by SNP and rpoT gene polymorphisms, we studied 85 clinical isolates from Thai patients and compared the findings with those reported from Asian isolates. SNP genotyping by PCR amplification and sequencing revealed that all strains like those in Myanmar were SNP type 1 and 3, with the former being predominant, while in Japan, Korea, and Indonesia, the SNP type 3 was found to be more frequent. The pattern of M. leprae distribution in Thailand and Myanmar is quite similar, except that SNP type 2 was not found in Thailand. In addition, the 3-copy hexamer genotype in the rpoT gene is shared among the isolates from these two neighboring countries. On the basis of these two markers, we postulate that M. leprae in leprosy patients from Myanmar and Thailand has a common historical origin. Further differentiation among Thai isolates was possible by assessing copy numbers of the TTC sequence, a more polymorphic microsatellite locus.

  14. Haplotype assembly from aligned weighted SNP fragments.

    PubMed

    Zhao, Yu-Ying; Wu, Ling-Yun; Zhang, Ji-Hong; Wang, Rui-Sheng; Zhang, Xiang-Sun

    2005-08-01

    Given an assembled genome of a diploid organism the haplotype assembly problem can be formulated as retrieval of a pair of haplotypes from a set of aligned weighted SNP fragments. Known computational formulations (models) of this problem are minimum letter flips (MLF) and the weighted minimum letter flips (WMLF; Greenberg et al. (INFORMS J. Comput. 2004, 14, 211-213)). In this paper we show that the general WMLF model is NP-hard even for the gapless case. However the algorithmic solutions for selected variants of WMFL can exist and we propose a heuristic algorithm based on a dynamic clustering technique. We also introduce a new formulation of the haplotype assembly problem that we call COMPLETE WMLF (CWMLF). This model and algorithms for its implementation take into account a simultaneous presence of multiple kinds of data errors. Extensive computational experiments indicate that the algorithmic implementations of the CWMLF model achieve higher accuracy of haplotype reconstruction than the WMLF-based algorithms, which in turn appear to be more accurate than those based on MLF.

  15. Rapid SNP Detection and Genotyping of Bacterial Pathogens by Pyrosequencing.

    PubMed

    Amoako, Kingsley K; Thomas, Matthew C; Janzen, Timothy W; Goji, Noriko

    2017-01-01

    Bacterial identification and typing are fixtures of microbiology laboratories and are vital aspects of our response mechanisms in the event of foodborne outbreaks and bioterrorist events. Whole genome sequencing (WGS) is leading the way in terms of expanding our ability to identify and characterize bacteria through the identification of subtle differences between genomes (e.g. single nucleotide polymorphisms (SNPs) and insertions/deletions). Modern high-throughput technologies such as pyrosequencing can facilitate the typing of bacteria by generating short-read sequence data of informative regions identified by WGS analyses, at a fraction of the cost of WGS. Thus, pyrosequencing systems remain a valuable asset in the laboratory today. Presented in this chapter are two methods developed in the Amoako laboratory that detail the identification and genotyping of bacterial pathogens. The first targets canonical single nucleotide polymorphisms (canSNPs) of evolutionary importance in Bacillus anthracis, the causative agent of Anthrax. The second assay detects Shiga-toxin (stx) genes, which are associated with virulence in Escherichia coli and Shigella spp., and differentiates the subtypes of stx-1 and stx-2 based on SNP loci. These rapid methods provide end users with important information regarding virulence traits as well as the evolutionary and biogeographic origin of isolates.

  16. Deriving Gene Networks from SNP Associated with Triacylglycerol and Phospholipid Fatty Acid Fractions from Ribeyes of Angus Cattle

    PubMed Central

    Buchanan, Justin W.; Reecy, James M.; Garrick, Dorian J.; Duan, Qing; Beitz, Don C.; Koltes, James E.; Saatchi, Mahdi; Koesterke, Lars; Mateescu, Raluca G.

    2016-01-01

    The fatty acid profile of beef is a complex trait that can benefit from gene-interaction network analysis to understand relationships among loci that contribute to phenotypic variation. Phenotypic measures of fatty acid profile from triacylglycerol and phospholipid fractions of longissimus muscle, pedigree information, and Illumina 54 k bovine SNP genotypes were utilized to derive an annotated gene network associated with fatty acid composition in 1,833 Angus beef cattle. The Bayes-B statistical model was utilized to perform a genome wide association study to estimate associations between 54 k SNP genotypes and 39 individual fatty acid phenotypes within each fraction. Posterior means of the effects were estimated for each of the 54 k SNP and for the collective effects of all the SNP in every 1-Mb genomic window in terms of the proportion of genetic variance explained by the window. Windows that explained the largest proportions of genetic variance for individual lipids were found in the triacylglycerol fraction. There was almost no overlap in the genomic regions explaining variance between the triacylglycerol and phospholipid fractions. Partial correlations were used to identify correlated regions of the genome for the set of largest 1 Mb windows that explained up to 35% genetic variation in either fatty acid fraction. SNP were allocated to windows based on the bovine UMD3.1 assembly. Gene network clusters were generated utilizing a partial correlation and information theory algorithm. Results were used in conjunction with network scoring and visualization software to analyze correlated SNP across 39 fatty acid phenotypes to identify SNP of significance. Significant pathways implicated in fatty acid metabolism through GO term enrichment analysis included homeostasis of number of cells, homeostatic process, coenzyme/cofactor activity, and immunoglobulin. These results suggest different metabolic pathways regulate the development of different types of lipids found in

  17. SNP-SNP interaction analysis of NF-κB signaling pathway on breast cancer survival

    PubMed Central

    Jamshidi, Maral; Fagerholm, Rainer; Khan, Sofia; Aittomäki, Kristiina; Czene, Kamila; Darabi, Hatef; Li, Jingmei; Andrulis, Irene L.; Chang-Claude, Jenny; Devilee, Peter; Fasching, Peter A.; Michailidou, Kyriaki; Bolla, Manjeet K.; Dennis, Joe; Wang, Qin; Guo, Qi; Rhenius, Valerie; Cornelissen, Sten; Rudolph, Anja; Knight, Julia A.; Loehberg, Christian R.; Burwinkel, Barbara; Marme, Frederik; Hopper, John L.; Southey, Melissa C.; Bojesen, Stig E.; Flyger, Henrik; Brenner, Hermann; Holleczek, Bernd; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Dyck, Laurien Van; Nevelsteen, Ines; Couch, Fergus J.; Olson, Janet E.; Giles, Graham G.; McLean, Catriona; Haiman, Christopher A.; Henderson, Brian E.; Winqvist, Robert; Pylkäs, Katri; Tollenaar, Rob A.E.M.; García-Closas, Montserrat; Figueroa, Jonine; Hooning, Maartje J.; Martens, John W.M.; Cox, Angela; Cross, Simon S.; Simard, Jacques; Dunning, Alison M.; Easton, Douglas F.; Pharoah, Paul D.P.; Hall, Per; Blomqvist, Carl; Schmidt, Marjanka K.; Nevanlinna, Heli

    2015-01-01

    In breast cancer, constitutive activation of NF-κB has been reported, however, the impact of genetic variation of the pathway on patient prognosis has been little studied. Furthermore, a combination of genetic variants, rather than single polymorphisms, may affect disease prognosis. Here, in an extensive dataset (n = 30,431) from the Breast Cancer Association Consortium, we investigated the association of 917 SNPs in 75 genes in the NF-κB pathway with breast cancer prognosis. We explored SNP-SNP interactions on survival using the likelihood-ratio test comparing multivariate Cox’ regression models of SNP pairs without and with an interaction term. We found two interacting pairs associating with prognosis: patients simultaneously homozygous for the rare alleles of rs5996080 and rs7973914 had worse survival (HRinteraction 6.98, 95% CI=3.3-14.4, P = 1.42E-07), and patients carrying at least one rare allele for rs17243893 and rs57890595 had better survival (HRinteraction 0.51, 95% CI=0.3-0.6, P = 2.19E-05). Based on in silico functional analyses and literature, we speculate that the rs5996080 and rs7973914 loci may affect the BAFFR and TNFR1/TNFR3 receptors and breast cancer survival, possibly by disturbing both the canonical and non-canonical NF-κB pathways or their dynamics, whereas, rs17243893-rs57890595 interaction on survival may be mediated through TRAF2-TRAIL-R4 interplay. These results warrant further validation and functional analyses. PMID:26317411

  18. A scan statistic for identifying chromosomal patterns of SNP association.

    PubMed

    Sun, Yan V; Levin, Albert M; Boerwinkle, Eric; Robertson, Henry; Kardia, Sharon L R

    2006-11-01

    We have developed a single nucleotide polymorphism (SNP) association scan statistic that takes into account the complex distribution of the human genome variation in the identification of chromosomal regions with significant SNP associations. This scan statistic has wide applicability for genetic analysis, whether to identify important chromosomal regions associated with common diseases based on whole-genome SNP association studies or to identify disease susceptibility genes based on dense SNP positional candidate studies. To illustrate this method, we analyzed patterns of SNP associations on chromosome 19 in a large cohort study. Among 2,944 SNPs, we found seven regions that contained clusters of significantly associated SNPs. The average width of these regions was 35 kb with a range of 10-72 kb. We compared the scan statistic results to Fisher's product method using a sliding window approach, and detected 22 regions with significant clusters of SNP associations. The average width of these regions was 131 kb with a range of 10.1-615 kb. Given that the distances between SNPs are not taken into consideration in the sliding window approach, it is likely that a large fraction of these regions represents false positives. However, all seven regions detected by the scan statistic were also detected by the sliding window approach. The linkage disequilibrium (LD) patterns within the seven regions were highly variable indicating that the clusters of SNP associations were not due to LD alone. The scan statistic developed here can be used to make gene-based or region-based SNP inferences about disease association.

  19. A Novel Test for Detecting SNP-SNP Interactions in Case-Only Trio Studies.

    PubMed

    Balliu, Brunilda; Zaitlen, Noah

    2016-04-01

    Epistasis plays a significant role in the genetic architecture of many complex phenotypes in model organisms. To date, there have been very few interactions replicated in human studies due in part to the multiple-hypothesis burden implicit in genome-wide tests of epistasis. Therefore, it is of paramount importance to develop the most powerful tests possible for detecting interactions. In this work we develop a new SNP-SNP interaction test for use in case-only trio studies called the trio correlation (TC) test. The TC test computes the expected joint distribution of marker pairs in offspring conditional on parental genotypes. This distribution is then incorporated into a standard 1 d.f. correlation test of interaction. We show via extensive simulations under a variety of disease models that our test substantially outperforms existing tests of interaction in case-only trio studies. We also demonstrate a bias in a previous case-only trio interaction test and identify its origin. Finally, we show that a previously proposed permutation scheme in trio studies mitigates the known biases of case-only tests in the presence of population stratification. We conclude that the TC test shows improved power to identify interactions in existing, as well as emerging, trio association studies. The method is publicly available at www.github.com/BrunildaBalliu/TrioEpi.

  20. Slider--maximum use of probability information for alignment of short sequence reads and SNP detection.

    PubMed

    Malhis, Nawar; Butterfield, Yaron S N; Ester, Martin; Jones, Steven J M

    2009-01-01

    A plethora of alignment tools have been created that are designed to best fit different types of alignment conditions. While some of these are made for aligning Illumina Sequence Analyzer reads, none of these are fully utilizing its probability (prb) output. In this article, we will introduce a new alignment approach (Slider) that reduces the alignment problem space by utilizing each read base's probabilities given in the prb files. Compared with other aligners, Slider has higher alignment accuracy and efficiency. In addition, given that Slider matches bases with probabilities other than the most probable, it significantly reduces the percentage of base mismatches. The result is that its SNP predictions are more accurate than other SNP prediction approaches used today that start from the most probable sequence, including those using base quality.

  1. Rapid Detection of Rare Deleterious Variants by Next Generation Sequencing with Optional Microarray SNP Genotype Data.

    PubMed

    Watson, Christopher M; Crinnion, Laura A; Gurgel-Gianetti, Juliana; Harrison, Sally M; Daly, Catherine; Antanavicuite, Agne; Lascelles, Carolina; Markham, Alexander F; Pena, Sergio D J; Bonthron, David T; Carr, Ian M

    2015-09-01

    Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease-causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome-wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution.

  2. Development of a SNP set for human identification: A set with high powers of discrimination which yields high genetic information from naturally degraded DNA samples in the Thai population.

    PubMed

    Boonyarit, Hathaichanoke; Mahasirimongkol, Surakameth; Chavalvechakul, Nuttama; Aoki, Masayuki; Amitani, Hanae; Hosono, Naoya; Kamatani, Naoyuki; Kubo, Michiaki; Lertrit, Patcharee

    2014-07-01

    This study describes the development of a SNP typing system for human identification in the Thai population, in particular for extremely degraded DNA samples. A highly informative SNP marker set for forensic identification was identified, and a multiplex PCR-based Invader assay was developed. Fifty-one highly informative autosomal SNP markers and three sex determination SNP markers were amplified in two multiplex PCR reactions and then detected using Invader assay reactions. The average PCR product size was 71 base pairs. The match probability of the 54-SNP marker set in 124 Thai individuals was 1.48×10(-21), higher than that of STR typing, suggesting that this 54-SNP marker set is beneficial for forensic identification in the Thai population. The selected SNP marker set was also evaluated in 90 artificially degraded samples, and in 128 naturally degraded DNA samples from real forensic casework which had shown no profiles or incomplete profiles when examined using a commercial STR typing system. A total of 56 degraded samples (44%) achieved the matching probability (PM) equivalent to STR gold standard analysis (successful genotyping of 44 SNP markers) for human identification. These data indicated that our novel 54-SNP marker set provides a very useful and valuable approach for forensic identification in the Thai population, especially in the case of highly to extremely degraded DNA. In summary, we have developed a set of 54 Thai-specific SNPs for human identification which have higher discrimination power than STR genotyping. The PCRs for these 54 SNP markers were successfully combined into two multiplex reactions and detected with an Invader assay. This novel SNP genotyping system also yields high levels of genetic information from naturally degraded samples, even though there are much more difficult to recover than artificially degraded samples.

  3. Mutations of C-reactive protein (CRP) -286 SNP, APC and p53 in colorectal cancer: implication for a CRP-Wnt crosstalk.

    PubMed

    Su, Hai-Xiang; Zhou, Hai-Hong; Wang, Ming-Yu; Cheng, Jin; Zhang, Shi-Chao; Hui, Feng; Chen, Xue-Zhong; Liu, Shan-Hui; Liu, Qin-Jiang; Zhu, Zi-Jiang; Hu, Qing-Rong; Wu, Yi; Ji, Shang-Rong

    2014-01-01

    C-reactive protein (CRP) is an established marker of inflammation with pattern-recognition receptor-like activities. Despite the close association of the serum level of CRP with the risk and prognosis of several types of cancer, it remains elusive whether CRP contributes directly to tumorigenesis or just represents a bystander marker. We have recently identified recurrent mutations at the SNP position -286 (rs3091244) in the promoter of CRP gene in several tumor types, instead suggesting that locally produced CRP is a potential driver of tumorigenesis. However, it is unknown whether the -286 site is the sole SNP position of CRP gene targeted for mutation and whether there is any association between CRP SNP mutations and other frequently mutated genes in tumors. Herein, we have examined the genotypes of three common CRP non-coding SNPs (rs7553007, rs1205, rs3093077) in tumor/normal sample pairs of 5 cancer types (n = 141). No recurrent somatic mutations are found at these SNP positions, indicating that the -286 SNP mutations are preferentially selected during the development of cancer. Further analysis reveals that the -286 SNP mutations of CRP tend to co-occur with mutated APC particularly in rectal cancer (p = 0.04; n = 67). By contrast, mutations of CRP and p53 or K-ras appear to be unrelated. There results thus underscore the functional importance of the -286 mutation of CRP in tumorigenesis and imply an interaction between CRP and Wnt signaling pathway.

  4. Detection of selective sweeps in cattle using genome-wide SNP data

    PubMed Central

    2013-01-01

    Background The domestication and subsequent selection by humans to create breeds and biological types of cattle undoubtedly altered the patterning of variation within their genomes. Strong selection to fix advantageous large-effect mutations underlying domesticability, breed characteristics or productivity created selective sweeps in which variation was lost in the chromosomal region flanking the selected allele. Selective sweeps have now been identified in the genomes of many animal species including humans, dogs, horses, and chickens. Here, we attempt to identify and characterise regions of the bovine genome that have been subjected to selective sweeps. Results Two datasets were used for the discovery and validation of selective sweeps via the fixation of alleles at a series of contiguous SNP loci. BovineSNP50 data were used to identify 28 putative sweep regions among 14 diverse cattle breeds. Affymetrix BOS 1 prescreening assay data for five breeds were used to identify 85 regions and validate 5 regions identified using the BovineSNP50 data. Many genes are located within these regions and the lack of sequence data for the analysed breeds precludes the nomination of selected genes or variants and limits the prediction of the selected phenotypes. However, phenotypes that we predict to have historically been under strong selection include horned-polled, coat colour, stature, ear morphology, and behaviour. Conclusions The bias towards common SNPs in the design of the BovineSNP50 assay led to the identification of recent selective sweeps associated with breed formation and common to only a small number of breeds rather than ancient events associated with domestication which could potentially be common to all European taurines. The limited SNP density, or marker resolution, of the BovineSNP50 assay significantly impacted the rate of false discovery of selective sweeps, however, we found sweeps in common between breeds which were confirmed using an ultra

  5. A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation

    PubMed Central

    2013-01-01

    Background Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. Results We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Conclusions Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array—more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation

  6. Heterogeneous computing architecture for fast detection of SNP-SNP interactions.

    PubMed

    Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros

    2014-06-25

    The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.

  7. Proper joint analysis of summary association statistics requires the adjustment of heterogeneity in SNP coverage pattern.

    PubMed

    Zhang, Han; Wheeler, William; Song, Lei; Yu, Kai

    2017-07-07

    As meta-analysis results published by consortia of genome-wide association studies (GWASs) become increasingly available, many association summary statistics-based multi-locus tests have been developed to jointly evaluate multiple single-nucleotide polymorphisms (SNPs) to reveal novel genetic architectures of various complex traits. The validity of these approaches relies on the accurate estimate of z-score correlations at considered SNPs, which in turn requires knowledge on the set of SNPs assessed by each study participating in the meta-analysis. However, this exact SNP coverage information is usually unavailable from the meta-analysis results published by GWAS consortia. In the absence of the coverage information, researchers typically estimate the z-score correlations by making oversimplified coverage assumptions. We show through real studies that such a practice can generate highly inflated type I errors, and we demonstrate the proper way to incorporate correct coverage information into multi-locus analyses. We advocate that consortia should make SNP coverage information available when posting their meta-analysis results, and that investigators who develop analytic tools for joint analyses based on summary data should pay attention to the variation in SNP coverage and adjust for it appropriately. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.

  8. A 34-plex autosomal SNP single base extension assay for ancestry investigations.

    PubMed

    Phillips, C; Fondevila, M; Lareau, Maria Victoria

    2012-01-01

    Ancestry inference based on autosomal markers remains a niche approach in forensic analysis: most laboratories feel more secure with a review of the cumulative STR profile frequencies in a range of relevant populations with the possible additional analysis of mitochondrial and/or Y-chromosome variability. However, a proportion of autosomal single nucleotide polymorphisms (SNPs) show very well-differentiated allele frequencies among global population-groups. Furthermore, such ancestry informative marker SNPs (AIM-SNPs) lend themselves to relatively straightforward typing with short-amplicon PCR and multiplexed single base extension reactions using the same capillary electrophoresis detectors required for the sequencing and STR genotyping of mainstream forensic markers. In this chapter, we describe a 34 AIM-SNP multiplex that is robust enough for the analysis of challenging, often highly degraded DNA typical of much of routine forensic casework. We also outline in detail the in-silico procedures necessary for collecting parental population reference data from the SPSmart SNP databases and performing ancestry inference of single AIM-SNP profiles or large-scale population data using the companion ancestry analysis website of Snipper. Two casework examples are described that show, in both cases, that an inference of likely ancestry using AIM-SNPs helped the identification of highly degraded skeletal material.

  9. Quantification of within-sample genetic heterogeneity from SNP-array data.

    PubMed

    Martinez, Pierre; Kimberley, Christopher; BirkBak, Nicolai J; Marquard, Andrea; Szallasi, Zoltan; Graham, Trevor A

    2017-06-12

    Intra-tumour genetic heterogeneity (ITH) fosters drug resistance and is a critical hurdle to clinical treatment. ITH can be well-measured using multi-region sampling but this is costly and challenging to implement. There is therefore a need for tools to estimate ITH in individual samples, using standard genomic data such as SNP-arrays, that could be implemented routinely. We designed two novel scores S and R, respectively based on the Shannon diversity index and Ripley's L statistic of spatial homogeneity, to quantify ITH in single SNP-array samples. We created in-silico and in-vitro mixtures of tumour clones, in which diversity was known for benchmarking purposes. We found significant but highly-variable associations of our scores with diversity in-silico (p < 0.001) and moderate associations in-vitro (p = 0.015 and p = 0.085). Our scores were also correlated to previous ITH estimates from sequencing data but heterogeneity in the fraction of tumour cells present across samples hampered accurate quantification. The prognostic potential of both scores was moderate but significantly predictive of survival in several tumour types (corrected p = 0.03). Our work thus shows how individual SNP-arrays reveal intra-sample clonal diversity with moderate accuracy.

  10. Meta-analysis diagnostic accuracy of SNP-based pathogenicity detection tools: a case of UTG1A1 gene mutations

    PubMed Central

    Galehdari, Hamid; Saki, Najmaldin; Mohammadi-asl, Javad; Rahim, Fakher

    2013-01-01

    Crigler-Najjar syndrome (CNS) type I and type II are usually inherited as autosomal recessive conditions that result from mutations in the UGT1A1 gene. The main objective of the present review is to summarize results of all available evidence on the accuracy of SNP-based pathogenicity detection tools compared to published clinical result for the prediction of in nsSNPs that leads to disease using prediction performance method. A comprehensive search was performed to find all mutations related to CNS. Database searches included dbSNP, SNPdbe, HGMD, Swissvar, ensemble, and OMIM. All the mutation related to CNS was extracted. The pathogenicity prediction was done using SNP-based pathogenicity detection tools include SIFT, PHD-SNP, PolyPhen2, fathmm, Provean, and Mutpred. Overall, 59 different SNPs related to missense mutations in the UGT1A1 gene, were reviewed. Comparing the diagnostic OR, PolyPhen2 and Mutpred have the highest detection 4.983 (95% CI: 1.24 – 20.02) in both, following by SIFT (diagnostic OR: 3.25, 95% CI: 1.07 – 9.83). The highest MCC of SNP-based pathogenicity detection tools, was belong to SIFT (34.19%) followed by Provean, PolyPhen2, and Mutpred (29.99%, 29.89%, and 29.89%, respectively). Hence the highest SNP-based pathogenicity detection tools ACC, was fit to SIFT (62.71%) followed by PolyPhen2, and Mutpred (61.02%, in both). Our results suggest that some of the well-established SNP-based pathogenicity detection tools can appropriately reflect the role of a disease-associated SNP in both local and global structures. PMID:23875061

  11. snp-search: simple processing, manipulation and searching of SNPs from high-throughput sequencing

    PubMed Central

    2013-01-01

    Background A typical bacterial pathogen genome mapping project can identify thousands of single nucleotide polymorphisms (SNP). Interpreting SNP data is complex and it is difficult to conceptualise the data contained within the large flat files that are the typical output from most SNP calling algorithms. One solution to this problem is to construct a database that can be queried using simple commands so that SNP interrogation and output is both easy and comprehensible. Results Here we present snp-search, a tool that manages SNP data and allows for manipulation and searching of SNP data. After creation of a SNP database from a VCF file, snp-search can be used to convert the selected SNP data into FASTA sequences, construct phylogenies, look for unique SNPs, and output contextual information about each SNP. The FASTA output from snp-search is particularly useful for the generation of robust phylogenetic trees that are based on SNP differences across the conserved positions in whole genomes. Queries can be designed to answer critical genomic questions such as the association of SNPs with particular phenotypes. Conclusions snp-search is a tool that manages SNP data and outputs useful information which can be used to test important biological hypotheses. PMID:24246037

  12. Genome-wide SNP analysis of the Systemic Capillary Leak Syndrome (Clarkson disease)

    PubMed Central

    Xie, Zhihui; Nagarajan, Vijayaraj; Sturdevant, Daniel E; Iwaki, Shoko; Chan, Eunice; Wisch, Laura; Young, Michael; Nelson, Celeste M; Porcella, Stephen F; Druey, Kirk M

    2013-01-01

    The Systemic Capillary Leak Syndrome (SCLS) is an extremely rare, orphan disease that resembles, and is frequently erroneously diagnosed as, systemic anaphylaxis. The disorder is characterized by repeated, transient, and seemingly unprovoked episodes of hypotensive shock and peripheral edema due to transient endothelial hyperpermeability. SCLS is often accompanied by a monoclonal gammopathy of unknown significance (MGUS). Using Affymetrix Single Nucleotide Polymorphism (SNP) microarrays, we performed the first genome-wide SNP analysis of SCLS in a cohort of 12 disease subjects and 18 controls. Exome capture sequencing was performed on genomic DNA from nine of these patients as validation for the SNP-chip discoveries and de novo data generation. We identified candidate susceptibility loci for SCLS, which included a region flanking CAV3 (3p25.3) as well as SNP clusters in PON1 (7q21.3), PSORS1C1 (6p21.3), and CHCHD3 (7q33). Among the most highly ranked discoveries were gene-associated SNPs in the uncharacterized LOC100130480 gene (rs6417039, rs2004296). Top case-associated SNPs were observed in BTRC (rs12355803, 3rs4436485), ARHGEF18 (rs11668246), CDH13 (rs4782779), and EDG2 (rs12552348), which encode proteins with known or suspected roles in B cell function and/or vascular integrity. 61 SNPs that were significantly associated with SCLS by microarray analysis were also detected and validated by exome deep sequencing. Functional annotation of highly ranked SNPs revealed enrichment of cell projections, cell junctions and adhesion, and molecules containing pleckstrin homology, Ras/Rho regulatory, and immunoglobulin Ig-like C2/fibronectin type III domains, all of which involve mechanistic functions that correlate with the SCLS phenotype. These results highlight SNPs with potential relevance to SCLS. PMID:24808988

  13. Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift

    PubMed Central

    Cingolani, Pablo; Patel, Viral M.; Coon, Melissa; Nguyen, Tung; Land, Susan J.; Ruden, Douglas M.; Lu, Xiangyi

    2012-01-01

    This paper describes a new program SnpSift for filtering differential DNA sequence variants between two or more experimental genomes after genotoxic chemical exposure. Here, we illustrate how SnpSift can be used to identify candidate phenotype-relevant variants including single nucleotide polymorphisms, multiple nucleotide polymorphisms, insertions, and deletions (InDels) in mutant strains isolated from genome-wide chemical mutagenesis of Drosophila melanogaster. First, the genomes of two independently isolated mutant fly strains that are allelic for a novel recessive male-sterile locus generated by genotoxic chemical exposure were sequenced using the Illumina next-generation DNA sequencer to obtain 20- to 29-fold coverage of the euchromatic sequences. The sequencing reads were processed and variants were called using standard bioinformatic tools. Next, SnpEff was used to annotate all sequence variants and their potential mutational effects on associated genes. Then, SnpSift was used to filter and select differential variants that potentially disrupt a common gene in the two allelic mutant strains. The potential causative DNA lesions were partially validated by capillary sequencing of polymerase chain reaction-amplified DNA in the genetic interval as defined by meiotic mapping and deletions that remove defined regions of the chromosome. Of the five candidate genes located in the genetic interval, the Pka-like gene CG12069 was found to carry a separate pre-mature stop codon mutation in each of the two allelic mutants whereas the other four candidate genes within the interval have wild-type sequences. The Pka-like gene is therefore a strong candidate gene for the male-sterile locus. These results demonstrate that combining SnpEff and SnpSift can expedite the identification of candidate phenotype-causative mutations in chemically mutagenized Drosophila strains. This technique can also be used to characterize the variety of mutations generated by genotoxic chemicals

  14. A multi-SNP association test for complex diseases incorporating an optimal P-value threshold algorithm in nuclear families.

    PubMed

    Wang, Yi-Ting; Sung, Pei-Yuan; Lin, Peng-Lin; Yu, Ya-Wen; Chung, Ren-Hua

    2015-05-15

    Genome-wide association studies (GWAS) have become a common approach to identifying single nucleotide polymorphisms (SNPs) associated with complex diseases. As complex diseases are caused by the joint effects of multiple genes, while the effect of individual gene or SNP is modest, a method considering the joint effects of multiple SNPs can be more powerful than testing individual SNPs. The multi-SNP analysis aims to test association based on a SNP set, usually defined based on biological knowledge such as gene or pathway, which may contain only a portion of SNPs with effects on the disease. Therefore, a challenge for the multi-SNP analysis is how to effectively select a subset of SNPs with promising association signals from the SNP set. We developed the Optimal P-value Threshold Pedigree Disequilibrium Test (OPTPDT). The OPTPDT uses general nuclear families. A variable p-value threshold algorithm is used to determine an optimal p-value threshold for selecting a subset of SNPs. A permutation procedure is used to assess the significance of the test. We used simulations to verify that the OPTPDT has correct type I error rates. Our power studies showed that the OPTPDT can be more powerful than the set-based test in PLINK, the multi-SNP FBAT test, and the p-value based test GATES. We applied the OPTPDT to a family-based autism GWAS dataset for gene-based association analysis and identified MACROD2-AS1 with genome-wide significance (p-value=2.5×10(-6)). Our simulation results suggested that the OPTPDT is a valid and powerful test. The OPTPDT will be helpful for gene-based or pathway association analysis. The method is ideal for the secondary analysis of existing GWAS datasets, which may identify a set of SNPs with joint effects on the disease.

  15. DoGSD: the dog and wolf genome SNP database.

    PubMed

    Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping

    2015-01-01

    The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies.

  16. Missingness in the T1DGC MHC fine-mapping SNP data: association with HLA genotype and potential influence on genetic association studies.

    PubMed

    James, I; McKinnon, E; Gaudieri, S; Morahan, G

    2009-02-01

    The absence or 'missingness' of single nucleotide polymorphism (SNP) assay values because of genotype or related factors of interest may bias association and other studies. Missingness was determined for the Type 1 Diabetes Genetics Consortium (T1DGC) Major Histocompatibility Complex (MHC) data and was found to vary across the region, ranging up to 11.1% of the non-null proband SNPs, with a median of 0.3%. We consider factors related to missingness in the T1DGC data and briefly assess its possible influence on association studies. We assessed associations of missingness in the SNP assay data with human leucocyte antigen (HLA) genotype of the individual and with SNP genotypes of the parents. Within-cohort analyses were combined (over all cohorts) using (i) Mantel-Haenszel tests for two-by-two tables or (ii) by combining test statistics for larger tables and regression models. Mixed effect regression models were used to assess association of the SNP genotypes with affected status of the offspring after adjustment for parental SNP genotypes, cohort membership and HLA genotypes. Log-linear models were used to assess association of missingness in the unaffected sib assays with SNP genotypes of the probands. Missingness of SNP values near the HLA class I (A, B and C) and class II (DR, DQ and DP) loci is strongly associated with carriage of corresponding HLA genotypes within these groups. Similar associations pertain to missing values among the microsatellite data. In at least some of these cases, regions of missingness coincided with known deletion regions corresponding to the associated HLA haplotype. We conjecture that other regions of associated missingness may point to similar haplotypic deletions. Analysis of association patterns of SNP genotypes with affected status of offspring does not indicate strong informative missingness. However, association of missingness in proband data with parental SNP genotypes may impact transmission disequilibrium test (TDT)-type

  17. SNP markers identify widely distributed clonal lineages of Phytophthora colocasiae in Vietnam, Hawaii and Hainan Island, China.

    PubMed

    Shrestha, Sandesh; Hu, Jian; Fryxell, Rebecca Trout; Mudge, Joann; Lamour, Kurt

    2014-01-01

    Taro (Colocasia esculenta) is an important food crop, and taro leaf blight caused by Phytophthora colocasiae can significantly affect production. Our objectives were to develop single nucleotide polymorphism (SNP) markers for P. colocasiae and characterize populations in Hawaii (HI), Vietnam (VN) and Hainan Island, China (HIC). In total, 379 isolates were analyzed for mating type and multilocus SNP profiles including 214 from HI, 97 from VN and 68 from HIC. A total of 1152 single nucleotide variant (SNV) sites were identified via restriction site-associated DNA (RAD) sequencing of two field isolates. Genotyping with 27 SNPs revealed 41 multilocus SNP genotypes grouped into seven clonal lineages containing 2-232 members. Three clonal lineages were shared among countries. In addition, five SNP markers had a low incidence of loss of heterozygosity (LOH) during asexual laboratory growth. For HI and VN, >95% of isolates were the A2 mating type. On HIC, isolates within single clonal lineages had A1, A2 and A0 (neuter) isolates. The implications for the wide dispersal of clonal lineages are discussed.

  18. SNP genotyping of animal and human derived isolates of Mycobacterium avium subsp. paratuberculosis.

    PubMed

    Wynne, James W; Beller, Christie; Boyd, Victoria; Francis, Barry; Gwoźdź, Jacek; Carajias, Marios; Heine, Hans G; Wagner, Josef; Kirkwood, Carl D; Michalski, Wojtek P

    2014-08-27

    Mycobacterium avium subsp. paratuberculosis (MAP) is the aetiological agent of Johne's disease (JD), a chronic granulomatous enteritis that affects ruminants worldwide. While the ability of MAP to cause disease in animals is clear, the role of this bacterium in human inflammatory bowel diseases remains unresolved. Previous whole genome sequencing of MAP isolates derived from human and three animal hosts showed that human isolates were genetically similar and showed a close phylogenetic relationship to one bovine isolate. In contrast, other animal derived isolates were more genetically diverse. The present study aimed to investigate the frequency of this human strain across 52 wild-type MAP isolates, collected predominantly from Australia. A Luminex based SNP genotyping approach was utilised to genotype SNPs that had previously been shown to be specific to the human, bovine or ovine isolate types. Fourteen SNPs were initially evaluated across a reference panel of isolates with known genotypes. A subset of seven SNPs was chosen for analysis within the wild-type collection. Of the seven SNPs, three were found to be unique to paediatric human isolates. No wild-type isolates contain these SNP alleles. Interestingly, and in contrast to the paediatric isolates, three additional adult human isolates (derived from adult Crohn's disease patients) also did not contain these SNP alleles. Furthermore we identified two SNPs, which demonstrate extensive polymorphism within the animal-derived MAP isolates. One of which appears unique to ovine and a single camel isolate. From this study we suggest the existence of genetic heterogeneity between human derived MAP isolates, some of which are highly similar to those derived from bovine hosts, but others of which are more divergent.

  19. Forensic SNP Genotyping using Nanopore MinION Sequencing

    PubMed Central

    Cornelis, Senne; Gansemans, Yannick; Deleye, Lieselot; Deforce, Dieter; Van Nieuwerburgh, Filip

    2017-01-01

    One of the latest developments in next generation sequencing is the Oxford Nanopore Technologies’ (ONT) MinION nanopore sequencer. We studied the applicability of this system to perform forensic genotyping of the forensic female DNA standard 9947 A using the 52 SNP-plex assay developed by the SNPforID consortium. All but one of the loci were correctly genotyped. Several SNP loci were identified as problematic for correct and robust genotyping using nanopore sequencing. All these loci contained homopolymers in the sequence flanking the forensic SNP and most of them were already reported as problematic in studies using other sequencing technologies. When these problematic loci are avoided, correct forensic genotyping using nanopore sequencing is technically feasible. PMID:28155888

  20. PanSNPdb: the Pan-Asian SNP genotyping database.

    PubMed

    Ngamphiw, Chumpol; Assawamakin, Anunchai; Xu, Shuhua; Shaw, Philip J; Yang, Jin Ok; Ghang, Ho; Bhak, Jong; Liu, Edison; Tongsima, Sissades

    2011-01-01

    The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb), which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at: http://www4a.biotec.or.th/PASNP.

  1. Forensic SNP Genotyping using Nanopore MinION Sequencing.

    PubMed

    Cornelis, Senne; Gansemans, Yannick; Deleye, Lieselot; Deforce, Dieter; Van Nieuwerburgh, Filip

    2017-02-03

    One of the latest developments in next generation sequencing is the Oxford Nanopore Technologies' (ONT) MinION nanopore sequencer. We studied the applicability of this system to perform forensic genotyping of the forensic female DNA standard 9947 A using the 52 SNP-plex assay developed by the SNPforID consortium. All but one of the loci were correctly genotyped. Several SNP loci were identified as problematic for correct and robust genotyping using nanopore sequencing. All these loci contained homopolymers in the sequence flanking the forensic SNP and most of them were already reported as problematic in studies using other sequencing technologies. When these problematic loci are avoided, correct forensic genotyping using nanopore sequencing is technically feasible.

  2. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649).

    PubMed

    Knappskog, Stian; Gansmo, Liv B; Dibirova, Khadizha; Metspalu, Andres; Cybulski, Cezary; Peterlongo, Paolo; Aaltonen, Lauri; Vatten, Lars; Romundstad, Pål; Hveem, Kristian; Devilee, Peter; Evans, Gareth D; Lin, Dongxin; Van Camp, Guy; Manolopoulos, Vangelis G; Osorio, Ana; Milani, Lili; Ozcelik, Tayfun; Zalloua, Pierre; Mouzaya, Francis; Bliznetz, Elena; Balanovska, Elena; Pocheshkova, Elvira; Kučinskas, Vaidutis; Atramentova, Lubov; Nymadawa, Pagbajabyn; Titov, Konstantin; Lavryashina, Maria; Yusupov, Yuldash; Bogdanova, Natalia; Koshel, Sergey; Zamora, Jorge; Wedge, David C; Charlesworth, Deborah; Dörk, Thilo; Balanovsky, Oleg; Lønning, Per E

    2014-09-30

    The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 - 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk.

  3. Sniper: improved SNP discovery by multiply mapping deep sequenced reads.

    PubMed

    Simola, Daniel F; Kim, Junhyong

    2011-06-20

    SNP (single nucleotide polymorphism) discovery using next-generation sequencing data remains difficult primarily because of redundant genomic regions, such as interspersed repetitive elements and paralogous genes, present in all eukaryotic genomes. To address this problem, we developed Sniper, a novel multi-locus Bayesian probabilistic model and a computationally efficient algorithm that explicitly incorporates sequence reads that map to multiple genomic loci. Our model fully accounts for sequencing error, template bias, and multi-locus SNP combinations, maintaining high sensitivity and specificity under a broad range of conditions. An implementation of Sniper is freely available at http://kim.bio.upenn.edu/software/sniper.shtml.

  4. Genetic algorithm-generated SNP barcodes of the mitochondrial D-loop for chronic dialysis susceptibility.

    PubMed

    Chen, Jin-Bor; Chuang, Li-Yeh; Lin, Yu-Da; Liou, Chia-Wei; Lin, Tsu-Kung; Lee, Wen-Chin; Cheng, Ben-Chung; Chang, Hsueh-Wei; Yang, Cheng-Hong

    2014-06-01

    Single nucleotide polymorphism (SNP) interaction analysis can simultaneously evaluate the complex SNP interactions present in complex diseases. However, it is less commonly applied to evaluate the predisposition of chronic dialysis and its computational analysis remains challenging. In this study, we aimed to improve the analysis of SNP-SNP interactions within the mitochondrial D-loop in chronic dialysis. The SNP-SNP interactions between 77 reported SNPs within the mitochondrial D-loop in chronic dialysis study were evaluated in terms of SNP barcodes (different SNP combinations with their corresponding genotypes). We propose a genetic algorithm (GA) to generate SNP barcodes. The χ(2) values were then calculated by the occurrences of the specific SNP barcodes and their non-specific combinations between cases and controls. Each SNP barcode (2- to 7-SNP) with the highest value in the χ(2) test was regarded as the best SNP barcode (11.304 to 23.310; p < 0.001). The best GA-generated SNP barcodes (2- to 7-SNP) were significantly associated with chronic dialysis (odds ratio [OR] = 1.998 to 3.139; p < 0.001). The order of influence for SNPs was the same as the order of their OR values for chronic dialysis in terms of 2- to 7-SNP barcodes. Taken together, we propose an effective algorithm to address the SNP-SNP interactions and demonstrated that many non-significant SNPs within the mitochondrial D-loop may play a role in jointed effects to chronic dialysis susceptibility.

  5. SNP variation in ADRB3 gene reflects the breed difference of sheep populations.

    PubMed

    Wu, Jianliang; Qiao, Liying; Liu, Jianhua; Yuan, Yanan; Liu, Wenzhong

    2012-08-01

    The β3-adrenergic receptor (ADRB3), a G-protein coupled receptor, plays a major role in energy metabolism and regulation of lipolysis and homeostasis. We detect the single nucleotide polymorphism (SNP) variation in full-length sequence of ovine ADRB3 gene in 12 domestic sheep populations within four types by polymerase chain reaction-single strand conformation polymorphism and sequencing to reveal the breed difference. Twenty-two SNPs, 12 of which in the exon 1 and ten in the intron, were detected, and 12 new exonic and four new intronic SNPs were found. Most SNPs presented in Shanxi Dam Line and least ones in Dorset. The average SNP number in both meat and dual purpose for meat and wool breeds was significantly higher than general and dual purpose breeds for wool and meat. Frequency of each SNP in studied breeds or types was different. The 18C Del and 1617T Ins majorly existed in dual purpose breeds for wool and meat. The 25A Del, 119C>G and 130C>T were mostly found in the meat and dual purpose for meat and wool breeds. The 1764C>A more frequently presented in meat than in other types. The majority of variations came from within the populations as suggested by analysis of molecular variance. Close relationship presented among the Chinese and western breeds, respectively. In conclusion, SNPs of ovine ADRB3 gene can reflect the breed difference and within- and between-population variations, and to a great extent, the breed relationship.

  6. New Insights into the Geographic Distribution of Mycobacterium leprae SNP Genotypes Determined for Isolates from Leprosy Cases Diagnosed in Metropolitan France and French Territories.

    PubMed

    Reibel, Florence; Chauffour, Aurélie; Brossier, Florence; Jarlier, Vincent; Cambau, Emmanuelle; Aubry, Alexandra

    2015-01-01

    Between 20 and 30 bacteriologically confirmed cases of leprosy are diagnosed each year at the French National Reference Center for mycobacteria. Patients are mainly immigrants from various endemic countries or living in French overseas territories. We aimed at expanding data regarding the geographical distribution of the SNP genotypes of the M. leprae isolates from these patients. Skin biopsies were obtained from 71 leprosy patients diagnosed between January 2009 and December 2013. Data regarding age, sex and place of birth and residence were also collected. Diagnosis of leprosy was confirmed by microscopic detection of acid-fast bacilli and/or amplification by PCR of the M. leprae-specific RLEP region. Single nucleotide polymorphisms (SNP), present in the M. leprae genome at positions 14 676, 1 642 875 and 2 935 685, were determined with an efficiency of 94% (67/71). Almost all patients were from countries other than France where leprosy is still prevalent (n = 31) or from French overseas territories (n = 36) where leprosy is not totally eradicated, while only a minority (n = 4) was born in metropolitan France but have lived in other countries. SNP type 1 was predominant (n = 33), followed by type 3 (n = 17), type 4 (n = 11) and type 2 (n = 6). SNP types were concordant with those previously reported as prevalent in the patients' countries of birth. SNP types found in patients born in countries other than France (Comoros, Haiti, Benin, Congo, Sri Lanka) and French overseas territories (French Polynesia, Mayotte and La Réunion) not covered by previous work correlated well with geographical location and history of human settlements. The phylogenic analysis of M. leprae strains isolated in France strongly suggests that French leprosy cases are caused by SNP types that are (a) concordant with the geographic origin or residence of the patients (non-French countries, French overseas territories, metropolitan France) or (b) more likely random in regions where diverse

  7. snpTree--a web-server to identify and construct SNP trees from whole genome sequence data.

    PubMed

    Leekitcharoenphon, Pimlapas; Kaas, Rolf S; Thomsen, Martin Christen Frølund; Friis, Carsten; Rasmussen, Simon; Aarestrup, Frank M

    2012-01-01

    The advances and decreasing economical cost of whole genome sequencing (WGS), will soon make this technology available for routine infectious disease epidemiology. In epidemiological studies, outbreak isolates have very little diversity and require extensive genomic analysis to differentiate and classify isolates. One of the successfully and broadly used methods is analysis of single nucletide polymorphisms (SNPs). Currently, there are different tools and methods to identify SNPs including various options and cut-off values. Furthermore, all current methods require bioinformatic skills. Thus, we lack a standard and simple automatic tool to determine SNPs and construct phylogenetic tree from WGS data. Here we introduce snpTree, a server for online-automatic SNPs analysis. This tool is composed of different SNPs analysis suites, perl and python scripts. snpTree can identify SNPs and construct phylogenetic trees from WGS as well as from assembled genomes or contigs. WGS data in fastq format are aligned to reference genomes by BWA while contigs in fasta format are processed by Nucmer. SNPs are concatenated based on position on reference genome and a tree is constructed from concatenated SNPs using FastTree and a perl script. The online server was implemented by HTML, Java and python script.The server was evaluated using four published bacterial WGS data sets (V. cholerae, S. aureus CC398, S. Typhimurium and M. tuberculosis). The evaluation results for the first three cases was consistent and concordant for both raw reads and assembled genomes. In the latter case the original publication involved extensive filtering of SNPs, which could not be repeated using snpTree. The snpTree server is an easy to use option for rapid standardised and automatic SNP analysis in epidemiological studies also for users with limited bioinformatic experience. The web server is freely accessible at http://www.cbs.dtu.dk/services/snpTree-1.0/.

  8. Evidence for SNP-SNP interaction identified through targeted sequencing of cleft case-parent trios.

    PubMed

    Xiao, Yanzi; Taub, Margaret A; Ruczinski, Ingo; Begum, Ferdouse; Hetmanski, Jacqueline B; Schwender, Holger; Leslie, Elizabeth J; Koboldt, Daniel C; Murray, Jeffrey C; Marazita, Mary L; Beaty, Terri H

    2017-04-01

    Nonsyndromic cleft lip with or without cleft palate (NSCL/P) is the most common craniofacial birth defect in humans, affecting 1 in 700 live births. This malformation has a complex etiology where multiple genes and several environmental factors influence risk. At least a dozen different genes have been confirmed to be associated with risk of NSCL/P in previous studies. However, all the known genetic risk factors cannot fully explain the observed heritability of NSCL/P, and several authors have suggested gene-gene (G × G) interaction may be important in the etiology of this complex and heterogeneous malformation. We tested for G × G interactions using common single nucleotide polymorphic (SNP) markers from targeted sequencing in 13 regions identified by previous studies spanning 6.3 Mb of the genome in a study of 1,498 NSCL/P case-parent trios. We used the R-package trio to assess interactions between polymorphic markers in different genes, using a 1 degree of freedom (1df) test for screening, and a 4 degree of freedom (4df) test to assess statistical significance of epistatic interactions. To adjust for multiple comparisons, we performed permutation tests. The most significant interaction was observed between rs6029315 in MAFB and rs6681355 in IRF6 (4df P = 3.8 × 10(-8) ) in case-parent trios of European ancestry, which remained significant after correcting for multiple comparisons. However, no significant interaction was detected in trios of Asian ancestry.

  9. Tag SNP selection in genotype data for maximizing SNP prediction accuracy.

    PubMed

    Halperin, Eran; Kimmel, Gad; Shamir, Ron

    2005-06-01

    The search for genetic regions associated with complex diseases, such as cancer or Alzheimer's disease, is an important challenge that may lead to better diagnosis and treatment. The existence of millions of DNA variations, primarily single nucleotide polymorphisms (SNPs), may allow the fine dissection of such associations. However, studies seeking disease association are limited by the cost of genotyping SNPs. Therefore, it is essential to find a small subset of informative SNPs (tag SNPs) that may be used as good representatives of the rest of the SNPs. We define a new natural measure for evaluating the prediction accuracy of a set of tag SNPs, and use it to develop a new method for tag SNPs selection. Our method is based on a novel algorithm that predicts the values of the rest of the SNPs given the tag SNPs. In contrast to most previous methods, our prediction algorithm uses the genotype information and not the haplotype information of the tag SNPs. Our method is very efficient, and it does not rely on having a block partition of the genomic region. We compared our method with two state-of-the-art tag SNP selection algorithms on 58 different genotype datasets from four different sources. Our method consistently found tag SNPs with considerably better prediction ability than the other methods. The software is available from the authors on request.

  10. Identification of novel single nucleotide polymorphisms (SNPs) in deer (Odocoileus spp.) using the BovineSNP50 BeadChip.

    PubMed

    Haynes, Gwilym D; Latch, Emily K

    2012-01-01

    Single nucleotide polymorphisms (SNPs) are growing in popularity as a genetic marker for investigating evolutionary processes. A panel of SNPs is often developed by comparing large quantities of DNA sequence data across multiple individuals to identify polymorphic sites. For non-model species, this is particularly difficult, as performing the necessary large-scale genomic sequencing often exceeds the resources available for the project. In this study, we trial the Bovine SNP50 BeadChip developed in cattle (Bos taurus) for identifying polymorphic SNPs in cervids Odocoileus hemionus (mule deer and black-tailed deer) and O. virginianus (white-tailed deer) in the Pacific Northwest. We found that 38.7% of loci could be genotyped, of which 5% (n = 1068) were polymorphic. Of these 1068 polymorphic SNPs, a mixture of putatively neutral loci (n = 878) and loci under selection (n = 190) were identified with the F(ST)-outlier method. A range of population genetic analyses were implemented using these SNPs and a panel of 10 microsatellite loci. The three types of deer could readily be distinguished with both the SNP and microsatellite datasets. This study demonstrates that commercially developed SNP chips are a viable means of SNP discovery for non-model organisms, even when used between very distantly related species (the Bovidae and Cervidae families diverged some 25.1-30.1 million years before present).

  11. Identification of Novel Single Nucleotide Polymorphisms (SNPs) in Deer (Odocoileus spp.) Using the BovineSNP50 BeadChip

    PubMed Central

    Haynes, Gwilym D.; Latch, Emily K.

    2012-01-01

    Single nucleotide polymorphisms (SNPs) are growing in popularity as a genetic marker for investigating evolutionary processes. A panel of SNPs is often developed by comparing large quantities of DNA sequence data across multiple individuals to identify polymorphic sites. For non-model species, this is particularly difficult, as performing the necessary large-scale genomic sequencing often exceeds the resources available for the project. In this study, we trial the Bovine SNP50 BeadChip developed in cattle (Bos taurus) for identifying polymorphic SNPs in cervids Odocoileus hemionus (mule deer and black-tailed deer) and O. virginianus (white-tailed deer) in the Pacific Northwest. We found that 38.7% of loci could be genotyped, of which 5% (n = 1068) were polymorphic. Of these 1068 polymorphic SNPs, a mixture of putatively neutral loci (n = 878) and loci under selection (n = 190) were identified with the FST-outlier method. A range of population genetic analyses were implemented using these SNPs and a panel of 10 microsatellite loci. The three types of deer could readily be distinguished with both the SNP and microsatellite datasets. This study demonstrates that commercially developed SNP chips are a viable means of SNP discovery for non-model organisms, even when used between very distantly related species (the Bovidae and Cervidae families diverged some 25.1−30.1 million years before present). PMID:22590559

  12. Influence of mismatched and bulged nucleotides on SNP-preferential RNase H cleavage of RNA-antisense gapmer heteroduplexes.

    PubMed

    Magner, Dorota; Biala, Ewa; Lisowiec-Wachnicka, Jolanta; Kierzek, Ryszard

    2017-10-02

    This study focused on determining design rules for gapmer-type antisense oligonucleotides (ASOs), that can differentiate cleavability of two SNP variants of RNA in the presence of ribonuclease H based on the mismatch type and position in the heteroduplex. We describe the influence of structural motifs formed by several arrangements of multiple mismatches (various types of mismatches and their position within the ASO/target RNA duplex) on RNase H cleavage selectivity of five different SNP types. The targets were mRNA fragments of APP, SCA3, SNCA and SOD1 genes, carrying C-to-G, G-to-C, G-to-A, A-to-G and C-to-U substitutions. The results show that certain arrangements of mismatches enhance discrimination between wild type and mutant SNP alleles of RNA in vitro as well as in HeLa cells. Among the over 120 gapmers tested, we found two gapmers that caused preferential degradation of the mutant allele APP 692 G and one that led to preferential cleavage of the mutant SNCA 53 A allele, both in vitro and in cells. However, several gapmers promoted selective cleavage of mRNA mutant alleles in in vitro experiments only.

  13. Do you really know where this SNP goes?

    USDA-ARS?s Scientific Manuscript database

    The release of build 10.2 of the swine genome was a marked improvement over previous builds and has proven extremely useful. However, as most know, there are regions of the genome that this particular build does not accurately represent. For instance, nearly 25% of the 62,162 SNP on the Illumina Por...

  14. Target SNP selection in complex disease association studies

    PubMed Central

    Wjst, Matthias

    2004-01-01

    Background The massive amount of SNP data stored at public internet sites provides unprecedented access to human genetic variation. Selecting target SNP for disease-gene association studies is currently done more or less randomly as decision rules for the selection of functional relevant SNPs are not available. Results We implemented a computational pipeline that retrieves the genomic sequence of target genes, collects information about sequence variation and selects functional motifs containing SNPs. Motifs being considered are gene promoter, exon-intron structure, AU-rich mRNA elements, transcription factor binding motifs, cryptic and enhancer splice sites together with expression in target tissue. As a case study, 396 genes on chromosome 6p21 in the extended HLA region were selected that contributed nearly 20,000 SNPs. By computer annotation ~2,500 SNPs in functional motifs could be identified. Most of these SNPs are disrupting transcription factor binding sites but only those introducing new sites had a significant depressing effect on SNP allele frequency. Other decision rules concern position within motifs, the validity of SNP database entries, the unique occurrence in the genome and conserved sequence context in other mammalian genomes. Conclusion Only 10% of all gene-based SNPs have sequence-predicted functional relevance making them a primary target for genotyping in association studies. PMID:15248903

  15. SNP Discovery and Linkage Map Construction in Cultivated Tomato

    PubMed Central

    Shirasawa, Kenta; Isobe, Sachiko; Hirakawa, Hideki; Asamizu, Erika; Fukuoka, Hiroyuki; Just, Daniel; Rothan, Christophe; Sasamoto, Shigemi; Fujishiro, Tsunakazu; Kishida, Yoshie; Kohara, Mitsuyo; Tsuruoka, Hisano; Wada, Tsuyuko; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

    2010-01-01

    Few intraspecific genetic linkage maps have been reported for cultivated tomato, mainly because genetic diversity within Solanum lycopersicum is much less than that between tomato species. Single nucleotide polymorphisms (SNPs), the most abundant source of genomic variation, are the most promising source of polymorphisms for the construction of linkage maps for closely related intraspecific lines. In this study, we developed SNP markers based on expressed sequence tags for the construction of intraspecific linkage maps in tomato. Out of the 5607 SNP positions detected through in silico analysis, 1536 were selected for high-throughput genotyping of two mapping populations derived from crosses between ‘Micro-Tom’ and either ‘Ailsa Craig’ or ‘M82’. A total of 1137 markers, including 793 out of the 1338 successfully genotyped SNPs, along with 344 simple sequence repeat and intronic polymorphism markers, were mapped onto two linkage maps, which covered 1467.8 and 1422.7 cM, respectively. The SNP markers developed were then screened against cultivated tomato lines in order to estimate the transferability of these SNPs to other breeding materials. The molecular markers and linkage maps represent a milestone in the genomics and genetics, and are the first step toward molecular breeding of cultivated tomato. Information on the DNA markers, linkage maps, and SNP genotypes for these tomato lines is available at http://www.kazusa.or.jp/tomato/. PMID:21044984

  16. High throughput SNP detection system based on magnetic nanoparticles separation.

    PubMed

    Liu, Bin; Jia, Yingying; Ma, Man; Li, Zhiyang; Liu, Hongna; Li, Song; Deng, Yan; Zhang, Liming; Lu, Zhuoxuan; Wang, Wei; He, Nongyue

    2013-02-01

    Single-nucleotide polymorphism (SNP) was one-base variations in DNA sequence that can often be helpful to find genes associations for hereditary disease, communicable disease and so on. We developed a high throughput SNP detection system based on magnetic nanoparticles (MNPs) separation and dual-color hybridization or single base extension. This system includes a magnetic separation unit for sample separation, three high precision robot arms for pipetting and microtiter plate transferring respectively, an accurate temperature control unit for PCR and DNA hybridization and a high accurate and sensitive optical signal detection unit for fluorescence detection. The cyclooxygenase-2 gene promoter region--65G > C polymorphism locus SNP genotyping experiment for 48 samples from the northern Jiangsu area has been done to verify that if this system can simplify manual operation of the researchers, save time and improve efficiency in SNP genotyping experiments. It can realize sample preparation, target sequence amplification, signal detection and data analysis automatically and can be used in clinical molecule diagnosis and high throughput fluorescence immunological detection and so on.

  17. Software solutions for the livestock genomics SNP array revolution.

    PubMed

    Nicolazzi, E L; Biffani, S; Biscarini, F; Orozco Ter Wengel, P; Caprera, A; Nazzicari, N; Stella, A

    2015-08-01

    Since the beginning of the genomic era, the number of available single nucleotide polymorphism (SNP) arrays has grown considerably. In the bovine species alone, 11 SNP chips not completely covered by intellectual property are currently available, and the number is growing. Genomic/genotype data are not standardized, and this hampers its exchange and integration. In addition, software used for the analyses of these data usually requires not standard (i.e. case specific) input files which, considering the large amount of data to be handled, require at least some programming skills in their production. In this work, we describe a software toolkit for SNP array data management, imputation, genome-wide association studies, population genetics and genomic selection. However, this toolkit does not solve the critical need for standardization of the genotypic data and software input files. It only highlights the chaotic situation each researcher has to face on a daily basis and gives some helpful advice on the currently available tools in order to navigate the SNP array data complexity.

  18. Genetic mapping in grapevine using a SNP microarray: intensity values

    USDA-ARS?s Scientific Manuscript database

    Genotyping microarrays are widely used for genome wide association studies, but in high-diversity organisms, the quality of SNP calls can be diminished by genetic variation near the assayed nucleotide. To address this limitation in grapevine, we developed a simple heuristic that uses hybridization i...

  19. SNP diversity within and among Brassica rapa accessions reveals no geographic differentiation.

    PubMed

    Tanhuanpää, P; Erkkilä, M; Tenhola-Roininen, T; Tanskanen, J; Manninen, O

    2016-01-01

    Genetic diversity was studied in a collection of 61 accessions of Brassica rapa, which were mostly oil-type turnip rapes but also included two oil-type subsp. dichotoma and five subsp. trilocularis accessions, as well as three leaf-type subspecies (subsp. japonica, pekinensis, and chinensis) and five turnip cultivars (subsp. rapa). Two-hundred and nine SNP markers, which had been discovered by amplicon resequencing, were used to genotype 893 plants from the B. rapa collection using Illumina BeadXpress. There was great variation in the diversity indices between accessions. With STRUCTURE analysis, the plant collection could be divided into three groups that seemed to correspond to morphotype and flowering habit but not to geography. According to AMOVA analysis, 65% of the variation was due to variation within accessions, 25% among accessions, and 10% among groups. A smaller subset of the plant collection, 12 accessions, was also studied with 5727 GBS-SNPs. Diversity indices obtained with GBS-SNPs correlated well with those obtained with Illumina BeadXpress SNPs. The developed SNP markers have already been used and will be used in future plant breeding programs as well as in mapping and diversity studies.

  20. Influence of serum adiponectin level and SNP +45 polymorphism of adiponectin gene on myocardial fibrosis.

    PubMed

    Yan, Cheng-jun; Li, Su-mei; Xiao, Qiang; Liu, Yan; Hou, Jian; Chen, Ai-fang; Xia, Li-ping; Li, Xiu-chang

    2013-08-01

    Adiponectin plays an important role in the development of hypertension, atherosclerosis, and cardiomyocyte hypertrophy, but very little was known about the influence of serum adiponectin or the adiponectin gene polymorphism on myocardial fibrosis. Our study investigates the influence of the SNP +45 polymorphism of the adiponectin gene and serum levels of adiponectin on myocardial fibrosis in patients with essential hypertension. A case-control study was conducted on 165 hypertensive patients and 126 normotensive healthy controls. The genotypes of adiponectin gene polymorphisms were detected by the polymerase chain reaction (PCR) method. Serum concentrations of procollagen were measured by a double antibody sandwich enzyme-linked immunosorbent assay (ELISA) in all subjects. The integrated backscatter score (IBS) was measured in the left ventricular myocardium using echocardiography. The serum levels of adiponectin in hypertensive patients were significantly lower than those in the normal control group ((2.69±1.0) μg/ml vs. (4.21±2.89) μg/ml, respectively, P<0.001). The serum levels of type-I procollagen carboxyl end peptide (PICP) and type-III procollagen ammonia cardinal extremity peptide (PIIINP) in the hypertension group were significantly higher than those in the control group. In the hypertension group, serum levels of adiponectin were significantly and negatively related to the average acoustic intensity and corrected acoustic intensity of the myocardium (r=0.46 and 0.61, respectively, P<0.05 for both). The serum levels of PICP and PIIINP were significantly different among the three genotypes of SNP +45 (P<0.01). Logistic regression analyses showed that sex and genotype (GG+GT) were the major risk factors of myocardial fibrosis in hypertensive patients (OR=5.343 and 3.278, respectively, P<0.05). These data suggest that lower levels of adiponectin and SNP +45 polymorphism of the adiponectin gene are likely to play an important role in myocardial fibrosis in

  1. High-throughput genomics in sorghum: from whole-genome resequencing to a SNP screening array.

    PubMed

    Bekele, Wubishet A; Wieckhorst, Silke; Friedt, Wolfgang; Snowdon, Rod J

    2013-12-01

    With its small, diploid and completely sequenced genome, sorghum (Sorghum bicolor L. Moench) is highly amenable to genomics-based breeding approaches. Here, we describe the development and testing of a robust single-nucleotide polymorphism (SNP) array platform that enables polymorphism screening for genome-wide and trait-linked polymorphisms in genetically diverse S. bicolor populations. Whole-genome sequences with 6× to 12× coverage from five genetically diverse S. bicolor genotypes, including three sweet sorghums and two grain sorghums, were aligned to the sorghum reference genome. From over 1 million high-quality SNPs, we selected 2124 Infinium Type II SNPs that were informative in all six source genomes, gave an optimal Assay Design Tool (ADT) score, had allele frequencies of 50% in the six genotypes and were evenly spaced throughout the S. bicolor genome. Furthermore, by phenotype-based pool sequencing, we selected an additional 876 SNPs with a phenotypic association to early-stage chilling tolerance, a key trait for European sorghum breeding. The 3000 attempted bead types were used to populate half of a dual-species Illumina iSelect SNP array. The array was tested using 564 Sorghum spp. genotypes, including offspring from four unrelated recombinant inbred line (RIL) and F2 populations and a genetic diversity collection. A high call rate of over 80% enabled validation of 2620 robust and polymorphic sorghum SNPs, underlining the efficiency of the array development scheme for whole-genome SNP selection and screening, with diverse applications including genetic mapping, genome-wide association studies and genomic selection. © 2013 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  2. Amerindians show association to obesity with adiponectin gene SNP45 and SNP276: population genetics of a food intake control and "thrifty" gene.

    PubMed

    Arnaiz-Villena, Antonio; Fernández-Honrado, Mercedes; Rey, Diego; Enríquez-de-Salamanca, Mercedes; Abd-El-Fatah-Khalil, Sedeka; Arribas, Ignacio; Coca, Carmen; Algora, Manuel; Areces, Cristina

    2013-02-01

    Adiponectin gene polymorphisms SNP45 and SNP276 have been related to metabolic syndrome (MS) and related pathologies, including obesity. However results of associations are contradictory depending on which population is studied. In the present study, these adiponectin SNPs are for the first time studied in Amerindians. Allele frequencies are obtained and comparison with obesity and other MS related parameters are performed. Amerindians were also defined by characteristic HLA genes. Our main results are: (1) SNP276 T is associated to low diastolic blood pressure in Amerindians, (2) SNP45 G allele is correlated with obesity in female but not in male Amerindians, (3) SNP45/SNP276 T/G haplotype in total obese/non-obese subjects tends to show a linkage with non-obese Amerindians, (4) SNP45/SNP276 T/T haplotype is linked to obese Amerindian males. Also, a world population study is carried out finding that SNP45 T and SNP276 T alleles are the most frequent in African Blacks and are found significantly in lower frequencies in Europeans and Asians. This together with the fact that there is a linkage of this haplotype to obese Amerindian males suggest that evolutionary forces related to famine (or population density in relation with available food) may have shaped world population adiponectin polymorphism frequencies.

  3. The Usage of an SNP-SNP Relationship Matrix for Best Linear Unbiased Prediction (BLUP) Analysis Using a Community-Based Cohort Study

    PubMed Central

    Lee, Young-Sup; Kim, Hyeon-Jeong; Cho, Seoae

    2014-01-01

    Best linear unbiased prediction (BLUP) has been used to estimate the fixed effects and random effects of complex traits. Traditionally, genomic relationship matrix-based (GRM) and random marker-based BLUP analyses are prevalent to estimate the genetic values of complex traits. We used three methods: GRM-based prediction (G-BLUP), random marker-based prediction using an identity matrix (so-called single-nucleotide polymorphism [SNP]-BLUP), and SNP-SNP variance-covariance matrix (so-called SNP-GBLUP). We used 35,675 SNPs and R package "rrBLUP" for the BLUP analysis. The SNP-SNP relationship matrix was calculated using the GRM and Sherman-Morrison-Woodbury lemma. The SNP-GBLUP result was very similar to G-BLUP in the prediction of genetic values. However, there were many discrepancies between SNP-BLUP and the other two BLUPs. SNP-GBLUP has the merit to be able to predict genetic values through SNP effects. PMID:25705167

  4. SNP selection and classification of genome-wide SNP data using stratified sampling random forests.

    PubMed

    Wu, Qingyao; Ye, Yunming; Liu, Yang; Ng, Michael K

    2012-09-01

    For high dimensional genome-wide association (GWA) case-control data of complex disease, there are usually a large portion of single-nucleotide polymorphisms (SNPs) that are irrelevant with the disease. A simple random sampling method in random forest using default mtry parameter to choose feature subspace, will select too many subspaces without informative SNPs. Exhaustive searching an optimal mtry is often required in order to include useful and relevant SNPs and get rid of vast of non-informative SNPs. However, it is too time-consuming and not favorable in GWA for high-dimensional data. The main aim of this paper is to propose a stratified sampling method for feature subspace selection to generate decision trees in a random forest for GWA high-dimensional data. Our idea is to design an equal-width discretization scheme for informativeness to divide SNPs into multiple groups. In feature subspace selection, we randomly select the same number of SNPs from each group and combine them to form a subspace to generate a decision tree. The advantage of this stratified sampling procedure can make sure each subspace contains enough useful SNPs, but can avoid a very high computational cost of exhaustive search of an optimal mtry, and maintain the randomness of a random forest. We employ two genome-wide SNP data sets (Parkinson case-control data comprised of 408 803 SNPs and Alzheimer case-control data comprised of 380 157 SNPs) to demonstrate that the proposed stratified sampling method is effective, and it can generate better random forest with higher accuracy and lower error bound than those by Breiman's random forest generation method. For Parkinson data, we also show some interesting genes identified by the method, which may be associated with neurological disorders for further biological investigations.

  5. Large-Scale SNP Discovery through RNA Sequencing and SNP Genotyping by Targeted Enrichment Sequencing in Cassava (Manihot esculenta Crantz)

    PubMed Central

    Pootakham, Wirulda; Shearman, Jeremy R.; Ruang-areerate, Panthita; Sonthirod, Chutima; Sangsrakru, Duangjai; Jomchai, Nukoon; Yoocha, Thippawan; Triwitayakorn, Kanokporn; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2014-01-01

    Cassava (Manihot esculenta Crantz) is one of the most important crop species being the main source of dietary energy in several countries. Marker-assisted selection has become an essential tool in plant breeding. Single nucleotide polymorphism (SNP) discovery via transcriptome sequencing is an attractive strategy for genome complexity reduction in organisms with large genomes. We sequenced the transcriptome of 16 cassava accessions using the Illumina HiSeq platform and identified 675,559 EST-derived SNP markers. A subset of those markers was subsequently genotyped by capture-based targeted enrichment sequencing in 100 F1 progeny segregating for starch viscosity phenotypes. A total of 2,110 non-redundant SNP markers were used to construct a genetic map. This map encompasses 1,785 cM and consists of 19 linkage groups. A major quantitative trait locus (QTL) controlling starch pasting properties was identified and shown to coincide with the QTL previously reported for this trait. With a high-density SNP-based linkage map presented here, we also uncovered a novel QTL associated with starch pasting time on LG 10. PMID:25551642

  6. High throughput SNP discovery and validation in the pig: towards the development of a high density swine SNP chip

    USDA-ARS?s Scientific Manuscript database

    Recent developments in sequencing technology have allowed the generation of millions of short read sequences in a fast and inexpensive way. This enables the cost effective large scale identification of hundreds of thousands of SNPs needed for the development of high density SNP arrays. Currently, a ...

  7. Large-scale SNP discovery through RNA sequencing and SNP genotyping by targeted enrichment sequencing in cassava (Manihot esculenta Crantz).

    PubMed

    Pootakham, Wirulda; Shearman, Jeremy R; Ruang-Areerate, Panthita; Sonthirod, Chutima; Sangsrakru, Duangjai; Jomchai, Nukoon; Yoocha, Thippawan; Triwitayakorn, Kanokporn; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

    2014-01-01

    Cassava (Manihot esculenta Crantz) is one of the most important crop species being the main source of dietary energy in several countries. Marker-assisted selection has become an essential tool in plant breeding. Single nucleotide polymorphism (SNP) discovery via transcriptome sequencing is an attractive strategy for genome complexity reduction in organisms with large genomes. We sequenced the transcriptome of 16 cassava accessions using the Illumina HiSeq platform and identified 675,559 EST-derived SNP markers. A subset of those markers was subsequently genotyped by capture-based targeted enrichment sequencing in 100 F1 progeny segregating for starch viscosity phenotypes. A total of 2,110 non-redundant SNP markers were used to construct a genetic map. This map encompasses 1,785 cM and consists of 19 linkage groups. A major quantitative trait locus (QTL) controlling starch pasting properties was identified and shown to coincide with the QTL previously reported for this trait. With a high-density SNP-based linkage map presented here, we also uncovered a novel QTL associated with starch pasting time on LG 10.

  8. Development and validation of a novel single nucleotide polymorphism (SNP) panel for genetic analysis of Blastomyces spp. and association analysis.

    PubMed

    Frost, Holly M; Anderson, Jennifer L; Ivacic, Lynn; Sloss, Brian L; Embil, John; Meece, Jennifer K

    2016-09-23

    Single nucleotide polymorphism (SNP) genotyping is increasingly being utilized for molecular typing of pathogens and is cost-effective, especially for large numbers of isolates. The goals of this study were 1) to develop and validate a SNP assay panel for genetic analysis of Blastomyces spp., 2) ascertain whether microsatellite genotyping and the SNP genotyping with the developed panel resolve identical genetic groups, and 3) explore the utility of SNPs for examining phylogenetic and virulence questions in humans. Three hundred sixty unique Blastomyces spp. isolates previously genotyped with microsatellite markers were genotyped with the MassARRAY® SNP genotyping system (Agena Bioscience™, San Diego, CA), for a custom panel of 28 SNPs. Clinical presentation data was analyzed for association with SNP variants. Three hundred twenty-three Blastomyces spp. isolates (90 %) were successfully genotyped by SNP analysis, with results obtained for at least 27 of 28 assays. For 99.7 % of isolates tested by both genotyping methods, microsatellite genetic group assignment correlated with species assignment based on internal transcribed spacer 2 (ITS2) genotyping, with Group 1 (Gr 1) being equivalent to B. gilchristii and Group 2 (Gr 2) being equivalent to B. dermatitidis. Thirteen isolates were genetic hybrids by one or both methods of genotyping and were difficult to assign to a particular genetic group or species. Fifteen SNP loci showed significantly different alleles in cases of pulmonary vs disseminated disease, at a p-value of <0.01 or less. This study is the largest genotyping study of Blastomyces spp. isolates and presents a new method for genetic analysis with which to further explore the relationship between the genetic diversity in Blastomyces spp. and clinical disease presentation. We demonstrated that microsatellite Gr 1 is equivalent to B. gilchristii and Gr 2 is equivalent to B. dermatitidis. We also discovered potential evidence of infrequent recombination

  9. High-throughput SNP genotyping for breeding applications in rice using the BeadXpress platform

    USDA-ARS?s Scientific Manuscript database

    Multiplexed single nucleotide polymorphism (SNP) markers have the potential to increase the speed and cost-effectiveness of genotyping, provided that an optimal SNP density is used for each application. To test the efficiency of multiplexed SNP genotyping for diversity, mapping and breeding applicat...

  10. Development of Single Nucleotide Polymorphism (SNP) Markers for Use in Commercial Maize (Zea Mays L.) Germplasm

    USDA-ARS?s Scientific Manuscript database

    The development of single nucleotide polymorphism (SNP) markers in maize offer the opportunity to utilize DNA markers in many new areas of population genetics, gene discovery, plant breeding, and germplasm identification. However, the steps from sequencing and SNP discovery to SNP marker design and ...

  11. Validation of a Cost-Efficient Multi-Purpose SNP Panel for Disease Based Research

    PubMed Central

    Hou, Liping; Phillips, Christopher; Azaro, Marco; Brzustowicz, Linda M.; Bartlett, Christopher W.

    2011-01-01

    Background Here we present convergent methodologies using theoretical calculations, empirical assessment on in-house and publicly available datasets as well as in silico simulations, that validate a panel of SNPs for a variety of necessary tasks in human genetics disease research before resources are committed to larger-scale genotyping studies on those samples. While large-scale well-funded human genetic studies routinely have up to a million SNP genotypes, samples in a human genetics laboratory that are not yet part of such studies may be productively utilized in pilot projects or as part of targeted follow-up work though such smaller scale applications require at least some genome-wide genotype data for quality control purposes such as DNA “barcoding” to detect swaps or contamination issues, determining familial relationships between samples and correcting biases due to population effects such as population stratification in pilot studies. Principal Findings Empirical performance in classification of relative types for any two given DNA samples (e.g., full siblings, parental, etc) indicated that for outbred populations the panel performs sufficiently to classify relationship in extended families and therefore also for smaller structures such as trios and for twin zygosity testing. Additionally, familial relationships do not significantly diminish the (mean match) probability of sharing SNP genotypes in pedigrees, further indicating the uniqueness of the “barcode.” Simulation using these SNPs for an African American case-control disease association study demonstrated that population stratification, even in complex admixed samples, can be adequately corrected under a range of disease models using the SNP panel. Conclusion The panel has been validated for use in a variety of human disease genetics research tasks including sample barcoding, relationship verification, population substructure detection and statistical correction. Given the ease of genotyping

  12. SNP identification and SNAP marker development for a GmNARK gene controlling supernodulation in soybean.

    PubMed

    Kim, M Y; Van, K; Lestari, P; Moon, J-K; Lee, S-H

    2005-04-01

    Supernodulation in soybean (Glycine max L. Merr.) is an important source of nitrogen supply to subterranean ecological systems. Single nucleotide-amplified polymorphism (SNAP) markers for supernodulation should allow rapid screening of the trait in early growth stages, without the need for inoculation and phenotyping. The gene GmNARK (Glycine max nodule autoregulation receptor kinase), controlling autoregulation of nodulation, was found to have a single nucleotide polymorphism (SNP) between the wild-type cultivar Sinpaldalkong 2 and its supernodulating mutant, SS2-2. Transversion of A to T at the 959-bp position of the GmNARK sequence results in a change of lysine (AAG) to a stop codon (TAG), thus terminating its translation in SS2-2. Based on the identified SNP in GmNARK, five primer pairs specific to each allele were designed using the WebSnaper program to develop a SNAP marker for supernodulation. One A-specific primer pair produced a band present in only Sinpaldalkong 2, while two T-specific pairs showed a band in only SS2-2. Both complementary PCRs, using each allele-specific primer pair were performed to genotype supernodulation against F2 progeny of Sinpaldalkong 2 x SS2-2. Among 28 individuals with the normal phenotype, eight individuals having only the A-allele-specific band were homozygous and normal, while 20 individuals were found to be heterozygous at the SNP having both A and T bands. Twelve supernodulating individuals showed only the band specific to the T allele. This SNAP marker for supernodulation could easily be analyzed through simple PCR and agarose gel electrophoresis. Therefore, use of this SNAP marker might be faster, cheaper, and more reproducible than using other genotyping methods, such as a cleaved amplified polymorphic sequence marker, which demand of restriction enzymes.

  13. Role of an SNP in Alternative Splicing of Bovine NCF4 and Mastitis Susceptibility.

    PubMed

    Ju, Zhihua; Wang, Changfa; Wang, Xiuge; Yang, Chunhong; Sun, Yan; Jiang, Qiang; Wang, Fei; Li, Mengjiao; Zhong, Jifeng; Huang, Jinming

    2015-01-01

    Neutrophil cytosolic factor 4 (NCF4) is component of the nicotinamide dinucleotide phosphate oxidase complex, a key factor in biochemical pathways and innate immune responses. In this study, splice variants and functional single-nucleotide polymorphism (SNP) of NCF4 were identified to determine the variability and association of the gene with susceptibility to bovine mastitis characterized by inflammation. A novel splice variant, designated as NCF4-TV and characterized by the retention of a 48 bp sequence in intron 9, was detected in the mammary gland tissues of infected cows. The expression of the NCF4-reference main transcript in the mastitic mammary tissues was higher than that in normal tissues. A novel SNP, g.18174 A>G, was also found in the retained 48 bp region of intron 9. To determine whether NCF4-TV could be due to the g.18174 A>G mutation, we constructed two mini-gene expression vectors with the wild-type or mutant NCF4 g.18174 A>G fragment. The vectors were then transiently transfected into 293T cells, and alternative splicing of NCF4 was analyzed by reverse transcription-PCR and sequencing. Mini-gene splicing assay demonstrated that the aberrantly spliced NCF4-TV with 48 bp retained fragment in intron 9 could be due to g.18174 A>G, which was associated with milk somatic count score and increased risk of mastitis infection in cows. NCF4 expression was also regulated by alternative splicing. This study proposes that NCF4 splice variants generated by functional SNP are important risk factors for mastitis susceptibility in dairy cows.

  14. Role of an SNP in Alternative Splicing of Bovine NCF4 and Mastitis Susceptibility

    PubMed Central

    Wang, Xiuge; Yang, Chunhong; Sun, Yan; Jiang, Qiang; Wang, Fei; Li, Mengjiao; Zhong, Jifeng; Huang, Jinming

    2015-01-01

    Neutrophil cytosolic factor 4 (NCF4) is component of the nicotinamide dinucleotide phosphate oxidase complex, a key factor in biochemical pathways and innate immune responses. In this study, splice variants and functional single-nucleotide polymorphism (SNP) of NCF4 were identified to determine the variability and association of the gene with susceptibility to bovine mastitis characterized by inflammation. A novel splice variant, designated as NCF4-TV and characterized by the retention of a 48 bp sequence in intron 9, was detected in the mammary gland tissues of infected cows. The expression of the NCF4-reference main transcript in the mastitic mammary tissues was higher than that in normal tissues. A novel SNP, g.18174 A>G, was also found in the retained 48 bp region of intron 9. To determine whether NCF4-TV could be due to the g.18174 A>G mutation, we constructed two mini-gene expression vectors with the wild-type or mutant NCF4 g.18174 A>G fragment. The vectors were then transiently transfected into 293T cells, and alternative splicing of NCF4 was analyzed by reverse transcription-PCR and sequencing. Mini-gene splicing assay demonstrated that the aberrantly spliced NCF4-TV with 48 bp retained fragment in intron 9 could be due to g.18174 A>G, which was associated with milk somatic count score and increased risk of mastitis infection in cows. NCF4 expression was also regulated by alternative splicing. This study proposes that NCF4 splice variants generated by functional SNP are important risk factors for mastitis susceptibility in dairy cows. PMID:26600390

  15. Novel quantitative real-time LCR for the sensitive detection of SNP frequencies in pooled DNA: method development, evaluation and application.

    PubMed

    Psifidi, Androniki; Dovas, Chrysostomos; Banos, Georgios

    2011-01-19

    Single nucleotide polymorphisms (SNP) have proven to be powerful genetic markers for genetic applications in medicine, life science and agriculture. A variety of methods exist for SNP detection but few can quantify SNP frequencies when the mutated DNA molecules correspond to a small fraction of the wild-type DNA. Furthermore, there is no generally accepted gold standard for SNP quantification, and, in general, currently applied methods give inconsistent results in selected cohorts. In the present study we sought to develop a novel method for accurate detection and quantification of SNP in DNA pooled samples. The development and evaluation of a novel Ligase Chain Reaction (LCR) protocol that uses a DNA-specific fluorescent dye to allow quantitative real-time analysis is described. Different reaction components and thermocycling parameters affecting the efficiency and specificity of LCR were examined. Several protocols, including gap-LCR modifications, were evaluated using plasmid standard and genomic DNA pools. A protocol of choice was identified and applied for the quantification of a polymorphism at codon 136 of the ovine PRNP gene that is associated with susceptibility to a transmissible spongiform encephalopathy in sheep. The real-time LCR protocol developed in the present study showed high sensitivity, accuracy, reproducibility and a wide dynamic range of SNP quantification in different DNA pools. The limits of detection and quantification of SNP frequencies were 0.085% and 0.35%, respectively. The proposed real-time LCR protocol is applicable when sensitive detection and accurate quantification of low copy number mutations in DNA pools is needed. Examples include oncogenes and tumour suppressor genes, infectious diseases, pathogenic bacteria, fungal species, viral mutants, drug resistance resulting from point mutations, and genetically modified organisms in food.

  16. Novel Quantitative Real-Time LCR for the Sensitive Detection of SNP Frequencies in Pooled DNA: Method Development, Evaluation and Application

    PubMed Central

    Psifidi, Androniki; Dovas, Chrysostomos; Banos, Georgios

    2011-01-01

    Background Single nucleotide polymorphisms (SNP) have proven to be powerful genetic markers for genetic applications in medicine, life science and agriculture. A variety of methods exist for SNP detection but few can quantify SNP frequencies when the mutated DNA molecules correspond to a small fraction of the wild-type DNA. Furthermore, there is no generally accepted gold standard for SNP quantification, and, in general, currently applied methods give inconsistent results in selected cohorts. In the present study we sought to develop a novel method for accurate detection and quantification of SNP in DNA pooled samples. Methods The development and evaluation of a novel Ligase Chain Reaction (LCR) protocol that uses a DNA-specific fluorescent dye to allow quantitative real-time analysis is described. Different reaction components and thermocycling parameters affecting the efficiency and specificity of LCR were examined. Several protocols, including gap-LCR modifications, were evaluated using plasmid standard and genomic DNA pools. A protocol of choice was identified and applied for the quantification of a polymorphism at codon 136 of the ovine PRNP gene that is associated with susceptibility to a transmissible spongiform encephalopathy in sheep. Conclusions The real-time LCR protocol developed in the present study showed high sensitivity, accuracy, reproducibility and a wide dynamic range of SNP quantification in different DNA pools. The limits of detection and quantification of SNP frequencies were 0.085% and 0.35%, respectively. Significance The proposed real-time LCR protocol is applicable when sensitive detection and accurate quantification of low copy number mutations in DNA pools is needed. Examples include oncogenes and tumour suppressor genes, infectious diseases, pathogenic bacteria, fungal species, viral mutants, drug resistance resulting from point mutations, and genetically modified organisms in food. PMID:21283808

  17. Pyrobayes: an improved base caller for SNP discovery in pyrosequences.

    PubMed

    Quinlan, Aaron R; Stewart, Donald A; Strömberg, Michael P; Marth, Gábor T

    2008-02-01

    Previously reported applications of the 454 Life Sciences pyrosequencing technology have relied on deep sequence coverage for accurate polymorphism discovery because of frequent insertion and deletion sequence errors. Here we report a new base calling program, Pyrobayes, for pyrosequencing reads. Pyrobayes permits accurate single-nucleotide polymorphism (SNP) calling in resequencing applications, even in shallow read coverage, primarily because it produces more confident base calls than the native base calling program.

  18. Introgression browser: high-throughput whole-genome SNP visualization.

    PubMed

    Aflitos, Saulo Alves; Sanchez-Perez, Gabino; de Ridder, Dick; Fransz, Paul; Schranz, Michael E; de Jong, Hans; Peters, Sander A

    2015-04-01

    Breeding by introgressive hybridization is a pivotal strategy to broaden the genetic basis of crops. Usually, the desired traits are monitored in consecutive crossing generations by marker-assisted selection, but their analyses fail in chromosome regions where crossover recombinants are rare or not viable. Here, we present the Introgression Browser (iBrowser), a bioinformatics tool aimed at visualizing introgressions at nucleotide or SNP (Single Nucleotide Polymorphisms) accuracy. The software selects homozygous SNPs from Variant Call Format (VCF) information and filters out heterozygous SNPs, multi-nucleotide polymorphisms (MNPs) and insertion-deletions (InDels). For data analysis iBrowser makes use of sliding windows, but if needed it can generate any desired fragmentation pattern through General Feature Format (GFF) information. In an example of tomato (Solanum lycopersicum) accessions we visualize SNP patterns and elucidate both position and boundaries of the introgressions. We also show that our tool is capable of identifying alien DNA in a panel of the closely related S. pimpinellifolium by examining phylogenetic relationships of the introgressed segments in tomato. In a third example, we demonstrate the power of the iBrowser in a panel of 597 Arabidopsis accessions, detecting the boundaries of a SNP-free region around a polymorphic 1.17 Mbp inverted segment on the short arm of chromosome 4. The architecture and functionality of iBrowser makes the software appropriate for a broad set of analyses including SNP mining, genome structure analysis, and pedigree analysis. Its functionality, together with the capability to process large data sets and efficient visualization of sequence variation, makes iBrowser a valuable breeding tool. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.

  19. Gene-based SNP discovery and genetic mapping in pea.

    PubMed

    Sindhu, Anoop; Ramsay, Larissa; Sanderson, Lacey-Anne; Stonehouse, Robert; Li, Rong; Condie, Janet; Shunmugam, Arun S K; Liu, Yong; Jha, Ambuj B; Diapari, Marwan; Burstin, Judith; Aubert, Gregoire; Tar'an, Bunyamin; Bett, Kirstin E; Warkentin, Thomas D; Sharpe, Andrew G

    2014-10-01

    Gene-based SNPs were identified and mapped in pea using five recombinant inbred line populations segregating for traits of agronomic importance. Pea (Pisum sativum L.) is one of the world's oldest domesticated crops and has been a model system in plant biology and genetics since the work of Gregor Mendel. Pea is the second most widely grown pulse crop in the world following common bean. The importance of pea as a food crop is growing due to its combination of moderate protein concentration, slowly digestible starch, high dietary fiber concentration, and its richness in micronutrients; however, pea has lagged behind other major crops in harnessing recent advances in molecular biology, genomics and bioinformatics, partly due to its large genome size with a large proportion of repetitive sequence, and to the relatively limited investment in research in this crop globally. The objective of this research was the development of a genome-wide transcriptome-based pea single-nucleotide polymorphism (SNP) marker platform using next-generation sequencing technology. A total of 1,536 polymorphic SNP loci selected from over 20,000 non-redundant SNPs identified using deep transcriptome sequencing of eight diverse Pisum accessions were used for genotyping in five RIL populations using an Illumina GoldenGate assay. The first high-density pea SNP map defining all seven linkage groups was generated by integrating with previously published anchor markers. Syntenic relationships of this map with the model legume Medicago truncatula and lentil (Lens culinaris Medik.) maps were established. The genic SNP map establishes a foundation for future molecular breeding efforts by enabling both the identification and tracking of introgression of genomic regions harbouring QTLs related to agronomic and seed quality traits.

  20. Multi-SNP Haplotype Analysis Methods for Association Analysis.

    PubMed

    Stram, Daniel O

    2017-01-01

    Haplotype analysis forms the basis of much of genetic association analysis using both related and unrelated individuals (we concentrate on unrelated). For example, haplotype analysis indirectly underlies the SNP imputation methods that are used for testing trait associations with known but unmeasured variants and for performing collaborative post-GWAS meta-analysis. This chapter is focused on the direct use of haplotypes in association testing. It reviews the rationale for haplotype-based association testing, discusses statistical issues related to haplotype uncertainty that affect the analysis, then gives practical guidance for testing haplotype-based associations with phenotype or outcome trait, first of candidate gene regions and then for the genome as a whole. Haplotypes are interesting for two reasons, first they may be in closer LD with a causal variant than any single measured SNP, and therefore may enhance the coverage value of the genotypes over single SNP analysis. Second, haplotypes may themselves be the causal variants of interest and some solid examples of this have appeared in the literature.This chapter discusses three possible approaches to incorporation of SNP haplotype analysis into generalized linear regression models: (1) a simple substitution method involving imputed haplotypes, (2) simultaneous maximum likelihood (ML) estimation of all parameters, including haplotype frequencies and regression parameters, and (3) a simplified approximation to full ML for case-control data.Examples of the various approaches for a haplotype analysis of a candidate gene are provided. We compare the behavior of the approximation-based methods and argue that in most instances the simpler methods hold up well in practice. We also describe the practical implementation of haplotype risk estimation genome-wide and discuss several shortcuts that can be used to speed up otherwise potentially very intensive computational requirements.

  1. Development of a forensic identity SNP panel for Indonesia.

    PubMed

    Augustinus, Daniel; Gahan, Michelle E; McNevin, Dennis

    2015-07-01

    Genetic markers included in forensic identity panels must exhibit Hardy-Weinberg and linkage equilibrium (HWE and LE). "Universal" panels designed for global use can fail these tests in regional jurisdictions exhibiting high levels of genetic differentiation such as the Indonesian archipelago. This is especially the case where a single DNA database is required for allele frequency estimates to calculate random match probabilities (RMPs) and associated likelihood ratios (LRs). A panel of 65 single nucleotide polymorphisms (SNPs) and a reduced set of 52 SNPs have been selected from 15 Indonesian subpopulations in the HUGO Pan Asian SNP database using a SNP selection strategy that could be applied to any panel of forensic identity markers. The strategy consists of four screening steps: (1) application of a G test for HWE; (2) ranking for high heterozygosity; (3) selection for LE; and (4) selection for low inbreeding depression. SNPs in our Indonesian panel perform well in comparison to some other universal SNP and short tandem repeat (STR) panels as measured by Fisher's exact test for HWE and LE and Wright's F statistics.

  2. Development of SNP-genotyping arrays in two shellfish species.

    PubMed

    Lapègue, S; Harrang, E; Heurtebise, S; Flahauw, E; Donnadieu, C; Gayral, P; Ballenghien, M; Genestout, L; Barbotte, L; Mahla, R; Haffray, P; Klopp, C

    2014-07-01

    Use of SNPs has been favoured due to their abundance in plant and animal genomes, accompanied by the falling cost and rising throughput capacity for detection and genotyping. Here, we present in vitro (obtained from targeted sequencing) and in silico discovery of SNPs, and the design of medium-throughput genotyping arrays for two oyster species, the Pacific oyster, Crassostrea gigas, and European flat oyster, Ostrea edulis. Two sets of 384 SNP markers were designed for two Illumina GoldenGate arrays and genotyped on more than 1000 samples for each species. In each case, oyster samples were obtained from wild and selected populations and from three-generation families segregating for traits of interest in aquaculture. The rate of successfully genotyped polymorphic SNPs was about 60% for each species. Effects of SNP origin and quality on genotyping success (Illumina functionality Score) were analysed and compared with other model and nonmodel species. Furthermore, a simulation was made based on a subset of the C. gigas SNP array with a minor allele frequency of 0.3 and typical crosses used in shellfish hatcheries. This simulation indicated that at least 150 markers were needed to perform an accurate parental assignment. Such panels might provide valuable tools to improve our understanding of the connectivity between wild (and selected) populations and could contribute to future selective breeding programmes.

  3. Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers

    PubMed Central

    Atwood, Tressa S.; Currey, Mark C.; Shiver, Anthony L.; Lewis, Zachary A.; Selker, Eric U.; Cresko, William A.; Johnson, Eric A.

    2008-01-01

    Single nucleotide polymorphism (SNP) discovery and genotyping are essential to genetic mapping. There remains a need for a simple, inexpensive platform that allows high-density SNP discovery and genotyping in large populations. Here we describe the sequencing of restriction-site associated DNA (RAD) tags, which identified more than 13,000 SNPs, and mapped three traits in two model organisms, using less than half the capacity of one Illumina sequencing run. We demonstrated that different marker densities can be attained by choice of restriction enzyme. Furthermore, we developed a barcoding system for sample multiplexing and fine mapped the genetic basis of lateral plate armor loss in threespine stickleback by identifying recombinant breakpoints in F2 individuals. Barcoding also facilitated mapping of a second trait, a reduction of pelvic structure, by in silico re-sorting of individuals. To further demonstrate the ease of the RAD sequencing approach we identified polymorphic markers and mapped an induced mutation in Neurospora crassa. Sequencing of RAD markers is an integrated platform for SNP discovery and genotyping. This approach should be widely applicable to genetic mapping in a variety of organisms. PMID:18852878

  4. Mutations of C-Reactive Protein (CRP) -286 SNP, APC and p53 in Colorectal Cancer: Implication for a CRP-Wnt Crosstalk

    PubMed Central

    Cheng, Jin; Zhang, Shi-Chao; Hui, Feng; Chen, Xue-Zhong; Liu, Shan-Hui; Liu, Qin-Jiang; Zhu, Zi-Jiang; Hu, Qing-Rong; Wu, Yi; Ji, Shang-Rong

    2014-01-01

    C-reactive protein (CRP) is an established marker of inflammation with pattern-recognition receptor-like activities. Despite the close association of the serum level of CRP with the risk and prognosis of several types of cancer, it remains elusive whether CRP contributes directly to tumorigenesis or just represents a bystander marker. We have recently identified recurrent mutations at the SNP position -286 (rs3091244) in the promoter of CRP gene in several tumor types, instead suggesting that locally produced CRP is a potential driver of tumorigenesis. However, it is unknown whether the -286 site is the sole SNP position of CRP gene targeted for mutation and whether there is any association between CRP SNP mutations and other frequently mutated genes in tumors. Herein, we have examined the genotypes of three common CRP non-coding SNPs (rs7553007, rs1205, rs3093077) in tumor/normal sample pairs of 5 cancer types (n = 141). No recurrent somatic mutations are found at these SNP positions, indicating that the -286 SNP mutations are preferentially selected during the development of cancer. Further analysis reveals that the -286 SNP mutations of CRP tend to co-occur with mutated APC particularly in rectal cancer (p = 0.04; n = 67). By contrast, mutations of CRP and p53 or K-ras appear to be unrelated. There results thus underscore the functional importance of the -286 mutation of CRP in tumorigenesis and imply an interaction between CRP and Wnt signaling pathway. PMID:25025473

  5. mrsFAST-Ultra: a compact, SNP-aware mapper for high performance sequencing applications.

    PubMed

    Hach, Faraz; Sarrafi, Iman; Hormozdiari, Farhad; Alkan, Can; Eichler, Evan E; Sahinalp, S Cenk

    2014-07-01

    High throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges for processing and downstream analysis. While tools that report the 'best' mapping location of each read provide a fast way to process HTS data, they are not suitable for many types of downstream analysis such as structural variation detection, where it is important to report multiple mapping loci for each read. For this purpose we introduce mrsFAST-Ultra, a fast, cache oblivious, SNP-aware aligner that can handle the multi-mapping of HTS reads very efficiently. mrsFAST-Ultra improves mrsFAST, our first cache oblivious read aligner capable of handling multi-mapping reads, through new and compact index structures that reduce not only the overall memory usage but also the number of CPU operations per alignment. In fact the size of the index generated by mrsFAST-Ultra is 10 times smaller than that of mrsFAST. As importantly, mrsFAST-Ultra introduces new features such as being able to (i) obtain the best mapping loci for each read, and (ii) return all reads that have at most n mapping loci (within an error threshold), together with these loci, for any user specified n. Furthermore, mrsFAST-Ultra is SNP-aware, i.e. it can map reads to reference genome while discounting the mismatches that occur at common SNP locations provided by db-SNP; this significantly increases the number of reads that can be mapped to the reference genome. Notice that all of the above features are implemented within the index structure and are not simple post-processing steps and thus are performed highly efficiently. Finally, mrsFAST-Ultra utilizes multiple available cores and processors and can be tuned for various memory settings. Our results show that mrsFAST-Ultra is roughly five times faster than its predecessor mrsFAST. In comparison to newly enhanced popular tools such as Bowtie2, it is more sensitive (it can report 10 times or more mappings per read) and much faster (six times or

  6. Multiple SNP-sets Analysis for Genome-wide Association Studies through Bayesian Latent Variable Selection

    PubMed Central

    Lu, Zhaohua; Zhu, Hongtu; Knickmeyer, Rebecca C; Sullivan, Patrick F.; Stephanie, Williams N.; Zou, Fei

    2015-01-01

    The power of genome-wide association studies (GWAS) for mapping complex traits with single SNP analysis may be undermined by modest SNP effect sizes, unobserved causal SNPs, correlation among adjacent SNPs, and SNP-SNP interactions. Alternative approaches for testing the association between a single SNP-set and individual phenotypes have been shown to be promising for improving the power of GWAS. We propose a Bayesian latent variable selection (BLVS) method to simultaneously model the joint association mapping between a large number of SNP-sets and complex traits. Compared to single SNP-set analysis, such joint association mapping not only accounts for the correlation among SNP-sets, but also is capable of detecting causal SNP-sets that are marginally uncorrelated with traits. The spike-slab prior assigned to the effects of SNP-sets can greatly reduce the dimension of effective SNP-sets, while speeding up computation. An efficient MCMC algorithm is developed. Simulations demonstrate that BLVS outperforms several competing variable selection methods in some important scenarios. PMID:26515609

  7. Detection of single nucleotide polymorphism (SNP) controlling the waxy character in wheat by using a derived cleaved amplified polymorphic sequence (dCAPS) marker.

    PubMed

    Yanagisawa, T; Kiribuchi-Otobe, C; Hirano, H; Suzuki, Y; Fujita, M

    2003-06-01

    We investigated a single nucleotide polymorphism (SNP) in the Wx-D1 gene, which was found in a mutant waxy wheat, and which expressed the Wx-D1 protein (granule-bound starch synthase I) as shown by immunoblot analysis. We also assayed starch synthase activity of granule-bound proteins. Using 22 doubled-haploid (DH) lines and 172 F(5) lines derived from the wild type x the mutant, we detected SNP via a PCR-based (dCAPS) marker. Amplified PCR products from Wx-D1 gene-specific primers, followed by mismatched primers designed for dCAPS analysis, were digested with the appropriate restriction enzyme. The two alleles, and the heterozygote genotype were easily and rapidly discriminated by gel-electrophoresis resolution to reveal SNP. All progeny lines that have the SNP of the mutant allele were waxy. Integrating the results of dCAPS analysis, immunoblot analysis and assays of starch synthase activity of granule-bound proteins indicates that the SNP in the Wx-D1 gene was responsible for its waxy character. This dCAPS marker is therefore useful as a marker to introduce the mutant allele into elite breeding lines.

  8. Population distribution and ancestry of the cancer protective MDM2 SNP285 (rs117039649)

    PubMed Central

    Knappskog, Stian; Gansmo, Liv B.; Dibirova, Khadizha; Metspalu, Andres; Cybulski, Cezary; Peterlongo, Paolo; Aaltonen, Lauri; Vatten, Lars; Romundstad, Pål; Hveem, Kristian; Devilee, Peter; Evans, Gareth D.; Lin, Dongxin; Camp, Guy Van; Manolopoulos, Vangelis G.; Osorio, Ana; Milani, Lili; Ozcelik, Tayfun; Zalloua, Pierre; Mouzaya, Francis; Bliznetz, Elena; Balanovska, Elena; Pocheshkova, Elvira; Kučinskas, Vaidutis; Atramentova, Lubov; Nymadawa, Pagbajabyn; Titov, Konstantin; Lavryashina, Maria; Yusupov, Yuldash; Bogdanova, Natalia; Koshel, Sergey; Zamora, Jorge; Wedge, David C.; Charlesworth, Deborah; Dörk, Thilo; Balanovsky, Oleg; Lønning, Per E.

    2014-01-01

    The MDM2 promoter SNP285C is located on the SNP309G allele. While SNP309G enhances Sp1 transcription factor binding and MDM2 transcription, SNP285C antagonizes Sp1 binding and reduces the risk of breast-, ovary- and endometrial cancer. Assessing SNP285 and 309 genotypes across 25 different ethnic populations (>10.000 individuals), the incidence of SNP285C was 6-8% across European populations except for Finns (1.2%) and Saami (0.3%). The incidence decreased towards the Middle-East and Eastern Russia, and SNP285C was absent among Han Chinese, Mongolians and African Americans. Interhaplotype variation analyses estimated SNP285C to have originated about 14,700 years ago (95% CI: 8,300 – 33,300). Both this estimate and the geographical distribution suggest SNP285C to have arisen after the separation between Caucasians and modern day East Asians (17,000 - 40,000 years ago). We observed a strong inverse correlation (r = -0.805; p < 0.001) between the percentage of SNP309G alleles harboring SNP285C and the MAF for SNP309G itself across different populations suggesting selection and environmental adaptation with respect to MDM2 expression in recent human evolution. In conclusion, we found SNP285C to be a pan-Caucasian variant. Ethnic variation regarding distribution of SNP285C needs to be taken into account when assessing the impact of MDM2 SNPs on cancer risk. PMID:25327560

  9. Fast and Rigorous Computation of Gene and Pathway Scores from SNP-Based Summary Statistics.

    PubMed

    Lamparter, David; Marbach, Daniel; Rueedi, Rico; Kutalik, Zoltán; Bergmann, Sven

    2016-01-01

    Integrating single nucleotide polymorphism (SNP) p-values from genome-wide association studies (GWAS) across genes and pathways is a strategy to improve statistical power and gain biological insight. Here, we present Pascal (Pathway scoring algorithm), a powerful tool for computing gene and pathway scores from SNP-phenotype association summary statistics. For gene score computation, we implemented analytic and efficient numerical solutions to calculate test statistics. We examined in particular the sum and the maximum of chi-squared statistics, which measure the strongest and the average association signals per gene, respectively. For pathway scoring, we use a modified Fisher method, which offers not only significant power improvement over more traditional enrichment strategies, but also eliminates the problem of arbitrary threshold selection inherent in any binary membership based pathway enrichment approach. We demonstrate the marked increase in power by analyzing summary statistics from dozens of large meta-studies for various traits. Our extensive testing indicates that our method not only excels in rigorous type I error control, but also results in more biologically meaningful discoveries.

  10. Y-SNP analysis versus Y-haplogroup predictor in the Slovak population.

    PubMed

    Petrejcíková, Eva; Carnogurská, Jana; Hronská, Danica; Bernasovská, Jarmila; Boronová, Iveta; Gabriková, Dana; Bôziková, Alexandra; Maceková, Sona

    2014-01-01

    Human Y-chromosome haplogroups are important markers used mainly in population genetic studies. The haplogroups are defined by several SNPs according to the phylogeny and international nomenclature. The alternative method to estimate the Y-chromosome haplogroups is to predict Y-chromosome haplotypes from a set of Y-STR markers using software for Y-haplogroup prediction. The purpose of this study was to compare the accuracy of three types of Y-haplogroup prediction software and to determine the structure of Slovak population revealed by the Y-chromosome haplogroups. We used a sample of 166 Slovak males in which 12 Y-STR markers were genotyped in our previous study. These results were analyzed by three different software products that predict Y-haplogroups. To estimate the accuracy of these prediction software, Y-haplogroups were determined in the same sample by genotyping Y-chromosome SNPs. Haplogroups were correctly predicted in 98.80% (Whit Athey's Haplogroup Predictor), 97.59% (Jim Cullen's Haplogroup Predictor) and 98.19% (YPredictor by Vadim Urasin 1.5.0) of individuals. The occurrence of errors in Y-chromosome haplogroup prediction suggests that the validation using SNP analysis is appropriate when high accuracy is required. The results of SNP based haplotype determination indicate that 39.15% of the Slovak population belongs to R1a-M198 lineage, which is one of the main European lineages.

  11. Genome-wide prediction of cancer driver genes based on SNP and cancer SNV data.

    PubMed

    He, Quanze; He, Quanyuan; Liu, Xiaohui; Wei, Youheng; Shen, Suqin; Hu, Xiaohui; Li, Qiao; Peng, Xiangwen; Wang, Lin; Yu, Long

    2014-01-01

    Identifying cancer driver genes and exploring their functions are essential and the most urgent need in basic cancer research. Developing efficient methods to differentiate between driver and passenger somatic mutations revealed from large-scale cancer genome sequencing data is critical to cancer driver gene discovery. Here, we compared distinct features of SNP with SNV data in detail and found that the weighted ratio of SNV to SNP (termed as WVPR) is an excellent indicator for cancer driver genes. The power of WVPR was validated by accurate predictions of known drivers. We ranked most of human genes by WVPR and did functional analyses on the list. The results demonstrate that driver genes are usually highly enriched in chromatin organization related genes/pathways. And some protein complexes, such as histone acetyltransferase, histone methyltransferase, telomerase, centrosome, sin3 and U12-type spliceosomal complexes, are hot spots of driver mutations. Furthermore, this study identified many new potential driver genes (e.g. NTRK3 and ZIC4) and pathways including oxidative phosphorylation pathway, which were not deemed by previous methods. Taken together, our study not only developed a method to identify cancer driver genes/pathways but also provided new insights into molecular mechanisms of cancer development.

  12. Human Y-chromosome SNP characterization by multiplex amplified product-length polymorphism analysis.

    PubMed

    Medina, Laura Smeldy Jurado; Muzzio, Marina; Schwab, Marisol; Costantino, María Leticia Bravi; Barreto, Guillermo; Bailliet, Graciela

    2014-09-01

    We designed an allele-specific amplification protocol to optimize Y-chromosome SNP typing, which is an unavoidable step for defining the phylogenetic status of paternal lineages. It allows the simultaneous highly specific definition of up to six mutations in a single reaction by amplification fragment length polymorphism (AFLP) without the need of specialized equipment, at a considerably lower cost than that based on single-base primer extension (SNaPshot™) technology or PCR-RFLP systems, requiring as little as 0.5 ng DNA and compatible with the small fragments characteristic of low-quality DNA. By designation of two primers recognizing the derived and ancestral state for each SNP, which can be differentiated by size by the addition of a noncomplementary nucleotide tail, we could define major Y clades E, F, K, R, Q, and subhaplogroups R1, R1a, R1b, R1b1b, R1b1c, J1, J2, G1, G2, I1, Q1a3, and Q1a3a1 through amplification fragments that ranged between 60 and 158bp.

  13. Identification of close relatives in the HUGO Pan-Asian SNP database.

    PubMed

    Yang, Xiong; Xu, Shuhua

    2011-01-01

    The HUGO Pan-Asian SNP Consortium has recently released a genome-wide dataset, which consists of 1,719 DNA samples collected from 71 Asian populations. For studies of human population genetics such as genetic structure and migration history, this provided the most comprehensive large-scale survey of genetic variation to date in East and Southeast Asia. However, although considered in the analysis, close relatives were not clearly reported in the original paper. Here we performed a systematic analysis of genetic relationships among individuals from the Pan-Asian SNP (PASNP) database and identified 3 pairs of monozygotic twins or duplicate samples, 100 pairs of first-degree and 161 second-degree of relationships. Three standardized subsets with different levels of unrelated individuals were suggested here for future applications of the samples in most types of population-genetics studies (denoted by PASNP1716, PASNP1640 and PASNP1583 respectively) based on the relationships inferred in this study. In addition, we provided gender information for PASNP samples, which were not included in the original dataset, based on analysis of X chromosome data.

  14. Forensic SNP genotyping with SNaPshot: Technical considerations for the development and optimization of multiplexed SNP assays.

    PubMed

    Fondevila, M; Børsting, C; Phillips, C; de la Puente, M; Consortium, Euroforen-NoE; Carracedo, A; Morling, N; Lareu, M V

    2017-01-01

    This review explores the key factors that influence the optimization, routine use, and profile interpretation of the SNaPshot single-base extension (SBE) system applied to forensic single-nucleotide polymorphism (SNP) genotyping. Despite being a mainly complimentary DNA genotyping technique to routine STR profiling, use of SNaPshot is an important part of the development of SNP sets for a wide range of forensic applications with these markers, from genotyping highly degraded DNA with very short amplicons to the introduction of SNPs to ascertain the ancestry and physical characteristics of an unidentified contact trace donor. However, this technology, as resourceful as it is, displays several features that depart from the usual STR genotyping far enough to demand a certain degree of expertise from the forensic analyst before tackling the complex casework on which SNaPshot application provides an advantage. In order to provide the basis for developing such expertise, we cover in this paper the most challenging aspects of the SNaPshot technology, focusing on the steps taken to design primer sets, optimize the PCR and single-base extension chemistries, and the important features of the peak patterns observed in typical forensic SNP profiles using SNaPshot. With that purpose in mind, we provide guidelines and troubleshooting for multiplex-SNaPshot-oriented primer design and the resulting capillary electrophoresis (CE) profile interpretation (covering the most commonly observed artifacts and expected departures from the ideal conditions).

  15. Protective effects of melatonin on the activity of SOD, CAT, GSH-Px and GSH content in organs of mice after administration of SNP.

    PubMed

    Goc, Zofia; Szaroma, Waldemar; Kapusta, Edyta; Dziubek, Karol

    2017-02-28

    Sodium nitroprusside (SNP) is an antihypertensive drug with proven dose-dependent toxic effects attributed mainly to the production of cyanide but also excesive nitric oxide (NO) and derived reactive species. The present study evaluated whether melatonin administration would have time-dependent protective effect against SNP−induced toxicity. Male Swiss mice were used in this study. Control mice were treated with 0.9% NaCl; the second group was injected with 10 mg melatonin (MEL)/kg body weight (b.w.); the third group was given SNP at the dose of 3,6 mg/kg b.w.; the fourth group received both MEL and SNP at the same doses. In homogenates of brain, liver and kidneys, activities of superoxide dismutase (SOD), catalase (CAT) and glutathione peroxidase (GSH-Px) were estimated after 3, 6 and 24 h of drugs administration. The concentration of reduced glutathione (GSH) was also evaluated in the blood, brain, liver and kidneys of mice at the same time intervals. In animals receiving MEL, the highest levels of GSH were observed in all the organs as compared to the control after 3, 6 h. Meanwhile, SNP decreased GSH concentration in the blood, brain, liver and kidneys in all time intervals. Administration of MEL in combination with SNP increased the GSH levels in all organs, as compared to the administration of SNP alone; this effect was observed after 3, 6 and 24 h. The activity of SOD, CAT and GSH-Px in the MEL-treated group increased after 3 h in all the organs, while in liver and kidney the increase was also observed after 6 h. Conversely, the SNP intoxication caused a decrease of the activity of enzymes in the tested organs in all intervals, while administration of MEL + SNP resulted in increased activities of SOD, CAT and GSH-Px in all the organs after 3 h and 6 h. The investigation carried out in the present study provide new data to add to the study of antioxidant properties of MEL and SNP-induced oxidative stress with regard to time-dependent properties in different

  16. Exploration of SNP variants affecting hair colour prediction in Europeans.

    PubMed

    Söchtig, Jens; Phillips, Chris; Maroñas, Olalla; Gómez-Tato, Antonio; Cruz, Raquel; Alvarez-Dios, Jose; de Cal, María-Ángeles Casares; Ruiz, Yarimar; Reich, Kristian; Fondevila, Manuel; Carracedo, Ángel; Lareu, María V

    2015-09-01

    DNA profiling is a key tool for forensic analysis; however, current methods identify a suspect either by direct comparison or from DNA database searches. In cases with unidentified suspects, prediction of visible physical traits e.g. pigmentation or hair distribution of the DNA donors can provide important probative information. This study aimed to explore single nucleotide polymorphism (SNP) variants for their effect on hair colour prediction. A discovery panel of 63 SNPs consisting of already established hair colour markers from the HIrisPlex hair colour phenotyping assay as well as additional markers for which associations to human pigmentation traits were previously identified was used to develop multiplex assays based on SNaPshot single-base extension technology. A genotyping study was performed on a range of European populations (n = 605). Hair colour phenotyping was accomplished by matching donor's hair to a graded colour category system of reference shades and photography. Since multiple SNPs in combination contribute in varying degrees to hair colour predictability in Europeans, we aimed to compile a compact marker set that could provide a reliable hair colour inference from the fewest SNPs. The predictive approach developed uses a naïve Bayes classifier to provide hair colour assignment probabilities for the SNP profiles of the key SNPs and was embedded into the Snipper online SNP classifier ( http://mathgene.usc.es/snipper/ ). Results indicate that red, blond, brown and black hair colours are predictable with informative probabilities in a high proportion of cases. Our study resulted in the identification of 12 most strongly associated SNPs to hair pigmentation variation in six genes.

  17. Computational tradeoffs in multiplex PCR assay design for SNP genotyping

    PubMed Central

    Rachlin, John; Ding, Chunming; Cantor, Charles; Kasif, Simon

    2005-01-01

    Background Multiplex PCR is a key technology for detecting infectious microorganisms, whole-genome sequencing, forensic analysis, and for enabling flexible yet low-cost genotyping. However, the design of a multiplex PCR assays requires the consideration of multiple competing objectives and physical constraints, and extensive computational analysis must be performed in order to identify the possible formation of primer-dimers that can negatively impact product yield. Results This paper examines the computational design limits of multiplex PCR in the context of SNP genotyping and examines tradeoffs associated with several key design factors including multiplexing level (the number of primer pairs per tube), coverage (the % of SNP whose associated primers are actually assigned to one of several available tube), and tube-size uniformity. We also examine how design performance depends on the total number of available SNPs from which to choose, and primer stringency criterial. We show that finding high-multiplexing/high-coverage designs is subject to a computational phase transition, becoming dramatically more difficult when the probability of primer pair interaction exceeds a critical threshold. The precise location of this critical transition point depends on the number of available SNPs and the level of multiplexing required. We also demonstrate how coverage performance is impacted by the number of available snps, primer selection criteria, and target multiplexing levels. Conclusion The presence of a phase transition suggests limits to scaling Multiplex PCR performance for high-throughput genomics applications. Achieving broad SNP coverage rapidly transitions from being very easy to very hard as the target multiplexing level (# of primer pairs per tube) increases. The onset of a phase transition can be "delayed" by having a larger pool of SNPs, or loosening primer selection constraints so as to increase the number of candidate primer pairs per SNP, though the latter

  18. Single Nucleotide Polymorphism (SNP)-Based Loss of Heterozygosity (LOH) Testing by Real Time PCR in Patients Suspect of Myeloproliferative Disease

    PubMed Central

    Huijsmans, Cornelis J. J.; Poodt, Jeroen; Damen, Jan; van der Linden, Johannes C.; Savelkoul, Paul H. M.; Pruijt, Johannes F. M.; Hilbink, Mirrian; Hermans, Mirjam H. A.

    2012-01-01

    During tumor development, loss of heterozygosity (LOH) often occurs. When LOH is preceded by an oncogene activating mutation, the mutant allele may be further potentiated if the wild-type allele is lost or inactivated. In myeloproliferative neoplasms (MPN) somatic acquisition of JAK2V617F may be followed by LOH resulting in loss of the wild type allele. The occurrence of LOH in MPN and other proliferative diseases may lead to a further potentiating the mutant allele and thereby increasing morbidity. A real time PCR based SNP profiling assay was developed and validated for LOH detection of the JAK2 region (JAK2LOH). Blood of a cohort of 12 JAK2V617F-positive patients (n = 6 25–50% and n = 6>50% JAK2V617F) and a cohort of 81 patients suspected of MPN was stored with EDTA and subsequently used for validation. To generate germ-line profiles, non-neoplastic formalin-fixed paraffin-embedded tissue from each patient was analyzed. Results of the SNP assay were compared to those of an established Short Tandem Repeat (STR) assay. Both assays revealed JAK2LOH in 1/6 patients with 25–50% JAK2V617F. In patients with >50% JAK2V617F, JAK2LOH was detected in 6/6 by the SNP assay and 5/6 patients by the STR assay. Of the 81 patients suspected of MPN, 18 patients carried JAK2V617F. Both the SNP and STR assay demonstrated the occurrence of JAK2LOH in 5 of them. In the 63 JAK2V617F-negative patients, no JAK2LOH was observed by SNP and STR analyses. The presented SNP assay reliably detects JAK2LOH and is a fast and easy to perform alternative for STR analyses. We therefore anticipate the SNP approach as a proof of principle for the development of LOH SNP-assays for other clinically relevant LOH loci. PMID:22768290

  19. Single nucleotide polymorphism (SNP)-based loss of heterozygosity (LOH) testing by real time PCR in patients suspect of myeloproliferative disease.

    PubMed

    Huijsmans, Cornelis J J; Poodt, Jeroen; Damen, Jan; van der Linden, Johannes C; Savelkoul, Paul H M; Pruijt, Johannes F M; Hilbink, Mirrian; Hermans, Mirjam H A

    2012-01-01

    During tumor development, loss of heterozygosity (LOH) often occurs. When LOH is preceded by an oncogene activating mutation, the mutant allele may be further potentiated if the wild-type allele is lost or inactivated. In myeloproliferative neoplasms (MPN) somatic acquisition of JAK2V617F may be followed by LOH resulting in loss of the wild type allele. The occurrence of LOH in MPN and other proliferative diseases may lead to a further potentiating the mutant allele and thereby increasing morbidity. A real time PCR based SNP profiling assay was developed and validated for LOH detection of the JAK2 region (JAK2LOH). Blood of a cohort of 12 JAK2V617F-positive patients (n=6 25-50% and n=6>50% JAK2V617F) and a cohort of 81 patients suspected of MPN was stored with EDTA and subsequently used for validation. To generate germ-line profiles, non-neoplastic formalin-fixed paraffin-embedded tissue from each patient was analyzed. Results of the SNP assay were compared to those of an established Short Tandem Repeat (STR) assay. Both assays revealed JAK2LOH in 1/6 patients with 25-50% JAK2V617F. In patients with >50% JAK2V617F, JAK2LOH was detected in 6/6 by the SNP assay and 5/6 patients by the STR assay. Of the 81 patients suspected of MPN, 18 patients carried JAK2V617F. Both the SNP and STR assay demonstrated the occurrence of JAK2LOH in 5 of them. In the 63 JAK2V617F-negative patients, no JAK2LOH was observed by SNP and STR analyses. The presented SNP assay reliably detects JAK2LOH and is a fast and easy to perform alternative for STR analyses. We therefore anticipate the SNP approach as a proof of principle for the development of LOH SNP-assays for other clinically relevant LOH loci.

  20. Correlating observed odds ratios from lung cancer case-control studies to SNP functional scores predicted by bioinformatic tools

    PubMed Central

    Zhu, Yong; Hoffman, Aaron; Wu, Xifeng; Zhang, Heping; Zhang, Yawei; Leaderer, Derek; Zheng, Tongzhang

    2008-01-01

    Bioinformatic tools are widely utilized to predict functional single nucleotide polymorphisms (SNPs) for genotyping in molecular epidemiological studies. However, the extent to which these approaches are mirrored by epidemiological findings has not been fully explored. In this study, we first surveyed SNPs examined in case-control studies of lung cancer, the most extensively-studied cancer type. We then computed SNP functional scores using four popular bioinformatics tools: SIFT, PolyPhen, SNPs3D, and PMut, and determined their predictive potential using the odds ratios (ORs) reported. Spearman’s correlation coefficient (r) for the association with SNP score from SIFT, PolyPhen, SNPs3D, and PMut, and the summary ORs were r = −0.36 (p = 0.007), r = 0.25 (p = 0.068), r = −0.20 (p = 0.205), and r = −0.12 (p = 0.370) respectively. By creating a combined score using information from all four tools we were able to achieve a correlation coefficient of r = 0.51 (p < 0.001). These results indicate that scores of predicted functionality could explain a certain fraction of the lung cancer risk detected in genetic association studies and more accurate predictions may be obtained by combining information from a variety of tools. Our findings suggest that bioinformatic tools are useful in predicting SNP functionality and may facilitate future genetic epidemiological studies. PMID:18191955

  1. A second generation SNP and SSR integrated linkage map and QTL mapping for the Chinese mitten crab Eriocheir sinensis

    PubMed Central

    Qiu, Gao-Feng; Xiong, Liang-Wei; Han, Zhi-Ke; Liu, Zhi-Qiang; Feng, Jian-Bin; Wu, Xu-Gan; Yan, Yin-Long; Shen, Hong; Huang, Long; Chen, Li

    2017-01-01

    The Chinese mitten crab Eriocheir sinensis is the most economically important cultivated crab species in China, and its genome has a high number of chromosomes (2n = 146). To obtain sufficient markers for construction of a dense genetic map for this species, we employed the recently developed specific-locus amplified fragment sequencing (SLAF-seq) method for large-scale SNPs screening and genotyping in a F1 full-sib family of 149 individuals. SLAF-seq generated 127,677 polymorphic SNP markers, of which 20,803 valid markers were assigned into five segregation types and were used together with previous SSR markers for linkage map construction. The final integrated genetic map included 17,680 SNP and 629 SSR markers on the 73 linkage groups (LG), and spanned 14,894.9 cM with an average marker interval of 0.81 cM. QTL mapping localized three significant growth-related QTL to a 1.2 cM region in LG53 as well as 146 sex-linked markers in LG48. Genome-wide QTL-association analysis further identified four growth-related QTL genes named LNX2, PAK2, FMRFamide and octopamine receptors. These genes are involved in a variety of different signaling pathways including cell proliferation and growth. The map and SNP markers described here will be a valuable resource for the E. sinensis genome project and selective breeding programs. PMID:28045132

  2. A high-density, multi-parental SNP genetic map on apple validates a new mapping approach for outcrossing species

    PubMed Central

    Di Pierro, Erica A; Gianfranceschi, Luca; Di Guardo, Mario; Koehorst-van Putten, Herma JJ; Kruisselbrink, Johannes W; Longhi, Sara; Troggio, Michela; Bianco, Luca; Muranty, Hélène; Pagliarani, Giulia; Tartarini, Stefano; Letschka, Thomas; Lozano Luis, Lidia; Garkava-Gustavsson, Larisa; Micheletti, Diego; Bink, Marco CAM; Voorrips, Roeland E; Aziz, Ebrahimi; Velasco, Riccardo; Laurens, François; van de Weg, W Eric

    2016-01-01

    Quantitative trait loci (QTL) mapping approaches rely on the correct ordering of molecular markers along the chromosomes, which can be obtained from genetic linkage maps or a reference genome sequence. For apple (Malus domestica Borkh), the genome sequence v1 and v2 could not meet this need; therefore, a novel approach was devised to develop a dense genetic linkage map, providing the most reliable marker-loci order for the highest possible number of markers. The approach was based on four strategies: (i) the use of multiple full-sib families, (ii) the reduction of missing information through the use of HaploBlocks and alternative calling procedures for single-nucleotide polymorphism (SNP) markers, (iii) the construction of a single backcross-type data set including all families, and (iv) a two-step map generation procedure based on the sequential inclusion of markers. The map comprises 15 417 SNP markers, clustered in 3 K HaploBlock markers spanning 1 267 cM, with an average distance between adjacent markers of 0.37 cM and a maximum distance of 3.29 cM. Moreover, chromosome 5 was oriented according to its homoeologous chromosome 10. This map was useful to improve the apple genome sequence, design the Axiom Apple 480 K SNP array and perform multifamily-based QTL studies. Its collinearity with the genome sequences v1 and v3 are reported. To our knowledge, this is the shortest published SNP map in apple, while including the largest number of markers, families and individuals. This result validates our methodology, proving its value for the construction of integrated linkage maps for any outbreeding species. PMID:27917289

  3. The MDM4 SNP34091 (rs4245739) C-allele is associated with increased risk of ovarian-but not endometrial cancer.

    PubMed

    Gansmo, Liv B; Bjørnslett, Merete; Halle, Mari Kyllesø; Salvesen, Helga B; Dørum, Anne; Birkeland, Einar; Hveem, Kristian; Romundstad, Pål; Vatten, Lars; Lønning, Per Eystein; Knappskog, Stian

    2016-08-01

    The MDM4 protein (also known as MDMX or HDMX) is a negative regulator of p53, not only by direct interaction but also through its interaction with MDM2. Further, MDM4 overexpression and amplification have been observed in several cancer forms. Recently, a single nucleotide polymorphism (SNP) in the 3' untranslated region of the MDM4 gene, SNP34091A > C (rs4245739) was reported to alter MDM4 messenger RNA (mRNA) stability by modulating a microRNA binding site, thereby leading to decreased MDM4 levels. In this case-control study, we aimed to evaluate the possible association between MDM4 SNP34091 status and cancer risk by comparing the genotype frequencies in large hospital-based cohorts of endometrial- (n = 1404) and ovarian (n = 1385) cancer patients with healthy female controls (n = 1870). Genotype frequencies were compared by odds ratio (OR) estimates and Fisher exact tests. We found that individuals harboring the MDM4 SNP34091AC/CC genotypes had a significantly elevated risk for serous ovarian cancer (SOC) in general and high-grade serous ovarian cancer (HGSOC) in particular (SOC: OR = 1.18., 95 % CI = 1.01-1.39; HGSOC: OR = 1.25, CI = 1.02-1.53). No association between SNP34091 genotypes and endometrial cancer risk was observed. Our data indicate the MDM4 SNP34091AC/CC genotypes to be associated with an elevated risk for SOC and in particular the HGSOC type.

  4. A genome-wide search for common SNP x SNP interactions on the risk of venous thrombosis

    PubMed Central

    2013-01-01

    Background Venous Thrombosis (VT) is a common multifactorial disease with an estimated heritability between 35% and 60%. Known genetic polymorphisms identified so far only explain ~5% of the genetic variance of the disease. This study was aimed to investigate whether pair-wise interactions between common single nucleotide polymorphisms (SNPs) could exist and modulate the risk of VT. Methods A genome-wide SNP x SNP interaction analysis on VT risk was conducted in a French case–control study and the most significant findings were tested for replication in a second independent French case–control sample. The results obtained in the two studies totaling 1,953 cases and 2,338 healthy subjects were combined into a meta-analysis. Results The smallest observed p-value for interaction was p = 6.00 10-11 but it did not pass the Bonferroni significance threshold of 1.69 10-12 correcting for the number of investigated interactions that was 2.96 1010. Among the 37 suggestive pair-wise interactions with p-value less than 10-8, one was further shown to involve two SNPs, rs9804128 (IGFS21 locus) and rs4784379 (IRX3 locus) that demonstrated significant interactive effects (p = 4.83 10-5) on the variability of plasma Factor VIII levels, a quantitative biomarker of VT risk, in a sample of 1,091 VT patients. Conclusion This study, the first genome-wide SNP interaction analysis conducted so far on VT risk, suggests that common SNPs are unlikely exerting strong interactive effects on the risk of disease. PMID:23509962

  5. Highly effective SNP-based association mapping and management of recessive defects in livestock.

    PubMed

    Charlier, Carole; Coppieters, Wouter; Rollin, Frédéric; Desmecht, Daniel; Agerholm, Jorgen S; Cambisano, Nadine; Carta, Eloisa; Dardano, Sabrina; Dive, Marc; Fasquelle, Corinne; Frennet, Jean-Claude; Hanset, Roger; Hubin, Xavier; Jorgensen, Claus; Karim, Latifa; Kent, Matthew; Harvey, Kirsten; Pearce, Brian R; Simon, Patricia; Tama, Nico; Nie, Haisheng; Vandeputte, Sébastien; Lien, Sigbjorn; Longeri, Maria; Fredholm, Merete; Harvey, Robert J; Georges, Michel

    2008-04-01

    The widespread use of elite sires by means of artificial insemination in livestock breeding leads to the frequent emergence of recessive genetic defects, which cause significant economic and animal welfare concerns. Here we show that the availability of genome-wide, high-density SNP panels, combined with the typical structure of livestock populations, markedly accelerates the positional identification of genes and mutations that cause inherited defects. We report the fine-scale mapping of five recessive disorders in cattle and the molecular basis for three of these: congenital muscular dystony (CMD) types 1 and 2 in Belgian Blue cattle and ichthyosis fetalis in Italian Chianina cattle. Identification of these causative mutations has an immediate translation into breeding practice, allowing marker assisted selection against the defects through avoidance of at-risk matings.

  6. [Mechanism of genuineness of Glycyrrhiza uralensis based on SNP of β-Amyrin synthase gene].

    PubMed

    Zang, Yi-mei; Li, Yan-peng; Qiao, Jing; Chen, Hong-hao; Liu, Chun-sheng

    2015-07-01

    β-Amyrin synthase (β-AS) genes of Glycyrrhiza uralensis from 6 different regions were analyzed by PCR-SSCP and sequenced, then the correlationship between β-AS SNP and regions of Glycyrrhiza uralensis were determined. According to the 1 coding single nucleotide polymorphism on the first exon of β-AS gene at 94 bp site, Glycyrrhiza uralensis could be divided into 3 genotypes. In these genotypes, the percentage of 94A type in genuine regions was much higher, and it had significant differences with the percentage in non-genuine regions (P < 0.001). The results of the experiment proved that different β-AS genotypes at 94 bp site from different regions may be one of the important reasons to result in the genuineness of Glycyrrhiza uralensis.

  7. Cloning, chromosomal localization, SNP detection and association analysis of the porcine IRS-1 gene.

    PubMed

    Niu, P-X; Huang, Z; Li, C-C; Fan, B; Li, K; Liu, B; Yu, M; Zhao, S-H

    2009-11-01

    Insulin receptor substrate-1(IRS-1) gene is one member of the Insulin receptor substrate (IRS) gene family, which plays an important role in mediating the growth of skeletal muscle and the molecular metabolism of type 2 diabetes. Here, we cloned a 3,573 bp fragment of the partial CDS sequence of porcine IRS-1 gene by in silicon cloning strategy and RT-PCR method. The porcine IRS-1 gene was assigned to SSC15q25 by using IMpRH. Sequencing of PCR products from Duroc and Tibetan pig breeds identified one SNP in exon 1 of porcine IRS-1 gene (C3257A polymorphisms). Association analysis of genotypes with the growth traits, anatomy traits, meat quality traits and physiological biochemical indexes traits showed that different genotypes at locus 3,257 of IRS-1 have significant differences in carcass straight length in pigs (P = 0.0102 \\ 0.05).

  8. Analysis of Y-chromosomal SNP haplogroups and STR haplotypes in an Algerian population sample.

    PubMed

    Robino, C; Crobu, F; Di Gaetano, C; Bekada, A; Benhamamouch, S; Cerutti, N; Piazza, A; Inturri, S; Torre, C

    2008-05-01

    The distribution of Y-chromosomal single nucleotide polymorphism (SNP) haplogroups and short tandem repeat (STR) haplotypes was determined in a sample of 102 unrelated men of Arab origin from northwestern Algeria (Oran area). A total of nine different haplogroups were identified by a panel of 22 binary markers. The most common haplogroups observed in the Algerian population were E3b2 (45.1%) and J1 (22.5%). Y-STR typing by a 17-loci multiplex system allowed 93 haplotypes to be defined (88 were unique). Striking differences in the allele distribution and gene diversity of Y-STR markers between haplogroups could be found. In particular, intermediate alleles at locus DYS458 specifically characterized the haplotypes of individuals carrying haplogroup J1. All the intermediate alleles shared a common repeat sequence structure, supporting the hypothesis that the variant originated from a single mutational event.

  9. The Impact of a Common MDM2 SNP on the Sensitivity of Breast Cancer to Treatment

    DTIC Science & Technology

    2012-06-01

    could decrease the effectiveness of treatment. These outcomes are likely due to the increased expression of mdm2 protein in SNP309 individuals, which...expression at the protein level occur in the mdm2 SNP309 cell line. There was no association between the mdm2 SNP309 and clinical outcome of breast cancer...with chemotherapy, hormonal therapy and radiation therapy. 1S. SUBJECT TERMS mdm2, breast cancer, polymorphisms 16. SECURITY CLASSIFICATION OF: 17

  10. A SNP transferability survey within the genus Vitis

    PubMed Central

    Vezzulli, Silvia; Micheletti, Diego; Riaz, Summaira; Pindo, Massimo; Viola, Roberto; This, Patrice; Walker, M Andrew; Troggio, Michela; Velasco, Riccardo

    2008-01-01

    Background Efforts to sequence the genomes of different organisms continue to increase. The DNA sequence is usually decoded for one individual and its application is for the whole species. The recent sequencing of the highly heterozygous Vitis vinifera L. cultivar Pinot Noir (clone ENTAV 115) genome gave rise to several thousand polymorphisms and offers a good model to study the transferability of its degree of polymorphism to other individuals of the same species and within the genus. Results This study was performed by genotyping 137 SNPs through the SNPlex™ Genotyping System (Applied Biosystems Inc.) and by comparing the SNPlex sequencing results across 35 (of the 137) regions from 69 grape accessions. A heterozygous state transferability of 31.5% across the unrelated cultivars of V. vinifera, of 18.8% across the wild forms of V. vinifera, of 2.3% among non-vinifera Vitis species, and of 0% with Muscadinia rotundifolia was found. In addition, mean allele frequencies were used to evaluate SNP informativeness and develop useful subsets of markers. Conclusion Using SNPlex application and corroboration from the sequencing analysis, the informativeness of SNP markers from the heterozygous grape cultivar Pinot Noir was validated in V. vinifera (including cultivars and wild forms), but had a limited application for non-vinifera Vitis species where a resequencing strategy may be preferred, knowing that homology at priming sites is sufficient. This work will allow future applications such as mapping and diversity studies, accession identification and genomic-research assisted breeding within V. vinifera. PMID:19087337

  11. Structural Architecture of SNP Effects on Complex Traits

    PubMed Central

    Gamazon, Eric R.; Cox, Nancy J.; Davis, Lea K.

    2014-01-01

    Despite the discovery of copy-number variation (CNV) across the genome nearly 10 years ago, current SNP-based analysis methodologies continue to collapse the homozygous (i.e., A/A), hemizygous (i.e., A/0), and duplicative (i.e., A/A/A) genotype states, treating the genotype variable as irreducible or unaltered by other colocalizing forms of genetic (e.g., structural) variation. Our understanding of common, genome-wide CNVs suggests that the canonical genotype construct might belie the enormous complexity of the genome. Here we present multiple analyses of several phenotypes and provide methods supporting a conceptual shift that embraces the structural dimension of genotype. We comprehensively investigate the impact of the structural dimension of genotype on (1) GWAS methods, (2) interpretation of rare LOF variants, (3) characterization of genomic architecture, and (4) implications for mapping loci involved in complex disease. Taken together, these results argue for the inclusion of a structural dimension and suggest that some portion of the “missing” heritability might be recovered through integration of the structural dimension of SNP effects on complex traits. PMID:25307299

  12. Eigenanalysis of SNP data with an identity by descent interpretation.

    PubMed

    Zheng, Xiuwen; Weir, Bruce S

    2016-02-01

    Principal component analysis (PCA) is widely used in genome-wide association studies (GWAS), and the principal component axes often represent perpendicular gradients in geographic space. The explanation of PCA results is of major interest for geneticists to understand fundamental demographic parameters. Here, we provide an interpretation of PCA based on relatedness measures, which are described by the probability that sets of genes are identical-by-descent (IBD). An approximately linear transformation between ancestral proportions (AP) of individuals with multiple ancestries and their projections onto the principal components is found. In addition, a new method of eigenanalysis "EIGMIX" is proposed to estimate individual ancestries. EIGMIX is a method of moments with computational efficiency suitable for millions of SNP data, and it is not subject to the assumption of linkage equilibrium. With the assumptions of multiple ancestries and their surrogate ancestral samples, EIGMIX is able to infer ancestral proportions (APs) of individuals. The methods were applied to the SNP data from the HapMap Phase 3 project and the Human Genome Diversity Panel. The APs of individuals inferred by EIGMIX are consistent with the findings of the program ADMIXTURE. In conclusion, EIGMIX can be used to detect population structure and estimate genome-wide ancestral proportions with a relatively high accuracy.

  13. Fine-scaled human genetic structure revealed by SNP microarrays.

    PubMed

    Xing, Jinchuan; Watkins, W Scott; Witherspoon, David J; Zhang, Yuhua; Guthery, Stephen L; Thara, Rangaswamy; Mowry, Bryan J; Bulayeva, Kazima; Weiss, Robert B; Jorde, Lynn B

    2009-05-01

    We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.

  14. SNP Markers and Their Impact on Plant Breeding

    PubMed Central

    Mammadov, Jafar; Aggarwal, Rajat; Buyyarapu, Ramesh; Kumpatla, Siva

    2012-01-01

    The use of molecular markers has revolutionized the pace and precision of plant genetic analysis which in turn facilitated the implementation of molecular breeding of crops. The last three decades have seen tremendous advances in the evolution of marker systems and the respective detection platforms. Markers based on single nucleotide polymorphisms (SNPs) have rapidly gained the center stage of molecular genetics during the recent years due to their abundance in the genomes and their amenability for high-throughput detection formats and platforms. Computational approaches dominate SNP discovery methods due to the ever-increasing sequence information in public databases; however, complex genomes pose special challenges in the identification of informative SNPs warranting alternative strategies in those crops. Many genotyping platforms and chemistries have become available making the use of SNPs even more attractive and efficient. This paper provides a review of historical and current efforts in the development, validation, and application of SNP markers in QTL/gene discovery and plant breeding by discussing key experimental strategies and cases exemplifying their impact. PMID:23316221

  15. Data mining and genetic algorithm based gene/SNP selection.

    PubMed

    Shah, Shital C; Kusiak, Andrew

    2004-07-01

    Genomic studies provide large volumes of data with the number of single nucleotide polymorphisms (SNPs) ranging into thousands. The analysis of SNPs permits determining relationships between genotypic and phenotypic information as well as the identification of SNPs related to a disease. The growing wealth of information and advances in biology call for the development of approaches for discovery of new knowledge. One such area is the identification of gene/SNP patterns impacting cure/drug development for various diseases. A new approach for predicting drug effectiveness is presented. The approach is based on data mining and genetic algorithms. A global search mechanism, weighted decision tree, decision-tree-based wrapper, a correlation-based heuristic, and the identification of intersecting feature sets are employed for selecting significant genes. The feature selection approach has resulted in 85% reduction of number of features. The relative increase in cross-validation accuracy and specificity for the significant gene/SNP set was 10% and 3.2%, respectively. The feature selection approach was successfully applied to data sets for drug and placebo subjects. The number of features has been significantly reduced while the quality of knowledge was enhanced. The feature set intersection approach provided the most significant genes/SNPs. The results reported in the paper discuss associations among SNPs resulting in patient-specific treatment protocols.

  16. New multilocus linkage disequilibrium measure for tag SNP selection.

    PubMed

    Liao, Bo; Wang, Xiangjun; Zhu, Wen; Li, Xiong; Cai, Lijun; Chen, Haowen

    2017-02-01

    Numerous approaches have been proposed for selecting an optimal tag single-nucleotide polymorphism (SNP) set. Most of these approaches are based on linkage disequilibrium (LD). Classical LD measures, such as D' and r(2), are frequently used to quantify the relationship between two marker (pairwise) linkage disequilibria. Despite of their successful use in many applications, these measures cannot be used to measure the LD between multiple-marker. These LD measures need information about the frequencies of alleles collected from haplotype dataset. In this study, a cluster algorithm is proposed to cluster SNPs according to multilocus LD measure which is based on information theory. After that, tag SNPs are selected in each cluster optimized by the number of tag SNPs, prediction accuracy and so on. The experimental results show that this new LD measure can be directly applied to genotype dataset collected from the HapMap project, so that it saves the cost of haplotyping. More importantly, the proposed method significantly improves the efficiency and prediction accuracy of tag SNP selection.

  17. New generation pharmacogenomic tools: a SNP linkage disequilibrium Map, validated SNP assay resource, and high-throughput instrumentation system for large-scale genetic studies.

    PubMed

    De La Vega, Francisco M; Dailey, David; Ziegle, Janet; Williams, Julie; Madden, Dawn; Gilbert, Dennis A

    2002-06-01

    Since public and private efforts announced the first draft of the human genome last year, researchers have reported great numbers of single nucleotide polymorphisms (SNPs). We believe that the availability of well-mapped, quality SNP markers constitutes the gateway to a revolution in genetics and personalized medicine that will lead to better diagnosis and treatment of common complex disorders. A new generation of tools and public SNP resources for pharmacogenomic and genetic studies--specifically for candidate-gene, candidate-region, and whole-genome association studies--will form part of the new scientific landscape. This will only be possible through the greater accessibility of SNP resources and superior high-throughput instrumentation-assay systems that enable affordable, highly productive large-scale genetic studies. We are contributing to this effort by developing a high-quality linkage disequilibrium SNP marker map and an accompanying set of ready-to-use, validated SNP assays across every gene in the human genome. This effort incorporates both the public sequence and SNP data sources, and Celera Genomics' human genome assembly and enormous resource ofphysically mapped SNPs (approximately 4,000,000 unique records). This article discusses our approach and methodology for designing the map, choosing quality SNPs, designing and validating these assays, and obtaining population frequency ofthe polymorphisms. We also discuss an advanced, high-performance SNP assay chemisty--a new generation of the TaqMan probe-based, 5' nuclease assay-and high-throughput instrumentation-software system for large-scale genotyping. We provide the new SNP map and validation information, validated SNP assays and reagents, and instrumentation systems as a novel resource for genetic discoveries.

  18. Predicting HLA alleles from high-resolution SNP data in three Southeast Asian populations.

    PubMed

    Pillai, Nisha Esakimuthu; Okada, Yukinori; Saw, Woei-Yuh; Ong, Rick Twee-Hee; Wang, Xu; Tantoso, Erwin; Xu, Wenting; Peterson, Trevor A; Bielawny, Thomas; Ali, Mohammad; Tay, Koon-Yong; Poh, Wan-Ting; Tan, Linda Wei-Lin; Koo, Seok-Hwee; Lim, Wei-Yen; Soong, Richie; Wenk, Markus; Raychaudhuri, Soumya; Little, Peter; Plummer, Francis A; Lee, Edmund J D; Chia, Kee-Seng; Luo, Ma; De Bakker, Paul I W; Teo, Yik-Ying

    2014-08-15

    The major histocompatibility complex (MHC) containing the classical human leukocyte antigen (HLA) Class I and Class II genes is among the most polymorphic and diverse regions in the human genome. Despite the clinical importance of identifying the HLA types, very few databases jointly characterize densely genotyped single nucleotide polymorphisms (SNPs) and HLA alleles in the same samples. To date, the HapMap presents the only public resource that provides a SNP reference panel for predicting HLA alleles, constructed with four collections of individuals of north-western European, northern Han Chinese, cosmopolitan Japanese and Yoruba Nigerian ancestry. Owing to complex patterns of linkage disequilibrium in this region, it is unclear whether the HapMap reference panels can be appropriately utilized for other populations. Here, we describe a public resource for the Singapore Genome Variation Project with: (i) dense genotyping across ∼ 9000 SNPs in the MHC; (ii) four-digit HLA typing for eight Class I and Class II loci, in 96 southern Han Chinese, 89 Southeast Asian Malays and 83 Tamil Indians. This resource provides population estimates of the frequencies of HLA alleles at these eight loci in the three population groups, particularly for HLA-DPA1 and HLA-DPB1 that were not assayed in HapMap. Comparing between population-specific reference panels and a cosmopolitan panel created from all four HapMap populations, we demonstrate that more accurate imputation is obtained with population-specific panels than with the cosmopolitan panel, especially for the Malays and Indians but even when imputing between northern and southern Han Chinese. As with SNP imputation, common HLA alleles were imputed with greater accuracy than low-frequency variants.

  19. Prognostic impact of SNP array karyotyping in myelodysplastic syndromes and related myeloid malignancies

    PubMed Central

    Tiu, Ramon V.; Gondek, Lukasz P.; O'Keefe, Christine L.; Elson, Paul; Huh, Jungwon; Mohamedali, Azim; Kulasekararaj, Austin; Advani, Anjali S.; Paquette, Ronald; List, Alan F.; Sekeres, Mikkael A.; McDevitt, Michael A.

    2011-01-01

    Single nucleotide polymorphism arrays (SNP-As) have emerged as an important tool in the identification of chromosomal defects undetected by metaphase cytogenetics (MC) in hematologic cancers, offering superior resolution of unbalanced chromosomal defects and acquired copy-neutral loss of heterozygosity. Myelodysplastic syndromes (MDSs) and related cancers share recurrent chromosomal defects and molecular lesions that predict outcomes. We hypothesized that combining SNP-A and MC could improve diagnosis/prognosis and further the molecular characterization of myeloid malignancies. We analyzed MC/SNP-A results from 430 patients (MDS = 250, MDS/myeloproliferative overlap neoplasm = 95, acute myeloid leukemia from MDS = 85). The frequency and clinical significance of genomic aberrations was compared between MC and MC plus SNP-A. Combined MC/SNP-A karyotyping lead to higher diagnostic yield of chromosomal defects (74% vs 44%, P < .0001), compared with MC alone, often through detection of novel lesions in patients with normal/noninformative (54%) and abnormal (62%) MC results. Newly detected SNP-A defects contributed to poorer prognosis for patients stratified by current morphologic and clinical risk schemes. The presence and number of new SNP-A detected lesions are independent predictors of overall and event-free survival. The significant diagnostic and prognostic contributions of SNP-A–detected defects in MDS and related diseases underscore the utility of SNP-A when combined with MC in hematologic malignancies. PMID:21285439

  20. Analysis of high-order SNP barcodes in mitochondrial D-loop for chronic dialysis susceptibility.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Da; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2016-10-01

    Positively identifying disease-associated single nucleotide polymorphism (SNP) markers in genome-wide studies entails the complex association analysis of a huge number of SNPs. Such large numbers of SNP barcode (SNP/genotype combinations) continue to pose serious computational challenges, especially for high-dimensional data. We propose a novel exploiting SNP barcode method based on differential evolution, termed IDE (improved differential evolution). IDE uses a "top combination strategy" to improve the ability of differential evolution to explore high-order SNP barcodes in high-dimensional data. We simulate disease data and use real chronic dialysis data to test four global optimization algorithms. In 48 simulated disease models, we show that IDE outperforms existing global optimization algorithms in terms of exploring ability and power to detect the specific SNP/genotype combinations with a maximum difference between cases and controls. In real data, we show that IDE can be used to evaluate the relative effects of each individual SNP on disease susceptibility. IDE generated significant SNP barcode with less computational complexity than the other algorithms, making IDE ideally suited for analysis of high-order SNP barcodes. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. PCR amplification of SNP loci from crude DNA for large-scale genotyping of oomycetes.

    PubMed

    Hu, Jian; Lyon, Rebecca; Zhou, Yuxin; Lamour, Kurt

    2014-01-01

    Similar to other eukaryotes, single nucleotide polymorphism (SNP) markers are abundant in many oomycete plant pathogen genomes. High resolution DNA melting analysis (HR-DMA) is a cost-effective method for SNP genotyping, but like many SNP marker technologies, is limited by the amount and quality of template DNA. We describe PCR preamplification of Phytophthora and Peronospora SNP loci from crude DNA extracted from a small amount of mycelium and/or infected plant tissue to produce sufficient template to genotype at least 10 000 SNPs. The approach is fast, inexpensive, requires minimal biological material and should be useful for many organisms in a variety of contexts.

  2. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

    PubMed Central

    2010-01-01

    Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls

  3. Obesity-related known and candidate SNP markers can significantly change affinity of TATA-binding protein for human gene promoters

    PubMed Central

    2015-01-01

    Background Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e., significant differences of personal genomes of patients from the reference human genome) can help physicians to improve treatment. Clinical research can validate SNP markers via genotyping of patients and demonstration that SNP alleles are significantly more frequent in patients than in healthy people. The search for biomedical SNP markers of interest can be accelerated by computer-based analysis of hundreds of millions of SNPs in the 1000 Genomes project because of selection of the most meaningful candidate SNP markers and elimination of neutral SNPs. Results We cross-validated the output of two computer-based methods: DNA sequence analysis using Web service SNP_TATA_Comparator and keyword search for articles on comorbidities of obesity. Near the sites binding to TATA-binding protein (TBP) in human gene promoters, we found 22 obesity-related candidate SNP markers, including rs10895068 (male breast cancer in obesity); rs35036378 (reduced risk of obesity after ovariectomy); rs201739205 (reduced risk of obesity-related cancers due to weight loss by diet/exercise in obese postmenopausal women); rs183433761 (obesity resistance during a high-fat diet); rs367732974 and rs549591993 (both: cardiovascular complications in obese patients with type 2 diabetes mellitus); rs200487063 and rs34104384 (both: obesity-caused hypertension); rs35518301, rs72661131, and rs562962093 (all: obesity); and rs397509430, rs33980857, rs34598529, rs33931746, rs33981098, rs34500389, rs63750953, rs281864525, rs35518301, and rs34166473 (all: chronic inflammation in comorbidities of obesity). Using an electrophoretic mobility shift assay under nonequilibrium conditions, we

  4. Obesity-related known and candidate SNP markers can significantly change affinity of TATA-binding protein for human gene promoters.

    PubMed

    Arkova, Olga V; Ponomarenko, Mikhail P; Rasskazov, Dmitry A; Drachkova, Irina A; Arshinova, Tatjana V; Ponomarenko, Petr M; Savinkova, Ludmila K; Kolchanov, Nikolay A

    2015-01-01

    Obesity affects quality of life and life expectancy and is associated with cardiovascular disorders, cancer, diabetes, reproductive disorders in women, prostate diseases in men, and congenital anomalies in children. The use of single nucleotide polymorphism (SNP) markers of diseases and drug responses (i.e., significant differences of personal genomes of patients from the reference human genome) can help physicians to improve treatment. Clinical research can validate SNP markers via genotyping of patients and demonstration that SNP alleles are significantly more frequent in patients than in healthy people. The search for biomedical SNP markers of interest can be accelerated by computer-based analysis of hundreds of millions of SNPs in the 1000 Genomes project because of selection of the most meaningful candidate SNP markers and elimination of neutral SNPs. We cross-validated the output of two computer-based methods: DNA sequence analysis using Web service SNP_TATA_Comparator and keyword search for articles on comorbidities of obesity. Near the sites binding to TATA-binding protein (TBP) in human gene promoters, we found 22 obesity-related candidate SNP markers, including rs10895068 (male breast cancer in obesity); rs35036378 (reduced risk of obesity after ovariectomy); rs201739205 (reduced risk of obesity-related cancers due to weight loss by diet/exercise in obese postmenopausal women); rs183433761 (obesity resistance during a high-fat diet); rs367732974 and rs549591993 (both: cardiovascular complications in obese patients with type 2 diabetes mellitus); rs200487063 and rs34104384 (both: obesity-caused hypertension); rs35518301, rs72661131, and rs562962093 (all: obesity); and rs397509430, rs33980857, rs34598529, rs33931746, rs33981098, rs34500389, rs63750953, rs281864525, rs35518301, and rs34166473 (all: chronic inflammation in comorbidities of obesity). Using an electrophoretic mobility shift assay under nonequilibrium conditions, we empirically validated the

  5. A Genome-Wide Association Study for Agronomic Traits in Soybean Using SNP Markers and SNP-Based Haplotype Analysis

    PubMed Central

    de Oliveira, Marco Antônio Rott; Higashi, Wilson; Scapim, Carlos Alberto; Schuster, Ivan

    2017-01-01

    Mapping quantitative trait loci through the use of linkage disequilibrium (LD) in populations of unrelated individuals provides a valuable approach for dissecting the genetic basis of complex traits in soybean (Glycine max). The haplotype-based genome-wide association study (GWAS) has now been proposed as a complementary approach to intensify benefits from LD, which enable to assess the genetic determinants of agronomic traits. In this study a GWAS was undertaken to identify genomic regions that control 100-seed weight (SW), plant height (PH) and seed yield (SY) in a soybean association mapping panel using single nucleotide polymorphism (SNP) markers and haplotype information. The soybean cultivars (N = 169) were field-evaluated across four locations of southern Brazil. The genome-wide haplotype association analysis (941 haplotypes) identified eleven, seventeen and fifty-nine SNP-based haplotypes significantly associated with SY, SW and PH, respectively. Although most marker-trait associations were environment and trait specific, stable haplotype associations were identified for SY and SW across environments (i.e., haplotypes Gm12_Hap12). The haplotype block 42 on Chr19 (Gm19_Hap42) was confirmed to be associated with PH in two environments. These findings enable us to refine the breeding strategy for tropical soybean, which confirm that haplotype-based GWAS can provide new insights on the genetic determinants that are not captured by the single-marker approach. PMID:28152092

  6. A Genome-Wide Association Study for Agronomic Traits in Soybean Using SNP Markers and SNP-Based Haplotype Analysis.

    PubMed

    Contreras-Soto, Rodrigo Iván; Mora, Freddy; de Oliveira, Marco Antônio Rott; Higashi, Wilson; Scapim, Carlos Alberto; Schuster, Ivan

    2017-01-01

    Mapping quantitative trait loci through the use of linkage disequilibrium (LD) in populations of unrelated individuals provides a valuable approach for dissecting the genetic basis of complex traits in soybean (Glycine max). The haplotype-based genome-wide association study (GWAS) has now been proposed as a complementary approach to intensify benefits from LD, which enable to assess the genetic determinants of agronomic traits. In this study a GWAS was undertaken to identify genomic regions that control 100-seed weight (SW), plant height (PH) and seed yield (SY) in a soybean association mapping panel using single nucleotide polymorphism (SNP) markers and haplotype information. The soybean cultivars (N = 169) were field-evaluated across four locations of southern Brazil. The genome-wide haplotype association analysis (941 haplotypes) identified eleven, seventeen and fifty-nine SNP-based haplotypes significantly associated with SY, SW and PH, respectively. Although most marker-trait associations were environment and trait specific, stable haplotype associations were identified for SY and SW across environments (i.e., haplotypes Gm12_Hap12). The haplotype block 42 on Chr19 (Gm19_Hap42) was confirmed to be associated with PH in two environments. These findings enable us to refine the breeding strategy for tropical soybean, which confirm that haplotype-based GWAS can provide new insights on the genetic determinants that are not captured by the single-marker approach.

  7. Comparing the efficacy of SNP filtering methods for identifying a single causal SNP in a known association region.

    PubMed

    Spencer, Amy Victoria; Cox, Angela; Walters, Kevin

    2014-01-01

    Genome-wide association studies have successfully identified associations between common diseases and a large number of single nucleotide polymorphisms (SNPs) across the genome. We investigate the effectiveness of several statistics, including p-values, likelihoods, genetic map distance and linkage disequilibrium between SNPs, in filtering SNPs in several disease-associated regions. We use simulated data to compare the efficacy of filters with different sample sizes and for causal SNPs with different minor allele frequencies (MAFs) and effect sizes, focusing on the small effect sizes and MAFs likely to represent the majority of unidentified causal SNPs. In our analyses, of all the methods investigated, filtering on the ranked likelihoods consistently retains the true causal SNP with the highest probability for a given false positive rate. This was the case for all the local linkage disequilibrium patterns investigated. Our results indicate that when using this method to retain only the top 5% of SNPs, even a causal SNP with an odds ratio of 1.1 and MAF of 0.08 can be retained with a probability exceeding 0.9 using an overall sample size of 50,000. © 2013 John Wiley & Sons Ltd/University College London.

  8. SNP Discovery for mapping alien introgressions in wheat

    PubMed Central

    2014-01-01

    Background Monitoring alien introgressions in crop plants is difficult due to the lack of genetic and molecular mapping information on the wild crop relatives. The tertiary gene pool of wheat is a very important source of genetic variability for wheat improvement against biotic and abiotic stresses. By exploring the 5Mg short arm (5MgS) of Aegilops geniculata, we can apply chromosome genomics for the discovery of SNP markers and their use for monitoring alien introgressions in wheat (Triticum aestivum L). Results The short arm of chromosome 5Mg of Ae. geniculata Roth (syn. Ae. ovata L.; 2n = 4x = 28, UgUgMgMg) was flow-sorted from a wheat line in which it is maintained as a telocentric chromosome. DNA of the sorted arm was amplified and sequenced using an Illumina Hiseq 2000 with ~45x coverage. The sequence data was used for SNP discovery against wheat homoeologous group-5 assemblies. A total of 2,178 unique, 5MgS-specific SNPs were discovered. Randomly selected samples of 59 5MgS-specific SNPs were tested (44 by KASPar assay and 15 by Sanger sequencing) and 84% were validated. Of the selected SNPs, 97% mapped to a chromosome 5Mg addition to wheat (the source of t5MgS), and 94% to 5Mg introgressed from a different accession of Ae. geniculata substituting for chromosome 5D of wheat. The validated SNPs also identified chromosome segments of 5MgS origin in a set of T5D-5Mg translocation lines; eight SNPs (25%) mapped to TA5601 [T5DL · 5DS-5MgS(0.75)] and three (8%) to TA5602 [T5DL · 5DS-5MgS (0.95)]. SNPs (gsnp_5ms83 and gsnp_5ms94), tagging chromosome T5DL · 5DS-5MgS(0.95) with the smallest introgression carrying resistance to leaf rust (Lr57) and stripe rust (Yr40), were validated in two released germplasm lines with Lr57 and Yr40 genes. Conclusion This approach should be widely applicable for the identification of species/genome-specific SNPs. The development of a large number of SNP markers will facilitate the precise introgression and

  9. Molecular cloning and SNP association analysis of chicken PMCH gene.

    PubMed

    Sun, Guirong; Li, Ming; Li, Hong; Tian, Yadong; Chen, Qixin; Bai, Yichun; Kang, Xiangtao

    2013-08-01

    The pre-melanin-concentrating hormone (PMCH) gene is an important gene functionally concerning the regulations of body fat content, feeding behavior and energy balance. In this study, the full-length cDNA of chicken PMCH gene was amplified by SMART RACE method. The single nucleotide polymorphisms (SNPs) in the PMCH gene were screened by comparative sequence analysis. The obtained non-synonymous coding SNPs (ncSNPs) were designed for genotyping firstly. Its effects on growth, carcass characteristics and meat quality traits were investigated employing the F2 resource population of Gushi chicken crossed with Anak broiler by AluI CRS-PCR-RFLP. Our results indicated that the cDNA of chicken PMCH shared 67.25 and 66.47% homology with that of human and bovine PMCH, respectively. The deduced amino acid sequence of chicken PMCH (163 amino acids) were 52.07 and 50.89% identical to those of human and bovine PMCH, respectively. The PMCH protein sequence is predicted to have several functional domains, including pro-MCH, CSP, IL7, XPGI and some low complexity sequence. It has 8 phosphorylation sites and no signal peptide sequence. gga-miR-18a, gga-miR-18b, gga-miR-499 microRNA targeting site was predicted in the 3' untranslated region of chicken PMCH mRNA. In addition, a total of seven SNPs including an ncSNP and a synonymous coding SNP, were identified in the PMCH gene. The ncSNP c.81 A>T was found to be in moderate polymorphic state (polymorphic index=0.365), and the frequencies for genotype AA, AB and BB were 0.3648, 0.4682 and 0.1670, respectively. Significant associations between the locus and shear force of breast and leg were observed. This polymorphic site may serve as a useful target for the marker assisted selection of the growth and meat quality traits in chicken.

  10. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding.

    PubMed

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network.

  11. A HapMap leads to a Capsicum annuum SNP infinium array: a new tool for pepper breeding

    PubMed Central

    Hulse-Kemp, Amanda M; Ashrafi, Hamid; Plieske, Joerg; Lemm, Jana; Stoffel, Kevin; Hill, Theresa; Luerssen, Hartmut; Pethiyagoda, Charit L; Lawley, Cindy T; Ganal, Martin W; Van Deynze, Allen

    2016-01-01

    The Capsicum genus (Pepper) is a part of the Solanacae family. It has been important in many cultures worldwide for its key nutritional components and uses as spices, medicines, ornamentals and vegetables. Worldwide population growth is associated with demand for more nutritionally valuable vegetables while contending with decreasing resources and available land. These conditions require increased efficiency in pepper breeding to deal with these imminent challenges. Through resequencing of inbred lines we have completed a valuable haplotype map (HapMap) for the pepper genome based on single-nucleotide polymorphisms (SNP). The identified SNPs were annotated and classified based on their gene annotation in the pepper draft genome sequence and phenotype of the sequenced inbred lines. A selection of one marker per gene model was utilized to create the PepperSNP16K array, which simultaneously genotyped 16 405 SNPs, of which 90.7% were found to be informative. A set of 84 inbred and hybrid lines and a mapping population of 90 interspecific F2 individuals were utilized to validate the array. Diversity analysis of the inbred lines shows a distinct separation of bell versus chile/hot pepper types and separates them into five distinct germplasm groups. The interspecific population created between Tabasco (C. frutescens chile type) and P4 (C. annuum blocky type) produced a linkage map with 5546 markers separated into 1361 bins on twelve 12 linkage groups representing 1392.3 cM. This publically available genotyping platform can be used to rapidly assess a large number of markers in a reproducible high-throughput manner for pepper. As a standardized tool for genetic analyses, the PepperSNP16K can be used worldwide to share findings and analyze QTLs for important traits leading to continued improvement of pepper for consumers. Data and information on the array are available through the Solanaceae Genomics Network. PMID:27602231

  12. miRNA-Mediated Relationships between Cis-SNP Genotypes and Transcript Intensities in Lymphocyte Cell Lines

    PubMed Central

    Zhang, Wensheng; Edwards, Andrea; Zhu, Dongxiao; Flemington, Erik K.; Deininger, Prescott; Zhang, Kun

    2012-01-01

    In metazoans, miRNAs regulate gene expression primarily through binding to target sites in the 3′ UTRs (untranslated regions) of messenger RNAs (mRNAs). Cis-acting variants within, or close to, a gene are crucial in explaining the variability of gene expression measures. Single nucleotide polymorphisms (SNPs) in the 3′ UTRs of genes can affect the base-pairing between miRNAs and mRNAs, and hence disrupt existing target sites (in the reference sequence) or create novel target sites, suggesting a possible mechanism for cis regulation of gene expression. Moreover, because the alleles of different SNPs within a DNA sequence of limited length tend to be in strong linkage disequilibrium (LD), we hypothesize the variants of miRNA target sites caused by SNPs potentially function as bridges linking the documented cis-SNP markers to the expression of the associated genes. A large-scale analysis was herein performed to test this hypothesis. By systematically integrating multiple latest information sources, we found 21 significant gene-level SNP-involved miRNA-mediated post-transcriptional regulation modules (SNP-MPRMs) in the form of SNP-miRNA-mRNA triplets in lymphocyte cell lines for the CEU and YRI populations. Among the cognate genes, six including ALG8, DGKE, GNA12, KLF11, LRPAP1, and MMAB are related to multiple genetic diseases such as depressive disorder and Type-II diabetes. Furthermore, we found that ∼35% of the documented transcript intensity-related cis-SNPs (∼950) in a recent publication are identical to, or in significant linkage disequilibrium (LD) (p<0.01) with, one or multiple SNPs located in miRNA target sites. Based on these associations (or identities), 69 significant exon-level SNP-MPRMs and 12 disease genes were further determined for two populations. These results provide concrete in silico evidence for the proposed hypothesis. The discovered modules warrant additional follow-up in independent laboratory studies. PMID:22348086

  13. Estimating genomic diversity and population differentiation - an empirical comparison of microsatellite and SNP variation in Arabidopsis halleri.

    PubMed

    Fischer, Martin C; Rellstab, Christian; Leuzinger, Marianne; Roumet, Marie; Gugerli, Felix; Shimizu, Kentaro K; Holderegger, Rolf; Widmer, Alex

    2017-01-11

    Microsatellite markers are widely used for estimating genetic diversity within and differentiation among populations. However, it has rarely been tested whether such estimates are useful proxies for genome-wide patterns of variation and differentiation. Here, we compared microsatellite variation with genome-wide single nucleotide polymorphisms (SNPs) to assess and quantify potential marker-specific biases and derive recommendations for future studies. Overall, we genotyped 180 Arabidopsis halleri individuals from nine populations using 20 microsatellite markers. Twelve of these markers were originally developed for Arabidopsis thaliana (cross-species markers) and eight for A. halleri (species-specific markers). We further characterized 2 million SNPs across the genome with a pooled whole-genome re-sequencing approach (Pool-Seq). Our analyses revealed that estimates of genetic diversity and differentiation derived from cross-species and species-specific microsatellites differed substantially and that expected microsatellite heterozygosity (SSR-H e) was not significantly correlated with genome-wide SNP diversity estimates (SNP-H e and θ Watterson) in A. halleri. Instead, microsatellite allelic richness (A r) was a better proxy for genome-wide SNP diversity. Estimates of genetic differentiation among populations (F ST) based on both marker types were correlated, but microsatellite-based estimates were significantly larger than those from SNPs. Possible causes include the limited number of microsatellite markers used, marker ascertainment bias, as well as the high variance in microsatellite-derived estimates. In contrast, genome-wide SNP data provided unbiased estimates of genetic diversity independent of whether genome- or only exome-wide SNPs were used. Further, we inferred that a few thousand random SNPs are sufficient to reliably estimate genome-wide diversity and to distinguish among populations differing in genetic variation. We recommend that future analyses of

  14. Development of maizeSNP3072, a high-throughput compatible SNP array, for DNA fingerprinting identification of Chinese maize varieties.

    PubMed

    Tian, Hong-Li; Wang, Feng-Ge; Zhao, Jiu-Ran; Yi, Hong-Mei; Wang, Lu; Wang, Rui; Yang, Yang; Song, Wei

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are abundant and evenly distributed throughout the maize (Zea mays L.) genome. SNPs have several advantages over simple sequence repeats, such as ease of data comparison and integration, high-throughput processing of loci, and identification of associated phenotypes. SNPs are thus ideal for DNA fingerprinting, genetic diversity analysis, and marker-assisted breeding. Here, we developed a high-throughput and compatible SNP array, maizeSNP3072, containing 3072 SNPs developed from the maizeSNP50 array. To improve genotyping efficiency, a high-quality cluster file, maizeSNP3072_GT.egt, was constructed. All 3072 SNP loci were localized within different genes, where they were distributed in exons (43 %), promoters (21 %), 3' untranslated regions (UTRs; 22 %), 5' UTRs (9 %), and introns (5 %). The average genotyping failure rate using these SNPs was only 6 %, or 3 % using the cluster file to call genotypes. The genotype consistency of repeat sample analysis on Illumina GoldenGate versus Infinium platforms exceeded 96.4 %. The minor allele frequency (MAF) of the SNPs averaged 0.37 based on data from 309 inbred lines. The 3072 SNPs were highly effective for distinguishing among 276 examined hybrids. Comparative analysis using Chinese varieties revealed that the 3072SNP array showed a better marker success rate and higher average MAF values, evaluation scores, and variety-distinguishing efficiency than the maizeSNP50K array. The maizeSNP3072 array thus can be successfully used in DNA fingerprinting identification of Chinese maize varieties and shows potential as a useful tool for germplasm resource evaluation and molecular marker-assisted breeding.

  15. SNPWaveTM: a flexible multiplexed SNP genotyping technology

    PubMed Central

    van Eijk, Michiel J. T.; Broekhof, José L. N.; van der Poel, Hein J. A.; Hogers, René C. J.; Schneiders, Harrie; Kamerbeek, Judith; Verstege, Esther; van Aart, Joris W.; Geerlings, Henk; Buntjer, Jaap B.; van Oeveren, A. Jan; Vos, Pieter

    2004-01-01

    Scalable multiplexed amplification technologies are needed for cost-effective large-scale genotyping of genetic markers such as single nucleotide polymorphisms (SNPs). We present SNPWaveTM, a novel SNP genotyping technology to detect various subsets of sequences in a flexible fashion in a fixed detection format. SNPWave is based on highly multiplexed ligation, followed by amplification of up to 20 ligated probes in a single PCR. Depending on the multiplexing level of the ligation reaction, the latter employs selective amplification using the amplified fragment length polymorphism (AFLP®) technology. Detection of SNPWave reaction products is based on size separation on a sequencing instrument with multiple fluorescence labels and short run times. The SNPWave technique is illustrated by a 100-plex genotyping assay for Arabidopsis, a 40-plex assay for tomato and a 10-plex assay for Caenorhabditis elegans, detected on the MegaBACE 1000 capillary sequencer. PMID:15004220

  16. SNP-VISTA: An Interactive SNPs Visualization Tool

    SciTech Connect

    Shah, Nameeta; Teplitsky, Michael V.; Pennacchio, Len A.; Hugenholtz, Philip; Hamann, Bernd; Dubchak, Inna L.

    2005-07-05

    Recent advances in sequencing technologies promise better diagnostics for many diseases as well as better understanding of evolution of microbial populations. Single Nucleotide Polymorphisms(SNPs) are established genetic markers that aid in the identification of loci affecting quantitative traits and/or disease in a wide variety of eukaryotic species. With today's technological capabilities, it is possible to re-sequence a large set of appropriate candidate genes in individuals with a given disease and then screen for causative mutations.In addition, SNPs have been used extensively in efforts to study the evolution of microbial populations, and the recent application of random shotgun sequencing to environmental samples makes possible more extensive SNP analysis of co-occurring and co-evolving microbial populations. The program is available at http://genome.lbl.gov/vista/snpvista.

  17. Linear reduction method for predictive and informative tag SNP selection.

    PubMed

    He, Jingwu; Westbrooks, Kelly; Zelikovsky, Alexander

    2005-01-01

    Constructing a complete human haplotype map is helpful when associating complex diseases with their related SNPs. Unfortunately, the number of SNPs is very large and it is costly to sequence many individuals. Therefore, it is desirable to reduce the number of SNPs that should be sequenced to a small number of informative representatives called tag SNPs. In this paper, we propose a new linear algebra-based method for selecting and using tag SNPs. We measure the quality of our tag SNP selection algorithm by comparing actual SNPs with SNPs predicted from selected linearly independent tag SNPs. Our experiments show that for sufficiently long haplotypes, knowing only 0.4% of all SNPs the proposed linear reduction method predicts an unknown haplotype with the error rate below 2% based on 10% of the population.

  18. Grouping preprocess for haplotype inference from SNP and CNV data

    NASA Astrophysics Data System (ADS)

    Shindo, Hiroyuki; Chigira, Hiroshi; Nagaoka, Tomoyo; Kamatani, Naoyuki; Inoue, Masato

    2009-12-01

    The method of statistical haplotype inference is an indispensable technique in the field of medical science. The authors previously reported Hardy-Weinberg equilibrium-based haplotype inference that could manage single nucleotide polymorphism (SNP) data. We recently extended the method to cover copy number variation (CNV) data. Haplotype inference from mixed data is important because SNPs and CNVs are occasionally in linkage disequilibrium. The idea underlying the proposed method is simple, but the algorithm for it needs to be quite elaborate to reduce the calculation cost. Consequently, we have focused on the details on the algorithm in this study. Although the main advantage of the method is accuracy, in that it does not use any approximation, its main disadvantage is still the calculation cost, which is sometimes intractable for large data sets with missing values.

  19. Authentication of medicinal plants by SNP-based multiplex PCR.

    PubMed

    Lee, Ok Ran; Kim, Min-Kyeoung; Yang, Deok-Chun

    2012-01-01

    Highly variable intergenic spacer and intron regions from nuclear and cytoplasmic DNA have been used for species identification. Noncoding internal transcribed spacers (ITSs) located in 18S-5.8S-26S, and 5S ribosomal RNA genes (rDNAs) represent suitable region for medicinal plant authentication. Noncoding regions from two cytoplasmic DNA, chloroplast DNA (trnT-F intergenic spacer region), and mitochondrial DNA (fourth intron region of nad7 gene) are also successfully applied for the proper identification of medicinal plants. Single-nucleotide polymorphism (SNP) sites obtained from the amplification of intergenic spacer and intron regions are properly utilized for the verification of medicinal plants in species level using multiplex PCR. Multiplex PCR as a variant of PCR technique used to amplify more than two loci simultaneously.

  20. Multilocus analysis of SNP and metabolic data within a given pathway

    PubMed Central

    Kristensen, Vessela N; Tsalenko, Anya; Geisler, Jurgen; Faldaas, Anne; Grenaker, Grethe Irene; Lingjærde, Ole Christian; Fjeldstad, Ståle; Yakhini, Zohar; Lønning, Per Eystein; Børresen-Dale, Anne-Lise

    2006-01-01

    Background Complex traits, which are under the influence of multiple and possibly interacting genes, have become a subject of new statistical methodological research. One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common multifactorial diseases and their association to different quantitative phenotypic traits. Results Two types of data from the same metabolic pathway were used in the analysis: categorical measurements of 18 SNPs; and quantitative measurements of plasma levels of several steroids and their precursors. Using the combinatorial partitioning method we tested various thresholds for each metabolic trait and each individual SNP locus. One SNP in CYP19, 3UTR, two SNPs in CYP1B1 (R48G and A119S) and one in CYP1A1 (T461N) were significantly differently distributed between the high and low level metabolic groups. The leave one out cross validation method showed that 6 SNPs in concert make 65% correct prediction of phenotype. Further we used pattern recognition, computing the p-value by Monte Carlo simulation to identify sets of SNPs and physiological characteristics such as age and weight that contribute to a given metabolic level. Since the SNPs detected by both methods reside either in the same gene (CYP1B1) or in 3 different genes in immediate vicinity on chromosome 15 (CYP19, CYP11 and CYP1A1) we investigated the possibility that they form intragenic and intergenic haplotypes, which may jointly account for a higher activity in the pathway. We identified such haplotypes associated with metabolic levels. Conclusion The methods reported here may enable to study multiple low-penetrance genetic factors that together determine various quantitative phenotypic traits. Our preliminary data suggest that several genes coding for proteins involved in a common pathway, that happen to be located on common chromosomal areas and may form intragenic haplotypes, together account for a higher

  1. Eurasiaplex: a forensic SNP assay for differentiating European and South Asian ancestries.

    PubMed

    Phillips, C; Freire Aradas, A; Kriegel, A K; Fondevila, M; Bulbul, O; Santos, C; Serrulla Rech, F; Perez Carceles, M D; Carracedo, Á; Schneider, P M; Lareu, M V

    2013-05-01

    We have selected a set of single nucleotide polymorphisms (SNPs) with the specific aim of differentiating European and South Asian ancestries. The SNPs were combined into a 23-plex SNaPshot primer extension assay: Eurasiaplex, designed to complement an existing 34-plex forensic ancestry test with both marker sets occupying well-spaced genomic positions, enabling their combination as single profile submissions to the Bayesian Snipper forensic ancestry inference system. We analyzed the ability of Eurasiaplex plus 34plex SNPs to assign ancestry to a total 1648 profiles from 16 European, 7 Middle East, 13 Central-South Asian and 21 East Asian populations. Ancestry assignment likelihoods were estimated from Snipper using training sets of five-group data (three Eurasian groups, East Asian and African genotypes) and four-group data (Middle East genotypes removed). Five-group differentiations gave assignment success of 91% for NW European populations, 72% for Middle East populations and 39% for Central-South Asian populations, indicating Middle East individuals are not reliably differentiated from either Europeans or Central-South Asians. Four-group differentiations provided markedly improved assignment success rates of 97% for most continental Europeans tested (excluding Turkish and Adygei at the far eastern edge of Europe) and 95% for Central-South Asians, despite applying a probability threshold for the highest likelihood ratio above '100 times more likely'. As part of the assessment of the sensitivity of Eurasiaplex to analyze challenging forensic material we detail Eurasiaplex and 34-plex SNP typing to infer ancestry of a cranium recovered from the sea, achieving 82% SNP genotype completeness. Therefore, Eurasiaplex provides an informative and forensically robust approach to the differentiation of European and South Asian ancestries amongst Eurasian populations. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  2. A 50-SNP assay for biogeographic ancestry and phenotype prediction in the U.S. population.

    PubMed

    Gettings, Katherine Butler; Lai, Ronald; Johnson, Joni L; Peck, Michelle A; Hart, Jessica A; Gordish-Dressman, Heather; Schanfield, Moses S; Podini, Daniele S

    2014-01-01

    When an STR DNA profile obtained from crime scene evidence does not match identified suspects or profiles from available databases, further DNA analyses targeted at inferring the possible ancestral origin and phenotypic characteristics of the perpetrator could yield valuable information. Single Nucleotide Polymorphisms (SNPs), the most common form of genetic polymorphisms, have alleles associated with specific populations and/or correlated to physical characteristics. We have used single base primer extension (SBE) technology to develop a 50 SNP assay (composed of three multiplexes) designed to predict ancestry among the primary U.S. populations (African American, East Asian, European American, and Hispanic American/Native American), as well as pigmentation phenotype (eye, hair, and skin color) among European American. We have optimized this assay to a sensitivity level comparable to current forensic DNA analyses, and shown robust performance on forensic-type samples. In addition, we developed a prediction model for ancestry in the U.S. population, based on the random match probability and likelihood ratio formulas already used in forensic laboratories. Lastly, we evaluated the biogeographic ancestry prediction model using a test set, and we evaluated an existing model for eye color with our U.S. sample set. Using these models with recommended thresholds, the 50 SNP assay provided accurate ancestry information in 98.6% of the test set samples, and provided accurate eye color information in 61% of the European samples tested (25% were inconclusive and 14% were incorrect). This method, which uses equipment already available in forensic DNA laboratories, is recommended for use in U.S. forensic casework to provide additional information about the donor of a DNA sample when the STR profile has not been linked to an individual. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  3. Technical Reproducibility of Genotyping SNP Arrays Used in Genome-Wide Association Studies

    PubMed Central

    Hong, Huixiao; Xu, Lei; Liu, Jie; Jones, Wendell D.; Su, Zhenqiang; Ning, Baitang; Perkins, Roger; Ge, Weigong; Miclaus, Kelci; Zhang, Li; Park, Kyunghee; Green, Bridgett; Han, Tao; Fang, Hong; Lambert, Christophe G.; Vega, Silvia C.; Lin, Simon M.; Jafari, Nadereh; Czika, Wendy; Wolfinger, Russell D.; Goodsaid, Federico; Tong, Weida; Shi, Leming

    2012-01-01

    During the last several years, high-density genotyping SNP arrays have facilitated genome-wide association studies (GWAS) that successfully identified common genetic variants associated with a variety of phenotypes. However, each of the identified genetic variants only explains a very small fraction of the underlying genetic contribution to the studied phenotypic trait. Moreover, discordance observed in results between independent GWAS indicates the potential for Type I and II errors. High reliability of genotyping technology is needed to have confidence in using SNP data and interpreting GWAS results. Therefore, reproducibility of two widely genotyping technology platforms from Affymetrix and Illumina was assessed by analyzing four technical replicates from each of the six individuals in five laboratories. Genotype concordance of 99.40% to 99.87% within a laboratory for the sample platform, 98.59% to 99.86% across laboratories for the same platform, and 98.80% across genotyping platforms was observed. Moreover, arrays with low quality data were detected when comparing genotyping data from technical replicates, but they could not be detected according to venders’ quality control (QC) suggestions. Our results demonstrated the technical reliability of currently available genotyping platforms but also indicated the importance of incorporating some technical replicates for genotyping QC in order to improve the reliability of GWAS results. The impact of discordant genotypes on association analysis results was simulated and could explain, at least in part, the irreproducibility of some GWAS findings when the effect size (i.e. the odds ratio) and the minor allele frequencies are low. PMID:22970228

  4. TNF-alpha SNP haplotype frequencies in equidae.

    PubMed

    Brown, J J; Ollier, W E R; Thomson, W; Matthews, J B; Carter, S D; Binns, M; Pinchbeck, G; Clegg, P D

    2006-05-01

    Tumour necrosis factor alpha (TNF-alpha) is a pro-inflammatory cytokine that plays a crucial role in the regulation of inflammatory and immune responses. In all vertebrate species the genes encoding TNF-alpha are located within the major histocompatability complex. In the horse TNF-alpha has been ascribed a role in a variety of important disease processes. Previously two single nucleotide polymorphisms (SNPs) have been reported within the 5' un-translated region of the equine TNF-alpha gene. We have examined the equine TNF-alpha promoter region further for additional SNPs by analysing DNA from 131 horses (Equus caballus), 19 donkeys (E. asinus), 2 Grant's zebras (E. burchellii boehmi) and one onager (E. hemionus). Two further SNPs were identified at nucleotide positions 24 (T/G) and 452 (T/C) relative to the first nucleotide of the 522 bp polymerase chain reaction product. A sequence variant at position 51 was observed between equidae. SNaPSHOT genotyping assays for these and the two previously reported SNPs were performed on 457 horses comprising seven different breeds and 23 donkeys to determine the gene frequencies. SNP frequencies varied considerably between different horse breeds and also between the equine species. In total, nine different TNF-alpha promoter SNP haplotypes and their frequencies were established amongst the various equidae examined, with some haplotypes being found only in horses and others only in donkeys or zebras. The haplotype frequencies observed varied greatly between different horse breeds. Such haplotypes may relate to levels of TNF-alpha production and disease susceptibility and further investigation is required to identify associations between particular haplotypes and altered risk of disease.

  5. Impact of population diversity on the prediction of 7-SNP NAT2 phenotypes using the tagSNP rs1495741 or paired SNPs.

    PubMed

    Suarez-Kurtz, Guilherme; Sortica, Vinicius A; Vargens, Daniela D; Bruxel, Estela M; Petzl-Erler, Maria-Luiza; Petz-Erler, Maria-Luiza; Tsuneto, Luisa T; Hutz, Mara H

    2012-04-01

    A novel NAT2 tagSNP (rs1495741) and a 2-SNP genotype (rs1041983 and rs1801280) have been recently shown to accurately predict the NAT2 acetylator phenotypes in populations of exclusive or predominant European/White ancestry. We confirmed the accuracy of the tagSNP approach in White Brazilians, but not in Brown or Black Brazilians, sub-Saharan Mozambicans, and Guarani Amerindians. The combined rs1041983 and rs1801280 genotypes provided considerably better prediction of the NAT2 phenotype in Guarani, but no consistent improvement in Brown or Black Brazilians and Mozambicans. Best predictions of the NAT2 phenotype in Mozambicans using NAT2 SNP pairs were obtained with rs1801280 and rs1799930, but the accuracy of the estimates remained inadequate for clinical use or for investigations in this sub-Saharan group or in Brazilians with considerable African ancestry. In conclusion, the rs1495741 tagSNP cannot be applied to predict the NAT2 acetylation phenotype in Guarani and African-derived populations, whereas 2-SNP genotypes may accurately predict NAT2 phenotypes in Guarani, but not in Africans.

  6. A new SNP panel for evaluating genetic diversity in a composite cattle breed

    USDA-ARS?s Scientific Manuscript database

    A custom 60K SNP panel, extracted from Bovine HD SNP chip was used to evaluate genotypic frequency changes in Braford (BF, a composite breed) when compared to progenitor breeds: Hereford (HF), Brahman (BR), and Nelore (NE). Samples from both the U. S. and Brazil were used. The new panel differentiat...

  7. Development and Applications of a Bovine 50,000 SNP Chip

    USDA-ARS?s Scientific Manuscript database

    To develop an Illumina iSelect high density single nucleotide polymorphism (SNP) assay for cattle, the collaborative iBMC (Illumina, USDA ARS Beltsville, University of Missouri, USDA ARS Clay Center) Consortium first performed a de novo SNP discovery project in which genomic reduced representation l...

  8. A genome-wide SNP panel for genetic diversity, mapping and breeding studies in rice

    USDA-ARS?s Scientific Manuscript database

    A genome-wide SNP resource was developed for rice using the GoldenGate assay and used to genotype 400 landrace accessions of O. sativa. SNPs were originally discovered using Perlegen re-sequencing technology in 20 diverse landraces of O. sativa as part of OryzaSNP project (http://irfgc.irri.org). An...

  9. Genome-wide copy number variations using SNP genotyping in a mixed breed swine population

    USDA-ARS?s Scientific Manuscript database

    Copy number variations (CNVs) are increasingly understood to affect phenotypic variation. This study uses SNP genotyping of trios of mixed breed swine to add to the catalog of known genotypic variation in an important agricultural animal. Porcine SNP60 BeadChip genotypes were collected from 1802 pi...

  10. Methods for the design, implementation, and analysis of illumina infinium™ SNP assays in plants.

    PubMed

    Chagné, David; Bianco, Luca; Lawley, Cindy; Micheletti, Diego; Jacobs, Jeanne M E

    2015-01-01

    The advent of Next-Generation sequencing-by-synthesis technologies has fuelled SNP discovery, genotyping, and screening of populations in myriad ways for many species, including various plant species. One technique widely applied to screening a large number of SNP markers over a large number of samples is the Illumina Infinium™ assay.

  11. A Coordinated Approach to Peach SNP Discovery in RosBREED

    USDA-ARS?s Scientific Manuscript database

    In the USDA-funded multi-institutional and trans-disciplinary project, “RosBREED”, crop-specific SNP genome scan platforms are being developed for peach, apple, strawberry, and cherry at a resolution of at least one polymorphic SNP marker every 5 cM in any random cross, for use in Pedigree-Based Ana...

  12. Networks of intergenic long-range enhancers and snpRNAs drive castration-resistant phenotype of prostate cancer and contribute to pathogenesis of multiple common human disorders

    PubMed Central

    Glinskii, Anna B; Ma, Shuang; Ma, Jun; Grant, Denise; Lim, Chang-Uk; Guest, Ian; Sell, Stewart; Buttyan, Ralph

    2011-01-01

    The mechanistic relevance of intergenic disease-associated genetic loci (IDAGL) containing highly statistically significant disease-linked SNPs remains unknown. Here, we present experimental and clinical evidence supporting the importantance of the role of IDAGL in human diseases. A targeted RT-PCR screen coupled with sequencing of purified PCR products detects widespread transcription at multiple IDAGL and identifies 96 small noncoding trans-regulatory RNAs of ∼100–300 nt in length containing SNPs (snpRNAs) associated with 21 common disorders. Multiple independent lines of experimental evidence support functionality of snpRNAs by documenting their cell type-specific expression and evolutionary conservation of sequences, genomic coordinates and biological effects. Chromatin state signatures, expression profiling experiments and luciferase reporter assays demonstrate that many IDAGL are Polycomb-regulated long-range enhancers. Expression of snpRNAs in human and mouse cells markedly affects cellular behavior and induces allele-specific clinically relevant phenotypic changes: NLRP1-locus snpRNAs rs2670660 exert regulatory effects on monocyte/macrophage transdifferentiation, induce prostate cancer (PC) susceptibility snpRNAs and transform low-malignancy hormone-dependent human PC cells into highly malignant androgen-independent PC. Q-PCR analysis and luciferase reporter assays demonstrate that snpRNA sequences represent allele-specific “decoy” targets of microRNAs that function as SNP allele-specific modifiers of microRNA expression and activity. We demonstrate that trans-acting RNA molecules facilitating resistance to androgen depletion (RAD) in vitro and castration-resistant phenotype (CRP) in vivo of PC contain intergenic 8q24-locus SNP variants (rs1447295; rs16901979; rs6983267) that were recently linked with increased risk of PC. Q-PCR analysis of clinical samples reveals markedly increased and highly concordant (r = 0.896; p < 0.0001) snpRNA expression

  13. Model, properties and imputation method of missing SNP genotype data utilizing mutual information

    NASA Astrophysics Data System (ADS)

    Wang, Ying; Wan, Weiming; Wang, Rui-Sheng; Feng, Enmin

    2009-07-01

    Mutual information can be used as a measure for the association of a genetic marker or a combination of markers with the phenotype. In this paper, we study the imputation of missing genotype data. We first utilize joint mutual information to compute the dependence between SNP sites, then construct a mathematical model in order to find the two SNP sites having maximal dependence with missing SNP sites, and further study the properties of this model. Finally, an extension method to haplotype-based imputation is proposed to impute the missing values in genotype data. To verify our method, extensive experiments have been performed, and numerical results show that our method is superior to haplotype-based imputation methods. At the same time, numerical results also prove joint mutual information can better measure the dependence between SNP sites. According to experimental results, we also conclude that the dependence between the adjacent SNP sites is not necessarily strongest.

  14. SNP uniqueness problem: a proof-of-principle in HapMap SNPs.

    PubMed

    Doron, Shany; Shweiki, Dorit

    2011-04-01

    SNP-based research strongly affects our biomedical and clinically associated knowledge. Nonunique and false-positive SNP existence in commonly used datasets may thus lead to biased, inaccurate clinically associated conclusions. We designed a computational study to reveal the degree of nonunique/false-positive SNPs in the HapMap dataset. Two sets of SNP flanking sequences were used as queries for BLAT analysis against the human genome. 4.2% and 11.9% of HapMap SNPs align to the genome nonuniquely (long and short, respectively). Furthermore, an average of 7.9% nonunique SNPs are included in common commercial genotyping arrays (according to our designed probes). Nonunique SNPs identified in this study are represented to various degrees in clinically associated databases, stressing the consequence of inaccurate SNP annotation and hence SNP utilization. Unfortunately, our results question some disease-related genotyping analyses, raising a worrisome concern on their validity.

  15. Design and characterization of a 52K SNP chip for goats.

    PubMed

    Tosser-Klopp, Gwenola; Bardou, Philippe; Bouchez, Olivier; Cabau, Cédric; Crooijmans, Richard; Dong, Yang; Donnadieu-Tonon, Cécile; Eggen, André; Heuven, Henri C M; Jamli, Saadiah; Jiken, Abdullah Johari; Klopp, Christophe; Lawley, Cynthia T; McEwan, John; Martin, Patrice; Moreno, Carole R; Mulsant, Philippe; Nabihoudine, Ibouniyamine; Pailhoux, Eric; Palhière, Isabelle; Rupp, Rachel; Sarry, Julien; Sayre, Brian L; Tircazes, Aurélie; Jun Wang; Wang, Wen; Zhang, Wenguang

    2014-01-01

    The success of Genome Wide Association Studies in the discovery of sequence variation linked to complex traits in humans has increased interest in high throughput SNP genotyping assays in livestock species. Primary goals are QTL detection and genomic selection. The purpose here was design of a 50-60,000 SNP chip for goats. The success of a moderate density SNP assay depends on reliable bioinformatic SNP detection procedures, the technological success rate of the SNP design, even spacing of SNPs on the genome and selection of Minor Allele Frequencies (MAF) suitable to use in diverse breeds. Through the federation of three SNP discovery projects consolidated as the International Goat Genome Consortium, we have identified approximately twelve million high quality SNP variants in the goat genome stored in a database together with their biological and technical characteristics. These SNPs were identified within and between six breeds (meat, milk and mixed): Alpine, Boer, Creole, Katjang, Saanen and Savanna, comprising a total of 97 animals. Whole genome and Reduced Representation Library sequences were aligned on >10 kb scaffolds of the de novo goat genome assembly. The 60,000 selected SNPs, evenly spaced on the goat genome, were submitted for oligo manufacturing (Illumina, Inc) and published in dbSNP along with flanking sequences and map position on goat assemblies (i.e. scaffolds and pseudo-chromosomes), sheep genome V2 and cattle UMD3.1 assembly. Ten breeds were then used to validate the SNP content and 52,295 loci could be successfully genotyped and used to generate a final cluster file. The combined strategy of using mainly whole genome Next Generation Sequencing and mapping on a contig genome assembly, complemented with Illumina design tools proved to be efficient in producing this GoatSNP50 chip. Advances in use of molecular markers are expected to accelerate goat genomic studies in coming years.

  16. A customized pigmentation SNP array identifies a novel SNP associated with melanoma predisposition in the SLC45A2 gene.

    PubMed

    Ibarrola-Villava, Maider; Fernandez, Lara P; Alonso, Santos; Boyano, M Dolores; Peña-Chilet, Maria; Pita, Guillermo; Aviles, Jose A; Mayor, Matias; Gomez-Fernandez, Cristina; Casado, Beatriz; Martin-Gonzalez, Manuel; Izagirre, Neskuts; De la Rua, Concepcion; Asumendi, Aintzane; Perez-Yarza, Gorka; Arroyo-Berdugo, Yoana; Boldo, Enrique; Lozoya, Rafael; Torrijos-Aguilar, Arantxa; Pitarch, Ana; Pitarch, Gerard; Sanchez-Motilla, Jose M; Valcuende-Cavero, Francisca; Tomas-Cabedo, Gloria; Perez-Pastor, Gemma; Diaz-Perez, Jose L; Gardeazabal, Jesus; Martinez de Lizarduy, Iñigo; Sanchez-Diez, Ana; Valdes, Carlos; Pizarro, Angel; Casado, Mariano; Carretero, Gregorio; Botella-Estrada, Rafael; Nagore, Eduardo; Lazaro, Pablo; Lluch, Ana; Benitez, Javier; Martinez-Cadenas, Conrado; Ribas, Gloria

    2011-04-29

    As the incidence of Malignant Melanoma (MM) reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2) and rs2069398 (SILV/CKD2), were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls). A novel SNP located on the SLC45A2 gene (rs35414) was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001). None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively) had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls). Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date.

  17. Evaluation of the SNP tagging approach in an independent population sample--array-based SNP discovery in Sami.

    PubMed

    Johansson, Asa; Vavruch-Nilsson, Veronika; Cox, David R; Frazer, Kelly A; Gyllensten, Ulf

    2007-09-01

    Significant efforts have been made to determine the correlation structure of common SNPs in the human genome. One method has been to identify the sets of tagSNPs that capture most of the genetic variation. Here, we evaluate the transferability of tagSNPs between populations using a population sample of Sami, the indigenous people of Scandinavia. Array-based SNP discovery in a 4.4 Mb region of 28 phased copies of chromosome 21 uncovered 5,132 segregating sites, 3,188 of which had a minimum minor allele frequency (mMAF) of 0.1. Due to the population structure and consequently high LD, the number of tagSNPs needed to capture all SNP variation in Sami is much lower than that for the HapMap populations. TagSNPs identified from the HapMap data perform only slightly better in the Sami than choosing tagSNPs at random from the same set of common SNPs. Surprisingly, tagSNPs defined from the HapMap data did not perform better than selecting the same number of SNPs at random from all SNPs discovered in Sami. Nearly half (46%) of the Sami SNPs with a mMAF of 0.1 are not present in the HapMap dataset. Among sites overlapping between Sami and HapMap populations, 18% are not tagged by the European American (CEU) HapMap tagSNPs, while 43% of the SNPs that are unique to Sami are not tagged by the CEU tagSNPs. These results point to serious limitations in the transferability of common tagSNPs to capture random sequence variation, even between closely related populations, such as CEU and Sami.

  18. A Customized Pigmentation SNP Array Identifies a Novel SNP Associated with Melanoma Predisposition in the SLC45A2 Gene

    PubMed Central

    Alonso, Santos; Boyano, M. Dolores; Peña-Chilet, Maria; Pita, Guillermo; Aviles, Jose A.; Mayor, Matias; Gomez-Fernandez, Cristina; Casado, Beatriz; Martin-Gonzalez, Manuel; Izagirre, Neskuts; De la Rua, Concepcion; Asumendi, Aintzane; Perez-Yarza, Gorka; Arroyo-Berdugo, Yoana; Boldo, Enrique; Lozoya, Rafael; Torrijos-Aguilar, Arantxa; Pitarch, Ana; Pitarch, Gerard; Sanchez-Motilla, Jose M.; Valcuende-Cavero, Francisca; Tomas-Cabedo, Gloria; Perez-Pastor, Gemma; Diaz-Perez, Jose L.; Gardeazabal, Jesus; de Lizarduy, Iñigo Martinez; Sanchez-Diez, Ana; Valdes, Carlos; Pizarro, Angel; Casado, Mariano; Carretero, Gregorio; Botella-Estrada, Rafael; Nagore, Eduardo; Lazaro, Pablo; Lluch, Ana; Benitez, Javier; Martinez-Cadenas, Conrado; Ribas, Gloria

    2011-01-01

    As the incidence of Malignant Melanoma (MM) reflects an interaction between skin colour and UV exposure, variations in genes implicated in pigmentation and tanning response to UV may be associated with susceptibility to MM. In this study, 363 SNPs in 65 gene regions belonging to the pigmentation pathway have been successfully genotyped using a SNP array. Five hundred and ninety MM cases and 507 controls were analyzed in a discovery phase I. Ten candidate SNPs based on a p-value threshold of 0.01 were identified. Two of them, rs35414 (SLC45A2) and rs2069398 (SILV/CKD2), were statistically significant after conservative Bonferroni correction. The best six SNPs were further tested in an independent Spanish series (624 MM cases and 789 controls). A novel SNP located on the SLC45A2 gene (rs35414) was found to be significantly associated with melanoma in both phase I and phase II (P<0.0001). None of the other five SNPs were replicated in this second phase of the study. However, three SNPs in TYR, SILV/CDK2 and ADAMTS20 genes (rs17793678, rs2069398 and rs1510521 respectively) had an overall p-value<0.05 when considering the whole DNA collection (1214 MM cases and 1296 controls). Both the SLC45A2 and the SILV/CDK2 variants behave as protective alleles, while the TYR and ADAMTS20 variants seem to function as risk alleles. Cumulative effects were detected when these four variants were considered together. Furthermore, individuals carrying two or more mutations in MC1R, a well-known low penetrance melanoma-predisposing gene, had a decreased MM risk if concurrently bearing the SLC45A2 protective variant. To our knowledge, this is the largest study on Spanish sporadic MM cases to date. PMID:21559390

  19. Identification of Swedish mosquitoes based on molecular barcoding of the COI gene and SNP analysis.

    PubMed

    Engdahl, Cecilia; Larsson, Pär; Näslund, Jonas; Bravo, Mayra; Evander, Magnus; Lundström, Jan O; Ahlm, Clas; Bucht, Göran

    2014-05-01

    Mosquito-borne infectious diseases are emerging in many regions of the world. Consequently, surveillance of mosquitoes and concomitant infectious agents is of great importance for prediction and prevention of mosquito-borne infectious diseases. Currently, morphological identification of mosquitoes is the traditional procedure. However, sequencing of specified genes or standard genomic regions, DNA barcoding, has recently been suggested as a global standard for identification and classification of many different species. Our aim was to develop a genetic method to identify mosquitoes and to study their relationship. Mosquitoes were captured at collection sites in northern Sweden and identified morphologically before the cytochrome c oxidase subunit I (COI) gene sequences of 14 of the most common mosquito species were determined. The sequences obtained were then used for phylogenetic placement, for validation and benchmarking of phenetic classifications and finally to develop a hierarchical PCR-based typing scheme based on single nucleotide polymorphism sites (SNPs) to enable rapid genetic identification, circumventing the need for morphological characterization. The results showed that exact phylogenetic relationships between mosquito taxa were preserved at shorter evolutionary distances, but at deeper levels, they could not be inferred with confidence using COI gene sequence data alone. Fourteen of the most common mosquito species in Sweden were identified by the SNP/PCR-based typing scheme, demonstrating that genetic typing using SNPs of the COI gene is a useful method for identification of mosquitoes with potential for worldwide application. © 2013 John Wiley & Sons Ltd.

  20. Case-control study on association of peroxisome proliferator-activated receptor-δ and SNP-SNP interactions with essential hypertension in Chinese Han population.

    PubMed

    Li, Yubo; Sun, Guoqiang

    2016-01-01

    The aim of this study was to investigate the association of peroxisome proliferator-activated receptor-δ (PPAR-δ) and additional SNP-SNP interaction with essential hypertension (EH) in Chinese Han population. A total of 1248 subjects (625 males, 623 females), including 620 EH patients and 628 normotension subjects, were included in the study. The mean age was 51.2 ± 15.1 years old. Logistic regression model was used to examine the association between four SNP and EH; odds ratio (OR) and 95% confident interval (95%CI) were calculated. Generalized multifactor dimensionality reduction (GMDR) was employed to analyze SNP-SNP interaction. EH risk was significantly lower in carriers of C allele of the rs2016520 polymorphism than those with TT (TC + CC versus TT, adjusted OR (95%CI) = 0.61 (0.49-0.78)). In addition, we also found a significant association between rs9794 and EH; EH risk was also significantly lower in carriers of G allele of the rs9794 polymorphism than those with CC (CG + GG versus CC, adjusted OR (95%CI) = 0.65 (0.53-0.83)). We also found a potential SNP-SNP interaction between rs2016520 and rs9794; subjects with TC or CC of rs2016520 and CG or GG of rs9794 genotype have the lowest EH risk, compared to subjects with TT of rs2016520 and CC of rs9794 genotype; OR (95%CI) was 0.32 (0.23-0.62) after covariate adjustment. Our results support an important association between rs2016520 and rs9794 minor allele of PPAR-δ and decreased risk of EH and additional interaction between rs2016520 and rs9794.

  1. Analysis of population structure and genetic history of cattle breeds based on high-density SNP data

    USDA-ARS?s Scientific Manuscript database

    Advances in single nucleotide polymorphism (SNP) genotyping microarrays have facilitated a new understanding of population structure and evolutionary history for several species. Most existing studies in livestock were based on low density SNP arrays. The first wave of low density SNP studies on cat...

  2. Exploring of new Y-chromosome SNP loci using Pyrosequencing and the SNaPshot methods.

    PubMed

    Wei, Wei; Luo, Hai-Bo; Yan, Jing; Hou, Yi-Ping

    2012-11-01

    The single nucleotide polymorphisms on the Y chromosome (Y-SNP) have been considered to be important in forensic casework. However, Y-SNP loci were mostly population specific and lacked biallelic polymorphisms in the Asian population. In this study, we developed a strategy for seeking and genotyping new Y-SNP markers based on both Pyrosequencing and the SNaPshot methods. As results, 34 new biallelic markers were observed to be polymorphic in the Chinese Han population by estimation of allele frequencies of 103 candidate's Y-SNP loci in DNA pools using Pyrosequencing technology. Then, a multiplex system with 20 Y-SNP loci was genotyped using the SNaPshot™ multiplex kit. Twenty Y-SNP loci defined 56 different haplotypes, and the haplotype diversity was estimated to be 0.9539. Our result demonstrated that the strategy could be used as an efficient tool to search and genotype biallelic markers from a large amount of candidate loci. In addition, 20 Y-SNP loci constructed a multiplex system, which could provide supplementary information for forensic identification.

  3. Rice SNP-seek database update: new SNPs, indels, and queries.

    PubMed

    Mansueto, Locedie; Fuentes, Roven Rommel; Borja, Frances Nikki; Detras, Jeffery; Abriol-Santos, Juan Miguel; Chebotarov, Dmytro; Sanciangco, Millicent; Palis, Kevin; Copetti, Dario; Poliakov, Alexandre; Dubchak, Inna; Solovyev, Victor; Wing, Rod A; Hamilton, Ruaraidh Sackville; Mauleon, Ramil; McNally, Kenneth L; Alexandrov, Nickolai

    2017-01-04

    We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Identification, validation and survey of a single nucleotide polymorphism (SNP) associated with pungency in Capsicum spp.

    PubMed

    Garcés-Claver, Ana; Fellman, Shanna Moore; Gil-Ortega, Ramiro; Jahn, Molly; Arnedo-Andrés, María S

    2007-11-01

    A single nucleotide polymorphism (SNP) associated with pungency was detected within an expressed sequence tag (EST) of 307 bp. This fragment was identified after expression analysis of the EST clone SB2-66 in placenta tissue of Capsicum fruits. Sequence alignments corresponding to this new fragment allowed us to identify an SNP between pungent and non-pungent accessions. Two methods were chosen for the development of the SNP marker linked to pungency: tetra-primer amplification refractory mutation system-PCR (tetra-primer ARMS-PCR) and cleaved amplified polymorphic sequence. Results showed that both methods were successful in distinguishing genotypes. Nevertheless, tetra-primer ARMS-PCR was chosen for SNP genotyping because it was more rapid, reliable and less cost-effective. The utility of this SNP marker for pungency was demonstrated by the ability to distinguish between 29 pungent and non-pungent cultivars of Capsicum annuum. In addition, the SNP was also associated with phenotypic pungent character in the tested genotypes of C. chinense, C. baccatum, C. frutescens, C. galapagoense, C. eximium, C. tovarii and C. cardenasi. This SNP marker is a faster, cheaper and more reproducible method for identifying pungent peppers than other techniques such as panel tasting, and allows rapid screening of the trait in early growth stages.

  5. Rice SNP-seek database update: new SNPs, indels, and queries

    PubMed Central

    Mansueto, Locedie; Fuentes, Roven Rommel; Borja, Frances Nikki; Detras, Jeffery; Abriol-Santos, Juan Miguel; Chebotarov, Dmytro; Sanciangco, Millicent; Palis, Kevin; Copetti, Dario; Poliakov, Alexandre; Dubchak, Inna; Solovyev, Victor; Wing, Rod A.; Hamilton, Ruaraidh Sackville; Mauleon, Ramil; McNally, Kenneth L.; Alexandrov, Nickolai

    2017-01-01

    We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org. PMID:27899667

  6. Mutagenic primer design for mismatch PCR-RFLP SNP genotyping using a genetic algorithm.

    PubMed

    Yang, Cheng-Hong; Cheng, Yu-Huei; Yang, Cheng-Huei; Chuang, Li-Yeh

    2012-01-01

    Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) is useful in small-scale basic research studies of complex genetic diseases that are associated with single nucleotide polymorphism (SNP). Designing a feasible primer pair is an important work before performing PCR-RFLP for SNP genotyping. However, in many cases, restriction enzymes to discriminate the target SNP resulting in the primer design is not applicable. A mutagenic primer is introduced to solve this problem. GA-based Mismatch PCR-RFLP Primers Design (GAMPD) provides a method that uses a genetic algorithm to search for optimal mutagenic primers and available restriction enzymes from REBASE. In order to improve the efficiency of the proposed method, a mutagenic matrix is employed to judge whether a hypothetical mutagenic primer can discriminate the target SNP by digestion with available restriction enzymes. The available restriction enzymes for the target SNP are mined by the updated core of SNP-RFLPing. GAMPD has been used to simulate the SNPs in the human SLC6A4 gene under different parameter settings and compared with SNP Cutter for mismatch PCR-RFLP primer design. The in silico simulation of the proposed GAMPD program showed that it designs mismatch PCR-RFLP primers. The GAMPD program is implemented in JAVA and is freely available at http://bio.kuas.edu.tw/gampd/.

  7. Leaf Transcriptome Sequencing for Identifying Genic-SSR Markers and SNP Heterozygosity in Crossbred Mango Variety 'Amrapali' (Mangifera indica L.).

    PubMed

    Mahato, Ajay Kumar; Sharma, Nimisha; Singh, Akshay; Srivastav, Manish; Jaiprakash; Singh, Sanjay Kumar; Singh, Anand Kumar; Sharma, Tilak Raj; Singh, Nagendra Kumar

    2016-01-01

    Mango (Mangifera indica L.) is called "king of fruits" due to its sweetness, richness of taste, diversity, large production volume and a variety of end usage. Despite its huge economic importance genomic resources in mango are scarce and genetics of useful horticultural traits are poorly understood. Here we generated deep coverage leaf RNA sequence data for mango parental varieties 'Neelam', 'Dashehari' and their hybrid 'Amrapali' using next generation sequencing technologies. De-novo sequence assembly generated 27,528, 20,771 and 35,182 transcripts for the three genotypes, respectively. The transcripts were further assembled into a non-redundant set of 70,057 unigenes that were used for SSR and SNP identification and annotation. Total 5,465 SSR loci were identified in 4,912 unigenes with 288 type I SSR (n ≥ 20 bp). One hundred type I SSR markers were randomly selected of which 43 yielded PCR amplicons of expected size in the first round of validation and were designated as validated genic-SSR markers. Further, 22,306 SNPs were identified by aligning high quality sequence reads of the three mango varieties to the reference unigene set, revealing significantly enhanced SNP heterozygosity in the hybrid Amrapali. The present study on leaf RNA sequencing of mango varieties and their hybrid provides useful genomic resource for genetic improvement of mango.

  8. A system for exact and approximate genetic linkage analysis of SNP data in large pedigrees

    PubMed Central

    Silberstein, Mark; Weissbrod, Omer; Otten, Lars; Tzemach, Anna; Anisenia, Andrei; Shtark, Oren; Tuberg, Dvir; Galfrin, Eddie; Gannon, Irena; Shalata, Adel; Borochowitz, Zvi U.; Dechter, Rina; Thompson, Elizabeth; Geiger, Dan

    2013-01-01

    Motivation: The use of dense single nucleotide polymorphism (SNP) data in genetic linkage analysis of large pedigrees is impeded by significant technical, methodological and computational challenges. Here we describe Superlink-Online SNP, a new powerful online system that streamlines the linkage analysis of SNP data. It features a fully integrated flexible processing workflow comprising both well-known and novel data analysis tools, including SNP clustering, erroneous data filtering, exact and approximate LOD calculations and maximum-likelihood haplotyping. The system draws its power from thousands of CPUs, performing data analysis tasks orders of magnitude faster than a single computer. By providing an intuitive interface to sophisticated state-of-the-art analysis tools coupled with high computing capacity, Superlink-Online SNP helps geneticists unleash the potential of SNP data for detecting disease genes. Results: Computations performed by Superlink-Online SNP are automatically parallelized using novel paradigms, and executed on unlimited number of private or public CPUs. One novel service is large-scale approximate Markov Chain–Monte Carlo (MCMC) analysis. The accuracy of the results is reliably estimated by running the same computation on multiple CPUs and evaluating the Gelman–Rubin Score to set aside unreliable results. Another service within the workflow is a novel parallelized exact algorithm for inferring maximum-likelihood haplotyping. The reported system enables genetic analyses that were previously infeasible. We demonstrate the system capabilities through a study of a large complex pedigree affected with metabolic syndrome. Availability: Superlink-Online SNP is freely available for researchers at http://cbl-hap.cs.technion.ac.il/superlink-snp. The system source code can also be downloaded from the system website. Contact: omerw@cs.technion.ac.il Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23162081

  9. MDM2 Promoter SNP344T>A (rs1196333) Status Does Not Affect Cancer Risk

    PubMed Central

    Knappskog, Stian; Gansmo, Liv B.; Romundstad, Pål; Bjørnslett, Merete; Trovik, Jone; Sommerfelt-Pettersen, Jan; Løkkevik, Erik; Tollenaar, Rob A. E. M.; Seynaeve, Caroline; Devilee, Peter; Salvesen, Helga B.; Dørum, Anne; Hveem, Kristian; Vatten, Lars; Lønning, Per E.

    2012-01-01

    The MDM2 proto-oncogene plays a key role in central cellular processes like growth control and apoptosis, and the gene locus is frequently amplified in sarcomas. Two polymorphisms located in the MDM2 promoter P2 have been shown to affect cancer risk. One of these polymorphisms (SNP309T>G; rs2279744) facilitates Sp1 transcription factor binding to the promoter and is associated with increased cancer risk. In contrast, SNP285G>C (rs117039649), located 24 bp upstream of rs2279744, and in complete linkage disequilibrium with the SNP309G allele, reduces Sp1 recruitment and lowers cancer risk. Thus, fine tuning of MDM2 expression has proven to be of significant importance with respect to tumorigenesis. We assessed the potential functional effects of a third MDM2 promoter P2 polymorphism (SNP344T>A; rs1196333) located on the SNP309T allele. While in silico analyses indicated SNP344A to modulate TFAP2A, SPIB and AP1 transcription factor binding, we found no effect of SNP344 status on MDM2 expression levels. Assessing the frequency of SNP344A in healthy Caucasians (n = 2,954) and patients suffering from ovarian (n = 1,927), breast (n = 1,271), endometrial (n = 895) or prostatic cancer (n = 641), we detected no significant difference in the distribution of this polymorphism between any of these cancer forms and healthy controls (6.1% in healthy controls, and 4.9%, 5.0%, 5.4% and 7.2% in the cancer groups, respectively). In conclusion, our findings provide no evidence indicating that SNP344A may affect MDM2 transcription or cancer risk. PMID:22558411

  10. SNP-SNP interactions between WNT4 and WNT5A were associated with obesity related traits in Han Chinese Population

    PubMed Central

    Dong, Shan-Shan; Hu, Wei-Xin; Yang, Tie-Lin; Chen, Xiao-Feng; Yan, Han; Chen, Xiang-Ding; Tan, Li-Jun; Tian, Qing; Deng, Hong-Wen; Guo, Yan

    2017-01-01

    Considering the biological roles of WNT4 and WNT5A involved in adipogenesis, we aimed to investigate whether SNPs in WNT4 and WNT5A contribute to obesity related traits in Han Chinese population. Targeted genomic sequence for WNT4 and WNT5A was determined in 100 Han Chinese subjects and tag SNPs were selected. Both single SNP and SNP × SNP interaction association analyses with body mass index (BMI) were evaluated in the 100 subjects and another independent sample of 1,627 Han Chinese subjects. Meta-analyses were performed and multiple testing corrections were carried out using the Bonferroni method. Consistent with the Genetic Investigation of ANthropometric Traits (GIANT) dataset results, we didn’t detect significant association signals in single SNP association analyses. However, the interaction between rs2072920 and rs11918967, was associated with BMI after multiple testing corrections (combined P = 2.20 × 10−4). The signal was also significant in each contributing data set. SNP rs2072920 is located in the 3′-UTR of WNT4 and SNP rs11918967 is located in the intron of WNT5A. Functional annotation results revealed that both SNPs might be involved in transcriptional regulation of gene expression. Our results suggest that a combined effect of SNPs via WNT4-WNT5A interaction may affect the variation of BMI in Han Chinese population. PMID:28272483

  11. Sequential sentinel SNP Regional Association Plots (SSS-RAP): an approach for testing independence of SNP association signals using meta-analysis data.

    PubMed

    Zheng, Jie; Gaunt, Tom R; Day, Ian N M

    2013-01-01

    Genome-Wide Association Studies (GWAS) frequently incorporate meta-analysis within their framework. However, conditional analysis of individual-level data, which is an established approach for fine mapping of causal sites, is often precluded where only group-level summary data are available for analysis. Here, we present a numerical and graphical approach, "sequential sentinel SNP regional association plot" (SSS-RAP), which estimates regression coefficients (beta) with their standard errors using the meta-analysis summary results directly. Under an additive model, typical for genes with small effect, the effect for a sentinel SNP can be transformed to the predicted effect for a possibly dependent SNP through a 2×2 2-SNP haplotypes table. The approach assumes Hardy-Weinberg equilibrium for test SNPs. SSS-RAP is available as a Web-tool (http://apps.biocompute.org.uk/sssrap/sssrap.cgi). To develop and illustrate SSS-RAP we analyzed lipid and ECG traits data from the British Women's Heart and Health Study (BWHHS), evaluated a meta-analysis for ECG trait and presented several simulations. We compared results with existing approaches such as model selection methods and conditional analysis. Generally findings were consistent. SSS-RAP represents a tool for testing independence of SNP association signals using meta-analysis data, and is also a convenient approach based on biological principles for fine mapping in group level summary data. © 2012 Blackwell Publishing Ltd/University College London.

  12. Performance comparison of SNP detection tools with illumina exome sequencing data—an assessment using both family pedigree information and sample-matched SNP array data

    PubMed Central

    Yi, Ming; Zhao, Yongmei; Jia, Li; He, Mei; Kebebew, Electron; Stephens, Robert M.

    2014-01-01

    To apply exome-seq-derived variants in the clinical setting, there is an urgent need to identify the best variant caller(s) from a large collection of available options. We have used an Illumina exome-seq dataset as a benchmark, with two validation scenarios—family pedigree information and SNP array data for the same samples, permitting global high-throughput cross-validation, to evaluate the quality of SNP calls derived from several popular variant discovery tools from both the open-source and commercial communities using a set of designated quality metrics. To the best of our knowledge, this is the first large-scale performance comparison of exome-seq variant discovery tools using high-throughput validation with both Mendelian inheritance checking and SNP array data, which allows us to gain insights into the accuracy of SNP calling through such high-throughput validation in an unprecedented way, whereas the previously reported comparison studies have only assessed concordance of these tools without directly assessing the quality of the derived SNPs. More importantly, the main purpose of our study was to establish a reusable procedure that applies high-throughput validation to compare the quality of SNP discovery tools with a focus on exome-seq, which can be used to compare any forthcoming tool(s) of interest. PMID:24831545

  13. Using Mendelian inheritance to improve high-throughput SNP discovery.

    PubMed

    Chen, Nancy; Van Hout, Cristopher V; Gottipati, Srikanth; Clark, Andrew G

    2014-11-01

    Restriction site-associated DNA sequencing or genotyping-by-sequencing (GBS) approaches allow for rapid and cost-effective discovery and genotyping of thousands of single-nucleotide polymorphisms (SNPs) in multiple individuals. However, rigorous quality control practices are needed to avoid high levels of error and bias with these reduced representation methods. We developed a formal statistical framework for filtering spurious loci, using Mendelian inheritance patterns in nuclear families, that accommodates variable-quality genotype calls and missing data--both rampant issues with GBS data--and for identifying sex-linked SNPs. Simulations predict excellent performance of both the Mendelian filter and the sex-linkage assignment under a variety of conditions. We further evaluate our method by applying it to real GBS data and validating a subset of high-quality SNPs. These results demonstrate that our metric of Mendelian inheritance is a powerful quality filter for GBS loci that is complementary to standard coverage and Hardy-Weinberg filters. The described method, implemented in the software MendelChecker, will improve quality control during SNP discovery in nonmodel as well as model organisms.

  14. Porcine colonization of the Americas: a 60k SNP story

    PubMed Central

    Burgos-Paz, W; Souza, C A; Megens, H J; Ramayo-Caldas, Y; Melo, M; Lemús-Flores, C; Caal, E; Soto, H W; Martínez, R; Álvarez, L A; Aguirre, L; Iñiguez, V; Revidatti, M A; Martínez-López, O R; Llambi, S; Esteve-Codina, A; Rodríguez, M C; Crooijmans, R P M A; Paiva, S R; Schook, L B; Groenen, M A M; Pérez-Enciso, M

    2013-01-01

    The pig, Sus scrofa, is a foreign species to the American continent. Although pigs originally introduced in the Americas should be related to those from the Iberian Peninsula and Canary islands, the phylogeny of current creole pigs that now populate the continent is likely to be very complex. Because of the extreme climates that America harbors, these populations also provide a unique example of a fast evolutionary phenomenon of adaptation. Here, we provide a genome wide study of these issues by genotyping, with a 60k SNP chip, 206 village pigs sampled across 14 countries and 183 pigs from outgroup breeds that are potential founders of the American populations, including wild boar, Iberian, international and Chinese breeds. Results show that American village pigs are primarily of European ancestry, although the observed genetic landscape is that of a complex conglomerate. There was no correlation between genetic and geographical distances, neither continent wide nor when analyzing specific areas. Most populations showed a clear admixed structure where the Iberian pig was not necessarily the main component, illustrating how international breeds, but also Chinese pigs, have contributed to extant genetic composition of American village pigs. We also observe that many genes related to the cardiovascular system show an increased differentiation between altiplano and genetically related pigs living near sea level. PMID:23250008

  15. Porcine colonization of the Americas: a 60k SNP story.

    PubMed

    Burgos-Paz, W; Souza, C A; Megens, H J; Ramayo-Caldas, Y; Melo, M; Lemús-Flores, C; Caal, E; Soto, H W; Martínez, R; Alvarez, L A; Aguirre, L; Iñiguez, V; Revidatti, M A; Martínez-López, O R; Llambi, S; Esteve-Codina, A; Rodríguez, M C; Crooijmans, R P M A; Paiva, S R; Schook, L B; Groenen, M A M; Pérez-Enciso, M

    2013-04-01

    The pig, Sus scrofa, is a foreign species to the American continent. Although pigs originally introduced in the Americas should be related to those from the Iberian Peninsula and Canary islands, the phylogeny of current creole pigs that now populate the continent is likely to be very complex. Because of the extreme climates that America harbors, these populations also provide a unique example of a fast evolutionary phenomenon of adaptation. Here, we provide a genome wide study of these issues by genotyping, with a 60k SNP chip, 206 village pigs sampled across 14 countries and 183 pigs from outgroup breeds that are potential founders of the American populations, including wild boar, Iberian, international and Chinese breeds. Results show that American village pigs are primarily of European ancestry, although the observed genetic landscape is that of a complex conglomerate. There was no correlation between genetic and geographical distances, neither continent wide nor when analyzing specific areas. Most populations showed a clear admixed structure where the Iberian pig was not necessarily the main component, illustrating how international breeds, but also Chinese pigs, have contributed to extant genetic composition of American village pigs. We also observe that many genes related to the cardiovascular system show an increased differentiation between altiplano and genetically related pigs living near sea level.

  16. A 21-locus autosomal SNP multiplex and its application in forensic science.

    PubMed

    Hou, Guangwei; Jiang, Xianhua; Yang, Yanyan; Jia, Fei; Li, Qiang; Zhao, Jinling; Guo, Fei; Liu, Limin

    2014-01-01

    To develop a cost-effective technique for single-nucleotide polymorphism (SNP) genotyping and improve the efficiency to analyze degraded DNA, we have established a novel multiplex system including 21-locus autosomal SNPs and amelogenin locus, which was based on allele-specific amplification (ASA) and universal reporter primers (URP). The target amplicons for each of the 21 SNPs arranged from 63 base pair (bp) to 192 bp. The system was tested in 539 samples from three ethnic groups (Han, Mongolian, and Zhuang population) in China, and the total power of discrimination (TPD) and cumulative probability of exclusion (CPE) were more than 0.99999999 and 0.98, respectively. The system was further validated with forensic samples and full profiles could be achieved from degraded DNA and 63 case-type samples. In summary, the multiplex system offers an effective technique for individual identification of forensic samples and is much more efficient in the analysis of degraded DNA compared with standard STR typing.

  17. SNP (–617C>A) in ARE-Like Loci of the NRF2 Gene: A New Biomarker for Prognosis of Lung Adenocarcinoma in Japanese Non-Smoking Women

    PubMed Central

    Okano, Yasuko; Nezu, Uru; Enokida, Yasuaki; Lee, Ming Ta Michael; Kinoshita, Hiroko; Lezhava, Alexander; Hayashizaki, Yoshihide; Morita, Satoshi; Taguri, Masataka; Ichikawa, Yasushi; Kaneko, Takeshi; Natsumeda, Yutaka; Yokose, Tomoyuki; Nakayama, Haruhiko; Miyagi, Yohei; Ishikawa, Toshihisa

    2013-01-01

    Purpose The transcription factor NRF2 plays a pivotal role in protecting normal cells from external toxic challenges and oxidative stress, whereas it can also endow cancer cells resistance to anticancer drugs. At present little information is available about the genetic polymorphisms of the NRF2 gene and their clinical relevance. We aimed to investigate the single nucleotide polymorphisms in the NRF2 gene as a prognostic biomarker in lung cancer. Experimental Design We prepared genomic DNA samples from 387 Japanese patients with primary lung cancer and detected SNP (c.–617C>A; rs6721961) in the ARE-like loci of the human NRF2 gene by the rapid genetic testing method we developed in this study. We then analyzed the association between the SNP in the NRF2 gene and patients’ overall survival. Results Patients harboring wild-type (WT) homozygous (c.–617C/C), SNP heterozygous (c.–617C/A), and SNP homozygous (c.–617A/A) alleles numbered 216 (55.8%), 147 (38.0%), and 24 (6.2%), respectively. Multivariate logistic regression models revealed that SNP homozygote (c.–617A/A) was significantly related to gender. Its frequency was four-fold higher in female patients than in males (10.8% female vs 2.7% male) and was associated with female non-smokers with adenocarcinoma. Interestingly, lung cancer patients carrying NRF2 SNP homozygous alleles (c.–617A/A) and the 309T (WT) allele in the MDM2 gene exhibited remarkable survival over 1,700 days after surgical operation (log-rank p = 0.021). Conclusion SNP homozygous (c.–617A/A) alleles in the NRF2 gene are associated with female non-smokers with adenocarcinoma and regarded as a prognostic biomarker for assessing overall survival of patients with lung adenocarcinoma. PMID:24040073

  18. Single Nucleotide Polymorphism (SNP)-Strings: An Alternative Method for Assessing Genetic Associations

    PubMed Central

    Goodin, Douglas S.; Khankhanian, Pouya

    2014-01-01

    Background Genome-wide association studies (GWAS) identify disease-associations for single-nucleotide-polymorphisms (SNPs) from scattered genomic-locations. However, SNPs frequently reside on several different SNP-haplotypes, only some of which may be disease-associated. This circumstance lowers the observed odds-ratio for disease-association. Methodology/Principal Findings Here we develop a method to identify the two SNP-haplotypes, which combine to produce each person’s SNP-genotype over specified chromosomal segments. Two multiple sclerosis (MS)-associated genetic regions were modeled; DRB1 (a Class II molecule of the major histocompatibility complex) and MMEL1 (an endopeptidase that degrades both neuropeptides and β-amyloid). For each locus, we considered sets of eleven adjacent SNPs, surrounding the putative disease-associated gene and spanning ∼200 kb of DNA. The SNP-information was converted into an ordered-set of eleven-numbers (subject-vectors) based on whether a person had zero, one, or two copies of particular SNP-variant at each sequential SNP-location. SNP-strings were defined as those ordered-combinations of eleven-numbers (0 or 1), representing a haplotype, two of which combined to form the observed subject-vector. Subject-vectors were resolved using probabilistic methods. In both regions, only a small number of SNP-strings were present. We compared our method to the SHAPEIT-2 phasing-algorithm. When the SNP-information spanning 200 kb was used, SHAPEIT-2 was inaccurate. When the SHAPEIT-2 window was increased to 2,000 kb, the concordance between the two methods, in both of these eleven-SNP regions, was over 99%, suggesting that, in these regions, both methods were quite accurate. Nevertheless, correspondence was not uniformly high over the entire DNA-span but, rather, was characterized by alternating peaks and valleys of concordance. Moreover, in the valleys of poor-correspondence, SHAPEIT-2 was also inconsistent with itself, suggesting that

  19. Impact of the PDE4D gene polymorphism and additional SNP-SNP and gene-smoking interaction on ischemic stroke risk in Chinese Han population.

    PubMed

    Wang, Xianxiang; Sun, Zhongwu; Zhang, Yiquan; Tian, Xuefeng; Li, Qingxin; Luo, Jing

    2017-04-01

    To investigate the association between phosphodiesterase 4D gene (PDE4D) gene single nucleotide polymorphisms (SNPs) and ischemic stroke (IS) risk, and impact of additional SNP- SNP and gene- smoking interaction on IS risk in Chinese population. A total of 1228 subjects (666 males, 562 females) were selected, including 610 IS patients and 618 control subjects. Logistic regression model was used to examine the association between SNPs in PDE4D gene and IS risk. Generalized multifactor dimensionality reduction (GMDR) was employed to analyze the SNP- SNP and gene- smoking interaction. IS risks were significantly higher in carriers of A allele of rs12188950 polymorphism than those with GG genotype (GA + AA vs. GG), adjusted OR (95%CI) = 1.61 (1.26-2.19), and also significantly higher in carriers of T allele of rs966221 polymorphism than those with CC (CT + TT vs. CC), adjusted OR (95%CI) = 1.82 (1.39-2.23). We found that there was a significant SNP- SNP interaction between rs966221 and rs12188950. Subjects with CT or TT of rs966221 and GA or AA of rs12188950 genotype have the highest IS risk, compared to subjects with CC of rs966221 and GG of rs12188950 genotype, OR (95%CI) was 3.52 (2.68-4.69). We also found a significant gene-environment interaction between rs966221 and smoking. Smokers with CT or TT of rs966221 genotype have the highest IS risk, compared to never smokers with CC of rs966221 genotype, OR (95%CI) was 3.97 (2.25-5.71). Our results support an important association of rs966221 and rs12188950 minor allele and its interaction with increased risk of IS risk, and additional interaction between rs966221 and smoking.

  20. Development and Evaluation of a 9K SNP Array for Peach by Internationally Coordinated SNP Detection and Validation in Breeding Germplasm

    PubMed Central

    Scalabrin, Simone; Gilmore, Barbara; Lawley, Cynthia T.; Gasic, Ksenija; Micheletti, Diego; Rosyara, Umesh R.; Cattonaro, Federica; Vendramin, Elisa; Main, Dorrie; Aramini, Valeria; Blas, Andrea L.; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Troggio, Michela; Sosinski, Bryon; Aranzana, Maria José; Arús, Pere; Iezzoni, Amy; Morgante, Michele; Peace, Cameron

    2012-01-01

    Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs. The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species. PMID:22536421

  1. Identification of QTL and Qualitative Trait Loci for Agronomic Traits Using SNP Markers in the Adzuki Bean

    PubMed Central

    Li, Yuan; Yang, Kai; Yang, Wei; Chu, Liwei; Chen, Chunhai; Zhao, Bo; Li, Yisong; Jian, Jianbo; Yin, Zhichao; Wang, Tianqi; Wan, Ping

    2017-01-01

    The adzuki bean (Vigna angularis) is an important grain legume. Fine mapping of quantitative trait loci (QTL) and qualitative trait genes plays an important role in gene cloning, molecular-marker-assisted selection (MAS), and trait improvement. However, the genetic control of agronomic traits in the adzuki bean remains poorly understood. Single-nucleotide polymorphisms (SNPs) are invaluable in the construction of high-density genetic maps. We mapped 26 agronomic QTLs and five qualitative trait genes related to pigmentation using 1,571 polymorphic SNP markers from the adzuki bean genome via restriction-site-associated DNA sequencing of 150 members of an F2 population derived from a cross between cultivated and wild adzuki beans. We mapped 11 QTLs for flowering time and pod maturity on chromosomes 4, 7, and 10. Six 100-seed weight (SD100WT) QTLs were detected. Two major flowering time QTLs were located on chromosome 4, firstly VaFld4.1 (PEVs 71.3%), co-segregating with SNP marker s690-144110, and VaFld4.2 (PEVs 67.6%) at a 0.974 cM genetic distance from the SNP marker s165-116310. Three QTLs for seed number per pod (Snp3.1, Snp3.2, and Snp4.1) were mapped on chromosomes 3 and 4. One QTL VaSdt4.1 of seed thickness (SDT) and three QTLs for branch number on the main stem were detected on chromosome 4. QTLs for maximum leaf width (LFMW) and stem internode length were mapped to chromosomes 2 and 9, respectively. Trait genes controlling the color of the seed coat, pod, stem and flower were mapped to chromosomes 3 and 1. Three candidate genes, VaAGL, VaPhyE, and VaAP2, were identified for flowering time and pod maturity. VaAGL encodes an agamous-like MADS-box protein of 379 amino acids. VaPhyE encodes a phytochrome E protein of 1,121 amino acids. Four phytochrome genes (VaPhyA1, VaPhyA2, VaPhyB, and VaPhyE) were identified in the adzuki bean genome. We found candidate genes VaAP2/ERF.81 and VaAP2/ERF.82 of SD100WT, VaAP2-s4 of SDT, and VaAP2/ERF.86 of LFMW. A candidate gene

  2. Hierarchical Y-SNP assay to study the hidden diversity and phylogenetic relationship of native populations in South America.

    PubMed

    Geppert, Maria; Baeta, Miriam; Núñez, Carolina; Martínez-Jarreta, Begoña; Zweynert, Sarah; Cruz, Omar Wladimir Vacas; González-Andrade, Fabricio; González-Solorzano, Jorge; Nagy, Marion; Roewer, Lutz

    2011-03-01

    Studying the Y chromosomes of indigenous tribes of Ecuador revealed a lack of strategic SNP assays to examine the substructure of South American native populations. In most studies dealing with South American samples so far only the most common Y-SNP M3 of haplogroup Q was analyzed, because this is known to define a founder group in South America. Studies of SNPs ancestral to Q-M3 (Q1a3a) to confirm the results or the typing of Q subclades have often been neglected. For this reason we developed a SNaPshot assay, which allows first for a hierarchical testing of all main haplogroups occurring in South American populations and second for a detailed analysis of haplogroups Q and C thought having ancient Asian descent. We selected 16 SNPs from the YCC haplogroup tree and established two multiplexes. The first multiplex ("SA Major") includes 12 Y-SNPs defining the most frequent haplogroups occurring in South America (M42, M207, M242, M168, M3, M145, M174, M213, RPS4Y711, M45, P170, and M9). The second multiplex ("SA SpecQ") contains Y-SNPs of haplogroup Q, especially of the subclade Q-M3 (M19, M194, P292, M3, and M199). Within our Ecuadorian sample, haplogroup Q-M3 (xM19, M194, P292, and M199) was predominant, but we also found haplogroup E and R, which can be attributed to recent admixture. Moreover, we found four out of 65 samples, which were tested to be haplogroup C3* (C-M217) the modal haplogroup in Mongolians and widespread in indigenous populations of the Russian Far East as well as in Eastern Asia. This haplogroup is not known to be the result of recent admixture and has been found only one time before in South America. Since haplogroup C occurs in Asia and in North America (C3b or C-P39), we assume that these C-lineages are ancient as well. Therefore, we established a third multiplex ("SA SpecC"), which allows the further subtyping of haplogroup C, mainly of subclade C3 defined by the Y-SNP M217 (M407, M48, P53.1, M217, P62, RPS4Y711, M93, M86, and P39

  3. Software for optimization of SNP and PCR-RFLP genotyping to discriminate many genomes with the fewest assays

    PubMed Central

    Gardner, Shea N; Wagner, Mark C

    2005-01-01

    Background Microbial forensics is important in tracking the source of a pathogen, whether the disease is a naturally occurring outbreak or part of a criminal investigation. Results A method and SPR Opt (SNP and PCR-RFLP Optimization) software to perform a comprehensive, whole-genome analysis to forensically discriminate multiple sequences is presented. Tools for the optimization of forensic typing using Single Nucleotide Polymorphism (SNP) and PCR-Restriction Fragment Length Polymorphism (PCR-RFLP) analyses across multiple isolate sequences of a species are described. The PCR-RFLP analysis includes prediction and selection of optimal primers and restriction enzymes to enable maximum isolate discrimination based on sequence information. SPR Opt calculates all SNP or PCR-RFLP variations present in the sequences, groups them into haplotypes according to their co-segregation across those sequences, and performs combinatoric analyses to determine which sets of haplotypes provide maximal discrimination among all the input sequences. Those set combinations requiring that membership in the fewest haplotypes be queried (i.e. the fewest assays be performed) are found. These analyses highlight variable regions based on existing sequence data. These markers may be heterogeneous among unsequenced isolates as well, and thus may be useful for characterizing the relationships among unsequenced as well as sequenced isolates. The predictions are multi-locus. Analyses of mumps and SARS viruses are summarized. Phylogenetic trees created based on SNPs, PCR-RFLPs, and full genomes are compared for SARS virus, illustrating that purported phylogenies based only on SNP or PCR-RFLP variations do not match those based on multiple sequence alignment of the full genomes. Conclusion This is the first software to optimize the selection of forensic markers to maximize information gained from the fewest assays, accepting whole or partial genome sequence data as input. As more sequence data becomes

  4. Interim report on updated microarray probes for the LLNL Burkholderia pseudomallei SNP array

    SciTech Connect

    Gardner, S; Jaing, C

    2012-03-27

    The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interim report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.

  5. SNP ascertainment bias in population genetic analyses: Why it is important, and how to correct it

    PubMed Central

    Lachance, Joseph; Tishkoff, Sarah A.

    2013-01-01

    Summary Whole genome sequencing and SNP genotyping arrays can paint strikingly different pictures of demographic history and natural selection. This is because genotyping arrays contain biased sets of pre-ascertained SNPs. In this short review, we use comparisons between high-coverage whole genome sequences of African hunter-gatherers and data from genotyping arrays to highlight how SNP ascertainment bias distorts population genetic inferences. Sample sizes and the populations in which SNPs are discovered affect the characteristics of observed variants. We find that SNPs on genotyping arrays tend to be older and present in multiple populations. In addition, genotyping arrays cause allele frequency distributions to be shifted towards intermediate frequency alleles, and estimates of linkage disequilibrium are modified. Since population genetic analyses depend on allele frequencies it is imperative that researchers are aware of the effects of SNP ascertainment bias. With this in mind we describe multiple ways to correct for SNP ascertainment bias. PMID:23836388

  6. An overview of SNP interactions in genome-wide association studies.

    PubMed

    Li, Pei; Guo, Maozu; Wang, Chunyu; Liu, Xiaoyan; Zou, Quan

    2015-03-01

    With the recent explosion in high-throughput genotyping technology, the amount and quality of single-nucleotide polymorphism (SNP) data has increased exponentially. Therefore, the identification of SNP interactions that are associated with common diseases is playing an increasing and important role in interpreting the genetic basis of disease susceptibility and in devising new diagnostic tests and treatments. However, because these data sets are large, although they typically have small sample sizes and low signal-to-noise ratios, there has been no major breakthrough despite many efforts, making this a major focus in the field of bioinformatics. In this article, we review the two main aspects of SNP interaction studies in recent years-the simulation and identification of SNP interactions-and then discuss the principles, efficiency and differences between these methods. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  7. SNP discovery and genotyping using Genotyping-by-Sequencing in Pekin ducks

    PubMed Central

    Zhu, Feng; Cui, Qian-Qian; Hou, Zhuo-Cheng

    2016-01-01

    Genomic selection and genome-wide association studies need thousands to millions of SNPs. However, many non-model species do not have reference chips for detecting variation. Our goal was to develop and validate an inexpensive but effective method for detecting SNP variation. Genotyping by sequencing (GBS) can be a highly efficient strategy for genome-wide SNP detection, as an alternative to microarray chips. Here, we developed a GBS protocol for ducks and tested it to genotype 49 Pekin ducks. A total of 169,209 SNPs were identified from all animals, with a mean of 55,920 SNPs per individual. The average SNP density reached 1156 SNPs/MB. In this study, the first application of GBS to ducks, we demonstrate the power and simplicity of this method. GBS can be used for genetic studies in to provide an effective method for genome-wide SNP discovery. PMID:27845353

  8. A user guide to the Brassica 60K Illumina Infinium™ SNP genotyping array.

    PubMed

    Mason, Annaliese S; Higgins, Erin E; Snowdon, Rod J; Batley, Jacqueline; Stein, Anna; Werner, Christian; Parkin, Isobel A P

    2017-04-01

    The Brassica napus 60K Illumina Infinium™ SNP array has had huge international uptake in the rapeseed community due to the revolutionary speed of acquisition and ease of analysis of this high-throughput genotyping data, particularly when coupled with the newly available reference genome sequence. However, further utilization of this valuable resource can be optimized by better understanding the promises and pitfalls of SNP arrays. We outline how best to analyze Brassica SNP marker array data for diverse applications, including linkage and association mapping, genetic diversity and genomic introgression studies. We present data on which SNPs are locus-specific in winter, semi-winter and spring B. napus germplasm pools, rather than amplifying both an A-genome and a C-genome locus or multiple loci. Common issues that arise when analyzing array data will be discussed, particularly those unique to SNP markers and how to deal with these for practical applications in Brassica breeding applications.

  9. Set up of cutoff thresholds for kinship determination using SNP loci.

    PubMed

    Cho, Sohee; Shin, Eun Soon; Yu, Hyung Jin; Lee, Ji Hyun; Seo, Hee Jin; Kim, Moon Young; Lee, Soong Deok

    2017-03-08

    The usefulness of single nucleotide polymorphism (SNP) loci for kinship testing has been demonstrated in many case works, and suggested as a promising marker for relationship identification. For interpreting results based on the calculation of the likelihood ratio (LR) in kinship testing, it is important to prepare cutoffs for respective relatives which are dependent on genetic relatedness. For this, analysis using true pedigree data is significant and reliable as it reflects the actual frequencies of markers in the population. In this study, the kinship index was explored through 1209 parent-child pairs, 1373 full sibling pairs, and 247 uncle-nephew pairs using 136 SNP loci. The cutoffs for LR were set up using different numbers of SNP loci with accuracy, sensitivity, and specificity. It is expected that this study can support the application of SNP loci-based kinship testing for various relationships.

  10. Identification of SNP Haplotypes and Prospects of Association Mapping in Watermelon

    USDA-ARS?s Scientific Manuscript database

    Watermelon is the fifth most economically important vegetable crop cultivated world-wide. Implementing Single Nucleotide Polymorphism (SNP) marker technology in watermelon breeding and germplasm evaluation programs holds a key to improve horticulturally important traits. Next-generation sequencing...

  11. Use of molecular variation in the NCBI dbSNP database.

    PubMed

    Sherry, S T; Ward, M; Sirotkin, K

    2000-01-01

    While high quality information regarding variation in genes is currently available in locus-specific or specialized mutation databases, the need remains for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping, and evolutionary biology. In response to this need, the National Center for Biotechnology Information (NCBI) has established the dbSNP database http://ncbi. nlm.nih.gov/SNP/ to serve as a generalized, central variation database. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink, and the Human Genome Project data, and the complete contents of dbSNP are available to the public via anonymous FTP. Hum Mutat 15:68-75, 2000. Published 2000 Wiley-Liss, Inc.

  12. Methods of tagSNP selection and other variables affecting imputation accuracy in swine

    PubMed Central

    2013-01-01

    Background Genotype imputation is a cost efficient alternative to use of high density genotypes for implementing genomic selection. The objective of this study was to investigate variables affecting imputation accuracy from low density tagSNP (average distance between tagSNP from 100kb to 1Mb) sets in swine, selected using LD information, physical location, or accuracy for genotype imputation. We compared results of imputation accuracy based on several sets of low density tagSNP of varying densities and selected using three different methods. In addition, we assessed the effect of varying size and composition of the reference panel of haplotypes used for imputation. Results TagSNP density of at least 1 tagSNP per 340kb (∼7000 tagSNP) selected using pairwise LD information was necessary to achieve average imputation accuracy higher than 0.95. A commercial low density (9K) tagSNP set for swine was developed concurrent to this study and an average accuracy of imputation of 0.951 based on these tagSNP was estimated. Construction of a haplotype reference panel was most efficient when these haplotypes were obtained from randomly sampled individuals. Increasing the size of the original reference haplotype panel (128 haplotypes sampled from 32 sire/dam/offspring trios phased in a previous study) led to an overall increase in imputation accuracy (IA = 0.97 with 512 haplotypes), but was especially useful in increasing imputation accuracy of SNP with MAF below 0.1 and for SNP located in the chromosomal extremes (within 5% of chromosome end). Conclusion The new commercially available 9K tagSNP set can be used to obtain imputed genotypes with high accuracy, even when imputation is based on a comparably small panel of reference haplotypes (128 haplotypes). Average imputation accuracy can be further increased by adding haplotypes to the reference panel. In addition, our results show that randomly sampling individuals to genotype for the construction of a reference haplotype

  13. Longitudinal SNP-set association analysis of quantitative phenotypes.

    PubMed

    Wang, Zhong; Xu, Ke; Zhang, Xinyu; Wu, Xiaowei; Wang, Zuoheng

    2017-01-01

    Many genetic epidemiological studies collect repeated measurements over time. This design not only provides a more accurate assessment of disease condition, but allows us to explore the genetic influence on disease development and progression. Thus, it is of great interest to study the longitudinal contribution of genes to disease susceptibility. Most association testing methods for longitudinal phenotypes are developed for single variant, and may have limited power to detect association, especially for variants with low minor allele frequency. We propose Longitudinal SNP-set/sequence kernel association test (LSKAT), a robust, mixed-effects method for association testing of rare and common variants with longitudinal quantitative phenotypes. LSKAT uses several random effects to account for the within-subject correlation in longitudinal data, and allows for adjustment for both static and time-varying covariates. We also present a longitudinal trait burden test (LBT), where we test association between the trait and the burden score in linear mixed models. In simulation studies, we demonstrate that LBT achieves high power when variants are almost all deleterious or all protective, while LSKAT performs well in a wide range of genetic models. By making full use of trait values from repeated measures, LSKAT is more powerful than several tests applied to a single measurement or average over all time points. Moreover, LSKAT is robust to misspecification of the covariance structure. We apply the LSKAT and LBT methods to detect association with longitudinally measured body mass index in the Framingham Heart Study, where we are able to replicate association with a circadian gene NR1D2. © 2016 WILEY PERIODICALS, INC.

  14. Prim-SNPing: a primer designer for cost-effective SNP genotyping.

    PubMed

    Chang, Hsueh-Wei; Chuang, Li-Yeh; Cheng, Yu-Huei; Hung, Yu-Chen; Wen, Cheng-Hao; Gu, De-Leung; Yang, Cheng-Hong

    2009-05-01

    Many kinds of primer design (PD) software tools have been developed, but most of them lack a single nucleotide polymorphism (SNP) genotyping service. Here, we introduce the web-based freeware "Prim-SNPing," which, in addition to general PD, provides three kinds of primer design functions for cost-effective SNP genotyping: natural PD, mutagenic PD, and confronting two-pair primers (CTPP) PD. The natural PD and mutagenic PD provide primers and restriction enzyme mining for polymerase chain reaction-restriction fragment of length polymorphism (PCR-RFLP), while CTPP PD provides primers for restriction enzyme-free SNP genotyping. The PCR specificity and efficiency of the designed primers are improved by BLAST searching and evaluating secondary structure (such as GC clamps, dimers, and hairpins), respectively. The length pattern of PCR-RFLP using natural PD is user-adjustable, and the restriction sites of the RFLP enzymes provided by Prim-SNPing are confirmed to be absent within the generated PCR product. In CTPP PD, the need for a separate digestion step in RFLP is eliminated, thus making it faster and cheaper. The output of Prim-SNPing includes the primer list, melting temperature (Tm) value, GC percentage, and amplicon size with enzyme digestion information. The reference SNP (refSNP, or rs) clusters from the Single Nucleotide Polymorphism database (dbSNP) at the National Center for Biotechnology Information (NCBI), and multiple other formats of human, mouse, and rat SNP sequences are acceptable input. In summary, Prim-SNPing provides interactive, user-friendly and cost-effective primer design for SNP genotyping. It is freely available at http://bio.kuas.edu.tw/prim-snping.

  15. Evaluation of approaches for identifying population informative markers from high density SNP Chips

    PubMed Central

    2011-01-01

    Background Genetic markers can be used to identify and verify the origin of individuals. Motivation for the inference of ancestry ranges from conservation genetics to forensic analysis. High density assays featuring Single Nucleotide Polymorphism (SNP) markers can be exploited to create a reduced panel containing the most informative markers for these purposes. The objectives of this study were to evaluate methods of marker selection and determine the minimum number of markers from the BovineSNP50 BeadChip required to verify the origin of individuals in European cattle breeds. Delta, Wright's FST, Weir & Cockerham's FST and PCA methods for population differentiation were compared. The level of informativeness of each SNP was estimated from the breed specific allele frequencies. Individual assignment analysis was performed using the ranked informative markers. Stringency levels were applied by log-likelihood ratio to assess the confidence of the assignment test. Results A 95% assignment success rate for the 384 individually genotyped animals was achieved with < 80, < 100, < 140 and < 200 SNP markers (with increasing stringency threshold levels) across all the examined methods for marker selection. No further gain in power of assignment was achieved by sampling in excess of 200 SNP markers. The marker selection method that required the lowest number of SNP markers to verify the animal's breed origin was Wright's FST (60 to 140 SNPs depending on the chosen degree of confidence). Certain breeds required fewer markers (< 100) to achieve 100% assignment success. In contrast, closely related breeds require more markers (~200) to achieve > 95% assignment success. The power of assignment success, and therefore the number of SNP markers required, is dependent on the levels of genetic heterogeneity and pool of samples considered. Conclusions While all SNP selection methods produced marker panels capable of breed identification, the power of assignment varied markedly among

  16. Design and validation of a 90K SNP genotyping assay for the water buffalo (Bubalus bubalis).

    PubMed

    Iamartino, Daniela; Nicolazzi, Ezequiel L; Van Tassell, Curtis P; Reecy, James M; Fritz-Waters, Eric R; Koltes, James E; Biffani, Stefano; Sonstegard, Tad S; Schroeder, Steven G; Ajmone-Marsan, Paolo; Negrini, Riccardo; Pasquariello, Rolando; Ramelli, Paola; Coletta, Angelo; Garcia, José F; Ali, Ahmad; Ramunno, Luigi; Cosenza, Gianfranco; de Oliveira, Denise A A; Drummond, Marcela G; Bastianetto, Eduardo; Davassi, Alessandro; Pirani, Ali; Brew, Fiona; Williams, John L

    2017-01-01

    The availability of the bovine genome sequence and SNP panels has improved various genomic analyses, from exploring genetic diversity to aiding genetic selection. However, few of the SNP on the bovine chips are polymorphic in buffalo, therefore a panel of single nucleotide DNA markers exclusive for buffalo was necessary for molecular genetic analyses and to develop genomic selection approaches for water buffalo. The creation of a 90K SNP panel for river buffalo and testing in a genome wide association study for milk production is described here. The genomes of 73 buffaloes of 4 different breeds were sequenced and aligned against the bovine genome, which facilitated the identification of 22 million of sequence variants among the buffalo genomes. Based on frequencies of variants within and among buffalo breeds, and their distribution across the genome, inferred from the bovine genome sequence, 90,000 putative single nucleotide polymorphisms were selected to create an Axiom® Buffalo Genotyping Array 90K. This 90K "SNP-Chip" was tested in several river buffalo populations and found to have ∼70% high quality and polymorphic SNPs. Of the 90K SNPs about 24K were also found to be polymorphic in swamp buffalo. The SNP chip was used to investigate the structure of buffalo populations, and could distinguish buffalo from different farms. A Genome Wide Association Study identified genomic regions on 5 chromosomes putatively involved in milk production. The 90K buffalo SNP chip described here is suitable for the analysis of the genomes of river buffalo breeds, and could be used for genetic diversity studies and potentially as a starting point for genome-assisted selection programmes. This SNP Chip could also be used to analyse swamp buffalo, but many loci are not informative and creation of a revised SNP set specific for swamp buffalo would be advised.

  17. Evaluation of breast cancer susceptibility using improved genetic algorithms to generate genotype SNP barcodes.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Da; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2013-01-01

    Genetic association is a challenging task for the identification and characterization of genes that increase the susceptibility to common complex multifactorial diseases. To fully execute genetic studies of complex diseases, modern geneticists face the challenge of detecting interactions between loci. A genetic algorithm (GA) is developed to detect the association of genotype frequencies of cancer cases and noncancer cases based on statistical analysis. An improved genetic algorithm (IGA) is proposed to improve the reliability of the GA method for high-dimensional SNP-SNP interactions. The strategy offers the top five results to the random population process, in which they guide the GA toward a significant search course. The IGA increases the likelihood of quickly detecting the maximum ratio difference between cancer cases and noncancer cases. The study systematically evaluates the joint effect of 23 SNP combinations of six steroid hormone metabolisms, and signaling-related genes involved in breast carcinogenesis pathways were systematically evaluated, with IGA successfully detecting significant ratio differences between breast cancer cases and noncancer cases. The possible breast cancer risks were subsequently analyzed by odds-ratio (OR) and risk-ratio analysis. The estimated OR of the best SNP barcode is significantly higher than 1 (between 1.15 and 7.01) for specific combinations of two to 13 SNPs. Analysis results support that the IGA provides higher ratio difference values than the GA between breast cancer cases and noncancer cases over 3-SNP to 13-SNP interactions. A more specific SNP-SNP interaction profile for the risk of breast cancer is also provided.

  18. SNP detection in Na/K ATP-ase gene α1 subunit of bisexual and parthenogenetic Artemia strains by RFLP screening.

    PubMed

    Manaffar, R; Zare, S; Agh, N; Abdolahzadeh, N; Soltanian, S; Sorgeloos, P; Bossier, P; Van Stappen, G

    2011-01-01

    In order to find a marker for differentiating between a bisexual and a parthenogenetic Artemia strain, Exon-7 of the Na/K ATPase α(1) subunit gene was screened by RFLP technique. The results revealed a constant synonymous SNP (single nucleotide polymorphism) in digestion by the Tru1I enzyme that was consistent with these two types of Artemia. This SNP was identified as an accurate molecular marker for discrimination between bisexual and parthenogenetic Artemia. According to the Nei's genetic distance (1973), the lowest genetic distance was found between individuals from Artemia urmiana Günther 1890 and parthenogenetic populations, making the described marker the first marker to easily distinguish between these two cooccurring species. © 2010 Blackwell Publishing Ltd.

  19. Exploring germplasm diversity to understand the domestication process in Cicer spp. using SNP and DArT markers.

    PubMed

    Roorkiwal, Manish; von Wettberg, Eric J; Upadhyaya, Hari D; Warschefsky, Emily; Rathore, Abhishek; Varshney, Rajeev K

    2014-01-01

    To estimate genetic diversity within and between 10 interfertile Cicer species (94 genotypes) from the primary, secondary and tertiary gene pool, we analysed 5,257 DArT markers and 651 KASPar SNP markers. Based on successful allele calling in the tertiary gene pool, 2,763 DArT and 624 SNP markers that are polymorphic between genotypes from the gene pools were analyzed further. STRUCTURE analyses were consistent with 3 cultivated populations, representing kabuli, desi and pea-shaped seed types, with substantial admixture among these groups, while two wild populations were observed using DArT markers. AMOVA was used to partition variance among hierarchical sets of landraces and wild species at both the geographical and species level, with 61% of the variation found between species, and 39% within species. Molecular variance among the wild species was high (39%) compared to the variation present in cultivated material (10%). Observed heterozygosity was higher in wild species than the cultivated species for each linkage group. Our results support the Fertile Crescent both as the center of domestication and diversification of chickpea. The collection used in the present study covers all the three regions of historical chickpea cultivation, with the highest diversity in the Fertile Crescent region. Shared alleles between different gene pools suggest the possibility of gene flow among these species or incomplete lineage sorting and could indicate complicated patterns of divergence and fusion of wild chickpea taxa in the past.

  20. Next-generation transcriptome sequencing, SNP discovery and validation in four market classes of peanut, Arachis hypogaea L.

    PubMed

    Chopra, Ratan; Burow, Gloria; Farmer, Andrew; Mudge, Joann; Simpson, Charles E; Wilkins, Thea A; Baring, Michael R; Puppala, Naveen; Chamberlin, Kelly D; Burow, Mark D

    2015-06-01

    Single-nucleotide polymorphisms, which can be identified in the thousands or millions from comparisons of transcriptome or genome sequences, are ideally suited for making high-resolution genetic maps, investigating population evolutionary history, and discovering marker-trait linkages. Despite significant results from their use in human genetics, progress in identification and use in plants, and particularly polyploid plants, has lagged. As part of a long-term project to identify and use SNPs suitable for these purposes in cultivated peanut, which is tetraploid, we generated transcriptome sequences of four peanut cultivars, namely OLin, New Mexico Valencia C, Tamrun OL07 and Jupiter, which represent the four major market classes of peanut grown in the world, and which are important economically to the US southwest peanut growing region. CopyDNA libraries of each genotype were used to generate 2 × 54 paired-end reads using an Illumina GAIIx sequencer. Raw reads were mapped to a custom reference consisting of Tifrunner 454 sequences plus peanut ESTs in GenBank, compromising 43,108 contigs; 263,840 SNP and indel variants were identified among four genotypes compared to the reference. A subset of 6 variants was assayed across 24 genotypes representing four market types using KASP chemistry to assess the criteria for SNP selection. Results demonstrated that transcriptome sequencing can identify SNPs usable as selectable DNA-based markers in complex polyploid species such as peanut. Criteria for effective use of SNPs as markers are discussed in this context.

  1. Exploring Germplasm Diversity to Understand the Domestication Process in Cicer spp. Using SNP and DArT Markers

    PubMed Central

    Roorkiwal, Manish; von Wettberg, Eric J.; Upadhyaya, Hari D.; Warschefsky, Emily; Rathore, Abhishek; Varshney, Rajeev K.

    2014-01-01

    To estimate genetic diversity within and between 10 interfertile Cicer species (94 genotypes) from the primary, secondary and tertiary gene pool, we analysed 5,257 DArT markers and 651 KASPar SNP markers. Based on successful allele calling in the tertiary gene pool, 2,763 DArT and 624 SNP markers that are polymorphic between genotypes from the gene pools were analyzed further. STRUCTURE analyses were consistent with 3 cultivated populations, representing kabuli, desi and pea-shaped seed types, with substantial admixture among these groups, while two wild populations were observed using DArT markers. AMOVA was used to partition variance among hierarchical sets of landraces and wild species at both the geographical and species level, with 61% of the variation found between species, and 39% within species. Molecular variance among the wild species was high (39%) compared to the variation present in cultivated material (10%). Observed heterozygosity was higher in wild species than the cultivated species for each linkage group. Our results support the Fertile Crescent both as the center of domestication and diversification of chickpea. The collection used in the present study covers all the three regions of historical chickpea cultivation, with the highest diversity in the Fertile Crescent region. Shared alleles between different gene pools suggest the possibility of gene flow among these species or incomplete lineage sorting and could indicate complicated patterns of divergence and fusion of wild chickpea taxa in the past. PMID:25010059

  2. SNP2TFBS – a database of regulatory SNPs affecting predicted transcription factor binding site affinity

    PubMed Central

    Kumar, Sunil; Ambrosini, Giovanna; Bucher, Philipp

    2017-01-01

    SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs. Homepage: http://ccg.vital-it.ch/snp2tfbs/. PMID:27899579

  3. SNP rs1511412 in FOXL2 gene as a risk factor for keloid by meta analysis.

    PubMed

    Lu, Wensheng; Zheng, Xiaodong; Liu, Shengli; Ding, Maoqian; Xie, Jian; Yao, Xiuhua; Zhang, Lanfang; Hu, Bai

    2015-01-01

    Determine whether SNP rs1511412 is associated with keloid. One large-scale GWAS identified association between SNP rs1511412 in the FOXL2 gene and keloid disease in the Japanese population. However, researchers didn't observe significant association for keloid in Chinese Han population (PBonferroni>0.05). It's probable that the frequency of this variant in Chinese Han population was relatively low and the sample size was not very large in this study (power =45.5). We performed an independent case control association study in the Chinese Han population and a follow-up large scale meta-analysis for SNP rs1511412. Our study included 309 keloid patients and 1080 controls of the Chinese Han population. A significant association was found between SNP and keloid (P=0.02, OR=2.23). Meta-analysis included 1847 keloid patients and 7229 controls combined from five Asian populations. The association between SNP rs1511412 and keloid became highly significant (P<1×10(-8) OR=1.89). We conclude that SNP rs1511412 in FOXL2 is indeed a genetic risk factor for keloid across different ethnic populations.

  4. SNP rs1511412 in FOXL2 gene as a risk factor for keloid by meta analysis

    PubMed Central

    Lu, Wensheng; Zheng, Xiaodong; Liu, Shengli; Ding, Maoqian; Xie, Jian; Yao, Xiuhua; Zhang, Lanfang; Hu, Bai

    2015-01-01

    Objective: Determine whether SNP rs1511412 is associated with keloid. Design and methods: One large-scale GWAS identified association between SNP rs1511412 in the FOXL2 gene and keloid disease in the Japanese population. However, researchers didn’t observe significant association for keloid in Chinese Han population (PBonferroni>0.05). It’s probable that the frequency of this variant in Chinese Han population was relatively low and the sample size was not very large in this study (power =45.5). We performed an independent case control association study in the Chinese Han population and a follow-up large scale meta-analysis for SNP rs1511412. Results: Our study included 309 keloid patients and 1080 controls of the Chinese Han population. A significant association was found between SNP and keloid (P=0.02, OR=2.23). Meta-analysis included 1847 keloid patients and 7229 controls combined from five Asian populations. The association between SNP rs1511412 and keloid became highly significant (P<1×10-8 OR=1.89). Conclusion: We conclude that SNP rs1511412 in FOXL2 is indeed a genetic risk factor for keloid across different ethnic populations. PMID:25932232

  5. Construction of a versatile SNP array for pyramiding useful genes of rice.

    PubMed

    Kurokawa, Yusuke; Noda, Tomonori; Yamagata, Yoshiyuki; Angeles-Shim, Rosalyn; Sunohara, Hidehiko; Uehara, Kanako; Furuta, Tomoyuki; Nagai, Keisuke; Jena, Kshirod Kumar; Yasui, Hideshi; Yoshimura, Atsushi; Ashikari, Motoyuki; Doi, Kazuyuki

    2016-01-01

    DNA marker-assisted selection (MAS) has become an indispensable component of breeding. Single nucleotide polymorphisms (SNP) are the most frequent polymorphism in the rice genome. However, SNP markers are not readily employed in MAS because of limitations in genotyping platforms. Here the authors report a Golden Gate SNP array that targets specific genes controlling yield-related traits and biotic stress resistance in rice. As a first step, the SNP genotypes were surveyed in 31 parental varieties using the Affymetrix Rice 44K SNP microarray. The haplotype information for 16 target genes was then converted to the Golden Gate platform with 143-plex markers. Haplotypes for the 14 useful allele are unique and can discriminate among all other varieties. The genotyping consistency between the Affymetrix microarray and the Golden Gate array was 92.8%, and the accuracy of the Golden Gate array was confirmed in 3 F2 segregating populations. The concept of the haplotype-based selection by using the constructed SNP array was proofed. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  6. SNP and mutation data on the web - hidden treasures for uncovering.

    PubMed

    Barnes, Michael R

    2002-01-01

    SNP data has grown exponentially over the last two years, SNP database evolution has matched this growth, as initial development of several independent SNP databases has given way to one central SNP database, dbSNP. Other SNP databases have instead evolved to complement this central database by providing gene specific focus and an increased level of curation and analysis on subsets of data, derived from the central data set. By contrast, human mutation data, which has been collected over many years, is still stored in disparate sources, although moves are afoot to move to a similar central database. These developments are timely, human mutation and polymorphism data both hold complementary keys to a better understanding of how genes function and malfunction in disease. The impending availability of a complete human genome presents us with an ideal framework to integrate both these forms of data, as our understanding of the mechanisms of disease increase, the full genomic context of variation may become increasingly significant.

  7. SNP-based prediction of the human germ cell methylation landscape.

    PubMed

    Xie, Hehuang; Wang, Min; Bischof, Jared; Bonaldo, Maria de Fatima; Soares, Marcelo Bento

    2009-05-01

    Base substitution occurs at a high rate at CpG dinucleotides due to the frequent methylation of CpG and the deamination of methylated cytosine to thymine. If these substitutions occur in germ cells, they constitute a heritable mutation that may eventually rise to polymorphic frequencies, hence resulting in a SNP that is methylation associated. In this study, we sought to identify clusters of methylation associated SNPs as a basis for prediction of methylation landscapes of germ cell genomes. Genomic regions enriched with methylation associated SNPs, namely "methylation associated SNP clusters", were identified with an agglomerative hierarchical clustering algorithm. Repetitive elements, segmental duplications, and syntenic tandem DNA repeats were enriched in methylation associated SNP clusters. The frequency of methylation associated SNPs in Alu Y/S elements exhibited a gradient pattern suggestive of linear spreading, being higher in proximity to methylation associated SNP clusters and lower closer to CpG islands. Interestingly, methylation associated SNP clusters were over-represented near the transcriptional initiation sites of immune response genes. We propose a de novo DNA methylation model during germ cell development whereby a pattern is established by long-range chromatic interactions through syntenic repeats combined with regional methylation spreading from methylation associated SNP clusters.

  8. SNP Microarray in FISH Negative Clinically Suspected 22q11.2 Microdeletion Syndrome

    PubMed Central

    Jain, Manish; Kalsi, Amanpreet Kaur

    2016-01-01

    The present study evaluated the role of SNP microarray in 101 cases of clinically suspected FISH negative (noninformative/normal) 22q11.2 microdeletion syndrome. SNP microarray was carried out using 300 K HumanCytoSNP-12 BeadChip array or CytoScan 750 K array. SNP microarray identified 8 cases of 22q11.2 microdeletions and/or microduplications in addition to cases of chromosomal abnormalities and other pathogenic/likely pathogenic CNVs. Clinically suspected specific deletions (22q11.2) were detectable in approximately 8% of cases by SNP microarray, mostly from FISH noninformative cases. This study also identified several LOH/AOH loci with known and well-defined UPD (uniparental disomy) disorders. In conclusion, this study suggests more strict clinical criteria for FISH analysis. However, if clinical criteria are few or doubtful, in particular newborn/neonate in intensive care, SNP microarray should be the first screening test to be ordered. FISH is ideal test for detecting mosaicism, screening family members, and prenatal diagnosis in proven families. PMID:27051557

  9. SNP Microarray in FISH Negative Clinically Suspected 22q11.2 Microdeletion Syndrome.

    PubMed

    Halder, Ashutosh; Jain, Manish; Kalsi, Amanpreet Kaur

    2016-01-01

    The present study evaluated the role of SNP microarray in 101 cases of clinically suspected FISH negative (noninformative/normal) 22q11.2 microdeletion syndrome. SNP microarray was carried out using 300 K HumanCytoSNP-12 BeadChip array or CytoScan 750 K array. SNP microarray identified 8 cases of 22q11.2 microdeletions and/or microduplications in addition to cases of chromosomal abnormalities and other pathogenic/likely pathogenic CNVs. Clinically suspected specific deletions (22q11.2) were detectable in approximately 8% of cases by SNP microarray, mostly from FISH noninformative cases. This study also identified several LOH/AOH loci with known and well-defined UPD (uniparental disomy) disorders. In conclusion, this study suggests more strict clinical criteria for FISH analysis. However, if clinical criteria are few or doubtful, in particular newborn/neonate in intensive care, SNP microarray should be the first screening test to be ordered. FISH is ideal test for detecting mosaicism, screening family members, and prenatal diagnosis in proven families.

  10. Electrochemical Li Topotactic Reaction in Layered SnP3 for Superior Li-Ion Batteries

    NASA Astrophysics Data System (ADS)

    Park, Jae-Wan; Park, Cheol-Min

    2016-10-01

    The development of new anode materials having high electrochemical performances and interesting reaction mechanisms is highly required to satisfy the need for long-lasting mobile electronic devices and electric vehicles. Here, we report a layer crystalline structured SnP3 and its unique electrochemical behaviors with Li. The SnP3 was simply synthesized through modification of Sn crystallography by combination with P and its potential as an anode material for LIBs was investigated. During Li insertion reaction, the SnP3 anode showed an interesting two-step electrochemical reaction mechanism comprised of a topotactic transition (0.7–2.0 V) and a conversion (0.0–2.0 V) reaction. When the SnP3-based composite electrode was tested within the topotactic reaction region (0.7–2.0 V) between SnP3 and LixSnP3 (x ≤ 4), it showed excellent electrochemical properties, such as a high volumetric capacity (1st discharge/charge capacity was 840/663 mA h cm‑3) with a high initial coulombic efficiency, stable cycle behavior (636 mA h cm‑3 over 100 cycles), and fast rate capability (550 mA h cm‑3 at 3C). This layered SnP3 anode will be applicable to a new anode material for rechargeable LIBs.

  11. Electrochemical Li Topotactic Reaction in Layered SnP3 for Superior Li-Ion Batteries

    PubMed Central

    Park, Jae-Wan; Park, Cheol-Min

    2016-01-01

    The development of new anode materials having high electrochemical performances and interesting reaction mechanisms is highly required to satisfy the need for long-lasting mobile electronic devices and electric vehicles. Here, we report a layer crystalline structured SnP3 and its unique electrochemical behaviors with Li. The SnP3 was simply synthesized through modification of Sn crystallography by combination with P and its potential as an anode material for LIBs was investigated. During Li insertion reaction, the SnP3 anode showed an interesting two-step electrochemical reaction mechanism comprised of a topotactic transition (0.7–2.0 V) and a conversion (0.0–2.0 V) reaction. When the SnP3-based composite electrode was tested within the topotactic reaction region (0.7–2.0 V) between SnP3 and LixSnP3 (x ≤ 4), it showed excellent electrochemical properties, such as a high volumetric capacity (1st discharge/charge capacity was 840/663 mA h cm−3) with a high initial coulombic efficiency, stable cycle behavior (636 mA h cm−3 over 100 cycles), and fast rate capability (550 mA h cm−3 at 3C). This layered SnP3 anode will be applicable to a new anode material for rechargeable LIBs. PMID:27775090

  12. Electrochemical Li Topotactic Reaction in Layered SnP3 for Superior Li-Ion Batteries.

    PubMed

    Park, Jae-Wan; Park, Cheol-Min

    2016-10-24

    The development of new anode materials having high electrochemical performances and interesting reaction mechanisms is highly required to satisfy the need for long-lasting mobile electronic devices and electric vehicles. Here, we report a layer crystalline structured SnP3 and its unique electrochemical behaviors with Li. The SnP3 was simply synthesized through modification of Sn crystallography by combination with P and its potential as an anode material for LIBs was investigated. During Li insertion reaction, the SnP3 anode showed an interesting two-step electrochemical reaction mechanism comprised of a topotactic transition (0.7-2.0 V) and a conversion (0.0-2.0 V) reaction. When the SnP3-based composite electrode was tested within the topotactic reaction region (0.7-2.0 V) between SnP3 and LixSnP3 (x ≤ 4), it showed excellent electrochemical properties, such as a high volumetric capacity (1st discharge/charge capacity was 840/663 mA h cm(-3)) with a high initial coulombic efficiency, stable cycle behavior (636 mA h cm(-3) over 100 cycles), and fast rate capability (550 mA h cm(-3) at 3C). This layered SnP3 anode will be applicable to a new anode material for rechargeable LIBs.

  13. SNP2TFBS - a database of regulatory SNPs affecting predicted transcription factor binding site affinity.

    PubMed

    Kumar, Sunil; Ambrosini, Giovanna; Bucher, Philipp

    2017-01-04

    SNP2TFBS is a computational resource intended to support researchers investigating the molecular mechanisms underlying regulatory variation in the human genome. The database essentially consists of a collection of text files providing specific annotations for human single nucleotide polymorphisms (SNPs), namely whether they are predicted to abolish, create or change the affinity of one or several transcription factor (TF) binding sites. A SNP's effect on TF binding is estimated based on a position weight matrix (PWM) model for the binding specificity of the corresponding factor. These data files are regenerated at regular intervals by an automatic procedure that takes as input a reference genome, a comprehensive SNP catalogue and a collection of PWMs. SNP2TFBS is also accessible over a web interface, enabling users to view the information provided for an individual SNP, to extract SNPs based on various search criteria, to annotate uploaded sets of SNPs or to display statistics about the frequencies of binding sites affected by selected SNPs. Homepage: http://ccg.vital-it.ch/snp2tfbs/.

  14. Review of alignment and SNP calling algorithms for next-generation sequencing data.

    PubMed

    Mielczarek, M; Szyda, J

    2016-02-01

    Application of the massive parallel sequencing technology has become one of the most important issues in life sciences. Therefore, it was crucial to develop bioinformatics tools for next-generation sequencing (NGS) data processing. Currently, two of the most significant tasks include alignment to a reference genome and detection of single nucleotide polymorphisms (SNPs). In many types of genomic analyses, great numbers of reads need to be mapped to the reference genome; therefore, selection of the aligner is an essential step in NGS pipelines. Two main algorithms-suffix tries and hash tables-have been introduced for this purpose. Suffix array-based aligners are memory-efficient and work faster than hash-based aligners, but they are less accurate. In contrast, hash table algorithms tend to be slower, but more sensitive. SNP and genotype callers may also be divided into two main different approaches: heuristic and probabilistic methods. A variety of software has been subsequently developed over the past several years. In this paper, we briefly review the current development of NGS data processing algorithms and present the available software.

  15. Family-Based Multi-SNP X Chromosome Analysis Using Parent Information.

    PubMed

    Wise, Alison S; Shi, Min; Weinberg, Clarice R

    2016-01-01

    We propose a method for association analysis of haplotypes on the X chromosome that offers both improved power and robustness to population stratification in studies of affected offspring and their parents if all three have been genotyped. The method makes use of assumed parental haplotype exchangeability (PHE), a weaker assumption than Hardy-Weinberg equilibrium (HWE). PHE requires that in the source population, of the three X chromosome haplotypes carried by the two parents, each is equally likely to be carried by the father. We propose a pseudo-sibling approach that exploits that exchangeability assumption. Our method extends the single-SNP PIX-LRT method to multiple SNPs in a high linkage block. We describe methods for testing the PHE assumption and also for determining how apparent violations can be distinguished from true fetal effects or maternally-mediated effects. We show results of simulations that demonstrate nominal type I error rate and good power. The methods are then applied to dbGaP data on the birth defect oral cleft, using both Asian and Caucasian families with cleft.

  16. Family-Based Multi-SNP X Chromosome Analysis Using Parent Information

    PubMed Central

    Wise, Alison S.; Shi, Min; Weinberg, Clarice R.

    2016-01-01

    We propose a method for association analysis of haplotypes on the X chromosome that offers both improved power and robustness to population stratification in studies of affected offspring and their parents if all three have been genotyped. The method makes use of assumed parental haplotype exchangeability (PHE), a weaker assumption than Hardy-Weinberg equilibrium (HWE). PHE requires that in the source population, of the three X chromosome haplotypes carried by the two parents, each is equally likely to be carried by the father. We propose a pseudo-sibling approach that exploits that exchangeability assumption. Our method extends the single-SNP PIX-LRT method to multiple SNPs in a high linkage block. We describe methods for testing the PHE assumption and also for determining how apparent violations can be distinguished from true fetal effects or maternally-mediated effects. We show results of simulations that demonstrate nominal type I error rate and good power. The methods are then applied to dbGaP data on the birth defect oral cleft, using both Asian and Caucasian families with cleft. PMID:26941777

  17. Sensitive DNA detection and SNP discrimination using ultrabright SERS nanorattles and magnetic beads for malaria diagnostics.

    PubMed

    Ngo, Hoan T; Gandra, Naveen; Fales, Andrew M; Taylor, Steve M; Vo-Dinh, Tuan

    2016-07-15

    One of the major obstacles to implement nucleic acid-based molecular diagnostics at the point-of-care (POC) and in resource-limited settings is the lack of sensitive and practical DNA detection methods that can be seamlessly integrated into portable platforms. Herein we present a sensitive yet simple DNA detection method using a surface-enhanced Raman scattering (SERS) nanoplatform: the ultrabright SERS nanorattle. The method, referred to as the nanorattle-based method, involves sandwich hybridization of magnetic beads that are loaded with capture probes, target sequences, and ultrabright SERS nanorattles that are loaded with reporter probes. Upon hybridization, a magnet was applied to concentrate the hybridization sandwiches at a detection spot for SERS measurements. The ultrabright SERS nanorattles, composed of a core and a shell with resonance Raman reporters loaded in the gap space between the core and the shell, serve as SERS tags for signal detection. Using this method, a specific DNA sequence of the malaria parasite Plasmodium falciparum could be detected with a detection limit of approximately 100 attomoles. Single nucleotide polymorphism (SNP) discrimination of wild type malaria DNA and mutant malaria DNA, which confers resistance to artemisinin drugs, was also demonstrated. These test models demonstrate the molecular diagnostic potential of the nanorattle-based method to both detect and genotype infectious pathogens. Furthermore, the method's simplicity makes it a suitable candidate for integration into portable platforms for POC and in resource-limited settings applications.

  18. Sensitive DNA detection and SNP discrimination using ultrabright SERS nanorattles and magnetic beads for malaria diagnostics

    PubMed Central

    Ngo, Hoan T.; Gandra, Naveen; Fales, Andrew M.; Taylor, Steve M.; Vo-Dinh, Tuan

    2016-01-01

    One of the major obstacles to implement nucleic acid-based molecular diagnostics at the point-of-care (POC) and in resource-limited settings is the lack of sensitive and practical DNA detection methods that can be seamlessly integrated into portable platforms. Herein we present a sensitive yet simple DNA detection method using a surface-enhanced Raman scattering (SERS) nanoplatform: the ultrabright SERS nanorattle. The method, referred to as the nanorattle-based method, involves sandwich hybridization of magnetic beads that are loaded with capture probes, target sequences, and ultrabright SERS nanorattles that are loaded with reporter probes. Upon hybridization, a magnet was applied to concentrate the hybridization sandwiches at a detection spot for SERS measurements. The ultrabright SERS nanorattles, composed of a core and a shell with resonance Raman reporters loaded in the gap space between the core and the shell, serve as SERS tags for signal detection. Using this method, a specific DNA sequence of the malaria parasite Plasmodium falciparum could be detected with a detection limit of approximately 100 attomoles. Single nucleotide polymorphism (SNP) discrimination of wild type malaria DNA and mutant malaria DNA, which confers resistance to artemisinin drugs, was also demonstrated. These test models demonstrate the molecular diagnostic potential of the nanorattle-based method to both detect and genotype infectious pathogens. Furthermore, the method’s simplicity makes it a suitable candidate for integration into portable platforms for POC and in resource-limited settings applications. PMID:26913502

  19. A genome-wide SNP scan accelerates trait-regulatory genomic loci identification in chickpea

    PubMed Central

    Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C.L.L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.

    2015-01-01

    We identified 44844 high-quality SNPs by sequencing 92 diverse chickpea accessions belonging to a seed and pod trait-specific association panel using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays. A GWAS (genome-wide association study) in an association panel of 211, including the 92 sequenced accessions, identified 22 major genomic loci showing significant association (explaining 23–47% phenotypic variation) with pod and seed number/plant and 100-seed weight. Eighteen trait-regulatory major genomic loci underlying 13 robust QTLs were validated and mapped on an intra-specific genetic linkage map by QTL mapping. A combinatorial approach of GWAS, QTL mapping and gene haplotype-specific LD mapping and transcript profiling uncovered one superior haplotype and favourable natural allelic variants in the upstream regulatory region of a CesA-type cellulose synthase (Ca_Kabuli_CesA3) gene regulating high pod and seed number/plant (explaining 47% phenotypic variation) in chickpea. The up-regulation of this superior gene haplotype correlated with increased transcript expression of Ca_Kabuli_CesA3 gene in the pollen and pod of high pod/seed number accession, resulting in higher cellulose accumulation for normal pollen and pollen tube growth. A rapid combinatorial genome-wide SNP genotyping-based approach has potential to dissect complex quantitative agronomic traits and delineate trait-regulatory genomic loci (candidate genes) for genetic enhancement in crop plants, including chickpea. PMID:26058368

  20. [Advances in development of gene-gene interaction analysis methods based on SNP data: a review].

    PubMed

    Luan, Yi-Zhao; Zuo, Xiao-Yu; Liu, Ke; Li, Gu; Rao, Shao-Qi

    2013-12-01

    The SNP-based association analysis has become one of the most important approaches to interpret the underlying molecular mechanisms for human complex diseases. Nevertheless, the widely-used singe-locus analysis is only capable of capturing a small portion of susceptible SNPs with prominent marginal effects, leaving the important genetic component, epistasis or joint effects, to be undetectable. Identifying the complex interplays among multiple genes in the genome-wide context is an essential task for systematically unraveling the molecular mechanisms for complex diseases. Many approaches have been used to detect genome-wide gene-gene interactions and provided new insights into the genetic basis of complex diseases. This paper reviewed recent advances of the methods for detecting gene-gene interaction, categorized into three types, model-based and model-free statistical methods, and data mining methods, based on their characteristics in theory and numerical algorithm. In particular, the basic principle, numerical implementation and cautions for application for each method were elucidated. In addition, this paper briefly discussed the limitations and challenges associated with detecting genome-wide epistasis, in order to provide some methodological consultancies for scientists in the related fields.

  1. A MAP3k1 SNP Predicts Survival of Gastric Cancer in a Chinese Population

    PubMed Central

    Gu, Dongying; Shen, Lili; Wang, Meilin; Xu, Zhi; Gong, Weida; Tang, Cuiju; Gao, Jinglong; Chen, Jinfei; Zhang, Zhengdong

    2014-01-01

    Objectives Genome-wide association studies (GWAS) have demonstrated that the single nucleotide polymorphism (SNP) MAP3K1 rs889312 is a genetic susceptibility marker significantly associated with a risk of hormone-related tumors such as breast cancer. Considering steroid hormone-mediated signaling pathways have an important role in the progression of gastric cancer, we hypothesized that MAP3K1 rs889312 may be associated with survival outcomes in gastric cancer. The purpose of this study was to test this hypothesis. Methods We genotyped MAP3K1 rs889312 using TaqMan in 884 gastric cancer patients who received subtotal or total gastrectomy. Kaplan-Meier survival analysis and Cox proportional hazard regression were used to analyze the association between MAP3K1 rs889312 genotypes and survival outcomes of gastric cancer. Results Our findings reveal that the rs889312 heterozygous AC genotype was significantly associated with an increased rate of mortality among patients with diffuse-type gastric cancer (log-rank P = 0.028 for AC versus AA/CC, hazard ratio [HR] = 1.32, 95% confidence interval [CI] = 1.03–1.69), compared to those carrying the homozygous variant genotypes (AA/CC). Additionally, univariate and multivariate Cox regression analysis demonstrate that rs889312 polymorphism was an independent risk factor for poor survival in these patients. Conclusions In conclusion, we demonstrate that MAP3K1 rs889312 is closely correlated with outcome among diffuse-type gastric cancer. This raises the possibility for rs889312 polymorphisms to be used as an independent indicator for predicting the prognosis of diffuse-type gastric cancer within the Chinese population. PMID:24759887

  2. Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms

    PubMed Central

    2014-01-01

    Background High-throughput sequencing has opened up exciting possibilities in population and conservation genetics by enabling the assessment of genetic variation at genome-wide scales. One approach to reduce genome complexity, i.e. investigating only parts of the genome, is reduced-representation library (RRL) sequencing. Like similar approaches, RRL sequencing reduces ascertainment bias due to simultaneous discovery and genotyping of single-nucleotide polymorphisms (SNPs) and does not require reference genomes. Yet, generating such datasets remains challenging due to laboratory and bioinformatical issues. In the laboratory, current protocols require improvements with regards to sequencing homologous fragments to reduce the number of missing genotypes. From the bioinformatical perspective, the reliance of most studies on a single SNP caller disregards the possibility that different algorithms may produce disparate SNP datasets. Results We present an improved RRL (iRRL) protocol that maximizes the generation of homologous DNA sequences, thus achieving improved genotyping-by-sequencing efficiency. Our modifications facilitate generation of single-sample libraries, enabling individual genotype assignments instead of pooled-sample analysis. We sequenced ~1% of the orangutan genome with 41-fold median coverage in 31 wild-born individuals from two populations. SNPs and genotypes were called using three different algorithms. We obtained substantially different SNP datasets depending on the SNP caller. Genotype validations revealed that the Unified Genotyper of the Genome Analysis Toolkit and SAMtools performed significantly better than a caller from CLC Genomics Workbench (CLC). Of all conflicting genotype calls, CLC was only correct in 17% of the cases. Furthermore, conflicting genotypes between two algorithms showed a systematic bias in that one caller almost exclusively assigned heterozygotes, while the other one almost exclusively assigned homozygotes. Conclusions

  3. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

    PubMed

    Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R; Taylor, Jeremy F; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart

    2016-01-01

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The

  4. Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing.

    PubMed

    Yu, ShiGang; Chu, WeiWei; Zhang, LiFan; Han, HouMing; Zhao, RongXue; Wu, Wei; Zhu, JiangNing; Dodson, Michael V; Wei, Wei; Liu, HongLin; Chen, Jie

    2015-01-01

    Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying

  5. Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing

    PubMed Central

    Yu, ShiGang; Chu, WeiWei; Zhang, LiFan; Han, HouMing; Zhao, RongXue; Wu, Wei; Zhu, JiangNing; Dodson, Michael V.; Wei, Wei; Liu, HongLin; Chen, Jie

    2015-01-01

    Laying performance is an important economical trait of goose production. As laying performance is of low heritability, it is of significance to develop a marker-assisted selection (MAS) strategy for this trait. Definition of sequence variation related to the target trait is a prerequisite of quantitating MAS, but little is presently known about the goose genome, which greatly hinders the identification of genetic markers for the laying traits of geese. Recently developed restriction site-associated DNA (RAD) sequencing is a possible approach for discerning large-scale single nucleotide polymorphism (SNP) and reducing the complexity of a genome without having reference genomic information available. In the present study, we developed a pooled RAD sequencing strategy for detecting geese laying-related SNP. Two DNA pools were constructed, each consisting of equal amounts of genomic DNA from 10 individuals with either high estimated breeding value (HEBV) or low estimated breeding value (LEBV). A total of 139,013 SNP were obtained from 42,291,356 sequences, of which 18,771,943 were for LEBV and 23,519,413 were for HEBV cohorts. Fifty-five SNP which had different allelic frequencies in the two DNA pools were further validated by individual-based AS-PCR genotyping in the LEBV and HEBV cohorts. Ten out of 55 SNP exhibited distinct allele distributions in these two cohorts. These 10 SNP were further genotyped in a goose population of 492 geese to verify the association with egg numbers. The result showed that 8 of 10 SNP were associated with egg numbers. Additionally, liner regression analysis revealed that SNP Record-111407, 106975 and 112359 were involved in a multiplegene network affecting laying performance. We used IPCR to extend the unknown regions flanking the candidate RAD tags. The obtained sequences were subjected to BLAST to retrieve the orthologous genes in either ducks or chickens. Five novel genes were cloned for geese which harbored the candidate laying

  6. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications

    PubMed Central

    Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R.; Taylor, Jeremy F.; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart

    2016-01-01

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The

  7. HapRice, an SNP haplotype database and a web tool for rice.

    PubMed

    Yonemaru, Jun-ichi; Ebana, Kaworu; Yano, Masahiro

    2014-01-01

    Genome-wide single nucleotide polymorphism (SNP) analysis is a promising tool to examine the genetic diversity of rice populations and genetic traits of scientific and economic importance. Next-generation sequencing technology has accelerated the re-sequencing of diverse rice varieties and the discovery of genome-wide SNPs. Notably, validation of these SNPs by a high-throughput genotyping system, such as an SNP array, could provide a manageable and highly accurate SNP set. To enhance the potential utility of genome-wide SNPs for geneticists and breeders, analysis tools need to be developed. Here, we constructed an SNP haplotype database, which allows visualization of the allele frequency of all SNPs in the genome browser. We calculated the allele frequencies of 3,334 SNPs in 76 accessions from the world rice collection and 3,252 SNPs in 177 Japanese rice accessions; all these SNPs have been validated in our previous studies. The SNP haplotypes were defined by the allele frequency in each cultivar group (aus, indica, tropical japonica and temperate japonica) for the world rice accessions, and in non-irrigated and three irrigated groups (three variety registration periods) for Japanese rice accessions. We also developed web tools for finding polymorphic SNPs between any two rice accessions and for the primer design to develop cleaved amplified polymorphic sequence markers at any SNP. The 'HapRice' database and the web tools can be accessed at http://qtaro.abr.affrc.go.jp/index.html. In addition, we established a core SNP set consisting of 768 SNPs uniformly distributed in the rice genome; this set is of a practically appropriate size for use in rice genetic analysis.

  8. AMY-tree: an algorithm to use whole genome SNP calling for Y chromosomal phylogenetic applications

    PubMed Central

    2013-01-01

    Background Due to the rapid progress of next-generation sequencing (NGS) facilities, an explosion of human whole genome data will become available in the coming years. These data can be used to optimize and to increase the resolution of the phylogenetic Y chromosomal tree. Moreover, the exponential growth of known Y chromosomal lineages will require an automatic determination of the phylogenetic position of an individual based on whole genome SNP calling data and an up to date Y chromosomal tree. Results We present an automated approach, ‘AMY-tree’, which is able to determine the phylogenetic position of a Y chromosome using a whole genome SNP profile, independently from the NGS platform and SNP calling program, whereby mistakes in the SNP calling or phylogenetic Y chromosomal tree are taken into account. Moreover, AMY-tree indicates ambiguities within the present phylogenetic tree and points out new Y-SNPs which may be phylogenetically relevant. The AMY-tree software package was validated successfully on 118 whole genome SNP profiles of 109 males with different origins. Moreover, support was found for an unknown recurrent mutation, wrong reported mutation conversions and a large amount of new interesting Y-SNPs. Conclusions Therefore, AMY-tree is a useful tool to determine the Y lineage of a sample based on SNP calling, to identify Y-SNPs with yet unknown phylogenetic position and to optimize the Y chromosomal phylogenetic tree in the future. AMY-tree will not add lineages to the existing phylogenetic tree of the Y-chromosome but it is the first step to analyse whole genome SNP profiles in a phylogenetic framework. PMID:23405914

  9. AMY-tree: an algorithm to use whole genome SNP calling for Y chromosomal phylogenetic applications.

    PubMed

    Van Geystelen, Anneleen; Decorte, Ronny; Larmuseau, Maarten H D

    2013-02-13

    Due to the rapid progress of next-generation sequencing (NGS) facilities, an explosion of human whole genome data will become available in the coming years. These data can be used to optimize and to increase the resolution of the phylogenetic Y chromosomal tree. Moreover, the exponential growth of known Y chromosomal lineages will require an automatic determination of the phylogenetic position of an individual based on whole genome SNP calling data and an up to date Y chromosomal tree. We present an automated approach, 'AMY-tree', which is able to determine the phylogenetic position of a Y chromosome using a whole genome SNP profile, independently from the NGS platform and SNP calling program, whereby mistakes in the SNP calling or phylogenetic Y chromosomal tree are taken into account. Moreover, AMY-tree indicates ambiguities within the present phylogenetic tree and points out new Y-SNPs which may be phylogenetically relevant. The AMY-tree software package was validated successfully on 118 whole genome SNP profiles of 109 males with different origins. Moreover, support was found for an unknown recurrent mutation, wrong reported mutation conversions and a large amount of new interesting Y-SNPs. Therefore, AMY-tree is a useful tool to determine the Y lineage of a sample based on SNP calling, to identify Y-SNPs with yet unknown phylogenetic position and to optimize the Y chromosomal phylogenetic tree in the future. AMY-tree will not add lineages to the existing phylogenetic tree of the Y-chromosome but it is the first step to analyse whole genome SNP profiles in a phylogenetic framework.

  10. Comparison of Methods for Determining ABO Blood Type in Cynomolgus Macaques (Macaca fascicularis).

    PubMed

    Kim, Tae M; Park, Hyojun; Cho, Kahee; Kim, Jong S; Park, Mi K; Choi, Ju Y; Park, Jae B; Park, Wan J; Kim, Sung J

    2015-05-01

    Thorough examination of ABO blood type in cynomolgus monkeys is an essential experimental step to prevent humoral rejection during transplantation research. In the present study, we evaluated current methods of ABO blood-antigen typing in cynomolgus monkeys by comparing the outcomes obtained by reverse hemagglutination, single-nucleotide polymorphism (SNP) analysis, and buccal mucosal immunohistochemistry. Among 21 animals, 5 were type A regardless of the method. However, of 8 serologically type B animals, 3 had a heterozygous type AB SNP profile, among which 2 failed to express A antigen, as shown by immunohistochemical analysis. Among 8 serologically type AB animals, 2 appeared to be type A by SNP analysis and immunohistochemistry. None of the methods identified any type O subjects. We conclude that the expression of ABO blood-group antigens is regulated by an incompletely understood process and that using both SNP and immunohistochemistry might minimize the risk of incorrect results obtained from the conventional hemagglutination assay.

  11. Is MDM2 SNP309 Variation a Risk Factor for Head and Neck Carcinoma?

    PubMed Central

    Zhuo, Xianlu; Ye, Huiping; Li, Qi; Xiang, Zhaolan; Zhang, Xueyuan

    2016-01-01

    Abstract Murine double minute-2 (MDM2) is a negative regulator of P53, and its T309G polymorphism has been suggested as a risk factor for a variety of cancers. Increasing evidence has shown the association of MDM2 T309G polymorphism with head and neck carcinoma (HNC) risk. However, the results are inconsistent. Thus, we performed a meta-analysis to elucidate the association. The meta-analysis retrieved studies published up to August 2015, and essential information was extracted for analysis. Separate analyses on ethnicity, source of controls, sample size, detection method, and cancer types were also conducted. Odds ratios (ORs) and their 95% confidence intervals (CIs) were used to estimate the association. Pooled data from 16 case–control studies including 4625 cases and 6927 controls failed to indicate a significant association. However, in the subgroup analysis of sample sizes, an increased risk was observed in the largest sample size group (>1000) under a recessive model (OR = 1.52; 95% CI = 1.08–2.13). Increased risks were also found in the nasopharyngeal cancer in the subgroup analysis of cancer types (GG vs TT: OR = 2.07; 95% CI = 1.38–3.12; dominant model: OR = 1.48; 95% CI = 1.13–1.93; recessive model: OR = 1.76; 95% CI = 1.17–2.65). The results suggest that homozygote GG alleles of MDM2 SNP309 may be a low-penetrant risk factor for HNC, and G allele may confer nasopharyngeal cancer susceptibility. PMID:26945408

  12. A SNP panel for identity and kinship testing using massive parallel sequencing.

    PubMed

    Grandell, Ida; Samara, Raed; Tillmar, Andreas O

    2016-07-01

    Within forensic genetics, there is still a need for supplementary DNA marker typing in order to increase the power to solve cases for both identity testing and complex kinship issues. One major disadvantage with current capillary electrophoresis (CE) methods is the limitation in DNA marker multiplex capability. By utilizing massive parallel sequencing (MPS) technology, this capability can, however, be increased. We have designed a customized GeneRead DNASeq SNP panel (Qiagen) of 140 previously published autosomal forensically relevant identity SNPs for analysis using MPS. One single amplification step was followed by library preparation using the GeneRead Library Prep workflow (Qiagen). The sequencing was performed on a MiSeq System (Illumina), and the bioinformatic analyses were done using the software Biomedical Genomics Workbench (CLC Bio, Qiagen). Forty-nine individuals from a Swedish population were genotyped in order to establish genotype frequencies and to evaluate the performance of the assay. The analyses showed to have a balanced coverage among the included loci, and the heterozygous balance showed to have less than 0.5 % outliers. Analyses of dilution series of the 2800M Control DNA gave reproducible results down to 0.2 ng DNA input. In addition, typing of FTA samples and bone samples was performed with promising results. Further studies and optimizations are, however, required for a more detailed evaluation of the performance of degraded and PCR-inhibited forensic samples. In summary, the assay offers a straightforward sample-to-genotype workflow and could be useful to gain information in forensic casework, for both identity testing and in order to solve complex kinship issues.

  13. The extent of linkage disequilibrium in beef cattle breeds using high-density SNP genotypes

    PubMed Central

    2014-01-01

    Background The extent of linkage disequilibrium (LD) between molecular markers impacts genome-wide association studies and implementation of genomic selection. The availability of high-density single nucleotide polymorphism (SNP) genotyping platforms makes it possible to investigate LD at an unprecedented resolution. In this work, we characterised LD decay in breeds of beef cattle of taurine, indicine and composite origins and explored its variation across autosomes and the X chromosome. Findings In each breed, LD decayed rapidly and r2 was less than 0.2 for marker pairs separated by 50 kb. The LD decay curves clustered into three groups of similar LD decay that distinguished the three main cattle types. At short distances between markers (< 10 kb), taurine breeds showed higher LD (r2 = 0.45) than their indicine (r2 = 0.25) and composite (r2 = 0.32) counterparts. This higher LD in taurine breeds was attributed to a smaller effective population size and a stronger bottleneck during breed formation. Using all SNPs on only the X chromosome, the three cattle types could still be distinguished. However for taurine breeds, the LD decay on the X chromosome was much faster and the background level much lower than for indicine breeds and composite populations. When using only SNPs that were polymorphic in all breeds, the analysis of the X chromosome mimicked that of the autosomes. Conclusions The pattern of LD mirrored some aspects of the history of breed populations and showed a sharp decay with increasing physical distance between markers. We conclude that the availability of the HD chip can be used to detect association signals that remained hidden when using lower density genotyping platforms, since LD dropped below 0.2 at distances of 50 kb. PMID:24661366

  14. SNP detection using peptide nucleic acid probes and conjugated polymers: applications in neurodegenerative disease identification.

    PubMed

    Gaylord, Brent S; Massie, Michelle R; Feinstein, Stuart C; Bazan, Guillermo C

    2005-01-04

    A strategy employing a combination of peptide nucleic acid (PNA) probes, an optically amplifying conjugated polymer (CP), and S1 nuclease enzyme is capable of detecting SNPs in a simple, rapid, and sensitive manner. The recognition is accomplished by sequence-specific hybridization between the uncharged, fluorescein-labeled PNA probe and the DNA sequence of interest. After subsequent treatment with S1 nuclease, the cationic water soluble CP electrostatically associates with the remaining anionic PNA/DNA complex, leading to sensitized emission of the labeled PNA probe via FRET from the CP. The generation of fluorescent signal is controlled by strand-specific electrostatic interactions and is governed by the complementarity of the probe/target pair. To assess the method, we compared the ability of the sensor system to detect normal, wild-type human DNA sequences, and those sequences containing a single base mutation. Specifically, we examined a PNA probe complementary to a region of the gene encoding the microtubule associated protein tau. The probe sequence covers a known point mutation implicated in a dominant neurodegenerative dementia known as frontotemporal dementia with parkinsonism linked to chromosome 17 (FTDP-17), which has clinical and molecular similarities to Alzheimer's disease. By using an appropriate PNA probe, the conjugated polymer poly[(9,9-bis(6'-N,N,N-trimethylammoniumhexylbromide)fluorene)-co-phenylene] and S1 nuclease, unambiguous FRET signaling is achieved for the wild-type DNA and not the mutant sequence harboring the SNP. Distance relationships in the CP/PNA assay are also discussed to highlight constraints and demonstrate improvements within the system.

  15. Evaluation of inbreeding depression in Holstein cattle using whole-genome SNP markers and alternative measures of genomic inbreeding.

    PubMed

    Bjelland, D W; Weigel, K A; Vukasinovic, N; Nkrumah, J D

    2013-07-01

    The effects of increased pedigree inbreeding in dairy cattle populations have been well documented and result in a negative impact on profitability. Recent advances in genotyping technology have allowed researchers to move beyond pedigree analysis and study inbreeding at a molecular level. In this study, 5,853 animals were genotyped for 54,001 single nucleotide polymorphisms (SNP); 2,913 cows had phenotypic records including a single lactation for milk yield (from either lactation 1, 2, 3, or 4), reproductive performance, and linear type conformation. After removing SNP with poor call rates, low minor allele frequencies, and departure from Hardy-Weinberg equilibrium, 33,025 SNP remained for analyses. Three measures of genomic inbreeding were evaluated: percent homozygosity (FPH), inbreeding calculated from runs of homozygosity (FROH), and inbreeding derived from a genomic relationship matrix (FGRM). Average FPH was 60.5±1.1%, average FROH was 3.8±2.1%, and average FGRM was 20.8±2.3%, where animals with larger values for each of the genomic inbreeding indices were considered more inbred. Decreases in total milk yield to 205d postpartum of 53, 20, and 47kg per 1% increase in FPH, FROH, and FGRM, respectively, were observed. Increases in days open per 1% increase in FPH (1.76 d), FROH (1.72 d), and FGRM (1.06 d) were also noted, as well as increases in maternal calving difficulty (0.09, 0.03, and 0.04 on a 5-point scale for FPH, FROH, and FGRM, respectively). Several linear type traits, such as strength (-0.40, -0.11, and -0.19), rear legs rear view (-0.35, -0.16, and -0.14), front teat placement (0.35, 0.25, 0.18), and teat length (-0.24, -0.14, and -0.13) were also affected by increases in FPH, FROH, and FGRM, respectively. Overall, increases in each measure of genomic inbreeding in this study were associated with negative effects on production and reproductive ability in dairy cows.

  16. Highly specific SNP detection using 2D graphene electronics and DNA strand displacement.

    PubMed

    Hwang, Michael T; Landon, Preston B; Lee, Joon; Choi, Duyoung; Mo, Alexander H; Glinsky, Gennadi; Lal, Ratnesh

    2016-06-28

    Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine.

  17. Objective evaluation measures of genetic marker selection in large-scale SNP genotyping.

    PubMed

    Kaminuma, Eli; Masuya, Hiroshi; Miura, Ikuo; Motegi, Hiromi; Takahasi, Kenzi R; Nakazawa, Miki; Matsui, Minami; Gondo, Yoichi; Noda, Tetsuo; Shiroishi, Toshihiko; Wakana, Shigeharu; Toyoda, Tetsuro

    2008-10-01

    High-throughput single nucleotide polymorphism (SNP) genotyping systems provide two kinds of fluorescent signals detected from different alleles. In current technologies, the process of genotype discrimination requires subjective judgments by expert operators, even when using clustering algorithms. Here, we propose two evaluation measures to manage fluorescent scatter data with nonclear plot aggregation. The first is the marker ranking measure, which provides a ranking system for the SNP markers based on the distance between the scatter plot distribution and a user-defined ideal distribution. The second measure, called individual genotype membership, uses the membership probability of each genotype related to an individual plot in the scatter data. In verification experiments, the marker ranking measure determined the ranking of SNP markers correlated with the subjective order of SNP markers judged by an expert operator. The experiment using the individual genotype membership measure clarified that the total number of unclassified individuals was remarkably reduced compared to that of manually unclassified ones. These two evaluation measures were implemented as the GTAssist software. GTAssist provides objective standards and avoids subjective biases in SNP genotyping workflows.

  18. Leveraging Ethnic Group Incidence Variation to Investigate Genetic Susceptibility to Glioma: A Novel Candidate SNP Approach

    PubMed Central

    Jacobs, Daniel I.; Walsh, Kyle M.; Wrensch, Margaret; Wiencke, John; Jenkins, Robert; Houlston, Richard S.; Bondy, Melissa; Simon, Matthias; Sanson, Marc; Gousias, Konstantinos; Schramm, Johannes; Labussière, Marianne; Di Stefano, Anna Luisa; Wichmann, H.-Erich; Müller-Nurasyid, Martina; Schreiber, Stefan; Franke, Andre; Moebus, Susanne; Eisele, Lewin; Dewan, Andrew T.; Dubrow, Robert

    2012-01-01

    Objectives: Using a novel candidate SNP approach, we aimed to identify a possible genetic basis for the higher glioma incidence in Whites relative to East Asians and African-Americans. Methods:  We hypothesized that genetic regions containing SNPs with extreme differences in allele frequencies across ethnicities are most likely to harbor susceptibility variants. We used International HapMap Project data to identify 3,961 candidate SNPs with the largest allele frequency differences in Whites compared to East Asians and Africans and tested these SNPs for association with glioma risk in a set of White cases and controls. Top SNPs identified in the discovery dataset were tested for association with glioma in five independent replication datasets. Results: No SNP achieved statistical significance in either the discovery or replication datasets after accounting for multiple testing or conducting meta-analysis. However, the most strongly associated SNP, rs879471, was found to be in linkage disequilibrium with a previously identified risk SNP, rs6010620, in RTEL1. We estimate rs6010620 to account for a glioma incidence rate ratio of 1.34 for Whites relative to East Asians. Conclusion: We explored genetic susceptibility to glioma using a novel candidate SNP method which may be applicable to other diseases with appropriate epidemiologic patterns. PMID:23091480

  19. Supervised learning-based tagSNP selection for genome-wide disease classifications

    PubMed Central

    Liu, Qingzhong; Yang, Jack; Chen, Zhongxue; Yang, Mary Qu; Sung, Andrew H; Huang, Xudong

    2008-01-01

    Background Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markers. Results We have developed a feature selection method named Supervised Recursive Feature Addition (SRFA). This method combines supervised learning and statistical measures for the chosen candidate features/SNPs to reconcile the redundancy information and, in doing so, improve the classification performance in association studies. Additionally, we have proposed a Support Vector based Recursive Feature Addition (SVRFA) scheme in SNP-disease association analysis. Conclusions We have proposed using SRFA with different statistical learning classifiers and SVRFA for both SNP selection and disease classification and then applying them to two complex disease data sets. In general, our approaches outperform the well-known feature selection method of Support Vector Machine Recursive Feature Elimination and logic regression-based SNP selection for disease classification in genetic association studies. Our study further indicates that both genetic and environmental variables should be taken into account when doing disease predictions and classifications for the most complex human diseases that have gene-environment interactions. PMID:18366619

  20. Mining and Analysis of SNP in Response to Salinity Stress in Upland Cotton (Gossypium hirsutum L.).

    PubMed

    Wang, Xiaoge; Lu, Xuke; Wang, Junjuan; Wang, Delong; Yin, Zujun; Fan, Weili; Wang, Shuai; Ye, Wuwei

    2016-01-01

    Salinity stress is a major abiotic factor that affects crop output, and as a pioneer crop in saline and alkaline land, salt tolerance study of cotton is particularly important. In our experiment, four salt-tolerance varieties with different salt tolerance indexes including CRI35 (65.04%), Kanghuanwei164 (56.19%), Zhong9807 (55.20%) and CRI44 (50.50%), as well as four salt-sensitive cotton varieties including Hengmian3 (48.21%), GK50 (40.20%), Xinyan96-48 (34.90%), ZhongS9612 (24.80%) were used as the materials. These materials were divided into salt-tolerant group (ST) and salt-sensitive group (SS). Illumina Cotton SNP 70K Chip was used to detect SNP in different cotton varieties. SNPv (SNP variation of the same seedling pre- and after- salt stress) in different varieties were screened; polymorphic SNP and SNPr (SNP related to salt tolerance) were obtained. Annotation and analysis of these SNPs showed that (1) the induction efficiency of salinity stress on SNPv of cotton materials with different salt tolerance index was different, in which the induction efficiency on salt-sensitive materials was significantly higher than that on salt-tolerant materials. The induction of salt stress on SNPv was obviously biased. (2) SNPv induced by salt stress may be related to the methylation changes under salt stress. (3) SNPr may influence salt tolerance of plants by affecting the expression of salt-tolerance related genes.

  1. Inferring Loss-of-Heterozygosity from Unpaired Tumors Using High-Density Oligonucleotide SNP Arrays

    PubMed Central

    Park, Yuhyun; Hao, Ke; Zhao, Xiaojun; Garraway, Levi A; Fox, Edward A; Hochberg, Ephraim P; Mellinghoff, Ingo K; Hofer, Matthias D; Descazeaud, Aurelien; Rubin, Mark A; Meyerson, Matthew; Wong, Wing Hung; Sellers, William R; Li, Cheng

    2006-01-01

    Loss of heterozygosity (LOH) of chromosomal regions bearing tumor suppressors is a key event in the evolution of epithelial and mesenchymal tumors. Identification of these regions usually relies on genotyping tumor and counterpart normal DNA and noting regions where heterozygous alleles in the normal DNA become homozygous in the tumor. However, paired normal samples for tumors and cell lines are often not available. With the advent of oligonucleotide arrays that simultaneously assay thousands of single-nucleotide polymorphism (SNP) markers, genotyping can now be done at high enough resolution to allow identification of LOH events by the absence of heterozygous loci, without comparison to normal controls. Here we describe a hidden Markov model-based method to identify LOH from unpaired tumor samples, taking into account SNP intermarker distances, SNP-specific heterozygosity rates, and the haplotype structure of the human genome. When we applied the method to data genotyped on 100 K arrays, we correctly identified 99% of SNP markers as either retention or loss. We also correctly identified 81% of the regions of LOH, including 98% of regions greater than 3 megabases. By integrating copy number analysis into the method, we were able to distinguish LOH from allelic imbalance. Application of this method to data from a set of prostate samples without paired normals identified known regions of prevalent LOH. We have developed a method for analyzing high-density oligonucleotide SNP array data to accurately identify of regions of LOH and retention in tumors without the need for paired normal samples. PMID:16699594

  2. Explaining the disease phenotype of intergenic SNP through predicted long range regulation.

    PubMed

    Chen, Jingqi; Tian, Weidong

    2016-10-14

    Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. Assessment of high resolution melting analysis as a potential SNP genotyping technique in forensic casework.

    PubMed

    Venables, Samantha J; Mehta, Bhavik; Daniel, Runa; Walsh, Simon J; van Oorschot, Roland A H; McNevin, Dennis

    2014-11-01

    High resolution melting (HRM) analysis is a simple, cost effective, closed tube SNP genotyping technique with high throughput potential. The effectiveness of HRM for forensic SNP genotyping was assessed with five commercially available HRM kits evaluated on the ViiA™ 7 Real Time PCR instrument. Four kits performed satisfactorily against forensically relevant criteria. One was further assessed to determine the sensitivity, reproducibility, and accuracy of HRM SNP genotyping. The manufacturer's protocol using 0.5 ng input DNA and 45 PCR cycles produced accurate and reproducible results for 17 of the 19 SNPs examined. Problematic SNPs had GC rich flanking regions which introduced additional melting domains into the melting curve (rs1800407) or included homozygotes that were difficult to distinguish reliably (rs16891982; a G to C SNP). A proof of concept multiplexing experiment revealed that multiplexing a small number of SNPs may be possible after further investigation. HRM enables genotyping of a number of SNPs in a large number of samples without extensive optimization. However, it requires more genomic DNA as template in comparison to SNaPshot®. Furthermore, suitably modifying pre-existing forensic intelligence SNP panels for HRM analysis may pose difficulties due to the properties of some SNPs.

  4. SNP-based association analysis for seedling traits in durum wheat (Triticum turgidum L. durum (Desf.)).

    PubMed

    Sabiel, Salih A I; Huang, Sisi; Hu, Xin; Ren, Xifeng; Fu, Chunjie; Peng, Junhua; Sun, Dongfa

    2017-03-01

    In the present study, 150 accessions of worldwide originated durum wheat germplasm (Triticum turgidum spp. durum) were observed for major seedling traits and their growth. The accessions were evaluated for major seedling traits under controlled conditions of hydroponics at the 13(th), 20(th), 27(th) and 34(th) day-after germination. Biomass traits were measured at the 34(th) day-after germination. Correlation analysis was conducted among the seedling traits and three field traits at maturity, plant height, grain weight and 1000-grain weight observed in four consecutive years. Associations of the measured seedling traits and SNP markers were analyzed based on the mixed linear model (MLM). The results indicated that highly significant genetic variation and robust heritability were found for the seedling and field mature traits. In total, 259 significant associations were detected for all the traits and four growth stages. The phenotypic variation explained (R2) by a single SNP marker is higher than 10% for most (84%) of the significant SNP markers. Forty-six SNP markers associated with multiple traits, indicating non-neglectable pleiotropy in seedling stage. The associated SNP markers could be helpful for genetic analysis of seedling traits, and marker-assisted breeding of new wheat varieties with strong seedling vigor.

  5. SNP-based association analysis for seedling traits in durum wheat (Triticum turgidum L. durum (Desf.))

    PubMed Central

    Sabiel, Salih A. I.; Huang, Sisi; Hu, Xin; Ren, Xifeng; Fu, Chunjie; Peng, Junhua; Sun, Dongfa

    2017-01-01

    In the present study, 150 accessions of worldwide originated durum wheat germplasm (Triticum turgidum spp. durum) were observed for major seedling traits and their growth. The accessions were evaluated for major seedling traits under controlled conditions of hydroponics at the 13th, 20th, 27th and 34th day-after germination. Biomass traits were measured at the 34th day-after germination. Correlation analysis was conducted among the seedling traits and three field traits at maturity, plant height, grain weight and 1000-grain weight observed in four consecutive years. Associations of the measured seedling traits and SNP markers were analyzed based on the mixed linear model (MLM). The results indicated that highly significant genetic variation and robust heritability were found for the seedling and field mature traits. In total, 259 significant associations were detected for all the traits and four growth stages. The phenotypic variation explained (R2) by a single SNP marker is higher than 10% for most (84%) of the significant SNP markers. Forty-six SNP markers associated with multiple traits, indicating non-neglectable pleiotropy in seedling stage. The associated SNP markers could be helpful for genetic analysis of seedling traits, and marker-assisted breeding of new wheat varieties with strong seedling vigor. PMID:28588384

  6. The Generalized Higher Criticism for Testing SNP-Set Effects in Genetic Association Studies.

    PubMed

    Barnett, Ian; Mukherjee, Rajarshi; Lin, Xihong

    2017-01-01

    It is of substantial interest to study the effects of genes, genetic pathways, and networks on the risk of complex diseases. These genetic constructs each contain multiple SNPs, which are often correlated and function jointly, and might be large in number. However, only a sparse subset of SNPs in a genetic construct is generally associated with the disease of interest. In this article, we propose the generalized higher criticism (GHC) to test for the association between an SNP set and a disease outcome. The higher criticism is a test traditionally used in high-dimensional signal detection settings when marginal test statistics are independent and the number of parameters is very large. However, these assumptions do not always hold in genetic association studies, due to linkage disequilibrium among SNPs and the finite number of SNPs in an SNP set in each genetic construct. The proposed GHC overcomes the limitations of the higher criticism by allowing for arbitrary correlation structures among the SNPs in an SNP-set, while performing accurate analytic p-value calculations for any finite number of SNPs in the SNP-set. We obtain the detection boundary of the GHC test. We compared empirically using simulations the power of the GHC method with existing SNP-set tests over a range of genetic regions with varied correlation structures and signal sparsity. We apply the proposed methods to analyze the CGEM breast cancer genome-wide association study. Supplementary materials for this article are available online.

  7. Association of NR3C1/Glucocorticoid Receptor gene SNP with azoospermia in Japanese men.

    PubMed

    Chihara, Makoto; Yoshihara, Kosuke; Ishiguro, Tatsuya; Adachi, Sosuke; Okada, Hiroyuki; Kashima, Katsunori; Sato, Takaaki; Tanaka, Atsushi; Tanaka, Kenichi; Enomoto, Takayuki

    2016-01-01

    The molecular pathogenesis of non-obstructive azoospermia (NOA) is unclear. Our aim was to identify the genetic susceptibility for NOA in Japanese men by using a combination of transcriptome network analysis and SNP genotyping. We searched for candidate genes using RNA transcriptome network analysis of 2611 NOA-related genes that we had previously reported. We analyzed candidate genes for disease linkage with single nucleotide polymorphisms (SNP) in the genomes of 335 Japanese men with NOA and 410 healthy controls using SNP-specific real-time polymerase chain reaction TaqMan assays. Three candidate genes (NR3C1, YBX2, and BCL2) were identified by the transcriptome network analysis, each with three SNP. Allele frequency analysis of the nine SNP indicated a significantly higher frequency of the NR3C1 rs852977 G allele in NOA cases compared with controls (corrected P = 5.7e-15; odds ratio = 3.20; 95% confidence interval, 2.40-4.26). The other eight candidate polymorphisms showed no significant association. The NR3C1 rs852977 polymorphism is a potential marker for genetic susceptibility to NOA in Japanese men. Further studies are necessary to clarify the association between the NR3C1 polymorphism and alterations of glucocorticoid signaling pathway leading to male infertility. © 2015 Japan Society of Obstetrics and Gynecology.

  8. Supervised learning-based tagSNP selection for genome-wide disease classifications.

    PubMed

    Liu, Qingzhong; Yang, Jack; Chen, Zhongxue; Yang, Mary Qu; Sung, Andrew H; Huang, Xudong

    2008-01-01

    Comprehensive evaluation of common genetic variations through association of single nucleotide polymorphisms (SNPs) with complex human diseases on the genome-wide scale is an active area in human genome research. One of the fundamental questions in a SNP-disease association study is to find an optimal subset of SNPs with predicting power for disease status. To find that subset while reducing study burden in terms of time and costs, one can potentially reconcile information redundancy from associations between SNP markers. We have developed a feature selection method named Supervised Recursive Feature Addition (SRFA). This method combines supervised learning and statistical measures for the chosen candidate features/SNPs to reconcile the redundancy information and, in doing so, improve the classification performance in association studies. Additionally, we have proposed a Support Vector based Recursive Feature Addition (SVRFA) scheme in SNP-disease association analysis. We have proposed using SRFA with different statistical learning classifiers and SVRFA for both SNP selection and disease classification and then applying them to two complex disease data sets. In general, our approaches outperform the well-known feature selection method of Support Vector Machine Recursive Feature Elimination and logic regression-based SNP selection for disease classification in genetic association studies. Our study further indicates that both genetic and environmental variables should be taken into account when doing disease predictions and classifications for the most complex human diseases that have gene-environment interactions.

  9. Explaining the disease phenotype of intergenic SNP through predicted long range regulation

    PubMed Central

    Chen, Jingqi; Tian, Weidong

    2016-01-01

    Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. PMID:27280978

  10. Different SNP combinations in the GCH1 gene and use of labor analgesia

    PubMed Central

    2010-01-01

    Background The aim of this study was to investigate if there is an association between different SNP combinations in the guanosine triphosphate cyclohydrolase (GCH1) gene and a number of pain behavior related outcomes during labor. A population-based sample of pregnant women (n = 814) was recruited at gestational week 18. A plasma sample was collected from each subject. Genotyping was performed and three single nucleotide polymorphisms (SNP) previously defined as a pain-protective SNP combination of GCH1 were used. Results Homozygous carriers of the pain-protective SNP combination of GCH1 arrived to the delivery ward with a more advanced stage of cervical dilation compared to heterozygous carriers and non-carriers. However, homozygous carriers more often used second line labor analgesia compared to the others. Conclusion The pain-protective SNP combination of GCH1 may be of importance in the limited number of homozygous carriers during the initial dilation of cervix but upon arrival at the delivery unit these women are more inclined to use second line labor analgesia. PMID:20633294

  11. Highly specific SNP detection using 2D graphene electronics and DNA strand displacement

    PubMed Central

    Hwang, Michael T.; Landon, Preston B.; Lee, Joon; Choi, Duyoung; Mo, Alexander H.; Glinsky, Gennadi; Lal, Ratnesh

    2016-01-01

    Single-nucleotide polymorphisms (SNPs) in a gene sequence are markers for a variety of human diseases. Detection of SNPs with high specificity and sensitivity is essential for effective practical implementation of personalized medicine. Current DNA sequencing, including SNP detection, primarily uses enzyme-based methods or fluorophore-labeled assays that are time-consuming, need laboratory-scale settings, and are expensive. Previously reported electrical charge-based SNP detectors have insufficient specificity and accuracy, limiting their effectiveness. Here, we demonstrate the use of a DNA strand displacement-based probe on a graphene field effect transistor (FET) for high-specificity, single-nucleotide mismatch detection. The single mismatch was detected by measuring strand displacement-induced resistance (and hence current) change and Dirac point shift in a graphene FET. SNP detection in large double-helix DNA strands (e.g., 47 nt) minimize false-positive results. Our electrical sensor-based SNP detection technology, without labeling and without apparent cross-hybridization artifacts, would allow fast, sensitive, and portable SNP detection with single-nucleotide resolution. The technology will have a wide range of applications in digital and implantable biosensors and high-throughput DNA genotyping, with transformative implications for personalized medicine. PMID:27298347

  12. MDM2 SNP309 polymorphism is associated with colorectal cancer risk

    PubMed Central

    Wang, Weizhi; Du, Mulong; Gu, Dongying; Zhu, Lingjun; Chu, Haiyan; Tong, Na; Zhang, Zhengdong; Xu, Zekuan; Wang, Meilin

    2014-01-01

    The human murine double minute 2 (MDM2) is known as an oncoprotein through inhibiting P53 transcriptional activity and mediating P53 ubiquitination. Therefore, the amplification of MDM2 may attenuate the P53 pathway and promote tumorigenesis. The SNP309 T>G polymorphism (rs2279744), which is located in the intronic promoter of MDM2 gene, was reported to contribute to the increased level of MDM2 protein. In this hospital-based case-control study, which consisted of 573 cases and 588 controls, we evaluated the association between MDM2 SNP309 and the risk of colorectal cancer (CRC) in a Chinese population by using the TaqMan method to genotype the polymorphism. We found that the MDM2 SNP309 polymorphism was significantly associated with CRC risk. In addition, in our meta-analysis, we found a significant association between MDM2 SNP309 and CRC risk among Asians, which was consistent with our results. In conclusion, we demonstrated that the MDM2 SNP309 polymorphism increased the susceptibility of CRC in Asian populations. PMID:24797837

  13. SnpFilt: A pipeline for reference-free assembly-based identification of SNPs in bacterial genomes.

    PubMed

    Chan, Carmen H S; Octavia, Sophie; Sintchenko, Vitali; Lan, Ruiting

    2016-12-01

    De novo assembly of bacterial genomes from next-generation sequencing (NGS) data allows a reference-free discovery of single nucleotide polymorphisms (SNP). However, substantial rates of errors in genomes assembled by this approach remain a major barrier for the reference-free analysis of genome variations in medically important bacteria. The aim of this report was to improve the quality of SNP identification in bacterial genomes without closely related references. We developed a bioinformatics pipeline (SnpFilt) that constructs an assembly using SPAdes and then removes unreliable regions based on the quality and coverage of re-aligned reads at neighbouring regions. The performance of the pipeline was compared against reference-based SNP calling for Illumina HiSeq, MiSeq and NextSeq reads from a range of bacterial pathogens including Salmonella, which is one of the most common causes of food-borne disease. The SnpFilt pipeline removed all false SNP in all test NGS datasets consisting of paired-end Illumina reads. We also showed that for reliable and complete SNP calls, at least 40-fold coverage is required. Analysis of bacterial isolates associated with epidemiologically confirmed outbreaks using the SnpFilt pipeline produced results consistent with previously published findings. The SnpFilt pipeline improves the quality of de-novo assembly and precision of SNP calling in bacterial genomes by removal of regions of the assembly that may potentially contain assembly errors. SnpFilt is available from https://github.com/LanLab/SnpFilt.

  14. Transcriptome sequencing for SNP discovery across Cucumis melo

    PubMed Central

    2012-01-01

    from India and Africa as compared to commercial cultivars, cultigens and landraces from Eastern Europe, Western Asia and the Mediterranean basin is consistent with the evolutionary history proposed for the species. Group-specific SNVs that will be useful in introgression programs were also detected. In a sample of 143 selected putative SNPs, we verified 93% of the polymorphisms in a panel of 78 genotypes. Conclusions This study provides the first comprehensive resequencing data for wild, exotic, and cultivated (landraces and commercial) melon transcriptomes, yielding the largest melon SNP collection available to date and representing a notable sample of the species diversity. This data provides a valuable resource for creating a catalog of allelic variants of melon genes and it will aid in future in-depth studies of population genetics, marker-assisted breeding, and gene identification aimed at developing improved varieties. PMID:22726804

  15. SNP Discovery by Illumina-Based Transcriptome Sequencing of the Olive and the Genetic Characterization of Turkish Olive Genotypes Revealed by AFLP, SSR and SNP Markers

    PubMed Central

    Kaya, Hilal Betul; Cetin, Oznur; Kaya, Hulya; Sahin, Mustafa; Sefer, Filiz; Kahraman, Abdullah; Tanyolac, Bahattin

    2013-01-01

    Background The olive tree (Olea europaea L.) is a diploid (2n = 2x = 46) outcrossing species mainly grown in the Mediterranean area, where it is the most important oil-producing crop. Because of its economic, cultural and ecological importance, various DNA markers have been used in the olive to characterize and elucidate homonyms, synonyms and unknown accessions. However, a comprehensive characterization and a full sequence of its transcriptome are unavailable, leading to the importance of an efficient large-scale single nucleotide polymorphism (SNP) discovery in olive. The objectives of this study were (1) to discover olive SNPs using next-generation sequencing and to identify SNP primers for cultivar identification and (2) to characterize 96 olive genotypes originating from different regions of Turkey. Methodology/Principal Findings Next-generation sequencing technology was used with five distinct olive genotypes and generated cDNA, producing 126,542,413 reads using an Illumina Genome Analyzer IIx. Following quality and size trimming, the high-quality reads were assembled into 22,052 contigs with an average length of 1,321 bases and 45 singletons. The SNPs were filtered and 2,987 high-quality putative SNP primers were identified. The assembled sequences and singletons were subjected to BLAST similarity searches and annotated with a Gene Ontology identifier. To identify the 96 olive genotypes, these SNP primers were applied to the genotypes in combination with amplified fragment length polymorphism (AFLP) and simple sequence repeats (SSR) markers. Conclusions/Significance This study marks the highest number of SNP markers discovered to date from olive genotypes using transcriptome sequencing. The developed SNP markers will provide a useful source for molecular genetic studies, such as genetic diversity and characterization, high density quantitative trait locus (QTL) analysis, association mapping and map-based gene cloning in the olive. High levels of

  16. Breast cancer-associated high-order SNP-SNP interaction of CXCL12/CXCR4-related genes by an improved multifactor dimensionality reduction (MDR-ER).

    PubMed

    Fu, Ou-Yang; Chang, Hsueh-Wei; Lin, Yu-Da; Chuang, Li-Yeh; Hou, Ming-Feng; Yang, Cheng-Hong

    2016-09-01

    In association studies, the combined effects of single nucleotide polymorphism (SNP)-SNP interactions and the problem of imbalanced data between cases and controls are frequently ignored. In the present study, we used an improved multifactor dimensionality reduction (MDR) approach namely MDR-ER to detect the high order SNP‑SNP interaction in an imbalanced breast cancer data set containing seven SNPs of chemokine CXCL12/CXCR4 pathway genes. Most individual SNPs were not significantly associated with breast cancer. After MDR‑ER analysis, six significant SNP‑SNP interaction models with seven genes (highest cross‑validation consistency, 10; classification error rates, 41.3‑21.0; and prediction error rates, 47.4‑55.3) were identified. CD4 and VEGFA genes were associated in a 2‑loci interaction model (classification error rate, 41.3; prediction error rate, 47.5; odds ratio (OR), 2.069; 95% bootstrap CI, 1.40‑2.90; P=1.71E‑04) and it also appeared in all the best 2‑7‑loci models. When the loci number increased, the classification error rates and P‑values decreased. The powers in 2‑7‑loci in all models were >0.9. The minimum classification error rate of the MDR‑ER‑generated model was shown with the 7‑loci interaction model (classification error rate, 21.0; OR=15.282; 95% bootstrap CI, 9.54‑23.87; P=4.03E‑31). In the epistasis network analysis, the overall effect with breast cancer susceptibility was identified and the SNP order of impact on breast cancer was identified as follows: CD4 = VEGFA > KITLG > CXCL12 > CCR7 = MMP2 > CXCR4. In conclusion, the MDR‑ER can effectively and correctly identify the best SNP‑SNP interaction models in an imbalanced data set for breast cancer cases.

  17. Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data

    PubMed Central

    2013-01-01

    Background Due to the low statistical power of individual markers from a genome-wide association study (GWAS), detecting causal single nucleotide polymorphisms (SNPs) for complex diseases is a challenge. SNP combinations are suggested to compensate for the low statistical power of individual markers, but SNP combinations from GWAS generate high computational complexity. Methods We aim to detect type 2 diabetes (T2D) causal SNP combinations from a GWAS dataset with optimal filtration and to discover the biological meaning of the detected SNP combinations. Optimal filtration can enhance the statistical power of SNP combinations by comparing the error rates of SNP combinations from various Bonferroni thresholds and p-value range-based thresholds combined with linkage disequilibrium (LD) pruning. T2D causal SNP combinations are selected using random forests with variable selection from an optimal SNP dataset. T2D causal SNP combinations and genome-wide SNPs are mapped into functional modules using expanded gene set enrichment analysis (GSEA) considering pathway, transcription factor (TF)-target, miRNA-target, gene ontology, and protein complex functional modules. The prediction error rates are measured for SNP sets from functional module-based filtration that selects SNPs within functional modules from genome-wide SNPs based expanded GSEA. Results A T2D causal SNP combination containing 101 SNPs from the Wellcome Trust Case Control Consortium (WTCCC) GWAS dataset are selected using optimal filtration criteria, with an error rate of 10.25%. Matching 101 SNPs with known T2D genes and functional modules reveals the relationships between T2D and SNP combinations. The prediction error rates of SNP sets from functional module-based filtration record no significance compared to the prediction error rates of randomly selected SNP sets and T2D causal SNP combinations from optimal filtration. Conclusions We propose a detection method for complex disease causal SNP combinations

  18. Genome-Wide SNP Markers Based on SLAF-Seq Uncover Breeding Traces in Rapeseed (Brassica napus L.)

    PubMed Central

    Zhou, Qinghong; Zhou, Can; Zheng, Wei; Mason, Annaliese S.; Fan, Shuying; Wu, Caijun; Fu, Donghui; Huang, Yingjin

    2017-01-01

    Single Nucleotide Polymorphisms (SNPs) are the most abundant and richest form of genomic polymorphism, and hence make highly favorable markers for genetic map construction and genome-wide association studies. In this study, a total of 300 rapeseed accessions (278 representative of Chinese germplasm, plus 22 outgroup accessions of different origins and ecotypes) were collected and sequenced using Specific-Locus Amplified Fragment Sequencing (SLAF-seq) technology, obtaining 660.25M reads with an average sequencing depth of 6.27 × and a mean Q30 of 85.96%. Based on the 238,711 polymorphic SLAF tags a total of 1,197,282 SNPs were discovered, and a subset of 201,817 SNPs with minor allele frequency >0.05 and integrity >0.8 were selected. Of these, 30,877 were designated SNP “hotspots,” and 41 SNP-rich genomic regions could be delineated, with 100 genes associated with plant resistance, vernalization response, and signal transduction detected in these regions. Subsequent analysis of genetic diversity, linkage disequilibrium (LD), and population structure in the 300 accessions was carried out based on the 201,817 SNPs. Nine subpopulations were observed based on the population structure analysis. Hierarchical clustering and principal component analysis divided the 300 varieties roughly in accordance with their ecotype origins. However, spring-type varieties were intermingled with semi-winter type varieties, indicating frequent hybridization between spring and semi-winter ecotypes in China. In addition, LD decay across the whole genome averaged 299 kb when r2 = 0.1, but the LD decay in the A genome (43 kb) was much shorter than in the C genome (1,455 kb), supporting the targeted introgression of the A genome from progenitor species B. rapa into Chinese rapeseed. This study also lays the foundation for genetic analysis of important agronomic traits using this rapeseed population. PMID:28503182

  19. Single nucleotide polymorphisms typing of Mycobacterium leprae reveals focal transmission of leprosy in high endemic regions of India.

    PubMed

    Lavania, M; Jadhav, R S; Turankar, R P; Chaitanya, V S; Singh, M; Sengupta, U

    2013-11-01

    Earlier studies indicate that genotyping of Mycobaterium leprae based on single-nucleotide polymorphisms (SNPs) is useful for analysis of the global spread of leprosy. In the present study, we investigated the diversity of M. leprae at eight SNP loci using 180 clinical isolates obtained from patients with leprosy residing mainly in Delhi and Purulia (West Bengal) regions. It was observed that the frequency of SNP type 1 and subtype D was most predominant in the Indian population. Further, the SNP type 2 subtype E was noted only from East Delhi region and SNP type 2 subtype G was noted only from the nearby areas of Hoogly district of West Bengal. These results indicate the occurrence of focal transmission of M. leprae infection and demonstrate that analysis by SNP typing has great potential to help researchers in understanding the transmission of M. leprae infection in the community.

  20. SNP discrimination through proofreading and OFF-switch of exo+ polymerase.

    PubMed

    Zhang, Jia; Li, Kai; Pardinas, Jose R; Liao, Duan F; Li, Hong J; Zhang, Xu

    2004-05-01

    Single nucleotide polymorphisms (SNPs) are useful physical markers for genetic studies as well as the cause of some genetic diseases. To develop more reliable SNP assays, we examined the underlying molecular mechanisms by which deoxyribonucleic acid (DNA) polymerases with 3' exonuclease activity maintain the high fidelity of DNA replication. In addition to mismatch removal by proofreading, we have discovered a premature termination of polymerization mediated by a novel OFF-switch mechanism. Two SNP assays were developed, one based on proofreading using 3' end-labeled primer extension and the other based on the newly identified OFF-switch, respectively. These two new assays are well suited for conventional techniques, such as electrophoresis and microplates detection systems as well as the sophisticated microchips. Application of these reliable SNP assays will greatly facilitate genetic and biomedical studies in the postgenome era.

  1. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations

    PubMed Central

    Welter, Danielle; MacArthur, Jacqueline; Morales, Joannella; Burdett, Tony; Hall, Peggy; Junkins, Heather; Klemm, Alan; Flicek, Paul; Manolio, Teri; Hindorff, Lucia; Parkinson, Helen

    2014-01-01

    The National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) Catalog provides a publicly available manually curated collection of published GWAS assaying at least 100 000 single-nucleotide polymorphisms (SNPs) and all SNP-trait associations with P <1 × 10−5. The Catalog includes 1751 curated publications of 11 912 SNPs. In addition to the SNP-trait association data, the Catalog also publishes a quarterly diagram of all SNP-trait associations mapped to the SNPs’ chromosomal locations. The Catalog can be accessed via a tabular web interface, via a dynamic visualization on the human karyotype, as a downloadable tab-delimited file and as an OWL knowledge base. This article presents a number of recent improvements to the Catalog, including novel ways for users to interact with the Catalog and changes to the curation infrastructure. PMID:24316577

  2. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

    PubMed

    Welter, Danielle; MacArthur, Jacqueline; Morales, Joannella; Burdett, Tony; Hall, Peggy; Junkins, Heather; Klemm, Alan; Flicek, Paul; Manolio, Teri; Hindorff, Lucia; Parkinson, Helen

    2014-01-01

    The National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) Catalog provides a publicly available manually curated collection of published GWAS assaying at least 100,000 single-nucleotide polymorphisms (SNPs) and all SNP-trait associations with P <1 × 10(-5). The Catalog includes 1751 curated publications of 11 912 SNPs. In addition to the SNP-trait association data, the Catalog also publishes a quarterly diagram of all SNP-trait associations mapped to the SNPs' chromosomal locations. The Catalog can be accessed via a tabular web interface, via a dynamic visualization on the human karyotype, as a downloadable tab-delimited file and as an OWL knowledge base. This article presents a number of recent improvements to the Catalog, including novel ways for users to interact with the Catalog and changes to the curation infrastructure.

  3. Bayesian model comparison in genetic association analysis: linear mixed modeling and SNP set testing.

    PubMed

    Wen, Xiaoquan

    2015-10-01

    We consider the problems of hypothesis testing and model comparison under a flexible Bayesian linear regression model whose formulation is closely connected with the linear mixed effect model and the parametric models for Single Nucleotide Polymorphism (SNP) set analysis in genetic association studies. We derive a class of analytic approximate Bayes factors and illustrate their connections with a variety of frequentist test statistics, including the Wald statistic and the variance component score statistic. Taking advantage of Bayesian model averaging and hierarchical modeling, we demonstrate some distinct advantages and flexibilities in the approaches utilizing the derived Bayes factors in the context of genetic association studies. We demonstrate our proposed methods using real or simulated numerical examples in applications of single SNP association testing, multi-locus fine-mapping and SNP set association testing. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  4. Multi-marker-LD based genetic algorithm for tag SNP selection.

    PubMed

    Mouawad, Amer E; Mansour, Nashat

    2014-12-01

    Despite the advances in genotyping technologies which have led to large reduction in genotyping cost, the Tag SNP Selection problem remains an important problem for computational biologists and geneticists. Selecting the smallest subset of tag SNPs that can predict the other SNPs would considerably minimize the complexity of genome-wide or block-based SNP-disease association studies. These studies would lead to better diagnosis and treatment of diseases. In this work, we propose three variations of a genetic algorithm based on two-marker linkage disequilibrium, multi-marker linkage disequilibrium, and a third measure that we denote by prediction power. The performance of the three algorithms are compared with those of a recognized tag SNP selection algorithm using three different real data sets from the HapMap project. The results indicate that the multi-marker linkage disequilibrium based genetic algorithm yields better prediction accuracy.

  5. Observation of perturbed 3snp double photoexcited Ryberg series of beryllium atoms

    SciTech Connect

    Yoshida, Fumiko; Matsuoka, Leo; Osaki, Hiroyuki; Kikkawa, Satoshi; Fukushima, Yu; Hasegawa, Shuichi; Nagata, Tetsuo; Azuma, Yoshiro; Obara, Satoshi

    2006-04-15

    We observed the 3snp autoionizing Rydberg series of the Be atom in order to investigate the double-photoexcitation processes in two-s-electron systems. We employed synchrotron radiation to photoexcite the Be atoms and measured the generated Be{sup +} photoions by the time-of-flight method. The 3snp (n=3-9) photoexcitation resonance peaks with interloper state of 3p4s that converges to Be{sup +}(3p) threshold were observed. We derived the resonance parameters of 3snp series from a fitting procedure and obtained the Fano parameter q, energy position E{sub 0}, and resonance width {gamma}. These parameters are in good agreement with theoretical values. In the vicinity of the 3s5p state these experimental results clearly revealed the influence of the interloper 3p4s state, and the comparison with the numerical calculations indicates that more detailed calculations might be required to fully explain this phenomenon.

  6. Sequential Support Vector Regression with Embedded Entropy for SNP Selection and Disease Classification.

    PubMed

    Liang, Yulan; Kelemen, Arpad

    2011-06-01

    Comprehensive evaluation of common genetic variations through association of SNP structure with common diseases on the genome-wide scale is currently a hot area in human genome research. For less costly and faster diagnostics, advanced computational approaches are needed to select the minimum SNPs with the highest prediction accuracy for common complex diseases. In this paper, we present a sequential support vector regression model with embedded entropy algorithm to deal with the redundancy for the selection of the SNPs that have best prediction performance of diseases. We implemented our proposed method for both SNP selection and disease classification, and applied it to simulation data sets and two real disease data sets. Results show that on the average, our proposed method outperforms the well known methods of Support Vector Machine Recursive Feature Elimination, logistic regression, CART, and logic regression based SNP selections for disease classification.

  7. SNP-Seek database of SNPs derived from 3000 rice genomes

    PubMed Central

    Alexandrov, Nickolai; Tai, Shuaishuai; Wang, Wensheng; Mansueto, Locedie; Palis, Kevin; Fuentes, Roven Rommel; Ulat, Victor Jun; Chebotarov, Dmytro; Zhang, Gengyun; Li, Zhikang; Mauleon, Ramil; Hamilton, Ruaraidh Sackville; McNally, Kenneth L.

    2015-01-01

    We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome. The SNPs and allele information are organized into a SNP-Seek system (http://www.oryzasnp.org/iric-portal/), which consists of Oracle database having a total number of rows with SNP genotypes close to 60 billion (20 M SNPs × 3 K rice lines) and web interface for convenient querying. The database allows quick retrieving of SNP alleles for all varieties in a given genome region, finding different alleles from predefined varieties and querying basic passport and morphological phenotypic information about sequenced rice lines. SNPs can be visualized together with the gene structures in JBrowse genome browser. Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots. PMID:25429973

  8. k-merSNP discovery: Software for alignment-and reference-free scalable SNP discovery, phylogenetics, and annotation for hundreds of microbial genomes

    SciTech Connect

    2014-11-18

    With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs or raw, unassembled reads. The method is fast to compute, finding SNPs and building a SNP phylogeny in minutes to hours, depending on the size and diversity of the input sequences. The SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle many gigabases of sequence in a single run. The algorithm is based on k-mer analysis.

  9. Toward a consensus on SNP and STR mutation rates on the human Y-chromosome.

    PubMed

    Balanovsky, O

    2017-05-01

    The mutation rate on the Y-chromosome matters for estimating the time-to-the-most-recent-common-ancestor (TMRCA, i.e. haplogroup age) in population genetics, as well as for forensic, medical, and genealogical studies. Large-scale sequencing efforts have produced several independent estimates of Y-SNP mutation rates. Genealogical, or pedigree, rates tend to be slightly faster than evolutionary rates obtained from ancient DNA or calibrations using dated (pre)historical events. It is, therefore, suggested to report TMRCAs using an envelope defined by the average aDNA-based rate and the average pedigree-based rate. The current estimate of the "envelope rate" is 0.75-0.89 substitutions per billion base pairs per year. The available Y-SNP mutation rates can be applied to high-coverage data from the entire X-degenerate region, but other datasets may demand recalibrated rates. While a consensus on Y-SNP rates is approaching, the debate on Y-STR rates has continued for two decades, because multiple genealogical rates were consistent with each other but three times faster than the single evolutionary estimate. Applying Y-SNP and Y-STR rates to the same haplogroups recently helped to clarify the issue. Genealogical and evolutionary STR rates typically provide lower and upper bounds of the "true" (SNP-based) age. The genealogical rate often-but not always-works well for haplogroups less than 7000 years old. The evolutionary rate, although calibrated using recent events, inflates ages of young haplogroups and deflates the age of the entire Y-chromosomal tree, but often provides reasonable estimates for intermediate ages (old haplogroups). Future rate estimates and accumulating case studies should further clarify the Y-SNP rates.

  10. An Integrated SNP Mining and Utilization (ISMU) Pipeline for Next Generation Sequencing Data

    PubMed Central

    Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M.; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A. V. S. K.; Varshney, Rajeev K.

    2014-01-01

    Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone

  11. SNP markers-based map construction and genome-wide linkage analysis in Brassica napus.

    PubMed

    Raman, Harsh; Dalton-Morgan, Jessica; Diffey, Simon; Raman, Rosy; Alamery, Salman; Edwards, David; Batley, Jacqueline

    2014-09-01

    An Illumina Infinium array comprising 5306 single nucleotide polymorphism (SNP) markers was used to genotype 175 individuals of a doubled haploid population derived from a cross between Skipton and Ag-Spectrum, two Australian cultivars of rapeseed (Brassica napus L.). A genetic linkage map based on 613 SNP and 228 non-SNP (DArT, SSR, SRAP and candidate gene markers) covering 2514.8 cM was constructed and further utilized to identify loci associated with flowering time and resistance to blackleg, a disease caused by the fungus Leptosphaeria maculans. Comparison between genetic map positions of SNP markers and the sequenced Brassica rapa (A) and Brassica oleracea (C) genome scaffolds showed several genomic rearrangements in the B. napus genome. A major locus controlling resistance to L. maculans was identified at both seedling and adult plant stages on chromosome A07. QTL analyses revealed that up to 40.2% of genetic variation for flowering time was accounted for by loci having quantitative effects. Comparative mapping showed Arabidopsis and Brassica flowering genes such as Phytochrome A/D, Flowering Locus C and agamous-Like MADS box gene AGL1 map within marker intervals associated with flowering time in a DH population from Skipton/Ag-Spectrum. Genomic regions associated with flowering time and resistance to L. maculans had several SNP markers mapped within 10 cM. Our results suggest that SNP markers will be suitable for various applications such as trait introgression, comparative mapping and high-resolution mapping of loci in B. napus. © 2014 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  12. Viability of in-house datamarting approaches for population genetics analysis of SNP genotypes

    PubMed Central

    Amigo, Jorge; Phillips, Christopher; Salas, Antonio; Carracedo, Ángel

    2009-01-01

    Background Databases containing very large amounts of SNP (Single Nucleotide Polymorphism) data are now freely available for researchers interested in medical and/or population genetics applications. While many of these SNP repositories have implemented data retrieval tools for general-purpose mining, these alone cannot cover the broad spectrum of needs of most medical and population genetics studies. Results To address this limitation, we have built in-house customized data marts from the raw data provided by the largest public databases. In particular, for population genetics analysis based on genotypes we have built a set of data processing scripts that deal with raw data coming from the major SNP variation databases (e.g. HapMap, Perlegen), stripping them into single genotypes and then grouping them into populations, then merged with additional complementary descriptive information extracted from dbSNP. This allows not only in-house standardization and normalization of the genotyping data retrieved from different repositories, but also the calculation of statistical indices from simple allele frequency estimates to more elaborate genetic differentiation tests within populations, together with the ability to combine population samples from different databases. Conclusion The present study demonstrates the viability of implementing scripts for handling extensive datasets of SNP genotypes with low computational costs, dealing with certain complex issues that arise from the divergent nature and configuration of the most popular SNP repositories. The information contained in these databases can also be enriched with additional information obtained from other complementary databases, in order to build a dedicated data mart. Updating the data structure is straightforward, as well as permitting easy implementation of new external data and the computation of supplementary statistical indices of interest. PMID:19344481

  13. Using Hamming Distance as Information for SNP-Sets Clustering and Testing in Disease Association Studies.

    PubMed

    Wang, Charlotte; Kao, Wen-Hsin; Hsiao, Chuhsing Kate

    2015-01-01

    The availability of high-throughput genomic data has led to several challenges in recent genetic association studies, including the large number of genetic variants that must be considered and the computational complexity in statistical analyses. Tackling these problems with a marker-set study such as SNP-set analysis can be an efficient solution. To construct SNP-sets, we first propose a clustering algorithm, which employs Hamming distance to measure the similarity between strings of SNP genotypes and evaluates whether the given SNPs or SNP-sets should be clustered. A dendrogram can then be constructed based on such distance measure, and the number of clusters can be determined. With the resulting SNP-sets, we next develop an association test HDAT to examine susceptibility to the disease of interest. This proposed test assesses, based on Hamming distance, whether the similarity between a diseased and a normal individual differs from the similarity between two individuals of the same disease status. In our proposed methodology, only genotype information is needed. No inference of haplotypes is required, and SNPs under consideration do not need to locate in nearby regions. The proposed clustering algorithm and association test are illustrated with applications and simulation studies. As compared with other existing methods, the clustering algorithm is faster and better at identifying sets containing SNPs exerting a similar effect. In addition, the simulation studies demonstrated that the proposed test works well for SNP-sets containing a large proportion of neutral SNPs. Furthermore, employing the clustering algorithm before testing a large set of data improves the knowledge in confining the genetic regions for susceptible genetic markers.

  14. Association between CYP19 gene SNP rs2414096 polymorphism and polycystic ovary syndrome in Chinese women.

    PubMed

    Jin, Jia-Li; Sun, Jing; Ge, Hui-Juan; Cao, Yun-Xia; Wu, Xiao-Ke; Liang, Feng-Jing; Sun, Hai-Xiang; Ke, Lu; Yi, Long; Wu, Zhi-Wei; Wang, Yong

    2009-12-16

    Several studies have reported the association of the SNP rs2414096 in the CYP19 gene with hyperandrogenism, which is one of the clinical manifestations of polycystic ovary syndrome (PCOS). These studies suggest that SNP rs2414096 may be involved in the etiopathogenisis of PCOS. To investigate whetherthe CYP19 gene SNP rs2414096 polymorphism is associated with the susceptibility to PCOS, we designed a case-controlled association study including 684 individuals. A case-controlled association study including 684 individuals (386 PCOS patients and 298 controls) was performed to assess the association of SNP rs2414096 with PCOS. Genotyping of SNP rs2414096 was conducted by the polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method that was performed on genomic DNA isolated from blood leucocytes. Results were analyzed in respect to clinical test results. The genotypic distributions of rs2414096 (GG, AG, AA) in the CYP19 gene (GG, AG, AA) in women with PCOS (0.363, 0.474, 0.163, respectively) were significantly different from that in controls (0.242, 0.500, 0.258, respectively) (P = 0.001). E2/T was different between the AA and GG genotypes. Age at menarche (AAM) and FSH were also significantly different among the GG, AG, and AA genotypes in women with PCOS (P = 0.0391 and 0.0118, respectively). No differences were observed in body mass index (BMI) and other serum hormone concentrations among the three genotypes, either in the PCOS patients or controls. Our data suggest that SNP rs2414096 in the CYP19 gene is associated with susceptibility to PCOS.

  15. An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data.

    PubMed

    Azam, Sarwar; Rathore, Abhishek; Shah, Trushar M; Telluri, Mohan; Amindala, BhanuPrakash; Ruperao, Pradeep; Katta, Mohan A V S K; Varshney, Rajeev K

    2014-01-01

    Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing. It has been developed in Java language and is available at http://hpc.icrisat.cgiar.org/ISMU as a standalone

  16. Vitis Phylogenomics: Hybridization Intensities from a SNP Array Outperform Genotype Calls

    PubMed Central

    Miller, Allison J.; Matasci, Naim; Schwaninger, Heidi; Aradhya, Mallikarjuna K.; Prins, Bernard; Zhong, Gan-Yuan; Simon, Charles; Buckler, Edward S.; Myles, Sean

    2013-01-01

    Understanding relationships among species is a fundamental goal of evolutionary biology. Single nucleotide polymorphisms (SNPs) identified through next generation sequencing and related technologies enable phylogeny reconstruction by providing unprecedented numbers of characters for analysis. One approach to SNP-based phylogeny reconstruction is to identify SNPs in a subset of individuals, and then to compile SNPs on an array that can be used to genotype additional samples at hundreds or thousands of sites simultaneously. Although powerful and efficient, this method is subject to ascertainment bias because applying variation discovered in a representative subset to a larger sample favors identification of SNPs with high minor allele frequencies and introduces bias against rare alleles. Here, we demonstrate that the use of hybridization intensity data, rather than genotype calls, reduces the effects of ascertainment bias. Whereas traditional SNP calls assess known variants based on diversity housed in the discovery panel, hybridization intensity data survey variation in the broader sample pool, regardless of whether those variants are present in the initial SNP discovery process. We apply SNP genotype and hybridization intensity data derived from the Vitis9kSNP array developed for grape to show the effects of ascertainment bias and to reconstruct evolutionary relationships among Vitis species. We demonstrate that phylogenies constructed using hybridization intensities suffer less from the distorting effects of ascertainment bias, and are thus more accurate than phylogenies based on genotype calls. Moreover, we reconstruct the phylogeny of the genus Vitis using hybridization data, show that North American subgenus Vitis species are monophyletic, and resolve several previously poorly known relationships among North American species. This study builds on earlier work that applied the Vitis9kSNP array to evolutionary questions within Vitis vinifera and has general

  17. Minimal SNP overlap among multiple panels of ancestry informative markers argues for more international collaboration.

    PubMed

    Soundararajan, Usha; Yun, Libing; Shi, Meisen; Kidd, Kenneth K

    2016-07-01

    The century-old use of genetic markers to determine population relationships has morphed in modern forensics into use of markers to determine the ancestry of an individual from a DNA sample. Researchers have identified sets of SNPs that have frequency differences among populations and many sets of SNPs have been published for the purpose of inferring ancestry. Such inference also requires reference datasets for the particular set of SNPs selected. We have identified 21 largely independent published panels of ancestry informative SNPs (AISNPs) and examined their union of 1397 SNPs. No SNP occurs in more than 6 panels. The 1397 SNPs in 21 panels yield a largely empty matrix that is inhibiting progress on more refined ability to infer ancestry for a forensic sample. The most common set of reference populations is the HGDP set of 52 small population samples totaling a thousand individuals. Only 46 (3%) of the 1397 SNPs occur in three or more panels. We assembled a new dataset for 44 of those SNPs involving 4,559 individuals from 73 populations. Analyses of this dataset provided clear differentiation of only five biogeographic regions: sub-Saharan Africa, Europe and SW Asia, South Asia, East Asia, and the Americas. This is an inadequate level of biogeographic resolution already exceeded by other panels. We conclude that more such AISNP panels are not needed and that the forensic community must collaborate to develop a common set of highly differentiating AISNPs typed on a very large number of population samples. How that can be accomplished will be the subject of future discussion. Copyright © 2016 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  18. Identity by Descent Mapping of Founder Mutations in Cancer Using High-Resolution Tumor SNP Data

    PubMed Central

    Letouzé, Eric; Sow, Aliou; Petel, Fabien; Rosati, Roberto; Figueiredo, Bonald C.; Burnichon, Nelly; Gimenez-Roqueplo, Anne-Paule

    2012-01-01

    Dense genotype data can be used to detect chromosome fragments inherited from a common ancestor in apparently unrelated individuals. A disease-causing mutation inherited from a common founder may thus be detected by searching for a common haplotype signature in a sample population of patients. We present here FounderTracker, a computational method for the genome-wide detection of founder mutations in cancer using dense tumor SNP profiles. Our method is based on two assumptions. First, the wild-type allele frequently undergoes loss of heterozygosity (LOH) in the tumors of germline mutation carriers. Second, the overlap between the ancestral chromosome fragments inherited from a common founder will define a minimal haplotype conserved in each patient carrying the founder mutation. Our approach thus relies on the detection of haplotypes with significant identity by descent (IBD) sharing within recurrent regions of LOH to highlight genomic loci likely to harbor a founder mutation. We validated this approach by analyzing two real cancer data sets in which we successfully identified founder mutations of well-characterized tumor suppressor genes. We then used simulated data to evaluate the ability of our method to detect IBD tracts as a function of their size and frequency. We show that FounderTracker can detect haplotypes of low prevalence with high power and specificity, significantly outperforming existing methods. FounderTracker is thus a powerful tool for discovering unknown founder mutations that may explain part of the “missing” heritability in cancer. This method is freely available and can be used online at the FounderTracker website. PMID:22567117

  19. Expanded dog leukocyte antigen (DLA) single nucleotide polymorphism (SNP) genotyping reveals spurious class II associations

    PubMed Central

    Safra, N.; Pedersen, N.C.; Wolf, Z.; Johnson, E.G.; Liu, H.W.; Hughes, A.M.; Young, A.; Bannasch, D.L.

    2011-01-01

    The dog leukocyte antigen (DLA) system contains many of the functional genes of the immune system, thereby making it a candidate region for involvement in immune-mediated disorders. A number of studies have identified associations between specific DLA class II haplotypes and canine immune hemolytic anemia, thyroiditis, immune polyarthritis, type I diabetes mellitus, hypoadrenocorticism, systemic lupus erythematosus-related disease complex, necrotizing meningoencephalitis (NME) and anal furunculosis. These studies have relied on sequencing approximately 300 bases of exon 2 of each of the DLA class II genes: DLA-DRB1, DLA-DQA1 and DLA-DQB1. An association (odds ratio = 4.29) was identified by this method between Weimaraner dogs with hypertrophic osteodystrophy (HOD) and DLA-DRB1*01501. In the present study, a genotyping assay of 126 coding single nucleotide polymorphisms (SNPs) from across the entire DLA, spanning a region of 2.5 Mb (3,320,000–5,830,000) on CFA12, was developed and tested on Weimaraners with HOD, as well as two additional breeds with diseases associated with DLA class II: Nova Scotia duck tolling retrievers with hypoadrenocorticism and Pug dogs with NME. No significant associations were found between Weimaraners with HOD or Nova Scotia duck tolling retrievers with hypoadrenocorticism and SNPs spanning the DLA region. In contrast, significant associations were found with NME in Pug dogs, although the associated region extended beyond the class II genes. By including a larger number of genes from a larger genomic region a SNP genotyping assay was generated that provides coverage of the extended DLA region and may be useful in identifying and fine mapping DLA associations in dogs. PMID:21741283

  20. BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters.

    PubMed

    Huang, Hailiang; Tata, Sandeep; Prill, Robert J

    2013-01-01

    Computational workloads for genome-wide association studies (GWAS) are growing in scale and complexity outpacing the capabilities of single-threaded software designed for personal computers. The BlueSNP R package implements GWAS statistical tests in the R programming language and executes the calculations across computer clusters configured with Apache Hadoop, a de facto standard framework for distributed data processing using the MapReduce formalism. BlueSNP makes computationally intensive analyses, such as estimating empirical p-values via data permutation, and searching for expression quantitative trait loci over thousands of genes, feasible for large genotype-phenotype datasets. http://github.com/ibm-bioinformatics/bluesnp

  1. SNP500Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes.

    PubMed

    Packer, Bernice R; Yeager, Meredith; Burdett, Laura; Welch, Robert; Beerman, Michael; Qi, Liqun; Sicotte, Hugues; Staats, Brian; Acharya, Mekhala; Crenshaw, Andrew; Eckert, Andrew; Puri, Vinita; Gerhard, Daniela S; Chanock, Stephen J

    2006-01-01

    The SNP500Cancer database provides sequence and genotype assay information for candidate SNPs useful in mapping complex diseases, such as cancer. The database is an integral component of the NCI Cancer Genome Anatomy Project (http://cgap.nci.nih.gov). SNP500Cancer reports sequence analysis of anonymized control DNA samples (n = 102 Coriell samples representing four self-described ethnic groups: African/African-American, Caucasian, Hispanic and Pacific Rim). The website is searchable by gene, chromosome, gene ontology pathway, dbSNP ID and SNP500Cancer SNP ID. As of October 2005, the database contains >13 400 SNPs, 9124 of which have been sequenced in the SNP500Cancer population. For each analysed SNP, gene location and >200 bp of surrounding annotated sequence (including nearby SNPs) are provided, with frequency information in total and per subpopulation as well as calculation of Hardy-Weinberg equilibrium for each subpopulation. The website provides the conditions for validated sequencing and genotyping assays, as well as genotype results for the 102 samples, in both viewable and downloadable formats. A subset of sequence validated SNPs with minor allele frequency >5% are entered into a high-throughput pipeline for genotyping analysis to determine concordance for the same 102 samples. In addition, the results of genotype analysis for select validated SNP assays (defined as 100% concordance between sequence analysis and genotype results) are posted for an additional 280 samples drawn from the Human Diversity Panel (HDP). SNP500Cancer provides an invaluable resource for investigators to select SNPs for analysis, design genotyping assays using validated sequence data, choose selected assays already validated on one or more genotyping platforms, and select reference standards for genotyping assays. The SNP500Cancer database is freely accessible via the web page at http://snp500cancer.nci.nih.gov.

  2. Selection and use of SNP markers for animal identification and paternity analysis in U.S. beef cattle.

    PubMed

    Heaton, Michael P; Harhay, Gregory P; Bennett, Gary L; Stone, Roger T; Grosse, W Michael; Casas, Eduardo; Keele, John W; Smith, Timothy P L; Chitko-McKown, Carol G; Laegreid, William W

    2002-05-01

    DNA marker technology represents a promising means for determining the genetic identity and kinship of an animal. Compared with other types of DNA markers, single nucleotide polymorphisms (SNPs) are attractive because they are abundant, genetically stable, and amenable to high-throughput automated analysis. In cattle, the challenge has been to identify a minimal set of SNPs with sufficient power for use in a variety of popular breeds and crossbred populations. This report describes a set of 32 highly informative SNP markers distributed among 18 autosomes and both sex chromosomes. Informativity of these SNPs in U.S. beef cattle populations was estimated from the distribution of allele and genotype frequencies in two panels: one consisting of 96 purebred sires representing 17 popular breeds, and another with 154 purebred American Angus from six herds in four Midwestern states. Based on frequency data from these panels, the estimated probability that two randomly selected, unrelated individuals will possess identical genotypes for all 32 loci was 2.0 x 10(-13) for multi-breed composite populations and 1.9 x 10(-10) for purebred Angus populations. The probability that a randomly chosen candidate sire will be excluded from paternity was estimated to be 99.9% and 99.4% for the same respective populations. The DNA immediately surrounding the 32 target SNPs was sequenced in the 96 sires of the multi-breed panel and found to contain an additional 183 polymorphic sites. Knowledge of these additional sites, together with the 32 target SNPs, allows the design of robust, accurate genotype assays on a variety of high-throughput SNP genotyping platforms.

  3. Hypertriglyceridemia associated with the c.553G>T APOA5 SNP results from aberrant hetero-disulfide bond formation

    PubMed Central

    Sharma, Vineeta; Witkowski, Andrzej; Witkowska, H. Ewa; Dykstra, Andrew; Simonsen, Jens B.; Nelbach, Lisa; Beckstead, Jennifer A.; Pullinger, Clive R.; Kane, John P.; Malloy, Mary J.; Watson, Gordon; Forte, Trudy M.; Ryan, Robert O.

    2014-01-01

    Objective Apolipoprotein (apo) A-V is a low abundance plasma protein that modulates triacylglycerol (TG) homeostasis. Gene transfer studies were undertaken in apoa5 (−/−) mice to define the mechanism underlying the correlation between the single nucleotide polymorphism (SNP) c.553G>T in APOA5 and hypertriglyceridemia (HTG). Approach and Results Adeno-associated virus (AAV) 2/8 mediated gene transfer of wild type (WT) apoA-V induced a dramatic lowering of plasma TG in apoa5 (−/−) mice while AAV2/8-Gly162Cys apoA-V (corresponding to the c.553G>T SNP: rs2075291) had a modest effect. Characterization studies revealed that plasma levels of WT- and G162C apoA-V in transduced mice were similar and within the physiological range. Fractionation of plasma from mice transduced with AAV2/8-G162C apoA-V indicated that, unlike WT apoA-V, >50% of G162C apoA-V was recovered in the lipoprotein-free fraction. Non-reducing SDS-PAGE immunoblot analysis provided evidence that G162C apoA-V present in the lipoprotein-free fraction, but not that portion associated with lipoproteins, displayed altered electrophoretic mobility consistent with disulfide-linked hetero-dimer formation. Immunoprecipitation followed by liquid chromatography/mass spectrometry of human plasma from subjects homozygous for WT APOA5 and c.553G>T APOA5 revealed that G162C apoA-V forms adducts with extraneous plasma proteins including fibronectin, kininogen-1 and others. Conclusion Substitution of Cys for Gly at position 162 of mature apoA-V introduces a free cysteine that forms disulfide bonds with plasma proteins such that its lipoprotein binding and TG modulation functions are compromised. PMID:25127531

  4. A preliminary assessment of the ForenSeq™ FGx System: next generation sequencing of an STR and SNP multiplex.

    PubMed

    Silvia, Ashley L; Shugarts, Nathan; Smith, Jenifer

    2017-01-01

    The ForenSeq™ FGx System (Illumina, San Diego, CA) was initially evaluated in concordance with SWGDAM guidelines for internal validation to determine the quality of the system's components: the ForenSeq™ DNA Signature Prep Kit reagents, the MiSeq FGx™ instrument, and the ForenSeq™ Universal Analysis Software, for the analysis of targeted, forensically informative single nucleotide polymorphisms (SNPs) and short tandem repeats (STRs). This multiplex consisted of STRs (autosomal, X, and Y) and SNPs (identity, ancestry, and phenotypic) that were run using one preparation process. Overall, the ForenSeq™ FGx System performed as well as the traditional capillary electrophoresis-based method in producing usable profile information, along with additional information that could aid in investigative leads. The MiSeq FGx™ System was validated using DNA samples in studies testing reproducibility, repeatability, concordance, sensitivity, and mock case single donor samples. Overall, genotyping results for STRs and SNPs were concordant with the profiles generated from conventional STR analysis using Identifiler and SNPs typed by 23andMe analysis. Genotypes of the ForenSeq™ aSNPs were used to evaluate biogeographical ancestry estimations using ForenSeq™ Universal Analysis Software, FROG-kb database (KIDD aiSNP 55 panel), and 23andMe. The system was shown to provide reproducible genotypes and reliable results were obtained at levels as low as 50 pg. All mock case samples were concordant with the donor profile. The results support consideration of the ForenSeq™ FGx System as an acceptable alternative to current STR and SNP analysis, pending formal developmental and internal validation studies.

  5. Functional SNP associated with birth weight in independent populations identified with a permutation step added to GBLUP-GWAS

    USDA-ARS?s Scientific Manuscript database

    This study was conducted as an initial assessment of a newly available genotyping assay containing about 34,000 common SNP included on previous SNP chips, and 199,000 sequence variants predicted to affect gene function. Objectives were to identify functional variants associated with birth weight in...

  6. Tagging SNP-set selection with maximum information based on linkage disequilibrium structure in genome-wide association studies.

    PubMed

    Wang, Shudong; He, Sicheng; Yuan, Fayou; Zhu, Xinjie

    2017-07-15

    Effective tagging single-nucleotide polymorphism (SNP)-set selection is crucial to SNP-set analysis in genome-wide association studies (GWAS). Most of the existing tagging SNP-set selection methods cannot make full use of the information hidden in common or rare variants associated diseases. It is noticed that some SNPs have overlapping genetic information owing to linkage disequilibrium (LD) structure between SNPs. Therefore, when testing the association between SNPs and disease susceptibility, it is sufficient to elect the representative SNPs (called tag SNP-set or tagSNP-set) with maximum information. It is proposed a new tagSNP-set selection method based on LD information between SNPs, namely TagSNP-Set with Maximum Information. Compared with classical SNP-set analytical method, our method not only has higher power, but also can minimize the number of selected tagSNPs and maximize the information provided by selected tagSNPs with less genotyping cost and lower time complexity. hesicheng12@163.com. Supplementary data are available at Bioinformatics online.

  7. Genome-wide Target Enrichment-aided Chip Design: a 66 K SNP Chip for Cashmere Goat.

    PubMed

    Qiao, Xian; Su, Rui; Wang, Yang; Wang, Ruijun; Yang, Ting; Li, Xiaokai; Chen, Wei; He, Shiyang; Jiang, Yu; Xu, Qiwu; Wan, Wenting; Zhang, Yaolei; Zhang, Wenguang; Chen, Jiang; Liu, Bin; Liu, Xin; Fan, Yixing; Chen, Duoyuan; Jiang, Huaizhi; Fang, Dongming; Liu, Zhihong; Wang, Xiaowen; Zhang, Yanjun; Mao, Danqing; Wang, Zhiying; Di, Ran; Zhao, Qianjun; Zhong, Tao; Yang, Huanming; Wang, Jian; Wang, Wen; Dong, Yang; Chen, Xiaoli; Xu, Xun; Li, Jinquan

    2017-08-17

    Compared with the commercially available single nucleotide polymorphism (SNP) chip based on the Bead Chip technology, the solution hybrid selection (SHS)-based target enrichment SNP chip is not only design-flexible, but also cost-effective for genotype sequencing. In this study, we propose to design an animal SNP chip using the SHS-based target enrichment strategy for the first time. As an update to the international collaboration on goat research, a 66 K SNP chip for cashmere goat was created from the whole-genome sequencing data of 73 individuals. Verification of this 66 K SNP chip with the whole-genome sequencing data of 436 cashmere goats showed that the SNP call rates was between 95.3% and 99.8%. The average sequencing depth for target SNPs were 40X. The capture regions were shown to be 200 bp that flank target SNPs. This chip was further tested in a genome-wide association analysis of cashmere fineness (fiber diameter). Several top hit loci were found marginally associated with signaling pathways involved in hair growth. These results demonstrate that the 66 K SNP chip is a useful tool in the genomic analyses of cashmere goats. The successful chip design shows that the SHS-based target enrichment strategy could be applied to SNP chip design in other species.

  8. MDM2 promoter SNP55 (rs2870820) affects risk of colon cancer but not breast-, lung-, or prostate cancer

    PubMed Central

    Helwa, Reham; Gansmo, Liv B.; Romundstad, Pål; Hveem, Kristian; Vatten, Lars; Ryan, Bríd M.; Harris, Curtis C.; Lønning, Per E.; Knappskog, Stian

    2016-01-01

    Two functional SNPs (SNP285G > C; rs117039649 and SNP309T > G; rs2279744) have previously been reported to modulate Sp1 transcription factor binding to the promoter of the proto-oncogene MDM2, and to influence cancer risk. Recently, a third SNP (SNP55C > T; rs2870820) was also reported to affect Sp1 binding and MDM2 transcription. In this large population based case-control study, we genotyped MDM2 SNP55 in 10,779 Caucasian individuals, previously genotyped for SNP309 and SNP285, including cases of colon (n = 1,524), lung (n = 1,323), breast (n = 1,709) and prostate cancer (n = 2,488) and 3,735 non-cancer controls, as well as 299 healthy African-Americans. Applying the dominant model, we found an elevated risk of colon cancer among individuals harbouring SNP55TT/CT genotypes compared to the SNP55CC genotype (OR = 1.15; 95% CI = 1.01–1.30). The risk was found to be highest for left-sided colon cancer (OR = 1.21; 95% CI = 1.00–1.45) and among females (OR = 1.32; 95% CI = 1.01–1.74). Assessing combined genotypes, we found the highest risk of colon cancer among individuals harbouring the SNP55TT or CT together with the SNP309TG genotype (OR = 1.21; 95% CI = 1.00–1.46). Supporting the conclusions from the risk estimates, we found colon cancer cases carrying the SNP55TT/CT genotypes to be diagnosed at younger age as compared to SNP55CC (p = 0.053), in particular among patients carrying the SNP309TG/TT genotypes (p = 0.009). PMID:27624283

  9. SNP array–based karyotyping: differences and similarities between aplastic anemia and hypocellular myelodysplastic syndromes

    PubMed Central

    Afable, Manuel G.; Wlodarski, Marcin; Makishima, Hideki; Shaik, Mohammed; Sekeres, Mikkael A.; Tiu, Ramon V.; Kalaycio, Matt; O'Keefe, Christine L.

    2011-01-01

    In aplastic anemia (AA), contraction of the stem cell pool may result in oligoclonality, while in myelodysplastic syndromes (MDS) a single hematopoietic clone often characterized by chromosomal aberrations expands and outcompetes normal stem cells. We analyzed patients with AA (N = 93) and hypocellular MDS (hMDS, N = 24) using single nucleotide polymorphism arrays (SNP-A) complementing routine cytogenetics. We hypothesized that clinically important cryptic clonal aberrations may exist in some patients with BM failure. Combined metaphase and SNP-A karyotyping improved detection of chromosomal lesions: 19% and 54% of AA and hMDS cases harbored clonal abnormalities including copy-neutral loss of heterozygosity (UPD, 7%). Remarkably, lesions involving the HLA locus suggestive of clonal immune escape were found in 3 of 93 patients with AA. In hMDS, additional clonal lesions were detected in 5 (36%) of 14 patients with normal/noninformative routine cytogenetics. In a subset of AA patients studied at presentation, persistent chromosomal genomic lesions were found in 10 of 33, suggesting that the initial diagnosis may have been hMDS. Similarly, using SNP-A, earlier clonal evolution was found in 4 of 7 AA patients followed serially. In sum, our results indicate that SNP-A identify cryptic clonal genomic aberrations in AA and hMDS leading to improved distinction of these disease entities. PMID:21527527

  10. selectSNP – An R package for selecting SNPs optimal for genetic evaluation

    USDA-ARS?s Scientific Manuscript database

    There has been a huge increase in the number of SNPs in the public repositories. This has made it a challenge to design low and medium density SNP panels, which requires careful selection of available SNPs considering many criteria, such as map position, allelic frequency, possible biological functi...

  11. The identification of SNPs with indeterminate positions using the Equine SNP50 BeadChip.

    PubMed

    Corbin, L J; Blott, S C; Swinburne, J E; Vaudin, M; Bishop, S C; Woolliams, J A

    2012-06-01

    We have used linkage disequilibrium (LD) to identify single nucleotide polymorphisms (SNPs) on the Illumina Equine SNP50 BeadChip, which may be incorrectly positioned on the genome map. A total of 1201 Thoroughbred horses were genotyped using the Illumina Equine SNP50 BeadChip. LD was evaluated in a pairwise fashion between all autosomal SNPs, both within and across chromosomes. Filters were then applied to the data, firstly to identify SNPs that may have been mapped to the wrong chromosome and secondly to identify SNPs that may have been incorrectly positioned within chromosomes. We identified a single SNP on ECA28, which showed low LD with neighbouring SNPs but considerable LD with a group of SNPs on ECA10. Furthermore, a cluster of SNPs on ECA5 showed unusually low LD with surrounding SNPs. A total of 39 SNPs met the criteria for unusual within-chromosome LD. The results of this study indicate that some SNPs may be misplaced. This finding is significant, as misplaced SNPs may lead to difficulties in the application of genomic methods, such as homozygosity mapping, for which SNP order is important.

  12. An abbreviated SNP panel for ancestry assignment of honeybees (Apis mellifera)

    USDA-ARS?s Scientific Manuscript database

    This paper examines whether an abbreviated panel of 37 single nucleotide polymorphisms (SNPs) has the same power as a larger and more expensive panel of 95 SNPs to assign ancestry of honeybees (Apis mellifera) to three ancestral lineages. We selected 37 SNPs from the original 95 SNP panel using alle...

  13. Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.

    USDA-ARS?s Scientific Manuscript database

    Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ~4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification pr...

  14. Longevity and Plasticity of CFTR Provide an Argument for Noncanonical SNP Organization in Hominid DNA

    PubMed Central

    Hill, Aubrey E.; Plyler, Zackery E.; Tiwari, Hemant; Patki, Amit; Tully, Joel P.; McAtee, Christopher W.; Moseley, Leah A.; Sorscher, Eric J.

    2014-01-01

    Like many other ancient genes, the cystic fibrosis transmembrane conductance regulator (CFTR) has survived for hundreds of millions of years. In this report, we consider whether such prodigious longevity of an individual gene – as opposed to an entire genome or species – should be considered surprising in the face of eons of relentless DNA replication errors, mutagenesis, and other causes of sequence polymorphism. The conventions that modern human SNP patterns result either from purifying selection or random (neutral) drift were not well supported, since extant models account rather poorly for the known plasticity and function (or the established SNP distributions) found in a multitude of genes such as CFTR. Instead, our analysis can be taken as a polemic indicating that SNPs in CFTR and many other mammalian genes may have been generated—and continue to accrue—in a fundamentally more organized manner than would otherwise have been expected. The resulting viewpoint contradicts earlier claims of ‘directional’ or ‘intelligent design-type’ SNP formation, and has important implications regarding the pace of DNA adaptation, the genesis of conserved non-coding DNA, and the extent to which eukaryotic SNP formation should be viewed as adaptive. PMID:25350658

  15. EvoSNP-DB: A database of genetic diversity in East Asian populations

    PubMed Central

    Kim, Young Uk; Kim, Young Jin; Lee, Jong-Young; Park, Kiejung

    2013-01-01

    Genome-wide association studies (GWAS) have become popular as an approach for the identification of large numbers of phenotype-associated variants. However, differences in genetic architecture and environmental factors mean that the effect of variants can vary across populations. Understanding population genetic diversity is valuable for the investigation of possible population specific and independent effects of variants. EvoSNP-DB aims to provide information regarding genetic diversity among East Asian populations, including Chinese, Japanese, and Korean. Non-redundant SNPs (1.6 million) were genotyped in 54 Korean trios (162 samples) and were compared with 4 million SNPs from HapMap phase II populations. EvoSNP-DB provides two user interfaces for data query and visualization, and integrates scores of genetic diversity (Fst and VarLD) at the level of SNPs, genes, and chromosome regions. EvoSNP-DB is a web-based application that allows users to navigate and visualize measurements of population genetic differences in an interactive manner, and is available online at [http://biomi.cdc.go.kr/EvoSNP/]. [BMB Reports 2013; 46(8): 416-421] PMID:23977990

  16. Measuring diversity in Gossypium hirsutum using the CottonSNP63K Array

    USDA-ARS?s Scientific Manuscript database

    A CottonSNP63K array and accompanying cluster file has been developed and includes 45,104 intra-specific SNPs and 17,954 inter-specific SNPs for automated genotyping of cotton (Gossypium spp.) samples. Development of the cluster file included genotyping of 1,156 samples, a subset of which were iden...

  17. Linkage disequilibrium among commonly genotyped SNP and variants detected from bull sequence

    USDA-ARS?s Scientific Manuscript database

    Genomic prediction utilizing causal variants could increase selection accuracy above that achieved with SNP genotyped by commercial assays. A number of variants detected from sequencing influential sires are likely to be causal, but noticable improvements in prediction accuracy using imputed sequen...

  18. Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array

    USDA-ARS?s Scientific Manuscript database

    Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases...

  19. An improved consensus linkage map of barley based on flow-sorted chromosomes and SNP markers

    USDA-ARS?s Scientific Manuscript database

    Recent advances in high-throughput genotyping have made it easier to combine information from different mapping populations into consensus genetic maps, which provide increased marker density and genome coverage compared to individual maps. Previously, a SNP-based genotyping platform was developed a...

  20. Microsatellite Imputation for parental verification from SNP across multiple Bos taurus and indicus breeds

    USDA-ARS?s Scientific Manuscript database

    Microsatellite markers (MS) have traditionally been used for parental verification and are still the international standard in spite of their higher cost, error rate, and turnaround time compared with Single Nucleotide Polymorphisms (SNP)-based assays. Despite domestic and international demands fro...

  1. A web-based genome browser for 'SNP-aware' assay design

    USDA-ARS?s Scientific Manuscript database

    Human and animal genomes contain an abundance of single nucleotide polymorphisms (SNPs) that are useful for genetic testing. However, the relatively large number of SNPs present in diverse populations can pose serious problems when designing assays. It is important to “mask” some SNP positions so ...

  2. Use of microsatellite and SNP markers to characterize biotypes in Hessian fly

    USDA-ARS?s Scientific Manuscript database

    Exploration of the biotype structure of Hessian fly, Mayetiola destructor (Say), would improve our knowledge regarding variation in virulence phenotypes and difference in genetic background. The objective of this study was to develop and test a panel of 18 microsatellite and 22 SNP markers to reveal...

  3. High-throughput RAD-SNP genotyping for characterization of sugar beet genotypes

    USDA-ARS?s Scientific Manuscript database

    High-throughput SNP genotyping provides a rapid way of developing resourceful set of markers for delineating the genetic architecture and for effective species discrimination. In the presented research, we demonstrate a set of 192 SNPs for effective genotyping in sugar beet using high-throughput mar...

  4. A novel approach to analyzing fMRI and SNP data via parallel independent component analysis

    NASA Astrophysics Data System (ADS)

    Liu, Jingyu; Pearlson, Godfrey; Calhoun, Vince; Windemuth, Andreas

    2007-03-01

    There is current interest in understanding genetic influences on brain function in both the healthy and the disordered brain. Parallel independent component analysis, a new method for analyzing multimodal data, is proposed in this paper and applied to functional magnetic resonance imaging (fMRI) and a single nucleotide polymorphism (SNP) array. The method aims to identify the independent components of each modality and the relationship between the two modalities. We analyzed 92 participants, including 29 schizophrenia (SZ) patients, 13 unaffected SZ relatives, and 50 healthy controls. We found a correlation of 0.79 between one fMRI component and one SNP component. The fMRI component consists of activations in cingulate gyrus, multiple frontal gyri, and superior temporal gyrus. The related SNP component is contributed to significantly by 9 SNPs located in sets of genes, including those coding for apolipoprotein A-I, and C-III, malate dehydrogenase 1 and the gamma-aminobutyric acid alpha-2 receptor. A significant difference in the presences of this SNP component is found between the SZ group (SZ patients and their relatives) and the control group. In summary, we constructed a framework to identify the interactions between brain functional and genetic information; our findings provide new insight into understanding genetic influences on brain function in a common mental disorder.

  5. Development and validation of a low-density SNP panel related to prolificacy in sheep

    USDA-ARS?s Scientific Manuscript database

    High-density SNP panels (e.g., 50,000 and 600,000 markers) have been used in exploratory population genetic studies with commercial and minor breeds of sheep. However, routine genetic diversity evaluations of large numbers of samples with large panels are in general cost-prohibitive for gene banks. ...

  6. The use of SNP data for the monitoring of genetic diversity in cattle breeds

    USDA-ARS?s Scientific Manuscript database

    LD between SNPs contains information about effective population size. In this study, we investigate the use of genome-wide SNP data for marker based estimation of effective population size for two taurine cattle breeds of Africa and two local cattle breeds of Switzerland. Estimated recombination rat...

  7. Mining for SNPs and SSRs using SNPServer, dbSNP and SSR taxonomy tree.

    PubMed

    Batley, Jacqueline; Edwards, David

    2009-01-01

    Molecular genetic markers represent one of the most powerful tools for the analysis of genomes and the association of heritable traits with underlying genetic variation. The development of high-throughput methods for the detection of single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) has led to a revolution in their use as molecular markers. The availability of large sequence data sets permits mining for these molecular markers, which may then be used for applications such as genetic trait mapping, diversity analysis and marker assisted selection in agriculture. Here we describe web-based automated methods for the discovery of SSRs using SSR taxonomy tree, the discovery of SNPs from sequence data using SNPServer and the identification of validated SNPs from within the dbSNP database. SSR taxonomy tree identifies pre-determined SSR amplification primers for virtually all species represented within the GenBank database. SNPServer uses a redundancy based approach to identify SNPs within DNA sequences. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms. The NCBI dbSNP database is a catalogue of molecular variation, hosting validated SNPs for several species within a public-domain archive.

  8. The impact of SNP fingerprinting and parentage analysis on the effectiveness of variety recommendations in cacao

    USDA-ARS?s Scientific Manuscript database

    Evidence for the impact of mislabeling and/or pollen contamination on consistency of field performance has been lacking to reinforce the need for strict adherence to quality control protocols in cacao seed garden and germplasm plot management. The present study used SNP fingerprinting at 64 loci to ...

  9. SNP discovery in complex allotetraploid genomes (Gossypium spp., Malvaceae) using genotyping by sequencing

    USDA-ARS?s Scientific Manuscript database

    Dramatic decreases in the cost of DNA sequencing have enabled the development of very large numbers of markers based on single nucleotide polymorphism (SNP) for phylogenetic studies, population genetics, linkage mapping, marker-assisted breeding and other applications. Using Illumina next-generatio...

  10. Verification of genetic identity of introduced cacao germplasm in Ghana using single nucleotide polymorphism (SNP) markers

    USDA-ARS?s Scientific Manuscript database

    Accurate identification of individual genotypes is important for cacao (Theobroma cacao L.) breeding, germplasm conservation and seed propagation. The development of single nucleotide polymorphism (SNP) markers in cacao offers an effective way to use a high-throughput genotyping system for cacao gen...

  11. Association of Agronomic Traits with SNP Markers in Durum Wheat (Triticum turgidum L. durum (Desf.)).

    PubMed

    Hu, Xin; Ren, Jing; Ren, Xifeng; Huang, Sisi; Sabiel, Salih A I; Luo, Mingcheng; Nevo, Eviatar; Fu, Chunjie; Peng, Junhua; Sun, Dongfa

    2015-01-01

    Association mapping is a powerful approach to detect associations between traits of interest and genetic markers based on linkage disequilibrium (LD) in molecular plant breeding. In this study, 150 accessions of worldwide originated durum wheat germplasm (Triticum turgidum spp. durum) were genotyped using 1,366 SNP markers. The extent of LD on each chromosome was evaluated. Association of single nucleotide polymorphisms (SNP) markers with ten agronomic traits measured in four consecutive years was analyzed under a mix linear model (MLM). Two hundred and one significant association pairs were detected in the four years. Several markers were associated with one trait, and also some markers were associated with multiple traits. Some of the associated markers were in agreement with previous quantitative trait loci (QTL) analyses. The function and homology analyses of the corresponding ESTs of some SNP markers could explain many of the associations for plant height, length of main spike, number of spikelets on main spike, grain number per plant, and 1000-grain weight, etc. The SNP associations for the observed traits are generally clustered in specific chromosome regions of the wheat genome, mainly in 2A, 5A, 6A, 7A, 1B, and 6B chromosomes. This study demonstrates that association mapping can complement and enhance previous QTL analyses and provide additional information for marker-assisted selection.

  12. Identification of a SNP marker associated with WB242 nematode resistance in sugar beet

    USDA-ARS?s Scientific Manuscript database

    The beet-cyst nematode (Heterodera schachtii Schmidt) is one of the major diseases of sugar beet. The identification of molecular markers associated to the nematode resistance would be helpful for developing resistant varieties. The aim of this study was the identification of SNP (Single Nucleotide ...

  13. Utilization of a whole genome SNP panel for efficient genetic mapping in the mouse

    PubMed Central

    Moran, Jennifer L.; Bolton, Andrew D.; Tran, Pamela V.; Brown, Alison; Dwyer, Noelle D.; Manning, Danielle K.; Bjork, Bryan C.; Li, Cheng; Montgomery, Kate; Siepka, Sandra M.; Vitaterna, Martha Hotz; Takahashi, Joseph S.; Wiltshire, Tim; Kwiatkowski, David J.; Kucherlapati, Raju; Beier, David R.

    2006-01-01

    Phenotype-driven genetics can be used to create mouse models of human disease and birth defects. However, the utility of these mutant models is limited without identification of the causal gene. To facilitate genetic mapping, we developed a fixed single nucleotide polymorphism (SNP) panel of 394 SNPs as an alternative to analyses using simple sequence length polymorphism (SSLP) marker mapping. With the SNP panel, chromosomal locations for 22 monogenic mutants were identified. The average number of affected progeny genotyped for mapped monogenic mutations is nine. Map locations for several mutants have been obtained with as few as four affected progeny. The average size of genetic intervals obtained for these mutants is 43 Mb, with a range of 17–83 Mb. Thus, our SNP panel allows for identification of moderate resolution map position with small numbers of mice in a high-throughput manner. Importantly, the panel is suitable for mapping crosses from many inbred and wild-derived inbred strain combinations. The chromosomal localizations obtained with the SNP panel allow one to quickly distinguish between potentially novel loci or remutations in known genes, and facilitates fine mapping and positional cloning. By using this approach, we identified DNA sequence changes in two ethylnitrosourea-induced mutants. PMID:16461637

  14. Optimal design of low-density SNP arrays for genomic prediction: algorithm and applications

    USDA-ARS?s Scientific Manuscript database

    Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for their optimal design. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optim...

  15. Association mapping of resistance to leaf rust in emmer wheat using high throughput SNP markers

    USDA-ARS?s Scientific Manuscript database

    Emmer wheat (Triticum turgidum L. subsp. dicoccum) is known to be a useful source of genes for many desirable characters for improvement of modern cultivated wheat. Recently, a panel of 181 emmer wheat accessions has been genotyped with wheat 9K SNP (single nucleotide polymorphism) markers and exte...

  16. An innovative SNP genotyping method adapting to multiple platforms and throughputs

    USDA-ARS?s Scientific Manuscript database

    Single nucleotide polymorphisms (SNPs) are highly abundant, distributed throughout the genome in various species, and therefore they are widely used as genetic markers. However, the usefulness of this genetic tool relies heavily on the availability of user-friendly SNP genotyping methods. We have d...

  17. MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data.

    PubMed

    Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong

    2015-01-01

    Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package "MAFsnp" implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/.

  18. Assessing the Clinical Utility of SNP Microarray for Prader-Willi Syndrome due to Uniparental Disomy.

    PubMed

    Santoro, Stephanie L; Hashimoto, Sayaka; McKinney, Aimee; Mihalic Mosher, Theresa; Pyatt, Robert; Reshmi, Shalini C; Astbury, Caroline; Hickey, Scott E

    2017-01-01

    Maternal uniparental disomy (UPD) 15 is one of the molecular causes of Prader-Willi syndrome (PWS), a multisystem disorder which presents with neonatal hypotonia and feeding difficulty. Current diagnostic algorithms differ regarding the use of SNP microarray to detect PWS. We retrospectively examined the frequency with which SNP microarray could identify regions of homozygosity (ROH) in patients with PWS. We determined that 7/12 (58%) patients with previously confirmed PWS by methylation analysis and microsatellite-positive UPD studies had ROH (>10 Mb) by SNP microarray. Additional assessment of 5,000 clinical microarrays, performed from 2013 to present, determined that only a single case of ROH for chromosome 15 was not caused by an imprinting disorder or identity by descent. We observed that ROH for chromosome 15 is rarely incidental and strongly associated with hypotonic infants having features of PWS. Although UPD microsatellite studies remain essential to definitively establish the presence of UPD, SNP microarray has important utility in the timely diagnostic algorithm for PWS. © 2017 S. Karger AG, Basel.

  19. Analysis of gene-derived SNP marker polymorphism in wheat (Triticum aestivum L.)

    USDA-ARS?s Scientific Manuscript database

    In this study, we analyzed 359 single nucleotide polymorphisms (SNPs) previously discovered in intron sequences of wheat genes to evaluate SNP marker polymorphism in common wheat (Triticum aestivum L.). These SNPs showed an average polymorphism information content (PIC) of 0.181 among 20 US wheat c...

  20. DHOEM: a statistical simulation software for simulating new markers in real SNP marker data.

    PubMed

    Jacquin, Laval; Cao, Tuong-Vi; Grenier, Cécile; Ahmadi, Nourollah

    2015-12-03

    Numerous simulation tools based on specific assumptions have been proposed to simulate populations. Here we present a simulation tool named DHOEM (densification of haplotypes by loess regression and maximum likelihood) which is free from population assumptions and simulates new markers in real SNP marker data. The main objective of DHOEM is to generate a new population, which incorporates real and simulated SNP by statistical learning from an initial population, which match the realized features of the latter. To demonstrate DHOEM's abilities, we used a sample of 704 haplotypes for 12 chromosomes with 8336 SNP from a synthetic population, used for breeding upland rice in Latin America. The distributions of allele frequencies, pairwise SNP LD coefficients and data structures, before and after marker densification of the associated marker data set, were shown to be in relatively good agreement at moderate degrees of marker densification. DHOEM is a user-friendly tool that allows the user to specify the level of marker density desired, with a user defined minor allele frequency (MAF) limit, which is produced in a reasonable computation time. DHOEM is a user-friendly and useful tool for simulation and methodological studies in quantitative genetics and breeding.

  1. SNP-based high density genetic map and mapping of btwd1 dwarfing gene in barley

    PubMed Central

    Ren, Xifeng; Wang, Jibin; Liu, Lipan; Sun, Genlou; Li, Chengdao; Luo, Hong; Sun, Dongfa

    2016-01-01

    A high-density linkage map is a valuable tool for functional genomics and breeding. A newly developed sequence-based marker technology, restriction site associated DNA (RAD) sequencing, has been proven to be powerful for the rapid discovery and genotyping of genome-wide single nucleotide polymorphism (SNP) markers and for the high-density genetic map construction. The objective of this research was to construct a high-density genetic map of barley using RAD sequencing. 1894 high-quality SNP markers were developed and mapped onto all seven chromosomes together with 68 SSR markers. These 1962 markers constituted a total genetic length of 1375.8 cM and an average of 0.7 cM between adjacent loci. The number of markers within each linkage group ranged from 209 to 396. The new recessive dwarfing gene btwd1 in Huaai 11 was mapped onto the high density linkage maps. The result showed that the btwd1 is positioned between SNP marks 7HL_6335336 and 7_249275418 with a genetic distance of 0.9 cM and 0.7 cM on chromosome 7H, respectively. The SNP-based high-density genetic map developed and the dwarfing gene btwd1 mapped in this study provide critical information for position cloning of the btwd1 gene and molecular breeding of barley. PMID:27530597

  2. SNP-based genotyping in lentil: linking sequence information with phenotypes

    USDA-ARS?s Scientific Manuscript database

    Lentil (Lens culinaris) has been late to enter the world of high throughput molecular analysis due to a general lack of genomic resources. Using a 454 sequencing-based approach, SNPs have been identified in genes across the lentil genome. Several hundred have been turned into single SNP KASP assay...

  3. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate

    Treesearch

    Gretchen H. Roffler; Stephen J. Amish; Seth Smith; Ted Cosart; Marty Kardos; Michael K. Schwartz; Gordon Luikart

    2016-01-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding...

  4. Association of Agronomic Traits with SNP Markers in Durum Wheat (Triticum turgidum L. durum (Desf.))

    PubMed Central

    Hu, Xin; Ren, Jing; Ren, Xifeng; Huang, Sisi; Sabiel, Salih A. I.; Luo, Mingcheng; Nevo, Eviatar; Fu, Chunjie; Peng, Junhua; Sun, Dongfa

    2015-01-01

    Association mapping is a powerful approach to detect associations between traits of interest and genetic markers based on linkage disequilibrium (LD) in molecular plant breeding. In this study, 150 accessions of worldwide originated durum wheat germplasm (Triticum turgidum spp. durum) were genotyped using 1,366 SNP markers. The extent of LD on each chromosome was evaluated. Association of single nucleotide polymorphisms (SNP) markers with ten agronomic traits measured in four consecutive years was analyzed under a mix linear model (MLM). Two hundred and one significant association pairs were detected in the four years. Several markers were associated with one trait, and also some markers were associated with multiple traits. Some of the associated markers were in agreement with previous quantitative trait loci (QTL) analyses. The function and homology analyses of the corresponding ESTs of some SNP markers could explain many of the associations for plant height, length of main spike, number of spikelets on main spike, grain number per plant, and 1000-grain weight, etc. The SNP associations for the observed traits are generally clustered in specific chromosome regions of the wheat genome, mainly in 2A, 5A, 6A, 7A, 1B, and 6B chromosomes. This study demonstrates that association mapping can complement and enhance previous QTL analyses and provide additional information for marker-assisted selection. PMID:26110423

  5. Priming of seeds with nitric oxide donor sodium nitroprusside (SNP) alleviates the inhibition on wheat seed germination by salt stress.

    PubMed

    Duan, Pei; Ding, Feng; Wang, Fang; Wang, Bao-Shan

    2007-06-01

    The effect of SNP, an NO donor, on seed germination of wheat (Triticum aestivum L. cv. 'DK961') under salt stress was studied. The results showed that priming of seeds with 0.06 mmol/L SNP for 24 h markedly alleviated the decrease of the germination percentage, germination index, vigor index and imbibition rate of wheat seeds under salt stress. SNP significantly alleviated the decrease of the beta-amylase activity but almost did not affect the alpha-amylase activity of wheat seeds under salt stress. SNP slightly increased the alpha-amylase isoenzymes (especially isoenzyme 3) and significantly increased the beta-amylase isoenzymes (especially isoenzyme d, e, f and g). SNP pretreatment decreased Na(+) content, but increased the K(+) content, resulting in a mark increase of K(+)/Na(+) ratio of wheat seedlings under salt stress. These results suggested that NO is involved in promoting wheat seed germination under salt stress by increasing the beta-amylase activity.

  6. Estimating the effect of SNP genotype on quantitative traits from pooled DNA samples

    PubMed Central

    2012-01-01

    Background Studies to detect associations between DNA markers and traits of interest in humans and livestock benefit from increasing the number of individuals genotyped. Performing association studies on pooled DNA samples can provide greater power for a given cost. For quantitative traits, the effect of an SNP is measured in the units of the trait and here we propose and demonstrate a method to estimate SNP effects on quantitative traits from pooled DNA data. Methods To obtain estimates of SNP effects from pooled DNA samples, we used logistic regression of estimated allele frequencies in pools on phenotype. The method was tested on a simulated dataset, and a beef cattle dataset using a model that included principal components from a genomic correlation matrix derived from the allele frequencies estimated from the pooled samples. The performance of the obtained estimates was evaluated by comparison with estimates obtained using regression of phenotype on genotype from individual samples of DNA. Results For the simulated data, the estimates of SNP effects from pooled DNA are similar but asymptotically different to those from individual DNA data. Error in estimating allele frequencies had a large effect on the accuracy of estimated SNP effects. For the beef cattle dataset, the principal components of the genomic correlation matrix from pooled DNA were consistent with known breed groups, and could be used to account for population stratification. Correctly modeling the contemporary group structure was essential to achieve estimates similar to those from individual DNA data, and pooling DNA from individuals within groups was superior to pooling DNA across groups. For a fixed number of assays, pooled DNA samples produced results that were more correlated with results from individual genotyping data than were results from one random individual assayed from each pool. Conclusions Use of logistic regression of allele frequency on phenotype makes it possible to estimate SNP

  7. Kernel machine SNP-set testing under multiple candidate kernels.

    PubMed

    Wu, Michael C; Maity, Arnab; Lee, Seunggeun; Simmons, Elizabeth M; Harmon, Quaker E; Lin, Xinyi; Engel, Stephanie M; Molldrem, Jeffrey J; Armistead, Paul M

    2013-04-01

    Joint testing for the cumulative effect of multiple single-nucleotide polymorphisms grouped on the basis of prior biological knowledge has become a popular and powerful strategy for the analysis of large-scale genetic association studies. The kernel machine (KM)-testing framework is a useful approach that has been proposed for testing associations between multiple genetic variants and many different types of complex traits by comparing pairwise similarity in phenotype between subjects to pairwise similarity in genotype, with similarity in genotype defined via a kernel function. An advantage of the KM framework is its flexibility: choosing different kernel functions allows for different assumptions concerning the underlying model and can allow for improved power. In practice, it is difficult to know which kernel to use a priori because this depends on the unknown underlying trait architecture and selecting the kernel which gives the lowest P-value can lead to inflated type I error. Therefore, we propose practical strategies for KM testing when multiple candidate kernels are present based on constructing composite kernels and based on efficient perturbation procedures. We demonstrate through simulations and real data applications that the procedures protect the type I error rate and can lead to substantially improved power over poor choices of kernels and only modest differences in power vs. using the best candidate kernel.

  8. Development and Validation of a High-Density SNP Genotyping Array for African Oil Palm.

    PubMed

    Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Heng, Huey Ying; Lee, Heng Leng; Mohamed, Mohaimi; Low, Joel Zi-Bin; Apparow, Sukganah; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Appleton, David Ross

    2016-08-01

    High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle. Copyright © 2016 The Author. Published by Elsevier Inc. All rights reserved.

  9. Detecting SNP combinations discriminating human populations from HapMap data.

    PubMed

    Ding, XiaoJun; Li, Min; Gu, HaiHua; Peng, XiaoQing; Zhang, Zhen; Wu, FangXiang

    2015-03-01

    The genomes of different human beings are similar. There are only a relatively small number of genetic differences between people. The genetic differences between people are very worthy of study. Researchers have proposed the fixation index FST measurement to find the single nucleotide polymorphisms (SNPs) which can reflect human population differences. However, most SNPs have interactions and they work together, which leads to the differences among human populations. The number of all possible m-locus combinations chosen from n SNPs grows exponentially. Most methods concern on 2-locus interactions. In this paper, we propose a novel method to find a new coordinate system under which the energy distributions of different populations are quite different. We select out candidate SNPs from n SNPs by using the information of the axes in the coordinate system. The number of candidate SNPs is small, thus SNP-SNP interactions can be searched efficiently. The method can also find interactions of more than two loci. These interactions should be able to reflect the evolution of human populations from another way. The numbers of SNP-SNP interactions are regarded as the differences between pairwise populations and a hierarchical clustering algorithm is used to construct the evolutionary tree. In the experiments, we apply the method to SNP data of four chromosomes separately and the trees constructed on these four chromosomes are highly consistent. Furthermore, the trees are also consistent with previous studies, which indicates that evolutionary information is well mined. The method provides a new insight to analyze the human population differences.

  10. Comparative SNP diversity among four Eucalyptus species for genes from secondary metabolite biosynthetic pathways

    PubMed Central

    Külheim, Carsten; Hui Yeoh, Suat; Maintz, Jens; Foley, William J; Moran, Gavin F

    2009-01-01

    Background There is little information about the DNA sequence variation within and between closely related plant species. The combination of re-sequencing technologies, large-scale DNA pools and availability of reference gene sequences allowed the extensive characterisation of single nucleotide polymorphisms (SNPs) in genes of four biosynthetic pathways leading to the formation of ecologically relevant secondary metabolites in Eucalyptus. With this approach the occurrence and patterns of SNP variation for a set of genes can be compared across different species from the same genus. Results In a single GS-FLX run, we sequenced over 103 Mbp and assembled them to approximately 50 kbp of reference sequences. An average sequencing depth of 315 reads per nucleotide site was achieved for all four eucalypt species, Eucalyptus globulus, E. nitens, E. camaldulensis and E. loxophleba. We sequenced 23 genes from 1,764 individuals and discovered 8,631 SNPs across the species, with about 1.5 times as many SNPs per kbp in the introns compared to exons. The exons of the two closely related species (E. globulus and E. nitens) had similar numbers of SNPs at synonymous and non-synonymous sites. These species also had similar levels of SNP diversity, whereas E. camaldulensis and E. loxophleba had much higher SNP diversity. Neither the pathway nor the position in the pathway influenced gene diversity. The four species share between 20 and 43% of the SNPs in these genes. Conclusion By using conservative statistical detection methods, we were confident about the validity of each SNP. With numerous individuals sampled over the geographical range of each species, we discovered one SNP in every 33 bp for E. nitens and one in every 31 bp in E. globulus. In contrast, the more distantly related species contained more SNPs: one in every 16 bp for E. camaldulensis and one in 17 bp for E. loxophleba, which is, to the best of our knowledge, the highest frequency of SNPs described in woody plant

  11. Multiplex single nucleotide polymorphism (SNP) assay for detection of soybean mosaic virus resistance genes in soybean.

    PubMed

    Shi, Ainong; Chen, Pengyin; Vierling, Richard; Zheng, Cuming; Li, Dexiao; Dong, Dekun; Shakiba, Ehsan; Cervantez, Innan

    2011-02-01

    Soybean mosaic virus (SMV) is one of the most destructive viral diseases in soybean (Glycine max). Three independent loci for SMV resistance have been identified in soybean germplasm. The use of genetic resistance is the most effective method of controlling this disease. Marker assisted selection (MAS) has become very important and useful in the effort of selecting genes for SMV resistance. Single nucleotide polymorphism (SNP), because of its abundance and high-throughput potential, is a powerful tool in genome mapping, association studies, diversity analysis, and tagging of important genes in plant genomics. In this study, a 10 SNPs plus one insert/deletion (InDel) multiplex assay was developed for SMV resistance: two SNPs were developed from the candidate gene 3gG2 at Rsv1 locus, two SNPs selected from the clone N11PF linked to Rsv1, one 'BARC' SNP screened from soybean chromosome 13 [linkage group (LG) F] near Rsv1, two 'BARC' SNPs from probe A519 linked to Rsv3, one 'BARC' SNP from chromosome 14 (LG B2) near Rsv3, and two 'BARC' SNPs from chromosome 2 (LG D1b) near Rsv4, plus one InDel marker from expressed sequence tag (EST) AW307114 linked to Rsv4. This 11 SNP/InDel multiplex assay showed polymorphism among 47 diverse soybean germplasm, indicating this assay can be used to investigate the mode of inheritance in a SMV resistant soybean line carrying Rsv1, Rsv3, and/or Rsv4 through a segregating population with phenotypic data, and to select a specific gene or pyramid two or three genes for SMV resistance through MAS in soybean breeding program. The presence of two SMV resistance genes (Rsv1 and Rsv3) in J05 soybean was confirmed by the SNP assay.

  12. SNP-set analysis replicates acute lung injury genetic risk factors

    PubMed Central

    2012-01-01

    Background We used a gene – based replication strategy to test the reproducibility of prior acute lung injury (ALI) candidate gene associations. Methods We phenotyped 474 patients from a prospective severe trauma cohort study for ALI. Genomic DNA from subjects’ blood was genotyped using the IBC chip, a multiplex single nucleotide polymorphism (SNP) array. Results were filtered for 25 candidate genes selected using prespecified literature search criteria and present on the IBC platform. For each gene, we grouped SNPs according to haplotype blocks and tested the joint effect of all SNPs on susceptibility to ALI using the SNP-set kernel association test. Results were compared to single SNP analysis of the candidate SNPs. Analyses were separate for genetically determined ancestry (African or European). Results We identified 4 genes in African ancestry and 2 in European ancestry trauma subjects which replicated their associations with ALI. Ours is the first replication of IL6, IL10, IRAK3, and VEGFA associations in non-European populations with ALI. Only one gene – VEGFA – demonstrated association with ALI in both ancestries, with distinct haplotype blocks in each ancestry driving the association. We also report the association between trauma-associated ALI and NFKBIA in European ancestry subjects. Conclusions Prior ALI genetic associations are reproducible and replicate in a trauma cohort. Kernel - based SNP-set analysis is a more powerful method to detect ALI association than single SNP analysis, and thus may be more useful for replication testing. Further, gene-based replication can extend candidate gene associations to diverse ethnicities. PMID:22742663

  13. SNP discovery in the transcriptome of white Pacific shrimp Litopenaeus vannamei by next generation sequencing.

    PubMed

    Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2014-01-01

    The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies.