Construction of a versatile SNP array for pyramiding useful genes of rice.
Kurokawa, Yusuke; Noda, Tomonori; Yamagata, Yoshiyuki; Angeles-Shim, Rosalyn; Sunohara, Hidehiko; Uehara, Kanako; Furuta, Tomoyuki; Nagai, Keisuke; Jena, Kshirod Kumar; Yasui, Hideshi; Yoshimura, Atsushi; Ashikari, Motoyuki; Doi, Kazuyuki
2016-01-01
DNA marker-assisted selection (MAS) has become an indispensable component of breeding. Single nucleotide polymorphisms (SNP) are the most frequent polymorphism in the rice genome. However, SNP markers are not readily employed in MAS because of limitations in genotyping platforms. Here the authors report a Golden Gate SNP array that targets specific genes controlling yield-related traits and biotic stress resistance in rice. As a first step, the SNP genotypes were surveyed in 31 parental varieties using the Affymetrix Rice 44K SNP microarray. The haplotype information for 16 target genes was then converted to the Golden Gate platform with 143-plex markers. Haplotypes for the 14 useful allele are unique and can discriminate among all other varieties. The genotyping consistency between the Affymetrix microarray and the Golden Gate array was 92.8%, and the accuracy of the Golden Gate array was confirmed in 3 F2 segregating populations. The concept of the haplotype-based selection by using the constructed SNP array was proofed. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Hernandez-Ferrer, Carles; Quintela Garcia, Ines; Danielski, Katharina; Carracedo, Ángel; Pérez-Jurado, Luis A; González, Juan R
2015-05-20
The well-known Genome-Wide Association Studies (GWAS) had led to many scientific discoveries using SNP data. Even so, they were not able to explain the full heritability of complex diseases. Now, other structural variants like copy number variants or DNA inversions, either germ-line or in mosaicism events, are being studies. We present the R package affy2sv to pre-process Affymetrix CytoScan HD/750k array (also for Genome-Wide SNP 5.0/6.0 and Axiom) in structural variant studies. We illustrate the capabilities of affy2sv using two different complete pipelines on real data. The first one performing a GWAS and a mosaic alterations detection study, and the other detecting CNVs and performing an inversion calling. Both examples presented in the article show up how affy2sv can be used as part of more complex pipelines aimed to analyze Affymetrix SNP arrays data in genetic association studies, where different types of structural variants are considered.
ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays.
Rigaill, Guillem; Hupé, Philippe; Almeida, Anna; La Rosa, Philippe; Meyniel, Jean-Philippe; Decraene, Charles; Barillot, Emmanuel
2008-03-15
Affymetrix SNP arrays can be used to determine the DNA copy number measurement of 11 000-500 000 SNPs along the genome. Their high density facilitates the precise localization of genomic alterations and makes them a powerful tool for studies of cancers and copy number polymorphism. Like other microarray technologies it is influenced by non-relevant sources of variation, requiring correction. Moreover, the amplitude of variation induced by non-relevant effects is similar or greater than the biologically relevant effect (i.e. true copy number), making it difficult to estimate non-relevant effects accurately without including the biologically relevant effect. We addressed this problem by developing ITALICS, a normalization method that estimates both biological and non-relevant effects in an alternate, iterative manner, accurately eliminating irrelevant effects. We compared our normalization method with other existing and available methods, and found that ITALICS outperformed these methods for several in-house datasets and one public dataset. These results were validated biologically by quantitative PCR. The R package ITALICS (ITerative and Alternative normaLIzation and Copy number calling for affymetrix Snp arrays) has been submitted to Bioconductor.
Coverage and efficiency in current SNP chips
Ha, Ngoc-Thuy; Freytag, Saskia; Bickeboeller, Heike
2014-01-01
To answer the question as to which commercial high-density SNP chip covers most of the human genome given a fixed budget, we compared the performance of 12 chips of different sizes released by Affymetrix and Illumina for the European, Asian, and African populations. These include Affymetrix' relatively new population-optimized arrays, whose SNP sets are each tailored toward a specific ethnicity. Our evaluation of the chips included the use of two measures, efficiency and cost–benefit ratio, which we developed as supplements to genetic coverage. Unlike coverage, these measures factor in the price of a chip or its substitute size (number of SNPs on chip), allowing comparisons to be drawn between differently priced chips. In this fashion, we identified the Affymetrix population-optimized arrays as offering the most cost-effective coverage for the Asian and African population. For the European population, we established the Illumina Human Omni 2.5-8 as the preferred choice. Interestingly, the Affymetrix chip tailored toward an Eastern Asian subpopulation performed well for all three populations investigated. However, our coverage estimates calculated for all chips proved much lower than those advertised by the producers. All our analyses were based on the 1000 Genome Project as reference population. PMID:24448550
2013-01-02
intensity data from the SNP array were normalized using the Affymetrix GeneChip Targeted Genotyping Analysis Software ( GTGS ). To assess robustness of SNP...calls, genotypes were called using three algorithms: (i) GTGS , (ii) illuminus (27), and (iii) a heuristic algorithm based on discrete cutoffs of
Unterseer, Sandra; Bauer, Eva; Haberer, Georg; Seidel, Michael; Knaak, Carsten; Ouzunova, Milena; Meitinger, Thomas; Strom, Tim M; Fries, Ruedi; Pausch, Hubert; Bertani, Christofer; Davassi, Alessandro; Mayer, Klaus Fx; Schön, Chris-Carolin
2014-09-29
High density genotyping data are indispensable for genomic analyses of complex traits in animal and crop species. Maize is one of the most important crop plants worldwide, however a high density SNP genotyping array for analysis of its large and highly dynamic genome was not available so far. We developed a high density maize SNP array composed of 616,201 variants (SNPs and small indels). Initially, 57 M variants were discovered by sequencing 30 representative temperate maize lines and then stringently filtered for sequence quality scores and predicted conversion performance on the array resulting in the selection of 1.2 M polymorphic variants assayed on two screening arrays. To identify high-confidence variants, 285 DNA samples from a broad genetic diversity panel of worldwide maize lines including the samples used for sequencing, important founder lines for European maize breeding, hybrids, and proprietary samples with European, US, semi-tropical, and tropical origin were used for experimental validation. We selected 616 k variants according to their performance during validation, support of genotype calls through sequencing data, and physical distribution for further analysis and for the design of the commercially available Affymetrix® Axiom® Maize Genotyping Array. This array is composed of 609,442 SNPs and 6,759 indels. Among these are 116,224 variants in coding regions and 45,655 SNPs of the Illumina® MaizeSNP50 BeadChip for study comparison. In a subset of 45,974 variants, apart from the target SNP additional off-target variants are detected, which show only a minor bias towards intermediate allele frequencies. We performed principal coordinate and admixture analyses to determine the ability of the array to detect and resolve population structure and investigated the extent of LD within a worldwide validation panel. The high density Affymetrix® Axiom® Maize Genotyping Array is optimized for European and American temperate maize and was developed based on a diverse sample panel by applying stringent quality filter criteria to ensure its suitability for a broad range of applications. With 600 k variants it is the largest currently publically available genotyping array in crop species.
Gutierrez, Alejandro P; Turner, Frances; Gharbi, Karim; Talbot, Richard; Lowe, Natalie R; Peñaloza, Carolina; McCullough, Mark; Prodöhl, Paulo A; Bean, Tim P; Houston, Ross D
2017-07-05
SNP arrays are enabling tools for high-resolution studies of the genetic basis of complex traits in farmed and wild animals. Oysters are of critical importance in many regions from both an ecological and economic perspective, and oyster aquaculture forms a key component of global food security. The aim of our study was to design a combined-species, medium density SNP array for Pacific oyster ( Crassostrea gigas ) and European flat oyster ( Ostrea edulis ), and to test the performance of this array on farmed and wild populations from multiple locations, with a focus on European populations. SNP discovery was carried out by whole-genome sequencing (WGS) of pooled genomic DNA samples from eight C. gigas populations, and restriction site-associated DNA sequencing (RAD-Seq) of 11 geographically diverse O. edulis populations. Nearly 12 million candidate SNPs were discovered and filtered based on several criteria, including preference for SNPs segregating in multiple populations and SNPs with monomorphic flanking regions. An Affymetrix Axiom Custom Array was created and tested on a diverse set of samples ( n = 219) showing ∼27 K high quality SNPs for C. gigas and ∼11 K high quality SNPs for O. edulis segregating in these populations. A high proportion of SNPs were segregating in each of the populations, and the array was used to detect population structure and levels of linkage disequilibrium (LD). Further testing of the array on three C. gigas nuclear families ( n = 165) revealed that the array can be used to clearly distinguish between both families based on identity-by-state (IBS) clustering parental assignment software. This medium density, combined-species array will be publicly available through Affymetrix, and will be applied for genome-wide association and evolutionary genetic studies, and for genomic selection in oyster breeding programs. Copyright © 2017 Gutierrez et al.
Developing 100K Affymetrix Axiom SNP Array for Polyploid Sugarcane
USDA-ARS?s Scientific Manuscript database
Sugarcane genotyping or fingerprinting has long been a daunting task due to its high polyploidy level with large number of chromosomes. Single nucleotide polymorphisms (SNPs) are very abundant DNA sequence variations in the genomes. With the advance of next generation sequencing (NGS) technologies, ...
Wong, Gerard; Leckie, Christopher; Gorringe, Kylie L; Haviv, Izhak; Campbell, Ian G; Kowalczyk, Adam
2010-04-15
High-density single nucleotide polymorphism (SNP) genotyping arrays are efficient and cost effective platforms for the detection of copy number variation (CNV). To ensure accuracy in probe synthesis and to minimize production costs, short oligonucleotide probe sequences are used. The use of short probe sequences limits the specificity of binding targets in the human genome. The specificity of these short probeset sequences has yet to be fully analysed against a normal reference human genome. Sequence similarity can artificially elevate or suppress copy number measurements, and hence reduce the reliability of affected probe readings. For the purpose of detecting narrow CNVs reliably down to the width of a single probeset, sequence similarity is an important issue that needs to be addressed. We surveyed the Affymetrix Human Mapping SNP arrays for probeset sequence similarity against the reference human genome. Utilizing sequence similarity results, we identified a collection of fine-scaled putative CNVs between gender from autosomal probesets whose sequence matches various loci on the sex chromosomes. To detect these variations, we utilized our statistical approach, Detecting REcurrent Copy number change using rank-order Statistics (DRECS), and showed that its performance was superior and more stable than the t-test in detecting CNVs. Through the application of DRECS on the HapMap population datasets with multi-matching probesets filtered, we identified biologically relevant SNPs in aberrant regions across populations with known association to physical traits, such as height, covered by the span of a single probe. This provided empirical confirmation of the existence of naturally occurring narrow CNVs as well as the sensitivity of the Affymetrix SNP array technology in detecting them. The MATLAB implementation of DRECS is available at http://ww2.cs.mu.oz.au/ approximately gwong/DRECS/index.html.
Linkage Disequilibrium And Genome-Wide Association Studies In O. sativa
USDA-ARS?s Scientific Manuscript database
There is increasing evidence that genome-wide association studies provide a powerful approach to find the genetic basis of complex phenotypic variation in all kinds of species. For this purpose, we developed the first generation 44K Affymetrix SNP array in rice (see Tung et al. poster). We genotyped...
MMP9 polymorphisms and breast cancer risk: a report from the Shanghai Breast Cancer Genetics Study.
Beeghly-Fadiel, Alicia; Lu, Wei; Shu, Xiao-Ou; Long, Jirong; Cai, Qiuyin; Xiang, Yongbin; Gao, Yu-Tang; Zheng, Wei
2011-04-01
In addition to tumor invasion and angiogenesis, matrix metalloproteinase (MMP)9 also contributes to carcinogenesis and tumor growth. Genetic variation that may influence MMP9 expression was evaluated among participants of the Shanghai Breast Cancer Genetics Study (SBCGS) for associations with breast cancer susceptibility. In stage 1, 11 MMP9 single nucleotide polymorphisms (SNPs) were genotyped by the Affymetrix Targeted Genotyping System and/or the Affymetrix Genome-Wide Human SNP Array 6.0 among 4,227 SBCGS participants. One SNP was further genotyped using the Sequenom iPLEX MassARRAY platform among an additional 6,270 SBCGS participants. Associations with breast cancer risk were evaluated by odds ratios (OR) and 95% confidence intervals (CI) from logistic regression models that included adjustment for age, education, and genotyping stage when appropriate. In Stage 1, rare allele homozygotes for a promoter SNP (rs3918241) or a non-synonymous SNP (rs2274756, R668Q) tended to occur more frequently among breast cancer cases (P value = 0.116 and 0.056, respectively). Given their high linkage disequilibrium (D' = 1.0, r (2) = 0.97), one (rs3918241) was selected for additional analysis. An association with breast cancer risk was not supported by additional Stage 2 genotyping. In combined analysis, no elevated risk of breast cancer among homozygotes was found (OR: 1.2, 95% CI: 0.8-1.8). Common genetic variation in MMP9 was not found to be significantly associated with breast cancer susceptibility among participants of the Shanghai Breast Cancer Genetics Study.
Development and validation of a high density SNP genotyping array for Atlantic salmon (Salmo salar).
Houston, Ross D; Taggart, John B; Cézard, Timothé; Bekaert, Michaël; Lowe, Natalie R; Downing, Alison; Talbot, Richard; Bishop, Stephen C; Archibald, Alan L; Bron, James E; Penman, David J; Davassi, Alessandro; Brew, Fiona; Tinch, Alan E; Gharbi, Karim; Hamilton, Alastair
2014-02-06
Dense single nucleotide polymorphism (SNP) genotyping arrays provide extensive information on polymorphic variation across the genome of species of interest. Such information can be used in studies of the genetic architecture of quantitative traits and to improve the accuracy of selection in breeding programs. In Atlantic salmon (Salmo salar), these goals are currently hampered by the lack of a high-density SNP genotyping platform. Therefore, the aim of the study was to develop and test a dense Atlantic salmon SNP array. SNP discovery was performed using extensive deep sequencing of Reduced Representation (RR-Seq), Restriction site-Associated DNA (RAD-Seq) and mRNA (RNA-Seq) libraries derived from farmed and wild Atlantic salmon samples (n = 283) resulting in the discovery of > 400 K putative SNPs. An Affymetrix Axiom® myDesign Custom Array was created and tested on samples of animals of wild and farmed origin (n = 96) revealing a total of 132,033 polymorphic SNPs with high call rate, good cluster separation on the array and stable Mendelian inheritance in our sample. At least 38% of these SNPs are from transcribed genomic regions and therefore more likely to include functional variants. Linkage analysis utilising the lack of male recombination in salmonids allowed the mapping of 40,214 SNPs distributed across all 29 pairs of chromosomes, highlighting the extensive genome-wide coverage of the SNPs. An identity-by-state clustering analysis revealed that the array can clearly distinguish between fish of different origins, within and between farmed and wild populations. Finally, Y-chromosome-specific probes included on the array provide an accurate molecular genetic test for sex. This manuscript describes the first high-density SNP genotyping array for Atlantic salmon. This array will be publicly available and is likely to be used as a platform for high-resolution genetics research into traits of evolutionary and economic importance in salmonids and in aquaculture breeding programs via genomic selection.
Development and validation of a high density SNP genotyping array for Atlantic salmon (Salmo salar)
2014-01-01
Background Dense single nucleotide polymorphism (SNP) genotyping arrays provide extensive information on polymorphic variation across the genome of species of interest. Such information can be used in studies of the genetic architecture of quantitative traits and to improve the accuracy of selection in breeding programs. In Atlantic salmon (Salmo salar), these goals are currently hampered by the lack of a high-density SNP genotyping platform. Therefore, the aim of the study was to develop and test a dense Atlantic salmon SNP array. Results SNP discovery was performed using extensive deep sequencing of Reduced Representation (RR-Seq), Restriction site-Associated DNA (RAD-Seq) and mRNA (RNA-Seq) libraries derived from farmed and wild Atlantic salmon samples (n = 283) resulting in the discovery of > 400 K putative SNPs. An Affymetrix Axiom® myDesign Custom Array was created and tested on samples of animals of wild and farmed origin (n = 96) revealing a total of 132,033 polymorphic SNPs with high call rate, good cluster separation on the array and stable Mendelian inheritance in our sample. At least 38% of these SNPs are from transcribed genomic regions and therefore more likely to include functional variants. Linkage analysis utilising the lack of male recombination in salmonids allowed the mapping of 40,214 SNPs distributed across all 29 pairs of chromosomes, highlighting the extensive genome-wide coverage of the SNPs. An identity-by-state clustering analysis revealed that the array can clearly distinguish between fish of different origins, within and between farmed and wild populations. Finally, Y-chromosome-specific probes included on the array provide an accurate molecular genetic test for sex. Conclusions This manuscript describes the first high-density SNP genotyping array for Atlantic salmon. This array will be publicly available and is likely to be used as a platform for high-resolution genetics research into traits of evolutionary and economic importance in salmonids and in aquaculture breeding programs via genomic selection. PMID:24524230
Haraksingh, Rajini R.; Abyzov, Alexej; Gerstein, Mark; Urban, Alexander E.; Snyder, Michael
2011-01-01
Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications. PMID:22140474
Humble, Emily; Thorne, Michael A S; Forcada, Jaume; Hoffman, Joseph I
2016-08-26
Single nucleotide polymorphism (SNP) discovery is an important goal of many studies. However, the number of 'putative' SNPs discovered from a sequence resource may not provide a reliable indication of the number that will successfully validate with a given genotyping technology. For this it may be necessary to account for factors such as the method used for SNP discovery and the type of sequence data from which it originates, suitability of the SNP flanking sequences for probe design, and genomic context. To explore the relative importance of these and other factors, we used Illumina sequencing to augment an existing Roche 454 transcriptome assembly for the Antarctic fur seal (Arctocephalus gazella). We then mapped the raw Illumina reads to the new hybrid transcriptome using BWA and BOWTIE2 before calling SNPs with GATK. The resulting markers were pooled with two existing sets of SNPs called from the original 454 assembly using NEWBLER and SWAP454. Finally, we explored the extent to which SNPs discovered using these four methods overlapped and predicted the corresponding validation outcomes for both Illumina Infinium iSelect HD and Affymetrix Axiom arrays. Collating markers across all discovery methods resulted in a global list of 34,718 SNPs. However, concordance between the methods was surprisingly poor, with only 51.0 % of SNPs being discovered by more than one method and 13.5 % being called from both the 454 and Illumina datasets. Using a predictive modeling approach, we could also show that SNPs called from the Illumina data were on average more likely to successfully validate, as were SNPs called by more than one method. Above and beyond this pattern, predicted validation outcomes were also consistently better for Affymetrix Axiom arrays. Our results suggest that focusing on SNPs called by more than one method could potentially improve validation outcomes. They also highlight possible differences between alternative genotyping technologies that could be explored in future studies of non-model organisms.
Howard, Nicholas P; van de Weg, Eric; Bedford, David S; Peace, Cameron P; Vanderzande, Stijn; Clark, Matthew D; Teh, Soon Li; Cai, Lichun; Luby, James J
2017-01-01
The apple (Malus×domestica) cultivar Honeycrisp has become important economically and as a breeding parent. An earlier study with SSR markers indicated the original recorded pedigree of ‘Honeycrisp’ was incorrect and ‘Keepsake’ was identified as one putative parent, the other being unknown. The objective of this study was to verify ‘Keepsake’ as a parent and identify and genetically describe the unknown parent and its grandparents. A multi-family based dense and high-quality integrated SNP map was created using the apple 8 K Illumina Infinium SNP array. This map was used alongside a large pedigree-connected data set from the RosBREED project to build extended SNP haplotypes and to identify pedigree relationships. ‘Keepsake’ was verified as one parent of ‘Honeycrisp’ and ‘Duchess of Oldenburg’ and ‘Golden Delicious’ were identified as grandparents through the unknown parent. Following this finding, siblings of ‘Honeycrisp’ were identified using the SNP data. Breeding records from several of these siblings suggested that the previously unreported parent is a University of Minnesota selection, MN1627. This selection is no longer available, but now is genetically described through imputed SNP haplotypes. We also present the mosaic grandparental composition of ‘Honeycrisp’ for each of its 17 chromosome pairs. This new pedigree and genetic information will be useful in future pedigree-based genetic studies to connect ‘Honeycrisp’ with other cultivars used widely in apple breeding programs. The created SNP linkage map will benefit future research using the data from the Illumina apple 8 and 20 K and Affymetrix 480 K SNP arrays. PMID:28243452
Yáñez, J M; Naswa, S; López, M E; Bassini, L; Correa, K; Gilbey, J; Bernatchez, L; Norris, A; Neira, R; Lhorente, J P; Schnable, P S; Newman, S; Mileham, A; Deeb, N; Di Genova, A; Maass, A
2016-07-01
A considerable number of single nucleotide polymorphisms (SNPs) are required to elucidate genotype-phenotype associations and determine the molecular basis of important traits. In this work, we carried out de novo SNP discovery accounting for both genome duplication and genetic variation from American and European salmon populations. A total of 9 736 473 nonredundant SNPs were identified across a set of 20 fish by whole-genome sequencing. After applying six bioinformatic filtering steps, 200 K SNPs were selected to develop an Affymetrix Axiom(®) myDesign Custom Array. This array was used to genotype 480 fish representing wild and farmed salmon from Europe, North America and Chile. A total of 159 099 (79.6%) SNPs were validated as high quality based on clustering properties. A total of 151 509 validated SNPs showed a unique position in the genome. When comparing these SNPs against 238 572 markers currently available in two other Atlantic salmon arrays, only 4.6% of the SNP overlapped with the panel developed in this study. This novel high-density SNP panel will be very useful for the dissection of economically and ecologically relevant traits, enhancing breeding programmes through genomic selection as well as supporting genetic studies in both wild and farmed populations of Atlantic salmon using high-resolution genomewide information. © 2016 John Wiley & Sons Ltd.
Bassil, Nahla V; Davis, Thomas M; Zhang, Hailong; Ficklin, Stephen; Mittmann, Mike; Webster, Teresa; Mahoney, Lise; Wood, David; Alperin, Elisabeth S; Rosyara, Umesh R; Koehorst-Vanc Putten, Herma; Monfort, Amparo; Sargent, Daniel J; Amaya, Iraida; Denoyes, Beatrice; Bianco, Luca; van Dijk, Thijs; Pirani, Ali; Iezzoni, Amy; Main, Dorrie; Peace, Cameron; Yang, Yilong; Whitaker, Vance; Verma, Sujeet; Bellon, Laurent; Brew, Fiona; Herrera, Raul; van de Weg, Eric
2015-03-07
A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria × ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca 'Hawaii 4' reference genome to identify single nucleotide polymorphisms (SNPs) and indels for incorporation into a 90 K Affymetrix® Axiom® array. We report the development and preliminary evaluation of this array. About 36 million sequence variants were identified in a 19 member, octoploid germplasm panel. Strategies and filtering pipelines were developed to identify and incorporate markers of several types: di-allelic SNPs (66.6%), multi-allelic SNPs (1.8%), indels (10.1%), and ploidy-reducing "haploSNPs" (11.7%). The remaining SNPs included those discovered in the diploid progenitor F. iinumae (3.9%), and speculative "codon-based" SNPs (5.9%). In genotyping 306 octoploid accessions, SNPs were assigned to six classes with Affymetrix's "SNPolisher" R package. The highest quality classes, PolyHigh Resolution (PHR), No Minor Homozygote (NMH), and Off-Target Variant (OTV) comprised 25%, 38%, and 1% of array markers, respectively. These markers were suitable for genetic studies as demonstrated in the full-sib family 'Holiday' × 'Korona' with the generation of a genetic linkage map consisting of 6,594 PHR SNPs evenly distributed across 28 chromosomes with an average density of approximately one marker per 0.5 cM, thus exceeding our goal of one marker per cM. The Affymetrix IStraw90 Axiom array is the first high-throughput genotyping platform for cultivated strawberry and is commercially available to the worldwide scientific community. The array's high success rate is likely driven by the presence of naturally occurring variation in ploidy level within the nominally octoploid genome, and by effectiveness of the employed array design and ploidy-reducing strategies. This array enables genetic analyses including generation of high-density linkage maps, identification of quantitative trait loci for economically important traits, and genome-wide association studies, thus providing a basis for marker-assisted breeding in this high value crop.
Evaluation of copy number variation detection for a SNP array platform
2014-01-01
Background Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one or several software packages for CNV detection based on the SNP array platform. We selected four publicly available software packages designed for CNV calling from an Affymetrix SNP array, including Birdsuite, dChip, Genotyping Console (GTC) and PennCNV. The publicly available dataset generated by Array-based Comparative Genomic Hybridization (CGH), with a resolution of 24 million probes per sample, was considered to be the “gold standard”. Compared with the CGH-based dataset, the success rate, average stability rate, sensitivity, consistence and reproducibility of these four software packages were assessed compared with the “gold standard”. Specially, we also compared the efficiency of detecting CNVs simultaneously by two, three and all of the software packages with that by a single software package. Results Simply from the quantity of the detected CNVs, Birdsuite detected the most while GTC detected the least. We found that Birdsuite and dChip had obvious detecting bias. And GTC seemed to be inferior because of the least amount of CNVs it detected. Thereafter we investigated the detection consistency produced by one certain software package and the rest three software suits. We found that the consistency of dChip was the lowest while GTC was the highest. Compared with the CNVs detecting result of CGH, in the matching group, GTC called the most matching CNVs, PennCNV-Affy ranked second. In the non-overlapping group, GTC called the least CNVs. With regards to the reproducibility of CNV calling, larger CNVs were usually replicated better. PennCNV-Affy shows the best consistency while Birdsuite shows the poorest. Conclusion We found that PennCNV outperformed the other three packages in the sensitivity and specificity of CNV calling. Obviously, each calling method had its own limitations and advantages for different data analysis. Therefore, the optimized calling methods might be identified using multiple algorithms to evaluate the concordance and discordance of SNP array-based CNV calling. PMID:24555668
van Geest, Geert; Voorrips, Roeland E; Esselink, Danny; Post, Aike; Visser, Richard Gf; Arens, Paul
2017-08-07
Cultivated chrysanthemum is an outcrossing hexaploid (2n = 6× = 54) with a disputed mode of inheritance. In this paper, we present a single nucleotide polymorphism (SNP) selection pipeline that was used to design an Affymetrix Axiom array with 183 k SNPs from RNA sequencing data (1). With this array, we genotyped four bi-parental populations (with sizes of 405, 53, 76 and 37 offspring plants respectively), and a cultivar panel of 63 genotypes. Further, we present a method for dosage scoring in hexaploids from signal intensities of the array based on mixture models (2) and validation of selection steps in the SNP selection pipeline (3). The resulting genotypic data is used to draw conclusions on the mode of inheritance in chrysanthemum (4), and to make an inference on allelic expression bias (5). With use of the mixture model approach, we successfully called the dosage of 73,936 out of 183,130 SNPs (40.4%) that segregated in any of the bi-parental populations. To investigate the mode of inheritance, we analysed markers that segregated in the large bi-parental population (n = 405). Analysis of segregation of duplex x nulliplex SNPs resulted in evidence for genome-wide hexasomic inheritance. This evidence was substantiated by the absence of strong linkage between markers in repulsion, which indicated absence of full disomic inheritance. We present the success rate of SNP discovery out of RNA sequencing data as affected by different selection steps, among which SNP coverage over genotypes and use of different types of sequence read mapping software. Genomic dosage highly correlated with relative allele coverage from the RNA sequencing data, indicating that most alleles are expressed according to their genomic dosage. The large population, genotyped with a very large number of markers, is a unique framework for extensive genetic analyses in hexaploid chrysanthemum. As starting point, we show conclusive evidence for genome-wide hexasomic inheritance.
The pitfalls of platform comparison: DNA copy number array technologies assessed
2009-01-01
Background The accurate and high resolution mapping of DNA copy number aberrations has become an important tool by which to gain insight into the mechanisms of tumourigenesis. There are various commercially available platforms for such studies, but there remains no general consensus as to the optimal platform. There have been several previous platform comparison studies, but they have either described older technologies, used less-complex samples, or have not addressed the issue of the inherent biases in such comparisons. Here we describe a systematic comparison of data from four leading microarray technologies (the Affymetrix Genome-wide SNP 5.0 array, Agilent High-Density CGH Human 244A array, Illumina HumanCNV370-Duo DNA Analysis BeadChip, and the Nimblegen 385 K oligonucleotide array). We compare samples derived from primary breast tumours and their corresponding matched normals, well-established cancer cell lines, and HapMap individuals. By careful consideration and avoidance of potential sources of bias, we aim to provide a fair assessment of platform performance. Results By performing a theoretical assessment of the reproducibility, noise, and sensitivity of each platform, notable differences were revealed. Nimblegen exhibited between-replicate array variances an order of magnitude greater than the other three platforms, with Agilent slightly outperforming the others, and a comparison of self-self hybridizations revealed similar patterns. An assessment of the single probe power revealed that Agilent exhibits the highest sensitivity. Additionally, we performed an in-depth visual assessment of the ability of each platform to detect aberrations of varying sizes. As expected, all platforms were able to identify large aberrations in a robust manner. However, some focal amplifications and deletions were only detected in a subset of the platforms. Conclusion Although there are substantial differences in the design, density, and number of replicate probes, the comparison indicates a generally high level of concordance between platforms, despite differences in the reproducibility, noise, and sensitivity. In general, Agilent tended to be the best aCGH platform and Affymetrix, the superior SNP-CGH platform, but for specific decisions the results described herein provide a guide for platform selection and study design, and the dataset a resource for more tailored comparisons. PMID:19995423
Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.
Guzzi, Pietro Hiram; Cannataro, Mario
2013-08-01
A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power Tools), (ii) the manual loading of preprocessing libraries, and (iii) the management of intermediate files, such as results and metadata. Micro-Analyzer users can directly manage Affymetrix binary data without worrying about locating and invoking the proper preprocessing tools and chip-specific libraries. Moreover, users of the Micro-Analyzer tool can load the preprocessed data directly into the well-known TM4 platform, extending in such a way also the TM4 capabilities. Consequently, Micro Analyzer offers the following advantages: (i) it reduces possible errors in the preprocessing and further analysis phases, e.g. due to the incorrect choice of parameters or due to the use of old libraries, (ii) it enables the combined and centralized pre-processing of different arrays, (iii) it may enhance the quality of further analysis by storing the workflow, i.e. information about the preprocessing steps, and (iv) finally Micro-Analzyer is freely available as a standalone application at the project web site http://sourceforge.net/projects/microanalyzer/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Analysis and visualization of chromosomal abnormalities in SNP data with SNPscan
Ting, Jason C; Ye, Ying; Thomas, George H; Ruczinski, Ingo; Pevsner, Jonathan
2006-01-01
Background A variety of diseases are caused by chromosomal abnormalities such as aneuploidies (having an abnormal number of chromosomes), microdeletions, microduplications, and uniparental disomy. High density single nucleotide polymorphism (SNP) microarrays provide information on chromosomal copy number changes, as well as genotype (heterozygosity and homozygosity). SNP array studies generate multiple types of data for each SNP site, some with more than 100,000 SNPs represented on each array. The identification of different classes of anomalies within SNP data has been challenging. Results We have developed SNPscan, a web-accessible tool to analyze and visualize high density SNP data. It enables researchers (1) to visually and quantitatively assess the quality of user-generated SNP data relative to a benchmark data set derived from a control population, (2) to display SNP intensity and allelic call data in order to detect chromosomal copy number anomalies (duplications and deletions), (3) to display uniparental isodisomy based on loss of heterozygosity (LOH) across genomic regions, (4) to compare paired samples (e.g. tumor and normal), and (5) to generate a file type for viewing SNP data in the University of California, Santa Cruz (UCSC) Human Genome Browser. SNPscan accepts data exported from Affymetrix Copy Number Analysis Tool as its input. We validated SNPscan using data generated from patients with known deletions, duplications, and uniparental disomy. We also inspected previously generated SNP data from 90 apparently normal individuals from the Centre d'Étude du Polymorphisme Humain (CEPH) collection, and identified three cases of uniparental isodisomy, four females having an apparently mosaic X chromosome, two mislabelled SNP data sets, and one microdeletion on chromosome 2 with mosaicism from an apparently normal female. These previously unrecognized abnormalities were all detected using SNPscan. The microdeletion was independently confirmed by fluorescence in situ hybridization, and a region of homozygosity in a UPD case was confirmed by sequencing of genomic DNA. Conclusion SNPscan is useful to identify chromosomal abnormalities based on SNP intensity (such as chromosomal copy number changes) and heterozygosity data (including regions of LOH and some cases of UPD). The program and source code are available at the SNPscan website . PMID:16420694
Fox, Ervin R.; Young, J. Hunter; Li, Yali; Dreisbach, Albert W.; Keating, Brendan J.; Musani, Solomon K.; Liu, Kiang; Morrison, Alanna C.; Ganesh, Santhi; Kutlar, Abdullah; Ramachandran, Vasan S.; Polak, Josef F.; Fabsitz, Richard R.; Dries, Daniel L.; Farlow, Deborah N.; Redline, Susan; Adeyemo, Adebowale; Hirschorn, Joel N.; Sun, Yan V.; Wyatt, Sharon B.; Penman, Alan D.; Palmas, Walter; Rotter, Jerome I.; Townsend, Raymond R.; Doumatey, Ayo P.; Tayo, Bamidele O.; Mosley, Thomas H.; Lyon, Helen N.; Kang, Sun J.; Rotimi, Charles N.; Cooper, Richard S.; Franceschini, Nora; Curb, J. David; Martin, Lisa W.; Eaton, Charles B.; Kardia, Sharon L.R.; Taylor, Herman A.; Caulfield, Mark J.; Ehret, Georg B.; Johnson, Toby; Chakravarti, Aravinda; Zhu, Xiaofeng; Levy, Daniel; Munroe, Patricia B.; Rice, Kenneth M.; Bochud, Murielle; Johnson, Andrew D.; Chasman, Daniel I.; Smith, Albert V.; Tobin, Martin D.; Verwoert, Germaine C.; Hwang, Shih-Jen; Pihur, Vasyl; Vollenweider, Peter; O'Reilly, Paul F.; Amin, Najaf; Bragg-Gresham, Jennifer L.; Teumer, Alexander; Glazer, Nicole L.; Launer, Lenore; Zhao, Jing Hua; Aulchenko, Yurii; Heath, Simon; Sõber, Siim; Parsa, Afshin; Luan, Jian'an; Arora, Pankaj; Dehghan, Abbas; Zhang, Feng; Lucas, Gavin; Hicks, Andrew A.; Jackson, Anne U.; Peden, John F.; Tanaka, Toshiko; Wild, Sarah H.; Rudan, Igor; Igl, Wilmar; Milaneschi, Yuri; Parker, Alex N.; Fava, Cristiano; Chambers, John C.; Kumari, Meena; JinGo, Min; van der Harst, Pim; Kao, Wen Hong Linda; Sjögren, Marketa; Vinay, D.G.; Alexander, Myriam; Tabara, Yasuharu; Shaw-Hawkins, Sue; Whincup, Peter H.; Liu, Yongmei; Shi, Gang; Kuusisto, Johanna; Seielstad, Mark; Sim, Xueling; Nguyen, Khanh-Dung Hoang; Lehtimäki, Terho; Matullo, Giuseppe; Wu, Ying; Gaunt, Tom R.; Charlotte Onland-Moret, N.; Cooper, Matthew N.; Platou, Carl G.P.; Org, Elin; Hardy, Rebecca; Dahgam, Santosh; Palmen, Jutta; Vitart, Veronique; Braund, Peter S.; Kuznetsova, Tatiana; Uiterwaal, Cuno S.P.M.; Campbell, Harry; Ludwig, Barbara; Tomaszewski, Maciej; Tzoulaki, Ioanna; Palmer, Nicholette D.; Aspelund, Thor; Garcia, Melissa; Chang, Yen-Pei C.; O'Connell, Jeffrey R.; Steinle, Nanette I.; Grobbee, Diederick E.; Arking, Dan E.; Hernandez, Dena; Najjar, Samer; McArdle, Wendy L.; Hadley, David; Brown, Morris J.; Connell, John M.; Hingorani, Aroon D.; Day, Ian N.M.; Lawlor, Debbie A.; Beilby, John P.; Lawrence, Robert W.; Clarke, Robert; Collins, Rory; Hopewell, Jemma C.; Ongen, Halit; Bis, Joshua C.; Kähönen, Mika; Viikari, Jorma; Adair, Linda S.; Lee, Nanette R.; Chen, Ming-Huei; Olden, Matthias; Pattaro, Cristian; Hoffman Bolton, Judith A.; Köttgen, Anna; Bergmann, Sven; Mooser, Vincent; Chaturvedi, Nish; Frayling, Timothy M.; Islam, Muhammad; Jafar, Tazeen H.; Erdmann, Jeanette; Kulkarni, Smita R.; Bornstein, Stefan R.; Grässler, Jürgen; Groop, Leif; Voight, Benjamin F.; Kettunen, Johannes; Howard, Philip; Taylor, Andrew; Guarrera, Simonetta; Ricceri, Fulvio; Emilsson, Valur; Plump, Andrew; Barroso, Inês; Khaw, Kay-Tee; Weder, Alan B.; Hunt, Steven C.; Bergman, Richard N.; Collins, Francis S.; Bonnycastle, Lori L.; Scott, Laura J.; Stringham, Heather M.; Peltonen, Leena; Perola, Markus; Vartiainen, Erkki; Brand, Stefan-Martin; Staessen, Jan A.; Wang, Thomas J.; Burton, Paul R.; SolerArtigas, Maria; Dong, Yanbin; Snieder, Harold; Wang, Xiaoling; Zhu, Haidong; Lohman, Kurt K.; Rudock, Megan E.; Heckbert, Susan R.; Smith, Nicholas L.; Wiggins, Kerri L.; Shriner, Daniel; Veldre, Gudrun; Viigimaa, Margus; Kinra, Sanjay; Prabhakaran, Dorairajan; Tripathy, Vikal; Langefeld, Carl D.; Rosengren, Annika; Thelle, Dag S.; MariaCorsi, Anna; Singleton, Andrew; Forrester, Terrence; Hilton, Gina; McKenzie, Colin A.; Salako, Tunde; Iwai, Naoharu; Kita, Yoshikuni; Ogihara, Toshio; Ohkubo, Takayoshi; Okamura, Tomonori; Ueshima, Hirotsugu; Umemura, Satoshi; Eyheramendy, Susana; Meitinger, Thomas; Wichmann, H.-Erich; Cho, Yoon Shin; Kim, Hyung-Lae; Lee, Jong-Young; Scott, James; Sehmi, Joban S.; Zhang, Weihua; Hedblad, Bo; Nilsson, Peter; Smith, George Davey; Wong, Andrew; Narisu, Narisu; Stančáková, Alena; Raffel, Leslie J.; Yao, Jie; Kathiresan, Sekar; O'Donnell, Chris; Schwartz, Steven M.; Arfan Ikram, M.; Longstreth, Will T.; Seshadri, Sudha; Shrine, Nick R.G.; Wain, Louise V.; Morken, Mario A.; Swift, Amy J.; Laitinen, Jaana; Prokopenko, Inga; Zitting, Paavo; Cooper, Jackie A.; Humphries, Steve E.; Danesh, John; Rasheed, Asif; Goel, Anuj; Hamsten, Anders; Watkins, Hugh; Bakker, Stephan J.L.; van Gilst, Wiek H.; Janipalli, Charles S.; Radha Mani, K.; Yajnik, Chittaranjan S.; Hofman, Albert; Mattace-Raso, Francesco U.S.; Oostra, Ben A.; Demirkan, Ayse; Isaacs, Aaron; Rivadeneira, Fernando; Lakatta, Edward G.; Orru, Marco; Scuteri, Angelo; Ala-Korpela, Mika; Kangas, Antti J.; Lyytikäinen, Leo-Pekka; Soininen, Pasi; Tukiainen, Taru; Würz, Peter; Twee-Hee Ong, Rick; Dörr, Marcus; Kroemer, Heyo K.; Völker, Uwe; Völzke, Henry; Galan, Pilar; Hercberg, Serge; Lathrop, Mark; Zelenika, Diana; Deloukas, Panos; Mangino, Massimo; Spector, Tim D.; Zhai, Guangju; Meschia, James F.; Nalls, Michael A.; Sharma, Pankaj; Terzic, Janos; Kranthi Kumar, M.J.; Denniff, Matthew; Zukowska-Szczechowska, Ewa; Wagenknecht, Lynne E.; Fowkes, Gerald R.; Charchar, Fadi J.; Schwarz, Peter E.H.; Hayward, Caroline; Guo, Xiuqing; Bots, Michiel L.; Brand, Eva; Samani, Nilesh J.; Polasek, Ozren; Talmud, Philippa J.; Nyberg, Fredrik; Kuh, Diana; Laan, Maris; Hveem, Kristian; Palmer, Lyle J.; van der Schouw, Yvonne T.; Casas, Juan P.; Mohlke, Karen L.; Vineis, Paolo; Raitakari, Olli; Wong, Tien Y.; Shyong Tai, E.; Laakso, Markku; Rao, Dabeeru C.; Harris, Tamara B.; Morris, Richard W.; Dominiczak, Anna F.; Kivimaki, Mika; Marmot, Michael G.; Miki, Tetsuro; Saleheen, Danish; Chandak, Giriraj R.; Coresh, Josef; Navis, Gerjan; Salomaa, Veikko; Han, Bok-Ghee; Kooner, Jaspal S.; Melander, Olle; Ridker, Paul M.; Bandinelli, Stefania; Gyllensten, Ulf B.; Wright, Alan F.; Wilson, James F.; Ferrucci, Luigi; Farrall, Martin; Tuomilehto, Jaakko; Pramstaller, Peter P.; Elosua, Roberto; Soranzo, Nicole; Sijbrands, Eric J.G.; Altshuler, David; Loos, Ruth J.F.; Shuldiner, Alan R.; Gieger, Christian; Meneton, Pierre; Uitterlinden, Andre G.; Wareham, Nicholas J.; Gudnason, Vilmundur; Rettig, Rainer; Uda, Manuela; Strachan, David P.; Witteman, Jacqueline C.M.; Hartikainen, Anna-Liisa; Beckmann, Jacques S.; Boerwinkle, Eric; Boehnke, Michael; Larson, Martin G.; Järvelin, Marjo-Riitta; Psaty, Bruce M.; Abecasis, Gonçalo R.; Elliott, Paul; van Duijn , Cornelia M.; Newton-Cheh, Christopher
2011-01-01
The prevalence of hypertension in African Americans (AAs) is higher than in other US groups; yet, few have performed genome-wide association studies (GWASs) in AA. Among people of European descent, GWASs have identified genetic variants at 13 loci that are associated with blood pressure. It is unknown if these variants confer susceptibility in people of African ancestry. Here, we examined genome-wide and candidate gene associations with systolic blood pressure (SBP) and diastolic blood pressure (DBP) using the Candidate Gene Association Resource (CARe) consortium consisting of 8591 AAs. Genotypes included genome-wide single-nucleotide polymorphism (SNP) data utilizing the Affymetrix 6.0 array with imputation to 2.5 million HapMap SNPs and candidate gene SNP data utilizing a 50K cardiovascular gene-centric array (ITMAT-Broad-CARe [IBC] array). For Affymetrix data, the strongest signal for DBP was rs10474346 (P= 3.6 × 10−8) located near GPR98 and ARRDC3. For SBP, the strongest signal was rs2258119 in C21orf91 (P= 4.7 × 10−8). The top IBC association for SBP was rs2012318 (P= 6.4 × 10−6) near SLC25A42 and for DBP was rs2523586 (P= 1.3 × 10−6) near HLA-B. None of the top variants replicated in additional AA (n = 11 882) or European-American (n = 69 899) cohorts. We replicated previously reported European-American blood pressure SNPs in our AA samples (SH2B3, P= 0.009; TBX3-TBX5, P= 0.03; and CSK-ULK3, P= 0.0004). These genetic loci represent the best evidence of genetic influences on SBP and DBP in AAs to date. More broadly, this work supports that notion that blood pressure among AAs is a trait with genetic underpinnings but also with significant complexity. PMID:21378095
Fox, Ervin R; Young, J Hunter; Li, Yali; Dreisbach, Albert W; Keating, Brendan J; Musani, Solomon K; Liu, Kiang; Morrison, Alanna C; Ganesh, Santhi; Kutlar, Abdullah; Ramachandran, Vasan S; Polak, Josef F; Fabsitz, Richard R; Dries, Daniel L; Farlow, Deborah N; Redline, Susan; Adeyemo, Adebowale; Hirschorn, Joel N; Sun, Yan V; Wyatt, Sharon B; Penman, Alan D; Palmas, Walter; Rotter, Jerome I; Townsend, Raymond R; Doumatey, Ayo P; Tayo, Bamidele O; Mosley, Thomas H; Lyon, Helen N; Kang, Sun J; Rotimi, Charles N; Cooper, Richard S; Franceschini, Nora; Curb, J David; Martin, Lisa W; Eaton, Charles B; Kardia, Sharon L R; Taylor, Herman A; Caulfield, Mark J; Ehret, Georg B; Johnson, Toby; Chakravarti, Aravinda; Zhu, Xiaofeng; Levy, Daniel
2011-06-01
The prevalence of hypertension in African Americans (AAs) is higher than in other US groups; yet, few have performed genome-wide association studies (GWASs) in AA. Among people of European descent, GWASs have identified genetic variants at 13 loci that are associated with blood pressure. It is unknown if these variants confer susceptibility in people of African ancestry. Here, we examined genome-wide and candidate gene associations with systolic blood pressure (SBP) and diastolic blood pressure (DBP) using the Candidate Gene Association Resource (CARe) consortium consisting of 8591 AAs. Genotypes included genome-wide single-nucleotide polymorphism (SNP) data utilizing the Affymetrix 6.0 array with imputation to 2.5 million HapMap SNPs and candidate gene SNP data utilizing a 50K cardiovascular gene-centric array (ITMAT-Broad-CARe [IBC] array). For Affymetrix data, the strongest signal for DBP was rs10474346 (P= 3.6 × 10(-8)) located near GPR98 and ARRDC3. For SBP, the strongest signal was rs2258119 in C21orf91 (P= 4.7 × 10(-8)). The top IBC association for SBP was rs2012318 (P= 6.4 × 10(-6)) near SLC25A42 and for DBP was rs2523586 (P= 1.3 × 10(-6)) near HLA-B. None of the top variants replicated in additional AA (n = 11 882) or European-American (n = 69 899) cohorts. We replicated previously reported European-American blood pressure SNPs in our AA samples (SH2B3, P= 0.009; TBX3-TBX5, P= 0.03; and CSK-ULK3, P= 0.0004). These genetic loci represent the best evidence of genetic influences on SBP and DBP in AAs to date. More broadly, this work supports that notion that blood pressure among AAs is a trait with genetic underpinnings but also with significant complexity.
Loss of heterozygosity at D8S262: an early genetic event of hepatocarcinogenesis.
Zhu, Qiao; Gong, Li; Liu, Xiaoyan; Wang, Jun; Ren, Pin; Zhang, Wendong; Yao, Li; Han, Xiujuan; Zhu, Shaojun; Lan, Miao; Li, Yanhong; Zhang, Wei
2015-06-16
Hepatocellular carcinoma (HCC) is a multi-factor, multi-step, multi-gene and complicated process resulting from the accumulation of sequential genetic and epigenetic alterations. An important change among them is from precancerous lesions to HCC. However, only few studies have been reported about the sequential genetic changes during hepatocarcinogenesis. We observed firstly molecular karyotypes of 10 matched HCC using Affymetrix single-nucleotide polymorphism (SNP) 6.0 arrays, and found chromosomal fragments with high incidence (more than 70%) of loss of heterozygosity (LOH). Then, we selected 28 microsatellite markers at some gene spanning these chromosomal fragments, and examined the frequency of LOH of 128 matched HCC and 43 matched precancerous lesions-dysplastic nodules (DN) by a PCR-based analysis. Finally, we investigated the expression of proteins encoded by these genes in HCC, DN and the surrounding hepatic tissues. The result of Affymetrix SNP6.0 arrays demonstrated that more than 70% (7/10) cases had chromosomal fragment deletion on 4q13.3-35.1, 8p23.2-21.2, 16q11.2-24.3, and 17p13.3-12. Among 28 microsatellite markers selected, LOH frequencies at D8S262 for DN and HCC were found to be the highest, 51.2% and 72.7%, respectively. Immunohistochemically, the positive rate of its adjacent gene CSMD1 in HCC, DN, and the surrounding hepatic tissues were 27.3% (35/128), 75% (33/44), and 82% (105/128), respectively. LOH at D8S262 may be associated with an early genetic event of hepatocarcinogenesis, and a predictor for the monitor and prevention of HCC. The virtual slides for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1557074981159099 .
Biological relevance of CNV calling methods using familial relatedness including monozygotic twins.
Castellani, Christina A; Melka, Melkaye G; Wishart, Andrea E; Locke, M Elizabeth O; Awamleh, Zain; O'Reilly, Richard L; Singh, Shiva M
2014-04-21
Studies involving the analysis of structural variation including Copy Number Variation (CNV) have recently exploded in the literature. Furthermore, CNVs have been associated with a number of complex diseases and neurodevelopmental disorders. Common methods for CNV detection use SNP, CNV, or CGH arrays, where the signal intensities of consecutive probes are used to define the number of copies associated with a given genomic region. These practices pose a number of challenges that interfere with the ability of available methods to accurately call CNVs. It has, therefore, become necessary to develop experimental protocols to test the reliability of CNV calling methods from microarray data so that researchers can properly discriminate biologically relevant data from noise. We have developed a workflow for the integration of data from multiple CNV calling algorithms using the same array results. It uses four CNV calling programs: PennCNV (PC), Affymetrix® Genotyping Console™ (AGC), Partek® Genomics Suite™ (PGS) and Golden Helix SVS™ (GH) to analyze CEL files from the Affymetrix® Human SNP 6.0 Array™. To assess the relative suitability of each program, we used individuals of known genetic relationships. We found significant differences in CNV calls obtained by different CNV calling programs. Although the programs showed variable patterns of CNVs in the same individuals, their distribution in individuals of different degrees of genetic relatedness has allowed us to offer two suggestions. The first involves the use of multiple algorithms for the detection of the largest possible number of CNVs, and the second suggests the use of PennCNV over all other methods when the use of only one software program is desirable.
Mismatch and G-Stack Modulated Probe Signals on SNP Microarrays
Binder, Hans; Fasold, Mario; Glomb, Torsten
2009-01-01
Background Single nucleotide polymorphism (SNP) arrays are important tools widely used for genotyping and copy number estimation. This technology utilizes the specific affinity of fragmented DNA for binding to surface-attached oligonucleotide DNA probes. We analyze the variability of the probe signals of Affymetrix GeneChip SNP arrays as a function of the probe sequence to identify relevant sequence motifs which potentially cause systematic biases of genotyping and copy number estimates. Methodology/Principal Findings The probe design of GeneChip SNP arrays enables us to disentangle different sources of intensity modulations such as the number of mismatches per duplex, matched and mismatched base pairings including nearest and next-nearest neighbors and their position along the probe sequence. The effect of probe sequence was estimated in terms of triple-motifs with central matches and mismatches which include all 256 combinations of possible base pairings. The probe/target interactions on the chip can be decomposed into nearest neighbor contributions which correlate well with free energy terms of DNA/DNA-interactions in solution. The effect of mismatches is about twice as large as that of canonical pairings. Runs of guanines (G) and the particular type of mismatched pairings formed in cross-allelic probe/target duplexes constitute sources of systematic biases of the probe signals with consequences for genotyping and copy number estimates. The poly-G effect seems to be related to the crowded arrangement of probes which facilitates complex formation of neighboring probes with at minimum three adjacent G's in their sequence. Conclusions The applied method of “triple-averaging” represents a model-free approach to estimate the mean intensity contributions of different sequence motifs which can be applied in calibration algorithms to correct signal values for sequence effects. Rules for appropriate sequence corrections are suggested. PMID:19924253
2009-01-01
Background Array genomic hybridization is being used clinically to detect pathogenic copy number variants in children with intellectual disability and other birth defects. However, there is no agreement regarding the kind of array, the distribution of probes across the genome, or the resolution that is most appropriate for clinical use. Results We performed 500 K Affymetrix GeneChip® array genomic hybridization in 100 idiopathic intellectual disability trios, each comprised of a child with intellectual disability of unknown cause and both unaffected parents. We found pathogenic genomic imbalance in 16 of these 100 individuals with idiopathic intellectual disability. In comparison, we had found pathogenic genomic imbalance in 11 of 100 children with idiopathic intellectual disability in a previous cohort who had been studied by 100 K GeneChip® array genomic hybridization. Among 54 intellectual disability trios selected from the previous cohort who were re-tested with 500 K GeneChip® array genomic hybridization, we identified all 10 previously-detected pathogenic genomic alterations and at least one additional pathogenic copy number variant that had not been detected with 100 K GeneChip® array genomic hybridization. Many benign copy number variants, including one that was de novo, were also detected with 500 K array genomic hybridization, but it was possible to distinguish the benign and pathogenic copy number variants with confidence in all but 3 (1.9%) of the 154 intellectual disability trios studied. Conclusion Affymetrix GeneChip® 500 K array genomic hybridization detected pathogenic genomic imbalance in 10 of 10 patients with idiopathic developmental disability in whom 100 K GeneChip® array genomic hybridization had found genomic imbalance, 1 of 44 patients in whom 100 K GeneChip® array genomic hybridization had found no abnormality, and 16 of 100 patients who had not previously been tested. Effective clinical interpretation of these studies requires considerable skill and experience. PMID:19917086
Genome-Wide Association Study of a Varroa-Specific Defense Behavior in Honeybees (Apis mellifera)
Spötter, Andreas; Gupta, Pooja; Mayer, Manfred; Reinsch, Norbert
2016-01-01
Honey bees are exposed to many damaging pathogens and parasites. The most devastating is Varroa destructor, which mainly affects the brood. A promising approach for preventing its spread is to breed Varroa-resistant honey bees. One trait that has been shown to provide significant resistance against the Varroa mite is hygienic behavior, which is a behavioral response of honeybee workers to brood diseases in general. Here, we report the use of an Affymetrix 44K SNP array to analyze SNPs associated with detection and uncapping of Varroa-parasitized brood by individual worker bees (Apis mellifera). For this study, 22 000 individually labeled bees were video-monitored and a sample of 122 cases and 122 controls was collected and analyzed to determine the dependence/independence of SNP genotypes from hygienic and nonhygienic behavior on a genome-wide scale. After false-discovery rate correction of the P values, 6 SNP markers had highly significant associations with the trait investigated (α < 0.01). Inspection of the genomic regions around these SNPs led to the discovery of putative candidate genes. PMID:26774061
Application of Nexus copy number software for CNV detection and analysis.
Darvishi, Katayoon
2010-04-01
Among human structural genomic variation, copy number variants (CNVs) are the most frequently known component, comprised of gains/losses of DNA segments that are generally 1 kb in length or longer. Array-based comparative genomic hybridization (aCGH) has emerged as a powerful tool for detecting genomic copy number variants (CNVs). With the rapid increase in the density of array technology and with the adaptation of new high-throughput technology, a reliable and computationally scalable method for accurate mapping of recurring DNA copy number aberrations has become a main focus in research. Here we introduce Nexus Copy Number software, a platform-independent tool, to analyze the output files of all types of commercial and custom-made comparative genomic hybridization (CGH) and single-nucleotide polymorphism (SNP) arrays, such as those manufactured by Affymetrix, Agilent Technologies, Illumina, and Roche NimbleGen. It also supports data generated by various array image-analysis software tools such as GenePix, ImaGene, and BlueFuse. (c) 2010 by John Wiley & Sons, Inc.
Mapping autism risk loci using genetic linkage and chromosomal rearrangements
Szatmari, Peter; Paterson, Andrew; Zwaigenbaum, Lonnie; Roberts, Wendy; Brian, Jessica; Liu, Xiao-Qing; Vincent, John; Skaug, Jennifer; Thompson, Ann; Senman, Lili; Feuk, Lars; Qian, Cheng; Bryson, Susan; Jones, Marshall; Marshall, Christian; Scherer, Stephen; Vieland, Veronica; Bartlett, Christopher; Mangin, La Vonne; Goedken, Rhinda; Segre, Alberto; Pericak-Vance, Margaret; Cuccaro, Michael; Gilbert, John; Wright, Harry; Abramson, Ruth; Betancur, Catalina; Bourgeron, Thomas; Gillberg, Christopher; Leboyer, Marion; Buxbaum, Joseph; Davis, Kenneth; Hollander, Eric; Silverman, Jeremy; Hallmayer, Joachim; Lotspeich, Linda; Sutcliffe, James; Haines, Jonathan; Folstein, Susan; Piven, Joseph; Wassink, Thomas; Sheffield, Val; Geschwind, Daniel; Bucan, Maja; Brown, Ted; Cantor, Rita; Constantino, John; Gilliam, Conrad; Herbert, Martha; Lajonchere, Clara; Ledbetter, David; Lese-Martin, Christa; Miller, Janet; Nelson, Stan; Samango-Sprouse, Carol; Spence, Sarah; State, Matthew; Tanzi, Rudolph; Coon, Hilary; Dawson, Geraldine; Devlin, Bernie; Estes, Annette; Flodman, Pamela; Klei, Lambertus; Mcmahon, William; Minshew, Nancy; Munson, Jeff; Korvatska, Elena; Rodier, Patricia; Schellenberg, Gerard; Smith, Moyra; Spence, Anne; Stodgell, Chris; Tepper, Ping Guo; Wijsman, Ellen; Yu, Chang-En; Rogé, Bernadette; Mantoulan, Carine; Wittemeyer, Kerstin; Poustka, Annemarie; Felder, Bärbel; Klauck, Sabine; Schuster, Claudia; Poustka, Fritz; Bölte, Sven; Feineis-Matthews, Sabine; Herbrecht, Evelyn; Schmötzer, Gabi; Tsiantis, John; Papanikolaou, Katerina; Maestrini, Elena; Bacchelli, Elena; Blasi, Francesca; Carone, Simona; Toma, Claudio; Van Engeland, Herman; De Jonge, Maretha; Kemner, Chantal; Koop, Frederieke; Langemeijer, Marjolein; Hijmans, Channa; Staal, Wouter; Baird, Gillian; Bolton, Patrick; Rutter, Michael; Weisblatt, Emma; Green, Jonathan; Aldred, Catherine; Wilkinson, Julie-Anne; Pickles, Andrew; Le Couteur, Ann; Berney, Tom; Mcconachie, Helen; Bailey, Anthony; Francis, Kostas; Honeyman, Gemma; Hutchinson, Aislinn; Parr, Jeremy; Wallace, Simon; Monaco, Anthony; Barnby, Gabrielle; Kobayashi, Kazuhiro; Lamb, Janine; Sousa, Ines; Sykes, Nuala; Cook, Edwin; Guter, Stephen; Leventhal, Bennett; Salt, Jeff; Lord, Catherine; Corsello, Christina; Hus, Vanessa; Weeks, Daniel; Volkmar, Fred; Tauber, Maïté; Fombonne, Eric; Shih, Andy; Meyer, Kacie
2007-01-01
Autism spectrum disorders (ASD) are common, heritable neurodevelopmental conditions. The genetic architecture of ASD is complex, requiring large samples to overcome heterogeneity. Here we broaden coverage and sample size relative to other studies of ASD by using Affymetrix 10K single nucleotide polymorphism (SNP) arrays and 1168 families with ≥ 2 affected individuals to perform the largest linkage scan to date, while also analyzing copy number variation (CNV) in these families. Linkage and CNV analyses implicate chromosome 11p12-p13 and neurexins, respectively, amongst other candidate loci. Neurexins team with previously-implicated neuroligins for glutamatergic synaptogenesis, highlighting glutamate-related genes as promising candidates for ASD. PMID:17322880
Elmore, James R; Obmann, Melissa A; Kuivaniemi, Helena; Tromp, Gerard; Gerhard, Glenn S; Franklin, David P; Boddy, Amy M; Carey, David J
2009-06-01
The goal of this project was to identify genetic variants associated with abdominal aortic aneurysms (AAAs). A genome wide association study was carried out using pooled DNA samples from 123 AAA cases and 112 controls matched for age, gender, and smoking history using Affymetrix 500K single nucleotide polymorphism (SNP) arrays (Affymetrix, Inc, Santa Clara, Calif). The difference in mean allele frequency between cases and controls was calculated for each SNP and used to identify candidate genomic regions. Association of candidate SNPs with AAA was confirmed by individual TaqMan genotype assays in a total of 2096 cases and controls that included an independent replication sample set. A genome wide association study of AAA cases and controls identified a candidate AAA-associated haplotype on chromosome 3p12.3. By individual genotype analysis, four SNPs in this region were significantly associated with AAA in cases and controls from the original study population. One SNP in this region (rs7635818) was genotyped in a total of 502 cases and 736 controls from the original study population (P = .017) and 448 cases and 410 controls from an independent replication sample (P = .013; combined P value = .0028; combined odds ratio [OR] = 1.33). An even stronger association with AAA was observed in a subset of smokers (391 cases, 241 controls, P = .00041, OR = 1.80), which represent the highest risk group for AAA. The AAA-associated haplotype is located approximately 200 kbp upstream of the CNTN3 gene transcription start site. This study identifies a region on chromosome 3 that is significantly associated with AAA in 2 distinct study populations.
Tips on hybridizing, washing, and scanning affymetrix microarrays.
Ares, Manuel
2014-02-01
Starting in the late 1990s, Affymetrix, Inc. produced a commercial system for hybridizing, washing, and scanning microarrays that was designed to be easy to operate and reproducible. The system used arrays packaged in a plastic cassette or chamber in which the prefabricated array was mounted and could be filled with fluid through resealable membrane ports either by hand or by an automated "fluidics station" specially designed to handle the arrays. A special rotating hybridization oven and a specially designed scanner were also required. Primarily because of automation and standardization the Affymetrix system was and still remains popular. Here, we provide a skeleton protocol with the potential pitfalls identified. It is designed to augment the protocols provided by Affymetrix.
Johnson, Eric O; Hancock, Dana B; Levy, Joshua L; Gaddis, Nathan C; Saccone, Nancy L; Bierut, Laura J; Page, Grier P
2013-05-01
A great promise of publicly sharing genome-wide association data is the potential to create composite sets of controls. However, studies often use different genotyping arrays, and imputation to a common set of SNPs has shown substantial bias: a problem which has no broadly applicable solution. Based on the idea that using differing genotyped SNP sets as inputs creates differential imputation errors and thus bias in the composite set of controls, we examined the degree to which each of the following occurs: (1) imputation based on the union of genotyped SNPs (i.e., SNPs available on one or more arrays) results in bias, as evidenced by spurious associations (type 1 error) between imputed genotypes and arbitrarily assigned case/control status; (2) imputation based on the intersection of genotyped SNPs (i.e., SNPs available on all arrays) does not evidence such bias; and (3) imputation quality varies by the size of the intersection of genotyped SNP sets. Imputations were conducted in European Americans and African Americans with reference to HapMap phase II and III data. Imputation based on the union of genotyped SNPs across the Illumina 1M and 550v3 arrays showed spurious associations for 0.2 % of SNPs: ~2,000 false positives per million SNPs imputed. Biases remained problematic for very similar arrays (550v1 vs. 550v3) and were substantial for dissimilar arrays (Illumina 1M vs. Affymetrix 6.0). In all instances, imputing based on the intersection of genotyped SNPs (as few as 30 % of the total SNPs genotyped) eliminated such bias while still achieving good imputation quality.
He, Xianmin; Wei, Qing; Sun, Meiqian; Fu, Xuping; Fan, Sichang; Li, Yao
2006-05-01
Biological techniques such as Array-Comparative genomic hybridization (CGH), fluorescent in situ hybridization (FISH) and affymetrix single nucleotide pleomorphism (SNP) array have been used to detect cytogenetic aberrations. However, on genomic scale, these techniques are labor intensive and time consuming. Comparative genomic microarray analysis (CGMA) has been used to identify cytogenetic changes in hepatocellular carcinoma (HCC) using gene expression microarray data. However, CGMA algorithm can not give precise localization of aberrations, fails to identify small cytogenetic changes, and exhibits false negatives and positives. Locally un-weighted smoothing cytogenetic aberrations prediction (LS-CAP) based on local smoothing and binomial distribution can be expected to address these problems. LS-CAP algorithm was built and used on HCC microarray profiles. Eighteen cytogenetic abnormalities were identified, among them 5 were reported previously, and 12 were proven by CGH studies. LS-CAP effectively reduced the false negatives and positives, and precisely located small fragments with cytogenetic aberrations.
Vallejo, Roger L.; Liu, Sixin; Gao, Guangtu; Fragomeni, Breno O.; Hernandez, Alvaro G.; Leeds, Timothy D.; Parsons, James E.; Martin, Kyle E.; Evenhuis, Jason P.; Welch, Timothy J.; Wiens, Gregory D.; Palti, Yniv
2017-01-01
Bacterial cold water disease (BCWD) causes significant mortality and economic losses in salmonid aquaculture. In previous studies, we identified moderate-large effect quantitative trait loci (QTL) for BCWD resistance in rainbow trout (Oncorhynchus mykiss). However, the recent availability of a 57 K SNP array and a reference genome assembly have enabled us to conduct genome-wide association studies (GWAS) that overcome several experimental limitations from our previous work. In the current study, we conducted GWAS for BCWD resistance in two rainbow trout breeding populations using two genotyping platforms, the 57 K Affymetrix SNP array and restriction-associated DNA (RAD) sequencing. Overall, we identified 14 moderate-large effect QTL that explained up to 60.8% of the genetic variance in one of the two populations and 27.7% in the other. Four of these QTL were found in both populations explaining a substantial proportion of the variance, although major differences were also detected between the two populations. Our results confirm that BCWD resistance is controlled by the oligogenic inheritance of few moderate-large effect loci and a large-unknown number of loci each having a small effect on BCWD resistance. We detected differences in QTL number and genome location between two GWAS models (weighted single-step GBLUP and Bayes B), which highlights the utility of using different models to uncover QTL. The RAD-SNPs detected a greater number of QTL than the 57 K SNP array in one population, suggesting that the RAD-SNPs may uncover polymorphisms that are more unique and informative for the specific population in which they were discovered. PMID:29109734
Vallejo, Roger L; Liu, Sixin; Gao, Guangtu; Fragomeni, Breno O; Hernandez, Alvaro G; Leeds, Timothy D; Parsons, James E; Martin, Kyle E; Evenhuis, Jason P; Welch, Timothy J; Wiens, Gregory D; Palti, Yniv
2017-01-01
Bacterial cold water disease (BCWD) causes significant mortality and economic losses in salmonid aquaculture. In previous studies, we identified moderate-large effect quantitative trait loci (QTL) for BCWD resistance in rainbow trout ( Oncorhynchus mykiss ). However, the recent availability of a 57 K SNP array and a reference genome assembly have enabled us to conduct genome-wide association studies (GWAS) that overcome several experimental limitations from our previous work. In the current study, we conducted GWAS for BCWD resistance in two rainbow trout breeding populations using two genotyping platforms, the 57 K Affymetrix SNP array and restriction-associated DNA (RAD) sequencing. Overall, we identified 14 moderate-large effect QTL that explained up to 60.8% of the genetic variance in one of the two populations and 27.7% in the other. Four of these QTL were found in both populations explaining a substantial proportion of the variance, although major differences were also detected between the two populations. Our results confirm that BCWD resistance is controlled by the oligogenic inheritance of few moderate-large effect loci and a large-unknown number of loci each having a small effect on BCWD resistance. We detected differences in QTL number and genome location between two GWAS models (weighted single-step GBLUP and Bayes B), which highlights the utility of using different models to uncover QTL. The RAD-SNPs detected a greater number of QTL than the 57 K SNP array in one population, suggesting that the RAD-SNPs may uncover polymorphisms that are more unique and informative for the specific population in which they were discovered.
Increased genomic prediction accuracy in wheat breeding using a large Australian panel.
Norman, Adam; Taylor, Julian; Tanaka, Emi; Telfer, Paul; Edwards, James; Martinant, Jean-Pierre; Kuchel, Haydn
2017-12-01
Genomic prediction accuracy within a large panel was found to be substantially higher than that previously observed in smaller populations, and also higher than QTL-based prediction. In recent years, genomic selection for wheat breeding has been widely studied, but this has typically been restricted to population sizes under 1000 individuals. To assess its efficacy in germplasm representative of commercial breeding programmes, we used a panel of 10,375 Australian wheat breeding lines to investigate the accuracy of genomic prediction for grain yield, physical grain quality and other physiological traits. To achieve this, the complete panel was phenotyped in a dedicated field trial and genotyped using a custom Axiom TM Affymetrix SNP array. A high-quality consensus map was also constructed, allowing the linkage disequilibrium present in the germplasm to be investigated. Using the complete SNP array, genomic prediction accuracies were found to be substantially higher than those previously observed in smaller populations and also more accurate compared to prediction approaches using a finite number of selected quantitative trait loci. Multi-trait genetic correlations were also assessed at an additive and residual genetic level, identifying a negative genetic correlation between grain yield and protein as well as a positive genetic correlation between grain size and test weight.
Development and validation of the Axiom(®) Apple480K SNP genotyping array.
Bianco, Luca; Cestaro, Alessandro; Linsmith, Gareth; Muranty, Hélène; Denancé, Caroline; Théron, Anthony; Poncet, Charles; Micheletti, Diego; Kerschbamer, Emanuela; Di Pierro, Erica A; Larger, Simone; Pindo, Massimo; Van de Weg, Eric; Davassi, Alessandro; Laurens, François; Velasco, Riccardo; Durel, Charles-Eric; Troggio, Michela
2016-04-01
Cultivated apple (Malus × domestica Borkh.) is one of the most important fruit crops in temperate regions, and has great economic and cultural value. The apple genome is highly heterozygous and has undergone a recent duplication which, combined with a rapid linkage disequilibrium decay, makes it difficult to perform genome-wide association (GWA) studies. Single nucleotide polymorphism arrays offer highly multiplexed assays at a relatively low cost per data point and can be a valid tool for the identification of the markers associated with traits of interest. Here, we describe the development and validation of a 487K SNP Affymetrix Axiom(®) genotyping array for apple and discuss its potential applications. The array has been built from the high-depth resequencing of 63 different cultivars covering most of the genetic diversity in cultivated apple. The SNPs were chosen by applying a focal points approach to enrich genic regions, but also to reach a uniform coverage of non-genic regions. A total of 1324 apple accessions, including the 92 progenies of two mapping populations, have been genotyped with the Axiom(®) Apple480K to assess the effectiveness of the array. A large majority of SNPs (359 994 or 74%) fell in the stringent class of poly high resolution polymorphisms. We also devised a filtering procedure to identify a subset of 275K very robust markers that can be safely used for germplasm surveys in apple. The Axiom(®) Apple480K has now been commercially released both for public and proprietary use and will likely be a reference tool for GWA studies in apple. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Haraksingh, Rajini R; Abyzov, Alexej; Urban, Alexander Eckehart
2017-04-24
High-resolution microarray technology is routinely used in basic research and clinical practice to efficiently detect copy number variants (CNVs) across the entire human genome. A new generation of arrays combining high probe densities with optimized designs will comprise essential tools for genome analysis in the coming years. We systematically compared the genome-wide CNV detection power of all 17 available array designs from the Affymetrix, Agilent, and Illumina platforms by hybridizing the well-characterized genome of 1000 Genomes Project subject NA12878 to all arrays, and performing data analysis using both manufacturer-recommended and platform-independent software. We benchmarked the resulting CNV call sets from each array using a gold standard set of CNVs for this genome derived from 1000 Genomes Project whole genome sequencing data. The arrays tested comprise both SNP and aCGH platforms with varying designs and contain between ~0.5 to ~4.6 million probes. Across the arrays CNV detection varied widely in number of CNV calls (4-489), CNV size range (~40 bp to ~8 Mbp), and percentage of non-validated CNVs (0-86%). We discovered strikingly strong effects of specific array design principles on performance. For example, some SNP array designs with the largest numbers of probes and extensive exonic coverage produced a considerable number of CNV calls that could not be validated, compared to designs with probe numbers that are sometimes an order of magnitude smaller. This effect was only partially ameliorated using different analysis software and optimizing data analysis parameters. High-resolution microarrays will continue to be used as reliable, cost- and time-efficient tools for CNV analysis. However, different applications tolerate different limitations in CNV detection. Our study quantified how these arrays differ in total number and size range of detected CNVs as well as sensitivity, and determined how each array balances these attributes. This analysis will inform appropriate array selection for future CNV studies, and allow better assessment of the CNV-analytical power of both published and ongoing array-based genomics studies. Furthermore, our findings emphasize the importance of concurrent use of multiple analysis algorithms and independent experimental validation in array-based CNV detection studies.
Genome-wide association studies for multiple diseases of the German Shepherd Dog
Tsai, Kate L.; Noorai, Rooksana E.; Starr-Moss, Alison N.; Quignon, Pascale; Rinz, Caitlin J.; Ostrander, Elaine A.; Steiner, Jörg M.; Murphy, Keith E.
2012-01-01
The German Shepherd Dog (GSD) is a popular working and companion breed for which over 50 hereditary diseases have been documented. Herein, SNP profiles for 197 GSDs were generated using the Affymetrix v2 canine SNP array for a genome-wide association study to identify loci associated with four diseases: pituitary dwarfism, degenerative myelopathy (DM), congenital megaesophagus (ME), and pancreatic acinar atrophy (PAA). A locus on Chr 9 is strongly associated with pituitary dwarfism and is proximal to a plausible candidate gene, LHX3. Results for DM confirm a major locus encompassing SOD1, in which an associated point mutation was previously identified, but do not suggest modifier loci. Several SNPs on Chr 12 are associated with ME and a 4.7 Mb haplotype block is present in affected dogs. Analysis of additional ME cases for a SNP within the haplotype provides further support for this association. Results for PAA indicate more complex genetic underpinnings. Several regions on multiple chromosomes reach genome-wide significance. However, no major locus is apparent and only two associated haplotype blocks, on Chrs 7 and 12 are observed. These data suggest that PAA may be governed by multiple loci with small effects, or it may be a heterogeneous disorder. PMID:22105877
Petrin, Aline L.; Daack-Hirsch, Sandra; L’Heureux, Jamie; Murray, Jeffrey C
2010-01-01
Objective The objective of this study was to use array-CGH to detect causal microdeletions in samples of subjects with cleft lip and palate. Subjects We analyzed DNA samples from a male patient and parents that was seen during surgical screening for an Operation Smile medical mission in the Philippines. Method We used Affymetrix Genome Wide Human SNP Array 6.0 followed by sequencing and quantitative PCR using SYBR Green I dye. Results We report the second case of 3q29 microdeletion syndrome including cleft lip with or without cleft palate and the first case of this microdeletion syndrome inherited from a phenotypically normal mosaic parent. Conclusions Our findings confirm the utility of aCGH to detect causal microdeletions; indicate that parental somatic mosaicism should be considered in healthy parents for genetic counseling of the families and discuss important ethical implications of sharing health impact results from research studies with the participant families. PMID:20500065
Single-Nucleotide Polymorphism Array-Based Karyotyping of Acute Promyelocytic Leukemia
Gómez-Seguí, Inés; Sánchez-Izquierdo, Dolors; Barragán, Eva; Such, Esperanza; Luna, Irene; López-Pavía, María; Ibáñez, Mariam; Villamón, Eva; Alonso, Carmen; Martín, Iván; Llop, Marta; Dolz, Sandra; Fuster, Óscar; Montesinos, Pau; Cañigral, Carolina; Boluda, Blanca; Salazar, Claudia
2014-01-01
Acute promyelocytic leukemia (APL) is characterized by the t(15;17)(q22;q21), but additional chromosomal abnormalities (ACA) and other rearrangements can contribute in the development of the whole leukemic phenotype. We hypothesized that some ACA not detected by conventional techniques may be informative of the onset of APL. We performed the high-resolution SNP array (SNP-A) 6.0 (Affymetrix) in 48 patients diagnosed with APL on matched diagnosis and remission sample. Forty-six abnormalities were found as an acquired event in 23 patients (48%): 22 duplications, 23 deletions and 1 Copy-Neutral Loss of Heterozygocity (CN-LOH), being a duplication of 8(q24) (23%) and a deletion of 7(q33-qter) (6%) the most frequent copy-number abnormalities (CNA). Four patients (8%) showed CNAs adjacent to the breakpoints of the translocation. We compared our results with other APL series and found that, except for dup(8q24) and del(7q33-qter), ACA were infrequent (≤3%) but most of them recurrent (70%). Interestingly, having CNA or FLT3 mutation were mutually exclusive events. Neither the number of CNA, nor any specific CNA was associated significantly with prognosis. This study has delineated recurrent abnormalities in addition to t(15;17) that may act as secondary events and could explain leukemogenesis in up to 40% of APL cases with no ACA by conventional cytogenetics. PMID:24959826
Single-nucleotide polymorphism array-based karyotyping of acute promyelocytic leukemia.
Gómez-Seguí, Inés; Sánchez-Izquierdo, Dolors; Barragán, Eva; Such, Esperanza; Luna, Irene; López-Pavía, María; Ibáñez, Mariam; Villamón, Eva; Alonso, Carmen; Martín, Iván; Llop, Marta; Dolz, Sandra; Fuster, Oscar; Montesinos, Pau; Cañigral, Carolina; Boluda, Blanca; Salazar, Claudia; Cervera, Jose; Sanz, Miguel A
2014-01-01
Acute promyelocytic leukemia (APL) is characterized by the t(15;17)(q22;q21), but additional chromosomal abnormalities (ACA) and other rearrangements can contribute in the development of the whole leukemic phenotype. We hypothesized that some ACA not detected by conventional techniques may be informative of the onset of APL. We performed the high-resolution SNP array (SNP-A) 6.0 (Affymetrix) in 48 patients diagnosed with APL on matched diagnosis and remission sample. Forty-six abnormalities were found as an acquired event in 23 patients (48%): 22 duplications, 23 deletions and 1 Copy-Neutral Loss of Heterozygocity (CN-LOH), being a duplication of 8(q24) (23%) and a deletion of 7(q33-qter) (6%) the most frequent copy-number abnormalities (CNA). Four patients (8%) showed CNAs adjacent to the breakpoints of the translocation. We compared our results with other APL series and found that, except for dup(8q24) and del(7q33-qter), ACA were infrequent (≤3%) but most of them recurrent (70%). Interestingly, having CNA or FLT3 mutation were mutually exclusive events. Neither the number of CNA, nor any specific CNA was associated significantly with prognosis. This study has delineated recurrent abnormalities in addition to t(15;17) that may act as secondary events and could explain leukemogenesis in up to 40% of APL cases with no ACA by conventional cytogenetics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerns, Sarah L.; Departments of Pathology and Genetics, Albert Einstein College of Medicine, Bronx, New York; Stock, Richard
2013-01-01
Purpose: To identify single nucleotide polymorphisms (SNPs) associated with development of erectile dysfunction (ED) among prostate cancer patients treated with radiation therapy. Methods and Materials: A 2-stage genome-wide association study was performed. Patients were split randomly into a stage I discovery cohort (132 cases, 103 controls) and a stage II replication cohort (128 cases, 102 controls). The discovery cohort was genotyped using Affymetrix 6.0 genome-wide arrays. The 940 top ranking SNPs selected from the discovery cohort were genotyped in the replication cohort using Illumina iSelect custom SNP arrays. Results: Twelve SNPs identified in the discovery cohort and validated in themore » replication cohort were associated with development of ED following radiation therapy (Fisher combined P values 2.1 Multiplication-Sign 10{sup -5} to 6.2 Multiplication-Sign 10{sup -4}). Notably, these 12 SNPs lie in or near genes involved in erectile function or other normal cellular functions (adhesion and signaling) rather than DNA damage repair. In a multivariable model including nongenetic risk factors, the odds ratios for these SNPs ranged from 1.6 to 5.6 in the pooled cohort. There was a striking relationship between the cumulative number of SNP risk alleles an individual possessed and ED status (Sommers' D P value = 1.7 Multiplication-Sign 10{sup -29}). A 1-allele increase in cumulative SNP score increased the odds for developing ED by a factor of 2.2 (P value = 2.1 Multiplication-Sign 10{sup -19}). The cumulative SNP score model had a sensitivity of 84% and specificity of 75% for prediction of developing ED at the radiation therapy planning stage. Conclusions: This genome-wide association study identified a set of SNPs that are associated with development of ED following radiation therapy. These candidate genetic predictors warrant more definitive validation in an independent cohort.« less
UPD detection using homozygosity profiling with a SNP genotyping microarray.
Papenhausen, Peter; Schwartz, Stuart; Risheg, Hiba; Keitges, Elisabeth; Gadi, Inder; Burnside, Rachel D; Jaswaney, Vikram; Pappas, John; Pasion, Romela; Friedman, Kenneth; Tepperberg, James
2011-04-01
Single nucleotide polymorphism (SNP) based chromosome microarrays provide both a high-density whole genome analysis of copy number and genotype. In the past 21 months we have analyzed over 13,000 samples primarily referred for developmental delay using the Affymetrix SNP/CN 6.0 version array platform. In addition to copy number, we have focused on the relative distribution of allele homozygosity (HZ) throughout the genome to confirm a strong association of uniparental disomy (UPD) with regions of isoallelism found in most confirmed cases of UPD. We sought to determine whether a long contiguous stretch of HZ (LCSH) greater than a threshold value found only in a single chromosome would correlate with UPD of that chromosome. Nine confirmed UPD cases were retrospectively analyzed with the array in the study, each showing the anticipated LCSH with the smallest 13.5 Mb in length. This length is well above the average longest run of HZ in a set of control patients and was then set as the prospective threshold for reporting possible UPD correlation. Ninety-two cases qualified at that threshold, 46 of those had molecular UPD testing and 29 were positive. Including retrospective cases, 16 showed complete HZ across the chromosome, consistent with total isoUPD. The average size LCSH in the 19 cases that were not completely HZ was 46.3 Mb with a range of 13.5-127.8 Mb. Three patients showed only segmental UPD. Both the size and location of the LCSH are relevant to correlation with UPD. Further studies will continue to delineate an optimal threshold for LCSH/UPD correlation. Copyright © 2011 Wiley-Liss, Inc.
Identification of Genes Promoting Skin Youthfulness by Genome-Wide Association Study
Chang, Anne L.S.; Atzmon, Gil; Bergman, Aviv; Brugmann, Samantha; Atwood, Scott X; Chang, Howard Y; Barzilai, Nir
2014-01-01
To identify genes that promote facial skin youthfulness (SY), a genome-wide association study on an Ashkenazi Jewish discovery group (n=428) was performed using Affymetrix 6.0 Single-Nucleotide Polymorphism (SNP) Array. After SNP quality controls, 901,470 SNPs remained for analysis. The eigenstrat method showed no stratification. Cases and controls were identified by global facial skin aging severity including intrinsic and extrinsic parameters. Linear regression adjusted for age and gender, with no significant differences in smoking history, body mass index, menopausal status, or personal or family history of centenarians. Six SNPs met the Bonferroni threshold with Pallele<10−8; two of these six had Pgenotype<10−8. Quantitative trait loci mapping confirmed linkage disequilibrium. The six SNPs were interrogated by MassARRAY in a replication group (n=436) with confirmation of rs6975107, an intronic region of KCND2 (potassium voltage-gated channel, Shal-related family member 2) (Pgenotype=0.023). A second replication group (n=371) confirmed rs318125, downstream of DIAPH2 (diaphanous homolog 2 (Drosophila)) (Pallele=0.010, Pgenotype=0.002) and rs7616661, downstream of EDEM1 (ER degradation enhancer, mannosidase α-like 1) (Pgenotype=0.042). DIAPH2 has been associated with premature ovarian insufficiency, an aging phenotype in humans. EDEM1 associates with lifespan in animal models, although not humans. KCND2 is expressed in human skin, but has not been associated with aging. These genes represent new candidate genes to study the molecular basis of healthy skin aging. PMID:24037343
DMET-analyzer: automatic analysis of Affymetrix DMET data.
Guzzi, Pietro Hiram; Agapito, Giuseppe; Di Martino, Maria Teresa; Arbitrio, Mariamena; Tassone, Pierfrancesco; Tagliaferri, Pierosandro; Cannataro, Mario
2012-10-05
Clinical Bioinformatics is currently growing and is based on the integration of clinical and omics data aiming at the development of personalized medicine. Thus the introduction of novel technologies able to investigate the relationship among clinical states and biological machineries may help the development of this field. For instance the Affymetrix DMET platform (drug metabolism enzymes and transporters) is able to study the relationship among the variation of the genome of patients and drug metabolism, detecting SNPs (Single Nucleotide Polymorphism) on genes related to drug metabolism. This may allow for instance to find genetic variants in patients which present different drug responses, in pharmacogenomics and clinical studies. Despite this, there is currently a lack in the development of open-source algorithms and tools for the analysis of DMET data. Existing software tools for DMET data generally allow only the preprocessing of binary data (e.g. the DMET-Console provided by Affymetrix) and simple data analysis operations, but do not allow to test the association of the presence of SNPs with the response to drugs. We developed DMET-Analyzer a tool for the automatic association analysis among the variation of the patient genomes and the clinical conditions of patients, i.e. the different response to drugs. The proposed system allows: (i) to automatize the workflow of analysis of DMET-SNP data avoiding the use of multiple tools; (ii) the automatic annotation of DMET-SNP data and the search in existing databases of SNPs (e.g. dbSNP), (iii) the association of SNP with pathway through the search in PharmaGKB, a major knowledge base for pharmacogenomic studies. DMET-Analyzer has a simple graphical user interface that allows users (doctors/biologists) to upload and analyse DMET files produced by Affymetrix DMET-Console in an interactive way. The effectiveness and easy use of DMET Analyzer is demonstrated through different case studies regarding the analysis of clinical datasets produced in the University Hospital of Catanzaro, Italy. DMET Analyzer is a novel tool able to automatically analyse data produced by the DMET-platform in case-control association studies. Using such tool user may avoid wasting time in the manual execution of multiple statistical tests avoiding possible errors and reducing the amount of time needed for a whole experiment. Moreover annotations and the direct link to external databases may increase the biological knowledge extracted. The system is freely available for academic purposes at: https://sourceforge.net/projects/dmetanalyzer/files/
Rose, Amy E.; Poliseno, Laura; Wang, Jinhua; Clark, Michael; Pearlman, Alexander; Wang, Guimin; Vega y Saenz de Miera, Eleazar C.; Medicherla, Ratna; Christos, Paul J.; Shapiro, Richard; Pavlick, Anna; Darvishian, Farbod; Zavadil, Jiri; Polsky, David; Hernando, Eva; Ostrer, Harry; Osman, Iman
2011-01-01
Superficial spreading melanoma (SSM) and nodular melanoma (NM) are believed to represent sequential phases of linear progression from radial to vertical growth. Several lines of clinical, pathological and epidemiologic evidence suggest, however, that SSM and NM might be the result of independent pathways of tumor development. We utilized an integrative genomic approach that combines single nucleotide polymorphism array (SNP 6.0, Affymetrix) with gene expression array (U133A 2.0, Affymetrix) to examine molecular differences between SSM and NM. Pathway analysis of the most differentially expressed genes between SSM and NM (N=114) revealed significant differences related to metabolic processes. We identified 8 genes (DIS3, FGFR1OP, G3BP2, GALNT7, MTAP, SEC23IP, USO1, ZNF668) in which NM/SSM-specific copy number alterations correlated with differential gene expression (P<0.05, Spearman’s rank). SSM-specific genomic deletions in G3BP2, MTAP, and SEC23IP were independently verified in two external data sets. Forced overexpression of metabolism-related gene methylthioadenosine phosphorylase (MTAP) in SSM resulted in reduced cell growth. The differential expression of another metabolic related gene, aldehyde dehydrogenase 7A1 (ALDH7A1), was validated at the protein level using tissue microarrays of human melanoma. In addition, we show that the decreased ALDH7A1 expression in SSM may be the result of epigenetic modifications. Our data reveal recurrent genomic deletions in SSM not present in NM, which challenge the linear model of melanoma progression. Furthermore, our data suggest a role for altered regulation of metabolism-related genes as a possible cause of the different clinical behavior of SSM and NM. PMID:21343389
Cifola, Ingrid; Bianchi, Cristina; Mangano, Eleonora; Bombelli, Silvia; Frascati, Fabio; Fasoli, Ester; Ferrero, Stefano; Di Stefano, Vitalba; Zipeto, Maria A; Magni, Fulvio; Signorini, Stefano; Battaglia, Cristina; Perego, Roberto A
2011-06-13
Clear cell renal cell carcinoma (ccRCC) is characterized by recurrent copy number alterations (CNAs) and loss of heterozygosity (LOH), which may have potential diagnostic and prognostic applications. Here, we explored whether ccRCC primary cultures, established from surgical tumor specimens, maintain the DNA profile of parental tumor tissues allowing a more confident CNAs and LOH discrimination with respect to the original tissues. We established a collection of 9 phenotypically well-characterized ccRCC primary cell cultures. Using the Affymetrix SNP array technology, we performed the genome-wide copy number (CN) profiling of both cultures and corresponding tumor tissues. Global concordance for each culture/tissue pair was assayed evaluating the correlations between whole-genome CN profiles and SNP allelic calls. CN analysis was performed using the two CNAG v3.0 and Partek software, and comparing results returned by two different algorithms (Hidden Markov Model and Genomic Segmentation). A very good overlap between the CNAs of each culture and corresponding tissue was observed. The finding, reinforced by high whole-genome CN correlations and SNP call concordances, provided evidence that each culture was derived from its corresponding tissue and maintained the genomic alterations of parental tumor. In addition, primary culture DNA profile remained stable for at least 3 weeks, till to third passage. These cultures showed a greater cell homogeneity and enrichment in tumor component than original tissues, thus enabling a better discrimination of CNAs and LOH. Especially for hemizygous deletions, primary cultures presented more evident CN losses, typically accompanied by LOH; differently, in original tissues the intensity of these deletions was weaken by normal cell contamination and LOH calls were missed. ccRCC primary cultures are a reliable in vitro model, well-reproducing original tumor genetics and phenotype, potentially useful for future functional approaches aimed to study genes or pathways involved in ccRCC etiopathogenesis and to identify novel clinical markers or therapeutic targets. Moreover, SNP array technology proved to be a powerful tool to better define the cell composition and homogeneity of RCC primary cultures. © 2011 Cifola et al; licensee BioMed Central Ltd.
Tzvetkov, Mladen V; Becker, Christian; Kulle, Bettina; Nürnberg, Peter; Brockmöller, Jürgen; Wojnowski, Leszek
2005-02-01
Whole-genome DNA amplification by multiple displacement (MD-WGA) is a promising tool to obtain sufficient DNA amounts from samples of limited quantity. Using Affymetrix' GeneChip Human Mapping 10K Arrays, we investigated the accuracy and allele amplification bias in DNA samples subjected to MD-WGA. We observed an excellent concordance (99.95%) between single-nucleotide polymorphisms (SNPs) called both in the nonamplified and the corresponding amplified DNA. This concordance was only 0.01% lower than the intra-assay reproducibility of the genotyping technique used. However, MD-WGA failed to amplify an estimated 7% of polymorphic loci. Due to the algorithm used to call genotypes, this was detected only for heterozygous loci. We achieved a 4.3-fold reduction of noncalled SNPs by combining the results from two independent MD-WGA reactions. This indicated that inter-reaction variations rather than specific chromosomal loci reduced the efficiency of MD-WGA. Consistently, we detected no regions of reduced amplification, with the exception of several SNPs located near chromosomal ends. Altogether, despite a substantial loss of polymorphic sites, MD-WGA appears to be the current method of choice to amplify genomic DNA for array-based SNP analyses. The number of nonamplified loci can be substantially reduced by amplifying each DNA sample in duplicate.
Prabhanjan, Manasa; Suresh, Raviraj V; Murthy, Megha N; Ramachandra, Nallur B
2016-03-01
To identify the role of copy number variations (CNVs) on disease risk genes and its effect on disease phenotypes in type 2 diabetes mellitus (T2DM) in 12 random populations using high throughput arrays. CNV analysis was carried out on a total of 1715 individuals from 12 populations, from ArrayExpress Archive of the European Bioinformatics Institute along with our subjects using Affymetrix Genome Wide SNP 6.0 array. CNV effect on T2DM genes were analyzed using several bioinformatics tools and a molecular protein interaction network was constructed to identify the disease mechanism altered by the CNVs. Analysis showed 34.4% of the total population to be under CNV burden for T2DM, with 83 disease causal and associated genes being under CNV influence. Hotspots were identified on chromosomes 22, 12, 6, 19 and 11.Overlap studies with case cohorts revealed significant disease risk genes such as EGFR, E2F1, PPP1R3A, HLA and TSPAN8. CNVs play a significant role in predisposing T2DM in normal cohorts and contribute to the phenotypic effects. Thus, CNVs should be considered as one of the major contributors in predisposition of the disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Comparison of Comparative Genomic Hybridization Technologies across Microarray Platforms
In the 2007 Association of Biomolecular Resource Facilities (ABRF) Microarray Research Group (MARG) project, we analyzed HL-60 DNA with five platforms: Agilent, Affymetrix 500K, Affymetrix U133 Plus 2.0, Illumina, and RPCI 19K BAC arrays. Copy number variation (CNV) was analyzed ...
Zhu, Yuerong; Zhu, Yuelin; Xu, Wei
2008-01-01
Background Though microarray experiments are very popular in life science research, managing and analyzing microarray data are still challenging tasks for many biologists. Most microarray programs require users to have sophisticated knowledge of mathematics, statistics and computer skills for usage. With accumulating microarray data deposited in public databases, easy-to-use programs to re-analyze previously published microarray data are in high demand. Results EzArray is a web-based Affymetrix expression array data management and analysis system for researchers who need to organize microarray data efficiently and get data analyzed instantly. EzArray organizes microarray data into projects that can be analyzed online with predefined or custom procedures. EzArray performs data preprocessing and detection of differentially expressed genes with statistical methods. All analysis procedures are optimized and highly automated so that even novice users with limited pre-knowledge of microarray data analysis can complete initial analysis quickly. Since all input files, analysis parameters, and executed scripts can be downloaded, EzArray provides maximum reproducibility for each analysis. In addition, EzArray integrates with Gene Expression Omnibus (GEO) and allows instantaneous re-analysis of published array data. Conclusion EzArray is a novel Affymetrix expression array data analysis and sharing system. EzArray provides easy-to-use tools for re-analyzing published microarray data and will help both novice and experienced users perform initial analysis of their microarray data from the location of data storage. We believe EzArray will be a useful system for facilities with microarray services and laboratories with multiple members involved in microarray data analysis. EzArray is freely available from . PMID:18218103
Schnider, D; Rieder, S; Leeb, T; Gerber, V; Neuditschko, M
2017-12-01
Recurrent airway obstruction (RAO), also known as heaves, is an asthma-like respiratory disease. Its development is strongly influenced by environmental risk factors such as sensitization and exposure to moldy hay, straw bedding and stabling indoors. A hereditary component has been documented in previous studies; however, so far no causative genetic variant that influences the risk of developing RAO has been identified. In this study, we revised an existing dataset and selected 384 horses for genotyping on the Affymetrix high-density equine SNP array. We performed an allelic case-control genome-wide association study, which revealed a suggestively significant association on equine chromosome 13 at 32 843 309 bp. This SNP is located in the protein-coding gene TXNDC11, which is possibly involved in the folding process of the multiprotein complexes DUOX1 and DUOX2. In humans, these proteins are known to take part in regulating the production of H 2 O 2 in the respiratory tract epithelium as well as in MUC5AC mucin expression. Therefore, TXNDC11 may be considered a functional candidate gene, and further research is needed to explore its potential role in RAO-affected horses. © 2017 Stichting International Foundation for Animal Genetics.
Allen, Alexandra M; Winfield, Mark O; Burridge, Amanda J; Downie, Rowena C; Benbow, Harriet R; Barker, Gary L A; Wilkinson, Paul A; Coghill, Jane; Waterfall, Christy; Davassi, Alessandro; Scopes, Geoff; Pirani, Ali; Webster, Teresa; Brew, Fiona; Bloor, Claire; Griffiths, Simon; Bentley, Alison R; Alda, Mark; Jack, Peter; Phillips, Andrew L; Edwards, Keith J
2017-03-01
Targeted selection and inbreeding have resulted in a lack of genetic diversity in elite hexaploid bread wheat accessions. Reduced diversity can be a limiting factor in the breeding of high yielding varieties and crucially can mean reduced resilience in the face of changing climate and resource pressures. Recent technological advances have enabled the development of molecular markers for use in the assessment and utilization of genetic diversity in hexaploid wheat. Starting with a large collection of 819 571 previously characterized wheat markers, here we describe the identification of 35 143 single nucleotide polymorphism-based markers, which are highly suited to the genotyping of elite hexaploid wheat accessions. To assess their suitability, the markers have been validated using a commercial high-density Affymetrix Axiom ® genotyping array (the Wheat Breeders' Array), in a high-throughput 384 microplate configuration, to characterize a diverse global collection of wheat accessions including landraces and elite lines derived from commercial breeding communities. We demonstrate that the Wheat Breeders' Array is also suitable for generating high-density genetic maps of previously uncharacterized populations and for characterizing novel genetic diversity produced by mutagenesis. To facilitate the use of the array by the wheat community, the markers, the associated sequence and the genotype information have been made available through the interactive web site 'CerealsDB'. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
SNPConvert: SNP Array Standardization and Integration in Livestock Species.
Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra
2016-06-09
One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git.
Stec, James; Wang, Jing; Coombes, Kevin; Ayers, Mark; Hoersch, Sebastian; Gold, David L.; Ross, Jeffrey S; Hess, Kenneth R.; Tirrell, Stephen; Linette, Gerald; Hortobagyi, Gabriel N.; Symmans, W. Fraser; Pusztai, Lajos
2005-01-01
We examined how well differentially expressed genes and multigene outcome classifiers retain their class-discriminating values when tested on data generated by different transcriptional profiling platforms. RNA from 33 stage I-III breast cancers was hybridized to both Affymetrix GeneChip and Millennium Pharmaceuticals cDNA arrays. Only 30% of all corresponding gene expression measurements on the two platforms had Pearson correlation coefficient r ≥ 0.7 when UniGene was used to match probes. There was substantial variation in correlation between different Affymetrix probe sets matched to the same cDNA probe. When cDNA and Affymetrix probes were matched by basic local alignment tool (BLAST) sequence identity, the correlation increased substantially. We identified 182 genes in the Affymetrix and 45 in the cDNA data (including 17 common genes) that accurately separated 91% of cases in supervised hierarchical clustering in each data set. Cross-platform testing of these informative genes resulted in lower clustering accuracy of 45 and 79%, respectively. Several sets of accurate five-gene classifiers were developed on each platform using linear discriminant analysis. The best 100 classifiers showed average misclassification error rate of 2% on the original data that rose to 19.5% when tested on data from the other platform. Random five-gene classifiers showed misclassification error rate of 33%. We conclude that multigene predictors optimized for one platform lose accuracy when applied to data from another platform due to missing genes and sequence differences in probes that result in differing measurements for the same gene. PMID:16049308
USDA-ARS?s Scientific Manuscript database
The Axiom®Turkey Genotyping Array interrogates 643,845 probesets on the array, covering 643,845 SNPs. The array development was led by Dr. Julie Long of the USDA-ARS Beltsville Agricultural Research Center under a public-private partnership with Hendrix Genetics, Aviagen, and Affymetrix. The Turk...
VIZARD: analysis of Affymetrix Arabidopsis GeneChip data
NASA Technical Reports Server (NTRS)
Moseyko, Nick; Feldman, Lewis J.
2002-01-01
SUMMARY: The Affymetrix GeneChip Arabidopsis genome array has proved to be a very powerful tool for the analysis of gene expression in Arabidopsis thaliana, the most commonly studied plant model organism. VIZARD is a Java program created at the University of California, Berkeley, to facilitate analysis of Arabidopsis GeneChip data. It includes several integrated tools for filtering, sorting, clustering and visualization of gene expression data as well as tools for the discovery of regulatory motifs in upstream sequences. VIZARD also includes annotation and upstream sequence databases for the majority of genes represented on the Affymetrix Arabidopsis GeneChip array. AVAILABILITY: VIZARD is available free of charge for educational, research, and not-for-profit purposes, and can be downloaded at http://www.anm.f2s.com/research/vizard/ CONTACT: moseyko@uclink4.berkeley.edu.
ArrayInitiative - a tool that simplifies creating custom Affymetrix CDFs
2011-01-01
Background Probes on a microarray represent a frozen view of a genome and are quickly outdated when new sequencing studies extend our knowledge, resulting in significant measurement error when analyzing any microarray experiment. There are several bioinformatics approaches to improve probe assignments, but without in-house programming expertise, standardizing these custom array specifications as a usable file (e.g. as Affymetrix CDFs) is difficult, owing mostly to the complexity of the specification file format. However, without correctly standardized files there is a significant barrier for testing competing analysis approaches since this file is one of the required inputs for many commonly used algorithms. The need to test combinations of probe assignments and analysis algorithms led us to develop ArrayInitiative, a tool for creating and managing custom array specifications. Results ArrayInitiative is a standalone, cross-platform, rich client desktop application for creating correctly formatted, custom versions of manufacturer-provided (default) array specifications, requiring only minimal knowledge of the array specification rules and file formats. Users can import default array specifications, import probe sequences for a default array specification, design and import a custom array specification, export any array specification to multiple output formats, export the probe sequences for any array specification and browse high-level information about the microarray, such as version and number of probes. The initial release of ArrayInitiative supports the Affymetrix 3' IVT expression arrays we currently analyze, but as an open source application, we hope that others will contribute modules for other platforms. Conclusions ArrayInitiative allows researchers to create new array specifications, in a standard format, based upon their own requirements. This makes it easier to test competing design and analysis strategies that depend on probe definitions. Since the custom array specifications are easily exported to the manufacturer's standard format, researchers can analyze these customized microarray experiments using established software tools, such as those available in Bioconductor. PMID:21548938
Combined array CGH plus SNP genome analyses in a single assay for optimized clinical testing
Wiszniewska, Joanna; Bi, Weimin; Shaw, Chad; Stankiewicz, Pawel; Kang, Sung-Hae L; Pursley, Amber N; Lalani, Seema; Hixson, Patricia; Gambin, Tomasz; Tsai, Chun-hui; Bock, Hans-Georg; Descartes, Maria; Probst, Frank J; Scaglia, Fernando; Beaudet, Arthur L; Lupski, James R; Eng, Christine; Wai Cheung, Sau; Bacino, Carlos; Patel, Ankita
2014-01-01
In clinical diagnostics, both array comparative genomic hybridization (array CGH) and single nucleotide polymorphism (SNP) genotyping have proven to be powerful genomic technologies utilized for the evaluation of developmental delay, multiple congenital anomalies, and neuropsychiatric disorders. Differences in the ability to resolve genomic changes between these arrays may constitute an implementation challenge for clinicians: which platform (SNP vs array CGH) might best detect the underlying genetic cause for the disease in the patient? While only SNP arrays enable the detection of copy number neutral regions of absence of heterozygosity (AOH), they have limited ability to detect single-exon copy number variants (CNVs) due to the distribution of SNPs across the genome. To provide comprehensive clinical testing for both CNVs and copy-neutral AOH, we enhanced our custom-designed high-resolution oligonucleotide array that has exon-targeted coverage of 1860 genes with 60 000 SNP probes, referred to as Chromosomal Microarray Analysis – Comprehensive (CMA-COMP). Of the 3240 cases evaluated by this array, clinically significant CNVs were detected in 445 cases including 21 cases with exonic events. In addition, 162 cases (5.0%) showed at least one AOH region >10 Mb. We demonstrate that even though this array has a lower density of SNP probes than other commercially available SNP arrays, it reliably detected AOH events >10 Mb as well as exonic CNVs beyond the detection limitations of SNP genotyping. Thus, combining SNP probes and exon-targeted array CGH into one platform provides clinically useful genetic screening in an efficient manner. PMID:23695279
Population Genetic Structure of the People of Qatar
Hunter-Zinck, Haley; Musharoff, Shaila; Salit, Jacqueline; Al-Ali, Khalid A.; Chouchane, Lotfi; Gohar, Abeer; Matthews, Rebecca; Butler, Marcus W.; Fuller, Jennifer; Hackett, Neil R.; Crystal, Ronald G.; Clark, Andrew G.
2010-01-01
People of the Qatar peninsula represent a relatively recent founding by a small number of families from three tribes of the Arabian Peninsula, Persia, and Oman, with indications of African admixture. To assess the roles of both this founding effect and the customary first-cousin marriages among the ancestral Islamic populations in Qatar's population genetic structure, we obtained and genotyped with Affymetrix 500k SNP arrays DNA samples from 168 self-reported Qatari nationals sampled from Doha, Qatar. Principal components analysis was performed along with samples from the Human Genetic Diversity Project data set, revealing three clear clusters of genotypes whose proximity to other human population samples is consistent with Arabian origin, a more eastern or Persian origin, and individuals with African admixture. The extent of linkage disequilibrium (LD) is greater than that of African populations, and runs of homozygosity in some individuals reflect substantial consanguinity. However, the variance in runs of homozygosity is exceptionally high, and the degree of identity-by-descent sharing generally appears to be lower than expected for a population in which nearly half of marriages are between first cousins. Despite the fact that the SNPs of the Affymetrix 500k chip were ascertained with a bias toward SNPs common in Europeans, the data strongly support the notion that the Qatari population could provide a valuable resource for the mapping of genes associated with complex disorders and that tests of pairwise interactions are particularly empowered by populations with elevated LD like the Qatari. PMID:20579625
Halgren, Christina; Bache, Iben; Bak, Mads; Myatt, Mikkel Wanting; Anderson, Claire Marie; Brøndum-Nielsen, Karen; Tommerup, Niels
2012-01-01
Only 20 patients with deletions of 18q12.2 have been reported in the literature and the associated phenotype includes borderline intellectual disability, behavioral problems, seizures, obesity, and eye manifestations. Here, we report a male patient with a de novo translocation involving chromosomes 12 and 18, with borderline IQ, developmental and behavioral disorders, myopia, obesity, and febrile seizures in childhood. We characterized the rearrangement with Affymetrix SNP 6.0 Array analysis and next-generation mate pair sequencing and found truncation of CELF4 at 18q12.2. This second report of a patient with a neurodevelopmental phenotype and a translocation involving CELF4 supports that CELF4 is responsible for the phenotype associated with deletion of 18q12.2. Our study illustrates the utility of high-resolution genome-wide techniques in identifying neurodevelopmental and neurobehavioral genes, and it adds to the growing evidence, including a transgenic mouse model, that CELF4 is important for human brain development. PMID:22617346
Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups
Lou, Haiyi; Li, Shilin; Jin, Wenfei; Fu, Ruiqing; Lu, Dongsheng; Pan, Xinwei; Zhou, Huaigu; Ping, Yuan; Jin, Li; Xu, Shuhua
2015-01-01
Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies. PMID:25026903
CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.
Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H
2010-07-06
The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/
Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups.
Lou, Haiyi; Li, Shilin; Jin, Wenfei; Fu, Ruiqing; Lu, Dongsheng; Pan, Xinwei; Zhou, Huaigu; Ping, Yuan; Jin, Li; Xu, Shuhua
2015-04-01
Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies.
Accuracy of CNV Detection from GWAS Data.
Zhang, Dandan; Qian, Yudong; Akula, Nirmala; Alliey-Rodriguez, Ney; Tang, Jinsong; Gershon, Elliot S; Liu, Chunyu
2011-01-13
Several computer programs are available for detecting copy number variants (CNVs) using genome-wide SNP arrays. We evaluated the performance of four CNV detection software suites--Birdsuite, Partek, HelixTree, and PennCNV-Affy--in the identification of both rare and common CNVs. Each program's performance was assessed in two ways. The first was its recovery rate, i.e., its ability to call 893 CNVs previously identified in eight HapMap samples by paired-end sequencing of whole-genome fosmid clones, and 51,440 CNVs identified by array Comparative Genome Hybridization (aCGH) followed by validation procedures, in 90 HapMap CEU samples. The second evaluation was program performance calling rare and common CNVs in the Bipolar Genome Study (BiGS) data set (1001 bipolar cases and 1033 controls, all of European ancestry) as measured by the Affymetrix SNP 6.0 array. Accuracy in calling rare CNVs was assessed by positive predictive value, based on the proportion of rare CNVs validated by quantitative real-time PCR (qPCR), while accuracy in calling common CNVs was assessed by false positive/false negative rates based on qPCR validation results from a subset of common CNVs. Birdsuite recovered the highest percentages of known HapMap CNVs containing >20 markers in two reference CNV datasets. The recovery rate increased with decreased CNV frequency. In the tested rare CNV data, Birdsuite and Partek had higher positive predictive values than the other software suites. In a test of three common CNVs in the BiGS dataset, Birdsuite's call was 98.8% consistent with qPCR quantification in one CNV region, but the other two regions showed an unacceptable degree of accuracy. We found relatively poor consistency between the two "gold standards," the sequence data of Kidd et al., and aCGH data of Conrad et al. Algorithms for calling CNVs especially common ones need substantial improvement, and a "gold standard" for detection of CNVs remains to be established.
Qualitative assessment of gene expression in affymetrix genechip arrays
NASA Astrophysics Data System (ADS)
Nagarajan, Radhakrishnan; Upreti, Meenakshi
2007-01-01
Affymetrix Genechip microarrays are used widely to determine the simultaneous expression of genes in a given biological paradigm. Probes on the Genechip array are atomic entities which by definition are randomly distributed across the array and in turn govern the gene expression. In the present study, we make several interesting observations. We show that there is considerable correlation between the probe intensities across the array which defy the independence assumption. While the mechanism behind such correlations is unclear, we show that scaling behavior and the profiles of perfect match (PM) as well as mismatch (MM) probes are similar and immune-to-background subtraction. We believe that the observed correlations are possibly an outcome of inherent non-stationarities or patchiness in the array devoid of biological significance. This is demonstrated by inspecting their scaling behavior and profiles of the PM and MM probe intensities obtained from publicly available Genechip arrays from three eukaryotic genomes, namely: Drosophila melanogaster (fruit fly), Homo sapiens (humans) and Mus musculus (house mouse) across distinct biological paradigms and across laboratories, with and without background subtraction. The fluctuation functions were estimated using detrended fluctuation analysis (DFA) with fourth-order polynomial detrending. The results presented in this study provide new insights into correlation signatures of PM and MM probe intensities and suggests the choice of DFA as a tool for qualitative assessment of Affymetrix Genechip microarrays prior to their analysis. A more detailed investigation is necessary in order to understand the source of these correlations.
Lu, Timothy Tehua; Lao, Oscar; Nothnagel, Michael; Junge, Olaf; Freitag-Wolf, Sandra; Caliebe, Amke; Balascakova, Miroslava; Bertranpetit, Jaume; Bindoff, Laurence Albert; Comas, David; Holmlund, Gunilla; Kouvatsi, Anastasia; Macek, Milan; Mollet, Isabelle; Nielsen, Finn; Parson, Walther; Palo, Jukka; Ploski, Rafal; Sajantila, Antti; Tagliabracci, Adriano; Gether, Ulrik; Werge, Thomas; Rivadeneira, Fernando; Hofman, Albert; Uitterlinden, André Gerardus; Gieger, Christian; Wichmann, Heinz-Erich; Ruether, Andreas; Schreiber, Stefan; Becker, Christian; Nürnberg, Peter; Nelson, Matthew Roberts; Kayser, Manfred; Krawczak, Michael
2009-07-01
Genetic matching potentially provides a means to alleviate the effects of incomplete Mendelian randomization in population-based gene-disease association studies. We therefore evaluated the genetic-matched pair study design on the basis of genome-wide SNP data (309,790 markers; Affymetrix GeneChip Human Mapping 500K Array) from 2457 individuals, sampled at 23 different recruitment sites across Europe. Using pair-wise identity-by-state (IBS) as a matching criterion, we tried to derive a subset of markers that would allow identification of the best overall matching (BOM) partner for a given individual, based on the IBS status for the subset alone. However, our results suggest that, by following this approach, the prediction accuracy is only notably improved by the first 20 markers selected, and increases proportionally to the marker number thereafter. Furthermore, in a considerable proportion of cases (76.0%), the BOM of a given individual, based on the complete marker set, came from a different recruitment site than the individual itself. A second marker set, specifically selected for ancestry sensitivity using singular value decomposition, performed even more poorly and was no more capable of predicting the BOM than randomly chosen subsets. This leads us to conclude that, at least in Europe, the utility of the genetic-matched pair study design depends critically on the availability of comprehensive genotype information for both cases and controls.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerns, Sarah L.; Ostrer, Harry; Stock, Richard
2010-12-01
Purpose: To identify single nucleotide polymorphisms (SNPs) associated with erectile dysfunction (ED) among African-American prostate cancer patients treated with external beam radiation therapy. Methods and Materials: A cohort of African-American prostate cancer patients treated with external beam radiation therapy was observed for the development of ED by use of the five-item Sexual Health Inventory for Men (SHIM) questionnaire. Final analysis included 27 cases (post-treatment SHIM score {<=}7) and 52 control subjects (post-treatment SHIM score {>=}16). A genome-wide association study was performed using approximately 909,000 SNPs genotyped on Affymetrix 6.0 arrays (Affymetrix, Santa Clara, CA). Results: We identified SNP rs2268363, locatedmore » in the follicle-stimulating hormone receptor (FSHR) gene, as significantly associated with ED after correcting for multiple comparisons (unadjusted p = 5.46 x 10{sup -8}, Bonferroni p = 0.028). We identified four additional SNPs that tended toward a significant association with an unadjusted p value < 10{sup -6}. Inference of population substructure showed that cases had a higher proportion of African ancestry than control subjects (77% vs. 60%, p = 0.005). A multivariate logistic regression model that incorporated estimated ancestry and four of the top-ranked SNPs was a more accurate classifier of ED than a model that included only clinical variables. Conclusions: To our knowledge, this is the first genome-wide association study to identify SNPs associated with adverse effects resulting from radiotherapy. It is important to note that the SNP that proved to be significantly associated with ED is located within a gene whose encoded product plays a role in male gonad development and function. Another key finding of this project is that the four SNPs most strongly associated with ED were specific to persons of African ancestry and would therefore not have been identified had a cohort of European ancestry been screened. This study demonstrates the feasibility of a genome-wide approach to investigate genetic predisposition to radiation injury.« less
Power enhancement via multivariate outlier testing with gene expression arrays.
Asare, Adam L; Gao, Zhong; Carey, Vincent J; Wang, Richard; Seyfert-Margolis, Vicki
2009-01-01
As the use of microarrays in human studies continues to increase, stringent quality assurance is necessary to ensure accurate experimental interpretation. We present a formal approach for microarray quality assessment that is based on dimension reduction of established measures of signal and noise components of expression followed by parametric multivariate outlier testing. We applied our approach to several data resources. First, as a negative control, we found that the Affymetrix and Illumina contributions to MAQC data were free from outliers at a nominal outlier flagging rate of alpha=0.01. Second, we created a tunable framework for artificially corrupting intensity data from the Affymetrix Latin Square spike-in experiment to allow investigation of sensitivity and specificity of quality assurance (QA) criteria. Third, we applied the procedure to 507 Affymetrix microarray GeneChips processed with RNA from human peripheral blood samples. We show that exclusion of arrays by this approach substantially increases inferential power, or the ability to detect differential expression, in large clinical studies. http://bioconductor.org/packages/2.3/bioc/html/arrayMvout.html and http://bioconductor.org/packages/2.3/bioc/html/affyContam.html affyContam (credentials: readonly/readonly)
Quinn, Michael C J; Wilson, Daniel J; Young, Fiona; Dempsey, Adam A; Arcand, Suzanna L; Birch, Ashley H; Wojnarowicz, Paulina M; Provencher, Diane; Mes-Masson, Anne-Marie; Englert, David; Tonin, Patricia N
2009-07-06
As gene expression signatures may serve as biomarkers, there is a need to develop technologies based on mRNA expression patterns that are adaptable for translational research. Xceed Molecular has recently developed a Ziplex technology, that can assay for gene expression of a discrete number of genes as a focused array. The present study has evaluated the reproducibility of the Ziplex system as applied to ovarian cancer research of genes shown to exhibit distinct expression profiles initially assessed by Affymetrix GeneChip analyses. The new chemiluminescence-based Ziplex gene expression array technology was evaluated for the expression of 93 genes selected based on their Affymetrix GeneChip profiles as applied to ovarian cancer research. Probe design was based on the Affymetrix target sequence that favors the 3' UTR of transcripts in order to maximize reproducibility across platforms. Gene expression analysis was performed using the Ziplex Automated Workstation. Statistical analyses were performed to evaluate reproducibility of both the magnitude of expression and differences between normal and tumor samples by correlation analyses, fold change differences and statistical significance testing. Expressions of 82 of 93 (88.2%) genes were highly correlated (p < 0.01) in a comparison of the two platforms. Overall, 75 of 93 (80.6%) genes exhibited consistent results in normal versus tumor tissue comparisons for both platforms (p < 0.001). The fold change differences were concordant for 87 of 93 (94%) genes, where there was agreement between the platforms regarding statistical significance for 71 (76%) of 87 genes. There was a strong agreement between the two platforms as shown by comparisons of log2 fold differences of gene expression between tumor versus normal samples (R = 0.93) and by Bland-Altman analysis, where greater than 90% of expression values fell within the 95% limits of agreement. Overall concordance of gene expression patterns based on correlations, statistical significance between tumor and normal ovary data, and fold changes was consistent between the Ziplex and Affymetrix platforms. The reproducibility and ease-of-use of the technology suggests that the Ziplex array is a suitable platform for translational research.
Family-Based Benchmarking of Copy Number Variation Detection Software.
Nutsua, Marcel Elie; Fischer, Annegret; Nebel, Almut; Hofmann, Sylvia; Schreiber, Stefan; Krawczak, Michael; Nothnagel, Michael
2015-01-01
The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico.
Common variants at the promoter region of the APOM confer a risk of rheumatoid arthritis
Hu, Hae-Jin; Jin, Eun-Heui; Yim, Seon-Hee; Yang, So-Young; Jung, Seung-Hyun; Shin, Seung-Hun; Kim, Wan-Uk; Shim, Seung-Cheol; Kim, Tai-Gyu
2011-01-01
Although the genetic component in the etiology of rheumatoid arthritis (RA) has been consistently suggested, many novel genetic loci remain to uncover. To identify RA risk loci, we performed a genome-wide association study (GWAS) with 100 RA cases and 600 controls using Affymetrix SNP array 5.0. The candidate risk locus (APOM gene) was re-sequenced to discover novel promoter and coding variants in a group of the subjects. Replication was performed with the independent case-control set comprising of 578 RAs and 711 controls. Through GWAS, we identified a novel SNP associated with RA at the APOM gene in the MHC class III region on 6p21.33 (rs805297, odds ratio (OR) = 2.28, P = 5.20 × 10-7). Three more polymorphisms were identified at the promoter region of the APOM by the re-sequencing. For the replication, we genotyped the four SNP loci in the independent case-control set. The association of rs805297 identified by GWAS was successfully replicated (OR = 1.40, P = 6.65 × 10-5). The association became more significant in the combined analysis of discovery and replication sets (OR = 1.56, P = 2.73 ± 10-10). The individuals with the rs805297 risk allele (A) at the promoter region showed a significantly lower level of APOM expression compared with those with the protective allele (C) homozygote. In the logistic regressions by the phenotype status, the homozygote risk genotype (A/A) consistently showed higher ORs than the heterozygote one (A/C) for the phenotype-positive RAs. These results indicate that APOM promoter polymorphisms are significantly associated with the susceptibility to RA. PMID:21844665
Population genetic structure of the people of Qatar.
Hunter-Zinck, Haley; Musharoff, Shaila; Salit, Jacqueline; Al-Ali, Khalid A; Chouchane, Lotfi; Gohar, Abeer; Matthews, Rebecca; Butler, Marcus W; Fuller, Jennifer; Hackett, Neil R; Crystal, Ronald G; Clark, Andrew G
2010-07-09
People of the Qatar peninsula represent a relatively recent founding by a small number of families from three tribes of the Arabian Peninsula, Persia, and Oman, with indications of African admixture. To assess the roles of both this founding effect and the customary first-cousin marriages among the ancestral Islamic populations in Qatar's population genetic structure, we obtained and genotyped with Affymetrix 500k SNP arrays DNA samples from 168 self-reported Qatari nationals sampled from Doha, Qatar. Principal components analysis was performed along with samples from the Human Genetic Diversity Project data set, revealing three clear clusters of genotypes whose proximity to other human population samples is consistent with Arabian origin, a more eastern or Persian origin, and individuals with African admixture. The extent of linkage disequilibrium (LD) is greater than that of African populations, and runs of homozygosity in some individuals reflect substantial consanguinity. However, the variance in runs of homozygosity is exceptionally high, and the degree of identity-by-descent sharing generally appears to be lower than expected for a population in which nearly half of marriages are between first cousins. Despite the fact that the SNPs of the Affymetrix 500k chip were ascertained with a bias toward SNPs common in Europeans, the data strongly support the notion that the Qatari population could provide a valuable resource for the mapping of genes associated with complex disorders and that tests of pairwise interactions are particularly empowered by populations with elevated LD like the Qatari. Copyright 2010 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Genetic Risk Score for Essential Hypertension and Risk of Preeclampsia.
Smith, Caitlin J; Saftlas, Audrey F; Spracklen, Cassandra N; Triche, Elizabeth W; Bjonnes, Andrew; Keating, Brendan; Saxena, Richa; Breheny, Patrick J; Dewan, Andrew T; Robinson, Jennifer G; Hoh, Josephine; Ryckman, Kelli K
2016-01-01
Preeclampsia is a hypertensive complication of pregnancy characterized by novel onset of hypertension after 20 weeks gestation, accompanied by proteinuria. Epidemiological evidence suggests that genetic susceptibility exists for preeclampsia; however, whether preeclampsia is the result of underlying genetic risk for essential hypertension has yet to be investigated. Based on the hypertensive state that is characteristic of preeclampsia, we aimed to determine if established genetic risk scores (GRSs) for hypertension and blood pressure are associated with preeclampsia. Subjects consisted of 162 preeclamptic cases and 108 normotensive pregnant controls, all of Iowa residence. Subjects' DNA was extracted from buccal swab samples and genotyped on the Affymetrix Genome-wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA). Missing genotypes were imputed using MaCH and Minimac software. GRSs were calculated for hypertension, systolic blood pressure (SBP), diastolic blood pressure (DBP), and mean arterial pressure (MAP) using established genetic risk loci for each outcome. Regression analyses were performed to determine the association between GRS and risk of preeclampsia. These analyses were replicated in an independent US population of 516 cases and 1,097 controls of European ancestry. GRSs for hypertension, SBP, DBP, and MAP were not significantly associated with risk for preeclampsia (P > 0.189). The results of the replication analysis also yielded nonsignificant associations. GRSs for hypertension and blood pressure are not associated with preeclampsia, suggesting that an underlying predisposition to essential hypertension is not on the causal pathway of preeclampsia. © American Journal of Hypertension, Ltd 2015. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Nicolazzi, Ezequiel L; Caprera, Andrea; Nazzicari, Nelson; Cozzi, Paolo; Strozzi, Francesco; Lawley, Cindy; Pirani, Ali; Soans, Chandrasen; Brew, Fiona; Jorjani, Hossein; Evans, Gary; Simpson, Barry; Tosser-Klopp, Gwenola; Brauning, Rudiger; Williams, John L; Stella, Alessandra
2015-04-10
In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information. Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion. This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.
... array, and oligo/SNP combination array. Related terms: comparative genomic hybridization ; copy number variant ; SNP array chromosome ... for example, the AB blood groups in humans comparative genomic hybridization Method in which two DNA samples ( ...
Hoffmann, Thomas J; Zhan, Yiping; Kvale, Mark N; Hesselson, Stephanie E; Gollub, Jeremy; Iribarren, Carlos; Lu, Yontao; Mei, Gangwu; Purdy, Matthew M; Quesenberry, Charles; Rowell, Sarah; Shapero, Michael H; Smethurst, David; Somkin, Carol P; Van den Eeden, Stephen K; Walter, Larry; Webster, Teresa; Whitmer, Rachel A; Finn, Andrea; Schaefer, Catherine; Kwok, Pui-Yan; Risch, Neil
2011-12-01
Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies. Copyright © 2011 Elsevier Inc. All rights reserved.
Yahya, Padillah; Sulong, Sarina; Harun, Azian; Wan Isa, Hatin; Ab Rajab, Nur-Shafawati; Wangkumhang, Pongsakorn; Wilantho, Alisa; Ngamphiw, Chumpol; Tongsima, Sissades; Zilfalil, Bin Alwi
2017-09-01
Malay, the main ethnic group in Peninsular Malaysia, is represented by various sub-ethnic groups such as Melayu Banjar, Melayu Bugis, Melayu Champa, Melayu Java, Melayu Kedah Melayu Kelantan, Melayu Minang and Melayu Patani. Using data retrieved from the MyHVP (Malaysian Human Variome Project) database, a total of 135 individuals from these sub-ethnic groups were profiled using the Affymetrix GeneChip Mapping Xba 50-K single nucleotide polymorphism (SNP) array to identify SNPs that were ancestry-informative markers (AIMs) for Malays of Peninsular Malaysia. Prior to selecting the AIMs, the genetic structure of Malays was explored with reference to 11 other populations obtained from the Pan-Asian SNP Consortium database using principal component analysis (PCA) and ADMIXTURE. Iterative pruning principal component analysis (ipPCA) was further used to identify sub-groups of Malays. Subsequently, we constructed an AIMs panel for Malays using the informativeness for assignment (I n ) of genetic markers, and the K-nearest neighbor classifier (KNN) was used to teach the classification models. A model of 250 SNPs ranked by I n , correctly classified Malay individuals with an accuracy of up to 90%. The identified panel of SNPs could be utilized as a panel of AIMs to ascertain the specific ancestry of Malays, which may be useful in disease association studies, biomedical research or forensic investigation purposes. Copyright © 2017 Elsevier B.V. All rights reserved.
A novel homozygous variant in the SMOC1 gene underlying Waardenburg anophthalmia syndrome.
Ullah, Asmat; Umair, Muhammad; Ahmad, Farooq; Muhammad, Dost; Basit, Sulman; Ahmad, Wasim
2017-01-01
Waardenburg anophthalmia syndrome (WAS), also known as ophthalmo-acromelic syndrome or anophthalmia-syndactyly, is a rare congenital disorder that segregates in an autosomal recessive pattern. Clinical features of the syndrome include malformation of the eyes and the skeleton. Mostly, WAS is caused by mutations in the SMOC-1 gene. The present report describes a large consanguineous family of Pakistani origin segregating Waardenburg anophthalmia syndrome in an autosomal recessive pattern. Genotyping followed by Sanger sequencing was performed to search for a candidate gene. SNP genotyping using AffymetrixGeneChip Human Mapping 250K Nsp array established a single homozygous region among affected members on chromosome 14q23.1-q24.3 harboring the SMOC1 gene. Sequencing of the gene revealed a novel homozygous missense mutation (c.812G>A; p.Cys271Tyr) in the family. This is the first report of Waardenburg anophthalmia syndrome caused by a SMOC1 variant in a Pakistani population. The mutation identified in the present investigation extends the body of evidence implicating the gene SMOC-1 in causing WAS.
Atopic dermatitis in West Highland white terriers is associated with a 1.3-Mb region on CFA 17.
Roque, Joana B; O'Leary, Caroline A; Duffy, David L; Kyaw-Tanner, Myat; Gharahkhani, Puya; Vogelnest, Linda; Mason, Kenneth; Shipstone, Michael; Latter, Melanie
2012-03-01
Canine atopic dermatitis (AD) is an allergic inflammatory skin disease that shares similarities with AD in humans. Canine AD is likely to be an inherited disease in dogs and is common in West Highland white terriers (WHWTs). We performed a genome-wide association study using the Affymetrix Canine SNP V2 array consisting of over 42,800 single nucleotide polymorphisms, on 35 atopic and 25 non-atopic WHWTs. A gene-dropping simulation method, using SIB-PAIR, identified a projected 1.3 Mb area of association (genome-wide P = 6 × 10(-5) to P = 7 × 10(-4)) on CFA 17. Nineteen genes on CFA 17, including 1 potential candidate gene (PTPN22), were located less than 0.5 Mb from the interval of association identified on the genome-wide association analysis. Four haplotypes within this locus were differently distributed between cases and controls in this population of dogs. These findings suggest that a major locus for canine AD in WHWTs may be located on, or in close proximity to an area on CFA 17.
Vathipadiekal, Vinod; Wang, Victoria; Wei, Wei; Waldron, Levi; Drapkin, Ronny; Gillette, Michael; Skates, Steven; Birrer, Michael
2015-11-01
To generate a comprehensive "Secretome" of proteins potentially found in the blood and derive a virtual Affymetrix array. To validate the utility of this database for the discovery of novel serum-based biomarkers using ovarian cancer transcriptomic data. The secretome was constructed by aggregating the data from databases of known secreted proteins, transmembrane or membrane proteins, signal peptides, G-protein coupled receptors, or proteins existing in the extracellular region, and the virtual array was generated by mapping them to Affymetrix probeset identifiers. Whole-genome microarray data from ovarian cancer, normal ovarian surface epithelium, and fallopian tube epithelium were used to identify transcripts upregulated in ovarian cancer. We established the secretome from eight public databases and a virtual array consisting of 16,521 Affymetrix U133 Plus 2.0 probesets. Using ovarian cancer transcriptomic data, we identified candidate blood-based biomarkers for ovarian cancer and performed bioinformatic validation by demonstrating rediscovery of known biomarkers including CA125 and HE4. Two novel top biomarkers (FGF18 and GPR172A) were validated in serum samples from an independent patient cohort. We present the secretome, comprising the most comprehensive resource available for protein products that are potentially found in the blood. The associated virtual array can be used to translate gene-expression data into cancer biomarker discovery. A list of blood-based biomarkers for ovarian cancer detection is reported and includes CA125 and HE4. FGF18 and GPR172A were identified and validated by ELISA as being differentially expressed in the serum of ovarian cancer patients compared with controls. ©2015 American Association for Cancer Research.
Correa, Katharina; Lhorente, Jean P; López, María E; Bassini, Liane; Naswa, Sudhir; Deeb, Nader; Di Genova, Alex; Maass, Alejandro; Davidson, William S; Yáñez, José M
2015-10-24
Pisciricketssia salmonis is the causal agent of Salmon Rickettsial Syndrome (SRS), which affects salmon species and causes severe economic losses. Selective breeding for disease resistance represents one approach for controlling SRS in farmed Atlantic salmon. Knowledge concerning the architecture of the resistance trait is needed before deciding on the most appropriate approach to enhance artificial selection for P. salmonis resistance in Atlantic salmon. The purpose of the study was to dissect the genetic variation in the resistance to this pathogen in Atlantic salmon. 2,601 Atlantic salmon smolts were experimentally challenged against P. salmonis by means of intra-peritoneal injection. These smolts were the progeny of 40 sires and 118 dams from a Chilean breeding population. Mortalities were recorded daily and the experiment ended at day 40 post-inoculation. Fish were genotyped using a 50K Affymetrix® Axiom® myDesignTM Single Nucleotide Polymorphism (SNP) Genotyping Array. A Genome Wide Association Analysis was performed on data from the challenged fish. Linear regression and logistic regression models were tested. Genome Wide Association Analysis indicated that resistance to P. salmonis is a moderately polygenic trait. There were five SNPs in chromosomes Ssa01 and Ssa17 significantly associated with the traits analysed. The proportion of the phenotypic variance explained by each marker is small, ranging from 0.007 to 0.045. Candidate genes including interleukin receptors and fucosyltransferase have been found to be physically linked with these genetic markers and may play an important role in the differential immune response against this pathogen. Due to the small amount of variance explained by each significant marker we conclude that genetic resistance to this pathogen can be more efficiently improved with the implementation of genetic evaluations incorporating genotype information from a dense SNP array.
Genome-wide association study of acute post-surgical pain in humans
Kim, Hyungsuk; Ramsay, Edward; Lee, Hyewon; Wahl, Sharon; Dionne, Raymond A
2009-01-01
Aims Testing a relatively small genomic region with a few hundred SNPs provides limited information. Genome-wide association studies (GWAS) provide an opportunity to overcome the limitation of candidate gene association studies. Here, we report the results of a GWAS for the responses to an NSAID analgesic. Materials & methods European Americans (60 females and 52 males) undergoing oral surgery were genotyped with Affymetrix 500K SNP assay. Additional SNP genotyping was performed from the gene in linkage disequilibrium with the candidate SNP revealed by the GWAS. Results GWAS revealed a candidate SNP (rs2562456) associated with analgesic onset, which is in linkage disequilibrium with a gene encoding a zinc finger protein. Additional SNP genotyping of ZNF429 confirmed the association with analgesic onset in humans (p = 1.8 × 10−10, degrees of freedom = 103, F = 28.3). We also found candidate loci for the maximum post-operative pain rating (rs17122021, p = 6.9 × 10−7) and post-operative pain onset time (rs6693882, p = 2.1 × 10−6), however, correcting for multiple comparisons did not sustain these genetic associations. Conclusion GWAS for acute clinical pain followed by additional SNP genotyping of a neighboring gene suggests that genetic variations in or near the loci encoding DNA binding proteins play a role in the individual variations in responses to analgesic drugs. PMID:19207018
Yuan, Jingwei; Sun, Congjiao; Dou, Taocun; Yi, Guoqiang; Qu, LuJiang; Qu, Liang; Wang, Kehua; Yang, Ning
2015-01-01
Egg number (EN), egg laying rate (LR) and age at first egg (AFE) are important production traits related to egg production in poultry industry. To better understand the knowledge of genetic architecture of dynamic EN during the whole laying cycle and provide the precise positions of associated variants for EN, LR and AFE, laying records from 21 to 72 weeks of age were collected individually for 1,534 F2 hens produced by reciprocal crosses between White Leghorn and Dongxiang Blue-shelled chicken, and their genotypes were assayed by chicken 600 K Affymetrix high density genotyping arrays. Subsequently, pedigree and SNP-based genetic parameters were estimated and a genome-wide association study (GWAS) was conducted on EN, LR and AFE. The heritability estimates were similar between pedigree and SNP-based estimates varying from 0.17 to 0.36. In the GWA analysis, we identified nine genome-wide significant loci associated with EN of the laying periods from 21 to 26 weeks, 27 to 36 weeks and 37 to 72 weeks. Analysis of GTF2A1 and CLSPN suggested that they influenced the function of ovary and uterus, and may be considered as relevant candidates. The identified SNP rs314448799 for accumulative EN from 21 to 40 weeks on chromosome 5 created phenotypic differences of 6.86 eggs between two homozygous genotypes, which could be potentially applied to the molecular breeding for EN selection. Moreover, our finding showed that LR was a moderate polygenic trait. The suggestive significant region on chromosome 16 for AFE suggested the relationship between sex maturity and immune in the current population. The present study comprehensively evaluates the role of genetic variants in the development of egg laying. The findings will be helpful to investigation of causative genes function and future marker-assisted selection and genomic selection in chickens.
Shan, Jingxuan; Al-Rumaihi, Khalid; Rabah, Danny; Al-Bozom, Issam; Kizhakayil, Dhanya; Farhat, Karim; Al-Said, Sami; Kfoury, Hala; Dsouza, Shoba P; Rowe, Jillian; Khalak, Hanif G; Jafri, Shahzad; Aigha, Idil I; Chouchane, Lotfi
2013-05-13
Large databases focused on genetic susceptibility to prostate cancer have been accumulated from population studies of different ancestries, including Europeans and African-Americans. Arab populations, however, have been only rarely studied. Using Affymetrix Genome-Wide Human SNP Array 6, we conducted a genome-wide association study (GWAS) in which 534,781 single nucleotide polymorphisms (SNPs) were genotyped in 221 Tunisians (90 prostate cancer patients and 131 age-matched healthy controls). TaqMan SNP Genotyping Assays on 11 prostate cancer associated SNPs were performed in a distinct cohort of 337 individuals from Arab ancestry living in Qatar and Saudi Arabia (155 prostate cancer patients and 182 age-matched controls). In-silico expression quantitative trait locus (eQTL) analysis along with mRNA quantification of nearby genes was performed to identify loci potentially cis-regulated by the identified SNPs. Three chromosomal regions, encompassing 14 SNPs, are significantly associated with prostate cancer risk in the Tunisian population (P = 1 × 10-4 to P = 1 × 10-5). In addition to SNPs located on chromosome 17q21, previously found associated with prostate cancer in Western populations, two novel chromosomal regions are revealed on chromosome 9p24 and 22q13. eQTL analysis and mRNA quantification indicate that the prostate cancer associated SNPs of chromosome 17 could enhance the expression of STAT5B gene. Our findings, identifying novel GWAS prostate cancer susceptibility loci, indicate that prostate cancer genetic risk factors could be ethnic specific.
[Prenatal genetic diagnosis for a fetus with atypical neurofibromatosis type 1 microdeletion].
Lin, Shaobin; Wu, Jianzhu; Zhang, Zhiqiang; Ji, Yuanjun; Fang, Qun; Chen, Baojiang; Luo, Yanmin
2016-04-01
To analyze the correlation between atypical neurofibromatosis type 1(NF1) microdeletion and fetal phenotype. Fetal blood sampling was carried out for a woman bearing a fetus with talipes equinovarus. G-banded karyotyping and single nucleotide polymorphism array (SNP-array) were performed on the fetal blood sample. Fluorescence in situ hybridization (FISH) was used to confirm the result of SNP array analysis. FISH assay was also carried out on peripheral blood specimens from the parents to ascertain the origin of mutation. The karyotype of fetus was found to be 46, XY by G-banding analysis. However, a 3.132 Mb microdeletion was detected in chromosome region 17q11.2 by SNP array, which overlaped with the region of NF1 microdeletion syndrome. Analyzing of the specimens from the fetus and its parents with FISH has confirmed it to be a de novo deletion. Talipes equinovarus may be an abnormal sonographic feature of fetus with atypical NF1 microdeletion which can be accurately diagnosed with SNP array.
Clevert, Djork-Arné; Mitterecker, Andreas; Mayr, Andreas; Klambauer, Günter; Tuefferd, Marianne; De Bondt, An; Talloen, Willem; Göhlmann, Hinrich; Hochreiter, Sepp
2011-07-01
Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, 'cn.FARMS', which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html.
Duplication of 20p12.3 associated with familial Wolff-Parkinson-White syndrome.
Mills, Kimberly I; Anderson, Jacqueline; Levy, Philip T; Cole, F Sessions; Silva, Jennifer N A; Kulkarni, Shashikant; Shinawi, Marwan
2013-01-01
Wolff-Parkinson-White (WPW) syndrome is caused by preexcitation of the ventricular myocardium via an accessory pathway which increases the risk for paroxysmal supraventricular tachycardia. The condition is often sporadic and of unknown etiology in the majority of cases. Autosomal dominant inheritance and association with congenital heart defects or ventricular hypertrophy were described. Microdeletions of 20p12.3 have been associated with WPW syndrome with either cognitive dysfunction or Alagille syndrome. Here, we describe the association of 20p12.3 duplication with WPW syndrome in a patient who presented with non-immune hydrops. Her paternal uncle carries the duplication and has attention-deficit hyperactivity disorder and electrocardiographic findings consistent with WPW. The 769 kb duplication was detected by the Affymetrix Whole Genome-Human SNP Array 6.0 and encompasses two genes and the first two exons of a third gene. We discuss the potential role of the genes in the duplicated region in the pathogenesis of WPW and possible neurobehavioral abnormalities. Our data provide additional support for a significant role of 20p12.3 chromosomal rearrangements in the etiology of WPW syndrome. Copyright © 2012 Wiley Periodicals, Inc.
Montanari, Sara; Saeed, Munazza; Knäbel, Mareike; Kim, YoonKyeong; Troggio, Michela; Malnoy, Mickael; Velasco, Riccardo; Fontana, Paolo; Won, KyungHo; Durel, Charles-Eric; Perchepied, Laure; Schaffer, Robert; Wiedow, Claudia; Bus, Vincent; Brewer, Lester; Gardiner, Susan E; Crowhurst, Ross N; Chagné, David
2013-01-01
We have used new generation sequencing (NGS) technologies to identify single nucleotide polymorphism (SNP) markers from three European pear (Pyrus communis L.) cultivars and subsequently developed a subset of 1096 pear SNPs into high throughput markers by combining them with the set of 7692 apple SNPs on the IRSC apple Infinium® II 8K array. We then evaluated this apple and pear Infinium® II 9K SNP array for large-scale genotyping in pear across several species, using both pear and apple SNPs. The segregating populations employed for array validation included a segregating population of European pear ('Old Home'×'Louise Bon Jersey') and four interspecific breeding families derived from Asian (P. pyrifolia Nakai and P. bretschneideri Rehd.) and European pear pedigrees. In total, we mapped 857 polymorphic pear markers to construct the first SNP-based genetic maps for pear, comprising 78% of the total pear SNPs included in the array. In addition, 1031 SNP markers derived from apple (13% of the total apple SNPs included in the array) were polymorphic and were mapped in one or more of the pear populations. These results are the first to demonstrate SNP transferability across the genera Malus and Pyrus. Our construction of high density SNP-based and gene-based genetic maps in pear represents an important step towards the identification of chromosomal regions associated with a range of horticultural characters, such as pest and disease resistance, orchard yield and fruit quality.
Vitis Phylogenomics: Hybridization Intensities from a SNP Array Outperform Genotype Calls
Miller, Allison J.; Matasci, Naim; Schwaninger, Heidi; Aradhya, Mallikarjuna K.; Prins, Bernard; Zhong, Gan-Yuan; Simon, Charles; Buckler, Edward S.; Myles, Sean
2013-01-01
Understanding relationships among species is a fundamental goal of evolutionary biology. Single nucleotide polymorphisms (SNPs) identified through next generation sequencing and related technologies enable phylogeny reconstruction by providing unprecedented numbers of characters for analysis. One approach to SNP-based phylogeny reconstruction is to identify SNPs in a subset of individuals, and then to compile SNPs on an array that can be used to genotype additional samples at hundreds or thousands of sites simultaneously. Although powerful and efficient, this method is subject to ascertainment bias because applying variation discovered in a representative subset to a larger sample favors identification of SNPs with high minor allele frequencies and introduces bias against rare alleles. Here, we demonstrate that the use of hybridization intensity data, rather than genotype calls, reduces the effects of ascertainment bias. Whereas traditional SNP calls assess known variants based on diversity housed in the discovery panel, hybridization intensity data survey variation in the broader sample pool, regardless of whether those variants are present in the initial SNP discovery process. We apply SNP genotype and hybridization intensity data derived from the Vitis9kSNP array developed for grape to show the effects of ascertainment bias and to reconstruct evolutionary relationships among Vitis species. We demonstrate that phylogenies constructed using hybridization intensities suffer less from the distorting effects of ascertainment bias, and are thus more accurate than phylogenies based on genotype calls. Moreover, we reconstruct the phylogeny of the genus Vitis using hybridization data, show that North American subgenus Vitis species are monophyletic, and resolve several previously poorly known relationships among North American species. This study builds on earlier work that applied the Vitis9kSNP array to evolutionary questions within Vitis vinifera and has general implications for addressing ascertainment bias in array-enabled phylogeny reconstruction. PMID:24236035
Expression Profiling Smackdown: Human Transcriptome Array HTA 2.0 vs. RNA-Seq
Palermo, Meghann; Driscoll, Heather; Tighe, Scott; Dragon, Julie; Bond, Jeff; Shukla, Arti; Vangala, Mahesh; Vincent, James; Hunter, Tim
2014-01-01
The advent of both microarray and massively parallel sequencing have revolutionized high-throughput analysis of the human transcriptome. Due to limitations in microarray technology, detecting and quantifying coding transcript isoforms, in addition to non-coding transcripts, has been challenging. As a result, RNA-Seq has been the preferred method for characterizing the full human transcriptome, until now. A new high-resolution array from Affymetrix, GeneChip Human Transcriptome Array 2.0 (HTA 2.0), has been designed to interrogate all transcript isoforms in the human transcriptome with >6 million probes targeting coding transcripts, exon-exon splice junctions, and non-coding transcripts. Here we compare expression results from GeneChip HTA 2.0 and RNA-Seq data using identical RNA extractions from three samples each of healthy human mesothelial cells in culture, LP9-C1, and healthy mesothelial cells treated with asbestos, LP9-A1. For GeneChip HTA 2.0 sample preparation, we chose to compare two target preparation methods, NuGEN Ovation Pico WTA V2 with the Encore Biotin Module versus Affymetrix's GeneChip WT PLUS with the WT Terminal Labeling Kit, on identical RNA extractions from both untreated and treated samples. These same RNA extractions were used for the RNA-Seq library preparation. All analyses were performed in Partek Genomics Suite 6.6. Expression profiles for control and asbestos-treated mesothelial cells prepared with NuGEN versus Affymetrix target preparation methods (GeneChip HTA 2.0) are compared to each other as well as to RNA-Seq results.
Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
2015-01-01
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980
Discovery of 100K SNP array and its utilization in sugarcane
USDA-ARS?s Scientific Manuscript database
Next generation sequencing (NGS) enable us to identify thousands of single nucleotide polymorphisms (SNPs) marker for genotyping and fingerprinting. However, the process requires very precise bioinformatics analysis and filtering process. High throughput SNP array with predefined genomic location co...
Development and Validation of a High-Density SNP Genotyping Array for African Oil Palm.
Kwong, Qi Bin; Teh, Chee Keng; Ong, Ai Ling; Heng, Huey Ying; Lee, Heng Leng; Mohamed, Mohaimi; Low, Joel Zi-Bin; Apparow, Sukganah; Chew, Fook Tim; Mayes, Sean; Kulaveerasingam, Harikrishna; Tammi, Martti; Appleton, David Ross
2016-08-01
High-density single nucleotide polymorphism (SNP) genotyping arrays are powerful tools that can measure the level of genetic polymorphism within a population. To develop a whole-genome SNP array for oil palms, SNP discovery was performed using deep resequencing of eight libraries derived from 132 Elaeis guineensis and Elaeis oleifera palms belonging to 59 origins, resulting in the discovery of >3 million putative SNPs. After SNP filtering, the Illumina OP200K custom array was built with 170 860 successful probes. Phenetic clustering analysis revealed that the array could distinguish between palms of different origins in a way consistent with pedigree records. Genome-wide linkage disequilibrium declined more slowly for the commercial populations (ranging from 120 kb at r(2) = 0.43 to 146 kb at r(2) = 0.50) when compared with the semi-wild populations (19.5 kb at r(2) = 0.22). Genetic fixation mapping comparing the semi-wild and commercial population identified 321 selective sweeps. A genome-wide association study (GWAS) detected a significant peak on chromosome 2 associated with the polygenic component of the shell thickness trait (based on the trait shell-to-fruit; S/F %) in tenera palms. Testing of a genomic selection model on the same trait resulted in good prediction accuracy (r = 0.65) with 42% of the S/F % variation explained. The first high-density SNP genotyping array for oil palm has been developed and shown to be robust for use in genetic studies and with potential for developing early trait prediction to shorten the oil palm breeding cycle. Copyright © 2016 The Author. Published by Elsevier Inc. All rights reserved.
Hulse-Kemp, Amanda M.; Lemm, Jana; Plieske, Joerg; Ashrafi, Hamid; Buyyarapu, Ramesh; Fang, David D.; Frelichowski, James; Giband, Marc; Hague, Steve; Hinze, Lori L.; Kochan, Kelli J.; Riggs, Penny K.; Scheffler, Jodi A.; Udall, Joshua A.; Ulloa, Mauricio; Wang, Shirley S.; Zhu, Qian-Hao; Bag, Sumit K.; Bhardwaj, Archana; Burke, John J.; Byers, Robert L.; Claverie, Michel; Gore, Michael A.; Harker, David B.; Islam, Md S.; Jenkins, Johnie N.; Jones, Don C.; Lacape, Jean-Marc; Llewellyn, Danny J.; Percy, Richard G.; Pepper, Alan E.; Poland, Jesse A.; Mohan Rai, Krishan; Sawant, Samir V.; Singh, Sunil Kumar; Spriggs, Andrew; Taylor, Jen M.; Wang, Fei; Yourstone, Scott M.; Zheng, Xiuting; Lawley, Cindy T.; Ganal, Martin W.; Van Deynze, Allen; Wilson, Iain W.; Stelly, David M.
2015-01-01
High-throughput genotyping arrays provide a standardized resource for plant breeding communities that are useful for a breadth of applications including high-density genetic mapping, genome-wide association studies (GWAS), genomic selection (GS), complex trait dissection, and studying patterns of genomic diversity among cultivars and wild accessions. We have developed the CottonSNP63K, an Illumina Infinium array containing assays for 45,104 putative intraspecific single nucleotide polymorphism (SNP) markers for use within the cultivated cotton species Gossypium hirsutum L. and 17,954 putative interspecific SNP markers for use with crosses of other cotton species with G. hirsutum. The SNPs on the array were developed from 13 different discovery sets that represent a diverse range of G. hirsutum germplasm and five other species: G. barbadense L., G. tomentosum Nuttal × Seemann, G. mustelinum Miers × Watt, G. armourianum Kearny, and G. longicalyx J.B. Hutchinson and Lee. The array was validated with 1,156 samples to generate cluster positions to facilitate automated analysis of 38,822 polymorphic markers. Two high-density genetic maps containing a total of 22,829 SNPs were generated for two F2 mapping populations, one intraspecific and one interspecific, and 3,533 SNP markers were co-occurring in both maps. The produced intraspecific genetic map is the first saturated map that associates into 26 linkage groups corresponding to the number of cotton chromosomes for a cross between two G. hirsutum lines. The linkage maps were shown to have high levels of collinearity to the JGI G. raimondii Ulbrich reference genome sequence. The CottonSNP63K array, cluster file and associated marker sequences constitute a major new resource for the global cotton research community. PMID:25908569
2012-01-01
Background DNA microarrays are used both for research and for diagnostics. In research, Affymetrix arrays are commonly used for genome wide association studies, resequencing, and for gene expression analysis. These arrays provide large amounts of data. This data is analyzed using statistical methods that quite often discard a large portion of the information. Most of the information that is lost comes from probes that systematically fail across chips and from batch effects. The aim of this study was to develop a comprehensive model for hybridization that predicts probe intensities for Affymetrix arrays and that could provide a basis for improved microarray analysis and probe development. The first part of the model calculates probe binding affinities to all the possible targets in the hybridization solution using the Langmuir isotherm. In the second part of the model we integrate details that are specific to each experiment and contribute to the differences between hybridization in solution and on the microarray. These details include fragmentation, wash stringency, temperature, salt concentration, and scanner settings. Furthermore, the model fits probe synthesis efficiency and target concentration parameters directly to the data. All the parameters used in the model have a well-established physical origin. Results For the 302 chips that were analyzed the mean correlation between expected and observed probe intensities was 0.701 with a range of 0.88 to 0.55. All available chips were included in the analysis regardless of the data quality. Our results show that batch effects arise from differences in probe synthesis, scanner settings, wash strength, and target fragmentation. We also show that probe synthesis efficiencies for different nucleotides are not uniform. Conclusions To date this is the most complete model for binding on microarrays. This is the first model that includes both probe synthesis efficiency and hybridization kinetics/cross-hybridization. These two factors are sequence dependent and have a large impact on probe intensity. The results presented here provide novel insight into the effect of probe synthesis errors on Affymetrix microarrays; furthermore, the algorithms developed in this work provide useful tools for the analysis of cross-hybridization, probe synthesis efficiency, fragmentation, wash stringency, temperature, and salt concentration on microarray intensities. PMID:23270536
Rice, K L; Lin, X; Wolniak, K; Ebert, B L; Berkofsky-Fessler, W; Buzzai, M; Sun, Y; Xi, C; Elkin, P; Levine, R; Golub, T; Gilliland, D G; Crispino, J D; Licht, J D; Zhang, W
2011-01-01
Polycythemia vera (PV), essential thrombocythemia and primary myelofibrosis, are myeloproliferative neoplasms (MPNs) with distinct clinical features and are associated with the JAK2V617F mutation. To identify genomic anomalies involved in the pathogenesis of these disorders, we profiled 87 MPN patients using Affymetrix 250K single-nucleotide polymorphism (SNP) arrays. Aberrations affecting chr9 were the most frequently observed and included 9pLOH (n=16), trisomy 9 (n=6) and amplifications of 9p13.3–23.3 (n=1), 9q33.1–34.13 (n=1) and 9q34.13 (n=6). Patients with trisomy 9 were associated with elevated JAK2V617F mutant allele burden, suggesting that gain of chr9 represents an alternative mechanism for increasing JAK2V617F dosage. Gene expression profiling of patients with and without chr9 abnormalities (+9, 9pLOH), identified genes potentially involved in disease pathogenesis including JAK2, STAT5B and MAPK14. We also observed recurrent gains of 1p36.31–36.33 (n=6), 17q21.2–q21.31 (n=5) and 17q25.1–25.3 (n=5) and deletions affecting 18p11.31–11.32 (n=8). Combined SNP and gene expression analysis identified aberrations affecting components of a non-canonical PRC2 complex (EZH1, SUZ12 and JARID2) and genes comprising a ‘HSC signature' (MLLT3, SMARCA2 and PBX1). We show that NFIB, which is amplified in 7/87 MPN patients and upregulated in PV CD34+ cells, protects cells from apoptosis induced by cytokine withdrawal. PMID:22829077
2014-01-01
Background Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. Methods 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 695,193 SNPs were conducted using UNPHASED, which combines information across families and unrelated individuals. We attempted to replicate signals found in 23 genomic regions using existing data on nonoverlapping samples from the Psychiatric GWAS Consortium and Schizophrenia-GENE-plus cohorts (10,352 schizophrenia patients and 24,474 controls). Results No individual SNP showed compelling evidence for association with psychosis in our data. However, we observed a trend for association with same risk alleles at loci previously associated with schizophrenia (one-sided p = .003). A polygenic score analysis found that the Psychiatric GWAS Consortium’s panel of SNPs associated with schizophrenia significantly predicted disease status in our sample (p = 5 × 10–14) and explained approximately 2% of the phenotypic variance. Conclusions Although narrowly defined phenotypes have their advantages, we believe new loci may also be discovered through meta-analysis across broad phenotypes. The novel statistical methodology we introduced to model effect size heterogeneity between studies should help future GWAS that combine association evidence from related phenotypes. Applying these approaches, we highlight three loci that warrant further investigation. We found that SNPs conveying risk for schizophrenia are also predictive of disease status in our data. PMID:23871474
Genomic and transcriptomic predictors of triglyceride response to regular exercise
Sarzynski, Mark A; Davidsen, Peter K; Sung, Yun Ju; Hesselink, Matthijs K C; Schrauwen, Patrick; Rice, Treva K; Rao, D C; Falciani, Francesco; Bouchard, Claude
2015-01-01
Aim We performed genome-wide and transcriptome-wide profiling to identify genes and single nucleotide polymorphisms (SNPs) associated with the response of triglycerides (TG) to exercise training. Methods Plasma TG levels were measured before and after a 20-week endurance training programme in 478 white participants from the HERITAGE Family Study. Illumina HumanCNV370-Quad v3.0 BeadChips were genotyped using the Illumina BeadStation 500GX platform. Affymetrix HG-U133+2 arrays were used to quantitate gene expression levels from baseline muscle biopsies of a subset of participants (N=52). Genome-wide association study (GWAS) analysis was performed using MERLIN, while transcriptomic predictor models were developed using the R-package GALGO. Results The GWAS results showed that eight SNPs were associated with TG training-response (ΔTG) at p<9.9×10−6, while another 31 SNPs showed p values <1×10−4. In multivariate regression models, the top 10 SNPs explained 32.0% of the variance in ΔTG, while conditional heritability analysis showed that four SNPs statistically accounted for all of the heritability of ΔTG. A molecular signature based on the baseline expression of 11 genes predicted 27% of ΔTG in HERITAGE, which was validated in an independent study. A composite SNP score based on the top four SNPs, each from the genomic and transcriptomic analyses, was the strongest predictor of ΔTG (R2=0.14, p=3.0×10−68). Conclusions Our results indicate that skeletal muscle transcript abundance at 11 genes and SNPs at a number of loci contribute to TG response to exercise training. Combining data from genomics and transcriptomics analyses identified a SNP-based gene signature that should be further tested in independent samples. PMID:26491034
Bramon, Elvira; Pirinen, Matti; Strange, Amy; Lin, Kuang; Freeman, Colin; Bellenguez, Céline; Su, Zhan; Band, Gavin; Pearson, Richard; Vukcevic, Damjan; Langford, Cordelia; Deloukas, Panos; Hunt, Sarah; Gray, Emma; Dronov, Serge; Potter, Simon C; Tashakkori-Ghanbaria, Avazeh; Edkins, Sarah; Bumpstead, Suzannah J; Arranz, Maria J; Bakker, Steven; Bender, Stephan; Bruggeman, Richard; Cahn, Wiepke; Chandler, David; Collier, David A; Crespo-Facorro, Benedicto; Dazzan, Paola; de Haan, Lieuwe; Di Forti, Marta; Dragović, Milan; Giegling, Ina; Hall, Jeremy; Iyegbe, Conrad; Jablensky, Assen; Kahn, René S; Kalaydjieva, Luba; Kravariti, Eugenia; Lawrie, Stephen; Linszen, Don H; Mata, Ignacio; McDonald, Colm; McIntosh, Andrew; Myin-Germeys, Inez; Ophoff, Roel A; Pariante, Carmine M; Paunio, Tiina; Picchioni, Marco; Ripke, Stephan; Rujescu, Dan; Sauer, Heinrich; Shaikh, Madiha; Sussmann, Jessika; Suvisaari, Jaana; Tosato, Sarah; Toulopoulou, Timothea; Van Os, Jim; Walshe, Muriel; Weisbrod, Matthias; Whalley, Heather; Wiersma, Durk; Blackwell, Jenefer M; Brown, Matthew A; Casas, Juan P; Corvin, Aiden; Duncanson, Audrey; Jankowski, Janusz A Z; Markus, Hugh S; Mathew, Christopher G; Palmer, Colin N A; Plomin, Robert; Rautanen, Anna; Sawcer, Stephen J; Trembath, Richard C; Wood, Nicholas W; Barroso, Ines; Peltonen, Leena; Lewis, Cathryn M; Murray, Robin M; Donnelly, Peter; Powell, John; Spencer, Chris C A
2014-03-01
Genome-wide association studies (GWAS) have identified several loci associated with schizophrenia and/or bipolar disorder. We performed a GWAS of psychosis as a broad syndrome rather than within specific diagnostic categories. 1239 cases with schizophrenia, schizoaffective disorder, or psychotic bipolar disorder; 857 of their unaffected relatives, and 2739 healthy controls were genotyped with the Affymetrix 6.0 single nucleotide polymorphism (SNP) array. Analyses of 695,193 SNPs were conducted using UNPHASED, which combines information across families and unrelated individuals. We attempted to replicate signals found in 23 genomic regions using existing data on nonoverlapping samples from the Psychiatric GWAS Consortium and Schizophrenia-GENE-plus cohorts (10,352 schizophrenia patients and 24,474 controls). No individual SNP showed compelling evidence for association with psychosis in our data. However, we observed a trend for association with same risk alleles at loci previously associated with schizophrenia (one-sided p = .003). A polygenic score analysis found that the Psychiatric GWAS Consortium's panel of SNPs associated with schizophrenia significantly predicted disease status in our sample (p = 5 × 10(-14)) and explained approximately 2% of the phenotypic variance. Although narrowly defined phenotypes have their advantages, we believe new loci may also be discovered through meta-analysis across broad phenotypes. The novel statistical methodology we introduced to model effect size heterogeneity between studies should help future GWAS that combine association evidence from related phenotypes. Applying these approaches, we highlight three loci that warrant further investigation. We found that SNPs conveying risk for schizophrenia are also predictive of disease status in our data. Copyright © 2014 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.
[Genetic analysis of two cases with Dandy-Walker deformed fetus].
Yao, Juan; Fang, Rong; Shen, Xueping; Shen, Guosong; Zhang, Su
2017-10-10
To explore the genetic etiology of two fetuses with Dandy-Walker malformation using single nucleotide polymorphism microarray (SNP-array). The fetuses and their parents were subjected to G banding karyotype analysis. The fetuses were also subjected to SNP-array analysis. The parents of both fetuses showed a normal karyotype. One fetus has a 46,X,?i(X)(q10), while for another conventional cell culture has failed. SNP-array showed that one fetus carried a 6p25.3p25.2 microdeletion, and another carried a Xp22.33p22.2 deletion and a Yq11.221q11 duplication. The abnormal fragments have involved FOXC1, SHOX and STS genes, which are associated with Dandy-Walker malformation. Alteration of 6p25.3p25.2, Xp22.33p22.2 copy numbers probably underlies the Dandy-Walker syndrome in the fetuses. The disorder may be attributed to abnormal expression of FOXC1, SHOX, and STS genes. SNP-array can provide an important supplement for prenatal diagnosis.
Antanaviciute, Laima; Fernández-Fernández, Felicidad; Jansen, Johannes; Banchi, Elisa; Evans, Katherine M; Viola, Roberto; Velasco, Riccardo; Dunwell, Jim M; Troggio, Michela; Sargent, Daniel J
2012-05-25
A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the 'Golden Delicious' genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the 'Golden Delicious' pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been assigned erroneous positions on the 'Golden Delicious' reference sequence will assist in the continued improvement of the genome sequence assembly for that variety.
Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela
2014-01-01
High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.
Bianco, Luca; Cestaro, Alessandro; Sargent, Daniel James; Banchi, Elisa; Derdak, Sophia; Di Guardo, Mario; Salvi, Silvio; Jansen, Johannes; Viola, Roberto; Gut, Ivo; Laurens, Francois; Chagné, David; Velasco, Riccardo; van de Weg, Eric; Troggio, Michela
2014-01-01
High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs. PMID:25303088
Optimal design of low-density SNP arrays for genomic prediction: algorithm and applications
USDA-ARS?s Scientific Manuscript database
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for their optimal design. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optim...
Troggio, Michela; Malnoy, Mickael; Velasco, Riccardo; Fontana, Paolo; Won, KyungHo; Durel, Charles-Eric; Perchepied, Laure; Schaffer, Robert; Wiedow, Claudia; Bus, Vincent; Brewer, Lester; Gardiner, Susan E.; Crowhurst, Ross N.; Chagné, David
2013-01-01
We have used new generation sequencing (NGS) technologies to identify single nucleotide polymorphism (SNP) markers from three European pear (Pyrus communis L.) cultivars and subsequently developed a subset of 1096 pear SNPs into high throughput markers by combining them with the set of 7692 apple SNPs on the IRSC apple Infinium® II 8K array. We then evaluated this apple and pear Infinium® II 9K SNP array for large-scale genotyping in pear across several species, using both pear and apple SNPs. The segregating populations employed for array validation included a segregating population of European pear (‘Old Home’בLouise Bon Jersey’) and four interspecific breeding families derived from Asian (P. pyrifolia Nakai and P. bretschneideri Rehd.) and European pear pedigrees. In total, we mapped 857 polymorphic pear markers to construct the first SNP-based genetic maps for pear, comprising 78% of the total pear SNPs included in the array. In addition, 1031 SNP markers derived from apple (13% of the total apple SNPs included in the array) were polymorphic and were mapped in one or more of the pear populations. These results are the first to demonstrate SNP transferability across the genera Malus and Pyrus. Our construction of high density SNP-based and gene-based genetic maps in pear represents an important step towards the identification of chromosomal regions associated with a range of horticultural characters, such as pest and disease resistance, orchard yield and fruit quality. PMID:24155917
Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
2015-08-01
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
IDENTIFICATION OF INTERSPECIES CONCORDANCE OF MECHANISMS OF ARSENIC-INDUCED BLADDER CANCER
Exposure to arsenic causes cancer by inducing a variety of responses that affect the expression of genes associated with numerous biological pathways leading to altered cell growth and proliferation, signaling, apoptosis and oxidative stress response. Affymetrix GeneChip® arrays ...
Analysis of population structure and genetic history of cattle breeds based on high-density SNP data
USDA-ARS?s Scientific Manuscript database
Advances in single nucleotide polymorphism (SNP) genotyping microarrays have facilitated a new understanding of population structure and evolutionary history for several species. Most existing studies in livestock were based on low density SNP arrays. The first wave of low density SNP studies on cat...
Tumor Touch Imprints as Source for Whole Genome Analysis of Neuroblastoma Tumors
Brunner, Clemens; Brunner-Herglotz, Bettina; Ziegler, Andrea; Frech, Christian; Amann, Gabriele; Ladenstein, Ruth; Ambros, Inge M.; Ambros, Peter F.
2016-01-01
Introduction Tumor touch imprints (TTIs) are routinely used for the molecular diagnosis of neuroblastomas by interphase fluorescence in-situ hybridization (I-FISH). However, in order to facilitate a comprehensive, up-to-date molecular diagnosis of neuroblastomas and to identify new markers to refine risk and therapy stratification methods, whole genome approaches are needed. We examined the applicability of an ultra-high density SNP array platform that identifies copy number changes of varying sizes down to a few exons for the detection of genomic changes in tumor DNA extracted from TTIs. Material and Methods DNAs were extracted from TTIs of 46 neuroblastoma and 4 other pediatric tumors. The DNAs were analyzed on the Cytoscan HD SNP array platform to evaluate numerical and structural genomic aberrations. The quality of the data obtained from TTIs was compared to that from randomly chosen fresh or fresh frozen solid tumors (n = 212) and I-FISH validation was performed. Results SNP array profiles were obtained from 48 (out of 50) TTI DNAs of which 47 showed genomic aberrations. The high marker density allowed for single gene analysis, e.g. loss of nine exons in the ATRX gene and the visualization of chromothripsis. Data quality was comparable to fresh or fresh frozen tumor SNP profiles. SNP array results were confirmed by I-FISH. Conclusion TTIs are an excellent source for SNP array processing with the advantage of simple handling, distribution and storage of tumor tissue on glass slides. The minimal amount of tumor tissue needed to analyze whole genomes makes TTIs an economic surrogate source in the molecular diagnostic work up of tumor samples. PMID:27560999
Measuring diversity in Gossypium hirsutum using the CottonSNP63K Array
USDA-ARS?s Scientific Manuscript database
A CottonSNP63K array and accompanying cluster file has been developed and includes 45,104 intra-specific SNPs and 17,954 inter-specific SNPs for automated genotyping of cotton (Gossypium spp.) samples. Development of the cluster file included genotyping of 1,156 samples, a subset of which were iden...
Genome-wide copy number variation (CNV) in patients with autoimmune Addison's disease
2011-01-01
Background Addison's disease (AD) is caused by an autoimmune destruction of the adrenal cortex. The pathogenesis is multi-factorial, involving genetic components and hitherto unknown environmental factors. The aim of the present study was to investigate if gene dosage in the form of copy number variation (CNV) could add to the repertoire of genetic susceptibility to autoimmune AD. Methods A genome-wide study using the Affymetrix GeneChip® Genome-Wide Human SNP Array 6.0 was conducted in 26 patients with AD. CNVs in selected genes were further investigated in a larger material of patients with autoimmune AD (n = 352) and healthy controls (n = 353) by duplex Taqman real-time polymerase chain reaction assays. Results We found that low copy number of UGT2B28 was significantly more frequent in AD patients compared to controls; conversely high copy number of ADAM3A was associated with AD. Conclusions We have identified two novel CNV associations to ADAM3A and UGT2B28 in AD. The mechanism by which this susceptibility is conferred is at present unclear, but may involve steroid inactivation (UGT2B28) and T cell maturation (ADAM3A). Characterization of these proteins may unravel novel information on the pathogenesis of autoimmunity. PMID:21851588
Bilateral wilms tumor with TP53-related anaplasia.
Popov, Sergey D; Vujanic, Gordan M; Sebire, Neil J; Chagtai, Tasnim; Williams, Richard; Vaidya, Sucheta; Pritchard-Jones, Kathy
2013-01-01
Wilms tumor (WT) with diffuse anaplasia has an unfavorable prognosis and is often (>70%) associated with mutations in the TP53 gene. Although most WTs are unilateral, 5-10% are bilateral, and they are almost always present with nephrogenic rests. The latter are considered a precursor of WT. Two cases of bilateral WTs with nephroblastomatosis, in which anaplastic changes were detected over a period of time, were analyzed using clinical, radiological, histopathological, and molecular-genetic data. TP53 was analyzed by direct sequencing of its full coding sequence and intron-exon boundaries in 11 fragments. DNA was extracted from paraffin-embedded or frozen specimens. High-resolution genomic copy number profiling was carried out by UCL Genomics on the Affymetrix Human Mapping 250K Nsp or Genome-Wide Human SNP Array 6.0 platform. Both cases demonstrated a strong association between the appearance of anaplastic clones and TP53 mutations. Synchronous ganglioneuroma was diagnosed in one case. Our cases are unique as they represent a long disease history and demonstrate the difficulties in managing rare cases of bilateral WT with anaplasia. These cases also emphasize the practical importance of modern molecular-genetic techniques and their clinical application. Moreover, they highlight the issue of the adequate sampling needed in order to gather comprehensive, efficient, and sufficient information about genetic events in a single tumor.
2012-01-01
Background A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny. Results Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the ‘Golden Delicious’ genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the ‘Golden Delicious’ pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence. Conclusions We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been assigned erroneous positions on the ‘Golden Delicious’ reference sequence will assist in the continued improvement of the genome sequence assembly for that variety. PMID:22631220
Yi, Ming; Zhao, Yongmei; Jia, Li; He, Mei; Kebebew, Electron; Stephens, Robert M.
2014-01-01
To apply exome-seq-derived variants in the clinical setting, there is an urgent need to identify the best variant caller(s) from a large collection of available options. We have used an Illumina exome-seq dataset as a benchmark, with two validation scenarios—family pedigree information and SNP array data for the same samples, permitting global high-throughput cross-validation, to evaluate the quality of SNP calls derived from several popular variant discovery tools from both the open-source and commercial communities using a set of designated quality metrics. To the best of our knowledge, this is the first large-scale performance comparison of exome-seq variant discovery tools using high-throughput validation with both Mendelian inheritance checking and SNP array data, which allows us to gain insights into the accuracy of SNP calling through such high-throughput validation in an unprecedented way, whereas the previously reported comparison studies have only assessed concordance of these tools without directly assessing the quality of the derived SNPs. More importantly, the main purpose of our study was to establish a reusable procedure that applies high-throughput validation to compare the quality of SNP discovery tools with a focus on exome-seq, which can be used to compare any forthcoming tool(s) of interest. PMID:24831545
Geraldes, A; Difazio, S P; Slavov, G T; Ranjan, P; Muchero, W; Hannemann, J; Gunter, L E; Wymore, A M; Grassa, C J; Farzaneh, N; Porth, I; McKown, A D; Skyba, O; Li, E; Fujita, M; Klápště, J; Martin, J; Schackwitz, W; Pennacchio, C; Rokhsar, D; Friedmann, M C; Wasteneys, G O; Guy, R D; El-Kassaby, Y A; Mansfield, S D; Cronk, Q C B; Ehlting, J; Douglas, C J; Tuskan, G A
2013-03-01
Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. For such studies, the use of large single nucleotide polymorphism (SNP) genotyping arrays still offers the most cost-effective solution. Herein we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre-ascertained in 34 wild accessions covering most of the species latitudinal range. We adopted a candidate gene approach to the array design that resulted in the selection of 34 131 SNPs, the majority of which are located in, or within 2 kb of, 3543 candidate genes. A subset of the SNPs on the array (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%. We demonstrate that even among small numbers of samples (n = 10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca. Finally, we provide evidence for the utility of the array to address evolutionary questions such as intraspecific studies of genetic differentiation, species assignment and the detection of natural hybrids. © 2013 Blackwell Publishing Ltd.
A Discovery Resource of Rare Copy Number Variations in Individuals with Autism Spectrum Disorder
Prasad, Aparna; Merico, Daniele; Thiruvahindrapuram, Bhooma; Wei, John; Lionel, Anath C.; Sato, Daisuke; Rickaby, Jessica; Lu, Chao; Szatmari, Peter; Roberts, Wendy; Fernandez, Bridget A.; Marshall, Christian R.; Hatchwell, Eli; Eis, Peggy S.; Scherer, Stephen W.
2012-01-01
The identification of rare inherited and de novo copy number variations (CNVs) in human subjects has proven a productive approach to highlight risk genes for autism spectrum disorder (ASD). A variety of microarrays are available to detect CNVs, including single-nucleotide polymorphism (SNP) arrays and comparative genomic hybridization (CGH) arrays. Here, we examine a cohort of 696 unrelated ASD cases using a high-resolution one-million feature CGH microarray, the majority of which were previously genotyped with SNP arrays. Our objective was to discover new CNVs in ASD cases that were not detected by SNP microarray analysis and to delineate novel ASD risk loci via combined analysis of CGH and SNP array data sets on the ASD cohort and CGH data on an additional 1000 control samples. Of the 615 ASD cases analyzed on both SNP and CGH arrays, we found that 13,572 of 21,346 (64%) of the CNVs were exclusively detected by the CGH array. Several of the CGH-specific CNVs are rare in population frequency and impact previously reported ASD genes (e.g., NRXN1, GRM8, DPYD), as well as novel ASD candidate genes (e.g., CIB2, DAPP1, SAE1), and all were inherited except for a de novo CNV in the GPHN gene. A functional enrichment test of gene-sets in ASD cases over controls revealed nucleotide metabolism as a potential novel pathway involved in ASD, which includes several candidate genes for follow-up (e.g., DPYD, UPB1, UPP1, TYMP). Finally, this extensively phenotyped and genotyped ASD clinical cohort serves as an invaluable resource for the next step of genome sequencing for complete genetic variation detection. PMID:23275889
Scalabrin, Simone; Gilmore, Barbara; Lawley, Cynthia T.; Gasic, Ksenija; Micheletti, Diego; Rosyara, Umesh R.; Cattonaro, Federica; Vendramin, Elisa; Main, Dorrie; Aramini, Valeria; Blas, Andrea L.; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Troggio, Michela; Sosinski, Bryon; Aranzana, Maria José; Arús, Pere; Iezzoni, Amy; Morgante, Michele; Peace, Cameron
2012-01-01
Although a large number of single nucleotide polymorphism (SNP) markers covering the entire genome are needed to enable molecular breeding efforts such as genome wide association studies, fine mapping, genomic selection and marker-assisted selection in peach [Prunus persica (L.) Batsch] and related Prunus species, only a limited number of genetic markers, including simple sequence repeats (SSRs), have been available to date. To address this need, an international consortium (The International Peach SNP Consortium; IPSC) has pursued a coordinated effort to perform genome-scale SNP discovery in peach using next generation sequencing platforms to develop and characterize a high-throughput Illumina Infinium® SNP genotyping array platform. We performed whole genome re-sequencing of 56 peach breeding accessions using the Illumina and Roche/454 sequencing technologies. Polymorphism detection algorithms identified a total of 1,022,354 SNPs. Validation with the Illumina GoldenGate® assay was performed on a subset of the predicted SNPs, verifying ∼75% of genic (exonic and intronic) SNPs, whereas only about a third of intergenic SNPs were verified. Conservative filtering was applied to arrive at a set of 8,144 SNPs that were included on the IPSC peach SNP array v1, distributed over all eight peach chromosomes with an average spacing of 26.7 kb between SNPs. Use of this platform to screen a total of 709 accessions of peach in two separate evaluation panels identified a total of 6,869 (84.3%) polymorphic SNPs. The almost 7,000 SNPs verified as polymorphic through extensive empirical evaluation represent an excellent source of markers for future studies in genetic relatedness, genetic mapping, and dissecting the genetic architecture of complex agricultural traits. The IPSC peach SNP array v1 is commercially available and we expect that it will be used worldwide for genetic studies in peach and related stone fruit and nut species. PMID:22536421
Identifying the impact of G-quadruplexes on Affymetrix 3' arrays using cloud computing.
Memon, Farhat N; Owen, Anne M; Sanchez-Graillet, Olivia; Upton, Graham J G; Harrison, Andrew P
2010-01-15
A tetramer quadruplex structure is formed by four parallel strands of DNA/ RNA containing runs of guanine. These quadruplexes are able to form because guanine can Hoogsteen hydrogen bond to other guanines, and a tetrad of guanines can form a stable arrangement. Recently we have discovered that probes on Affymetrix GeneChips that contain runs of guanine do not measure gene expression reliably. We associate this finding with the likelihood that quadruplexes are forming on the surface of GeneChips. In order to cope with the rapidly expanding size of GeneChip array datasets in the public domain, we are exploring the use of cloud computing to replicate our experiments on 3' arrays to look at the effect of the location of G-spots (runs of guanines). Cloud computing is a recently introduced high-performance solution that takes advantage of the computational infrastructure of large organisations such as Amazon and Google. We expect that cloud computing will become widely adopted because it enables bioinformaticians to avoid capital expenditure on expensive computing resources and to only pay a cloud computing provider for what is used. Moreover, as well as financial efficiency, cloud computing is an ecologically-friendly technology, it enables efficient data-sharing and we expect it to be faster for development purposes. Here we propose the advantageous use of cloud computing to perform a large data-mining analysis of public domain 3' arrays.
Ferchaud, Anne-Laure; Pedersen, Susanne H; Bekkevold, Dorte; Jian, Jianbo; Niu, Yongchao; Hansen, Michael M
2014-10-06
The threespine stickleback (Gasterosteus aculeatus) has become an important model species for studying both contemporary and parallel evolution. In particular, differential adaptation to freshwater and marine environments has led to high differentiation between freshwater and marine stickleback populations at the phenotypic trait of lateral plate morphology and the underlying candidate gene Ectodysplacin (EDA). Many studies have focused on this trait and candidate gene, although other genes involved in marine-freshwater adaptation may be equally important. In order to develop a resource for rapid and cost efficient analysis of genetic divergence between freshwater and marine sticklebacks, we generated a low-density SNP (Single Nucleotide Polymorphism) array encompassing markers of chromosome regions under putative directional selection, along with neutral markers for background. RAD (Restriction site Associated DNA) sequencing of sixty individuals representing two freshwater and one marine population led to the identification of 33,993 SNP markers. Ninety-six of these were chosen for the low-density SNP array, among which 70 represented SNPs under putatively directional selection in freshwater vs. marine environments, whereas 26 SNPs were assumed to be neutral. Annotation of these regions revealed several genes that are candidates for affecting stickleback phenotypic variation, some of which have been observed in previous studies whereas others are new. We have developed a cost-efficient low-density SNP array that allows for rapid screening of polymorphisms in threespine stickleback. The array provides a valuable tool for analyzing adaptive divergence between freshwater and marine stickleback populations beyond the well-established candidate gene Ectodysplacin (EDA).
USDA-ARS?s Scientific Manuscript database
Oligionucleotide microarrays (GeneChip Bovine Genome Arrays, Affymetrix Inc., Santa Clara, CA) were used to evaluate gene expression profiles in anterior pituitary glands collected from 4 anestrous and 4 cycling postpartum primiparous beef cows to provide insight into genes associated with transitio...
FULL-GENOME ANALYSIS OF ALTERNATIVE SPLICING IN MOUSE LIVER AFTER HEPATOTOXICANT EXPOSURE
Alternative splicing plays a role in determining gene function and protein diversity. We have employed whole genome exon profiling using Affymetrix Mouse Exon 1.0 ST arrays to understand the significance of alternative splicing on a genome-wide scale in response to multiple toxic...
Herbicides are structurally diverse chemicals that inhibit plant-specific targets, however their off-target and potentially differentiating side-effects are less well defined. In this study, genome-wide expression profiling based on Affymetrix AtH1 arrays was used to identify dis...
In this study, genome-wide expression profiling based on Affymetrix ATH1 arrays was used to identify discriminating responses of Arabidopsis thaliana to five herbicides, which contain active ingredients targeting two different branches of amino acid biosynthesis. One herbicide co...
Nie, Bei; Yang, Min; Fu, Weiling; Liang, Zhiqing
2015-07-07
The surface invasive cleavage assay, because of its innate accuracy and ability for self-signal amplification, provides a potential route for the mapping of hundreds of thousands of human SNP sites. However, its performance on a high density DNA array has not yet been established, due to the unusual "hairpin" probe design on the microarray and the lack of chemical stability of commercially available substrates. Here we present an applicable method to implement a nanocrystalline diamond thin film as an alternative substrate for fabricating an addressable DNA array using maskless light-directed photochemistry, producing the most chemically stable and biocompatible system for genetic analysis and enzymatic reactions. The surface invasive cleavage reaction, followed by degenerated primer ligation and post-rolling circle amplification is consecutively performed on the addressable diamond DNA array, accurately mapping SNP sites from PCR-amplified human genomic target DNA. Furthermore, a specially-designed DNA array containing dual probes in the same pixel is fabricated by following a reverse light-directed DNA synthesis protocol. This essentially enables us to decipher thousands of SNP alleles in a single-pot reaction by the simple addition of enzyme, target and reaction buffers.
Tang, Shaohua; Lv, Jiaojiao; Chen, Xiangnan; Bai, Lili; Li, Huanzheng; Chen, Chong; Wang, Ping; Xu, Xueqin; Lu, Jianxin
2016-01-01
To evaluate the usefulness of single-nucleotide polymorphism (SNP) array for prenatal genetic diagnosis of congenital heart defect (CHD), we used this approach to detect clinically significant copy number variants (CNVs) in fetuses with CHDs. A HumanCytoSNP-12 array was used to detect genomic samples obtained from 39 fetuses that exhibited cardiovascular abnormalities on ultrasound and had a normal karyotype. The relationship between CNVs and CHDs was identified by using genotype-phenotype comparisons and searching of chromosomal databases. All clinically significant CNVs were confirmed by real-time PCR. CNVs were detected in 38/39 (97.4%) fetuses: variants of unknown significance were detected in 2/39 (5.1%), and clinically significant CNVs were identified in 7/39 (17.9%). In 3 of the 7 fetuses with clinically significant CNVs, 3 rare and previously undescribed CNVs were detected, and these CNVs encompassed the CHD candidate genes FLNA (Xq28 dup), BCOR (Xp11.4 dup), and RBL2 (16q12.2 del). Compared with conventional cytogenetic genomics, SNP array analysis provides significantly improved detection of submicroscopic genomic aberrations in pregnancies with CHDs. Based on these results, we propose that genomic SNP array is an effective method which could be used in the prenatal diagnostic test to assist genetic counseling for pregnancies with CHDs. © 2015 S. Karger AG, Basel.
USDA-ARS?s Scientific Manuscript database
Our objective was to evaluate whether breed composition of crossbred cattle could be predicted using reference breed frequencies of SNP markers on the BovineSNP50 array. Semen DNA samples of over 2,000 bulls from 16 common commercial beef breeds were genotyped using the array and used to estimate cu...
Fine mapping of copy number variations on two cattle genome assemblies using high density SNP array
USDA-ARS?s Scientific Manuscript database
Btau_4.0 and UMD3.1 are two distinct cattle reference genome assemblies. In our previous study using the low density BovineSNP50 array, we reported a copy number variation (CNV) analysis on Btau_4.0 with 521 animals of 21 cattle breeds, yielding 682 CNV regions with a total length of 139.8 megabases...
Diversity analysis of cotton (Gossypium hirsutum L.) germplasm using the CottonSNP63K Array.
Hinze, Lori L; Hulse-Kemp, Amanda M; Wilson, Iain W; Zhu, Qian-Hao; Llewellyn, Danny J; Taylor, Jen M; Spriggs, Andrew; Fang, David D; Ulloa, Mauricio; Burke, John J; Giband, Marc; Lacape, Jean-Marc; Van Deynze, Allen; Udall, Joshua A; Scheffler, Jodi A; Hague, Steve; Wendel, Jonathan F; Pepper, Alan E; Frelichowski, James; Lawley, Cindy T; Jones, Don C; Percy, Richard G; Stelly, David M
2017-02-03
Cotton germplasm resources contain beneficial alleles that can be exploited to develop germplasm adapted to emerging environmental and climate conditions. Accessions and lines have traditionally been characterized based on phenotypes, but phenotypic profiles are limited by the cost, time, and space required to make visual observations and measurements. With advances in molecular genetic methods, genotypic profiles are increasingly able to identify differences among accessions due to the larger number of genetic markers that can be measured. A combination of both methods would greatly enhance our ability to characterize germplasm resources. Recent efforts have culminated in the identification of sufficient SNP markers to establish high-throughput genotyping systems, such as the CottonSNP63K array, which enables a researcher to efficiently analyze large numbers of SNP markers and obtain highly repeatable results. In the current investigation, we have utilized the SNP array for analyzing genetic diversity primarily among cotton cultivars, making comparisons to SSR-based phylogenetic analyses, and identifying loci associated with seed nutritional traits. The SNP markers distinctly separated G. hirsutum from other Gossypium species and distinguished the wild from cultivated types of G. hirsutum. The markers also efficiently discerned differences among cultivars, which was the primary goal when designing the CottonSNP63K array. Population structure within the genus compared favorably with previous results obtained using SSR markers, and an association study identified loci linked to factors that affect cottonseed protein content. Our results provide a large genome-wide variation data set for primarily cultivated cotton. Thousands of SNPs in representative cotton genotypes provide an opportunity to finely discriminate among cultivated cotton from around the world. The SNPs will be relevant as dense markers of genome variation for association mapping approaches aimed at correlating molecular polymorphisms with variation in phenotypic traits, as well as for molecular breeding approaches in cotton.
Zhao, Ni; Wilkerson, Matthew D; Shah, Usman; Yin, Xiaoying; Wang, Anyou; Hayward, Michele C; Roberts, Patrick; Lee, Carrie B; Parsons, Alden M; Thorne, Leigh B; Haithcock, Benjamin E; Grilley-Olson, Juneko E; Stinchcombe, Thomas E; Funkhouser, William K; Wong, Kwok-Kin; Sharpless, Norman E; Hayes, D Neil
2014-11-01
Brain metastases are one of the most malignant complications of lung cancer and constitute a significant cause of cancer related morbidity and mortality worldwide. Recent years of investigation suggested a role of LKB1 in NSCLC development and progression, in synergy with KRAS alteration. In this study, we systematically analyzed how LKB1 and KRAS alteration, measured by mutation, gene expression (GE) and copy number (CN), are associated with brain metastasis in NSCLC. Patients treated at University of North Carolina Hospital from 1990 to 2009 with NSCLC provided frozen, surgically extracted tumors for analysis. GE was measured using Agilent 44,000 custom-designed arrays, CN was assessed by Affymetrix GeneChip Human Mapping 250K Sty Array or the Genome-Wide Human SNP Array 6.0 and gene mutation was detected using ABI sequencing. Integrated analysis was conducted to assess the relationship between these genetic markers and brain metastasis. A model was proposed for brain metastasis prediction using these genetic measurements. 17 of the 174 patients developed brain metastasis. LKB1 wild type tumors had significantly higher LKB1 CN (p<0.001) and GE (p=0.002) than the LKB1 mutant group. KRAS wild type tumors had significantly lower KRAS GE (p<0.001) and lower CN, although the latter failed to be significant (p=0.295). Lower LKB1 CN (p=0.039) and KRAS mutation (p=0.007) were significantly associated with more brain metastasis. The predictive model based on nodal (N) stage, patient age, LKB1 CN and KRAS mutation had a good prediction accuracy, with area under the ROC curve of 0.832 (p<0.001). LKB1 CN in combination with KRAS mutation predicted brain metastasis in NSCLC. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
USDA-ARS?s Scientific Manuscript database
Natural antisense transcripts (NATs) are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded) or a different locus (trans-encoded). They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation....
USDA-ARS?s Scientific Manuscript database
Transcriptional profiles of soybean (Glycine max, L. Merr) near isogenic lines Clark (PI548553, iron efficient) and IsoClark (PI547430, iron inefficient) were analyzed and compared using the Affymetrix® GeneChip® Soybean Genome Array. A comparison of plants grown under Fe-sufficient and Fe-limited ...
Goldstein, Orly; Mezey, Jason G.; Schweitzer, Peter A.; Boyko, Adam R.; Gao, Chuan; Bustamante, Carlos D.; Jordan, Julie Ann; Aguirre, Gustavo D.; Acland, Gregory M.
2013-01-01
Purpose. To identify the causative mutations in two early-onset canine retinal degenerations, crd1 and crd2, segregating in the American Staffordshire terrier and the Pit Bull Terrier breeds, respectively. Methods. Retinal morphology of crd1- and crd2-affected dogs was evaluated by light microscopy. DNA was extracted from affected and related unaffected controls. Association analysis was undertaken using the Illumina Canine SNP array and PLINK (crd1 study), or the Affymetrix Version 2 Canine array, the “MAGIC” genotype algorithm, and Fisher's Exact test for association (crd2 study). Positional candidate genes were evaluated for each disease. Results. Structural photoreceptor abnormalities were observed in crd1-affected dogs as young as 11-weeks old. Rod and cone inner segment (IS) and outer segments (OS) were abnormal in size, shape, and number. In crd2-affected dogs, rod and cone IS and OS were abnormal as early as 3 weeks of age, progressing with age to severe loss of the OS, and thinning of the outer nuclear layer (ONL) by 12 weeks of age. Genome-wide association study (GWAS) identified association at the telomeric end of CFA3 in crd1-affected dogs and on CFA33 in crd2-affected dogs. Candidate gene evaluation identified a three bases deletion in exon 21 of PDE6B in crd1-affected dogs, and a cytosine insertion in exon 10 of IQCB1 in crd2-affected dogs. Conclusions. Identification of the mutations responsible for these two early-onset retinal degenerations provides new large animal models for comparative disease studies and evaluation of potential therapeutic approaches for the homologous human diseases. PMID:24045995
Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E
2013-11-15
Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geraldes, Armando; Hannemann, Jan; Grassa, Chris
2013-01-01
Genetic mapping of quantitative traits requires genotypic data for large numbers of markers in many individuals. Despite the declining costs of genotyping by sequencing, for most studies, the use of large SNP genotyping arrays still offers the most cost-effective solution for large-scale targeted genotyping. Here we report on the design and performance of a SNP genotyping array for Populus trichocarpa (black cottonwood). This genotyping array was designed with SNPs pre-ascertained in 34 wild accessions covering most of the species range. Due to the rapid decay of linkage disequilibrium in P. trichocarpa we adopted a candidate gene approach to the arraymore » design that resulted in the selection of 34,131 SNPs, the majority of which are located in, or within 2 kb, of 3,543 candidate genes. A subset of the SNPs (539) was selected based on patterns of variation among the SNP discovery accessions. We show that more than 95% of the loci produce high quality genotypes and that the genotyping error rate for these is likely below 2%, indicating that high-quality data are generated with this array. We demonstrate that even among small numbers of samples (n=10) from local populations over 84% of loci are polymorphic. We also tested the applicability of the array to other species in the genus and found that due to ascertainment bias the number of polymorphic loci decreases rapidly with genetic distance, with the largest numbers detected in other species in section Tacamahaca (P. balsamifera and P. angustifolia). Finally, we provide evidence for the utility of the array for intraspecific studies of genetic differentiation and for species assignment and the detection of natural hybrids.« less
Li, Ao; Liu, Zongzhi; Lezon-Geyda, Kimberly; Sarkar, Sudipa; Lannin, Donald; Schulz, Vincent; Krop, Ian; Winer, Eric; Harris, Lyndsay; Tuck, David
2011-01-01
There is an increasing interest in using single nucleotide polymorphism (SNP) genotyping arrays for profiling chromosomal rearrangements in tumors, as they allow simultaneous detection of copy number and loss of heterozygosity with high resolution. Critical issues such as signal baseline shift due to aneuploidy, normal cell contamination, and the presence of GC content bias have been reported to dramatically alter SNP array signals and complicate accurate identification of aberrations in cancer genomes. To address these issues, we propose a novel Global Parameter Hidden Markov Model (GPHMM) to unravel tangled genotyping data generated from tumor samples. In contrast to other HMM methods, a distinct feature of GPHMM is that the issues mentioned above are quantitatively modeled by global parameters and integrated within the statistical framework. We developed an efficient EM algorithm for parameter estimation. We evaluated performance on three data sets and show that GPHMM can correctly identify chromosomal aberrations in tumor samples containing as few as 10% cancer cells. Furthermore, we demonstrated that the estimation of global parameters in GPHMM provides information about the biological characteristics of tumor samples and the quality of genotyping signal from SNP array experiments, which is helpful for data quality control and outlier detection in cohort studies. PMID:21398628
Trembizki, Ella; Smith, Helen; Lahra, Monica M; Chen, Marcus; Donovan, Basil; Fairley, Christopher K; Guy, Rebecca; Kaldor, John; Regan, David; Ward, James; Nissen, Michael D; Sloots, Theo P; Whiley, David M
2014-06-01
Neisseria gonorrhoeae antimicrobial resistance (AMR) is a global problem heightened by emerging resistance to ceftriaxone. Appropriate molecular typing methods are important for understanding the emergence and spread of N. gonorrhoeae AMR. We report on the development, validation and testing of a Sequenom MassARRAY iPLEX method for multilocus sequence typing (MLST)-style genotyping of N. gonorrhoeae isolates. An iPLEX MassARRAY method (iPLEX14SNP) was developed targeting 14 informative gonococcal single nucleotide polymorphisms (SNPs) previously shown to predict MLST types. The method was initially validated using 24 N. gonorrhoeae control isolates and was then applied to 397 test isolates collected throughout Queensland, Australia in the first half of 2012. The iPLEX14SNP method provided 100% accuracy for the control isolates, correctly identifying all 14 SNPs for all 24 isolates (336/336). For the 397 test isolates, the iPLEX14SNP assigned results for 5461 of the possible 5558 SNPs (SNP call rate 98.25%), with complete 14 SNP profiles obtained for 364 isolates. Based on the complete SNP profile data, there were 49 different sequence types identified in Queensland, with 11 of the 49 SNP profiles accounting for the majority (n = 280; 77%) of isolates. AMR was dominated by several geographically clustered sequence types. Using the iPLEX14SNP method, up to 384 isolates could be tested within 1 working day for less than Aus$10 per isolate. The iPLEX14SNP offers an accurate and high-throughput method for the MLST-style genotyping of N. gonorrhoeae and may prove particularly useful for large-scale studies investigating the emergence and spread of gonococcal AMR. © The Author 2014. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Development and evaluation of the first high-throughput SNP array for common carp (Cyprinus carpio)
2014-01-01
Background A large number of single nucleotide polymorphisms (SNPs) have been identified in common carp (Cyprinus carpio) but, as yet, no high-throughput genotyping platform is available for this species. C. carpio is an important aquaculture species that accounts for nearly 14% of freshwater aquaculture production worldwide. We have developed an array for C. carpio with 250,000 SNPs and evaluated its performance using samples from various strains of C. carpio. Results The SNPs used on the array were selected from two resources: the transcribed sequences from RNA-seq data of four strains of C. carpio, and the genome re-sequencing data of five strains of C. carpio. The 250,000 SNPs on the resulting array are distributed evenly across the reference C.carpio genome with an average spacing of 6.6 kb. To evaluate the SNP array, 1,072 C. carpio samples were collected and tested. Of the 250,000 SNPs on the array, 185,150 (74.06%) were found to be polymorphic sites. Genotyping accuracy was checked using genotyping data from a group of full-siblings and their parents, and over 99.8% of the qualified SNPs were found to be reliable. Analysis of the linkage disequilibrium on all samples and on three domestic C.carpio strains revealed that the latter had the longer haplotype blocks. We also evaluated our SNP array on 80 samples from eight species related to C. carpio, with from 53,526 to 71,984 polymorphic SNPs. An identity by state analysis divided all the samples into three clusters; most of the C. carpio strains formed the largest cluster. Conclusions The Carp SNP array described here is the first high-throughput genotyping platform for C. carpio. Our evaluation of this array indicates that it will be valuable for farmed carp and for genetic and population biology studies in C. carpio and related species. PMID:24762296
Development and evaluation of the first high-throughput SNP array for common carp (Cyprinus carpio).
Xu, Jian; Zhao, Zixia; Zhang, Xiaofeng; Zheng, Xianhu; Li, Jiongtang; Jiang, Yanliang; Kuang, Youyi; Zhang, Yan; Feng, Jianxin; Li, Chuangju; Yu, Juhua; Li, Qiang; Zhu, Yuanyuan; Liu, Yuanyuan; Xu, Peng; Sun, Xiaowen
2014-04-24
A large number of single nucleotide polymorphisms (SNPs) have been identified in common carp (Cyprinus carpio) but, as yet, no high-throughput genotyping platform is available for this species. C. carpio is an important aquaculture species that accounts for nearly 14% of freshwater aquaculture production worldwide. We have developed an array for C. carpio with 250,000 SNPs and evaluated its performance using samples from various strains of C. carpio. The SNPs used on the array were selected from two resources: the transcribed sequences from RNA-seq data of four strains of C. carpio, and the genome re-sequencing data of five strains of C. carpio. The 250,000 SNPs on the resulting array are distributed evenly across the reference C.carpio genome with an average spacing of 6.6 kb. To evaluate the SNP array, 1,072 C. carpio samples were collected and tested. Of the 250,000 SNPs on the array, 185,150 (74.06%) were found to be polymorphic sites. Genotyping accuracy was checked using genotyping data from a group of full-siblings and their parents, and over 99.8% of the qualified SNPs were found to be reliable. Analysis of the linkage disequilibrium on all samples and on three domestic C.carpio strains revealed that the latter had the longer haplotype blocks. We also evaluated our SNP array on 80 samples from eight species related to C. carpio, with from 53,526 to 71,984 polymorphic SNPs. An identity by state analysis divided all the samples into three clusters; most of the C. carpio strains formed the largest cluster. The Carp SNP array described here is the first high-throughput genotyping platform for C. carpio. Our evaluation of this array indicates that it will be valuable for farmed carp and for genetic and population biology studies in C. carpio and related species.
McIntosh, Laura A; Marion, Miranda C; Sudman, Marc; Comeau, Mary E; Becker, Mara L; Bohnsack, John F; Fingerlin, Tasha E; Griffin, Thomas A; Haas, J Peter; Lovell, Daniel J; Maier, Lisa A; Nigrovic, Peter A; Prahalad, Sampath; Punaro, Marilynn; Rosé, Carlos D; Wallace, Carol A; Wise, Carol A; Moncrieffe, Halima; Howard, Timothy D; Langefeld, Carl D; Thompson, Susan D
2017-11-01
Juvenile idiopathic arthritis (JIA) is the most common childhood rheumatic disease and has a strong genomic component. To date, JIA genetic association studies have had limited sample sizes, used heterogeneous patient populations, or included only candidate regions. The aim of this study was to identify new associations between JIA patients with oligoarticular disease and those with IgM rheumatoid factor (RF)-negative polyarticular disease, which are clinically similar and the most prevalent JIA disease subtypes. Three cohorts comprising 2,751 patients with oligoarticular or RF-negative polyarticular JIA were genotyped using the Affymetrix Genome-Wide SNP Array 6.0 or the Illumina HumanCoreExome-12+ Array. Overall, 15,886 local and out-of-study controls, typed on these platforms or the Illumina HumanOmni2.5, were used for association analyses. High-quality single-nucleotide polymorphisms (SNPs) were used for imputation to 1000 Genomes prior to SNP association analysis. Meta-analysis showed evidence of association (P < 1 × 10 -6 ) at 9 regions: PRR9_LOR (P = 5.12 × 10 -8 ), ILDR1_CD86 (P = 6.73 × 10 -8 ), WDFY4 (P = 1.79 × 10 -7 ), PTH1R (P = 1.87 × 10 -7 ), RNF215 (P = 3.09 × 10 -7 ), AHI1_LINC00271 (P = 3.48 × 10 -7 ), JAK1 (P = 4.18 × 10 -7 ), LINC00951 (P = 5.80 × 10 -7 ), and HBP1 (P = 7.29 × 10 -7 ). Of these, PRR9_LOR, ILDR1_CD86, RNF215, LINC00951, and HBP1 were shown, for the first time, to be autoimmune disease susceptibility loci. Furthermore, associated SNPs included cis expression quantitative trait loci for WDFY4, CCDC12, MTP18, SF3A1, AHI1, COG5, HBP1, and GPR22. This study provides evidence of both unique JIA risk loci and risk loci overlapping between JIA and other autoimmune diseases. These newly associated SNPs are shown to influence gene expression, and their bounding regions tie into molecular pathways of immunologic relevance. Thus, they likely represent regions that contribute to the pathology of oligoarticular JIA and RF-negative polyarticular JIA. © 2017, American College of Rheumatology.
Linkage Analysis in Autoimmune Addison's Disease: NFATC1 as a Potential Novel Susceptibility Locus.
Mitchell, Anna L; Bøe Wolff, Anette; MacArthur, Katie; Weaver, Jolanta U; Vaidya, Bijay; Erichsen, Martina M; Darlay, Rebecca; Husebye, Eystein S; Cordell, Heather J; Pearce, Simon H S
2015-01-01
Autoimmune Addison's disease (AAD) is a rare, highly heritable autoimmune endocrinopathy. It is possible that there may be some highly penetrant variants which confer disease susceptibility that have yet to be discovered. DNA samples from 23 multiplex AAD pedigrees from the UK and Norway (50 cases, 67 controls) were genotyped on the Affymetrix SNP 6.0 array. Linkage analysis was performed using Merlin. EMMAX was used to carry out a genome-wide association analysis comparing the familial AAD cases to 2706 UK WTCCC controls. To explore some of the linkage findings further, a replication study was performed by genotyping 64 SNPs in two of the four linked regions (chromosomes 7 and 18), on the Sequenom iPlex platform in three European AAD case-control cohorts (1097 cases, 1117 controls). The data were analysed using a meta-analysis approach. In a parametric analysis, applying a rare dominant model, loci on chromosomes 7, 9 and 18 had LOD scores >2.8. In a non-parametric analysis, a locus corresponding to the HLA region on chromosome 6, known to be associated with AAD, had a LOD score >3.0. In the genome-wide association analysis, a SNP cluster on chromosome 2 and a pair of SNPs on chromosome 6 were associated with AAD (P <5x10-7). A meta-analysis of the replication study data demonstrated that three chromosome 18 SNPs were associated with AAD, including a non-synonymous variant in the NFATC1 gene. This linkage study has implicated a number of novel chromosomal regions in the pathogenesis of AAD in multiplex AAD families and adds further support to the role of HLA in AAD. The genome-wide association analysis has also identified a region of interest on chromosome 2. A replication study has demonstrated that the NFATC1 gene is worthy of future investigation, however each of the regions identified require further, systematic analysis.
A map of copy number variations in Chinese populations.
Lou, Haiyi; Li, Shilin; Yang, Yajun; Kang, Longli; Zhang, Xin; Jin, Wenfei; Wu, Bailin; Jin, Li; Xu, Shuhua
2011-01-01
It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies.
A genome-wide association search for type 2 diabetes genes in African Americans.
Palmer, Nicholette D; McDonough, Caitrin W; Hicks, Pamela J; Roh, Bong H; Wing, Maria R; An, S Sandy; Hester, Jessica M; Cooke, Jessica N; Bostrom, Meredith A; Rudock, Megan E; Talbert, Matthew E; Lewis, Joshua P; Ferrara, Assiamira; Lu, Lingyi; Ziegler, Julie T; Sale, Michele M; Divers, Jasmin; Shriner, Daniel; Adeyemo, Adebowale; Rotimi, Charles N; Ng, Maggie C Y; Langefeld, Carl D; Freedman, Barry I; Bowden, Donald W; Voight, Benjamin F; Scott, Laura J; Steinthorsdottir, Valgerdur; Morris, Andrew P; Dina, Christian; Welch, Ryan P; Zeggini, Eleftheria; Huth, Cornelia; Aulchenko, Yurii S; Thorleifsson, Gudmar; McCulloch, Laura J; Ferreira, Teresa; Grallert, Harald; Amin, Najaf; Wu, Guanming; Willer, Cristen J; Raychaudhuri, Soumya; McCarroll, Steve A; Langenberg, Claudia; Hofmann, Oliver M; Dupuis, Josée; Qi, Lu; Segrè, Ayellet V; van Hoek, Mandy; Navarro, Pau; Ardlie, Kristin; Balkau, Beverley; Benediktsson, Rafn; Bennett, Amanda J; Blagieva, Roza; Boerwinkle, Eric; Bonnycastle, Lori L; Boström, Kristina Bengtsson; Bravenboer, Bert; Bumpstead, Suzannah; Burtt, Noël P; Charpentier, Guillaume; Chines, Peter S; Cornelis, Marilyn; Couper, David J; Crawford, Gabe; Doney, Alex S F; Elliott, Katherine S; Elliott, Amanda L; Erdos, Michael R; Fox, Caroline S; Franklin, Christopher S; Ganser, Martha; Gieger, Christian; Grarup, Niels; Green, Todd; Griffin, Simon; Groves, Christopher J; Guiducci, Candace; Hadjadj, Samy; Hassanali, Neelam; Herder, Christian; Isomaa, Bo; Jackson, Anne U; Johnson, Paul R V; Jørgensen, Torben; Kao, Wen H L; Klopp, Norman; Kong, Augustine; Kraft, Peter; Kuusisto, Johanna; Lauritzen, Torsten; Li, Man; Lieverse, Aloysius; Lindgren, Cecilia M; Lyssenko, Valeriya; Marre, Michel; Meitinger, Thomas; Midthjell, Kristian; Morken, Mario A; Narisu, Narisu; Nilsson, Peter; Owen, Katharine R; Payne, Felicity; Perry, John R B; Petersen, Ann-Kristin; Platou, Carl; Proença, Christine; Prokopenko, Inga; Rathmann, Wolfgang; Rayner, N William; Robertson, Neil R; Rocheleau, Ghislain; Roden, Michael; Sampson, Michael J; Saxena, Richa; Shields, Beverley M; Shrader, Peter; Sigurdsson, Gunnar; Sparsø, Thomas; Strassburger, Klaus; Stringham, Heather M; Sun, Qi; Swift, Amy J; Thorand, Barbara; Tichet, Jean; Tuomi, Tiinamaija; van Dam, Rob M; van Haeften, Timon W; van Herpt, Thijs; van Vliet-Ostaptchouk, Jana V; Walters, G Bragi; Weedon, Michael N; Wijmenga, Cisca; Witteman, Jacqueline; Bergman, Richard N; Cauchi, Stephane; Collins, Francis S; Gloyn, Anna L; Gyllensten, Ulf; Hansen, Torben; Hide, Winston A; Hitman, Graham A; Hofman, Albert; Hunter, David J; Hveem, Kristian; Laakso, Markku; Mohlke, Karen L; Morris, Andrew D; Palmer, Colin N A; Pramstaller, Peter P; Rudan, Igor; Sijbrands, Eric; Stein, Lincoln D; Tuomilehto, Jaakko; Uitterlinden, Andre; Walker, Mark; Wareham, Nicholas J; Watanabe, Richard M; Abecasis, Goncalo R; Boehm, Bernhard O; Campbell, Harry; Daly, Mark J; Hattersley, Andrew T; Hu, Frank B; Meigs, James B; Pankow, James S; Pedersen, Oluf; Wichmann, H-Erich; Barroso, Inês; Florez, Jose C; Frayling, Timothy M; Groop, Leif; Sladek, Rob; Thorsteinsdottir, Unnur; Wilson, James F; Illig, Thomas; Froguel, Philippe; van Duijn, Cornelia M; Stefansson, Kari; Altshuler, David; Boehnke, Michael; McCarthy, Mark I; Soranzo, Nicole; Wheeler, Eleanor; Glazer, Nicole L; Bouatia-Naji, Nabila; Mägi, Reedik; Randall, Joshua; Johnson, Toby; Elliott, Paul; Rybin, Denis; Henneman, Peter; Dehghan, Abbas; Hottenga, Jouke Jan; Song, Kijoung; Goel, Anuj; Egan, Josephine M; Lajunen, Taina; Doney, Alex; Kanoni, Stavroula; Cavalcanti-Proença, Christine; Kumari, Meena; Timpson, Nicholas J; Zabena, Carina; Ingelsson, Erik; An, Ping; O'Connell, Jeffrey; Luan, Jian'an; Elliott, Amanda; McCarroll, Steven A; Roccasecca, Rosa Maria; Pattou, François; Sethupathy, Praveen; Ariyurek, Yavuz; Barter, Philip; Beilby, John P; Ben-Shlomo, Yoav; Bergmann, Sven; Bochud, Murielle; Bonnefond, Amélie; Borch-Johnsen, Knut; Böttcher, Yvonne; Brunner, Eric; Bumpstead, Suzannah J; Chen, Yii-Der Ida; Chines, Peter; Clarke, Robert; Coin, Lachlan J M; Cooper, Matthew N; Crisponi, Laura; Day, Ian N M; de Geus, Eco J C; Delplanque, Jerome; Fedson, Annette C; Fischer-Rosinsky, Antje; Forouhi, Nita G; Frants, Rune; Franzosi, Maria Grazia; Galan, Pilar; Goodarzi, Mark O; Graessler, Jürgen; Grundy, Scott; Gwilliam, Rhian; Hallmans, Göran; Hammond, Naomi; Han, Xijing; Hartikainen, Anna-Liisa; Hayward, Caroline; Heath, Simon C; Hercberg, Serge; Hicks, Andrew A; Hillman, David R; Hingorani, Aroon D; Hui, Jennie; Hung, Joe; Jula, Antti; Kaakinen, Marika; Kaprio, Jaakko; Kesaniemi, Y Antero; Kivimaki, Mika; Knight, Beatrice; Koskinen, Seppo; Kovacs, Peter; Kyvik, Kirsten Ohm; Lathrop, G Mark; Lawlor, Debbie A; Le Bacquer, Olivier; Lecoeur, Cécile; Li, Yun; Mahley, Robert; Mangino, Massimo; Manning, Alisa K; Martínez-Larrad, María Teresa; McAteer, Jarred B; McPherson, Ruth; Meisinger, Christa; Melzer, David; Meyre, David; Mitchell, Braxton D; Mukherjee, Sutapa; Naitza, Silvia; Neville, Matthew J; Oostra, Ben A; Orrù, Marco; Pakyz, Ruth; Paolisso, Giuseppe; Pattaro, Cristian; Pearson, Daniel; Peden, John F; Pedersen, Nancy L; Perola, Markus; Pfeiffer, Andreas F H; Pichler, Irene; Polasek, Ozren; Posthuma, Danielle; Potter, Simon C; Pouta, Anneli; Province, Michael A; Psaty, Bruce M; Rayner, Nigel W; Rice, Kenneth; Ripatti, Samuli; Rivadeneira, Fernando; Rolandsson, Olov; Sandbaek, Annelli; Sandhu, Manjinder; Sanna, Serena; Sayer, Avan Aihie; Scheet, Paul; Seedorf, Udo; Sharp, Stephen J; Shields, Beverley; Sijbrands, Eric J G; Silveira, Angela; Simpson, Laila; Singleton, Andrew; Smith, Nicholas L; Sovio, Ulla; Swift, Amy; Syddall, Holly; Syvänen, Ann-Christine; Tanaka, Toshiko; Tönjes, Anke; Uitterlinden, André G; van Dijk, Ko Willems; Varma, Dhiraj; Visvikis-Siest, Sophie; Vitart, Veronique; Vogelzangs, Nicole; Waeber, Gérard; Wagner, Peter J; Walley, Andrew; Ward, Kim L; Watkins, Hugh; Wild, Sarah H; Willemsen, Gonneke; Witteman, Jaqueline C M; Yarnell, John W G; Zelenika, Diana; Zethelius, Björn; Zhai, Guangju; Zhao, Jing Hua; Zillikens, M Carola; Borecki, Ingrid B; Loos, Ruth J F; Meneton, Pierre; Magnusson, Patrik K E; Nathan, David M; Williams, Gordon H; Silander, Kaisa; Salomaa, Veikko; Smith, George Davey; Bornstein, Stefan R; Schwarz, Peter; Spranger, Joachim; Karpe, Fredrik; Shuldiner, Alan R; Cooper, Cyrus; Dedoussis, George V; Serrano-Ríos, Manuel; Lind, Lars; Palmer, Lyle J; Franks, Paul W; Ebrahim, Shah; Marmot, Michael; Kao, W H Linda; Pramstaller, Peter Paul; Wright, Alan F; Stumvoll, Michael; Hamsten, Anders; Buchanan, Thomas A; Valle, Timo T; Rotter, Jerome I; Siscovick, David S; Penninx, Brenda W J H; Boomsma, Dorret I; Deloukas, Panos; Spector, Timothy D; Ferrucci, Luigi; Cao, Antonio; Scuteri, Angelo; Schlessinger, David; Uda, Manuela; Ruokonen, Aimo; Jarvelin, Marjo-Riitta; Waterworth, Dawn M; Vollenweider, Peter; Peltonen, Leena; Mooser, Vincent; Sladek, Robert
2012-01-01
African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide Association Study (GWAS) using the Affymetrix 6.0 array in 965 African-American cases with T2DM and end-stage renal disease (T2DM-ESRD) and 1029 population-based controls. The most significant SNPs (n = 550 independent loci) were genotyped in a replication cohort and 122 SNPs (n = 98 independent loci) were further tested through genotyping three additional validation cohorts followed by meta-analysis in all five cohorts totaling 3,132 cases and 3,317 controls. Twelve SNPs had evidence of association in the GWAS (P<0.0071), were directionally consistent in the Replication cohort and were associated with T2DM in subjects without nephropathy (P<0.05). Meta-analysis in all cases and controls revealed a single SNP reaching genome-wide significance (P<2.5×10(-8)). SNP rs7560163 (P = 7.0×10(-9), OR (95% CI) = 0.75 (0.67-0.84)) is located intergenically between RND3 and RBM43. Four additional loci (rs7542900, rs4659485, rs2722769 and rs7107217) were associated with T2DM (P<0.05) and reached more nominal levels of significance (P<2.5×10(-5)) in the overall analysis and may represent novel loci that contribute to T2DM. We have identified novel T2DM-susceptibility variants in the African-American population. Notably, T2DM risk was associated with the major allele and implies an interesting genetic architecture in this population. These results suggest that multiple loci underlie T2DM susceptibility in the African-American population and that these loci are distinct from those identified in other ethnic populations.
A Map of Copy Number Variations in Chinese Populations
Yang, Yajun; Kang, Longli; Zhang, Xin; Jin, Wenfei; Wu, Bailin; Jin, Li; Xu, Shuhua
2011-01-01
It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies. PMID:22087296
Keaton, Jacob M; Gao, Chuan; Guan, Meijian; Hellwege, Jacklyn N; Palmer, Nicholette D; Pankow, James S; Fornage, Myriam; Wilson, James G; Correa, Adolfo; Rasmussen-Torvik, Laura J; Rotter, Jerome I; Chen, Yii-Der I; Taylor, Kent D; Rich, Stephen S; Wagenknecht, Lynne E; Freedman, Barry I; Ng, Maggie C Y; Bowden, Donald W
2018-04-24
Although type 2 diabetes (T2D) results from metabolic defects in insulin secretion and insulin sensitivity, most of the genetic risk loci identified to date relates to insulin secretion. We reported that T2D loci influencing insulin sensitivity may be identified through interactions with insulin secretion loci, thereby leading to T2D. Here, we hypothesize that joint testing of variant main effects and interaction effects with an insulin secretion locus increases power to identify genetic interactions leading to T2D. We tested this hypothesis with an intronic MTNR1B SNP, rs10830963, which is associated with acute insulin response to glucose, a dynamic measure of insulin secretion. rs10830963 was tested for interaction and joint (main + interaction) effects with genome-wide data in African Americans (2,452 cases and 3,772 controls) from five cohorts. Genome-wide genotype data (Affymetrix Human Genome 6.0 array) was imputed to a 1000 Genomes Project reference panel. T2D risk was modeled using logistic regression with rs10830963 dosage, age, sex, and principal component as predictors. Joint effects were captured using the Kraft two degrees of freedom test. Genome-wide significant (P < 5 × 10 -8 ) interaction with MTNR1B and joint effects were detected for CMIP intronic SNP rs17197883 (P interaction = 1.43 × 10 -8 ; P joint = 4.70 × 10 -8 ). CMIP variants have been nominally associated with T2D, fasting glucose, and adiponectin in individuals of East Asian ancestry, with high-density lipoprotein, and with waist-to-hip ratio adjusted for body mass index in Europeans. These data support the hypothesis that additional genetic factors contributing to T2D risk, including insulin sensitivity loci, can be identified through interactions with insulin secretion loci. © 2018 WILEY PERIODICALS, INC.
Brown, Allan F; Yousef, Gad G; Chebrolu, Kranthi K; Byrd, Robert W; Everhart, Koyt W; Thomas, Aswathy; Reid, Robert W; Parkin, Isobel A P; Sharpe, Andrew G; Oliver, Rebekah; Guzman, Ivette; Jackson, Eric W
2014-09-01
A high-resolution genetic linkage map of B. oleracea was developed from a B. napus SNP array. The work will facilitate genetic and evolutionary studies in Brassicaceae. A broccoli population, VI-158 × BNC, consisting of 150 F2:3 families was used to create a saturated Brassica oleracea (diploid: CC) linkage map using a recently developed rapeseed (Brassica napus) (tetraploid: AACC) Illumina Infinium single nucleotide polymorphism (SNP) array. The map consisted of 547 non-redundant SNP markers spanning 948.1 cM across nine chromosomes with an average interval size of 1.7 cM. As the SNPs are anchored to the genomic reference sequence of the rapid cycling B. oleracea TO1000, we were able to estimate that the map provides 96 % coverage of the diploid genome. Carotenoid analysis of 2 years data identified 3 QTLs on two chromosomes that are associated with up to half of the phenotypic variation associated with the accumulation of total or individual compounds. By searching the genome sequences of the two related diploid species (B. oleracea and B. rapa), we further identified putative carotenoid candidate genes in the region of these QTLs. This is the first description of the use of a B. napus SNP array to rapidly construct high-density genetic linkage maps of one of the constituent diploid species. The unambiguous nature of these markers with regard to genomic sequences provides evidence to the nature of genes underlying the QTL, and demonstrates the value and impact this resource will have on Brassica research.
2011-01-01
Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants. PMID:22082336
USDA-ARS?s Scientific Manuscript database
Meeting the increasing market demands for pork products requires improvement of the feed efficiency of growing pigs. The use of Affymetrix Porcine Gene 1.0 ST array containing 19,211 genes in this study provides a comprehensive gene expression profile of skeletal muscle of finishing pigs in response...
KinSNP software for homozygosity mapping of disease genes using SNP microarrays
2010-01-01
Consanguineous families affected with a recessive genetic disease caused by homozygotisation of a mutation offer a unique advantage for positional cloning of rare diseases. Homozygosity mapping of patient genotypes is a powerful technique for the identification of the genomic locus harbouring the causing mutation. This strategy relies on the observation that in these patients a large region spanning the disease locus is also homozygous with high probability. The high marker density in single nucleotide polymorphism (SNP) arrays is extremely advantageous for homozygosity mapping. We present KinSNP, a user-friendly software tool for homozygosity mapping using SNP arrays. The software searches for stretches of SNPs which are homozygous to the same allele in all ascertained sick individuals. User-specified parameters control the number of allowed genotyping 'errors' within homozygous blocks. Candidate disease regions are then reported in a detailed, coloured Excel file, along with genotypes of family members and healthy controls. An interactive genome browser has been included which shows homozygous blocks, individual genotypes, genes and further annotations along the chromosomes, with zooming and scrolling capabilities. The software has been used to identify the location of a mutated gene causing insensitivity to pain in a large Bedouin family. KinSNP is freely available from http://bioinfo.bgu.ac.il/bsu/software/kinSNP. PMID:20846928
[Dandy-walker syndrome and microdeletions on chromosome 7].
Liao, Can; Fu, Fang; Li, Ru; Pan, Min; Yang, Xin; Yi, Cui-xing; Li, Jian; Li, Dong-zhi
2012-02-01
To investigate genetic etiology of Dandy-Walker syndrome with array-based comparative genomic hybridization (array-CGH). Eight fetuses with Dandy-Walker malformations but normal karyotypes by conventional cytogenetic technique were selected. DNA samples were extracted and hybridized with Affymetrix cytogenetic 2.7 M arrays by following the manufacturer's standard protocol. The data were analyzed by special software packages. By using array-CGH technique, common deletions and duplication on chromosome 7p21.3 were identified in three cases, within which were central nervous system disease associated genes NDUFA4 and PHF14. Copy number variations (CNVs) of chromosome 7p21.3 region are associated with Dandy-Walker malformations which may be due to haploinsufficiency or overexpression of NDUFA4 and PHF14 genes.
2012-01-01
Background For decades the tobacco plant has served as a model organism in plant biology to answer fundamental biological questions in the areas of plant development, physiology, and genetics. Due to the lack of sufficient coverage of genomic sequences, however, none of the expressed sequence tag (EST)-based chips developed to date cover gene expression from the whole genome. The availability of Tobacco Genome Initiative (TGI) sequences provides a useful resource to build a whole genome exon array, even if the assembled sequences are highly fragmented. Here, the design of a Tobacco Exon Array is reported and an application to improve the understanding of genes regulated by cadmium (Cd) in tobacco is described. Results From the analysis and annotation of the 1,271,256 Nicotiana tabacum fasta and quality files from methyl filtered genomic survey sequences (GSS) obtained from the TGI and ~56,000 ESTs available in public databases, an exon array with 272,342 probesets was designed (four probes per exon) and tested on two selected tobacco varieties. Two tobacco varieties out of 45 accumulating low and high cadmium in leaf were identified based on the GGE biplot analysis, which is analysis of the genotype main effect (G) plus analysis of the genotype by environment interaction (GE) of eight field trials (four fields over two years) showing reproducibility across the trials. The selected varieties were grown under greenhouse conditions in two different soils and subjected to exon array analyses using root and leaf tissues to understand the genetic make-up of the Cd accumulation. Conclusions An Affymetrix Exon Array was developed to cover a large (~90%) proportion of the tobacco gene space. The Tobacco Exon Array will be available for research use through Affymetrix array catalogue. As a proof of the exon array usability, we have demonstrated that the Tobacco Exon Array is a valuable tool for studying Cd accumulation in tobacco leaves. Data from field and greenhouse experiments supported by gene expression studies strongly suggested that the difference in leaf Cd accumulation between the two specific tobacco cultivars is dependent solely on genetic factors and genetic variability rather than on the environment. PMID:23190529
Wang, Yao; Cui, Yazhou; Zhou, Xiaoyan; Han, Jinxiang
2015-01-01
Objective Osteogenesis imperfecta (OI) is a rare inherited skeletal disease, characterized by bone fragility and low bone density. The mutations in this disorder have been widely reported to be on various exonal hotspots of the candidate genes, including COL1A1, COL1A2, CRTAP, LEPRE1, and FKBP10, thus creating a great demand for precise genetic tests. However, large genome sizes make the process daunting and the analyses, inefficient and expensive. Therefore, we aimed at developing a fast, accurate, efficient, and cheaper sequencing platform for OI diagnosis; and to this end, use of an advanced array-based technique was proposed. Method A CustomSeq Affymetrix Resequencing Array was established for high-throughput sequencing of five genes simultaneously. Genomic DNA extraction from 13 OI patients and 85 normal controls and amplification using long-range PCR (LR-PCR) were followed by DNA fragmentation and chip hybridization, according to standard Affymetrix protocols. Hybridization signals were determined using GeneChip Sequence Analysis Software (GSEQ). To examine the feasibility, the outcome from new resequencing approach was validated by conventional capillary sequencing method. Result Overall call rates using resequencing array was 96–98% and the agreement between microarray and capillary sequencing was 99.99%. 11 out of 13 OI patients with pathogenic mutations were successfully detected by the chip analysis without adjustment, and one mutation could also be identified using manual visual inspection. Conclusion A high-throughput resequencing array was developed that detects the disease-associated mutations in OI, providing a potential tool to facilitate large-scale genetic screening for OI patients. Through this method, a novel mutation was also found. PMID:25742658
Koning-Boucoiran, Carole F S; Esselink, G Danny; Vukosavljev, Mirjana; van 't Westende, Wendy P C; Gitonga, Virginia W; Krens, Frans A; Voorrips, Roeland E; van de Weg, W Eric; Schulz, Dietmar; Debener, Thomas; Maliepaard, Chris; Arens, Paul; Smulders, Marinus J M
2015-01-01
In order to develop a versatile and large SNP array for rose, we set out to mine ESTs from diverse sets of rose germplasm. For this RNA-Seq libraries containing about 700 million reads were generated from tetraploid cut and garden roses using Illumina paired-end sequencing, and from diploid Rosa multiflora using 454 sequencing. Separate de novo assemblies were performed in order to identify single nucleotide polymorphisms (SNPs) within and between rose varieties. SNPs among tetraploid roses were selected for constructing a genotyping array that can be employed for genetic mapping and marker-trait association discovery in breeding programs based on tetraploid germplasm, both from cut roses and from garden roses. In total 68,893 SNPs were included on the WagRhSNP Axiom array. Next, an orthology-guided assembly was performed for the construction of a non-redundant rose transcriptome database. A total of 21,740 transcripts had significant hits with orthologous genes in the strawberry (Fragaria vesca L.) genome. Of these 13,390 appeared to contain the full-length coding regions. This newly established transcriptome resource adds considerably to the currently available sequence resources for the Rosaceae family in general and the genus Rosa in particular.
Detection of selective sweeps in cattle using genome-wide SNP data
2013-01-01
Background The domestication and subsequent selection by humans to create breeds and biological types of cattle undoubtedly altered the patterning of variation within their genomes. Strong selection to fix advantageous large-effect mutations underlying domesticability, breed characteristics or productivity created selective sweeps in which variation was lost in the chromosomal region flanking the selected allele. Selective sweeps have now been identified in the genomes of many animal species including humans, dogs, horses, and chickens. Here, we attempt to identify and characterise regions of the bovine genome that have been subjected to selective sweeps. Results Two datasets were used for the discovery and validation of selective sweeps via the fixation of alleles at a series of contiguous SNP loci. BovineSNP50 data were used to identify 28 putative sweep regions among 14 diverse cattle breeds. Affymetrix BOS 1 prescreening assay data for five breeds were used to identify 85 regions and validate 5 regions identified using the BovineSNP50 data. Many genes are located within these regions and the lack of sequence data for the analysed breeds precludes the nomination of selected genes or variants and limits the prediction of the selected phenotypes. However, phenotypes that we predict to have historically been under strong selection include horned-polled, coat colour, stature, ear morphology, and behaviour. Conclusions The bias towards common SNPs in the design of the BovineSNP50 assay led to the identification of recent selective sweeps associated with breed formation and common to only a small number of breeds rather than ancient events associated with domestication which could potentially be common to all European taurines. The limited SNP density, or marker resolution, of the BovineSNP50 assay significantly impacted the rate of false discovery of selective sweeps, however, we found sweeps in common between breeds which were confirmed using an ultra-high-density assay scored in a small number of animals from a subset of the breeds. No sweep regions were shared between indicine and taurine breeds reflecting their divergent selection histories and the very different environmental habitats to which these sub-species have adapted. PMID:23758707
Arenillas, Leonor; Mallo, Mar; Ramos, Fernando; Guinta, Kathryn; Barragán, Eva; Lumbreras, Eva; Larráyoz, María-José; De Paz, Raquel; Tormo, Mar; Abáigar, María; Pedro, Carme; Cervera, José; Such, Esperanza; José Calasanz, María; Díez-Campelo, María; Sanz, Guillermo F; Hernández, Jesús María; Luño, Elisa; Saumell, Sílvia; Maciejewski, Jaroslaw; Florensa, Lourdes; Solé, Francesc
2013-12-01
Cytogenetic aberrations identified by metaphase cytogenetics (MC) have diagnostic, prognostic, and therapeutic implications in myelodysplastic syndromes (MDS). However, in some MDS patients MC study is unsuccesful. Single nucleotide polymorphism array (SNP-A) based karyotyping could be helpful in these cases. We performed SNP-A in 62 samples from bone marrow or peripheral blood of primary MDS with an unsuccessful MC study. SNP-A analysis enabled the detection of aberrations in 31 (50%) patients. We used the copy number alteration information to apply the International Prognostic Scoring System (IPSS) and we observed differences in survival between the low/intermediate-1 and intermediate-2/high risk patients. We also saw differences in survival between very low/low/intermediate and the high/very high patients when we applied the revised IPSS (IPSS-R). In conclusion, SNP-A can be used successfully in PB samples and the identification of CNA by SNP-A improve the diagnostic and prognostic evaluation of this group of MDS patients. Copyright © 2013 Wiley Periodicals, Inc.
Fine-scaled human genetic structure revealed by SNP microarrays.
Xing, Jinchuan; Watkins, W Scott; Witherspoon, David J; Zhang, Yuhua; Guthery, Stephen L; Thara, Rangaswamy; Mowry, Bryan J; Bulayeva, Kazima; Weiss, Robert B; Jorde, Lynn B
2009-05-01
We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.
USDA-ARS?s Scientific Manuscript database
RNA expression analysis was performed on the corpus luteum tissue at five time points after prostaglandin F2 alpha treatment of midcycle cows using an Affymetrix Bovine Gene v1 Array. The normalized linear microarray data was uploaded to the NCBI GEO repository (GSE94069). Subsequent statistical ana...
R classes and methods for SNP array data.
Scharpf, Robert B; Ruczinski, Ingo
2010-01-01
The Bioconductor project is an "open source and open development software project for the analysis and comprehension of genomic data" (1), primarily based on the R programming language. Infrastructure packages, such as Biobase, are maintained by Bioconductor core developers and serve several key roles to the broader community of Bioconductor software developers and users. In particular, Biobase introduces an S4 class, the eSet, for high-dimensional assay data. Encapsulating the assay data as well as meta-data on the samples, features, and experiment in the eSet class definition ensures propagation of the relevant sample and feature meta-data throughout an analysis. Extending the eSet class promotes code reuse through inheritance as well as interoperability with other R packages and is less error-prone. Recently proposed class definitions for high-throughput SNP arrays extend the eSet class. This chapter highlights the advantages of adopting and extending Biobase class definitions through a working example of one implementation of classes for the analysis of high-throughput SNP arrays.
KinSNP software for homozygosity mapping of disease genes using SNP microarrays.
Amir, El-Ad David; Bartal, Ofer; Morad, Efrat; Nagar, Tal; Sheynin, Jony; Parvari, Ruti; Chalifa-Caspi, Vered
2010-08-01
Consanguineous families affected with a recessive genetic disease caused by homozygotisation of a mutation offer a unique advantage for positional cloning of rare diseases. Homozygosity mapping of patient genotypes is a powerful technique for the identification of the genomic locus harbouring the causing mutation. This strategy relies on the observation that in these patients a large region spanning the disease locus is also homozygous with high probability. The high marker density in single nucleotide polymorphism (SNP) arrays is extremely advantageous for homozygosity mapping. We present KinSNP, a user-friendly software tool for homozygosity mapping using SNP arrays. The software searches for stretches of SNPs which are homozygous to the same allele in all ascertained sick individuals. User-specified parameters control the number of allowed genotyping 'errors' within homozygous blocks. Candidate disease regions are then reported in a detailed, coloured Excel file, along with genotypes of family members and healthy controls. An interactive genome browser has been included which shows homozygous blocks, individual genotypes, genes and further annotations along the chromosomes, with zooming and scrolling capabilities. The software has been used to identify the location of a mutated gene causing insensitivity to pain in a large Bedouin family. KinSNP is freely available from.
Evaluation of Bovine High-Density SNP Genotyping Array in Indigenous Dairy Cattle Breeds.
Dash, S; Singh, A; Bhatia, A K; Jayakumar, S; Sharma, A; Singh, S; Ganguly, I; Dixit, S P
2018-04-03
In total 52 samples of Sahiwal ( 19 ), Tharparkar ( 17 ), and Gir ( 16 ) were genotyped by using BovineHD SNP chip to analyze minor allele frequency (MAF), genetic diversity, and linkage disequilibrium among these cattle. The common SNPs of BovineHD and 54K SNP Chips were also extracted and evaluated for their performance. Only 40%-50% SNPs of these arrays was found informative for genetic analysis in these cattle breeds. The overall mean of MAF for SNPs of BovineHD SNPChip was 0.248 ± 0.006, 0.241 ± 0.007, and 0.242 ± 0.009 in Sahiwal, Tharparkar and Gir, respectively, while that for 54K SNPs was on lower side. The average Reynold's genetic distance between breeds ranged from 0.042 to 0.055 based on BovineHD Beadchip, and from 0.052 to 0.084 based on 54K SNP Chip. The estimates of genetic diversity based on HD and 54K chips were almost same and, hence, low density chip seems to be good enough to decipher genetic diversity of these cattle breeds. The linkage disequilibrium started decaying (r 2 < 0.2) at 140 kb inter-marker distance and, hence, a 20K low density customized SNP array from HD chip could be designed for genomic selection in these cattle else the 54K Bead Chip as such will be useful.
Xu, Li-Xin; Holland, Heidrun; Kirsten, Holger; Ahnert, Peter; Krupp, Wolfgang; Bauer, Manfred; Schober, Ralf; Mueller, Wolf; Fritzsch, Dominik; Meixensberger, Jürgen; Koschny, Ronald
2015-04-01
According to the World Health Organization gangliogliomas are classified as well-differentiated and slowly growing neuroepithelial tumors, composed of neoplastic mature ganglion and glial cells. It is the most frequent tumor entity observed in patients with long-term epilepsy. Comprehensive cytogenetic and molecular cytogenetic data including high-resolution genomic profiling (single nucleotide polymorphism (SNP)-array) of gangliogliomas are scarce but necessary for a better oncological understanding of this tumor entity. For a detailed characterization at the single cell and cell population levels, we analyzed genomic alterations of three gangliogliomas using trypsin-Giemsa banding (GTG-banding) and by spectral karyotyping (SKY) in combination with SNP-array and gene expression array experiments. By GTG and SKY, we could confirm frequently detected chromosomal aberrations (losses within chromosomes 10, 13 and 22; gains within chromosomes 5, 7, 8 and 12), and identify so far unknown genetic aberrations like the unbalanced non-reciprocal translocation t(1;18)(q21;q21). Interestingly, we report on the second so far detected ganglioglioma with ring chromosome 1. Analyses of SNP-array data from two of the tumors and respective germline DNA (peripheral blood) identified few small gains and losses and a number of copy-neutral regions with loss of heterozygosity (LOH) in germline and in tumor tissue. In comparison to germline DNA, tumor tissues did not show substantial regions with significant loss or gain or with newly developed LOH. Gene expression analyses of tumor-specific genes revealed similarities in the profile of the analyzed samples regarding different relevant pathways. Taken together, we describe overlapping but also distinct and novel genetic aberrations of three gangliogliomas. © 2014 Japanese Society of Neuropathology.
Bourret, Vincent; Kent, Matthew P; Primmer, Craig R; Vasemägi, Anti; Karlsson, Sten; Hindar, Kjetil; McGinnity, Philip; Verspoor, Eric; Bernatchez, Louis; Lien, Sigbjørn
2013-02-01
Atlantic salmon (Salmo salar) is one of the most extensively studied fish species in the world due to its significance in aquaculture, fisheries and ongoing conservation efforts to protect declining populations. Yet, limited genomic resources have hampered our understanding of genetic architecture in the species and the genetic basis of adaptation to the wide range of natural and artificial environments it occupies. In this study, we describe the development of a medium-density Atlantic salmon single nucleotide polymorphism (SNP) array based on expressed sequence tags (ESTs) and genomic sequencing. The array was used in the most extensive assessment of population genetic structure performed to date in this species. A total of 6176 informative SNPs were successfully genotyped in 38 anadromous and freshwater wild populations distributed across the species natural range. Principal component analysis clearly differentiated European and North American populations, and within Europe, three major regional genetic groups were identified for the first time in a single analysis. We assessed the potential for the array to disentangle neutral and putative adaptive divergence of SNP allele frequencies across populations and among regional groups. In Europe, secondary contact zones were identified between major clusters where endogenous and exogenous barriers could be associated, rendering the interpretation of environmental influence on potentially adaptive divergence equivocal. A small number of markers highly divergent in allele frequencies (outliers) were observed between (multiple) freshwater and anadromous populations, between northern and southern latitudes, and when comparing Baltic populations to all others. We also discuss the potential future applications of the SNP array for conservation, management and aquaculture. © 2012 Blackwell Publishing Ltd.
Schulz, Vincent; Chen, Min; Tuck, David
2010-01-01
Background Genotyping platforms such as single nucleotide polymorphism (SNP) arrays are powerful tools to study genomic aberrations in cancer samples. Allele specific information from SNP arrays provides valuable information for interpreting copy number variation (CNV) and allelic imbalance including loss-of-heterozygosity (LOH) beyond that obtained from the total DNA signal available from array comparative genomic hybridization (aCGH) platforms. Several algorithms based on hidden Markov models (HMMs) have been designed to detect copy number changes and copy-neutral LOH making use of the allele information on SNP arrays. However heterogeneity in clinical samples, due to stromal contamination and somatic alterations, complicates analysis and interpretation of these data. Methods We have developed MixHMM, a novel hidden Markov model using hidden states based on chromosomal structural aberrations. MixHMM allows CNV detection for copy numbers up to 7 and allows more complete and accurate description of other forms of allelic imbalance, such as increased copy number LOH or imbalanced amplifications. MixHMM also incorporates a novel sample mixing model that allows detection of tumor CNV events in heterogeneous tumor samples, where cancer cells are mixed with a proportion of stromal cells. Conclusions We validate MixHMM and demonstrate its advantages with simulated samples, clinical tumor samples and a dilution series of mixed samples. We have shown that the CNVs of cancer cells in a tumor sample contaminated with up to 80% of stromal cells can be detected accurately using Illumina BeadChip and MixHMM. Availability The MixHMM is available as a Python package provided with some other useful tools at http://genecube.med.yale.edu:8080/MixHMM. PMID:20532221
DeScipio, Cheryl; Morrissette, Jennifer J.D.; Conlin, Laura K.; Clark, Dinah; Kaur, Maninder; Coplan, James; Riethman, Harold; Spinner, Nancy B.; Krantz, Ian D.
2009-01-01
Two brothers, with dissimilar clinical features, were each found to have different abnormalities of chromosome 20 by subtelomere fluorescence in situ hybridization (FISH). The proband had deletion of 20p subtelomere and duplication of 20q subtelomere, while his brother was found to have a duplication of 20p subtelomere and deletion of 20q subtelomere. Parental cytogenetic studies were initially thought to be normal, both by G-banding and by subtelomere FISH analysis. Since chromosome 20 is a metacentric chromosome and an inversion was suspected, we used anchored FISH to assist in identifying a possible inversion. This approach employed concomitant hybridization of a FISH probe to the short (p) arm of chromosome 20 with the 20q subtelomere probe. We identified a cytogenetically non-visible, mosaic pericentric inversion of one of the maternal chromosome 20 homologues, providing a mechanistic explanation for the chromosomal abnormalities present in these brothers. Array comparative genomic hybridization (CGH) with both a custom-made BAC and cosmid-based subtelomere specific array (TEL array) and a commercially-available SNP-based array confirmed and further characterized these rearrangements, identifying this as the largest pericentric inversion of chromosome 20 described to date. TEL array data indicate that the 20p breakpoint is defined by BAC RP11-978M13, ~900 kb from the pter; SNP array data reveal this breakpoint to occur within BAC RP11-978M13. The 20q breakpoint is defined by BAC RP11-93B14, ~1.7 Mb from the qter, by TEL array; SNP array data refine this breakpoint to within a gap between BACs on the TEL array (i.e. between RP11-93B14 and proximal BAC RP11-765G16). PMID:20101690
Descipio, Cheryl; Morrissette, Jennifer D; Conlin, Laura K; Clark, Dinah; Kaur, Maninder; Coplan, James; Riethman, Harold; Spinner, Nancy B; Krantz, Ian D
2010-02-01
Two brothers, with dissimilar clinical features, were each found to have different abnormalities of chromosome 20 by subtelomere fluorescence in situ hybridization (FISH). The proband had deletion of 20p subtelomere and duplication of 20q subtelomere, while his brother was found to have a duplication of 20p subtelomere and deletion of 20q subtelomere. Parental cytogenetic studies were initially thought to be normal, both by G-banding and by subtelomere FISH analysis. Since chromosome 20 is a metacentric chromosome and an inversion was suspected, we used anchored FISH to assist in identifying a possible inversion. This approach employed concomitant hybridization of a FISH probe to the short (p) arm of chromosome 20 with the 20q subtelomere probe. We identified a cytogenetically non-visible, mosaic pericentric inversion of one of the maternal chromosome 20 homologs, providing a mechanistic explanation for the chromosomal abnormalities present in these brothers. Array comparative genomic hybridization (CGH) with both a custom-made BAC and cosmid-based subtelomere specific array (TEL array) and a commercially available SNP-based array confirmed and further characterized these rearrangements, identifying this as the largest pericentric inversion of chromosome 20 described to date. TEL array data indicate that the 20p breakpoint is defined by BAC RP11-978M13, approximately 900 kb from the pter; SNP array data reveal this breakpoint to occur within BAC RP11-978M13. The 20q breakpoint is defined by BAC RP11-93B14, approximately 1.7 Mb from the qter, by TEL array; SNP array data refine this breakpoint to within a gap between BACs on the TEL array (i.e., between RP11-93B14 and proximal BAC RP11-765G16). Copyright 2010 Wiley-Liss, Inc.
Pasaniuc, Bogdan; Zaitlen, Noah; Lettre, Guillaume; Chen, Gary K; Tandon, Arti; Kao, W H Linda; Ruczinski, Ingo; Fornage, Myriam; Siscovick, David S; Zhu, Xiaofeng; Larkin, Emma; Lange, Leslie A; Cupples, L Adrienne; Yang, Qiong; Akylbekova, Ermeg L; Musani, Solomon K; Divers, Jasmin; Mychaleckyj, Joe; Li, Mingyao; Papanicolaou, George J; Millikan, Robert C; Ambrosone, Christine B; John, Esther M; Bernstein, Leslie; Zheng, Wei; Hu, Jennifer J; Ziegler, Regina G; Nyante, Sarah J; Bandera, Elisa V; Ingles, Sue A; Press, Michael F; Chanock, Stephen J; Deming, Sandra L; Rodriguez-Gil, Jorge L; Palmer, Cameron D; Buxbaum, Sarah; Ekunwe, Lynette; Hirschhorn, Joel N; Henderson, Brian E; Myers, Simon; Haiman, Christopher A; Reich, David; Patterson, Nick; Wilson, James G; Price, Alkes L
2011-04-01
While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.
Analysis of genetic diversity using SNP markers in oat
USDA-ARS?s Scientific Manuscript database
A large-scale single nucleotide polymorphism (SNP) discovery was carried out in cultivated oat using Roche 454 sequencing methods. DNA sequences were generated from cDNAs originating from a panel of 20 diverse oat cultivars, and from Diversity Array Technology (DArT) genomic complexity reductions fr...
Liu, Xiao-Gang; Tan, Li-Jun; Lei, Shu-Feng; Liu, Yong-Jun; Shen, Hui; Wang, Liang; Yan, Han; Guo, Yan-Fang; Xiong, Dong-Hai; Chen, Xiang-Ding; Pan, Feng; Yang, Tie-Lin; Zhang, Yin-Ping; Guo, Yan; Tang, Nelson L; Zhu, Xue-Zhen; Deng, Hong-Yi; Levy, Shawn; Recker, Robert R; Papasian, Christopher J; Deng, Hong-Wen
2009-03-01
Low lean body mass (LBM) is related to a series of health problems, such as osteoporotic fracture and sarcopenia. Here we report a genome-wide association (GWA) study on LBM variation, by using Affymetrix 500K single-nucleotide polymorphism (SNP) arrays. In the GWA scan, we tested 379,319 eligible SNPs in 1,000 unrelated US whites and found that two SNPs, rs16892496 (p = 7.55 x 10(-8)) and rs7832552 (p = 7.58 x 10(-8)), within the thyrotropin-releasing hormone receptor (TRHR) gene were significantly associated with LBM. Subjects carrying unfavorable genotypes at rs16892496 and rs7832552 had, on average, 2.70 and 2.55 kg lower LBM, respectively, compared to those with alternative genotypes. We replicated the significant associations in three independent samples: (1) 1488 unrelated US whites, (2) 2955 Chinese unrelated subjects, and (3) 593 nuclear families comprising 1972 US whites. Meta-analyses of the GWA scan and the replication studies yielded p values of 5.53 x 10(-9) for rs16892496 and 3.88 x 10(-10) for rs7832552. In addition, we found significant interactions between rs16892496 and polymorphisms of several other genes involved in the hypothalamic-pituitary-thyroid and the growth hormone-insulin-like growth factor-I axes. Results of this study, together with the functional relevance of TRHR in muscle metabolism, support the TRHR gene as an important gene for LBM variation.
Hong, Jung Min; Kim, Tae-Ho; Kim, Hyun-Ju; Park, Eui-Kyun
2010-01-01
Multiple factors have been implicated in the development of osteonecrosis of the femoral head (ONFH). In particular, non-traumatic ONFH is directly or indirectly related to injury of the vascular supply to the femoral head. Thus, hypoxia in the femoral head caused by impaired blood flow may be an important risk factor for ONFH. In this study, we investigated whether genetic variations of angiogenesis- and hypoxia-related genes contribute to an increased risk for the development of ONFH. Candidate genes were selected based on known hypoxia and angiogenesis pathways. An association study was performed using an Affymetrix Targeted Genotyping 3K Chip array with 460 ONFH patients and 300 control subjects. We showed that single nucleotide polymorphisms (SNPs) in the genes TF, VEGFC, IGFBP3, and ACE were associated with an increased risk of ONFH. On the other hand, SNPs in the KDR and NRP1 genes were associated with protection against ONFH. The most important finding was that one SNP (rs2453839) in the IGFBP3 gene was significantly associated with a higher risk of ONFH (P = 0.0061, OR 7.74). In subgroup analysis, most candidate gene variations that were associated with ONFH occurred in the idiopathic subgroup. Among other SNPs, ACE SNPs were associated with steroid-induced ONFH (P = 0.0018-0.0037, OR > 3). Collectively, our findings suggest that genetic variations in angiogenesis- and hypoxia-related genes may help to identify susceptibility factors for the development of ONFH in the Korean population. PMID:20215856
Rare copy number variants and congenital heart defects in the 22q11.2 deletion syndrome.
Mlynarski, Elisabeth E; Xie, Michael; Taylor, Deanne; Sheridan, Molly B; Guo, Tingwei; Racedo, Silvia E; McDonald-McGinn, Donna M; Chow, Eva W C; Vorstman, Jacob; Swillen, Ann; Devriendt, Koen; Breckpot, Jeroen; Digilio, Maria Cristina; Marino, Bruno; Dallapiccola, Bruno; Philip, Nicole; Simon, Tony J; Roberts, Amy E; Piotrowicz, Małgorzata; Bearden, Carrie E; Eliez, Stephan; Gothelf, Doron; Coleman, Karlene; Kates, Wendy R; Devoto, Marcella; Zackai, Elaine; Heine-Suñer, Damian; Goldmuntz, Elizabeth; Bassett, Anne S; Morrow, Bernice E; Emanuel, Beverly S
2016-03-01
The 22q11.2 deletion syndrome (22q11DS; velocardiofacial/DiGeorge syndrome; VCFS/DGS; MIM #192430; 188400) is the most common microdeletion syndrome. The phenotypic presentation of 22q11DS is highly variable; approximately 60-75 % of 22q11DS patients have been reported to have a congenital heart defect (CHD), mostly of the conotruncal type, and/or aortic arch defect. The etiology of the cardiac phenotypic variability is not currently known for the majority of patients. We hypothesized that rare copy number variants (CNVs) outside the 22q11.2 deleted region may modify the risk of being born with a CHD in this sensitized population. Rare CNV analysis was performed using Affymetrix SNP Array 6.0 data from 946 22q11DS subjects with CHDs (n = 607) or with normal cardiac anatomy (n = 339). Although there was no significant difference in the overall burden of rare CNVs, an overabundance of CNVs affecting cardiac-related genes was detected in 22q11DS individuals with CHDs. When the rare CNVs were examined with regard to gene interactions, specific cardiac networks, such as Wnt signaling, appear to be overrepresented in 22q11DS CHD cases but not 22q11DS controls with a normal heart. Collectively, these data suggest that CNVs outside the 22q11.2 region may contain genes that modify risk for CHDs in some 22q11DS patients.
Alvarado, David M.; Aferol, Hyuliya; McCall, Kevin; Huang, Jason B.; Techy, Matthew; Buchan, Jillian; Cady, Janet; Gonzales, Patrick R.; Dobbs, Matthew B.; Gurnett, Christina A.
2010-01-01
Clubfoot is a common musculoskeletal birth defect for which few causative genes have been identified. To identify the genes responsible for isolated clubfoot, we screened for genomic copy-number variants with the Affymetrix Genome-wide Human SNP Array 6.0. A recurrent chromosome 17q23.1q23.2 microduplication was identified in 3 of 66 probands with familial isolated clubfoot. The chromosome 17q23.1q23.2 microduplication segregated with autosomal-dominant clubfoot in all three families but with reduced penetrance. Mild short stature was common and one female had developmental hip dysplasia. Subtle skeletal abnormalities consisted of broad and shortened metatarsals and calcanei, small distal tibial epiphyses, and thickened ischia. Several skeletal features were opposite to those described in the reciprocal chromosome 17q23.1q23.2 microdeletion syndrome associated with developmental delay and cardiac and limb abnormalities. Of note, during our study, we also identified a microdeletion at the locus in a sibling pair with isolated clubfoot. The chromosome 17q23.1q23.2 region contains the T-box transcription factor TBX4, a likely target of the bicoid-related transcription factor PITX1 previously implicated in clubfoot etiology. Our result suggests that this chromosome 17q23.1q23.2 microduplication is a relatively common cause of familial isolated clubfoot and provides strong evidence linking clubfoot etiology to abnormal early limb development. PMID:20598276
Liang, Winnie S.; Chen, Kewei; Lee, Wendy; Sidhar, Kunal; Corneveaux, Jason J.; Allen, April N.; Myers, Amanda; Villa, Stephen; Meechoovet, Bessie; Pruzin, Jeremy; Bandy, Daniel; Fleisher, Adam S.; Langbaum, Jessica B.S.; Huentelman, Matthew J.; Jensen, Kendall; Dunckley, Travis; Caselli, Richard J.; Kaib, Susan; Reiman, Eric M.
2010-01-01
In a genome-wide association study (GWAS) of late-onset Alzheimer's disease (AD), we found an association between common haplotypes of the GAB2 gene and AD risk in carriers of the apolipoprotein E (APOE) ε4 allele, the major late-onset AD susceptibility gene. We previously proposed the use of fluorodeoxyglucose positron emission tomography (FDG-PET) measurements as a quantitative presymptomatic endophenotype, more closely related to disease risk than the clinical syndrome itself, to help evaluate putative genetic and non-genetic modifiers of AD risk. In this study, we examined the relationship between the presence or absence of the relatively protective GAB2 haplotype and PET measurements of regional-to-whole brain FDG uptake in several AD-affected brain regions in 158 cognitively normal late-middle-aged APOEε4 homozygotes, heterozygotes, and non-carriers. GAB2 haplotypes were characterized using Affymetrix Genome-Wide Human SNP 6.0 Array data from each of these subjects. As predicted, the possibly protective GAB2 haplotype was associated with higher regional-to-whole brain FDG uptake in AD-affected brain regions in APOEε4 carriers. While additional studies are needed, this study supports the association between the possibly protective GAB2 haplotype and the risk of late-onset AD in APOEε4 carriers. It also supports the use of brain-imaging endophenotypes to help assess possible modifiers of AD risk. PMID:20888920
Chen, Guo-Bo; Lee, Sang Hong; Brion, Marie-Jo A; Montgomery, Grant W; Wray, Naomi R; Radford-Smith, Graham L; Visscher, Peter M
2014-09-01
As custom arrays are cheaper than generic GWAS arrays, larger sample size is achievable for gene discovery. Custom arrays can tag more variants through denser genotyping of SNPs at associated loci, but at the cost of losing genome-wide coverage. Balancing this trade-off is important for maximizing experimental designs. We quantified both the gain in captured SNP-heritability at known candidate regions and the loss due to imperfect genome-wide coverage for inflammatory bowel disease using immunochip (iChip) and imputed GWAS data on 61,251 and 38.550 samples, respectively. For Crohn's disease (CD), the iChip and GWAS data explained 19 and 26% of variation in liability, respectively, and SNPs in the densely genotyped iChip regions explained 13% of the SNP-heritability for both the iChip and GWAS data. For ulcerative colitis (UC), the iChip and GWAS data explained 15 and 19% of variation in liability, respectively, and the dense iChip regions explained 10 and 9% of the SNP-heritability in the iChip and the GWAS data. From bivariate analyses, estimates of the genetic correlation in risk between CD and UC were 0.75 (SE 0.017) and 0.62 (SE 0.042) for the iChip and GWAS data, respectively. We also quantified the SNP-heritability of genomic regions that did or did not contain the previous 163 GWAS hits for CD and UC, and SNP-heritability of the overlapping loci between the densely genotyped iChip regions and the 163 GWAS hits. For both diseases, over different genomic partitioning, the densely genotyped regions on the iChip tagged at least as much variation in liability as in the corresponding regions in the GWAS data, however a certain amount of tagged SNP-heritability in the GWAS data was lost using the iChip due to the low coverage at unselected regions. These results imply that custom arrays with a GWAS backbone will facilitate more gene discovery, both at associated and novel loci. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NIH CIDR Program Studies For whole exome sequencing projects, we pretest all samples using a high -density SNP array (>200,000 markers). For custom targeted sequencing, we pretest all samples using a 96 pretest samples using a 96 SNP GoldenGate assay. This extensive pretesting allows us to unambiguously tie
Novel applications of array comparative genomic hybridization in molecular diagnostics.
Cheung, Sau W; Bi, Weimin
2018-05-31
In 2004, the implementation of array comparative genomic hybridization (array comparative genome hybridization [CGH]) into clinical practice marked a new milestone for genetic diagnosis. Array CGH and single-nucleotide polymorphism (SNP) arrays enable genome-wide detection of copy number changes in a high resolution, and therefore microarray has been recognized as the first-tier test for patients with intellectual disability or multiple congenital anomalies, and has also been applied prenatally for detection of clinically relevant copy number variations in the fetus. Area covered: In this review, the authors summarize the evolution of array CGH technology from their diagnostic laboratory, highlighting exonic SNP arrays developed in the past decade which detect small intragenic copy number changes as well as large DNA segments for the region of heterozygosity. The applications of array CGH to human diseases with different modes of inheritance with the emphasis on autosomal recessive disorders are discussed. Expert commentary: An exonic array is a powerful and most efficient clinical tool in detecting genome wide small copy number variants in both dominant and recessive disorders. However, whole-genome sequencing may become the single integrated platform for detection of copy number changes, single-nucleotide changes as well as balanced chromosomal rearrangements in the near future.
Babushok, Daria V.; Xie, Hongbo M.; Roth, Jacquelyn J.; Perdigones, Nieves; Olson, Timothy S.; Cockroft, Joshua D.; Gai, Xiaowu; Perin, Juan C.; Li, Yimei; Paessler, Michele E.; Hakonarson, Hakon; Podsakoff, Gregory M.; Mason, Philip J.; Biegel, Jaclyn A.; Bessler, Monica
2013-01-01
Summary The bone marrow failure syndromes (BMFS) are a heterogeneous group of rare blood disorders characterized by inadequate haematopoiesis, clonal evolution, and increased risk of leukaemia. Single nucleotide polymorphism arrays (SNP-A) have been proposed as a tool for surveillance of clonal evolution in BMFS. To better understand the natural history of BMFS and to assess the clinical utility of SNP-A in these disorders, we analysed 124 SNP-A from a comprehensively characterized cohort of 91 patients at our BMFS centre. SNP-A were correlated with medical histories, haematopathology, cytogenetic and molecular data. To assess clonal evolution, longitudinal analysis of SNP-A was performed in 25 patients. We found that acquired copy number-neutral loss of heterozygosity (CN-LOH) was significantly more frequent in acquired aplastic anaemia (aAA) than in other BMFS (odds ratio 12.2, p<0.01). Homozygosity by descent was most common in congenital BMFS, frequently unmasking autosomal recessive mutations. Copy number variants (CNVs) were frequently polymorphic, and we identified CNVs enriched in neutropenia and aAA. Our results suggest that acquired CN-LOH is a general phenomenon in aAA that is probably mechanistically and prognostically distinct from typical CN-LOH of myeloid malignancies. Our analysis of clinical utility of SNP-A shows the highest yield of detecting new clonal haematopoiesis at diagnosis and at relapse. PMID:24116929
Babushok, Daria V; Xie, Hongbo M; Roth, Jacquelyn J; Perdigones, Nieves; Olson, Timothy S; Cockroft, Joshua D; Gai, Xiaowu; Perin, Juan C; Li, Yimei; Paessler, Michele E; Hakonarson, Hakon; Podsakoff, Gregory M; Mason, Philip J; Biegel, Jaclyn A; Bessler, Monica
2014-01-01
The bone marrow failure syndromes (BMFS) are a heterogeneous group of rare blood disorders characterized by inadequate haematopoiesis, clonal evolution, and increased risk of leukaemia. Single nucleotide polymorphism arrays (SNP-A) have been proposed as a tool for surveillance of clonal evolution in BMFS. To better understand the natural history of BMFS and to assess the clinical utility of SNP-A in these disorders, we analysed 124 SNP-A from a comprehensively characterized cohort of 91 patients at our BMFS centre. SNP-A were correlated with medical histories, haematopathology, cytogenetic and molecular data. To assess clonal evolution, longitudinal analysis of SNP-A was performed in 25 patients. We found that acquired copy number-neutral loss of heterozygosity (CN-LOH) was significantly more frequent in acquired aplastic anaemia (aAA) than in other BMFS (odds ratio 12·2, P < 0·01). Homozygosity by descent was most common in congenital BMFS, frequently unmasking autosomal recessive mutations. Copy number variants (CNVs) were frequently polymorphic, and we identified CNVs enriched in neutropenia and aAA. Our results suggest that acquired CN-LOH is a general phenomenon in aAA that is probably mechanistically and prognostically distinct from typical CN-LOH of myeloid malignancies. Our analysis of clinical utility of SNP-A shows the highest yield of detecting new clonal haematopoiesis at diagnosis and at relapse. © 2013 John Wiley & Sons Ltd.
Littlejohn, Mathew D; Turner, Sally-Anne; Walker, Caroline G; Berry, Sarah D; Tiplady, Kathryn; Sherlock, Ric G; Sutherland, Greg; Swift, Simon; Garrick, Dorian; Lacy-Hulbert, S Jane; McDougall, Scott; Spelman, Richard J; Snell, Russell G; Hillerton, J Eric
2018-05-01
Inflammation of the mammary gland following bacterial infection, commonly known as mastitis, affects all mammalian species. Although the aetiology and epidemiology of mastitis in the dairy cow are well described, the genetic factors mediating resistance to mammary gland infection are not well known, due in part to the difficulty in obtaining robust phenotypic information from sufficiently large numbers of individuals. To address this problem, an experimental mammary gland infection experiment was undertaken, using a Friesian-Jersey cross breed F2 herd. A total of 604 animals received an intramammary infusion of Streptococcus uberis in one gland, and the clinical response over 13 milkings was used for linkage mapping and genome-wide association analysis. A quantitative trait locus (QTL) was detected on bovine chromosome 11 for clinical mastitis status using micro-satellite and Affymetrix 10 K SNP markers, and then exome and genome sequence data used from the six F1 sires of the experimental animals to examine this region in more detail. A total of 485 sequence variants were typed in the QTL interval, and association mapping using these and an additional 37 986 genome-wide markers from the Illumina SNP50 bovine SNP panel revealed association with markers encompassing the interleukin-1 gene cluster locus. This study highlights a region on bovine chromosome 11, consistent with earlier studies, as conferring resistance to experimentally induced mammary gland infection, and newly prioritises the IL1 gene cluster for further analysis in genetic resistance to mastitis.
Diverse Genome-wide Association Studies Associate the IL12/IL23 Pathway with Crohn Disease
Wang, Kai; Zhang, Haitao; Kugathasan, Subra; Annese, Vito; Bradfield, Jonathan P.; Russell, Richard K.; Sleiman, Patrick M.A.; Imielinski, Marcin; Glessner, Joseph; Hou, Cuiping; Wilson, David C.; Walters, Thomas; Kim, Cecilia; Frackelton, Edward C.; Lionetti, Paolo; Barabino, Arrigo; Van Limbergen, Johan; Guthery, Stephen; Denson, Lee; Piccoli, David; Li, Mingyao; Dubinsky, Marla; Silverberg, Mark; Griffiths, Anne; Grant, Struan F.A.; Satsangi, Jack; Baldassano, Robert; Hakonarson, Hakon
2009-01-01
Previous genome-wide association (GWA) studies typically focus on single-locus analysis, which may not have the power to detect the majority of genuinely associated loci. Here, we applied pathway analysis using Affymetrix SNP genotype data from the Wellcome Trust Case Control Consortium (WTCCC) and uncovered significant association between Crohn Disease (CD) and the IL12/IL23 pathway, harboring 20 genes (p = 8 × 10−5). Interestingly, the pathway contains multiple genes (IL12B and JAK2) or homologs of genes (STAT3 and CCR6) that were recently identified as genuine susceptibility genes only through meta-analysis of several GWA studies. In addition, the pathway contains other susceptibility genes for CD, including IL18R1, JUN, IL12RB1, and TYK2, which do not reach genome-wide significance by single-marker association tests. The observed pathway-specific association signal was subsequently replicated in three additional GWA studies of European and African American ancestry generated on the Illumina HumanHap550 platform. Our study suggests that examination beyond individual SNP hits, by focusing on genetic networks and pathways, is important to unleashing the true power of GWA studies. PMID:19249008
NASA Astrophysics Data System (ADS)
Vukanti, R. V.; Mintz, E. M.; Leff, L. G.
2005-05-01
Bacterial responses to environmental signals are multifactorial and are coupled to changes in gene expression. An understanding of bacterial responses to environmental conditions is possible using microarray expression analysis. In this study, the utility of microarrays for examining changes in gene expression in Escherichia coli under different environmental conditions was assessed. RNA was isolated, hybridized to Affymetrix E. coli Genome 2.0 chips and analyzed using Affymetrix GCOS and Genespring software. Major limiting factors were obtaining enough quality RNA (107-108 cells to get 10μg RNA)and accounting for differences in growth rates under different conditions. Stabilization of RNA prior to isolation and taking extreme precautions while handling RNA were crucial. In addition, use of this method in ecological studies is limited by availability and cost of commercial arrays; choice of primers for cDNA synthesis, reproducibility, complexity of results generated and need to validate findings. This method may be more widely applicable with the development of better approaches for RNA recovery from environmental samples and increased number of available strain-specific arrays. Diligent experimental design and verification of results with real-time PCR or northern blots is needed. Overall, there is a great potential for use of this technology to discover mechanisms underlying organisms' responses to environmental conditions.
Veyrieras, Jean-Baptiste; Gaffney, Daniel J.; Pickrell, Joseph K.; Gilad, Yoav; Stephens, Matthew; Pritchard, Jonathan K.
2012-01-01
Mapping of expression quantitative trait loci (eQTLs) is an important technique for studying how genetic variation affects gene regulation in natural populations. In a previous study using Illumina expression data from human lymphoblastoid cell lines, we reported that cis-eQTLs are especially enriched around transcription start sites (TSSs) and immediately upstream of transcription end sites (TESs). In this paper, we revisit the distribution of eQTLs using additional data from Affymetrix exon arrays and from RNA sequencing. We confirm that most eQTLs lie close to the target genes; that transcribed regions are generally enriched for eQTLs; that eQTLs are more abundant in exons than introns; and that the peak density of eQTLs occurs at the TSS. However, we find that the intriguing TES peak is greatly reduced or absent in the Affymetrix and RNA-seq data. Instead our data suggest that the TES peak observed in the Illumina data is mainly due to exon-specific QTLs that affect 3′ untranslated regions, where most of the Illumina probes are positioned. Nonetheless, we do observe an overall enrichment of eQTLs in exons versus introns in all three data sets, consistent with an important role for exonic sequences in gene regulation. PMID:22359548
McKinney, Cushla; Stamp, Lisa K; Dalbeth, Nicola; Topless, Ruth K; Day, Richard O; Kannangara, Diluk Rw; Williams, Kenneth M; Janssen, Matthijs; Jansen, Timothy L; Joosten, Leo A; Radstake, Timothy R; Riches, Philip L; Tausche, Anne-Kathrin; Lioté, Frederic; So, Alexander; Merriman, Tony R
2015-10-13
The acute gout flare results from a localised self-limiting innate immune response to monosodium urate (MSU) crystals deposited in joints in hyperuricaemic individuals. Activation of the caspase recruitment domain-containing protein 8 (CARD8) NOD-like receptor pyrin-containing 3 (NLRP3) inflammasome by MSU crystals and production of mature interleukin-1β (IL-1β) is central to acute gouty arthritis. However very little is known about genetic control of the innate immune response involved in acute gouty arthritis. Therefore our aim was to test functional single nucleotide polymorphism (SNP) variants in the toll-like receptor (TLR)-inflammasome-IL-1β axis for association with gout. 1,494 gout cases of European and 863 gout cases of New Zealand (NZ) Polynesian (Māori and Pacific Island) ancestry were included. Gout was diagnosed by the 1977 ARA gout classification criteria. There were 1,030 Polynesian controls and 10,942 European controls including from the publicly-available Atherosclerosis Risk in Communities (ARIC) and Framingham Heart (FHS) studies. The ten SNPs were either genotyped by Sequenom MassArray or by Affymetrix SNP array or imputed in the ARIC and FHS datasets. Allelic association was done by logistic regression adjusting by age and sex with European and Polynesian data combined by meta-analysis. Sample sets were pooled for multiplicative interaction analysis, which was also adjusted by sample set. Eleven SNPs were tested in the TLR2, CD14, IL1B, CARD8, NLRP3, MYD88, P2RX7, DAPK1 and TNXIP genes. Nominally significant (P < 0.05) associations with gout were detected at CARD8 rs2043211 (OR = 1.12, P = 0.007), IL1B rs1143623 (OR = 1.10, P = 0.020) and CD14 rs2569190 (OR = 1.08; P = 0.036). There was significant multiplicative interaction between CARD8 and IL1B (P = 0.005), with the IL1B risk genotype amplifying the risk effect of CARD8. There is evidence for association of gout with functional variants in CARD8, IL1B and CD14. The gout-associated allele of IL1B increases expression of IL-1β - the multiplicative interaction with CARD8 would be consistent with a synergy of greater inflammasome activity (resulting from reduced CARD8) combined with higher levels of pre-IL-1β expression leading to increased production of mature IL-1β in gout.
Comparison between genotyping by sequencing and SNP-chip genotyping in QTL mapping in wheat
USDA-ARS?s Scientific Manuscript database
Array- or chip-based single nucleotide polymorphism (SNP) markers are widely used in genomic studies because of their abundance in a genome and cost less per data point compared to older marker technologies. Genotyping by sequencing (GBS), a relatively newer approach of genotyping, suggests equal or...
Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.
USDA-ARS?s Scientific Manuscript database
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ~4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification pr...
Identification of a Novel Idiopathic Epilepsy Locus in Belgian Shepherd Dogs
Seppälä, Eija H.; Koskinen, Lotta L. E.; Gulløv, Christina H.; Jokinen, Päivi; Karlskov-Mortensen, Peter; Bergamasco, Luciana; Baranowska Körberg, Izabella; Cizinauskas, Sigitas; Oberbauer, Anita M.; Berendt, Mette; Fredholm, Merete; Lohi, Hannes
2012-01-01
Epilepsy is the most common neurological disorder in dogs, with an incidence ranging from 0.5% to up to 20% in particular breeds. Canine epilepsy can be etiologically defined as idiopathic or symptomatic. Epileptic seizures may be classified as focal with or without secondary generalization, or as primary generalized. Nine genes have been identified for symptomatic (storage diseases) and one for idiopathic epilepsy in different breeds. However, the genetic background of common canine epilepsies remains unknown. We have studied the clinical and genetic background of epilepsy in Belgian Shepherds. We collected 159 cases and 148 controls and confirmed the presence of epilepsy through epilepsy questionnaires and clinical examinations. The MRI was normal while interictal EEG revealed abnormalities and variable foci in the clinically examined affected dogs. A genome-wide association study using Affymetrix 50K SNP arrays in 40 cases and 44 controls mapped the epilepsy locus on CFA37, which was replicated in an independent cohort (81 cases and 88 controls; combined p = 9.70×10−10, OR = 3.3). Fine mapping study defined a ∼1 Mb region including 12 genes of which none are known epilepsy genes or encode ion channels. Exonic sequencing was performed for two candidate genes, KLF7 and ADAM23. No variation was found in KLF7 but a highly-associated non-synonymous variant, G1203A (R387H) was present in the ADAM23 gene (p = 3.7×10−8, OR = 3.9 for homozygosity). Homozygosity for a two-SNP haplotype within the ADAM23 gene conferred the highest risk for epilepsy (p = 6.28×10−11, OR = 7.4). ADAM23 interacts with known epilepsy proteins LGI1 and LGI2. However, our data suggests that the ADAM23 variant is a polymorphism and we have initiated a targeted re-sequencing study across the locus to identify the causative mutation. It would establish the affected breed as a novel therapeutic model, help to develop a DNA test for breeding purposes and introduce a novel candidate gene for human idiopathic epilepsies. PMID:22457775
A Genome-Wide Association Search for Type 2 Diabetes Genes in African Americans
Palmer, Nicholette D.; McDonough, Caitrin W.; Hicks, Pamela J.; Roh, Bong H.; Wing, Maria R.; An, S. Sandy; Hester, Jessica M.; Cooke, Jessica N.; Bostrom, Meredith A.; Rudock, Megan E.; Talbert, Matthew E.; Lewis, Joshua P.; Ferrara, Assiamira; Lu, Lingyi; Ziegler, Julie T.; Sale, Michele M.; Divers, Jasmin; Shriner, Daniel; Adeyemo, Adebowale; Rotimi, Charles N.; Ng, Maggie C. Y.; Langefeld, Carl D.; Freedman, Barry I.; Bowden, Donald W.
2012-01-01
African Americans are disproportionately affected by type 2 diabetes (T2DM) yet few studies have examined T2DM using genome-wide association approaches in this ethnicity. The aim of this study was to identify genes associated with T2DM in the African American population. We performed a Genome Wide Association Study (GWAS) using the Affymetrix 6.0 array in 965 African-American cases with T2DM and end-stage renal disease (T2DM-ESRD) and 1029 population-based controls. The most significant SNPs (n = 550 independent loci) were genotyped in a replication cohort and 122 SNPs (n = 98 independent loci) were further tested through genotyping three additional validation cohorts followed by meta-analysis in all five cohorts totaling 3,132 cases and 3,317 controls. Twelve SNPs had evidence of association in the GWAS (P<0.0071), were directionally consistent in the Replication cohort and were associated with T2DM in subjects without nephropathy (P<0.05). Meta-analysis in all cases and controls revealed a single SNP reaching genome-wide significance (P<2.5×10−8). SNP rs7560163 (P = 7.0×10−9, OR (95% CI) = 0.75 (0.67–0.84)) is located intergenically between RND3 and RBM43. Four additional loci (rs7542900, rs4659485, rs2722769 and rs7107217) were associated with T2DM (P<0.05) and reached more nominal levels of significance (P<2.5×10−5) in the overall analysis and may represent novel loci that contribute to T2DM. We have identified novel T2DM-susceptibility variants in the African-American population. Notably, T2DM risk was associated with the major allele and implies an interesting genetic architecture in this population. These results suggest that multiple loci underlie T2DM susceptibility in the African-American population and that these loci are distinct from those identified in other ethnic populations. PMID:22238593
Equalizer reduces SNP bias in Affymetrix microarrays.
Quigley, David
2015-07-30
Gene expression microarrays measure the levels of messenger ribonucleic acid (mRNA) in a sample using probe sequences that hybridize with transcribed regions. These probe sequences are designed using a reference genome for the relevant species. However, most model organisms and all humans have genomes that deviate from their reference. These variations, which include single nucleotide polymorphisms, insertions of additional nucleotides, and nucleotide deletions, can affect the microarray's performance. Genetic experiments comparing individuals bearing different population-associated single nucleotide polymorphisms that intersect microarray probes are therefore subject to systemic bias, as the reduction in binding efficiency due to a technical artifact is confounded with genetic differences between parental strains. This problem has been recognized for some time, and earlier methods of compensation have attempted to identify probes affected by genome variants using statistical models. These methods may require replicate microarray measurement of gene expression in the relevant tissue in inbred parental samples, which are not always available in model organisms and are never available in humans. By using sequence information for the genomes of organisms under investigation, potentially problematic probes can now be identified a priori. However, there is no published software tool that makes it easy to eliminate these probes from an annotation. I present equalizer, a software package that uses genome variant data to modify annotation files for the commonly used Affymetrix IVT and Gene/Exon platforms. These files can be used by any microarray normalization method for subsequent analysis. I demonstrate how use of equalizer on experiments mapping germline influence on gene expression in a genetic cross between two divergent mouse species and in human samples significantly reduces probe hybridization-induced bias, reducing false positive and false negative findings. The equalizer package reduces probe hybridization bias from experiments performed on the Affymetrix microarray platform, allowing accurate assessment of germline influence on gene expression.
Delaneau, Olivier; Marchini, Jonathan
2014-06-13
A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000 GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.
McCue, Molly E.; Bannasch, Danika L.; Petersen, Jessica L.; Gurr, Jessica; Bailey, Ernie; Binns, Matthew M.; Distl, Ottmar; Guérin, Gérard; Hasegawa, Telhisa; Hill, Emmeline W.; Leeb, Tosso; Lindgren, Gabriella; Penedo, M. Cecilia T.; Røed, Knut H.; Ryder, Oliver A.; Swinburne, June E.; Tozaki, Teruaki; Valberg, Stephanie J.; Vaudin, Mark; Lindblad-Toh, Kerstin
2012-01-01
An equine SNP genotyping array was developed and evaluated on a panel of samples representing 14 domestic horse breeds and 18 evolutionarily related species. More than 54,000 polymorphic SNPs provided an average inter-SNP spacing of ∼43 kb. The mean minor allele frequency across domestic horse breeds was 0.23, and the number of polymorphic SNPs within breeds ranged from 43,287 to 52,085. Genome-wide linkage disequilibrium (LD) in most breeds declined rapidly over the first 50–100 kb and reached background levels within 1–2 Mb. The extent of LD and the level of inbreeding were highest in the Thoroughbred and lowest in the Mongolian and Quarter Horse. Multidimensional scaling (MDS) analyses demonstrated the tight grouping of individuals within most breeds, close proximity of related breeds, and less tight grouping in admixed breeds. The close relationship between the Przewalski's Horse and the domestic horse was demonstrated by pair-wise genetic distance and MDS. Genotyping of other Perissodactyla (zebras, asses, tapirs, and rhinoceros) was variably successful, with call rates and the number of polymorphic loci varying across taxa. Parsimony analysis placed the modern horse as sister taxa to Equus przewalski. The utility of the SNP array in genome-wide association was confirmed by mapping the known recessive chestnut coat color locus (MC1R) and defining a conserved haplotype of ∼750 kb across all breeds. These results demonstrate the high quality of this SNP genotyping resource, its usefulness in diverse genome analyses of the horse, and potential use in related species. PMID:22253606
Tumino, Giorgio; Voorrips, Roeland E; Rizza, Fulvia; Badeck, Franz W; Morcia, Caterina; Ghizzoni, Roberta; Germeier, Christoph U; Paulo, Maria-João; Terzi, Valeria; Smulders, Marinus J M
2016-09-01
Infinium SNP data analysed as continuous intensity ratios enabled associating genotypic and phenotypic data from heterogeneous oat samples, showing that association mapping for frost tolerance is a feasible option. Oat is sensitive to freezing temperatures, which restricts the cultivation of fall-sown or winter oats to regions with milder winters. Fall-sown oats have a longer growth cycle, mature earlier, and have a higher productivity than spring-sown oats, therefore improving frost tolerance is an important goal in oat breeding. Our aim was to test the effectiveness of a Genome-Wide Association Study (GWAS) for mapping QTLs related to frost tolerance, using an approach that tolerates continuously distributed signals from SNPs in bulked samples from heterogeneous accessions. A collection of 138 European oat accessions, including landraces, old and modern varieties from 27 countries was genotyped using the Infinium 6K SNP array. The SNP data were analyzed as continuous intensity ratios, rather than converting them into discrete values by genotype calling. PCA and Ward's clustering of genetic similarities revealed the presence of two main groups of accessions, which roughly corresponded to Continental Europe and Mediterranean/Atlantic Europe, although a total of eight subgroups can be distinguished. The accessions were phenotyped for frost tolerance under controlled conditions by measuring fluorescence quantum yield of photosystem II after a freezing stress. GWAS were performed by a linear mixed model approach, comparing different corrections for population structure. All models detected three robust QTLs, two of which co-mapped with QTLs identified earlier in bi-parental mapping populations. The approach used in the present work shows that SNP array data of heterogeneous hexaploid oat samples can be successfully used to determine genetic similarities and to map associations to quantitative phenotypic traits.
Genome-Wide SNP Detection, Validation, and Development of an 8K SNP Array for Apple
Chagné, David; Crowhurst, Ross N.; Troggio, Michela; Davey, Mark W.; Gilmore, Barbara; Lawley, Cindy; Vanderzande, Stijn; Hellens, Roger P.; Kumar, Satish; Cestaro, Alessandro; Velasco, Riccardo; Main, Dorrie; Rees, Jasper D.; Iezzoni, Amy; Mockler, Todd; Wilhelm, Larry; Van de Weg, Eric; Gardiner, Susan E.; Bassil, Nahla; Peace, Cameron
2012-01-01
As high-throughput genetic marker screening systems are essential for a range of genetics studies and plant breeding applications, the International RosBREED SNP Consortium (IRSC) has utilized the Illumina Infinium® II system to develop a medium- to high-throughput SNP screening tool for genome-wide evaluation of allelic variation in apple (Malus×domestica) breeding germplasm. For genome-wide SNP discovery, 27 apple cultivars were chosen to represent worldwide breeding germplasm and re-sequenced at low coverage with the Illumina Genome Analyzer II. Following alignment of these sequences to the whole genome sequence of ‘Golden Delicious’, SNPs were identified using SoapSNP. A total of 2,113,120 SNPs were detected, corresponding to one SNP to every 288 bp of the genome. The Illumina GoldenGate® assay was then used to validate a subset of 144 SNPs with a range of characteristics, using a set of 160 apple accessions. This validation assay enabled fine-tuning of the final subset of SNPs for the Illumina Infinium® II system. The set of stringent filtering criteria developed allowed choice of a set of SNPs that not only exhibited an even distribution across the apple genome and a range of minor allele frequencies to ensure utility across germplasm, but also were located in putative exonic regions to maximize genotyping success rate. A total of 7867 apple SNPs was established for the IRSC apple 8K SNP array v1, of which 5554 were polymorphic after evaluation in segregating families and a germplasm collection. This publicly available genomics resource will provide an unprecedented resolution of SNP haplotypes, which will enable marker-locus-trait association discovery, description of the genetic architecture of quantitative traits, investigation of genetic variation (neutral and functional), and genomic selection in apple. PMID:22363718
Schweizer, Rena M; Robinson, Jacqueline; Harrigan, Ryan; Silva, Pedro; Galverni, Marco; Musiani, Marco; Green, Richard E; Novembre, John; Wayne, Robert K
2016-01-01
In an era of ever-increasing amounts of whole-genome sequence data for individuals and populations, the utility of traditional single nucleotide polymorphisms (SNPs) array-based genome scans is uncertain. We previously performed a SNP array-based genome scan to identify candidate genes under selection in six distinct grey wolf (Canis lupus) ecotypes. Using this information, we designed a targeted capture array for 1040 genes, including all exons and flanking regions, as well as 5000 1-kb nongenic neutral regions, and resequenced these regions in 107 wolves. Selection tests revealed striking patterns of variation within candidate genes relative to noncandidate regions and identified potentially functional variants related to local adaptation. We found 27% and 47% of candidate genes from the previous SNP array study had functional changes that were outliers in sweed and bayenv analyses, respectively. This result verifies the use of genomewide SNP surveys to tag genes that contain functional variants between populations. We highlight nonsynonymous variants in APOB, LIPG and USH2A that occur in functional domains of these proteins, and that demonstrate high correlation with precipitation seasonality and vegetation. We find Arctic and High Arctic wolf ecotypes have higher numbers of genes under selection, which highlight their conservation value and heightened threat due to climate change. This study demonstrates that combining genomewide genotyping arrays with large-scale resequencing and environmental data provides a powerful approach to discern candidate functional variants in natural populations. © 2015 John Wiley & Sons Ltd.
Discovery and mapping of single feature polymorphisms in wheat using Affymetrix arrays
Bernardo, Amy N; Bradbury, Peter J; Ma, Hongxiang; Hu, Shengwa; Bowden, Robert L; Buckler, Edward S; Bai, Guihua
2009-01-01
Background Wheat (Triticum aestivum L.) is a staple food crop worldwide. The wheat genome has not yet been sequenced due to its huge genome size (~17,000 Mb) and high levels of repetitive sequences; the whole genome sequence may not be expected in the near future. Available linkage maps have low marker density due to limitation in available markers; therefore new technologies that detect genome-wide polymorphisms are still needed to discover a large number of new markers for construction of high-resolution maps. A high-resolution map is a critical tool for gene isolation, molecular breeding and genomic research. Single feature polymorphism (SFP) is a new microarray-based type of marker that is detected by hybridization of DNA or cRNA to oligonucleotide probes. This study was conducted to explore the feasibility of using the Affymetrix GeneChip to discover and map SFPs in the large hexaploid wheat genome. Results Six wheat varieties of diverse origins (Ning 7840, Clark, Jagger, Encruzilhada, Chinese Spring, and Opata 85) were analyzed for significant probe by variety interactions and 396 probe sets with SFPs were identified. A subset of 164 unigenes was sequenced and 54% showed polymorphism within probes. Microarray analysis of 71 recombinant inbred lines from the cross Ning 7840/Clark identified 955 SFPs and 877 of them were mapped together with 269 simple sequence repeat markers. The SFPs were randomly distributed within a chromosome but were unevenly distributed among different genomes. The B genome had the most SFPs, and the D genome had the least. Map positions of a selected set of SFPs were validated by mapping single nucleotide polymorphism using SNaPshot and comparing with expressed sequence tags mapping data. Conclusion The Affymetrix array is a cost-effective platform for SFP discovery and SFP mapping in wheat. The new high-density map constructed in this study will be a useful tool for genetic and genomic research in wheat. PMID:19480702
Automated tetraploid genotype calling by hierarchical clustering
USDA-ARS?s Scientific Manuscript database
SNP arrays are transforming breeding and genetics research for autotetraploids. To fully utilize these arrays, however, the relationship between signal intensity and allele dosage must be inferred independently for each marker. We developed an improved computational method to automate this process, ...
Interim report on updated microarray probes for the LLNL Burkholderia pseudomallei SNP array
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S; Jaing, C
2012-03-27
The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interimmore » report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.« less
2012-01-01
Background It is known from recent studies that more than 90% of human multi-exon genes are subject to Alternative Splicing (AS), a key molecular mechanism in which multiple transcripts may be generated from a single gene. It is widely recognized that a breakdown in AS mechanisms plays an important role in cellular differentiation and pathologies. Polymerase Chain Reactions, microarrays and sequencing technologies have been applied to the study of transcript diversity arising from alternative expression. Last generation Affymetrix GeneChip Human Exon 1.0 ST Arrays offer a more detailed view of the gene expression profile providing information on the AS patterns. The exon array technology, with more than five million data points, can detect approximately one million exons, and it allows performing analyses at both gene and exon level. In this paper we describe BEAT, an integrated user-friendly bioinformatics framework to store, analyze and visualize exon arrays datasets. It combines a data warehouse approach with some rigorous statistical methods for assessing the AS of genes involved in diseases. Meta statistics are proposed as a novel approach to explore the analysis results. BEAT is available at http://beat.ba.itb.cnr.it. Results BEAT is a web tool which allows uploading and analyzing exon array datasets using standard statistical methods and an easy-to-use graphical web front-end. BEAT has been tested on a dataset with 173 samples and tuned using new datasets of exon array experiments from 28 colorectal cancer and 26 renal cell cancer samples produced at the Medical Genetics Unit of IRCCS Casa Sollievo della Sofferenza. To highlight all possible AS events, alternative names, accession Ids, Gene Ontology terms and biochemical pathways annotations are integrated with exon and gene level expression plots. The user can customize the results choosing custom thresholds for the statistical parameters and exploiting the available clinical data of the samples for a multivariate AS analysis. Conclusions Despite exon array chips being widely used for transcriptomics studies, there is a lack of analysis tools offering advanced statistical features and requiring no programming knowledge. BEAT provides a user-friendly platform for a comprehensive study of AS events in human diseases, displaying the analysis results with easily interpretable and interactive tables and graphics. PMID:22536968
NuGO contributions to GenePattern
Reiff, C.; Mayer, C.; Müller, M.
2008-01-01
NuGO, the European Nutrigenomics Organization, utilizes 31 powerful computers for, e.g., data storage and analysis. These so-called black boxes (NBXses) are located at the sites of different partners. NuGO decided to use GenePattern as the preferred genomic analysis tool on each NBX. To handle the custom made Affymetrix NuGO arrays, new NuGO modules are added to GenePattern. These NuGO modules execute the latest Bioconductor version ensuring up-to-date annotations and access to the latest scientific developments. The following GenePattern modules are provided by NuGO: NuGOArrayQualityAnalysis for comprehensive quality control, NuGOExpressionFileCreator for import and normalization of data, LimmaAnalysis for identification of differentially expressed genes, TopGoAnalysis for calculation of GO enrichment, and GetResultForGo for retrieval of information on genes associated with specific GO terms. All together, these NuGO modules allow comprehensive, up-to-date, and user friendly analysis of Affymetrix data. A special feature of the NuGO modules is that for analysis they allow the use of either the standard Affymetrix or the MBNI custom CDF-files, which remap probes based on current knowledge. In both cases a .chip-file is created to enable GSEA analysis. The NuGO GenePattern installations are distributed as binary Ubuntu (.deb) packages via the NuGO repository. PMID:19034553
NuGO contributions to GenePattern.
De Groot, P J; Reiff, C; Mayer, C; Müller, M
2008-12-01
NuGO, the European Nutrigenomics Organization, utilizes 31 powerful computers for, e.g., data storage and analysis. These so-called black boxes (NBXses) are located at the sites of different partners. NuGO decided to use GenePattern as the preferred genomic analysis tool on each NBX. To handle the custom made Affymetrix NuGO arrays, new NuGO modules are added to GenePattern. These NuGO modules execute the latest Bioconductor version ensuring up-to-date annotations and access to the latest scientific developments. The following GenePattern modules are provided by NuGO: NuGOArrayQualityAnalysis for comprehensive quality control, NuGOExpressionFileCreator for import and normalization of data, LimmaAnalysis for identification of differentially expressed genes, TopGoAnalysis for calculation of GO enrichment, and GetResultForGo for retrieval of information on genes associated with specific GO terms. All together, these NuGO modules allow comprehensive, up-to-date, and user friendly analysis of Affymetrix data. A special feature of the NuGO modules is that for analysis they allow the use of either the standard Affymetrix or the MBNI custom CDF-files, which remap probes based on current knowledge. In both cases a .chip-file is created to enable GSEA analysis. The NuGO GenePattern installations are distributed as binary Ubuntu (.deb) packages via the NuGO repository.
[Phenotype-genotype correlation analysis of 12 cases with Angelman/Prader-Willi syndrome].
Chen, Chen; Peng, Ying; Xia, Yan; Li, Haoxian; Zhu, Huimin; Pan, Qian; Yin, Fei; Wu, Lingqian
2014-12-01
To investigate the genotype-phenotype correlation in patients with Angelman syndrome/Prader-Willi syndrome (AS/PWS) and assess the application value of high-resolution single nucleotide polymorphism microarrays (SNP array) for such diseases. Twelve AS/PWS patients were diagnosed through SNP array, fluorescence in situ hybridization (FISH) and karyotype analysis. Clinical characteristics were analyzed. Deletions ranging from 4.8 Mb to 7.0 Mb on chromosome 15q11.2-13 were detected in 11 patients. Uniparental disomy (UPD) was detected in only 1 patient. Patients with deletions could be divided into 2 groups, including 7 cases with class I and 4 with class II. The two groups however had no significant phenotypic difference. The UPD patient had relatively better development and language ability. Deletions of 6 patients were confirmed by FISH to be of de novo in origin. The risk to their sibs was determined to be less than 1%. The phenotypic differences between AS/PWS patients with class I and class II deletion need to be further studied. SNP array is useful in detecting and distinguishing of patients with deletion or UPD. This method may be applied for studying the genotype-phenotype association and the mechanism underlying AS/PWS.
Evaluation of Genomic Instability in the Abnormal Prostate
2006-12-01
array CGH maps copy number aberrations relative to the genome sequence by using arrays of BAC or cDNA clones as the hybridization target instead of...data produced from these analyses complicate the interpretation of results . For these reasons, and as outlined by Davies et al., 22 it is desirable...There have been numerous studies of these abnormalities and several techniques, including 9 chromosome painting, array CGH and SNP arrays , have
Sanges, Remo; Cordero, Francesca; Calogero, Raffaele A
2007-12-15
OneChannelGUI is an add-on Bioconductor package providing a new set of functions extending the capability of the affylmGUI package. This library provides a graphical interface (GUI) for Bioconductor libraries to be used for quality control, normalization, filtering, statistical validation and data mining for single channel microarrays. Affymetrix 3' expression (IVT) arrays as well as the new whole transcript expression arrays, i.e. gene/exon 1.0 ST, are actually implemented. oneChannelGUI is available for most platforms on which R runs, i.e. Windows and Unix-like machines. http://www.bioconductor.org/packages/2.0/bioc/html/oneChannelGUI.html
Chang, Melanie L.; Yokoyama, Jennifer S.; Branson, Nick; Dyer, Donna J.; Hitte, Christophe; Overall, Karen L.
2009-01-01
Until recently, canine genetic research has not focused on population structure within breeds, which may confound the results of case–control studies by introducing spurious correlations between phenotype and genotype that reflect population history. Intrabreed structure may exist when geographical origin or divergent selection regimes influence the choices of potential mates for breeding dogs. We present evidence for intrabreed stratification from a genome-wide marker survey in a sample of unrelated dogs. We genotyped 76 Border Collies, 49 Australian Shepherds, 17 German Shepherd Dogs, and 17 Portuguese Water Dogs for our primary analyses using Affymetrix Canine v2.0 single-nucleotide polymorphism (SNP) arrays. Subsets of autosomal markers were examined using clustering algorithms to facilitate assignment of individuals to populations and estimation of the number of populations represented in the sample. SNPs passing stringent quality control filters were employed for explicitly phylogenetic analyses reconstructing relationships between individuals using maximum parsimony and Bayesian methods. We used simulation studies to explore the possible effects of intrabreed stratification on genome-wide association studies. These analyses demonstrate significant stratification in at least one of our primary breeds of interest, the Border Collie. Demographic and pedigree data suggest that this population substructure may result from geographic isolation or divergent selection regimes practiced by breeders with different breeding program goals. Simulation studies indicate that such stratification could result in false discovery rates significant enough to confound genome-wide association analyses. Intrabreed stratification should be accounted for when designing and interpreting the results of case–control association studies using purebred dogs.
ITGB5 and AGFG1 variants are associated with severity of airway responsiveness.
Himes, Blanca E; Qiu, Weiliang; Klanderman, Barbara; Ziniti, John; Senter-Sylvia, Jody; Szefler, Stanley J; Lemanske, Robert F; Zeiger, Robert S; Strunk, Robert C; Martinez, Fernando D; Boushey, Homer; Chinchilli, Vernon M; Israel, Elliot; Mauger, David; Koppelman, Gerard H; Nieuwenhuis, Maartje A E; Postma, Dirkje S; Vonk, Judith M; Rafaels, Nicholas; Hansel, Nadia N; Barnes, Kathleen; Raby, Benjamin; Tantisira, Kelan G; Weiss, Scott T
2013-08-28
Airway hyperresponsiveness (AHR), a primary characteristic of asthma, involves increased airway smooth muscle contractility in response to certain exposures. We sought to determine whether common genetic variants were associated with AHR severity. A genome-wide association study (GWAS) of AHR, quantified as the natural log of the dosage of methacholine causing a 20% drop in FEV1, was performed with 994 non-Hispanic white asthmatic subjects from three drug clinical trials: CAMP, CARE, and ACRN. Genotyping was performed on Affymetrix 6.0 arrays, and imputed data based on HapMap Phase 2, was used to measure the association of SNPs with AHR using a linear regression model. Replication of primary findings was attempted in 650 white subjects from DAG, and 3,354 white subjects from LHS. Evidence that the top SNPs were eQTL of their respective genes was sought using expression data available for 419 white CAMP subjects. The top primary GWAS associations were in rs848788 (P-value 7.2E-07) and rs6731443 (P-value 2.5E-06), located within the ITGB5 and AGFG1 genes, respectively. The AGFG1 result replicated at a nominally significant level in one independent population (LHS P-value 0.012), and the SNP had a nominally significant unadjusted P-value (0.0067) for being an eQTL of AGFG1. Based on current knowledge of ITGB5 and AGFG1, our results suggest that variants within these genes may be involved in modulating AHR. Future functional studies are required to confirm that our associations represent true biologically significant findings.
Homozygous deletion in TUSC3 causing syndromic intellectual disability: a new patient.
Loddo, Sara; Parisi, Valentina; Doccini, Viola; Filippi, Tiziana; Bernardini, Laura; Brovedani, Paola; Ricci, Federica; Novelli, Antonio; Battaglia, Agatino
2013-08-01
Defects in the TUSC3 gene have been identified in individuals with nonsyndromic autosomal recessive intellectual disability (ARID), due to either point mutations or intragenic deletions. We report on a boy with a homozygous microdeletion 8p22, sizing 203 kb, encompassing the first exon of the TUSC3 gene, detected by SNP-array analysis (Human Gene Chip 6.0; Affymetrix). Both nonconsanguineous parents come from a small Sicilian village and were heterozygous carriers of the microdeletion. The propositus had a few dysmorphic features and a moderate cognitive impairment. Verbal communication was impaired, with an inappropriate phonetic inventory, important phono-articolatory distortions, and bucco-phonatory dyspraxia. Comprehension was possible for simple sentences. Behavior was characterized by motor instability, high tendency to irritability and distraibility, anxiety traits, and an oppositional-defiant disorder. His parents were of normal intelligence. TUSC3 is thought to encode a subunit of the endoplasmic reticulum-bound oligosaccharyltranferase complex that catalyzes a pivotal step in the protein N-glycosylation process. TUSC3 has been recently reported as a member of the plasma membrane Mg(2+) transport system, with a possible involvement in learning abilities, working memory and short- and long-term memory. This is the third family in which a deletion has been described. Although the pathogenic mechanism has not been clarified yet, our report argues for a more prominent role of TUSC3 in the etiology of intellectual disability and that deletions encompassing this gene could be more common than expected. Copyright © 2013 Wiley Periodicals, Inc.
Perez, Marco V; Hoffmann, Thomas J; Tang, Hua; Thornton, Timothy; Stefanick, Marcia L; Larson, Joseph C; Kooperberg, Charles; Reiner, Alex P; Caan, Bette; Iribarren, Carlos; Risch, Neil
2013-09-01
Atrial fibrillation (AF) is the most common arrhythmia in women and is associated with higher rates of stroke and death. Rates of AF are lower in African American subjects compared with European Americans, suggesting European ancestry could contribute to AF risk. The Women's Health Initiative (WHI) Observational Study (OS) followed up 93,676 women since the mid 1990s for various cardiovascular outcomes including AF. Multivariate Cox hazard regression analysis was used to measure the association between African American race and incident AF. A total of 8,119 African American women from the WHI randomized clinical trials and OS were genotyped on the Affymetrix Human SNP Array 6.0. Genome-wide ancestry and previously reported single nucleotide polymorphisms associated with AF in European cohorts were tested for association with AF using multivariate logistic regression analyses. Self-reported African American race was associated with lower rates of AF (hazard ratio 0.43, 95% CI 0.32-0.60) in the OS, independent of demographic and clinical risk factors. In the genotyped cohort, there were 558 women with AF. By contrast, genome-wide European ancestry was not associated with AF. None of the single nucleotide polymorphisms previously associated with AF in European populations, including rs2200733, were associated with AF in the WHI African American cohort. African American race is significantly and inversely correlated with AF in postmenopausal women. The etiology of this association remains unclear and may be related to unidentified environmental differences. Larger studies are necessary to identify genetic determinants of AF in African Americans. © 2013.
Kim, Jae-Jung; Hong, Young Mi; Sohn, Saejung; Jang, Gi Young; Ha, Kee-Soo; Yun, Sin Weon; Han, Myung Ki; Lee, Kyung-Yil; Song, Min Seob; Lee, Hyoung Doo; Kim, Dong Soo; Lee, Jong-Eun; Shin, Eun-Soon; Jang, Ji-Hyun; Lee, Yeon-Su; Kim, Sook-Young; Lee, Jong-Young; Han, Bok-Ghee; Wu, Jer-Yuarn; Kim, Kwi-Joo; Park, Young-Mi; Seo, Eul-Joo; Park, In-Sook; Lee, Jong-Keuk
2011-05-01
Kawasaki disease (KD) is an acute self-limited vasculitis of infants and children that manifests as fever and signs of mucocutaneous inflammation. Coronary artery aneurysms develop in approximately 15-25% of untreated children. Although the etiology of KD is largely unknown, epidemiologic data suggest the importance of genetic factors in the susceptibility to KD. In order to identify genetic variants that influence KD susceptibility, we performed a genome-wide association study (GWAS) using Affymetrix SNP array 6.0 in 186 Korean KD patients and 600 healthy controls; 18 and 26 genomic regions with one or more sequence variants were associated with KD and KD with coronary artery lesions (CALs), respectively (p < 1 × 10(-5)). Of these, one locus on chromosome 1p31 (rs527409) was replicated in 266 children with KD and 600 normal controls (odds ratio [OR] = 2.90, 95% confidence interval [CI] = 1.85-4.54, P (combined) = 1.46 × 10(-6)); and a PELI1 locus on chromosome 2p13.3 (rs7604693) was replicated in 86 KD patients with CALs and 600 controls (OR = 2.70, 95% CI = 1.77-4.12, P (combined) = 2.00 × 10(-6)). These results implicate a locus in the 1p31 region and the PELI1 gene locus in the 2p13.3 region as susceptibility loci for KD and CALs, respectively.
Alvarado, David M; Aferol, Hyuliya; McCall, Kevin; Huang, Jason B; Techy, Matthew; Buchan, Jillian; Cady, Janet; Gonzales, Patrick R; Dobbs, Matthew B; Gurnett, Christina A
2010-07-09
Clubfoot is a common musculoskeletal birth defect for which few causative genes have been identified. To identify the genes responsible for isolated clubfoot, we screened for genomic copy-number variants with the Affymetrix Genome-wide Human SNP Array 6.0. A recurrent chromosome 17q23.1q23.2 microduplication was identified in 3 of 66 probands with familial isolated clubfoot. The chromosome 17q23.1q23.2 microduplication segregated with autosomal-dominant clubfoot in all three families but with reduced penetrance. Mild short stature was common and one female had developmental hip dysplasia. Subtle skeletal abnormalities consisted of broad and shortened metatarsals and calcanei, small distal tibial epiphyses, and thickened ischia. Several skeletal features were opposite to those described in the reciprocal chromosome 17q23.1q23.2 microdeletion syndrome associated with developmental delay and cardiac and limb abnormalities. Of note, during our study, we also identified a microdeletion at the locus in a sibling pair with isolated clubfoot. The chromosome 17q23.1q23.2 region contains the T-box transcription factor TBX4, a likely target of the bicoid-related transcription factor PITX1 previously implicated in clubfoot etiology. Our result suggests that this chromosome 17q23.1q23.2 microduplication is a relatively common cause of familial isolated clubfoot and provides strong evidence linking clubfoot etiology to abnormal early limb development. Copyright 2010 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Abbey, Darren; Hickman, Meleah; Gresham, David; Berman, Judith
2011-01-01
Phenotypic diversity can arise rapidly through loss of heterozygosity (LOH) or by the acquisition of copy number variations (CNV) spanning whole chromosomes or shorter contiguous chromosome segments. In Candida albicans, a heterozygous diploid yeast pathogen with no known meiotic cycle, homozygosis and aneuploidy alter clinical characteristics, including drug resistance. Here, we developed a high-resolution microarray that simultaneously detects ∼39,000 single nucleotide polymorphism (SNP) alleles and ∼20,000 copy number variation loci across the C. albicans genome. An important feature of the array analysis is a computational pipeline that determines SNP allele ratios based upon chromosome copy number. Using the array and analysis tools, we constructed a haplotype map (hapmap) of strain SC5314 to assign SNP alleles to specific homologs, and we used it to follow the acquisition of loss of heterozygosity (LOH) and copy number changes in a series of derived laboratory strains. This high-resolution SNP/CGH microarray and the associated hapmap facilitated the phasing of alleles in lab strains and revealed detrimental genome changes that arose frequently during molecular manipulations of laboratory strains. Furthermore, it provided a useful tool for rapid, high-resolution, and cost-effective characterization of changes in allele diversity as well as changes in chromosome copy number in new C. albicans isolates. PMID:22384363
Gao, Z J; Jiang, Q; Cheng, D Z; Yan, X X; Chen, Q; Xu, K M
2016-10-02
Objective: To evaluate the application of single nucleotide polymorphism (SNP)-microarray and target gene sequencing technology in the clinical molecular genetic diagnosis of unexplained intellectual disability(ID) or developmental delay (DD). Method: Patients with ID or DD were recruited in the Department of Neurology, Affiliated Children's Hospital of Capital Institute of Pediatrics between September 2015 and February 2016. The intellectual assessment of the patients was performed using 0-6-year-old pediatric examination table of neuropsychological development or Wechsler intelligence scale (>6 years). Patients with a DQ less than 49 or IQ less than 51 were included in this study. The patients were scanned by SNP-array for detection of genomic copy number variations (CNV), and the revealed genomic imbalance was confirmed by quantitative real time-PCR. Candidate gene mutation screening was carried out by target gene sequencing technology.Causal mutations or likely pathogenic variants were verified by polymerase chain reaction and direct sequencing. Result: There were 15 children with ID or DD enrolled, 9 males and 6 females. The age of these patients was 7 months-16 years and 9 months. SNP-array revealed that two of the 15 patients had genomic CNV. Both CNV were de novo micro deletions, one involved 11q24.1q25 and the other micro deletion located on 21q22.2q22.3. Both micro deletions were proved to have a clinical significance due to their association with ID, brain DD, unusual faces etc. by querying Decipher database. Thirteen patients with negative findings in SNP-array were consequently examined with target gene sequencing technology, genotype-phenotype correlation analysis and genetic analysis. Five patients were diagnosed with monogenic disorder, two were diagnosed with suspected genetic disorder and six were still negative. Conclusion: Sequential use of SNP-array and target gene sequencing technology can significantly increase the molecular genetic etiologic diagnosis rate of the patients with unexplained ID or DD. Combined use of these technologies can serve as a useful examinational method in assisting differential diagnosis of children with unexplained ID or DD.
Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.
Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R; Taylor, Jeremy F; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart
2016-01-01
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal.
Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications
Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R.; Taylor, Jeremy F.; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart
2016-01-01
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal. PMID:27583971
USDA-ARS?s Scientific Manuscript database
High-density single nucleotide polymorphism (SNP) genotyping chips are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships among individuals in populations and studying marker-trait associations in mapping experiments. We developed a genotyping array includ...
Xu, Lingyang; Hou, Yali; Bickhart, Derek M; Song, Jiuzhou; Liu, George E
2013-06-25
Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.
Xu, Wang Hong; Zheng, Wei; Cai, Qiuyin; Cheng, Jia-Rong; Cai, Hui; Xiang, Yong-Bing; Shu, Xiao Ou
2008-01-01
We evaluated the interactive effect of polymorphisms in the sex hormone-binding globulin (SHBG) gene with soy isoflavones, tea consumption, and dietary fiber on endometrial cancer risk in a population-based, case-control study of 1,199 endometrial cancer patients and 1,212 controls. Genotyping of polymorphisms was performed by using TaqMan (Applied Biosystems, Foster City, CA) assays (rs6259) or the Affymetrix MegAllele Targeted Genotyping System (Affymetrix, Inc., US) (rs13894, rs858521, and rs2955617). Dietary information was obtained using a validated food frequency questionnaire. A logistic regression model was employed to compute adjusted odds ratios (ORs) and 95% confidence intervals (CIs). We found that the Asp(327)Asn (rs6259) polymorphism was associated with decreased risk of endometrial cancer, particularly among postmenopausal women (OR = 0.79, 95% CI = 0.62-1.00). This single nucleotide polymorphism (SNP) modified associations of soy isoflavones and tea consumption but not fiber intake with endometrial cancer, with the inverse association of soy intake and tea consumption being more evident for those with the Asp/Asp genotype of the SHBG gene at Asp(327)Asn (rs6259), particularly premenopausal women (P(interaction) = 0.06 and 0.02, respectively, for soy isoflavones and tea intake). This study suggests that gene-diet interaction may play an important role in the etiology of endometrial cancer risk.
A SNP genotyping array for hexaploid oat
USDA-ARS?s Scientific Manuscript database
Recognizing a need in cultivated hexaploid oat (Avena sativa L.) for a reliable set of reference SNPs, we have developed a 6K BeadChip design containing 257 Infinium I and 5,486 Infinium II designs corresponding to 5,743 SNPs. Of those, 4,975 SNPs yielded successful assays after array manufacturing...
Larson, Wesley; Palti, Yniv; Gao, G.; Warheit, Kenneth I.; Seeb, James E.
2017-01-01
Natural-origin steelhead trout (Oncorhynchus mykiss (Walbaum, 1792)) in the Pacific Northwest, USA, are threatened by a number of factors including habitat destruction, disease, decline in marine survival, and a potential erosion of genetic viability due to introgression from hatchery strains. Our major goal was to use a recently developed SNP array containing ∼57 000 SNPs to identify a subset of SNPs that differentiate hatchery and natural-origin populations. We analyzed 35 765 polymorphic SNPs in nine populations of steelhead trout sampled from Puget Sound, Washington, USA. We then conducted two outlier tests and found 360 loci that were candidates for divergent selection between hatchery and natural-origin populations (mean FCT = 0.29, maximum = 0.65) and 595 SNPs that were candidates for selection among natural-origin populations (mean FST = 0.25, maximum = 0.51). Comparisons with a linkage map revealed that two chromosomes (Omy05 and Omy25) contained significantly more outliers than other chromosomes, suggesting that regions on Omy05 and Omy25 may be of adaptive significance. Our results highlight several advantages of the 57 000 SNP array as a tool for population and conservation genomics studies.
Saka, Ernur; Harrison, Benjamin J; West, Kirk; Petruska, Jeffrey C; Rouchka, Eric C
2017-12-06
Since the introduction of microarrays in 1995, researchers world-wide have used both commercial and custom-designed microarrays for understanding differential expression of transcribed genes. Public databases such as ArrayExpress and the Gene Expression Omnibus (GEO) have made millions of samples readily available. One main drawback to microarray data analysis involves the selection of probes to represent a specific transcript of interest, particularly in light of the fact that transcript-specific knowledge (notably alternative splicing) is dynamic in nature. We therefore developed a framework for reannotating and reassigning probe groups for Affymetrix® GeneChip® technology based on functional regions of interest. This framework addresses three issues of Affymetrix® GeneChip® data analyses: removing nonspecific probes, updating probe target mapping based on the latest genome knowledge and grouping probes into gene, transcript and region-based (UTR, individual exon, CDS) probe sets. Updated gene and transcript probe sets provide more specific analysis results based on current genomic and transcriptomic knowledge. The framework selects unique probes, aligns them to gene annotations and generates a custom Chip Description File (CDF). The analysis reveals only 87% of the Affymetrix® GeneChip® HG-U133 Plus 2 probes uniquely align to the current hg38 human assembly without mismatches. We also tested new mappings on the publicly available data series using rat and human data from GSE48611 and GSE72551 obtained from GEO, and illustrate that functional grouping allows for the subtle detection of regions of interest likely to have phenotypical consequences. Through reanalysis of the publicly available data series GSE48611 and GSE72551, we profiled the contribution of UTR and CDS regions to the gene expression levels globally. The comparison between region and gene based results indicated that the detected expressed genes by gene-based and region-based CDFs show high consistency and regions based results allows us to detection of changes in transcript formation.
Roorkiwal, Manish; Jain, Ankit; Kale, Sandip M; Doddamani, Dadakhalandar; Chitikineni, Annapurna; Thudi, Mahendar; Varshney, Rajeev K
2018-04-01
To accelerate genomics research and molecular breeding applications in chickpea, a high-throughput SNP genotyping platform 'Axiom ® CicerSNP Array' has been designed, developed and validated. Screening of whole-genome resequencing data from 429 chickpea lines identified 4.9 million SNPs, from which a subset of 70 463 high-quality nonredundant SNPs was selected using different stringent filter criteria. This was further narrowed down to 61 174 SNPs based on p-convert score ≥0.3, of which 50 590 SNPs could be tiled on array. Among these tiled SNPs, a total of 11 245 SNPs (22.23%) were from the coding regions of 3673 different genes. The developed Axiom ® CicerSNP Array was used for genotyping two recombinant inbred line populations, namely ICCRIL03 (ICC 4958 × ICC 1882) and ICCRIL04 (ICC 283 × ICC 8261). Genotyping data reflected high success and polymorphic rate, with 15 140 (29.93%; ICCRIL03) and 20 018 (39.57%; ICCRIL04) polymorphic SNPs. High-density genetic maps comprising 13 679 SNPs spanning 1033.67 cM and 7769 SNPs spanning 1076.35 cM were developed for ICCRIL03 and ICCRIL04 populations, respectively. QTL analysis using multilocation, multiseason phenotyping data on these RILs identified 70 (ICCRIL03) and 120 (ICCRIL04) main-effect QTLs on genetic map. Higher precision and potential of this array is expected to advance chickpea genetics and breeding applications. © 2017 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
USDA-ARS?s Scientific Manuscript database
Background: In a previously reported genome-wide association study based on a high-density bovine SNP genotyping array, 8 SNP were nominally associated (P=0.003) with average daily gain (ADG) and 3 of these were also associated (P=0.002) with average daily feed intake (ADFI) in a population of c...
A Conductometric Indium Oxide Semiconducting Nanoparticle Enzymatic Biosensor Array
Lee, Dongjin; Ondrake, Janet; Cui, Tianhong
2011-01-01
We report a conductometric nanoparticle biosensor array to address the significant variation of electrical property in nanomaterial biosensors due to the random network nature of nanoparticle thin-film. Indium oxide and silica nanoparticles (SNP) are assembled selectively on the multi-site channel area of the resistors using layer-by-layer self-assembly. To demonstrate enzymatic biosensing capability, glucose oxidase is immobilized on the SNP layer for glucose detection. The packaged sensor chip onto a ceramic pin grid array is tested using syringe pump driven feed and multi-channel I–V measurement system. It is successfully demonstrated that glucose is detected in many different sensing sites within a chip, leading to concentration dependent currents. The sensitivity has been found to be dependent on the channel length of the resistor, 4–12 nA/mM for channel lengths of 5–20 μm, while the apparent Michaelis-Menten constant is 20 mM. By using sensor array, analytical data could be obtained with a single step of sample solution feeding. This work sheds light on the applicability of the developed nanoparticle microsensor array to multi-analyte sensors, novel bioassay platforms, and sensing components in a lab-on-a-chip. PMID:22163696
DOE Office of Scientific and Technical Information (OSTI.GOV)
Revet, Ingrid; Huizenga, Gerda; Chan, Alvin
Neuroblastoma is an embryonal tumour of the peripheral sympathetic nervous system (SNS). One of the master regulator genes for peripheral SNS differentiation, the homeobox transcription factor PHOX2B, is mutated in familiar and sporadic neuroblastomas. Here we report that inducible expression of PHOX2B in the neuroblastoma cell line SJNB-8 down-regulates MSX1, a homeobox gene important for embryonic neural crest development. Inducible expression of MSX1 in SJNB-8 caused inhibition of both cell proliferation and colony formation in soft agar. Affymetrix micro-array and Northern blot analysis demonstrated that MSX1 strongly up-regulated the Delta-Notch pathway genes DLK1, NOTCH3, and HEY1. In addition, the proneuralmore » gene NEUROD1 was down-regulated. Western blot analysis showed that MSX1 induction caused cleavage of the NOTCH3 protein to its activated form, further confirming activation of the Delta-Notch pathway. These experiments describe for the first time regulation of the Delta-Notch pathway by MSX1, and connect these genes to the PHOX2B oncogene, indicative of a role in neuroblastoma biology. Affymetrix micro-array analysis of a neuroblastic tumour series consisting of neuroblastomas and the more benign ganglioneuromas showed that MSX1, NOTCH3 and HEY1 are more highly expressed in ganglioneuromas. This suggests a block in differentiation of these tumours at distinct developmental stages or lineages.« less
2013-01-01
Background To detect genes correlated with hepatocellular carcinoma (HCC), we developed a triple combination array consisting of methylation array, gene expression array and single nucleotide polymorphism (SNP) array analysis. Methods A surgical specimen obtained from a 68-year-old female HCC patient was analyzed by triple combination array, which identified doublecortin domain-containing 2 (DCDC2) as a candidate tumor suppressor gene of HCC. Subsequently, samples from 48 HCC patients were evaluated for their DCDC2 methylation and expression status using methylation specific PCR (MSP) and semi-quantitative reverse transcriptase (RT) PCR, respectively. Then, we investigated the relationship between clinicopathological factors and methylation status of DCDC2. Results DCDC2 was revealed to be hypermethylated (methylation value 0.846, range 0–1.0) in cancer tissue, compared with adjacent normal tissue (0.212) by methylation array in the 68-year-old female patient. Expression array showed decreased expression of DCDC2 in cancerous tissue. SNP array showed that the copy number of chromosome 6p22.1, in which DCDC2 resides, was normal. MSP revealed hypermethylation of the promoter region of DCDC2 in 41 of the tumor samples. DCDC2 expression was significantly decreased in the cases with methylation (P = 0.048). Furthermore, the methylated cases revealed worse prognosis for overall survival than unmethylated cases (P = 0.048). Conclusions The present study indicates that triple combination array is an effective method to detect novel genes related to HCC. We propose that DCDC2 is a tumor suppressor gene of HCC. PMID:24034596
Genotype imputation in the domestic dog
Meurs, K. M.
2016-01-01
Application of imputation methods to accurately predict a dense array of SNP genotypes in the dog could provide an important supplement to current analyses of array-based genotyping data. Here, we developed a reference panel of 4,885,283 SNPs in 83 dogs across 15 breeds using whole genome sequencing. We used this panel to predict the genotypes of 268 dogs across three breeds with 84,193 SNP array-derived genotypes as inputs. We then (1) performed breed clustering of the actual and imputed data; (2) evaluated several reference panel breed combinations to determine an optimal reference panel composition; and (3) compared the accuracy of two commonly used software algorithms (Beagle and IMPUTE2). Breed clustering was well preserved in the imputation process across eigenvalues representing 75 % of the variation in the imputed data. Using Beagle with a target panel from a single breed, genotype concordance was highest using a multi-breed reference panel (92.4 %) compared to a breed-specific reference panel (87.0 %) or a reference panel containing no breeds overlapping with the target panel (74.9 %). This finding was confirmed using target panels derived from two other breeds. Additionally, using the multi-breed reference panel, genotype concordance was slightly higher with IMPUTE2 (94.1 %) compared to Beagle; Pearson correlation coefficients were slightly higher for both software packages (0.946 for Beagle, 0.961 for IMPUTE2). Our findings demonstrate that genotype imputation from SNP array-derived data to whole genome-level genotypes is both feasible and accurate in the dog with appropriate breed overlap between the target and reference panels. PMID:27129452
Garinet, Simon; Néou, Mario; de La Villéon, Bruno; Faillot, Simon; Sakat, Julien; Da Fonseca, Juliana P; Jouinot, Anne; Le Tourneau, Christophe; Kamal, Maud; Luscap-Rondof, Windy; Boeva, Valentina; Gaujoux, Sebastien; Vidaud, Michel; Pasmant, Eric; Letourneur, Franck; Bertherat, Jérôme; Assié, Guillaume
2017-09-01
Pangenomic studies identified distinct molecular classes for many cancers, with major clinical applications. However, routine use requires cost-effective assays. We assessed whether targeted next-generation sequencing (NGS) could call chromosomal alterations and DNA methylation status. A training set of 77 tumors and a validation set of 449 (43 tumor types) were analyzed by targeted NGS and single-nucleotide polymorphism (SNP) arrays. Thirty-two tumors were analyzed by NGS after bisulfite conversion, and compared to methylation array or methylation-specific multiplex ligation-dependent probe amplification. Considering allelic ratios, correlation was strong between targeted NGS and SNP arrays (r = 0.88). In contrast, considering DNA copy number, for variations of one DNA copy, correlation was weaker between read counts and SNP array (r = 0.49). Thus, we generated TARGOMICs, optimized for detecting chromosome alterations by combining allelic ratios and read counts generated by targeted NGS. Sensitivity for calling normal, lost, and gained chromosomes was 89%, 72%, and 31%, respectively. Specificity was 81%, 93%, and 98%, respectively. These results were confirmed in the validation set. Finally, TARGOMICs could efficiently align and compute proportions of methylated cytosines from bisulfite-converted DNA from targeted NGS. In conclusion, beyond calling mutations, targeted NGS efficiently calls chromosome alterations and methylation status in tumors. A single run and minor design/protocol adaptations are sufficient. Optimizing targeted NGS should expand translation of genomics to clinical routine. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Tönjes, Anke; Koriath, Moritz; Schleinitz, Dorit; Dietrich, Kerstin; Böttcher, Yvonne; Rayner, Nigel W; Almgren, Peter; Enigk, Beate; Richter, Olaf; Rohm, Silvio; Fischer-Rosinsky, Antje; Pfeiffer, Andreas; Hoffmann, Katrin; Krohn, Knut; Aust, Gabriela; Spranger, Joachim; Groop, Leif; Blüher, Matthias; Kovacs, Peter; Stumvoll, Michael
2009-12-01
Recently, associations of several common genetic variants with height have been reported in different populations. We attempted to identify further variants associated with adult height in a self-contained population (the Sorbs in Eastern Germany) as discovery set. We performed a genome wide association study (GWAS) (approximately 390,000 genetic polymorphisms, Affymetrix gene arrays) on adult height in 929 Sorbian individuals. Subsequently, the best SNPs (P < 0.001) were taken forward to a meta-analysis together with two independent cohorts [Diabetes Genetics Initiative, British 1958 Birth Cohort, (58BC, publicly available)]. Furthermore, we genotyped our best signal for replication in two additional German cohorts (Leipzig, n = 1044 and Berlin, n = 1728). In the primary Sorbian GWAS, we identified 5 loci with a P-value < 10(-5) and 455 SNPs with P-value < 0.001. In the meta-analysis on those 455 SNPs, only two variants in GPR133 (rs1569019 and rs1976930; in LD with each other) retained a P-value at or below 10(-6) and were associated with height in the three cohorts individually. Upon replication, the SNP rs1569019 showed significant effects on height in the Leipzig cohort (P = 0.004, beta = 1.166) and in 577 men of the Berlin cohort (P = 0.049, beta = 1.127) though not in women. The combined analysis of all five cohorts (n = 6,687) resulted in a P-value of 4.7 x 10(-8) (beta = 0.949). In conclusion, our GWAS suggests novel loci influencing height. In view of the robust replication in five different cohorts, we propose GPR133 to be a novel gene associated with adult height.
Deka, Ranjan; Koller, Daniel L.; Lai, Dongbing; Indugula, Subba Rao; Sun, Guangyun; Woo, Daniel; Sauerbeck, Laura; Moomaw, Charles J.; Hornung, Richard; Connolly, E. Sander; Anderson, Craig; Rouleau, Guy; Meissner, Irene; Bailey-Wilson, Joan E.; Huston, John; Brown, Robert D.; Kleindorfer, Dawn O.; Flaherty, Matthew L.; Langefeld, Carl; Foroud, Tatiana; Broderick, Joseph P.
2010-01-01
Purpose To replicate the previous association of single nucleotide polymorphisms (SNPs) with risk of intracranial aneurysm (IA) and to examine the relationship of smoking with these variants and the risk of IA. Methods White probands with an IA from families with multiple affected members were identified by 26 clinical centers located throughout North America, New Zealand, and Australia. White controls free of stroke and IA were selected by random digit dialing from the Greater Cincinnati population. SNPs previously associated with IA on chromosome 2, 8, and 9 were genotyped using a TaqMan assay or were included in the Affymetrix 6.0 array that was part of a genome-wide association study of 406 IA cases and 392 controls. Logistic regression modeling tested whether the association of replicated SNPs with IA was modulated by smoking. Results The strongest evidence of association with IA was found with the 8q SNP rs10958409 (genotypic P = 9.2 × 10-5; allelic P = 1.3 × 10-5; OR = 1.86, 95% CI: 1.40−2.47). We also replicated association with both SNPs on chromosome 9p, rs1333040 and rs10757278, but were not able to replicate the previously reported association of the two SNPs on chromosome 2q. Statistical testing showed a multiplicative relationship between the risk alleles and smoking with regard to the risk of IA. Conclusion Our data provide complementary evidence that the variants on chromosome 8q and 9p are associated with IA and that the risk of IA in patients with these variants are greatly increased with cigarette smoking. PMID:20190001
Copy Number Variation in Patients with Disorders of Sex Development Due to 46,XY Gonadal Dysgenesis
White, Stefan; Ohnesorg, Thomas; Notini, Amanda; Roeszler, Kelly; Hewitt, Jacqueline; Daggag, Hinda; Smith, Craig; Turbitt, Erin; Gustin, Sonja; van den Bergen, Jocelyn; Miles, Denise; Western, Patrick; Arboleda, Valerie; Schumacher, Valerie; Gordon, Lavinia; Bell, Katrina; Bengtsson, Henrik; Speed, Terry; Hutson, John; Warne, Garry; Harley, Vincent; Koopman, Peter; Vilain, Eric; Sinclair, Andrew
2011-01-01
Disorders of sex development (DSD), ranging in severity from mild genital abnormalities to complete sex reversal, represent a major concern for patients and their families. DSD are often due to disruption of the genetic programs that regulate gonad development. Although some genes have been identified in these developmental pathways, the causative mutations have not been identified in more than 50% 46,XY DSD cases. We used the Affymetrix Genome-Wide Human SNP Array 6.0 to analyse copy number variation in 23 individuals with unexplained 46,XY DSD due to gonadal dysgenesis (GD). Here we describe three discrete changes in copy number that are the likely cause of the GD. Firstly, we identified a large duplication on the X chromosome that included DAX1 (NR0B1). Secondly, we identified a rearrangement that appears to affect a novel gonad-specific regulatory region in a known testis gene, SOX9. Surprisingly this patient lacked any signs of campomelic dysplasia, suggesting that the deletion affected expression of SOX9 only in the gonad. Functional analysis of potential SRY binding sites within this deleted region identified five putative enhancers, suggesting that sequences additional to the known SRY-binding TES enhancer influence human testis-specific SOX9 expression. Thirdly, we identified a small deletion immediately downstream of GATA4, supporting a role for GATA4 in gonad development in humans. These CNV analyses give new insights into the pathways involved in human gonad development and dysfunction, and suggest that rearrangements of non-coding sequences disturbing gene regulation may account for significant proportion of DSD cases. PMID:21408189
Yang, T-L; Guo, Y; Li, S M; Li, S K; Tian, Q; Liu, Y-J; Deng, H-W
2013-02-01
Genomic copy number variations (CNVs) have been strongly implicated as important genetic factors for obesity. A recent genome-wide association study identified a novel variant, rs12444979, which is in high linkage disequilibrium with CNV 16p12.3, for association with obesity in Europeans. The aim of this study was to directly examine the relationship between the CNV 16p12.3 and obesity phenotypes, including body mass index (BMI) and body fat mass. Subjects were a multi-ethnic sample, including 2286 unrelated subjects from a European population and 1627 unrelated Han subjects from a Chinese population. Body fat mass was measured using dual energy X-ray absorptiometry. Using Affymetrix Genome-Wide Human SNP Array 6.0, we directly detected CNV 16p12.3, with the deletion frequency of 27.26 and 0.8% in the European and Chinese populations, respectively. We confirmed the significant association between this CNV and obesity (BMI: P=1.38 × 10(-2); body fat mass: P=2.13 × 10(-3)) in the European population. Less copy numbers were associated with lower BMI and body fat mass, and the effect size was estimated to be 0.62 (BMI) and 1.41 (body fat mass), respectively. However, for the Chinese population, we did not observe significant association signal, and the frequencies of this deletion CNV are quite different between the European and Chinese populations (P<0.001). Our findings first suggest that CNV 16p12.3 might be ethnic specific and cause ethnic phenotypic diversity, which may provide some new clues into the understanding of the genetic architecture of obesity.
Bai, H; Sun, Y; Liu, N; Liu, Y; Xue, F; Li, Y; Xu, S; Ni, A; Ye, J; Chen, Y; Chen, J
2018-06-01
Beak deformity (crossed beaks) is found in several indigenous chicken breeds including Beijing-You studied here. Birds with deformed beaks have reduced feed intake and poor production performance. Recently, copy number variation (CNV) has been examined in many species and is recognized as a source of genetic variation, especially for disease phenotypes. In this study, to unravel the genetic mechanisms underlying beak deformity, we performed genome-wide CNV detection using Affymetrix chicken high-density 600K data on 48 deformed-beak and 48 normal birds using penncnv. As a result, two and eight CNV regions (CNVRs) covering 0.32 and 2.45 Mb respectively on autosomes were identified in deformed-beak and normal birds respectively. Further RT-qPCR studies validated nine of the 10 CNVRs. The ratios of six CNVRs were significantly different between deformed-beak and normal birds (P < 0.01). Within these six regions, three and 21 known genes were identified in deformed-beak and normal birds respectively. Bioinformatics analysis showed that these genes were enriched in six GO terms and one KEGG pathway. Five candidate genes in the CNVRs were further validated using RT-qPCR. The expression of LRIG2 (leucine rich repeats and immunoglobulin like domains 2) was lower in birds with deformed beaks (P < 0.01). Therefore, the LRIG2 gene could be considered a key factor in view of its known functions and its potential roles in beak deformity. Overall, our results will be helpful for future investigations of the genomic structural variations underlying beak deformity in chickens. © 2018 Stichting International Foundation for Animal Genetics.
Sun, Miao; Li, Ning; Dong, Wu; Chen, Zugen; Liu, Qing; Xu, Yiming; He, Guang; Shi, Yongyong; Li, Xin; Hao, Jiajie; Luo, Yang; Shang, Dandan; Lv, Dan; Ma, Fen; Zhang, Dai; Hua, Rui; Lu, Chaoxia; Wen, Yaran; Cao, Lihua; Irvine, Alan D.; McLean, W.H. Irwin; Dong, Qi; Wang, Ming-Rong; Yu, Jun; He, Lin; Lo, Wilson H.Y.; Zhang, Xue
2009-01-01
Congenital generalized hypertrichosis terminalis (CGHT) is a rare condition characterized by universal excessive growth of pigmented terminal hairs and often accompanied with gingival hyperplasia. In the present study, we describe three Han Chinese families with autosomal-dominant CGHT and a sporadic case with extreme CGHT and gingival hyperplasia. We first did a genome-wide linkage scan in a large four-generation family. Our parametric multipoint linkage analysis revealed a genetic locus for CGHT on chromosome 17q24.2-q24.3. Further two-point linkage and haplotyping with microsatellite markers from the same chromosome region confirmed the genetic mapping and showed in all the families a microdeletion within the critical region that was present in all affected individuals but not in unaffected family members. We then carried out copy-number analysis with the Affymetrix Genome-Wide Human SNP Array 6.0 and detected genomic microdeletions of different sizes and with different breakpoints in the three families. We validated these microdeletions by real-time quantitative PCR and confirmed their perfect cosegregation with the disease phenotype in the three families. In the sporadic case, however, we found a de novo microduplication. Two-color interphase FISH analysis demonstrated that the duplication was inverted. These copy-number variations (CNVs) shared a common genomic region in which CNV is not reported in the public database and was not detected in our 434 unrelated Han Chinese normal controls. Thus, pathogenic copy-number mutations on 17q24.2-q24.3 are responsible for CGHT with or without gingival hyperplasia. Our work identifies CGHT as a genomic disorder. PMID:19463983
Siemiatkowska, Anna M.; Arimadyo, Kentar; Moruz, Luminita M.; Astuti, Galuh D.N.; de Castro-Miro, Marta; Zonneveld, Marijke N.; Strom, Tim M.; de Wijs, Ilse J.; Hoefsloot, Lies H.; Faradz, Sultana M.H.; Cremers, Frans P.M.; den Hollander, Anneke I.
2011-01-01
Purpose Retinitis pigmentosa (RP) is a clinically and genetically heterogeneous retinal disorder. Despite tremendous knowledge about the genes involved in RP, little is known about the genetic causes of RP in Indonesia. Here, we aim to identify the molecular genetic causes underlying RP in a small cohort of Indonesian patients, using genome-wide homozygosity mapping. Methods DNA samples from affected and healthy individuals from 14 Indonesian families segregating autosomal recessive, X-linked, or isolated RP were collected. Homozygosity mapping was conducted using Illumina 6k or Affymetrix 5.0 single nucleotide polymorphism (SNP) arrays. Known autosomal recessive RP (arRP) genes residing in homozygous regions and X-linked RP genes were sequenced for mutations. Results In ten out of the 14 families, homozygous regions were identified that contained genes known to be involved in the pathogenesis of RP. Sequence analysis of these genes revealed seven novel homozygous mutations in ATP-binding cassette, sub-family A, member 4 (ABCA4), crumbs homolog 1 (CRB1), eyes shut homolog (Drosophila) (EYS), c-mer proto-oncogene tyrosine kinase (MERTK), nuclear receptor subfamily 2, group E, member 3 (NR2E3) and phosphodiesterase 6A, cGMP-specific, rod, alpha (PDE6A), all segregating in the respective families. No mutations were identified in the X-linked genes retinitis pigmentosa GTPase regulator (RPGR) and retinitis pigmentosa 2 (X-linked recessive; RP2). Conclusions Homozygosity mapping is a powerful tool to identify the genetic defects underlying RP in the Indonesian population. Compared to studies involving patients from other populations, the same genes appear to be implicated in the etiology of recessive RP in Indonesia, although all mutations that were discovered are novel and as such may be unique for this population. PMID:22128245
Genome-wide association study of sporadic brain arteriovenous malformations.
Weinsheimer, Shantel; Bendjilali, Nasrine; Nelson, Jeffrey; Guo, Diana E; Zaroff, Jonathan G; Sidney, Stephen; McCulloch, Charles E; Al-Shahi Salman, Rustam; Berg, Jonathan N; Koeleman, Bobby P C; Simon, Matthias; Bostroem, Azize; Fontanella, Marco; Sturiale, Carmelo L; Pola, Roberto; Puca, Alfredo; Lawton, Michael T; Young, William L; Pawlikowska, Ludmila; Klijn, Catharina J M; Kim, Helen
2016-09-01
The pathogenesis of sporadic brain arteriovenous malformations (BAVMs) remains unknown, but studies suggest a genetic component. We estimated the heritability of sporadic BAVM and performed a genome-wide association study (GWAS) to investigate association of common single nucleotide polymorphisms (SNPs) with risk of sporadic BAVM in the international, multicentre Genetics of Arteriovenous Malformation (GEN-AVM) consortium. The Caucasian discovery cohort included 515 BAVM cases and 1191 controls genotyped using Affymetrix genome-wide SNP arrays. Genotype data were imputed to 1000 Genomes Project data, and well-imputed SNPs (>0.01 minor allele frequency) were analysed for association with BAVM. 57 top BAVM-associated SNPs (51 SNPs with p<10(-05) or p<10(-04) in candidate pathway genes, and 6 candidate BAVM SNPs) were tested in a replication cohort including 608 BAVM cases and 744 controls. The estimated heritability of BAVM was 17.6% (SE 8.9%, age and sex-adjusted p=0.015). None of the SNPs were significantly associated with BAVM in the replication cohort after correction for multiple testing. 6 SNPs had a nominal p<0.1 in the replication cohort and map to introns in EGFEM1P, SP4 and CDKAL1 or near JAG1 and BNC2. Of the 6 candidate SNPs, 2 in ACVRL1 and MMP3 had a nominal p<0.05 in the replication cohort. We performed the first GWAS of sporadic BAVM in the largest BAVM cohort assembled to date. No GWAS SNPs were replicated, suggesting that common SNPs do not contribute strongly to BAVM susceptibility. However, heritability estimates suggest a modest but significant genetic contribution. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Bull, Laura N; Hu, Donglei; Shah, Sohela; Temple, Luisa; Silva, Karla; Huntsman, Scott; Melgar, Jennifer; Geiser, Mary T; Sanford, Ukina; Ortiz, Juan A; Lee, Richard H; Kusanovic, Juan P; Ziv, Elad; Vargas, Juan E
2015-01-01
In the Americas, women with Indigenous American ancestry are at increased risk of intrahepatic cholestasis of pregnancy (ICP), relative to women of other ethnicities. We hypothesized that ancestry-related genetic factors contribute to this increased risk. We collected clinical and laboratory data, and performed biochemical assays on samples from U.S. Latinas and Chilean women, with and without ICP. The study sample included 198 women with ICP (90 from California, U.S., and 108 from Chile) and 174 pregnant control women (69 from California, U.S., and 105 from Chile). SNP genotyping was performed using Affymetrix arrays. We compared overall genetic ancestry between cases and controls, and used a genome-wide admixture mapping approach to screen for ICP susceptibility loci. We identified commonalities and differences in features of ICP between the 2 countries and determined that cases had a greater proportion of Indigenous American ancestry than did controls (p = 0.034). We performed admixture mapping, taking country of origin into account, and identified one locus for which Native American ancestry was associated with increased risk of ICP at a genome-wide level of significance (P = 3.1 x 10(-5), Pcorrected = 0.035). This locus has an odds ratio of 4.48 (95% CI: 2.21-9.06) for 2 versus zero Indigenous American chromosomes. This locus lies on chromosome 2, with a 10 Mb 95% confidence interval which does not contain any previously identified hereditary 'cholestasis genes.' Our results indicate that genetic factors contribute to the risk of developing ICP in the Americas, and support the utility of clinical and genetic studies of ethnically mixed populations for increasing our understanding of ICP.
Medintz, Igor; Wong, Wendy W.; Berti, Lorenzo; Shiow, Lawrence; Tom, Jennifer; Scherer, James; Sensabaugh, George; Mathies, Richard A.
2001-01-01
An assay is described for high-throughput single nucleotide polymorphism (SNP) genotyping on a microfabricated capillary array electrophoresis (CAE) microchip. The assay targets the three common variants at the HFE locus associated with the genetic disease hereditary hemochromatosis (HHC). The assay employs allele-specific PCR (ASPCR) for the C282Y (845g->a), H63D (187c->g), and S65C (193a->t) variants using fluorescently-labeled energy-transfer (ET) allele-specific primers. Using a 96-channel radial CAE microplate, the labeled ASPCR products generated from 96 samples in a reference Caucasian population are simultaneously separated with single-base-pair resolution and genotyped in under 10 min. Detection is accomplished with a laser-excited rotary four-color fluorescence scanner. The allele-specific amplicons are differentiated on the basis of both their size and the color of the label emission. This study is the first demonstration of the combined use of ASPCR with ET primers and microfabricated radial CAE microplates to perform multiplex SNP analyses in a clinically relevant population. PMID:11230165
SNP-array lesions in core binding factor acute myeloid leukemia
Duployez, Nicolas; Boudry-Labis, Elise; Roumier, Christophe; Boissel, Nicolas; Petit, Arnaud; Geffroy, Sandrine; Helevaut, Nathalie; Celli-Lebras, Karine; Terré, Christine; Fenneteau, Odile; Cuccuini, Wendy; Luquet, Isabelle; Lapillonne, Hélène; Lacombe, Catherine; Cornillet, Pascale; Ifrah, Norbert; Dombret, Hervé; Leverger, Guy; Jourdan, Eric; Preudhomme, Claude
2018-01-01
Acute myeloid leukemia (AML) with t(8;21) and inv(16), together referred as core binding factor (CBF)-AML, are recognized as unique entities. Both rearrangements share a common pathophysiology, the disruption of the CBF, and a relatively good prognosis. Experiments have demonstrated that CBF rearrangements were insufficient to induce leukemia, implying the existence of cooperating events. To explore these aberrations, we performed single nucleotide polymorphism (SNP)-array in a well-annotated cohort of 198 patients with CBF-AML. Excluding breakpoint-associated lesions, the most frequent events included loss of a sex chromosome (53%), deletions at 9q21 (12%) and 7q36 (9%) in patients with t(8;21) compared with trisomy 22 (13%), trisomy 8 (10%) and 7q36 deletions (12%) in patients with inv(16). SNP-array revealed novel recurrent genetic alterations likely to be involved in CBF-AML leukemogenesis. ZBTB7A mutations (20% of t(8;21)-AML) were shown to be a target of copy-neutral losses of heterozygosity (CN-LOH) at chromosome 19p. FOXP1 focal deletions were identified in 5% of inv(16)-AML while sequence analysis revealed that 2% carried FOXP1 truncating mutations. Finally, CCDC26 disruption was found in both subtypes (4.5% of the whole cohort) and possibly highlighted a new lesion associated with aberrant tyrosine kinase signaling in this particular subtype of leukemia. PMID:29464086
SNP-array lesions in core binding factor acute myeloid leukemia.
Duployez, Nicolas; Boudry-Labis, Elise; Roumier, Christophe; Boissel, Nicolas; Petit, Arnaud; Geffroy, Sandrine; Helevaut, Nathalie; Celli-Lebras, Karine; Terré, Christine; Fenneteau, Odile; Cuccuini, Wendy; Luquet, Isabelle; Lapillonne, Hélène; Lacombe, Catherine; Cornillet, Pascale; Ifrah, Norbert; Dombret, Hervé; Leverger, Guy; Jourdan, Eric; Preudhomme, Claude
2018-01-19
Acute myeloid leukemia (AML) with t(8;21) and inv(16), together referred as core binding factor (CBF)-AML, are recognized as unique entities. Both rearrangements share a common pathophysiology, the disruption of the CBF, and a relatively good prognosis. Experiments have demonstrated that CBF rearrangements were insufficient to induce leukemia, implying the existence of cooperating events. To explore these aberrations, we performed single nucleotide polymorphism (SNP)-array in a well-annotated cohort of 198 patients with CBF-AML. Excluding breakpoint-associated lesions, the most frequent events included loss of a sex chromosome (53%), deletions at 9q21 (12%) and 7q36 (9%) in patients with t(8;21) compared with trisomy 22 (13%), trisomy 8 (10%) and 7q36 deletions (12%) in patients with inv(16). SNP-array revealed novel recurrent genetic alterations likely to be involved in CBF-AML leukemogenesis. ZBTB7A mutations (20% of t(8;21)-AML) were shown to be a target of copy-neutral losses of heterozygosity (CN-LOH) at chromosome 19p. FOXP1 focal deletions were identified in 5% of inv(16)-AML while sequence analysis revealed that 2% carried FOXP1 truncating mutations. Finally, CCDC26 disruption was found in both subtypes (4.5% of the whole cohort) and possibly highlighted a new lesion associated with aberrant tyrosine kinase signaling in this particular subtype of leukemia.
A ddRAD Based Linkage Map of the Cultivated Strawberry, Fragaria xananassa
Davik, Jahn; Sargent, Daniel James; Brurberg, May Bente; Lien, Sigbjørn; Kent, Matthew; Alsheikh, Muath
2015-01-01
The cultivated strawberry (Fragaria ×ananassa Duch.) is an allo-octoploid considered difficult to disentangle genetically due to its four relatively similar sub-genomic chromosome sets. This has been alleviated by the recent release of the strawberry IStraw90 whole genome genotyping array. However, array resolution relies on the genotypes used in the array construction and may be of limited general use. SNP detection based on reduced genomic sequencing approaches has the potential of providing better coverage in cases where the studied genotypes are only distantly related from the SNP array’s construction foundation. Here we have used double digest restriction-associated DNA sequencing (ddRAD) to identify SNPs in a 145 seedling F1 hybrid population raised from the cross between the cultivars Sonata (♀) and Babette (♂). A linkage map containing 907 markers which spanned 1,581.5 cM across 31 linkage groups representing the 28 chromosomes of the species. Comparing the physical span of the SNP markers with the F. vesca genome sequence, the linkage groups resolved covered 79% of the estimated 830 Mb of the F. ×ananassa genome. Here, we have developed the first linkage map for F. ×ananassa using ddRAD and show that this technique and other related techniques are useful tools for linkage map development and downstream genetic studies in the octoploid strawberry. PMID:26398886
Paulsson, Kajsa; Cazier, Jean-Baptiste; MacDougall, Finlay; Stevens, Jane; Stasevich, Irina; Vrcelj, Nikoletta; Chaplin, Tracy; Lillington, Debra M.; Lister, T. Andrew; Young, Bryan D.
2008-01-01
We present here a genome-wide map of abnormalities found in diagnostic samples from 45 adults and adolescents with acute lymphoblastic leukemia (ALL). A 500K SNP array analysis uncovered frequent genetic abnormalities, with cryptic deletions constituting half of the detected changes, implying that microdeletions are a characteristic feature of this malignancy. Importantly, the pattern of deletions resembled that recently reported in pediatric ALL, suggesting that adult, adolescent, and childhood cases may be more similar on the genetic level than previously thought. Thus, 70% of the cases displayed deletion of one or more of the CDKN2A, PAX5, IKZF1, ETV6, RB1, and EBF1 genes. Furthermore, several genes not previously implicated in the pathogenesis of ALL were identified as possible recurrent targets of deletion. In total, the SNP array analysis identified 367 genetic abnormalities not corresponding to known copy number polymorphisms, with all but two cases (96%) displaying at least one cryptic change. The resolution level of this SNP array study is the highest used to date to investigate a malignant hematologic disorder. Our findings provide insights into the leukemogenic process and may be clinically important in adult and adolescent ALL. Most importantly, we report that microdeletions of key genes appear to be a common, characteristic feature of ALL that is shared among different clinical, morphological, and cytogenetic subgroups. PMID:18458336
USDA-ARS?s Scientific Manuscript database
High-throughput genotyping arrays provide a standardized resource for crop research communities that are useful for a breadth of applications including high-density genetic mapping, genome-wide association studies (GWAS), genomic selection (GS), candidate marker and quantitative trait loci (QTL) ide...
USDA-ARS?s Scientific Manuscript database
The Axiom® IStraw90 SNP (single nucleotide polymorphism) array was developed to enable high-throughput genotyping in allo-octoploid cultivated strawberry (Fragaria ×ananassa). However, high cost ($80-105 per sample) limits throughput for certain applications. On average the IStraw90 has yielded 50% ...
USDA-ARS?s Scientific Manuscript database
Single nucleotide polymorphisms (SNPs) are the most abundant DNA sequence variation in the genomes which can be used to associate genotypic variation to the phenotype. Therefore, availability of a high-density SNP array with uniform genome coverage can advance genetic studies and breeding applicatio...
Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue
2016-01-01
Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.
Du, Tao; Duan, Yu; Li, Kaiwen; Zhao, Xiaomiao; Ni, Renmin; Li, Yu; Yang, Dongzi
2015-01-01
Background. Single-nucleotide polymorphisms (SNPs) in the follicle stimulating hormone receptor (FSHR) gene are associated with PCOS. However, their relationship to the polycystic ovary (PCO) morphology remains unknown. This study aimed to investigate whether PCOS related SNPs in the FSHR gene are associated with PCO in women with PCOS. Methods. Patients were grouped into PCO (n = 384) and non-PCO (n = 63) groups. Genomic genotypes were profiled using Affymetrix human genome SNP chip 6. Two polymorphisms (rs2268361 and rs2349415) of FSHR were analyzed using a statistical approach. Results. Significant differences were found in the allele distributions of the GG genotype of rs2268361 between the PCO and non-PCO groups (27.6% GG, 53.4% GA, and 19.0% AA versus 33.3% GG, 36.5% GA, and 30.2% AA), while no significant differences were found in the allele distributions of the GG genotype of rs2349415. When rs2268361 was considered, there were statistically significant differences of serum follicle stimulating hormone, estradiol, and sex hormone binding globulin between genotypes in the PCO group. In case of the rs2349415 SNP, only serum sex hormone binding globulin was statistically different between genotypes in the PCO group. Conclusions. Functional variants in FSHR gene may contribute to PCO susceptibility in women with PCOS. PMID:26273622
[Phenotypic and genetic analysis of a patient presented with Tietz/Waardenburg type II a syndrome].
Wang, Huanhuan; Tang, Lifang; Zhang, Jingmin; Hu, Qin; Chen, Yingwei; Xiao, Bing
2015-08-01
To determine the genetic cause for a patient featuring decreased pigmentation of the skin and iris, hearing loss and multiple congenital anomalies. Routine chromosomal banding was performed to analyze the karyotype of the patient and his parents. Single nucleotide polymorphism array (SNP array) was employed to identify cryptic chromosome aberrations, and quantitative real-time PCR was used to confirm the results. Karyotype analysis has revealed no obvious anomaly for the patient and his parents. SNP array analysis of the patient has demonstrated a 3.9 Mb deletion encompassing 3p13p14.1, which caused loss of entire MITF gene. The deletion was confirmed by quantitative real-time PCR. Clinical features of the patient have included severe bilateral hearing loss, decreased pigmentation of the skin and iris and multiple congenital anomalies. The patient, carrying a 3p13p14.1 deletion, has features of Tietz syndrome/Waardenburg syndrome type IIa. This case may provide additional data for the study of genotype-phenotype correlation of this disease.
Seven newly identified loci for autoimmune thyroid disease.
Cooper, Jason D; Simmonds, Matthew J; Walker, Neil M; Burren, Oliver; Brand, Oliver J; Guo, Hui; Wallace, Chris; Stevens, Helen; Coleman, Gillian; Franklyn, Jayne A; Todd, John A; Gough, Stephen C L
2012-12-01
Autoimmune thyroid disease (AITD), including Graves' disease (GD) and Hashimoto's thyroiditis (HT), is one of the most common of the immune-mediated diseases. To further investigate the genetic determinants of AITD, we conducted an association study using a custom-made single-nucleotide polymorphism (SNP) array, the ImmunoChip. The SNP array contains all known and genotype-able SNPs across 186 distinct susceptibility loci associated with one or more immune-mediated diseases. After stringent quality control, we analysed 103 875 common SNPs (minor allele frequency >0.05) in 2285 GD and 462 HT patients and 9364 controls. We found evidence for seven new AITD risk loci (P < 1.12 × 10(-6); a permutation test derived significance threshold), five at locations previously associated and two at locations awaiting confirmation, with other immune-mediated diseases.
Wang, Shichen; Wong, Debbie; Forrest, Kerrie; Allen, Alexandra; Chao, Shiaoman; Huang, Bevan E; Maccaferri, Marco; Salvi, Silvio; Milner, Sara G; Cattivelli, Luigi; Mastrangelo, Anna M; Whan, Alex; Stephen, Stuart; Barker, Gary; Wieseke, Ralf; Plieske, Joerg; International Wheat Genome Sequencing Consortium; Lillemo, Morten; Mather, Diane; Appels, Rudi; Dolferus, Rudy; Brown-Guedira, Gina; Korol, Abraham; Akhunova, Alina R; Feuillet, Catherine; Salse, Jerome; Morgante, Michele; Pozniak, Curtis; Luo, Ming-Cheng; Dvorak, Jan; Morell, Matthew; Dubcovsky, Jorge; Ganal, Martin; Tuberosa, Roberto; Lawley, Cindy; Mikoulitch, Ivan; Cavanagh, Colin; Edwards, Keith J; Hayden, Matthew; Akhunov, Eduard
2014-01-01
High-density single nucleotide polymorphism (SNP) genotyping arrays are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships between individuals in populations and studying marker–trait associations in mapping experiments. We developed a genotyping array including about 90 000 gene-associated SNPs and used it to characterize genetic variation in allohexaploid and allotetraploid wheat populations. The array includes a significant fraction of common genome-wide distributed SNPs that are represented in populations of diverse geographical origin. We used density-based spatial clustering algorithms to enable high-throughput genotype calling in complex data sets obtained for polyploid wheat. We show that these model-free clustering algorithms provide accurate genotype calling in the presence of multiple clusters including clusters with low signal intensity resulting from significant sequence divergence at the target SNP site or gene deletions. Assays that detect low-intensity clusters can provide insight into the distribution of presence–absence variation (PAV) in wheat populations. A total of 46 977 SNPs from the wheat 90K array were genetically mapped using a combination of eight mapping populations. The developed array and cluster identification algorithms provide an opportunity to infer detailed haplotype structure in polyploid wheat and will serve as an invaluable resource for diversity studies and investigating the genetic basis of trait variation in wheat. PMID:24646323
A Universal Genome Array and Transcriptome Atlas for Brachypodium Distachyon
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mockler, Todd
Brachypodium distachyon is the premier experimental model grass platform and is related to candidate feedstock crops for bioethanol production. Based on the DOE-JGI Brachypodium Bd21 genome sequence and annotation we designed a whole genome DNA microarray platform. The quality of this array platform is unprecedented due to the exceptional quality of the Brachypodium genome assembly and annotation and the stringent probe selection criteria employed in the design. We worked with members of the international community and the bioinformatics/design team at Affymetrix at all stages in the development of the array. We used the Brachypodium arrays to interrogate the transcriptomes ofmore » plants grown in a variety of environmental conditions including diurnal and circadian light/temperature conditions and under a variety of environmental conditions. We examined the transciptional responses of Brachypodium seedlings subjected to various abiotic stresses including heat, cold, salt, and high intensity light. We generated a gene expression atlas representing various organs and developmental stages. The results of these efforts including all microarray datasets are published and available at online public databases.« less
Hartmann, Luise; Stephenson, Christine F; Verkamp, Stephanie R; Johnson, Krystal R; Burnworth, Bettina; Hammock, Kelle; Brodersen, Lisa Eidenschink; de Baca, Monica E; Wells, Denise A; Loken, Michael R; Zehentner, Barbara K
2014-12-01
Array comparative genomic hybridization (aCGH) has become a powerful tool for analyzing hematopoietic neoplasms and identifying genome-wide copy number changes in a single assay. aCGH also has superior resolution compared with fluorescence in situ hybridization (FISH) or conventional cytogenetics. Integration of single nucleotide polymorphism (SNP) probes with microarray analysis allows additional identification of acquired uniparental disomy, a copy neutral aberration with known potential to contribute to tumor pathogenesis. However, a limitation of microarray analysis has been the inability to detect clonal heterogeneity in a sample. This study comprised 16 samples (acute myeloid leukemia, myelodysplastic syndrome, chronic lymphocytic leukemia, plasma cell neoplasm) with complex cytogenetic features and evidence of clonal evolution. We used an integrated manual peak reassignment approach combining analysis of aCGH and SNP microarray data for characterization of subclonal abnormalities. We compared array findings with results obtained from conventional cytogenetic and FISH studies. Clonal heterogeneity was detected in 13 of 16 samples by microarray on the basis of log2 values. Use of the manual peak reassignment analysis approach improved resolution of the sample's clonal composition and genetic heterogeneity in 10 of 13 (77%) patients. Moreover, in 3 patients, clonal disease progression was revealed by array analysis that was not evident by cytogenetic or FISH studies. Genetic abnormalities originating from separate clonal subpopulations can be identified and further characterized by combining aCGH and SNP hybridization results from 1 integrated microarray chip by use of the manual peak reassignment technique. Its clinical utility in comparison to conventional cytogenetic or FISH studies is demonstrated. © 2014 American Association for Clinical Chemistry.
Capalbo, Antonio; Treff, Nathan R; Cimadomo, Danilo; Tao, Xin; Upham, Kathleen; Ubaldi, Filippo Maria; Rienzi, Laura; Scott, Richard T
2015-07-01
Comprehensive chromosome screening (CCS) methods are being extensively used to select chromosomally normal embryos in human assisted reproduction. Some concerns related to the stage of analysis and which aneuploidy screening method to use still remain. In this study, the reliability of blastocyst-stage aneuploidy screening and the diagnostic performance of the two mostly used CCS methods (quantitative real-time PCR (qPCR) and array comparative genome hybridization (aCGH)) has been assessed. aCGH aneuploid blastocysts were rebiopsied, blinded, and evaluated by qPCR. Discordant cases were subsequently rebiopsied, blinded, and evaluated by single-nucleotide polymorphism (SNP) array-based CCS. Although 81.7% of embryos showed the same diagnosis when comparing aCGH and qPCR-based CCS, 18.3% (22/120) of embryos gave a discordant result for at least one chromosome. SNP array reanalysis showed that a discordance was reported in ten blastocysts for aCGH, mostly due to false positives, and in four cases for qPCR. The discordant aneuploidy call rate per chromosome was significantly higher for aCGH (5.7%) compared with qPCR (0.6%; P<0.01). To corroborate these findings, 39 embryos were simultaneously biopsied for aCGH and qPCR during blastocyst-stage aneuploidy screening cycles. 35 matched including all 21 euploid embryos. Blinded SNP analysis on rebiopsies of the four embryos matched qPCR. These findings demonstrate the high reliability of diagnosis performed at the blastocyst stage with the use of different CCS methods. However, the application of aCGH can be expected to result in a higher aneuploidy rate than other contemporary methods of CCS.
At what scale should microarray data be analyzed?
Huang, Shuguang; Yeo, Adeline A; Gelbert, Lawrence; Lin, Xi; Nisenbaum, Laura; Bemis, Kerry G
2004-01-01
The hybridization intensities derived from microarray experiments, for example Affymetrix's MAS5 signals, are very often transformed in one way or another before statistical models are fitted. The motivation for performing transformation is usually to satisfy the model assumptions such as normality and homogeneity in variance. Generally speaking, two types of strategies are often applied to microarray data depending on the analysis need: correlation analysis where all the gene intensities on the array are considered simultaneously, and gene-by-gene ANOVA where each gene is analyzed individually. We investigate the distributional properties of the Affymetrix GeneChip signal data under the two scenarios, focusing on the impact of analyzing the data at an inappropriate scale. The Box-Cox type of transformation is first investigated for the strategy of pooling genes. The commonly used log-transformation is particularly applied for comparison purposes. For the scenario where analysis is on a gene-by-gene basis, the model assumptions such as normality are explored. The impact of using a wrong scale is illustrated by log-transformation and quartic-root transformation. When all the genes on the array are considered together, the dependent relationship between the expression and its variation level can be satisfactorily removed by Box-Cox transformation. When genes are analyzed individually, the distributional properties of the intensities are shown to be gene dependent. Derivation and simulation show that some loss of power is incurred when a wrong scale is used, but due to the robustness of the t-test, the loss is acceptable when the fold-change is not very large.
Weighted analysis of paired microarray experiments.
Kristiansson, Erik; Sjögren, Anders; Rudemo, Mats; Nerman, Olle
2005-01-01
In microarray experiments quality often varies, for example between samples and between arrays. The need for quality control is therefore strong. A statistical model and a corresponding analysis method is suggested for experiments with pairing, including designs with individuals observed before and after treatment and many experiments with two-colour spotted arrays. The model is of mixed type with some parameters estimated by an empirical Bayes method. Differences in quality are modelled by individual variances and correlations between repetitions. The method is applied to three real and several simulated datasets. Two of the real datasets are of Affymetrix type with patients profiled before and after treatment, and the third dataset is of two-colour spotted cDNA type. In all cases, the patients or arrays had different estimated variances, leading to distinctly unequal weights in the analysis. We suggest also plots which illustrate the variances and correlations that affect the weights computed by our analysis method. For simulated data the improvement relative to previously published methods without weighting is shown to be substantial.
Chevret, Sylvie; Nibourel, Olivier; Cheok, Meyling; Pautas, Cécile; Duléry, Rémy; Boyer, Thomas; Cayuela, Jean-Michel; Hayette, Sandrine; Raffoux, Emmanuel; Farhat, Hassan; Boissel, Nicolas; Terre, Christine
2014-01-01
We recently showed that the addition of fractionated doses of gemtuzumab ozogamicin (GO) to standard chemotherapy improves clinical outcome of acute myeloid leukemia (AML) patients. In the present study, we performed mutational analysis of 11 genes (FLT3, NPM1, CEBPA, MLL, WT1, IDH1/2, RUNX1, ASXL1, TET2, DNMT3A), EVI1 overexpression screening, and 6.0 single-nucleotide polymorphism array (SNP-A) analysis in diagnostic samples of the 278 AML patients enrolled in the ALFA-0701 trial. In cytogenetically normal (CN) AML (n = 146), 38% of the patients had at least 1 SNP-A lesion and 89% of the patients had at least 1 molecular alteration. In multivariate analysis, the independent predictors of higher cumulative incidence of relapse were unfavorable karyotype (P = 0.013) and randomization in the control arm (P = 0.007) in the whole cohort, and MLL partial tandem duplications (P = 0.014) and DNMT3A mutations (P = 0.010) in CN-AML. The independent predictors of shorter overall survival (OS) were unfavorable karyotype (P < 0.001) and SNP-A lesion(s) (P = 0.001) in the whole cohort, and SNP-A lesion(s) (P = 0.006), DNMT3A mutations (P = 0.042) and randomization in the control arm (P = 0.043) in CN-AML. Interestingly, CN-AML patients benefited preferentially more from GO treatment as compared to AML patients with abnormal cytogenetics (hazard ratio for death, 0.52 versus 1.14; test for interaction, P = 0.04). Although the interaction test was not statistically significant, the OS benefit associated with GO treatment appeared also more pronounced in FLT3 internal tandem duplication positive than in negative patients. PMID:24659740
Renneville, Aline; Abdelali, Raouf Ben; Chevret, Sylvie; Nibourel, Olivier; Cheok, Meyling; Pautas, Cécile; Duléry, Rémy; Boyer, Thomas; Cayuela, Jean-Michel; Hayette, Sandrine; Raffoux, Emmanuel; Farhat, Hassan; Boissel, Nicolas; Terre, Christine; Dombret, Hervé; Castaigne, Sylvie; Preudhomme, Claude
2014-02-28
We recently showed that the addition of fractionated doses of gemtuzumab ozogamicin (GO) to standard chemotherapy improves clinical outcome of acute myeloid leukemia (AML) patients. In the present study, we performed mutational analysis of 11 genes (FLT3, NPM1, CEBPA, MLL, WT1, IDH1/2, RUNX1, ASXL1, TET2, DNMT3A), EVI1 overexpression screening, and 6.0 single-nucleotide polymorphism array (SNP-A) analysis in diagnostic samples of the 278 AML patients enrolled in the ALFA-0701 trial. In cytogenetically normal (CN) AML (n=146), 38% of the patients had at least 1 SNP-A lesion and 89% of the patients had at least 1 molecular alteration. In multivariate analysis, the independent predictors of higher cumulative incidence of relapse were unfavorable karyotype (P = 0.013) and randomization in the control arm (P = 0.007) in the whole cohort, and MLL partial tandem duplications (P = 0.014) and DNMT3A mutations (P = 0.010) in CN-AML. The independent predictors of shorter overall survival (OS) were unfavorable karyotype (P <0.001) and SNP-A lesion(s) (P = 0.001) in the whole cohort, and SNP-A lesion(s) (P = 0.006), DNMT3A mutations (P = 0.042) and randomization in the control arm (P = 0.043) in CN-AML. Interestingly, CN-AML patients benefited preferentially more from GO treatment as compared to AML patients with abnormal cytogenetics (hazard ratio for death, 0.52 versus 1.14; test for interaction, P = 0.04). Although the interaction test was not statistically significant, the OS benefit associated with GO treatment appeared also more pronounced in FLT3 internal tandem duplication positive than in negative patients.
Peace, Cameron; Bassil, Nahla; Main, Dorrie; Ficklin, Stephen; Rosyara, Umesh R.; Stegmeir, Travis; Sebolt, Audrey; Gilmore, Barbara; Lawley, Cindy; Mockler, Todd C.; Bryant, Douglas W.; Wilhelm, Larry; Iezzoni, Amy
2012-01-01
High-throughput genome scans are important tools for genetic studies and breeding applications. Here, a 6K SNP array for use with the Illumina Infinium® system was developed for diploid sweet cherry (Prunus avium) and allotetraploid sour cherry (P. cerasus). This effort was led by RosBREED, a community initiative to enable marker-assisted breeding for rosaceous crops. Next-generation sequencing in diverse breeding germplasm provided 25 billion basepairs (Gb) of cherry DNA sequence from which were identified genome-wide SNPs for sweet cherry and for the two sour cherry subgenomes derived from sweet cherry (avium subgenome) and P. fruticosa (fruticosa subgenome). Anchoring to the peach genome sequence, recently released by the International Peach Genome Initiative, predicted relative physical locations of the 1.9 million putative SNPs detected, preliminarily filtered to 368,943 SNPs. Further filtering was guided by results of a 144-SNP subset examined with the Illumina GoldenGate® assay on 160 accessions. A 6K Infinium® II array was designed with SNPs evenly spaced genetically across the sweet and sour cherry genomes. SNPs were developed for each sour cherry subgenome by using minor allele frequency in the sour cherry detection panel to enrich for subgenome-specific SNPs followed by targeting to either subgenome according to alleles observed in sweet cherry. The array was evaluated using panels of sweet (n = 269) and sour (n = 330) cherry breeding germplasm. Approximately one third of array SNPs were informative for each crop. A total of 1825 polymorphic SNPs were verified in sweet cherry, 13% of these originally developed for sour cherry. Allele dosage was resolved for 2058 polymorphic SNPs in sour cherry, one third of these being originally developed for sweet cherry. This publicly available genomics resource represents a significant advance in cherry genome-scanning capability that will accelerate marker-locus-trait association discovery, genome structure investigation, and genetic diversity assessment in this diploid-tetraploid crop group. PMID:23284615
Wen, Weie; He, Zhonghu; Gao, Fengmei; Liu, Jindong; Jin, Hui; Zhai, Shengnan; Qu, Yanying; Xia, Xianchun
2017-01-01
A high-density consensus map is a powerful tool for gene mapping, cloning and molecular marker-assisted selection in wheat breeding. The objective of this study was to construct a high-density, single nucleotide polymorphism (SNP)-based consensus map of common wheat (Triticum aestivum L.) by integrating genetic maps from four recombinant inbred line populations. The populations were each genotyped using the wheat 90K Infinium iSelect SNP assay. A total of 29,692 SNP markers were mapped on 21 linkage groups corresponding to 21 hexaploid wheat chromosomes, covering 2,906.86 cM, with an overall marker density of 10.21 markers/cM. Compared with the previous maps based on the wheat 90K SNP chip detected 22,736 (76.6%) of the SNPs with consistent chromosomal locations, whereas 1,974 (6.7%) showed different chromosomal locations, and 4,982 (16.8%) were newly mapped. Alignment of the present consensus map and the wheat expressed sequence tags (ESTs) Chromosome Bin Map enabled assignment of 1,221 SNP markers to specific chromosome bins and 819 ESTs were integrated into the consensus map. The marker orders of the consensus map were validated based on physical positions on the wheat genome with Spearman rank correlation coefficients ranging from 0.69 (4D) to 0.97 (1A, 4B, 5B, and 6A), and were also confirmed by comparison with genetic position on the previously 40K SNP consensus map with Spearman rank correlation coefficients ranging from 0.84 (6D) to 0.99 (6A). Chromosomal rearrangements reported previously were confirmed in the present consensus map and new putative rearrangements were identified. In addition, an integrated consensus map was developed through the combination of five published maps with ours, containing 52,607 molecular markers. The consensus map described here provided a high-density SNP marker map and a reliable order of SNPs, representing a step forward in mapping and validation of chromosomal locations of SNPs on the wheat 90K array. Moreover, it can be used as a reference for quantitative trait loci (QTL) mapping to facilitate exploitation of genes and QTL in wheat breeding. PMID:28848588
"Gap hunting" to characterize clustered probe signals in Illumina methylation array data.
Andrews, Shan V; Ladd-Acosta, Christine; Feinberg, Andrew P; Hansen, Kasper D; Fallin, M Daniele
2016-01-01
The Illumina 450k array has been widely used in epigenetic association studies. Current quality-control (QC) pipelines typically remove certain sets of probes, such as those containing a SNP or with multiple mapping locations. An additional set of potentially problematic probes are those with DNA methylation distributions characterized by two or more distinct clusters separated by gaps. Data-driven identification of such probes may offer additional insights for downstream analyses. We developed a procedure, termed "gap hunting," to identify probes showing clustered distributions. Among 590 peripheral blood samples from the Study to Explore Early Development, we identified 11,007 "gap probes." The vast majority (9199) are likely attributed to an underlying SNP(s) or other variant in the probe, although SNP-affected probes exist that do not produce a gap signals. Specific factors predict which SNPs lead to gap signals, including type of nucleotide change, probe type, DNA strand, and overall methylation state. These expected effects are demonstrated in paired genotype and 450k data on the same samples. Gap probes can also serve as a surrogate for the local genetic sequence on a haplotype scale and can be used to adjust for population stratification. The characteristics of gap probes reflect potentially informative biology. QC pipelines may benefit from an efficient data-driven approach that "flags" gap probes, rather than filtering such probes, followed by careful interpretation of downstream association analyses. Our results should translate directly to the recently released Illumina EPIC array given the similar chemistry and content design.
Wallenborn, M; Petters, O; Rudolf, D; Hantmann, H; Richter, M; Ahnert, P; Rohani, L; Smink, J J; Bulwin, G C; Krupp, W; Schulz, R M; Holland, H
2018-04-23
In the development of cell-based medicinal products, it is crucial to guarantee that the application of such an advanced therapy medicinal product (ATMP) is safe for the patients. The consensus of the European regulatory authorities is: "In conclusion, on the basis of the state of art, conventional karyotyping can be considered a valuable and useful technique to analyse chromosomal stability during preclinical studies". 408 chondrocyte samples (84 monolayers and 324 spheroids) from six patients were analysed using trypsin-Giemsa staining, spectral karyotyping and fluorescence in situ hybridisation, to evaluate the genetic stability of chondrocyte samples from non-clinical studies. Single nucleotide polymorphism (SNP) array analysis was performed on chondrocyte spheroids from five of the six donors. Applying this combination of techniques, the genetic analyses performed revealed no significant genetic instability until passage 3 in monolayer cells and interphase cells from spheroid cultures at different time points. Clonal occurrence of polyploid metaphases and endoreduplications were identified associated with prolonged cultivation time. Also, gonosomal losses were observed in chondrocyte spheroids, with increasing passage and duration of the differentiation phase. Interestingly, in one of the donors, chromosomal aberrations that are also described in extraskeletal myxoid chondrosarcoma were identified. The SNP array analysis exhibited chromosomal aberrations in two donors and copy neutral losses of heterozygosity regions in four donors. This study showed the necessity of combined genetic analyses at defined cultivation time points in quality studies within the field of cell therapy.
Haplotype-Based Genotyping in Polyploids.
Clevenger, Josh P; Korani, Walid; Ozias-Akins, Peggy; Jackson, Scott
2018-01-01
Accurate identification of polymorphisms from sequence data is crucial to unlocking the potential of high throughput sequencing for genomics. Single nucleotide polymorphisms (SNPs) are difficult to accurately identify in polyploid crops due to the duplicative nature of polyploid genomes leading to low confidence in the true alignment of short reads. Implementing a haplotype-based method in contrasting subgenome-specific sequences leads to higher accuracy of SNP identification in polyploids. To test this method, a large-scale 48K SNP array (Axiom Arachis2) was developed for Arachis hypogaea (peanut), an allotetraploid, in which 1,674 haplotype-based SNPs were included. Results of the array show that 74% of the haplotype-based SNP markers could be validated, which is considerably higher than previous methods used for peanut. The haplotype method has been implemented in a standalone program, HAPLOSWEEP, which takes as input bam files and a vcf file and identifies haplotype-based markers. Haplotype discovery can be made within single reads or span paired reads, and can leverage long read technology by targeting any length of haplotype. Haplotype-based genotyping is applicable in all allopolyploid genomes and provides confidence in marker identification and in silico-based genotyping for polyploid genomics.
Bungartz, Annemarie; Klaus, Marius; Mathew, Boby; Léon, Jens; Naz, Ali Ahmad
2016-03-01
The aim of the present study was to develop a new cost effective PCR based CAPS marker set using advantages of high-throughput SNP genotyping. Initially, SNP survey was made using 20 diverse barley genotypes via 9k iSelect array genotyping that resulted in 6334 polymorphic SNP markers. Principle component analysis using this marker data showed fine differentiation of barley diverse gene pool. Till this end, we developed 200 SNP derived CAPS markers distributed across the genome covering around 991cM with an average marker density of 5.09cM. Further, we genotyped 68 CAPS markers in an F2 population (Cheri×ICB181160) segregating for seed color variation in barley. Genetic mapping of seed color revealed putative linkage of single nuclear gene on chromosome 1H. These findings showed the proof of concept for the development and utility of a newer cost effective genomic tool kit to analyze broader genetic resources of barley worldwide. Copyright © 2016 Elsevier Inc. All rights reserved.
Ganal, Martin W.; Durstewitz, Gregor; Polley, Andreas; Bérard, Aurélie; Buckler, Edward S.; Charcosset, Alain; Clarke, Joseph D.; Graner, Eva-Maria; Hansen, Mark; Joets, Johann; Le Paslier, Marie-Christine; McMullen, Michael D.; Montalent, Pierre; Rose, Mark; Schön, Chris-Carolin; Sun, Qi; Walter, Hildrun; Martin, Olivier C.; Falque, Matthieu
2011-01-01
SNP genotyping arrays have been useful for many applications that require a large number of molecular markers such as high-density genetic mapping, genome-wide association studies (GWAS), and genomic selection. We report the establishment of a large maize SNP array and its use for diversity analysis and high density linkage mapping. The markers, taken from more than 800,000 SNPs, were selected to be preferentially located in genes and evenly distributed across the genome. The array was tested with a set of maize germplasm including North American and European inbred lines, parent/F1 combinations, and distantly related teosinte material. A total of 49,585 markers, including 33,417 within 17,520 different genes and 16,168 outside genes, were of good quality for genotyping, with an average failure rate of 4% and rates up to 8% in specific germplasm. To demonstrate this array's use in genetic mapping and for the independent validation of the B73 sequence assembly, two intermated maize recombinant inbred line populations – IBM (B73×Mo17) and LHRF (F2×F252) – were genotyped to establish two high density linkage maps with 20,913 and 14,524 markers respectively. 172 mapped markers were absent in the current B73 assembly and their placement can be used for future improvements of the B73 reference sequence. Colinearity of the genetic and physical maps was mostly conserved with some exceptions that suggest errors in the B73 assembly. Five major regions containing non-colinearities were identified on chromosomes 2, 3, 6, 7 and 9, and are supported by both independent genetic maps. Four additional non-colinear regions were found on the LHRF map only; they may be due to a lower density of IBM markers in those regions or to true structural rearrangements between lines. Given the array's high quality, it will be a valuable resource for maize genetics and many aspects of maize breeding. PMID:22174790
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stemmer, Kerstin; Ellinger-Ziegelbauer, Heidrun; Lotz, Kerstin
2006-11-15
Laser microdissection in conjunction with microarray technology allows selective isolation and analysis of specific cell populations, e.g., preneoplastic renal lesions. To date, only limited information is available on sample preparation and preservation techniques that result in both optimal histomorphological preservation of sections and high-quality RNA for microarray analysis. Furthermore, amplification of minute amounts of RNA from microdissected renal samples allowing analysis with genechips has only scantily been addressed to date. The objective of this study was therefore to establish a reliable and reproducible protocol for laser microdissection in conjunction with microarray technology using kidney tissue from Eker rats p.o. treatedmore » for 7 days and 6 months with 10 and 1 mg Aristolochic acid/kg bw, respectively. Kidney tissues were preserved in RNAlater or snap frozen. Cryosections were cut and stained with either H and E or cresyl violet for subsequent morphological and RNA quality assessment and laser microdissection. RNA quality was comparable in snap frozen and RNAlater-preserved samples, however, the histomorphological preservation of renal sections was much better following cryopreservation. Moreover, the different staining techniques in combination with sample processing time at room temperature can have an influence on RNA quality. Different RNA amplification protocols were shown to have an impact on gene expression profiles as demonstrated with Affymetrix Rat Genome 230{sub 2}.0 arrays. Considering all the parameters analyzed in this study, a protocol for RNA isolation from laser microdissected samples with subsequent Affymetrix chip hybridization was established that was also successfully applied to preneoplastic lesions laser microdissected from Aristolochic acid-treated rats.« less
Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data.
Favero, F; Joshi, T; Marquard, A M; Birkbak, N J; Krzystanek, M; Li, Q; Szallasi, Z; Eklund, A C
2015-01-01
Exome or whole-genome deep sequencing of tumor DNA along with paired normal DNA can potentially provide a detailed picture of the somatic mutations that characterize the tumor. However, analysis of such sequence data can be complicated by the presence of normal cells in the tumor specimen, by intratumor heterogeneity, and by the sheer size of the raw data. In particular, determination of copy number variations from exome sequencing data alone has proven difficult; thus, single nucleotide polymorphism (SNP) arrays have often been used for this task. Recently, algorithms to estimate absolute, but not allele-specific, copy number profiles from tumor sequencing data have been described. We developed Sequenza, a software package that uses paired tumor-normal DNA sequencing data to estimate tumor cellularity and ploidy, and to calculate allele-specific copy number profiles and mutation profiles. We applied Sequenza, as well as two previously published algorithms, to exome sequence data from 30 tumors from The Cancer Genome Atlas. We assessed the performance of these algorithms by comparing their results with those generated using matched SNP arrays and processed by the allele-specific copy number analysis of tumors (ASCAT) algorithm. Comparison between Sequenza/exome and SNP/ASCAT revealed strong correlation in cellularity (Pearson's r = 0.90) and ploidy estimates (r = 0.42, or r = 0.94 after manual inspecting alternative solutions). This performance was noticeably superior to previously published algorithms. In addition, in artificial data simulating normal-tumor admixtures, Sequenza detected the correct ploidy in samples with tumor content as low as 30%. The agreement between Sequenza and SNP array-based copy number profiles suggests that exome sequencing alone is sufficient not only for identifying small scale mutations but also for estimating cellularity and inferring DNA copy number aberrations. © The Author 2014. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
The Role of Constitutional Copy Number Variants in Breast Cancer
Walker, Logan C.; Wiggins, George A.R.; Pearson, John F.
2015-01-01
Constitutional copy number variants (CNVs) include inherited and de novo deviations from a diploid state at a defined genomic region. These variants contribute significantly to genetic variation and disease in humans, including breast cancer susceptibility. Identification of genetic risk factors for breast cancer in recent years has been dominated by the use of genome-wide technologies, such as single nucleotide polymorphism (SNP)-arrays, with a significant focus on single nucleotide variants. To date, these large datasets have been underutilised for generating genome-wide CNV profiles despite offering a massive resource for assessing the contribution of these structural variants to breast cancer risk. Technical challenges remain in determining the location and distribution of CNVs across the human genome due to the accuracy of computational prediction algorithms and resolution of the array data. Moreover, better methods are required for interpreting the functional effect of newly discovered CNVs. In this review, we explore current and future application of SNP array technology to assess rare and common CNVs in association with breast cancer risk in humans. PMID:27600231
Xiao, Shijun; Wang, Panpan; Dong, Linsong; Zhang, Yaguang; Han, Zhaofang; Wang, Qiurong
2016-01-01
Whole-genome single-nucleotide polymorphism (SNP) markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS) provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms. PMID:28028455
Genes, age, and alcoholism: analysis of GAW14 data.
Apprey, Victor; Afful, Joseph; Harrell, Jules P; Taylor, Robert E; Bonney, George E
2005-12-30
A genetic analysis of age of onset of alcoholism was performed on the Collaborative Study on the Genetics of Alcoholism data released for Genetic Analysis Workshop 14. Our study illustrates an application of the log-normal age of onset model in our software Genetic Epidemiology Models (GEMs). The phenotype ALDX1 of alcoholism was studied. The analysis strategy was to first find the markers of the Affymetrix SNP dataset with significant association with age of onset, and then to perform linkage analysis on them. ALDX1 revealed strong evidence of linkage for marker tsc0041591 on chromosome 2 and suggestive linkage for marker tsc0894042 on chromosome 3. The largest separation in mean ages of onset of ALDX1 was 19.76 and 24.41 between male smokers who are carriers of the risk allele of tsc0041591 and the non-carriers, respectively. Hence, male smokers who are carriers of marker tsc0041591 on chromosome 2 have an average onset of ALDX1 almost 5 years earlier than non-carriers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
Rose, Amy E; Poliseno, Laura; Wang, Jinhua; Clark, Michael; Pearlman, Alexander; Wang, Guimin; Vega Y Saenz de Miera, Eleazar C; Medicherla, Ratna; Christos, Paul J; Shapiro, Richard; Pavlick, Anna; Darvishian, Farbod; Zavadil, Jiri; Polsky, David; Hernando, Eva; Ostrer, Harry; Osman, Iman
2011-04-01
Superficial spreading melanoma (SSM) and nodular melanoma (NM) are believed to represent sequential phases of linear progression from radial to vertical growth. Several lines of clinical, pathologic, and epidemiologic evidence suggest, however, that SSM and NM might be the result of independent pathways of tumor development. We utilized an integrative genomic approach that combines single nucleotide polymorphism array (6.0; Affymetrix) with gene expression array (U133A 2.0; Affymetrix) to examine molecular differences between SSM and NM. Pathway analysis of the most differentially expressed genes between SSM and NM (N = 114) revealed significant differences related to metabolic processes. We identified 8 genes (DIS3, FGFR1OP, G3BP2, GALNT7, MTAP, SEC23IP, USO1, and ZNF668) in which NM/SSM-specific copy number alterations correlated with differential gene expression (P < 0.05; Spearman's rank). SSM-specific genomic deletions in G3BP2, MTAP, and SEC23IP were independently verified in two external data sets. Forced overexpression of metabolism-related gene MTAP (methylthioadenosine phosphorylase) in SSM resulted in reduced cell growth. The differential expression of another metabolic-related gene, aldehyde dehydrogenase 7A1 (ALDH7A1), was validated at the protein level by using tissue microarrays of human melanoma. In addition, we show that the decreased ALDH7A1 expression in SSM may be the result of epigenetic modifications. Our data reveal recurrent genomic deletions in SSM not present in NM, which challenge the linear model of melanoma progression. Furthermore, our data suggest a role for altered regulation of metabolism-related genes as a possible cause of the different clinical behavior of SSM and NM.
Peng, Ke; Liu, Ruiqi; Yu, Yiyi; Liang, Li; Yu, Shan; Xu, Xiaojing; Liu, Tianshu
2018-01-01
Cetuximab is one of the most widely used epidermal growth factor receptor (EGFR) inhibitors to treat patients with metastatic colorectal cancer (mCRC) harboring wild-type of RAS/RAF status. However, primary and acquired resistance to cetuximab is often found during target therapy. To gain insights into the functions of long non-coding RNA (lncRNA) in cetuximab resistance, we used a lncRNA-mining approach to distinguish lncRNA specific probes in Affymetrix HG-U133A 2.0 arrays. Then we performed lncRNA expression profiling in a cetuximab treated mCRC cohort from Gene Expression Ominus (GEO). The potential lncRNAs were further validated in acquired cetuximab resistant cell lines and clinical samples of our hospital. The functions and associated pathways of the prognostic lncRNA were predicted by GO and KEGG analyses. 249 lncRNA-specific probe sets (corresponding to 212 lncRNAs) were represented in Affymetrix HG-U133A 2.0 arrays. We found that 9 lncRNAs were differentially expressed between disease control group (DCG) and non-responders, and 5 of these 9 lncRNAs were significantly related with the progression-free survival (PFS) of the patients. Among those 5 lncRNAs, POU5F1P4 was also down-regulated in acquired cetuximab resistant cells, as well as in cetuximab resistant patients. Downregulation of POU5F1P4 decreased the sensitivity of colorectal cancer cells to cetuximab. Our findings indicate the potential roles of lncRNAs in cetuximab resistance, and may provide the useful information for discovery of new biomarkers and therapeutic targets. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Draghici, Sorin; Tarca, Adi L; Yu, Longfei; Ethier, Stephen; Romero, Roberto
2008-03-01
The BioArray Software Environment (BASE) is a very popular MIAME-compliant, web-based microarray data repository. However in BASE, like in most other microarray data repositories, the experiment annotation and raw data uploading can be very timeconsuming, especially for large microarray experiments. We developed KUTE (Karmanos Universal daTabase for microarray Experiments), as a plug-in for BASE 2.0 that addresses these issues. KUTE provides an automatic experiment annotation feature and a completely redesigned data work-flow that dramatically reduce the human-computer interaction time. For instance, in BASE 2.0 a typical Affymetrix experiment involving 100 arrays required 4 h 30 min of user interaction time forexperiment annotation, and 45 min for data upload/download. In contrast, for the same experiment, KUTE required only 28 min of user interaction time for experiment annotation, and 3.3 min for data upload/download. http://vortex.cs.wayne.edu/kute/index.html.
Genome-wide association study for milking speed in French Holstein cows.
Marete, Andrew; Sahana, Goutam; Fritz, Sébastien; Lefebvre, Rachel; Barbat, Anne; Lund, Mogens Sandø; Guldbrandtsen, Bernt; Boichard, Didier
2018-04-25
Using a combination of data from the BovineSNP50 BeadChip SNP array (Illumina, San Diego, CA) and a EuroGenomics (Amsterdam, the Netherlands) custom single nucleotide polymorphism (SNP) chip with SNP pre-selected from whole genome sequence data, we carried out an association study of milking speed in 32,491 French Holstein dairy cows. Milking speed was measured by a score given by the farmer. Phenotypes were yield deviations as obtained from the French evaluation system. They were analyzed with a linear mixed model for association studies. We identified SNP on 22 chromosomes significantly associated with milking speed. As clinical mastitis and somatic cell score have an unfavorable genetic correlation with milking speed, we tested whether the most significant SNP on these 22 chromosomes associated with milking speed were also associated with clinical mastitis or somatic cell score. Nine hundred seventy-one genome-wide significant SNP were associated with milking speed. Of these, 86 were associated with clinical mastitis and 198 with somatic cell score. The most significant association signals for milking speed were observed on chromosomes 7, 8, 10, 14, and 18. The most significant signal was located on chromosome 14 (ZFAT gene). Eleven novel milking speed quantitative trait loci (QTL) were observed on chromosomes 7, 10, 11, 14, 18, 25, and 26. Twelve candidate SNP for milking speed mapped directly within genes. Of these 10 were QTL lead SNP, which mapped within the genes HMHA1, POLR2E, GNB5, KLHL29, ZFAT, KCNB2, CEACAM18, CCL24, and LHPP. Limited pleiotropy was observed between milking speed QTL and clinical mastitis. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Vallejo, Roger L; Silva, Rafael M O; Evenhuis, Jason P; Gao, Guangtu; Liu, Sixin; Parsons, James E; Martin, Kyle E; Wiens, Gregory D; Lourenco, Daniela A L; Leeds, Timothy D; Palti, Yniv
2018-06-05
Previously accurate genomic predictions for Bacterial cold water disease (BCWD) resistance in rainbow trout were obtained using a medium-density single nucleotide polymorphism (SNP) array. Here, the impact of lower-density SNP panels on the accuracy of genomic predictions was investigated in a commercial rainbow trout breeding population. Using progeny performance data, the accuracy of genomic breeding values (GEBV) using 35K, 10K, 3K, 1K, 500, 300 and 200 SNP panels as well as a panel with 70 quantitative trait loci (QTL)-flanking SNP was compared. The GEBVs were estimated using the Bayesian method BayesB, single-step GBLUP (ssGBLUP) and weighted ssGBLUP (wssGBLUP). The accuracy of GEBVs remained high despite the sharp reductions in SNP density, and even with 500 SNP accuracy was higher than the pedigree-based prediction (0.50-0.56 versus 0.36). Furthermore, the prediction accuracy with the 70 QTL-flanking SNP (0.65-0.72) was similar to the panel with 35K SNP (0.65-0.71). Genomewide linkage disequilibrium (LD) analysis revealed strong LD (r 2 ≥ 0.25) spanning on average over 1 Mb across the rainbow trout genome. This long-range LD likely contributed to the accurate genomic predictions with the low-density SNP panels. Population structure analysis supported the hypothesis that long-range LD in this population may be caused by admixture. Results suggest that lower-cost, low-density SNP panels can be used for implementing genomic selection for BCWD resistance in rainbow trout breeding programs. © 2018 The Authors. This article is a U.S. Government work and is in the public domain in the USA. Journal of Animal Breeding and Genetics published by Blackwell Verlag GmbH.
Liao, Can; Fu, Fang; Yang, Xin; Sun, Yi-Min; Li, Dong-Zhi
2011-06-01
Primary ovarian insufficiency (POI) is defined as a primary ovarian defect characterized by absent menarche (primary amenorrhea) or premature depletion of ovarian follicles before the age of 40 years. The etiology of primary ovarian insufficiency in human female patients is still unclear. The purpose of this study is to investigate the potential genetic causes in primary amenorrhea patients by high resolution array based comparative genomic hybridization (array-CGH) analysis. Following the standard karyotyping analysis, genomic DNA from whole blood of 15 primary amenorrhea patients and 15 normal control women was hybridized with Affymetrix cytogenetic 2.7M arrays following the standard protocol. Copy number variations identified by array-CGH were confirmed by real time polymerase chain reaction. All the 30 samples were negative by conventional karyotyping analysis. Microdeletions on chromosome 17q21.31-q21.32 with approximately 1.3 Mb were identified in four patients by high resolution array-CGH analysis. This included the female reproductive secretory pathway related factor N-ethylmaleimide-sensitive factor (NSF) gene. The results of the present study suggest that there may be critical regions regulating primary ovarian insufficiency in women with a 17q21.31-q21.32 microdeletion. This effect might be due to the loss of function of the NSF gene/genes within the deleted region or to effects on contiguous genes.
High-density SNP assay development for genetic analysis in maritime pine (Pinus pinaster).
Plomion, C; Bartholomé, J; Lesur, I; Boury, C; Rodríguez-Quilón, I; Lagraulet, H; Ehrenmann, F; Bouffier, L; Gion, J M; Grivet, D; de Miguel, M; de María, N; Cervera, M T; Bagnoli, F; Isik, F; Vendramin, G G; González-Martínez, S C
2016-03-01
Maritime pine provides essential ecosystem services in the south-western Mediterranean basin, where it covers around 4 million ha. Its scattered distribution over a range of environmental conditions makes it an ideal forest tree species for studies of local adaptation and evolutionary responses to climatic change. Highly multiplexed single nucleotide polymorphism (SNP) genotyping arrays are increasingly used to study genetic variation in living organisms and for practical applications in plant and animal breeding and genetic resource conservation. We developed a 9k Illumina Infinium SNP array and genotyped maritime pine trees from (i) a three-generation inbred (F2) pedigree, (ii) the French breeding population and (iii) natural populations from Portugal and the French Atlantic coast. A large proportion of the exploitable SNPs (2052/8410, i.e. 24.4%) segregated in the mapping population and could be mapped, providing the densest ever gene-based linkage map for this species. Based on 5016 SNPs, natural and breeding populations from the French gene pool exhibited similar level of genetic diversity. Population genetics and structure analyses based on 3981 SNP markers common to the Portuguese and French gene pools revealed high levels of differentiation, leading to the identification of a set of highly differentiated SNPs that could be used for seed provenance certification. Finally, we discuss how the validated SNPs could facilitate the identification of ecologically and economically relevant genes in this species, improving our understanding of the demography and selective forces shaping its natural genetic diversity, and providing support for new breeding strategies. © 2015 John Wiley & Sons Ltd.
A high-density intraspecific SNP linkage map of pigeonpea (Cajanas cajan L. Millsp.)
Mandal, Paritra; Bhutani, Shefali; Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram Pratap; Chaudhary, A. K.; Yadav, Rekha; Gaikwad, K.; Sevanthi, Amitha Mithra; Datta, Subhojit; Raje, Ranjeet S.; Sharma, Tilak R.; Singh, Nagendra Kumar
2017-01-01
Pigeonpea (Cajanus cajan (L.) Millsp.) is a major food legume cultivated in semi-arid tropical regions including the Indian subcontinent, Africa, and Southeast Asia. It is an important source of protein, minerals, and vitamins for nearly 20% of the world population. Due to high carbon sequestration and drought tolerance, pigeonpea is an important crop for the development of climate resilient agriculture and nutritional security. However, pigeonpea productivity has remained low for decades because of limited genetic and genomic resources, and sparse utilization of landraces and wild pigeonpea germplasm. Here, we present a dense intraspecific linkage map of pigeonpea comprising 932 markers that span a total adjusted map length of 1,411.83 cM. The consensus map is based on three different linkage maps that incorporate a large number of single nucleotide polymorphism (SNP) markers derived from next generation sequencing data, using Illumina GoldenGate bead arrays, and genotyping with restriction site associated DNA (RAD) sequencing. The genotyping-by-sequencing enhanced the marker density but was met with limited success due to lack of common markers across the genotypes of mapping population. The integrated map has 547 bead-array SNP, 319 RAD-SNP, and 65 simple sequence repeat (SSR) marker loci. We also show here correspondence between our linkage map and published genome pseudomolecules of pigeonpea. The availability of a high-density linkage map will help improve the anchoring of the pigeonpea genome to its chromosomes and the mapping of genes and quantitative trait loci associated with useful agronomic traits. PMID:28654689
Tsai, Hsin Y; Robledo, Diego; Lowe, Natalie R; Bekaert, Michael; Taggart, John B; Bron, James E; Houston, Ross D
2016-07-07
High density linkage maps are useful tools for fine-scale mapping of quantitative trait loci, and characterization of the recombination landscape of a species' genome. Genomic resources for Atlantic salmon (Salmo salar) include a well-assembled reference genome, and high density single nucleotide polymorphism (SNP) arrays. Our aim was to create a high density linkage map, and to align it with the reference genome assembly. Over 96,000 SNPs were mapped and ordered on the 29 salmon linkage groups using a pedigreed population comprising 622 fish from 60 nuclear families, all genotyped with the 'ssalar01' high density SNP array. The number of SNPs per group showed a high positive correlation with physical chromosome length (r = 0.95). While the order of markers on the genetic and physical maps was generally consistent, areas of discrepancy were identified. Approximately 6.5% of the previously unmapped reference genome sequence was assigned to chromosomes using the linkage map. Male recombination rate was lower than females across the vast majority of the genome, but with a notable peak in subtelomeric regions. Finally, using RNA-Seq data to annotate the reference genome, the mapped SNPs were categorized according to their predicted function, including annotation of ∼2500 putative nonsynonymous variants. The highest density SNP linkage map for any salmonid species has been created, annotated, and integrated with the Atlantic salmon reference genome assembly. This map highlights the marked heterochiasmy of salmon, and provides a useful resource for salmonid genetics and genomics research. Copyright © 2016 Tsai et al.
Helm, Benjamin M; Langley, Katherine; Spangler, Brooke; Vergano, Samantha
2014-08-01
Single nucleotide polymorphism microarrays have the ability to reveal parental consanguinity which may or may not be known to healthcare providers. Consanguinity can have significant implications for the health of patients and for individual and family psychosocial well-being. These results often present ethical and legal dilemmas that can have important ramifications. Unexpected consanguinity can be confounding to healthcare professionals who may be unprepared to handle these results or to communicate them to families or other appropriate representatives. There are few published accounts of experiences with consanguinity and SNP arrays. In this paper we discuss three cases where molecular evidence of parental incest was identified by SNP microarray. We hope to further highlight consanguinity as a potential incidental finding, how the cases were handled by the clinical team, and what resources were found to be most helpful. This paper aims to contribute further to professional discourse on incidental findings with genomic technology and how they were addressed clinically. These experiences may provide some guidance on how others can prepare for these findings and help improve practice. As genetic and genomic testing is utilized more by non-genetics providers, we also hope to inform about the importance of engaging with geneticists and genetic counselors when addressing these findings.
Laios, Eleftheria; Drogari, Euridiki
2006-12-01
Three mutations in the low density lipoprotein receptor (LDLR) gene account for 49% of familial hypercholesterolemia (FH) cases in Greece. We used the microelectronic array technology of the NanoChip Molecular Biology Workstation to develop a multiplex method to analyze these single-nucleotide polymorphisms (SNPs). Primer pairs amplified the region encompassing each SNP. The biotinylated PCR amplicon was electronically addressed to streptavidin-coated microarray sites. Allele-specific fluorescently labeled oligonucleotide reporters were designed and used for detection of wild-type and SNP sequences. Genotypes were compared to PCR-restriction fragment length polymorphism (PCR-RFLP). We developed three monoplex assays (1 SNP/site) and an optimized multiplex assay (3SNPs/site). We performed 92 Greece II, 100 Genoa, and 98 Afrikaner-2 NanoChip monoplex assays (addressed to duplicate sites and analyzed separately). Of the 580 monoplex genotypings (290 samples), 579 agreed with RFLP. Duplicate sites of one sample were not in agreement with each other. Of the 580 multiplex genotypings, 576 agreed with the monoplex results. Duplicate sites of three samples were not in agreement with each other, indicating requirement for repetition upon which discrepancies were resolved. The multiplex assay detects common LDLR mutations in Greek FH patients and can be extended to accommodate additional mutations.
Lepoittevin, Camille; Frigerio, Jean-Marc; Garnier-Géré, Pauline; Salin, Franck; Cervera, María-Teresa; Vornam, Barbara; Harvengt, Luc; Plomion, Christophe
2010-01-01
Background There is considerable interest in the high-throughput discovery and genotyping of single nucleotide polymorphisms (SNPs) to accelerate genetic mapping and enable association studies. This study provides an assessment of EST-derived and resequencing-derived SNP quality in maritime pine (Pinus pinaster Ait.), a conifer characterized by a huge genome size (∼23.8 Gb/C). Methodology/Principal Findings A 384-SNPs GoldenGate genotyping array was built from i/ 184 SNPs originally detected in a set of 40 re-sequenced candidate genes (in vitro SNPs), chosen on the basis of functionality scores, presence of neighboring polymorphisms, minor allele frequencies and linkage disequilibrium and ii/ 200 SNPs screened from ESTs (in silico SNPs) selected based on the number of ESTs used for SNP detection, the SNP minor allele frequency and the quality of SNP flanking sequences. The global success rate of the assay was 66.9%, and a conversion rate (considering only polymorphic SNPs) of 51% was achieved. In vitro SNPs showed significantly higher genotyping-success and conversion rates than in silico SNPs (+11.5% and +18.5%, respectively). The reproducibility was 100%, and the genotyping error rate very low (0.54%, dropping down to 0.06% when removing four SNPs showing elevated error rates). Conclusions/Significance This study demonstrates that ESTs provide a resource for SNP identification in non-model species, which do not require any additional bench work and little bio-informatics analysis. However, the time and cost benefits of in silico SNPs are counterbalanced by a lower conversion rate than in vitro SNPs. This drawback is acceptable for population-based experiments, but could be dramatic in experiments involving samples from narrow genetic backgrounds. In addition, we showed that both the visual inspection of genotyping clusters and the estimation of a per SNP error rate should help identify markers that are not suitable to the GoldenGate technology in species characterized by a large and complex genome. PMID:20543950
Troggio, Michela; Surbanovski, Nada; Bianco, Luca; Moretto, Marco; Giongo, Lara; Banchi, Elisa; Viola, Roberto; Fernández, Felicdad Fernández; Costa, Fabrizio; Velasco, Riccardo; Cestaro, Alessandro; Sargent, Daniel James
2013-01-01
High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs) have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432), but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the 'Golden Delicious' genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies.
Tomato breeding in the genomics era: insights from a SNP array.
Víquez-Zamora, Marcela; Vosman, Ben; van de Geest, Henri; Bovy, Arnaud; Visser, Richard G F; Finkers, Richard; van Heusden, Adriaan W
2013-05-27
The major bottle neck in genetic and linkage studies in tomato has been the lack of a sufficient number of molecular markers. This has radically changed with the application of next generation sequencing and high throughput genotyping. A set of 6000 SNPs was identified and 5528 of them were used to evaluate tomato germplasm at the level of species, varieties and segregating populations. From the 5528 SNPs, 1980 originated from 454-sequencing, 3495 from Illumina Solexa sequencing and 53 were additional known markers. Genotyping different tomato samples allowed the evaluation of the level of heterozygosity and introgressions among commercial varieties. Cherry tomatoes were especially different from round/beefs in chromosomes 4, 5 and 12. We were able to identify a set of 750 unique markers distinguishing S. lycopersicum 'Moneymaker' from all its distantly related wild relatives. Clustering and neighbour joining analysis among varieties and species showed expected grouping patterns, with S. pimpinellifolium as the most closely related to commercial tomatoes earlier results. Our results show that a SNP search in only a few breeding lines already provides generally applicable markers in tomato and its wild relatives. It also shows that the Illumina bead array generated data are highly reproducible. Our SNPs can roughly be divided in two categories: SNPs of which both forms are present in the wild relatives and in domesticated tomatoes (originating from common ancestors) and SNPs unique for the domesticated tomato (originating from after the domestication event). The SNPs can be used for genotyping, identification of varieties, comparison of genetic and physical linkage maps and to confirm (phylogenetic) relations. In the SNPs used for the array there is hardly any overlap with the SolCAP array and it is strongly recommended to combine both SNP sets and to select a core collection of robust SNPs completely covering the entire tomato genome.
A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation.
Howe, Glenn T; Yu, Jianbin; Knaus, Brian; Cronn, Richard; Kolpak, Scott; Dolan, Peter; Lorenz, W Walter; Dean, Jeffrey F D
2013-02-28
Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array-more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change.
Sulovari, Arvis; Li, Dawei
2014-07-19
Genome-wide association studies (GWAS) have successfully identified genes associated with complex human diseases. Although much of the heritability remains unexplained, combining single nucleotide polymorphism (SNP) genotypes from multiple studies for meta-analysis will increase the statistical power to identify new disease-associated variants. Meta-analysis requires same allele definition (nomenclature) and genome build among individual studies. Similarly, imputation, commonly-used prior to meta-analysis, requires the same consistency. However, the genotypes from various GWAS are generated using different genotyping platforms, arrays or SNP-calling approaches, resulting in use of different genome builds and allele definitions. Incorrect assumptions of identical allele definition among combined GWAS lead to a large portion of discarded genotypes or incorrect association findings. There is no published tool that predicts and converts among all major allele definitions. In this study, we have developed a tool, GACT, which stands for Genome build and Allele definition Conversion Tool, that predicts and inter-converts between any of the common SNP allele definitions and between the major genome builds. In addition, we assessed several factors that may affect imputation quality, and our results indicated that inclusion of singletons in the reference had detrimental effects while ambiguous SNPs had no measurable effect. Unexpectedly, exclusion of genotypes with missing rate > 0.001 (40% of study SNPs) showed no significant decrease of imputation quality (even significantly higher when compared to the imputation with singletons in the reference), especially for rare SNPs. GACT is a new, powerful, and user-friendly tool with both command-line and interactive online versions that can accurately predict, and convert between any of the common allele definitions and between genome builds for genome-wide meta-analysis and imputation of genotypes from SNP-arrays or deep-sequencing, particularly for data from the dbGaP and other public databases. http://www.uvm.edu/genomics/software/gact.
Talseth-Palmer, Bente A; Holliday, Elizabeth G; Evans, Tiffany-Jane; McEvoy, Mark; Attia, John; Grice, Desma M; Masson, Amy L; Meldrum, Cliff; Spigelman, Allan; Scott, Rodney J
2013-03-26
Hereditary non-polyposis colorectal cancer (HNPCC)/Lynch syndrome (LS) is a cancer syndrome characterised by early-onset epithelial cancers, especially colorectal cancer (CRC) and endometrial cancer. The aim of the current study was to use SNP-array technology to identify genomic aberrations which could contribute to the increased risk of cancer in HNPCC/LS patients. Individuals diagnosed with HNPCC/LS (100) and healthy controls (384) were genotyped using the Illumina Human610-Quad SNP-arrays. Copy number variation (CNV) calling and association analyses were performed using Nexus software, with significant results validated using QuantiSNP. TaqMan Copy-Number assays were used for verification of CNVs showing significant association with HNPCC/LS identified by both software programs. We detected copy number (CN) gains associated with HNPCC/LS status on chromosome 7q11.21 (28% cases and 0% controls, Nexus; p =3.60E-20 and QuantiSNP; p < 1.00E-16) and 16p11.2 (46% in cases, while a CN loss was observed in 23% of controls, Nexus; p = 4.93E-21 and QuantiSNP; p = 5.00E-06) via in silico analyses. TaqMan Copy-Number assay was used for validation of CNVs showing significant association with HNPCC/LS. In addition, CNV burden (total CNV length, average CNV length and number of observed CNV events) was significantly greater in cases compared to controls. A greater CNV burden was identified in HNPCC/LS cases compared to controls supporting the notion of higher genomic instability in these patients. One intergenic locus on chromosome 7q11.21 is possibly associated with HNPCC/LS and deserves further investigation. The results from this study highlight the complexities of fluorescent based CNV analyses. The inefficiency of both CNV detection methods to reproducibly detect observed CNVs demonstrates the need for sequence data to be considered alongside intensity data to avoid false positive results.
Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue
2016-01-01
Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean. PMID:26856884
Wang, Jiao; Chu, Shanshan; Zhang, Huairen; Zhu, Ying; Cheng, Hao; Yu, Deyue
2016-02-09
Domestication of soybeans occurred under the intense human-directed selections aimed at developing high-yielding lines. Tracing the domestication history and identifying the genes underlying soybean domestication require further exploration. Here, we developed a high-throughput NJAU 355 K SoySNP array and used this array to study the genetic variation patterns in 367 soybean accessions, including 105 wild soybeans and 262 cultivated soybeans. The population genetic analysis suggests that cultivated soybeans have tended to originate from northern and central China, from where they spread to other regions, accompanied with a gradual increase in seed weight. Genome-wide scanning for evidence of artificial selection revealed signs of selective sweeps involving genes controlling domestication-related agronomic traits including seed weight. To further identify genomic regions related to seed weight, a genome-wide association study (GWAS) was conducted across multiple environments in wild and cultivated soybeans. As a result, a strong linkage disequilibrium region on chromosome 20 was found to be significantly correlated with seed weight in cultivated soybeans. Collectively, these findings should provide an important basis for genomic-enabled breeding and advance the study of functional genomics in soybean.
Comparing CNV detection methods for SNP arrays.
Winchester, Laura; Yau, Christopher; Ragoussis, Jiannis
2009-09-01
Data from whole genome association studies can now be used for dual purposes, genotyping and copy number detection. In this review we discuss some of the methods for using SNP data to detect copy number events. We examine a number of algorithms designed to detect copy number changes through the use of signal-intensity data and consider methods to evaluate the changes found. We describe the use of several statistical models in copy number detection in germline samples. We also present a comparison of data using these methods to assess accuracy of prediction and detection of changes in copy number.
User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org
Eijssen, Lars M. T.; Jaillard, Magali; Adriaens, Michiel E.; Gaj, Stan; de Groot, Philip J.; Müller, Michael; Evelo, Chris T.
2013-01-01
Quality control (QC) is crucial for any scientific method producing data. Applying adequate QC introduces new challenges in the genomics field where large amounts of data are produced with complex technologies. For DNA microarrays, specific algorithms for QC and pre-processing including normalization have been developed by the scientific community, especially for expression chips of the Affymetrix platform. Many of these have been implemented in the statistical scripting language R and are available from the Bioconductor repository. However, application is hampered by lack of integrative tools that can be used by users of any experience level. To fill this gap, we developed a freely available tool for QC and pre-processing of Affymetrix gene expression results, extending, integrating and harmonizing functionality of Bioconductor packages. The tool can be easily accessed through a wizard-like web portal at http://www.arrayanalysis.org or downloaded for local use in R. The portal provides extensive documentation, including user guides, interpretation help with real output illustrations and detailed technical documentation. It assists newcomers to the field in performing state-of-the-art QC and pre-processing while offering data analysts an integral open-source package. Providing the scientific community with this easily accessible tool will allow improving data quality and reuse and adoption of standards. PMID:23620278
TIPMaP: a web server to establish transcript isoform profiles from reliable microarray probes.
Chitturi, Neelima; Balagannavar, Govindkumar; Chandrashekar, Darshan S; Abinaya, Sadashivam; Srini, Vasan S; Acharya, Kshitish K
2013-12-27
Standard 3' Affymetrix gene expression arrays have contributed a significantly higher volume of existing gene expression data than other microarray platforms. These arrays were designed to identify differentially expressed genes, but not their alternatively spliced transcript forms. No resource can currently identify expression pattern of specific mRNA forms using these microarray data, even though it is possible to do this. We report a web server for expression profiling of alternatively spliced transcripts using microarray data sets from 31 standard 3' Affymetrix arrays for human, mouse and rat species. The tool has been experimentally validated for mRNAs transcribed or not-detected in a human disease condition (non-obstructive azoospermia, a male infertility condition). About 4000 gene expression datasets were downloaded from a public repository. 'Good probes' with complete coverage and identity to latest reference transcript sequences were first identified. Using them, 'Transcript specific probe-clusters' were derived for each platform and used to identify expression status of possible transcripts. The web server can lead the user to datasets corresponding to specific tissues, conditions via identifiers of the microarray studies or hybridizations, keywords, official gene symbols or reference transcript identifiers. It can identify, in the tissues and conditions of interest, about 40% of known transcripts as 'transcribed', 'not-detected' or 'differentially regulated'. Corresponding additional information for probes, genes, transcripts and proteins can be viewed too. We identified the expression of transcripts in a specific clinical condition and validated a few of these transcripts by experiments (using reverse transcription followed by polymerase chain reaction). The experimental observations indicated higher agreements with the web server results, than contradictions. The tool is accessible at http://resource.ibab.ac.in/TIPMaP. The newly developed online tool forms a reliable means for identification of alternatively spliced transcript-isoforms that may be differentially expressed in various tissues, cell types or physiological conditions. Thus, by making better use of existing data, TIPMaP avoids the dependence on precious tissue-samples, in experiments with a goal to establish expression profiles of alternative splice forms--at least in some cases.
Barton, G; Abbott, J; Chiba, N; Huang, DW; Huang, Y; Krznaric, M; Mack-Smith, J; Saleem, A; Sherman, BT; Tiwari, B; Tomlinson, C; Aitman, T; Darlington, J; Game, L; Sternberg, MJE; Butcher, SA
2008-01-01
Background Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management. Results EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms. Conclusion EMAAS enables users to track and perform microarray data management and analysis tasks through a single easy-to-use web application. The system architecture is flexible and scalable to allow new array types, analysis algorithms and tools to be added with relative ease and to cope with large increases in data volume. PMID:19032776
Rodriguez-Murillo, Laura; Fromer, Menachem; Mazaika, Erica; Vardarajan, Badri; Italia, Michael; Leipzig, Jeremy; DePalma, Steven R.; Golhar, Ryan; Sanders, Stephan J.; Yamrom, Boris; Ronemus, Michael; Iossifov, Ivan; Willsey, A. Jeremy; State, Matthew W.; Kaltman, Jonathan R.; White, Peter S.; Shen, Yufeng; Warburton, Dorothy; Brueckner, Martina; Seidman, Christine; Goldmuntz, Elizabeth; Gelb, Bruce D.; Lifton, Richard; Seidman, Jonathan; Hakonarson, Hakon; Chung, Wendy K.
2014-01-01
Rationale Congenital heart disease (CHD) is among the most common birth defects. Most cases are of unknown etiology. Objective To determine the contribution of de novo copy number variants (CNVs) in the etiology of sporadic CHD. Methods and Results We studied 538 CHD trios using genome-wide dense single nucleotide polymorphism (SNP) arrays and/or whole exome sequencing (WES). Results were experimentally validated using digital droplet PCR. We compared validated CNVs in CHD cases to CNVs in 1,301 healthy control trios. The two complementary high-resolution technologies identified 63 validated de novo CNVs in 51 CHD cases. A significant increase in CNV burden was observed when comparing CHD trios with healthy trios, using either SNP array (p=7x10−5, Odds Ratio (OR)=4.6) or WES data (p=6x10−4, OR=3.5) and remained after removing 16% of de novo CNV loci previously reported as pathogenic (p=0.02, OR=2.7). We observed recurrent de novo CNVs on 15q11.2 encompassing CYFIP1, NIPA1, and NIPA2 and single de novo CNVs encompassing DUSP1, JUN, JUP, MED15, MED9, PTPRE SREBF1, TOP2A, and ZEB2, genes that interact with established CHD proteins NKX2-5 and GATA4. Integrating de novo variants in WES and CNV data suggests that ETS1 is the pathogenic gene altered by 11q24.2-q25 deletions in Jacobsen syndrome and that CTBP2 is the pathogenic gene in 10q sub-telomeric deletions. Conclusions We demonstrate a significantly increased frequency of rare de novo CNVs in CHD patients compared with healthy controls and suggest several novel genetic loci for CHD. PMID:25205790
Morandi, Anita; Bonnefond, Amélie; Lobbens, Stéphane; Carotenuto, Marco; Del Giudice, Emanuele Miraglia; Froguel, Philippe; Maffeis, Claudio
2015-11-01
The Prader-Willi syndrome (PWS) is caused by lack of expression of paternal allele of the 15q11.2-q13 region, due to deletions at paternal 15q11.2-q13 (<70%), maternal uniparental disomy of chromosome 15 (mat-UPD 15) (30%) or imprinting defects (1%). Hyperphagia, intellectual disabilities/behavioral disorders, neonatal hypotonia, and hypogonadism are cardinal features for PWS. Methylation sensitive PCR (MS-PCR) of the SNRPN locus, which assesses the presence of both the unmethylated (paternal) and the methylated (maternal) allele of 15q11.2-q13, is considered a sensitive reference technique for PWS diagnosis regardless of genetic subtype. We describe a 17-year-old girl with severe obesity, short stature, and intellectual disability, without hypogonadism and history of neonatal hypotonia, who was suspected to have an incomplete PWS. The MS-PCR showed a normal pattern with similar maternal and paternal electrophoretic bands. Afterwards, a SNP array showed the presence of iso-UPD 15, that is, UPD15 with two copies of the same chromosome 15, in about 50% of cells, suggesting a diagnosis of partial PWS due to mosaic maternal iso-UPD15 arisen as rescue of a post-fertilization error. A quantitative methylation analysis confirmed the presence of mosaic UPD15 in about 50% of cells. We propose that complete clinical criteria for PWS and MS-PCR should not be considered sensitive in suspecting and diagnosing partial PWS due to mosaic UPD15. In contrast, clinical suspicion based on less restrictive criteria followed by SNP array is a more powerful approach to diagnose atypical PWS due to UPD15 mosaicism. © 2015 Wiley Periodicals, Inc.
Microarray labeling extension values: laboratory signatures for Affymetrix GeneChips
Lee, Yun-Shien; Chen, Chun-Houh; Tsai, Chi-Neu; Tsai, Chia-Lung; Chao, Angel; Wang, Tzu-Hao
2009-01-01
Interlaboratory comparison of microarray data, even when using the same platform, imposes several challenges to scientists. RNA quality, RNA labeling efficiency, hybridization procedures and data-mining tools can all contribute variations in each laboratory. In Affymetrix GeneChips, about 11–20 different 25-mer oligonucleotides are used to measure the level of each transcript. Here, we report that ‘labeling extension values (LEVs)’, which are correlation coefficients between probe intensities and probe positions, are highly correlated with the gene expression levels (GEVs) on eukayotic Affymetrix microarray data. By analyzing LEVs and GEVs in the publicly available 2414 cel files of 20 Affymetrix microarray types covering 13 species, we found that correlations between LEVs and GEVs only exist in eukaryotic RNAs, but not in prokaryotic ones. Surprisingly, Affymetrix results of the same specimens that were analyzed in different laboratories could be clearly differentiated only by LEVs, leading to the identification of ‘laboratory signatures’. In the examined dataset, GSE10797, filtering out high-LEV genes did not compromise the discovery of biological processes that are constructed by differentially expressed genes. In conclusion, LEVs provide a new filtering parameter for microarray analysis of gene expression and it may improve the inter- and intralaboratory comparability of Affymetrix GeneChips data. PMID:19295132
Characterization of genetic variability of Venezuelan equine encephalitis viruses
Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.; ...
2016-04-07
Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less
efficient association study design via power-optimized tag SNP selection
HAN, BUHM; KANG, HYUN MIN; SEO, MYEONG SEONG; ZAITLEN, NOAH; ESKIN, ELEAZAR
2008-01-01
Discovering statistical correlation between causal genetic variation and clinical traits through association studies is an important method for identifying the genetic basis of human diseases. Since fully resequencing a cohort is prohibitively costly, genetic association studies take advantage of local correlation structure (or linkage disequilibrium) between single nucleotide polymorphisms (SNPs) by selecting a subset of SNPs to be genotyped (tag SNPs). While many current association studies are performed using commercially available high-throughput genotyping products that define a set of tag SNPs, choosing tag SNPs remains an important problem for both custom follow-up studies as well as designing the high-throughput genotyping products themselves. The most widely used tag SNP selection method optimizes over the correlation between SNPs (r2). However, tag SNPs chosen based on an r2 criterion do not necessarily maximize the statistical power of an association study. We propose a study design framework that chooses SNPs to maximize power and efficiently measures the power through empirical simulation. Empirical results based on the HapMap data show that our method gains considerable power over a widely used r2-based method, or equivalently reduces the number of tag SNPs required to attain the desired power of a study. Our power-optimized 100k whole genome tag set provides equivalent power to the Affymetrix 500k chip for the CEU population. For the design of custom follow-up studies, our method provides up to twice the power increase using the same number of tag SNPs as r2-based methods. Our method is publicly available via web server at http://design.cs.ucla.edu. PMID:18702637
Schweighofer, Carmen D.; Coombes, Kevin R.; Majewski, Tadeusz; Barron, Lynn L.; Lerner, Susan; Sargent, Rachel L.; O'Brien, Susan; Ferrajoli, Alessandra; Wierda, William G.; Czerniak, Bogdan A.; Medeiros, L. Jeffrey; Keating, Michael J.; Abruzzo, Lynne V.
2013-01-01
Genomic abnormalities, such as deletions in 11q22 or 17p13, are associated with poorer prognosis in patients with chronic lymphocytic leukemia (CLL). We hypothesized that unknown regions of copy number variation (CNV) affect clinical outcome and can be detected by array-based single-nucleotide polymorphism (SNP) genotyping. We compared SNP genotypes from 168 untreated patients with CLL with genotypes from 73 white HapMap controls. We identified 322 regions of recurrent CNV, 82 of which occurred significantly more often in CLL than in HapMap (CLL-specific CNV), including regions typically aberrant in CLL: deletions in 6q21, 11q22, 13q14, and 17p13 and trisomy 12. In univariate analyses, 35 of total and 11 of CLL-specific CNVs were associated with unfavorable time-to-event outcomes, including gains or losses in chromosomes 2p, 4p, 4q, 6p, 6q, 7q, 11p, 11q, and 17p. In multivariate analyses, six CNVs (ie, CLL-specific variations in 11p15.1-15.4 or 6q27) predicted time-to-treatment or overall survival independently of established markers of prognosis. Moreover, genotypic complexity (ie, the number of independent CNVs per patient) significantly predicted prognosis, with a median time-to-treatment of 64 months versus 23 months in patients with zero to one versus two or more CNVs, respectively (P = 3.3 × 10−8). In summary, a comparison of SNP genotypes from patients with CLL with HapMap controls allowed us to identify known and unknown recurrent CNVs and to determine regions and rates of CNV that predict poorer prognosis in patients with CLL. PMID:23273604
Demirci, F. Yesim; Wang, Xingbin; Kelly, Jennifer A.; Morris, David L.; Barmada, M. Michael; Feingold, Eleanor; Kao, Amy H.; Sivils, Kathy L.; Bernatsky, Sasha; Pineau, Christian; Clarke, Ann; Ramsey-Goldman, Rosalind; Vyse, Timothy J.; Gaffney, Patrick M.; Manzi, Susan; Kamboh, M. Ilyas
2016-01-01
Objective Genome-wide association studies (GWASs) in individuals of European ancestry identified a number of systemic lupus erythematosus (SLE) susceptibility loci using earlier versions of high-density genotyping platforms. Follow-up studies on suggestive GWAS regions using larger samples and more markers identified additional SLE loci in European-descent subjects. Here we report the results of a multi-stage study that we performed to identify novel SLE loci. Methods In Stage 1, we conducted a new GWAS of SLE in a North American case-control sample of European ancestry (n=1,166) genotyped on Affymetrix Genome-Wide Human SNP Array 6.0. In Stage 2, we further investigated top new suggestive GWAS hits by in silico evaluation and meta-analysis using an additional dataset of European-descent subjects (>2,500 individuals), followed by replication of top meta-analysis findings in another dataset of European-descent subjects (>10,000 individuals) in Stage 3. Results As expected, our GWAS revealed most significant associations at the major histocompatibility complex locus (6p21), which easily surpassed genome-wide significance threshold (P<5×10−8). Several other SLE signals/loci previously implicated in Caucasians and/or Asians were also supported in Stage 1 discovery sample and strongest signals were observed at 2q32/STAT4 (P=3.6×10−7) and at 8p23/BLK (P=8.1×10−6). Stage 2 meta-analyses identified a new genome-wide significant SLE locus at 12q12 (meta P=3.1×10−8), which was replicated in Stage 3. Conclusion Our multi-stage study identified and replicated a new SLE locus that warrants further follow-up in additional studies. Publicly available databases suggest that this new SLE signal falls within a functionally relevant genomic region and near biologically important genes. PMID:26316170
Fox, Ervin R.; Musani, Solomon K.; Barbalic, Maja; Lin, Honghuang; Yu, Bing; Ogunyankin, Kofo O.; Smith, Nicholas L.; Kutlar, Abdullah; Glazer, Nicole L.; Post, Wendy S.; Paltoo, Dina N.; Dries, Daniel L.; Farlow, Deborah N.; Duarte, Christine W.; Kardia, Sharon L.; Meyers, Kristin J.; Sun, Yan V.; Arnett, Donna K.; Patki, Amit A.; Sha, Jin; Cui, Xiangqui; Samdarshi, Tandaw E.; Penman, Alan D.; Bibbins-Domingo, Kirsten; Bůžková, Petra; Benjamin, Emelia J.; Bluemke, David A.; Morrison, Alanna C.; Heiss, Gerardo; Carr, J. Jeffrey; Tracy, Russell P.; Mosley, Thomas H.; Taylor, Herman A.; Psaty, Bruce M.; Heckbert, Susan R.; Cappola, Thomas P.; Vasan, Ramachandran S.
2013-01-01
Background Using data from four community-based cohorts of African Americans (AA), we tested the association between genome-wide markers (SNPs) and cardiac phenotypes in the Candidate-gene Association REsource (CARe) study. Methods and Results Among 6,765 AA, we related age, sex, height and weight-adjusted residuals for nine cardiac phenotypes (assessed by echocardiogram or MRI) to 2.5 million SNPs genotyped using Genome-Wide Affymetrix Human SNP Array 6.0 (Affy6.0) and the remainder imputed. Within cohort genome-wide association analysis was conducted followed by meta-analysis across cohorts using inverse variance weights (genome-wide significance threshold=4.0 ×10−07). Supplementary pathway analysis was performed. We attempted replication in 3 smaller cohorts of African ancestry and tested look-ups in one consortium of European ancestry (EchoGEN). Across the 9 phenotypes, variants in 4 genetic loci reached genome-wide significance: rs4552931 in UBE2V2 (p=1.43 × 10−07) for left ventricular mass (LVM); rs7213314 in WIPI1 (p=1.68 × 10−07) for LV internal diastolic diameter (LVIDD); rs1571099 in PPAPDC1A (p= 2.57 × 10−08) for interventricular septal wall thickness (IVST); and rs9530176 in KLF5 (p=4.02 × 10−07) for ejection fraction (EF). Associated variants were enriched in three signaling pathways involved in cardiac remodeling. None of the 4 loci replicated in cohorts of African ancestry were confirmed in look-ups in EchoGEN. Conclusions In the largest GWAS of cardiac structure and function to date in AA, we identified 4 genetic loci related to LVM, IVST, LVIDD and EF that reached genome-wide significance. Replication results suggest that these loci may represent unique to individuals of African ancestry. Additional large-scale studies are warranted for these complex phenotypes. PMID:23275298
Parra, E. J.; Below, J. E.; Krithika, S.; Valladares, A.; Barta, J. L.; Cox, N. J.; Hanis, C. L.; Wacher, N.; Garcia-Mena, J.; Hu, P.; Shriver, M. D.; Kumate, J.; McKeigue, P. M.; Escobedo, J.; Cruz, M.
2013-01-01
Aims/hypothesis We report a genome-wide association study of type 2 diabetes in an admixed sample from Mexico City and describe the results of a meta-analysis of this study and another genome-wide scan in a Mexican-American sample from Starr County, TX, USA. The top signals observed in this meta-analysis were followed up in the Diabetes Genetics Replication and Meta-analysis Consortium (DIAGRAM) and DIAGRAM+ datasets. Methods We analysed 967 cases and 343 normoglycaemic controls. The samples were genotyped with the Affymetrix Genome-wide Human SNP array 5.0. Associations of genotyped and imputed markers with type 2 diabetes were tested using a missing data likelihood score test. A fixed-effects meta-analysis including 1,804 cases and 780 normoglycaemic controls was carried out by weighting the effect estimates by their inverse variances. Results In the meta-analysis of the two Hispanic studies, markers showing suggestive associations (p<10−5) were identified in two known diabetes genes, HNF1A and KCNQ1, as well as in several additional regions. Meta-analysis of the two Hispanic studies and the recent DIAGRAM+ dataset identified genome-wide significant signals (p<5×10−8) within or near the genes HNF1A and CDKN2A/CDKN2B, as well as suggestive associations in three additional regions, IGF2BP2, KCNQ1 and the previously unreported C14orf70. Conclusions/interpretation We observed numerous regions with suggestive associations with type 2 diabetes. Some of these signals correspond to regions described in previous studies. However, many of these regions could not be replicated in the DIAGRAM datasets. It is critical to carry out additional studies in Hispanic and American Indian populations, which have a high prevalence of type 2 diabetes. PMID:21573907
Renault, Victor; Tost, Jörg; Pichon, Fabien; Wang-Renault, Shu-Fang; Letouzé, Eric; Imbeaud, Sandrine; Zucman-Rossi, Jessica; Deleuze, Jean-François; How-Kit, Alexandre
2017-01-01
Copy number variations (CNV) include net gains or losses of part or whole chromosomal regions. They differ from copy neutral loss of heterozygosity (cn-LOH) events which do not induce any net change in the copy number and are often associated with uniparental disomy. These phenomena have long been reported to be associated with diseases and particularly in cancer. Losses/gains of genomic regions are often correlated with lower/higher gene expression. On the other hand, loss of heterozygosity (LOH) and cn-LOH are common events in cancer and may be associated with the loss of a functional tumor suppressor gene. Therefore, identifying recurrent CNV and cn-LOH events can be important as they may highlight common biological components and give insights into the development or mechanisms of a disease. However, no currently available tools allow a comprehensive whole-genome visualization of recurrent CNVs and cn-LOH in groups of samples providing absolute quantification of the aberrations leading to the loss of potentially important information. To overcome these limitations, we developed aCNViewer (Absolute CNV Viewer), a visualization tool for absolute CNVs and cn-LOH across a group of samples. aCNViewer proposes three graphical representations: dendrograms, bi-dimensional heatmaps showing chromosomal regions sharing similar abnormality patterns, and quantitative stacked histograms facilitating the identification of recurrent absolute CNVs and cn-LOH. We illustrated aCNViewer using publically available hepatocellular carcinomas (HCCs) Affymetrix SNP Array data (Fig 1A). Regions 1q and 8q present a similar percentage of total gains but significantly different copy number gain categories (p-value of 0.0103 with a Fisher exact test), validated by another cohort of HCCs (p-value of 5.6e-7) (Fig 2B). aCNViewer is implemented in python and R and is available with a GNU GPLv3 license on GitHub https://github.com/FJD-CEPH/aCNViewer and Docker https://hub.docker.com/r/fjdceph/acnviewer/. aCNViewer@cephb.fr.
Trucks, Holger; Schulz, Herbert; de Kovel, Carolien G.; Kasteleijn-Nolst Trenité, Dorothée; Sonsma, Anja C. M.; Koeleman, Bobby P.; Lindhout, Dick; Weber, Yvonne G.; Lerche, Holger; Kapser, Claudia; Schankin, Christoph J.; Kunz, Wolfram S.; Surges, Rainer; Elger, Christian E.; Gaus, Verena; Schmitz, Bettina; Helbig, Ingo; Muhle, Hiltrud; Stephani, Ulrich; Klein, Karl M.; Rosenow, Felix; Neubauer, Bernd A.; Reinthaler, Eva M.; Zimprich, Fritz; Feucht, Martha; Møller, Rikke S.; Hjalgrim, Helle; De Jonghe, Peter; Suls, Arvid; Lieb, Wolfgang; Franke, Andre; Strauch, Konstantin; Gieger, Christian; Schurmann, Claudia; Schminke, Ulf; Nürnberg, Peter; Sander, Thomas
2015-01-01
Genetic generalised epilepsy (GGE) is the most common form of genetic epilepsy, accounting for 20% of all epilepsies. Genomic copy number variations (CNVs) constitute important genetic risk factors of common GGE syndromes. In our present genome-wide burden analysis, large (≥ 400 kb) and rare (< 1%) autosomal microdeletions with high calling confidence (≥ 200 markers) were assessed by the Affymetrix SNP 6.0 array in European case-control cohorts of 1,366 GGE patients and 5,234 ancestry-matched controls. We aimed to: 1) assess the microdeletion burden in common GGE syndromes, 2) estimate the relative contribution of recurrent microdeletions at genomic rearrangement hotspots and non-recurrent microdeletions, and 3) identify potential candidate genes for GGE. We found a significant excess of microdeletions in 7.3% of GGE patients compared to 4.0% in controls (P = 1.8 x 10-7; OR = 1.9). Recurrent microdeletions at seven known genomic hotspots accounted for 36.9% of all microdeletions identified in the GGE cohort and showed a 7.5-fold increased burden (P = 2.6 x 10-17) relative to controls. Microdeletions affecting either a gene previously implicated in neurodevelopmental disorders (P = 8.0 x 10-18, OR = 4.6) or an evolutionarily conserved brain-expressed gene related to autism spectrum disorder (P = 1.3 x 10-12, OR = 4.1) were significantly enriched in the GGE patients. Microdeletions found only in GGE patients harboured a high proportion of genes previously associated with epilepsy and neuropsychiatric disorders (NRXN1, RBFOX1, PCDH7, KCNA2, EPM2A, RORB, PLCB1). Our results demonstrate that the significantly increased burden of large and rare microdeletions in GGE patients is largely confined to recurrent hotspot microdeletions and microdeletions affecting neurodevelopmental genes, suggesting a strong impact of fundamental neurodevelopmental processes in the pathogenesis of common GGE syndromes. PMID:25950944
Mlynarski, Elisabeth E; Sheridan, Molly B; Xie, Michael; Guo, Tingwei; Racedo, Silvia E; McDonald-McGinn, Donna M; Gai, Xiaowu; Chow, Eva W C; Vorstman, Jacob; Swillen, Ann; Devriendt, Koen; Breckpot, Jeroen; Digilio, Maria Cristina; Marino, Bruno; Dallapiccola, Bruno; Philip, Nicole; Simon, Tony J; Roberts, Amy E; Piotrowicz, Małgorzata; Bearden, Carrie E; Eliez, Stephan; Gothelf, Doron; Coleman, Karlene; Kates, Wendy R; Devoto, Marcella; Zackai, Elaine; Heine-Suñer, Damian; Shaikh, Tamim H; Bassett, Anne S; Goldmuntz, Elizabeth; Morrow, Bernice E; Emanuel, Beverly S
2015-05-07
The 22q11.2 deletion syndrome (22q11DS; velocardiofacial/DiGeorge syndrome; VCFS/DGS) is the most common microdeletion syndrome and the phenotypic presentation is highly variable. Approximately 65% of individuals with 22q11DS have a congenital heart defect (CHD), mostly of the conotruncal type, and/or an aortic arch defect. The etiology of this phenotypic variability is not currently known. We hypothesized that copy-number variants (CNVs) outside the 22q11.2 deleted region might increase the risk of being born with a CHD in this sensitized population. Genotyping with Affymetrix SNP Array 6.0 was performed on two groups of subjects with 22q11DS separated by time of ascertainment and processing. CNV analysis was completed on a total of 949 subjects (cohort 1, n = 562; cohort 2, n = 387), 603 with CHDs (cohort 1, n = 363; cohort 2, n = 240) and 346 with normal cardiac anatomy (cohort 1, n = 199; cohort 2, n = 147). Our analysis revealed that a duplication of SLC2A3 was the most frequent CNV identified in the first cohort. It was present in 18 subjects with CHDs and 1 subject without (p = 3.12 × 10(-3), two-tailed Fisher's exact test). In the second cohort, the SLC2A3 duplication was also significantly enriched in subjects with CHDs (p = 3.30 × 10(-2), two-tailed Fisher's exact test). The SLC2A3 duplication was the most frequent CNV detected and the only significant finding in our combined analysis (p = 2.68 × 10(-4), two-tailed Fisher's exact test), indicating that the SLC2A3 duplication might serve as a genetic modifier of CHDs and/or aortic arch anomalies in individuals with 22q11DS. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Baron-Cohen, Simon; Murphy, Laura; Chakrabarti, Bhismadev; Craig, Ian; Mallya, Uma; Lakatošová, Silvia; Rehnstrom, Karola; Peltonen, Leena; Wheelwright, Sally; Allison, Carrie; Fisher, Simon E; Warrier, Varun
2014-01-01
Mathematical ability is heritable, but few studies have directly investigated its molecular genetic basis. Here we aimed to identify specific genetic contributions to variation in mathematical ability. We carried out a genome wide association scan using pooled DNA in two groups of U.K. samples, based on end of secondary/high school national academic exam achievement: high (n = 419) versus low (n = 183) mathematical ability while controlling for their verbal ability. Significant differences in allele frequencies between these groups were searched for in 906,600 SNPs using the Affymetrix GeneChip Human Mapping version 6.0 array. After meeting a threshold of p<1.5×10(-5), 12 SNPs from the pooled association analysis were individually genotyped in 542 of the participants and analyzed to validate the initial associations (lowest p-value 1.14 ×10(-6)). In this analysis, one of the SNPs (rs789859) showed significant association after Bonferroni correction, and four (rs10873824, rs4144887, rs12130910 rs2809115) were nominally significant (lowest p-value 3.278 × 10(-4)). Three of the SNPs of interest are located within, or near to, known genes (FAM43A, SFT2D1, C14orf64). The SNP that showed the strongest association, rs789859, is located in a region on chromosome 3q29 that has been previously linked to learning difficulties and autism. rs789859 lies 1.3 kbp downstream of LSG1, and 700 bp upstream of FAM43A, mapping within the potential promoter/regulatory region of the latter. To our knowledge, this is only the second study to investigate the association of genetic variants with mathematical ability, and it highlights a number of interesting markers for future study.
Musunuru, Kiran; Post, Wendy S.; Herzog, William; Shen, Haiqing; O’Connell, Jeffrey R.; McArdle, Patrick F.; Ryan, Kathleen A.; Gibson, Quince; Cheng, Yu-Ching; Clearfield, Elizabeth; Johnson, Andrew D.; Tofler, Geoffrey; Yang, Qiong; O’Donnell, Christopher J.; Becker, Diane M.; Yanek, Lisa R.; Becker, Lewis C.; Faraday, Nauder; Bielak, Lawrence F.; Peyser, Patricia A.; Shuldiner, Alan R.; Mitchell, Braxton D.
2010-01-01
Background Genome-wide association studies have identified a locus on chromosome 9p21.3 to be strongly associated with myocardial infarction/coronary artery disease (MI/CAD) and ischemic stroke. To gain insights into the mechanisms underlying these associations, we hypothesized that single nucleotide polymorphisms (SNPs) in this region would be associated with platelet reactivity across multiple populations. Methods and Results Subjects in the initial population included 1,402 asymptomatic Amish adults in whom we measured platelet reactivity (n=788) and/or coronary artery calcification (CAC) (n=939). Platelet reactivity on agonist stimulation was measured by impedence aggregometry, and CAC by electron beam computed tomography. Twenty-nine SNPs at the 9p21.3 locus were genotyped using the Affymetrix 500K array. Twelve correlated SNPs in the locus were significantly associated with platelet reactivity (all p ≤ 0.001). The SNP most strongly associated with platelet reactivity, rs10965219 (p = 0.0002) was also associated with CAC (p = 0.002), along with 9 other SNPs (all p < 0.004). Association of rs10965219 with platelet reactivity persisted after adjustment for CAC, a measure of underlying atherosclerotic burden known to affect platelet reactivity. We then tested rs10965219 for association with platelet function in 2,364 subjects from the Framingham Heart Study (FHS) and 1,169 subjects from the GeneSTAR Study. The rs10965219 G allele (frequency ~ 51% across all three populations) was significantly associated with higher platelet reactivity in FHS (p = 0.001) and trended toward higher reactivity in GeneSTAR (p = 0.087); the combined p-value for meta-analysis was 0.0002. Conclusions These results suggest that risk alleles at 9p21.3 locus may have pleiotropic effects on MI/CAD and stroke risk, possibly through their influence on platelet reactivity. PMID:20858905
Population sequencing reveals breed and sub-species specific CNVs in cattle
USDA-ARS?s Scientific Manuscript database
Individualized copy number variation (CNV) maps have highlighted the need for population surveys of cattle to detect rare and common variants. While SNP and comparative genomic hybridization (CGH) arrays have provided preliminary data, next-generation sequence (NGS) data analysis offers an increased...
QTL mapping of potato chip color and tuber traits within an autotetraploid family
USDA-ARS?s Scientific Manuscript database
Cultivated potato (Solanum tuberosum L.) is a highly heterozygous autotetraploid crop species, and this presents challenges for traditional line development and molecular breeding. Recent availability of a single nucleotide polymorphism (SNP) array with 8303 features and software packages for linkag...
Development and characterization of a microheater array device for real-time DNA mutation detection
NASA Astrophysics Data System (ADS)
Williams, Layne; Okandan, Murat; Chagovetz, Alex; Blair, Steve
2008-04-01
DNA analysis, specifically single nucleotide polymorphism (SNP) detection, is becoming increasingly important in rapid diagnostics and disease detection. Temperature is often controlled to help speed reaction rates and perform melting of hybridized oligonucleotides. The difference in melting temperatures, Tm, between wild-type and SNP sequences, respectively, to a given probe oligonucleotide, is indicative of the specificity of the reaction. We have characterized Tm's in solution and on a solid substrate of three sequences from known mutations associated with Cystic Fibrosis. Taking advantage of Tm differences, a microheater array device was designed to enable individual temperature control of up to 18 specific hybridization events. The device was fabricated at Sandia National Laboratories using surface micromachining techniques. The microheaters have been characterized using an IR camera at Sandia and show individual temperature control with minimal thermal cross talk. Development of the device as a real-time DNA detection platform, including surface chemistry and associated microfluidics, is described.
Development and characterization of a microheater array device for real-time DNA mutation detection
NASA Astrophysics Data System (ADS)
Williams, Layne; Okandan, Murat; Chagovetz, Alex; Blair, Steve
2008-02-01
DNA analysis, specifically single nucleotide polymorphism (SNP) detection, is becoming increasingly important in rapid diagnostics and disease detection. Temperature is often controlled to help speed reaction rates and perform melting of hybridized oligonucleotides. The difference in melting temperatures, Tm, between wild-type and SNP sequences, respectively, to a given probe oligonucleotide, is indicative of the specificity of the reaction. We have characterized Tm's in solution and on a solid substrate of three sequences from known mutations associated with Cystic Fibrosis. Taking advantage of Tm differences, a microheater array device was designed to enable individual temperature control of up to 18 specific hybridization events. The device was fabricated at Sandia National Laboratories using surface micromachining techniques. The microheaters have been characterized using an IR camera at Sandia and show individual temperature control with minimal thermal cross talk. Development of the device as a real-time DNA detection platform, including surface chemistry and associated microfluidics, is described.
Pierson, Tyler Mark; Markello, Thomas; Accardi, John; Wolfe, Lynne; Adams, David; Sincan, Murat; Tarazi, Noor M.; Fajardo, Karin Fuentes; Cherukuri, Praveen F.; Bajraktari, Ilda; Meilleur, Katy G.; Donkervoort, Sandra; Jain, Mina; Hu, Ying; Lehky, Tanya J.; Cruz, Pedro; Mullikin, James C.; Bonnemann, Carsten; Gahl, William A.; Boerkoel, Cornelius F.; Tifft, Cynthia J.
2013-01-01
Early-onset myopathy, areflexia, respiratory distress and dysphagia (EMARDD) is a myopathic disorder associated with mutations in MEGF10. By novel analysis of SNP array hybridization and exome sequence coverage, we diagnosed a 10-year old girl with EMARDD following identification of a novel homozygous deletion of exon 7 in MEGF10. In contrast to previously reported EMARDD patients, her weakness was more prominent proximally than distally, and involved her legs more than her arms. MRI of her pelvis and thighs showed muscle atrophy and fatty replacement. Ultrasound of several muscle groups revealed dense homogenous increases in echogenicity. Cloning and sequencing of the deletion breakpoint identified features suggesting the mutation arose by fork stalling and template switching. These findings constitute the first genomic deletion causing EMARDD, expand the clinical phenotype, and provide new insight into the pattern and histology of its muscular pathology. PMID:23453856
Ding, Liang-Hao; Xie, Yang; Park, Seongmi; Xiao, Guanghua; Story, Michael D.
2008-01-01
Despite the tremendous growth of microarray usage in scientific studies, there is a lack of standards for background correction methodologies, especially in single-color microarray platforms. Traditional background subtraction methods often generate negative signals and thus cause large amounts of data loss. Hence, some researchers prefer to avoid background corrections, which typically result in the underestimation of differential expression. Here, by utilizing nonspecific negative control features integrated into Illumina whole genome expression arrays, we have developed a method of model-based background correction for BeadArrays (MBCB). We compared the MBCB with a method adapted from the Affymetrix robust multi-array analysis algorithm and with no background subtraction, using a mouse acute myeloid leukemia (AML) dataset. We demonstrated that differential expression ratios obtained by using the MBCB had the best correlation with quantitative RT–PCR. MBCB also achieved better sensitivity in detecting differentially expressed genes with biological significance. For example, we demonstrated that the differential regulation of Tnfr2, Ikk and NF-kappaB, the death receptor pathway, in the AML samples, could only be detected by using data after MBCB implementation. We conclude that MBCB is a robust background correction method that will lead to more precise determination of gene expression and better biological interpretation of Illumina BeadArray data. PMID:18450815
Arabidopsis gene expression patterns during spaceflight
NASA Astrophysics Data System (ADS)
Paul, A.-L.; Ferl, R. J.
The exposure of Arabidopsis thaliana (Arabidopsis) plants to spaceflight environments resulted in the differential expression of hundreds of genes. A 5 day mission on orbiter Columbia in 1999 (STS-93) carried transgenic Arabidopsis plants engineered with a transgene composed of the alcohol dehydrogenase (Adh) gene promoter linked to the β -Glucuronidase (GUS) reporter gene. The plants were used to evaluate the effects of spaceflight on two fronts. First, expression patterns visualized with the Adh/GUS transgene were used to address specifically the possibility that spaceflight induces a hypoxic stress response, and to assess whether any spaceflight response was similar to control terrestrial hypoxia-induced gene expression patterns. (Paul et al., Plant Physiol. 2001, 126:613). Second, genome-wide patterns of native gene expression were evaluated utilizing the Affymetrix ATH1 GeneChip? array of 8,000 Arabidopsis genes. As a control for the veracity of the array analyses, a selection of genes identified with the arrays was further characterized with quantitative Real-Time RT PCR (ABI - TaqmanTM). Comparison of the patterns of expression for arrays of hybridized with RNA isolated from plants exposed to spaceflight compared to the control arrays revealed hundreds of genes that were differentially expressed in response to spaceflight, yet most genes that are hallmarks of hypoxic stress were unaffected. These results will be discussed in light of current models for plant responses to the spaceflight environment, and with regard to potential future flight opportunities.
A novel approach to analyzing fMRI and SNP data via parallel independent component analysis
NASA Astrophysics Data System (ADS)
Liu, Jingyu; Pearlson, Godfrey; Calhoun, Vince; Windemuth, Andreas
2007-03-01
There is current interest in understanding genetic influences on brain function in both the healthy and the disordered brain. Parallel independent component analysis, a new method for analyzing multimodal data, is proposed in this paper and applied to functional magnetic resonance imaging (fMRI) and a single nucleotide polymorphism (SNP) array. The method aims to identify the independent components of each modality and the relationship between the two modalities. We analyzed 92 participants, including 29 schizophrenia (SZ) patients, 13 unaffected SZ relatives, and 50 healthy controls. We found a correlation of 0.79 between one fMRI component and one SNP component. The fMRI component consists of activations in cingulate gyrus, multiple frontal gyri, and superior temporal gyrus. The related SNP component is contributed to significantly by 9 SNPs located in sets of genes, including those coding for apolipoprotein A-I, and C-III, malate dehydrogenase 1 and the gamma-aminobutyric acid alpha-2 receptor. A significant difference in the presences of this SNP component is found between the SZ group (SZ patients and their relatives) and the control group. In summary, we constructed a framework to identify the interactions between brain functional and genetic information; our findings provide new insight into understanding genetic influences on brain function in a common mental disorder.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, Shea; Slezak, Tom
With the flood of whole genome finished and draft microbial sequences, we need faster, more scalable bioinformatics tools for sequence comparison. An algorithm is described to find single nucleotide polymorphisms (SNPs) in whole genome data. It scales to hundreds of bacterial or viral genomes, and can be used for finished and/or draft genomes available as unassembled contigs. The method is fast to compute, finding SNPs and building a SNP phylogeny in seconds to hours. We use it to identify thousands of putative SNPs from all publicly available Filoviridae, Poxviridae, foot-and-mouth disease virus, Bacillus, and Escherichia coli genomes and plasmids. Themore » SNP-based trees that result are consistent with known taxonomy and trees determined in other studies. The approach we describe can handle as input hundreds of gigabases of sequence in a single run. The algorithm is based on k-mer analysis using a suffix array, so we call it saSNP.« less
Das, Sayan; Bhat, Prasanna R; Sudhakar, Chinta; Ehlers, Jeffrey D; Wanamaker, Steve; Roberts, Philip A; Cui, Xinping; Close, Timothy J
2008-02-28
Cowpea (Vigna unguiculata L. Walp) is an important food and fodder legume of the semiarid tropics and subtropics worldwide, especially in sub-Saharan Africa. High density genetic linkage maps are needed for marker assisted breeding but are not available for cowpea. A single feature polymorphism (SFP) is a microarray-based marker which can be used for high throughput genotyping and high density mapping. Here we report detection and validation of SFPs in cowpea using a readily available soybean (Glycine max) genome array. Robustified projection pursuit (RPP) was used for statistical analysis using RNA as a surrogate for DNA. Using a 15% outlying score cut-off, 1058 potential SFPs were enumerated between two parents of a recombinant inbred line (RIL) population segregating for several important traits including drought tolerance, Fusarium and brown blotch resistance, grain size and photoperiod sensitivity. Sequencing of 25 putative polymorphism-containing amplicons yielded a SFP probe set validation rate of 68%. We conclude that the Affymetrix soybean genome array is a satisfactory platform for identification of some 1000's of SFPs for cowpea. This study provides an example of extension of genomic resources from a well supported species to an orphan crop. Presumably, other legume systems are similarly tractable to SFP marker development using existing legume array resources.
2006-11-01
study of the NCI60 panel of cancer cell lines [39]. More recently, amplifications of NOTCH3 were noted in ovarian tumors by an SNP array analysis...and the functional role of NOTCH3 was suggested by the ability to suppress cell proliferation by inhibiting NOTCH3 [40]. Allele-specific copy...Identified and functionally validated the oncogene MITF. 40 Park JT, Li M, Nakayama K, et al. Notch3 gene amplification in ovarian cancer. Cancer Res
Population sequencing reveals breed and sub-species specific CNVs in cattle
USDA-ARS?s Scientific Manuscript database
Individualized copy number variation (CNV) maps have highlighted the need for population surveys of cattle to detect the rare and common variants. While SNP and comparative genomic hybridization (CGH) arrays have provided preliminary data, next-generation sequence (NGS) data analysis offers an incre...
Development and utilization of 100K SNP array in Saccharum Spp.
USDA-ARS?s Scientific Manuscript database
Sugarcane genotyping or fingerprinting has long been a daunting task due to its high polyploidy level with large number of chromosomes. Single nucleotide polymorphisms (SNPs) are very abundant DNA sequence variations in the genome. With the advance of next generation sequencing (NGS) technologies, m...
Ali, Shahin S; Shao, Jonathan; Strem, Mary D; Phillips-Mora, Wilberth; Zhang, Dapeng; Meinhardt, Lyndel W; Bailey, Bryan A
2015-01-01
Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR) disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population may allow the population to adapt to changing environmental conditions and adapt to enhanced resistance in the host plant. The present study developed single nucleotide polymorphism (SNP) markers from RNASeq results for 13 M. roreri isolates and validated the markers for their ability to reveal genetic diversity in an international M. roreri collection. The SNP resources reported herein represent the first study of RNA sequencing (RNASeq)-derived SNP validation in M. roreri and demonstrates the utility of RNASeq as an approach for de novo SNP identification in M. roreri. A total of 88 polymorphic SNPs were used to evaluate the genetic diversity of 172 M. roreri cacao isolates resulting in 37 distinct genotypes (including 14 synonymous groups). Absence of heterozygosity for the 88 SNP markers indicates reproduction in M. roreri is clonal and likely due to a homothallic life style. The upper Magdalena Valley of Colombia showed the highest levels of genetic diversity with 20 distinct genotypes of which 13 were limited to this region, and indicates this region as the possible center of origin for M. roreri.
Ali, Shahin S.; Shao, Jonathan; Strem, Mary D.; Phillips-Mora, Wilberth; Zhang, Dapeng; Meinhardt, Lyndel W.; Bailey, Bryan A.
2015-01-01
Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR) disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population may allow the population to adapt to changing environmental conditions and adapt to enhanced resistance in the host plant. The present study developed single nucleotide polymorphism (SNP) markers from RNASeq results for 13 M. roreri isolates and validated the markers for their ability to reveal genetic diversity in an international M. roreri collection. The SNP resources reported herein represent the first study of RNA sequencing (RNASeq)-derived SNP validation in M. roreri and demonstrates the utility of RNASeq as an approach for de novo SNP identification in M. roreri. A total of 88 polymorphic SNPs were used to evaluate the genetic diversity of 172 M. roreri cacao isolates resulting in 37 distinct genotypes (including 14 synonymous groups). Absence of heterozygosity for the 88 SNP markers indicates reproduction in M. roreri is clonal and likely due to a homothallic life style. The upper Magdalena Valley of Colombia showed the highest levels of genetic diversity with 20 distinct genotypes of which 13 were limited to this region, and indicates this region as the possible center of origin for M. roreri. PMID:26379633
DOE Office of Scientific and Technical Information (OSTI.GOV)
Farahani, Poupak; Chiu, Sally; Bowlus, Christopher L.
Obesity is a complex disease. To date, over 100 chromosomal loci for body weight, body fat, regional white adipose tissue weight, and other obesity-related traits have been identified in humans and in animal models. For most loci, the underlying genes are not yet identified; some of these chromosomal loci will be alleles of known obesity genes, whereas many will represent alleles of unknown genes. Microarray analysis allows simultaneous multiple gene and pathway discovery. cDNA and oligonucleotide arrays are commonly used to identify differentially expressed genes by surveys of large numbers of known and unnamed genes. Two papers previously identified genesmore » differentially expressed in adipose tissue of mouse models of obesity and diabetes by analysis of hybridization to Affymetrix oligonucleotide chips.« less
Oligonucleotide-arrayed TFT photosensor applicable for DNA chip technology.
Tanaka, Tsuyoshi; Hatakeyama, Keiichi; Sawaguchi, Masahiro; Iwadate, Akihito; Mizutani, Yasushi; Sasaki, Kazuhiro; Tateishi, Naofumi; Takeyama, Haruko; Matsunaga, Tadashi
2006-09-05
A thin film transistor (TFT) photosensor fabricated by semiconductor integrated circuit (IC) technology was applied to DNA chip technology. The surface of the TFT photosensor was coated with TiO2 using a vapor deposition technique for the fabrication of optical filters. The immobilization of thiolated oligonucleotide probes onto a TiO2-coated TFT photosensor using gamma-aminopropyltriethoxysilane (APTES) and N-(gamma-maleimidobutyloxy) sulfosuccinimide ester (GMBS) was optimized. The coverage value of immobilized oligonucleotides reached a plateau at 33.7 pmol/cm2, which was similar to a previous analysis using radioisotope-labeled oligonucleotides. The lowest detection limits were 0.05 pmol/cm2 for quantum dot and 2.1 pmol/cm2 for Alexa Fluor 350. Furthermore, single nucleotide polymorphism (SNP) detection was examined using the oligonucleotide-arrayed TFT photosensor. A SNP present in the aldehyde dehydrogenase 2 (ALDH2) gene was used as a target. The SNPs in ALDH2*1 and ALDH2*2 target DNA were detected successfully using the TFT photosensor. DNA hybridization in the presence of both ALDH2*1 and ALDH2*2 target DNA was observed using both ALDH2*1 and ALDH2*2 detection oligonucleotides-arrayed TFT photosensor. Use of the TFT photosensor will allow the development of a disposable photodetecting device for DNA chip systems. (c) 2006 Wiley Periodicals, Inc.
Transcriptional profiling of Medicago truncatula meristematic root cells
Holmes, Peta; Goffard, Nicolas; Weiller, Georg F; Rolfe, Barry G; Imin, Nijat
2008-01-01
Background The root apical meristem of crop and model legume Medicago truncatula is a significantly different stem cell system to that of the widely studied model plant species Arabidopsis thaliana. In this study we used the Affymetrix Medicago GeneChip® to compare the transcriptomes of meristem and non-meristematic root to identify root meristem specific candidate genes. Results Using mRNA from root meristem and non-meristem we were able to identify 324 and 363 transcripts differentially expressed from the two regions. With bioinformatics tools developed to functionally annotate the Medicago genome array we could identify significant changes in metabolism, signalling and the differentially expression of 55 transcription factors in meristematic and non-meristematic roots. Conclusion This is the first comprehensive analysis of M. truncatula root meristem cells using this genome array. This data will facilitate the mapping of regulatory and metabolic networks involved in the open root meristem of M. truncatula and provides candidates for functional analysis. PMID:18302802
USDA-ARS?s Scientific Manuscript database
Individualized copy number variation (CNV) maps have highlighted the need for population surveys of cattle to detect rare and common variants. While SNP and comparative genomic hybridization (CGH) arrays have provided preliminary data, next-generation sequence (NGS) data analysis offers an increased...
Increasing feed efficiency and reducing methane emissions using genomics: An international approach
USDA-ARS?s Scientific Manuscript database
Genomic technology (including SNP arrays and next-generation sequencing) is a powerful driver for the genetic improvement of livestock. Phenotype recording can now, to an extent, be partitioned from selection, and even limited to several thousand animals. Rapid development of new technologies and pr...
Diversity analysis of cotton (Gossypium hirsutum L.) germplasm using the CottonSNP63K Array
USDA-ARS?s Scientific Manuscript database
Cotton germplasm resources contain beneficial alleles that can be exploited to develop germplasm adapting to emerging environmental and climate conditions, and this germplasm has commonly been characterized based on phenotypes. However, phenotypic profiles are limited by what can be observed and me...
variety of arrays appropriate for a wide breadth of study design needs. Genomic coverage of many of the chromosomal anomalies are services offered at NO ADDITIONAL COST to study investigators with GWAS projects be submitted for both the initial GWAS study as well as replication using our custom SNP service
Troggio, Michela; Šurbanovski, Nada; Bianco, Luca; Moretto, Marco; Giongo, Lara; Banchi, Elisa; Viola, Roberto; Fernández, Felicdad Fernández; Costa, Fabrizio; Velasco, Riccardo; Cestaro, Alessandro; Sargent, Daniel James
2013-01-01
High throughput arrays for the simultaneous genotyping of thousands of single-nucleotide polymorphisms (SNPs) have made the rapid genetic characterisation of plant genomes and the development of saturated linkage maps a realistic prospect for many plant species of agronomic importance. However, the correct calling of SNP genotypes in divergent polyploid genomes using array technology can be problematic due to paralogy, and to divergence in probe sequences causing changes in probe binding efficiencies. An Illumina Infinium II whole-genome genotyping array was recently developed for the cultivated apple and used to develop a molecular linkage map for an apple rootstock progeny (M432), but a large proportion of segregating SNPs were not mapped in the progeny, due to unexpected genotype clustering patterns. To investigate the causes of this unexpected clustering we performed BLAST analysis of all probe sequences against the ‘Golden Delicious’ genome sequence and discovered evidence for paralogous annealing sites and probe sequence divergence for a high proportion of probes contained on the array. Following visual re-evaluation of the genotyping data generated for 8,788 SNPs for the M432 progeny using the array, we manually re-scored genotypes at 818 loci and mapped a further 797 markers to the M432 linkage map. The newly mapped markers included the majority of those that could not be mapped previously, as well as loci that were previously scored as monomorphic, but which segregated due to divergence leading to heterozygosity in probe annealing sites. An evaluation of the 8,788 probes in a diverse collection of Malus germplasm showed that more than half the probes returned genotype clustering patterns that were difficult or impossible to interpret reliably, highlighting implications for the use of the array in genome-wide association studies. PMID:23826289
2011-01-01
Background High-throughput SNP genotyping has become an essential requirement for molecular breeding and population genomics studies in plant species. Large scale SNP developments have been reported for several mainstream crops. A growing interest now exists to expand the speed and resolution of genetic analysis to outbred species with highly heterozygous genomes. When nucleotide diversity is high, a refined diagnosis of the target SNP sequence context is needed to convert queried SNPs into high-quality genotypes using the Golden Gate Genotyping Technology (GGGT). This issue becomes exacerbated when attempting to transfer SNPs across species, a scarcely explored topic in plants, and likely to become significant for population genomics and inter specific breeding applications in less domesticated and less funded plant genera. Results We have successfully developed the first set of 768 SNPs assayed by the GGGT for the highly heterozygous genome of Eucalyptus from a mixed Sanger/454 database with 1,164,695 ESTs and the preliminary 4.5X draft genome sequence for E. grandis. A systematic assessment of in silico SNP filtering requirements showed that stringent constraints on the SNP surrounding sequences have a significant impact on SNP genotyping performance and polymorphism. SNP assay success was high for the 288 SNPs selected with more rigorous in silico constraints; 93% of them provided high quality genotype calls and 71% of them were polymorphic in a diverse panel of 96 individuals of five different species. SNP reliability was high across nine Eucalyptus species belonging to three sections within subgenus Symphomyrtus and still satisfactory across species of two additional subgenera, although polymorphism declined as phylogenetic distance increased. Conclusions This study indicates that the GGGT performs well both within and across species of Eucalyptus notwithstanding its nucleotide diversity ≥2%. The development of a much larger array of informative SNPs across multiple Eucalyptus species is feasible, although strongly dependent on having a representative and sufficiently deep collection of sequences from many individuals of each target species. A higher density SNP platform will be instrumental to undertake genome-wide phylogenetic and population genomics studies and to implement molecular breeding by Genomic Selection in Eucalyptus. PMID:21492434
2013-01-01
Background Hereditary non-polyposis colorectal cancer (HNPCC)/Lynch syndrome (LS) is a cancer syndrome characterised by early-onset epithelial cancers, especially colorectal cancer (CRC) and endometrial cancer. The aim of the current study was to use SNP-array technology to identify genomic aberrations which could contribute to the increased risk of cancer in HNPCC/LS patients. Methods Individuals diagnosed with HNPCC/LS (100) and healthy controls (384) were genotyped using the Illumina Human610-Quad SNP-arrays. Copy number variation (CNV) calling and association analyses were performed using Nexus software, with significant results validated using QuantiSNP. TaqMan Copy-Number assays were used for verification of CNVs showing significant association with HNPCC/LS identified by both software programs. Results We detected copy number (CN) gains associated with HNPCC/LS status on chromosome 7q11.21 (28% cases and 0% controls, Nexus; p = 3.60E-20 and QuantiSNP; p < 1.00E-16) and 16p11.2 (46% in cases, while a CN loss was observed in 23% of controls, Nexus; p = 4.93E-21 and QuantiSNP; p = 5.00E-06) via in silico analyses. TaqMan Copy-Number assay was used for validation of CNVs showing significant association with HNPCC/LS. In addition, CNV burden (total CNV length, average CNV length and number of observed CNV events) was significantly greater in cases compared to controls. Conclusion A greater CNV burden was identified in HNPCC/LS cases compared to controls supporting the notion of higher genomic instability in these patients. One intergenic locus on chromosome 7q11.21 is possibly associated with HNPCC/LS and deserves further investigation. The results from this study highlight the complexities of fluorescent based CNV analyses. The inefficiency of both CNV detection methods to reproducibly detect observed CNVs demonstrates the need for sequence data to be considered alongside intensity data to avoid false positive results. PMID:23531357
Yamamoto, Toshio; Nagasaki, Hideki; Yonemaru, Jun-ichi; Ebana, Kaworu; Nakajima, Maiko; Shibaya, Taeko; Yano, Masahiro
2010-04-27
To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition. The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be caused by a recent human selection in rice breeding. The definition of pedigree haplotypes by means of genome-wide SNPs will facilitate next-generation breeding of rice and other crops.
2012-01-01
Background High-density genotyping arrays that measure hybridization of genomic DNA fragments to allele-specific oligonucleotide probes are widely used to genotype single nucleotide polymorphisms (SNPs) in genetic studies, including human genome-wide association studies. Hybridization intensities are converted to genotype calls by clustering algorithms that assign each sample to a genotype class at each SNP. Data for SNP probes that do not conform to the expected pattern of clustering are often discarded, contributing to ascertainment bias and resulting in lost information - as much as 50% in a recent genome-wide association study in dogs. Results We identified atypical patterns of hybridization intensities that were highly reproducible and demonstrated that these patterns represent genetic variants that were not accounted for in the design of the array platform. We characterized variable intensity oligonucleotide (VINO) probes that display such patterns and are found in all hybridization-based genotyping platforms, including those developed for human, dog, cattle, and mouse. When recognized and properly interpreted, VINOs recovered a substantial fraction of discarded probes and counteracted SNP ascertainment bias. We developed software (MouseDivGeno) that identifies VINOs and improves the accuracy of genotype calling. MouseDivGeno produced highly concordant genotype calls when compared with other methods but it uniquely identified more than 786000 VINOs in 351 mouse samples. We used whole-genome sequence from 14 mouse strains to confirm the presence of novel variants explaining 28000 VINOs in those strains. We also identified VINOs in human HapMap 3 samples, many of which were specific to an African population. Incorporating VINOs in phylogenetic analyses substantially improved the accuracy of a Mus species tree and local haplotype assignment in laboratory mouse strains. Conclusion The problems of ascertainment bias and missing information due to genotyping errors are widely recognized as limiting factors in genetic studies. We have conducted the first formal analysis of the effect of novel variants on genotyping arrays, and we have shown that these variants account for a large portion of miscalled and uncalled genotypes. Genetic studies will benefit from substantial improvements in the accuracy of their results by incorporating VINOs in their analyses. PMID:22260749
MASUDA, TAIKI; ISHIKAWA, TOSHIAKI; MOGUSHI, KAORU; OKAZAKI, SATOSHI; ISHIGURO, MEGUMI; IIDA, SATORU; MIZUSHIMA, HIROSHI; TANAKA, HIROSHI; UETAKE, HIROYUKI; SUGIHARA, KENICHI
2016-01-01
We aimed to identify a novel prognostic biomarker related to recurrence in stage II and III colorectal cancer (CRC) patients. Stage II and III CRC tissue mRNA expression was profiled using an Affymetrix Gene Chip, and copy number profiles of 125 patients were generated using an Affymetrix 250K Sty array. Genes showing both upregulated expression and copy number gains in cases involving recurrence were extracted as candidate biomarkers. The protein expression of the candidate gene was assessed using immunohistochemical staining of tissue from 161 patients. The relationship between protein expression and clinicopathological features was also examined. We identified 9 candidate genes related to recurrence of stage II and III CRC, whose mRNA expression was significantly higher in CRC than in normal tissue. Of these proteins, the S100 calcium-binding protein A2 (S100A2) has been observed in several human cancers. S100A2 protein overexpression in CRC cells was associated with significantly worse overall survival and relapse-free survival, indicating that S100A2 is an independent risk factor for stage II and III CRC recurrence. S100A2 overexpression in cancer cells could be a biomarker of poor prognosis in stage II and III CRC recurrence and a target for treatment of this disease. PMID:26783118
Genome-wide analysis links NFATC2 with asparaginase hypersensitivity
Fernandez, Christian A.; Smith, Colton; Yang, Wenjian; Mullighan, Charles G.; Qu, Chunxu; Larsen, Eric; Bowman, W. Paul; Liu, Chengcheng; Ramsey, Laura B.; Chang, Tamara; Karol, Seth E.; Loh, Mignon L.; Raetz, Elizabeth A.; Winick, Naomi J.; Hunger, Stephen P.; Carroll, William L.; Jeha, Sima; Pui, Ching-Hon; Evans, William E.; Devidas, Meenakshi
2015-01-01
Asparaginase is used to treat acute lymphoblastic leukemia (ALL); however, hypersensitivity reactions can lead to suboptimal asparaginase exposure. Our objective was to use a genome-wide approach to identify loci associated with asparaginase hypersensitivity in children with ALL enrolled on St. Jude Children’s Research Hospital (SJCRH) protocols Total XIIIA (n = 154), Total XV (n = 498), and Total XVI (n = 271), or Children’s Oncology Group protocols POG 9906 (n = 222) and AALL0232 (n = 2163). Germline DNA was genotyped using the Affymetrix 500K, Affymetrix 6.0, or the Illumina Exome BeadChip array. In multivariate logistic regression, the intronic rs6021191 variant in nuclear factor of activated T cells 2 (NFATC2) had the strongest association with hypersensitivity (P = 4.1 × 10−8; odds ratio [OR] = 3.11). RNA-seq data available from 65 SJCRH ALL tumor samples and 52 Yoruba HapMap samples showed that samples carrying the rs6021191 variant had higher NFATC2 expression compared with noncarriers (P = 1.1 × 10−3 and 0.03, respectively). The top ranked nonsynonymous polymorphism was rs17885382 in HLA-DRB1 (P = 3.2 × 10−6; OR = 1.63), which is in near complete linkage disequilibrium with the HLA-DRB1*07:01 allele we previously observed in a candidate gene study. The strongest risk factors for asparaginase allergy are variants within genes regulating the immune response. PMID:25987655
Telfer, Emily J; Stovold, Grahame T; Li, Yongjun; Silva-Junior, Orzenil B; Grattapaglia, Dario G; Dungey, Heidi S
2015-01-01
Pedigree reconstruction using molecular markers enables efficient management of inbreeding in open-pollinated breeding strategies, replacing expensive and time-consuming controlled pollination. This is particularly useful in preferentially outcrossed, insect pollinated Eucalypts known to suffer considerable inbreeding depression from related matings. A single nucleotide polymorphism (SNP) marker panel consisting of 106 markers was selected for pedigree reconstruction from the recently developed high-density Eucalyptus Infinium SNP chip (EuCHIP60K). The performance of this SNP panel for pedigree reconstruction in open-pollinated progenies of two Eucalyptus nitens seed orchards was compared with that of two microsatellite panels with 13 and 16 markers respectively. The SNP marker panel out-performed one of the microsatellite panels in the resolution power to reconstruct pedigrees and out-performed both panels with respect to data quality. Parentage of all but one offspring in each clonal seed orchard was correctly matched to the expected seed parent using the SNP marker panel, whereas parentage assignment to less than a third of the expected seed parents were supported using the 13-microsatellite panel. The 16-microsatellite panel supported all but one of the recorded seed parents, one better than the SNP panel, although there was still a considerable level of missing and inconsistent data. SNP marker data was considerably superior to microsatellite data in accuracy, reproducibility and robustness. Although microsatellites and SNPs data provide equivalent resolution for pedigree reconstruction, microsatellite analysis requires more time and experience to deal with the uncertainties of allele calling and faces challenges for data transferability across labs and over time. While microsatellite analysis will continue to be useful for some breeding tasks due to the high information content, existing infrastructure and low operating costs, the multi-species SNP resource available with the EuCHIP60k, opens a whole new array of opportunities for high-throughput, genome-wide or targeted genotyping in species of Eucalyptus.
Karampetsou, Evangelia; Morrogh, Deborah; Chitty, Lyn
2014-01-01
The advantage of microarray (array) over conventional karyotype for the diagnosis of fetal pathogenic chromosomal anomalies has prompted the use of microarrays in prenatal diagnostics. In this review we compare the performance of different array platforms (BAC, oligonucleotide CGH, SNP) and designs (targeted, whole genome, whole genome, and targeted, custom) and discuss their advantages and disadvantages in relation to prenatal testing. We also discuss the factors to consider when implementing a microarray testing service for the diagnosis of fetal chromosomal aberrations. PMID:26237396
Genotyping of 75 SNPs using arrays for individual identification in five population groups.
Hwa, Hsiao-Lin; Wu, Lawrence Shih Hsin; Lin, Chun-Yen; Huang, Tsun-Ying; Yin, Hsiang-I; Tseng, Li-Hui; Lee, James Chun-I
2016-01-01
Single nucleotide polymorphism (SNP) typing offers promise to forensic genetics. Various strategies and panels for analyzing SNP markers for individual identification have been published. However, the best panels with fewer identity SNPs for all major population groups are still under discussion. This study aimed to find more autosomal SNPs with high heterozygosity for individual identification among Asian populations. Ninety-six autosomal SNPs of 502 DNA samples from unrelated individuals of five population groups (208 Taiwanese Han, 83 Filipinos, 62 Thais, 69 Indonesians, and 80 individuals with European, Near Eastern, or South Asian ancestry) were analyzed using arrays in an initial screening, and 75 SNPs (group A, 46 newly selected SNPs; groups B, 29 SNPs based on a previous SNP panel) were selected for further statistical analyses. Some SNPs with high heterozygosity from Asian populations were identified. The combined random match probability of the best 40 and 45 SNPs was between 3.16 × 10(-17) and 7.75 × 10(-17) and between 2.33 × 10(-19) and 7.00 × 10(-19), respectively, in all five populations. These loci offer comparable power to short tandem repeats (STRs) for routine forensic profiling. In this study, we demonstrated the population genetic characteristics and forensic parameters of 75 SNPs with high heterozygosity from five population groups. This SNPs panel can provide valuable genotypic information and can be helpful in forensic casework for individual identification among these populations.
Qiao, Y; Tyson, C; Hrynchak, M; Lopez-Rangel, E; Hildebrand, J; Martell, S; Fawcett, C; Kasmara, L; Calli, K; Harvard, C; Liu, X; Holden, J J A; Lewis, S M E; Rajcan-Separovic, E
2013-02-01
Higher resolution whole-genome arrays facilitate the identification of smaller copy number variations (CNVs) and their integral genes contributing to autism and/or intellectual disability (ASD/ID). Our study describes the use of one of the highest resolution arrays, the Affymetrix(®) Cytogenetics 2.7M array, coupled with quantitative multiplex polymerase chain reaction (PCR) of short fluorescent fragments (QMPSF) for detection and validation of small CNVs. We studied 82 subjects with ASD and ID in total (30 in the validation and 52 in the application cohort) and detected putatively pathogenic CNVs in 6/52 cases from the application cohort. This included a 130-kb maternal duplication spanning exons 64-79 of the DMD gene which was found in a 3-year-old boy manifesting autism and mild neuromotor delays. Other pathogenic CNVs involved 4p14, 12q24.31, 14q32.31, 15q13.2-13.3, and 17p13.3. We established the optimal experimental conditions which, when applied to select small CNVs for QMPSF confirmation, reduced the false positive rate from 60% to 25%. Our work suggests that selection of small CNVs based on the function of integral genes, followed by review of array experimental parameters resulting in highest confirmation rate using multiplex PCR, may enhance the usefulness of higher resolution platforms for ASD and ID gene discovery. © 2012 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.
Rajasekaran, S; Kanna, Rishi Mugesh; Senthil, Natesan; Raveendran, Muthuraja; Cheung, Kenneth M C; Chan, Danny; Subramaniam, Sakthikanal; Shetty, Ajoy Prasad
2013-10-01
Although the influence of genetics on the process of disc degeneration is well recognized, in recently published studies, there is a wide variation in the race and selection criteria for such study populations. More importantly, the radiographic features of disc degeneration that are selected to represent the disc degeneration phenotype are variable in these studies. The study presented here evaluates the association between single nucleotide polymorphisms (SNPs) of candidate genes and three distinct radiographic features that can be defined as the degenerative disc disease (DDD) phenotype. The study objectives were to examine the allelic diversity of 58 SNPs related to 35 candidate genes related to lumbar DDD, to evaluate the association in a hitherto unevaluated ethnic Indian population that represents more than one-sixth of the world population, and to analyze how genetic associations can vary in the same study subjects with the choice of phenotype. A cross-sectional, case-control study of an ethnic Indian population was carried out. Fifty-eight SNPs in 35 potential candidate genes were evaluated in 342 subjects and the associations were analyzed against three highly specific markers for DDD, namely disc degeneration by Pfirrmann grading, end-plate damage evaluated by total end-plate damage score, and annular tears evaluated by disc herniations and hyperintense zones. Genotyping of cases and controls was performed on a genome-wide SNP array to identify potential associated disease loci. The results from the genome-wide SNP array were then used to facilitate SNP selection and genotype validation was conducted using Sequenom-based genotyping. Eleven of the 58 SNPs provided evidence of association with one of the phenotypes. For annular tears, rs1042631 SNP of AGC1 and rs467691 SNP of ADAMTS5 were highly significantly associated (p<.01) and SNPs in NGFB, IL1B, IL18RAP, and MMP10 were also significantly associated (p<.05). The rs4076018 SNP of NGFB was highly significant (p<.01) and rs2292657 SNP of GLI1 was significantly (p<.05) correlated to disc degeneration. For end-plate damage, the rs2252070 SNP of MMP 13 showed a significant association (p<.05). Previously associated genes such as COL 9, SKT, CHST 3, CILP, IGFR, SOXp, BMP, MMP 2-12, ADH2, IL1RN, and COX2 were not significantly associated and new associations (NGFB and GLI1) were identified. The validity of all the associations was found to be phenotype dependent. For the first time, genetic associations with DDD have been performed in an Indian population. Apart from identifying new associations, the highlight of the study was that in the same study population with DDD, SNP associations completely changed when different radiographic features were used to define the DDD phenotype. Our study results therefore indicate that standardization of the phenotypes chosen to study the genetics of disc degeneration is essential and should be strongly considered before planning genetic association studies. Copyright © 2013 Elsevier Inc. All rights reserved.
Nested association mapping for dissecting complex traits using Peanut 58K SNP array
USDA-ARS?s Scientific Manuscript database
Genome-wide association studies (GWAS) and linkage mapping have been the two most predominant strategies to dissect complex traits, but are limited by the occurrence of false positives reported for GWAS, and low resolution in the case of linkage analysis. This has led to the development of a joint a...
USDA-ARS?s Scientific Manuscript database
Bacterial cold water disease (BCWD) causes significant mortality and economic losses in salmonids aquaculture. In previous studies we have identified moderate-large effect QTL for BCWD resistance in rainbow trout (Oncorhynchus mykiss). However, the recent availability of a high density SNP array and...
Design of a bovine low-density SNP array optimized for imputation
USDA-ARS?s Scientific Manuscript database
The Illumina BovineLD BeadChip was designed to support imputation to higher density genotypes in dairy and beef breeds by including single-nucleotide polymorphisms (SNPs) that had a high minor allele frequency as well as uniform spacing across the genome except at the ends of the chromosome where de...
2010-10-01
5 Results ...to disease prognosis and in determining the course of treatment for the patient (2) . Breast cancer is a highly heterogeneous and complex disease...progression is a challenge. Introduction of high density single nucleotide polymorphism (SNP) genotyping arrays has helped not only for whole genome
USDA-ARS?s Scientific Manuscript database
Improving water-use efficiency by incorporating drought avoidance traits into new wheat varieties is an important objective for wheat breeding in water-limited environments. This study uses genome wide association studies (GWAS) to identify candidate loci for water-soluble carbohydrate accumulation,...
Arlt, Martin F.; Ozdemir, Alev Cagla; Birkeland, Shanda R.; Lyons, Robert H.; Glover, Thomas W.; Wilson, Thomas E.
2011-01-01
Copy-number variants (CNVs) are a major source of genetic variation in human health and disease. Previous studies have implicated replication stress as a causative factor in CNV formation. However, existing data are technically limited in the quality of comparisons that can be made between human CNVs and experimentally induced variants. Here, we used two high-resolution strategies—single nucleotide polymorphism (SNP) arrays and mate-pair sequencing—to compare CNVs that occur constitutionally to those that arise following aphidicolin-induced DNA replication stress in the same human cells. Although the optimized methods provided complementary information, sequencing was more sensitive to small variants and provided superior structural descriptions. The majority of constitutional and all aphidicolin-induced CNVs appear to be formed via homology-independent mechanisms, while aphidicolin-induced CNVs were of a larger median size than constitutional events even when mate-pair data were considered. Aphidicolin thus appears to stimulate formation of CNVs that closely resemble human pathogenic CNVs and the subset of larger nonhomologous constitutional CNVs. PMID:21212237
Kuhn, Alexandre; Ong, Yao Min; Cheng, Ching-Yu; Wong, Tien Yin; Quake, Stephen R; Burkholder, William F
2014-06-03
Insertions of the human-specific subfamily of LINE-1 (L1) retrotransposon are highly polymorphic across individuals and can critically influence the human transcriptome. We hypothesized that L1 insertions could represent genetic variants determining important human phenotypic traits, and performed an integrated analysis of L1 elements and single nucleotide polymorphisms (SNPs) in several human populations. We found that a large fraction of L1s were in high linkage disequilibrium with their surrounding genomic regions and that they were well tagged by SNPs. However, L1 variants were only partially captured by SNPs on standard SNP arrays, so that their potential phenotypic impact would be frequently missed by SNP array-based genome-wide association studies. We next identified potential phenotypic effects of L1s by looking for signatures of natural selection linked to L1 insertions; significant extended haplotype homozygosity was detected around several L1 insertions. This finding suggests that some of these L1 insertions may have been the target of recent positive selection.
High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis.
Eyre, Steve; Bowes, John; Diogo, Dorothée; Lee, Annette; Barton, Anne; Martin, Paul; Zhernakova, Alexandra; Stahl, Eli; Viatte, Sebastien; McAllister, Kate; Amos, Christopher I; Padyukov, Leonid; Toes, Rene E M; Huizinga, Tom W J; Wijmenga, Cisca; Trynka, Gosia; Franke, Lude; Westra, Harm-Jan; Alfredsson, Lars; Hu, Xinli; Sandor, Cynthia; de Bakker, Paul I W; Davila, Sonia; Khor, Chiea Chuen; Heng, Khai Koon; Andrews, Robert; Edkins, Sarah; Hunt, Sarah E; Langford, Cordelia; Symmons, Deborah; Concannon, Pat; Onengut-Gumuscu, Suna; Rich, Stephen S; Deloukas, Panos; Gonzalez-Gay, Miguel A; Rodriguez-Rodriguez, Luis; Ärlsetig, Lisbeth; Martin, Javier; Rantapää-Dahlqvist, Solbritt; Plenge, Robert M; Raychaudhuri, Soumya; Klareskog, Lars; Gregersen, Peter K; Worthington, Jane
2012-12-01
Using the Immunochip custom SNP array, which was designed for dense genotyping of 186 loci identified through genome-wide association studies (GWAS), we analyzed 11,475 individuals with rheumatoid arthritis (cases) of European ancestry and 15,870 controls for 129,464 markers. We combined these data in a meta-analysis with GWAS data from additional independent cases (n = 2,363) and controls (n = 17,872). We identified 14 new susceptibility loci, 9 of which were associated with rheumatoid arthritis overall and five of which were specifically associated with disease that was positive for anticitrullinated peptide antibodies, bringing the number of confirmed rheumatoid arthritis risk loci in individuals of European ancestry to 46. We refined the peak of association to a single gene for 19 loci, identified secondary independent effects at 6 loci and identified association to low-frequency variants at 4 loci. Bioinformatic analyses generated strong hypotheses for the causal SNP at seven loci. This study illustrates the advantages of dense SNP mapping analysis to inform subsequent functional investigations.
Genome-wide SNP association-based localization of a dwarfism gene in Friesian dwarf horses.
Orr, N; Back, W; Gu, J; Leegwater, P; Govindarajan, P; Conroy, J; Ducro, B; Van Arendonk, J A M; MacHugh, D E; Ennis, S; Hill, E W; Brama, P A J
2010-12-01
The recent completion of the horse genome and commercial availability of an equine SNP genotyping array has facilitated the mapping of disease genes. We report putative localization of the gene responsible for dwarfism, a trait in Friesian horses that is thought to have a recessive mode of inheritance, to a 2-MB region of chromosome 14 using just 10 affected animals and 10 controls. We successfully genotyped 34,429 SNPs that were tested for association with dwarfism using chi-square tests. The most significant SNP in our study, BIEC2-239376 (P(2df)=4.54 × 10(-5), P(rec)=7.74 × 10(-6)), is located close to a gene implicated in human dwarfism. Fine-mapping and resequencing analyses did not aid in further localization of the causative variant, and replication of our findings in independent sample sets will be necessary to confirm these results. © 2010 The Authors, Journal compilation © 2010 Stichting International Foundation for Animal Genetics.
High density genetic mapping identifies new susceptibility loci for rheumatoid arthritis
Eyre, Steve; Bowes, John; Diogo, Dorothée; Lee, Annette; Barton, Anne; Martin, Paul; Zhernakova, Alexandra; Stahl, Eli; Viatte, Sebastien; McAllister, Kate; Amos, Christopher I.; Padyukov, Leonid; Toes, Rene E.M.; Huizinga, Tom W.J.; Wijmenga, Cisca; Trynka, Gosia; Franke, Lude; Westra, Harm-Jan; Alfredsson, Lars; Hu, Xinli; Sandor, Cynthia; de Bakker, Paul I.W.; Davila, Sonia; Khor, Chiea Chuen; Heng, Khai Koon; Andrews, Robert; Edkins, Sarah; Hunt, Sarah E; Langford, Cordelia; Symmons, Deborah; Concannon, Pat; Onengut-Gumuscu, Suna; Rich, Stephen S; Deloukas, Panos; Gonzalez-Gay, Miguel A.; Rodriguez-Rodriguez, Luis; Ärlsetig, Lisbeth; Martin, Javier; Rantapää-Dahlqvist, Solbritt; Plenge, Robert; Raychaudhuri, Soumya; Klareskog, Lars; Gregersen, Peter K; Worthington, Jane
2012-01-01
Summary Using the Immunochip custom single nucleotide polymorphism (SNP) array, designed for dense genotyping of 186 genome wide association study (GWAS) confirmed loci we analysed 11,475 rheumatoid arthritis cases of European ancestry and 15,870 controls for 129,464 markers. The data were combined in meta-analysis with GWAS data from additional independent cases (n=2,363) and controls (n=17,872). We identified fourteen novel loci; nine were associated with rheumatoid arthritis overall and 5 specifically in anti-citrillunated peptide antibody positive disease, bringing the number of confirmed European ancestry rheumatoid arthritis loci to 46. We refined the peak of association to a single gene for 19 loci, identified secondary independent effects at six loci and association to low frequency variants (minor allele frequency <0.05) at 4 loci. Bioinformatic analysis of the data generated strong hypotheses for the causal SNP at seven loci. This study illustrates the advantages of dense SNP mapping analysis to inform subsequent functional investigations. PMID:23143596
Model-based variance-stabilizing transformation for Illumina microarray data.
Lin, Simon M; Du, Pan; Huber, Wolfgang; Kibbe, Warren A
2008-02-01
Variance stabilization is a step in the preprocessing of microarray data that can greatly benefit the performance of subsequent statistical modeling and inference. Due to the often limited number of technical replicates for Affymetrix and cDNA arrays, achieving variance stabilization can be difficult. Although the Illumina microarray platform provides a larger number of technical replicates on each array (usually over 30 randomly distributed beads per probe), these replicates have not been leveraged in the current log2 data transformation process. We devised a variance-stabilizing transformation (VST) method that takes advantage of the technical replicates available on an Illumina microarray. We have compared VST with log2 and Variance-stabilizing normalization (VSN) by using the Kruglyak bead-level data (2006) and Barnes titration data (2005). The results of the Kruglyak data suggest that VST stabilizes variances of bead-replicates within an array. The results of the Barnes data show that VST can improve the detection of differentially expressed genes and reduce false-positive identifications. We conclude that although both VST and VSN are built upon the same model of measurement noise, VST stabilizes the variance better and more efficiently for the Illumina platform by leveraging the availability of a larger number of within-array replicates. The algorithms and Supplementary Data are included in the lumi package of Bioconductor, available at: www.bioconductor.org.
SNPchiMp: a database to disentangle the SNPchip jungle in bovine livestock.
Nicolazzi, Ezequiel Luis; Picciolini, Matteo; Strozzi, Francesco; Schnabel, Robert David; Lawley, Cindy; Pirani, Ali; Brew, Fiona; Stella, Alessandra
2014-02-11
Currently, six commercial whole-genome SNP chips are available for cattle genotyping, produced by two different genotyping platforms. Technical issues need to be addressed to combine data that originates from the different platforms, or different versions of the same array generated by the manufacturer. For example: i) genome coordinates for SNPs may refer to different genome assemblies; ii) reference genome sequences are updated over time changing the positions, or even removing sequences which contain SNPs; iii) not all commercial SNP ID's are searchable within public databases; iv) SNPs can be coded using different formats and referencing different strands (e.g. A/B or A/C/T/G alleles, referencing forward/reverse, top/bottom or plus/minus strand); v) Due to new information being discovered, higher density chips do not necessarily include all the SNPs present in the lower density chips; and, vi) SNP IDs may not be consistent across chips and platforms. Most researchers and breed associations manage SNP data in real-time and thus require tools to standardise data in a user-friendly manner. Here we present SNPchiMp, a MySQL database linked to an open access web-based interface. Features of this interface include, but are not limited to, the following functions: 1) referencing the SNP mapping information to the latest genome assembly, 2) extraction of information contained in dbSNP for SNPs present in all commercially available bovine chips, and 3) identification of SNPs in common between two or more bovine chips (e.g. for SNP imputation from lower to higher density). In addition, SNPchiMp can retrieve this information on subsets of SNPs, accessing such data either via physical position on a supported assembly, or by a list of SNP IDs, rs or ss identifiers. This tool combines many different sources of information, that otherwise are time consuming to obtain and difficult to integrate. The SNPchiMp not only provides the information in a user-friendly format, but also enables researchers to perform a large number of operations with a few clicks of the mouse. This significantly reduces the time needed to execute the large number of operations required to manage SNP data.
Poćwierz-Kotus, Anita; Bernaś, Rafał; Kent, Matthew P; Lien, Sigbjørn; Leliűna, Egidijus; Dębowski, Piotr; Wenne, Roman
2015-05-06
Native populations of Atlantic salmon in Poland, from the southern Baltic region, became extinct in the 1980s. Attempts to restitute salmon populations in Poland have been based on a Latvian salmon population from the Daugava river. Releases of hatchery reared smolts started in 1986, but to date, only one population with confirmed natural reproduction has been observed in the Slupia river. Our aim was to investigate the genetic differentiation of salmon populations in the southern Baltic using a 7K SNP (single nucleotide polymorphism) array in order to assess the impact of salmon restitution in Poland. One hundred and forty salmon samples were collected from: the Polish Slupia river including wild salmon and individuals from two hatcheries, the Swedish Morrum river and the Lithuanian Neman river. All samples were genotyped using an Atlantic salmon 7K SNP array. A set of 3218 diagnostic SNPs was used for genetic analyses. Genetic structure analyses indicated that the individuals from the investigated populations were clustered into three groups i.e. one clade that included individuals from both hatcheries and the wild population from the Polish Slupia river, which was clearly separated from the other clades. An assignment test showed that there were no stray fish from the Morrum or Neman rivers in the sample analyzed from the Slupia river. Global FST over polymorphic loci was high (0.177). A strong genetic differentiation was observed between the Lithuanian and Swedish populations (FST = 0.28). Wild juvenile salmon specimens that were sampled from the Slupia river were the progeny of fish released from hatcheries and, most likely, were not progeny of stray fish from Sweden or Lithuania. Strong genetic differences were observed between the salmon populations from the three studied locations. Our recommendation is that future stocking activities that aim at restituting salmon populations in Poland include stocking material from the Lithuanian Neman river because of its closer geographic proximity.
Small cell ovarian carcinoma: genomic stability and responsiveness to therapeutics.
Gamwell, Lisa F; Gambaro, Karen; Merziotis, Maria; Crane, Colleen; Arcand, Suzanna L; Bourada, Valerie; Davis, Christopher; Squire, Jeremy A; Huntsman, David G; Tonin, Patricia N; Vanderhyden, Barbara C
2013-02-21
The biology of small cell ovarian carcinoma of the hypercalcemic type (SCCOHT), which is a rare and aggressive form of ovarian cancer, is poorly understood. Tumourigenicity, in vitro growth characteristics, genetic and genomic anomalies, and sensitivity to standard and novel chemotherapeutic treatments were investigated in the unique SCCOHT cell line, BIN-67, to provide further insight in the biology of this rare type of ovarian cancer. The tumourigenic potential of BIN-67 cells was determined and the tumours formed in a xenograft model was compared to human SCCOHT. DNA sequencing, spectral karyotyping and high density SNP array analysis was performed. The sensitivity of the BIN-67 cells to standard chemotherapeutic agents and to vesicular stomatitis virus (VSV) and the JX-594 vaccinia virus was tested. BIN-67 cells were capable of forming spheroids in hanging drop cultures. When xenografted into immunodeficient mice, BIN-67 cells developed into tumours that reflected the hypercalcemia and histology of human SCCOHT, notably intense expression of WT-1 and vimentin, and lack of expression of inhibin. Somatic mutations in TP53 and the most common activating mutations in KRAS and BRAF were not found in BIN-67 cells by DNA sequencing. Spectral karyotyping revealed a largely normal diploid karyotype (in greater than 95% of cells) with a visibly shorter chromosome 20 contig. High density SNP array analysis also revealed few genomic anomalies in BIN-67 cells, which included loss of heterozygosity of an estimated 16.7 Mb interval on chromosome 20. SNP array analyses of four SCCOHT samples also indicated a low frequency of genomic anomalies in the majority of cases. Although resistant to platinum chemotherapeutic drugs, BIN-67 cell viability in vitro was reduced by > 75% after infection with oncolytic viruses. These results show that SCCOHT differs from high-grade serous carcinomas by exhibiting few chromosomal anomalies and lacking TP53 mutations. Although BIN-67 cells are resistant to standard chemotherapeutic agents, their sensitivity to oncolytic viruses suggests that their therapeutic use in SCCOHT should be considered.
Kawakami, Takeshi; Backström, Niclas; Burri, Reto; Husby, Arild; Olason, Pall; Rice, Amber M; Ålund, Murielle; Qvarnström, Anna; Ellegren, Hans
2014-01-01
With the access to draft genome sequence assemblies and whole-genome resequencing data from population samples, molecular ecology studies will be able to take truly genome-wide approaches. This now applies to an avian model system in ecological and evolutionary research: Old World flycatchers of the genus Ficedula, for which we recently obtained a 1.1 Gb collared flycatcher genome assembly and identified 13 million single-nucleotide polymorphism (SNP)s in population resequencing of this species and its sister species, pied flycatcher. Here, we developed a custom 50K Illumina iSelect flycatcher SNP array with markers covering 30 autosomes and the Z chromosome. Using a number of selection criteria for inclusion in the array, both genotyping success rate and polymorphism information content (mean marker heterozygosity = 0.41) were high. We used the array to assess linkage disequilibrium (LD) and hybridization in flycatchers. Linkage disequilibrium declined quickly to the background level at an average distance of 17 kb, but the extent of LD varied markedly within the genome and was more than 10-fold higher in ‘genomic islands’ of differentiation than in the rest of the genome. Genetic ancestry analysis identified 33 F1 hybrids but no later-generation hybrids from sympatric populations of collared flycatchers and pied flycatchers, contradicting earlier reports of backcrosses identified from much fewer number of markers. With an estimated divergence time as recently as <1 Ma, this suggests strong selection against F1 hybrids and unusually rapid evolution of reproductive incompatibility in an avian system. PMID:24784959
USDA-ARS?s Scientific Manuscript database
Natural-origin steelhead in the Pacific Northwest USA are threatened by a number of factors including habitat destruction, disease, decline in marine survival and a potential erosion of genetic viability due to introgression from hatchery strains. The major goal of this study was to use a recently ...
Discovery of 20,000 RAD-SNPs and development of a 52-SNP array for monitoring river otters
Jeffrey B. Stetz; Seth Smith; Michael A. Sawaya; Alan B. Ramsey; Stephen J. Amish; Michael K. Schwartz; Gordon Luikart
2016-01-01
Many North American river otter (Lontra canadensis) populations are threatened or recovering but are difficult to study because they occur at low densities, it is difficult to visually identify individuals, and they inhabit aquatic environments that accelerate degradation of biological samples. Single nucleotide polymorphisms (SNPs) can improve our ability to...
USDA-ARS?s Scientific Manuscript database
Tea [Camellia sinensis (L.) O Kuntze] is an economically important crop cultivated in more than 50 countries. Production and marketing of premium specialty tea products provides opportunities for tea growers, the tea industry and consumers. Rapid market segmentation in the tea industry has resulted ...
USDA-ARS?s Scientific Manuscript database
Single nucleotide polymorphisms (SNPs) are capable of providing the highest level of genome coverage for genomic and genetic analysis because of their abundance and relatively even distribution in the genome. Such a capacity, however, cannot be achieved without an efficient genotyping platform such ...
USDA-ARS?s Scientific Manuscript database
Bacterial cold water disease (BCWD) causes significant mortality and economic losses in salmonid aquaculture. In previous studies, we identified moderate-large effect QTL for BCWD resistance in rainbow trout (Oncorhynchus mykiss). However, the recent availability of a 57K SNP array and a genome phys...
Cappola, Thomas P; Matkovich, Scot J; Wang, Wei; van Booven, Derek; Li, Mingyao; Wang, Xuexia; Qu, Liming; Sweitzer, Nancy K; Fang, James C; Reilly, Muredach P; Hakonarson, Hakon; Nerbonne, Jeanne M; Dorn, Gerald W
2011-02-08
Common heart failure has a strong undefined heritable component. Two recent independent cardiovascular SNP array studies identified a common SNP at 1p36 in intron 2 of the HSPB7 gene as being associated with heart failure. HSPB7 resequencing identified other risk alleles but no functional gene variants. Here, we further show no effect of the HSPB7 SNP on cardiac HSPB7 mRNA levels or splicing, suggesting that the SNP marks the position of a functional variant in another gene. Accordingly, we used massively parallel platforms to resequence all coding exons of the adjacent CLCNKA gene, which encodes the K(a) renal chloride channel (ClC-K(a)). Of 51 exonic CLCNKA variants identified, one SNP (rs10927887, encoding Arg83Gly) was common, in linkage disequilibrium with the heart failure risk SNP in HSPB7, and associated with heart failure in two independent Caucasian referral populations (n = 2,606 and 1,168; combined P = 2.25 × 10(-6)). Individual genotyping of rs10927887 in the two study populations and a third independent heart failure cohort (combined n = 5,489) revealed an additive allele effect on heart failure risk that is independent of age, sex, and prior hypertension (odds ratio = 1.27 per allele copy; P = 8.3 × 10(-7)). Functional characterization of recombinant wild-type Arg83 and variant Gly83 ClC-K(a) chloride channel currents revealed ≈ 50% loss-of-function of the variant channel. These findings identify a common, functionally significant genetic risk factor for Caucasian heart failure. The variant CLCNKA risk allele, telegraphed by linked variants in the adjacent HSPB7 gene, uncovers a previously overlooked genetic mechanism affecting the cardio-renal axis.
Schlebusch, Carina M; Soodyall, Himlya
2012-12-01
The San and Khoe people currently represent remnant groups of a much larger and widely distributed population of hunter-gatherers and pastoralists who had exclusive occupation of southern Africa before the arrival of Bantu-speaking groups in the past 1,200 years and sea-borne immigrants within the last 350 years. Genetic studies [mitochondrial deoxyribonucleic acid (DNA) and Y-chromosome] conducted on San and Khoe groups revealed that they harbor some of the most divergent lineages found in living peoples throughout the world. Recently, high-density, autosomal, single-nucleotide polymorphism (SNP)-array studies confirmed the early divergence of Khoe-San population groups from all other human populations. The present study made use of 220 autosomal SNP markers (in the format of both haplotypes and genotypes) to examine the population structure of various San and Khoe groups and their relationship to other neighboring groups. Whereas analyses based on the genotypic SNP data only supported the division of the included populations into three main groups-Khoe-San, Bantu-speakers, and non-African populations-haplotype analyses revealed finer structure within Khoe-San populations. By the use of only 44 short SNP haplotypes (compiled from a total of 220 SNPs), most of the Khoe-San groups could be resolved as separate groups by applying STRUCTURE analyses. Therefore, by carefully selecting a few SNPs and combining them into haplotypes, we were able to achieve the same level of population distinction that was achieved previously in high-density SNP studies on the same population groups. Using haplotypes proved to be a very efficient and cost-effective way to study population structure. Copyright © 2013 Wayne State University Press, Detroit, Michigan 48201-1309.
Clover Biotechnology Research at FAPRU
USDA-ARS?s Scientific Manuscript database
Randy Dinkins (USDA-ARS-FAPRU) is conducting research to determine the utility of using the Medicago Affymetrix Genechip for use with red clover (Trifolium pretense). The Medicago Affymetrix Genechip contains approximately 51,000 probe sets that are derived from Medicago truncatula, 1,800 from Medi...
The Minnesota Center for Twin and Family Research Genome-Wide Association Study
Miller, Michael B.; Basu, Saonli; Cunningham, Julie; Eskin, Eleazar; Malone, Steven M.; Oetting, William S.; Schork, Nicholas; Sul, Jae Hoon; Iacono, William G.; Mcgue, Matt
2012-01-01
As part of the Genes, Environment and Development Initiative (GEDI), the Minnesota Center for Twin and Family Research (MCTFR) undertook a genome-wide association study (GWAS), which we describe here. A total of 8405 research participants, clustered in 4-member families, have been successfully genotyped on 527,829 single nucleotide polymorphism (SNP) markers using Illumina’s Human660W-Quad array. Quality control screening of samples and markers as well as SNP imputation procedures are described. We also describe methods for ancestry control and how the familial clustering of the MCTFR sample can be accounted for in the analysis using a Rapid Feasible Generalized Least Squares algorithm. The rich longitudinal MCTFR assessments provide numerous opportunities for collaboration. PMID:23363460
Gorkhali, Neena Amatya; Dong, Kunzhe; Yang, Min; Song, Shen; Kader, Adiljian; Shrestha, Bhola Shankar; He, Xiaohong; Zhao, Qianjun; Pu, Yabin; Li, Xiangchen; Kijas, James; Guan, Weijun; Han, Jianlin; Jiang, Lin; Ma, Yuehui
2016-07-22
Sheep has successfully adapted to the extreme high-altitude Himalayan region. To identify genes underlying such adaptation, we genotyped genome-wide single nucleotide polymorphisms (SNPs) of four major sheep breeds living at different altitudes in Nepal and downloaded SNP array data from additional Asian and Middle East breeds. Using a di value-based genomic comparison between four high-altitude and eight lowland Asian breeds, we discovered the most differentiated variants at the locus of FGF-7 (Keratinocyte growth factor-7), which was previously reported as a good protective candidate for pulmonary injuries. We further found a SNP upstream of FGF-7 that appears to contribute to the divergence signature. First, the SNP occurred at an extremely conserved site. Second, the SNP showed an increasing allele frequency with the elevated altitude in Nepalese sheep. Third, the electrophoretic mobility shift assays (EMSA) analysis using human lung cancer cells revealed the allele-specific DNA-protein interactions. We thus hypothesized that FGF-7 gene potentially enhances lung function by regulating its expression level in high-altitude sheep through altering its binding of specific transcription factors. Especially, FGF-7 gene was not implicated in previous studies of other high-altitude species, suggesting a potential novel adaptive mechanism to high altitude in sheep at the Himalayas.
Huggins, P; Johnson, CK; Schoergendorfer, A; Putta, S; Bathke, AC; Stromberg, AJ; Voss, SR
2011-01-01
The Mexican axolotl (Ambystoma mexicanum) presents an excellent model to investigate mechanisms of brain development that are conserved among vertebrates. In particular, metamorphic changes of the brain can be induced in free-living aquatic juveniles and adults by simply adding thyroid hormone (T4) to rearing water. Whole brains were sampled from juvenile A. mexicanum that were exposed to 0, 8, and 18 days of 50 nM T4, and these were used to isolate RNA and make normalized cDNA libraries for 454 DNA sequencing. A total of 1,875,732 high quality cDNA reads were assembled with existing ESTs to obtain 5,884 new contigs for human RefSeq protein models, and to develop a custom Affymetrix gene expression array (Amby_002) with approximately 20,000 probe sets. The Amby_002 array was used to identify 303 transcripts that differed statistically (p < 0.05, fold change > 1.5) as a function of days of T4 treatment. Further statistical analyses showed that Amby_002 performed concordantly in comparison to an existing, small format expression array. This study introduces a new A. mexicanum microarray resource for the community and the first lists of T4-responsive genes from the brain of a salamander amphibian. PMID:21457787
Huggins, P; Johnson, C K; Schoergendorfer, A; Putta, S; Bathke, A C; Stromberg, A J; Voss, S R
2012-01-01
The Mexican axolotl (Ambystoma mexicanum) presents an excellent model to investigate mechanisms of brain development that are conserved among vertebrates. In particular, metamorphic changes of the brain can be induced in free-living aquatic juveniles and adults by simply adding thyroid hormone (T4) to rearing water. Whole brains were sampled from juvenile A. mexicanum that were exposed to 0, 8, and 18 days of 50 nM T4, and these were used to isolate RNA and make normalized cDNA libraries for 454 DNA sequencing. A total of 1,875,732 high quality cDNA reads were assembled with existing ESTs to obtain 5884 new contigs for human RefSeq protein models, and to develop a custom Affymetrix gene expression array (Amby_002) with approximately 20,000 probe sets. The Amby_002 array was used to identify 303 transcripts that differed statistically (p<0.05, fold change >1.5) as a function of days of T4 treatment. Further statistical analyses showed that Amby_002 performed concordantly in comparison to an existing, small format expression array. This study introduces a new A. mexicanum microarray resource for the community and the first lists of T4-responsive genes from the brain of a salamander amphibian. Copyright © 2011 Elsevier Inc. All rights reserved.
Correction of Spatial Bias in Oligonucleotide Array Data
Lemieux, Sébastien
2013-01-01
Background. Oligonucleotide microarrays allow for high-throughput gene expression profiling assays. The technology relies on the fundamental assumption that observed hybridization signal intensities (HSIs) for each intended target, on average, correlate with their target's true concentration in the sample. However, systematic, nonbiological variation from several sources undermines this hypothesis. Background hybridization signal has been previously identified as one such important source, one manifestation of which appears in the form of spatial autocorrelation. Results. We propose an algorithm, pyn, for the elimination of spatial autocorrelation in HSIs, exploiting the duality of desirable mutual information shared by probes in a common probe set and undesirable mutual information shared by spatially proximate probes. We show that this correction procedure reduces spatial autocorrelation in HSIs; increases HSI reproducibility across replicate arrays; increases differentially expressed gene detection power; and performs better than previously published methods. Conclusions. The proposed algorithm increases both precision and accuracy, while requiring virtually no changes to users' current analysis pipelines: the correction consists merely of a transformation of raw HSIs (e.g., CEL files for Affymetrix arrays). A free, open-source implementation is provided as an R package, compatible with standard Bioconductor tools. The approach may also be tailored to other platform types and other sources of bias. PMID:23573083
Dupl'áková, Nikoleta; Renák, David; Hovanec, Patrik; Honysová, Barbora; Twell, David; Honys, David
2007-07-23
Microarray technologies now belong to the standard functional genomics toolbox and have undergone massive development leading to increased genome coverage, accuracy and reliability. The number of experiments exploiting microarray technology has markedly increased in recent years. In parallel with the rapid accumulation of transcriptomic data, on-line analysis tools are being introduced to simplify their use. Global statistical data analysis methods contribute to the development of overall concepts about gene expression patterns and to query and compose working hypotheses. More recently, these applications are being supplemented with more specialized products offering visualization and specific data mining tools. We present a curated gene family-oriented gene expression database, Arabidopsis Gene Family Profiler (aGFP; http://agfp.ueb.cas.cz), which gives the user access to a large collection of normalised Affymetrix ATH1 microarray datasets. The database currently contains NASC Array and AtGenExpress transcriptomic datasets for various tissues at different developmental stages of wild type plants gathered from nearly 350 gene chips. The Arabidopsis GFP database has been designed as an easy-to-use tool for users needing an easily accessible resource for expression data of single genes, pre-defined gene families or custom gene sets, with the further possibility of keyword search. Arabidopsis Gene Family Profiler presents a user-friendly web interface using both graphic and text output. Data are stored at the MySQL server and individual queries are created in PHP script. The most distinguishable features of Arabidopsis Gene Family Profiler database are: 1) the presentation of normalized datasets (Affymetrix MAS algorithm and calculation of model-based gene-expression values based on the Perfect Match-only model); 2) the choice between two different normalization algorithms (Affymetrix MAS4 or MAS5 algorithms); 3) an intuitive interface; 4) an interactive "virtual plant" visualizing the spatial and developmental expression profiles of both gene families and individual genes. Arabidopsis GFP gives users the possibility to analyze current Arabidopsis developmental transcriptomic data starting with simple global queries that can be expanded and further refined to visualize comparative and highly selective gene expression profiles.
Characterization of a Genomic Signature of Pregnancy in the Breast
Belitskaya-Lévy, Ilana; Zeleniuch-Jacquotte, Anne; Russo, Jose; Russo, Irma H.; Bordás, Pal; Åhman, Janet; Afanasyeva, Yelena; Johansson, Robert; Lenner, Per; Li, Xiaochun; de Cicco, Ricardo López; Peri, Suraj; Ross, Eric; Russo, Patricia A.; Santucci-Pereira, Julia; Sheriff, Fathima S.; Slifker, Michael; Hallmans, Göran; Toniolo, Paolo; Arslan, Alan A.
2012-01-01
The objective of the current study was to comprehensively compare the genomic profiles in the breast of parous and nulliparous postmenopausal women to identify genes that permanently change their expression following pregnancy. The study was designed as a two-phase approach. In the discovery phase, we compared breast genomic profiles of 37 parous with 18 nulliparous postmenopausal women. In the validation phase, confirmation of the genomic patterns observed in the discovery phase was sought in an independent set of 30 parous and 22 nulliparous postmenopausal women. RNA was hybridized to Affymetrix HG_U133 Plus 2.0 oligonucleotide arrays containing probes to 54,675 transcripts; scanned and the images analyzed using Affymetrix GCOS software. Surrogate variable analysis, logistic regression and significance analysis for microarrays were used to identify statistically significant differences in expression of genes. The False Discovery Rate (FDR) approach was used to control for multiple comparisons. We found that 208 genes (305 probe sets) were differentially expressed between parous and nulliparous women in both discovery and validation phases of the study at a FDR of 10% and with at least a 1.25-fold change. These genes are involved in regulation of transcription, centrosome organization, RNA splicing, cell cycle control, adhesion and differentiation. The results provide persuasive evidence that full-term pregnancy induces long-term genomic changes in the breast. The genomic signature of pregnancy could be used as an intermediate marker to assess potential chemopreventive interventions with hormones mimicking the effects of pregnancy for prevention of breast cancer. PMID:21622728
Schoenfeld, Jonathan; Lessan, Khashayar; Johnson, Nicola A; Charnock-Jones, D Stephen; Evans, Amanda; Vourvouhaki, Ekaterini; Scott, Laurie; Stephens, Richard; Freeman, Tom C; Saidi, Samir A; Tom, Brian; Weston, Gareth C; Rogers, Peter; Smith, Stephen K; Print, Cristin G
2004-01-01
We recently published a review in this journal describing the design, hybridisation and basic data processing required to use gene arrays to investigate vascular biology (Evans et al. Angiogenesis 2003; 6: 93-104). Here, we build on this review by describing a set of powerful and robust methods for the analysis and interpretation of gene array data derived from primary vascular cell cultures. First, we describe the evaluation of transcriptome heterogeneity between primary cultures derived from different individuals, and estimation of the false discovery rate introduced by this heterogeneity and by experimental noise. Then, we discuss the appropriate use of Bayesian t-tests, clustering and independent component analysis to mine the data. We illustrate these principles by analysis of a previously unpublished set of gene array data in which human umbilical vein endothelial cells (HUVEC) cultured in either rich or low-serum media were exposed to vascular endothelial growth factor (VEGF)-A165 or placental growth factor (PlGF)-1(131). We have used Affymetrix U95A gene arrays to map the effects of these factors on the HUVEC transcriptome. These experiments followed a paired design and were biologically replicated three times. In addition, one experiment was repeated using serial analysis of gene expression (SAGE). In contrast to some previous studies, we found that VEGF-A and PlGF consistently regulated only small, non-overlapping and culture media-dependant sets of HUVEC transcripts, despite causing significant cell biological changes.
Pavy, Nathalie; Parsons, Lee S; Paule, Charles; MacKay, John; Bousquet, Jean
2006-01-01
Background High-throughput genotyping technologies represent a highly efficient way to accelerate genetic mapping and enable association studies. As a first step toward this goal, we aimed to develop a resource of candidate Single Nucleotide Polymorphisms (SNP) in white spruce (Picea glauca [Moench] Voss), a softwood tree of major economic importance. Results A white spruce SNP resource encompassing 12,264 SNPs was constructed from a set of 6,459 contigs derived from Expressed Sequence Tags (EST) and by using the bayesian-based statistical software PolyBayes. Several parameters influencing the SNP prediction were analysed including the a priori expected polymorphism, the probability score (PSNP), and the contig depth and length. SNP detection in 3' and 5' reads from the same clones revealed a level of inconsistency between overlapping sequences as low as 1%. A subset of 245 predicted SNPs were verified through the independent resequencing of genomic DNA of a genotype also used to prepare cDNA libraries. The validation rate reached a maximum of 85% for SNPs predicted with either PSNP ≥ 0.95 or ≥ 0.99. A total of 9,310 SNPs were detected by using PSNP ≥ 0.95 as a criterion. The SNPs were distributed among 3,590 contigs encompassing an array of broad functional categories, with an overall frequency of 1 SNP per 700 nucleotide sites. Experimental and statistical approaches were used to evaluate the proportion of paralogous SNPs, with estimates in the range of 8 to 12%. The 3,789 coding SNPs identified through coding region annotation and ORF prediction, were distributed into 39% nonsynonymous and 61% synonymous substitutions. Overall, there were 0.9 SNP per 1,000 nonsynonymous sites and 5.2 SNPs per 1,000 synonymous sites, for a genome-wide nonsynonymous to synonymous substitution rate ratio (Ka/Ks) of 0.17. Conclusion We integrated the SNP data in the ForestTreeDB database along with functional annotations to provide a tool facilitating the choice of candidate genes for mapping purposes or association studies. PMID:16824208
Kerr, Jonathan R; Kaushik, Narendra; Fear, David; Baldwin, Don A; Nuwaysir, Emile F; Adcock, Ian M
2005-07-15
This study was undertaken to further examine the role of the host response to parvovirus B19 in the development of symptoms and consequences of viral persistence. Genomic DNA from 42 patients with symptomatic B19 infection was analyzed using the HuSNP assay (Affymetrix), and the results were compared with those from analysis of 53 healthy control individuals. Fifty-seven single-nucleotide polymorphisms were identified that were significantly associated with symptomatic infection. Total RNA from peripheral blood mononuclear cells from 57 B19-seropositive and 13 B19-seronegative donors was analyzed by hybridization to a single-color microarray representing 9522 human genes. Ninety-two genes were shown to be differentially expressed. Differential expression was confirmed in 6 of 38 genes (SKIP, MACF1, SPAG7, FLOT1, c6orf48, and RASSF5) tested using real-time quantitative polymerase chain reaction in a different group of healthy subjects. Genes identified in both studies play a functional role in the cytoskeleton, integrin signaling, and oncosuppression, themes that have been shown to be important in parvovirus infections.
Tüysüz, Beyhan; Bayrakli, Fatih; DiLuna, Michael L; Bilguvar, Kaya; Bayri, Yasar; Yalcinkaya, Cengiz; Bursali, Aysegul; Ozdamar, Elif; Korkmaz, Baris; Mason, Christopher E; Ozturk, Ali K; Lifton, Richard P; State, Matthew W; Gunel, Murat
2008-05-01
Hereditary sensory and autonomic neuropathy type IV (HSAN IV), or congenital insensitivity to pain with anhidrosis, is an autosomal recessive disorder characterized by insensitivity to noxious stimuli, anhidrosis from deinnervated sweat glands, and delayed mental and motor development. Mutations in the neurotrophic tyrosine kinase receptor type 1 (NTRK1), a receptor in the neurotrophin signaling pathway phosphorylated in response to nerve growth factor, are associated with this disorder. We identified six families from Northern Central Turkey with HSAN IV. We screened the NTRK1 gene for mutations in these families. Microsatellite and single nucleotide polymorphism (SNP) markers on the Affymetrix 250K chip platform were used to determine the haplotypes for three families harboring the same mutation. Screening for mutations in the NTRK1 gene demonstrated one novel frameshift mutation, two novel nonsense mutations, and three unrelated kindreds with the same splice-site mutation. Genotyping of the three families with the identical splice-site mutation revealed that they share the same haplotype. This report broadens the spectrum of mutations in NTRK1 that cause HSAN IV and demonstrates a founder mutation in the Turkish population.
Schönhals, Elske Maria; Ding, Jia; Ritter, Enrique; Paulo, Maria João; Cara, Nicolás; Tacke, Ekhard; Hofferbert, Hans-Reinhard; Lübeck, Jens; Strahwald, Josef; Gebhardt, Christiane
2017-08-22
Tuber yield and starch content of the cultivated potato are complex traits of decisive importance for breeding improved varieties. Natural variation of tuber yield and starch content depends on the environment and on multiple, mostly unknown genetic factors. Dissection and molecular identification of the genes and their natural allelic variants controlling these complex traits will lead to the development of diagnostic DNA-based markers, by which precision and efficiency of selection can be increased (precision breeding). Three case-control populations were assembled from tetraploid potato cultivars based on maximizing the differences between high and low tuber yield (TY), starch content (TSC) and starch yield (TSY, arithmetic product of TY and TSC). The case-control populations were genotyped by restriction-site associated DNA sequencing (RADseq) and the 8.3 k SolCAP SNP genotyping array. The allele frequencies of single nucleotide polymorphisms (SNPs) were compared between cases and controls. RADseq identified, depending on data filtering criteria, between 6664 and 450 genes with one or more differential SNPs for one, two or all three traits. Differential SNPs in 275 genes were detected using the SolCAP array. A genome wide association study using the SolCAP array on an independent, unselected population identified SNPs associated with tuber starch content in 117 genes. Physical mapping of the genes containing differential or associated SNPs, and comparisons between the two genome wide genotyping methods and two different populations identified genome segments on all twelve potato chromosomes harboring one or more quantitative trait loci (QTL) for TY, TSC and TSY. Several hundred genes control tuber yield and starch content in potato. They are unequally distributed on all potato chromosomes, forming clusters between 0.5-4 Mbp width. The largest fraction of these genes had unknown function, followed by genes with putative signalling and regulatory functions. The genetic control of tuber yield and starch content is interlinked. Most differential SNPs affecting both traits had antagonistic effects: The allele increasing TY decreased TSC and vice versa. Exceptions were 89 SNP alleles which had synergistic effects on TY, TSC and TSY. These and the corresponding genes are primary targets for developing diagnostic markers.
Ni, Guiyan; Cavero, David; Fangmann, Anna; Erbe, Malena; Simianer, Henner
2017-01-16
With the availability of next-generation sequencing technologies, genomic prediction based on whole-genome sequencing (WGS) data is now feasible in animal breeding schemes and was expected to lead to higher predictive ability, since such data may contain all genomic variants including causal mutations. Our objective was to compare prediction ability with high-density (HD) array data and WGS data in a commercial brown layer line with genomic best linear unbiased prediction (GBLUP) models using various approaches to weight single nucleotide polymorphisms (SNPs). A total of 892 chickens from a commercial brown layer line were genotyped with 336 K segregating SNPs (array data) that included 157 K genic SNPs (i.e. SNPs in or around a gene). For these individuals, genome-wide sequence information was imputed based on data from re-sequencing runs of 25 individuals, leading to 5.2 million (M) imputed SNPs (WGS data), including 2.6 M genic SNPs. De-regressed proofs (DRP) for eggshell strength, feed intake and laying rate were used as quasi-phenotypic data in genomic prediction analyses. Four weighting factors for building a trait-specific genomic relationship matrix were investigated: identical weights, -(log 10 P) from genome-wide association study results, squares of SNP effects from random regression BLUP, and variable selection based weights (known as BLUP|GA). Predictive ability was measured as the correlation between DRP and direct genomic breeding values in five replications of a fivefold cross-validation. Averaged over the three traits, the highest predictive ability (0.366 ± 0.075) was obtained when only genic SNPs from WGS data were used. Predictive abilities with genic SNPs and all SNPs from HD array data were 0.361 ± 0.072 and 0.353 ± 0.074, respectively. Prediction with -(log 10 P) or squares of SNP effects as weighting factors for building a genomic relationship matrix or BLUP|GA did not increase accuracy, compared to that with identical weights, regardless of the SNP set used. Our results show that little or no benefit was gained when using all imputed WGS data to perform genomic prediction compared to using HD array data regardless of the weighting factors tested. However, using only genic SNPs from WGS data had a positive effect on prediction ability.
Ulloa, Mauricio; Hulse-Kemp, Amanda M; De Santiago, Luis M; Stelly, David M; Burke, John J
2017-01-01
High-density linkage maps are vital to supporting the correct placement of scaffolds and gene sequences on chromosomes and fundamental to contemporary organismal research and scientific approaches to genetic improvement, especially in paleopolyploids with exceptionally complex genomes, eg, upland cotton ( Gossypium hirsutum L., "2n = 52"). Three independently developed intraspecific upland mapping populations were analyzed to generate 3 high-density genetic linkage single-nucleotide polymorphism (SNP) maps and a consensus map using the CottonSNP63K array. The populations consisted of a previously reported F 2 , a recombinant inbred line (RIL), and reciprocal RIL population, from "Phytogen 72" and "Stoneville 474" cultivars. The cluster file provided 7417 genotyped SNP markers, resulting in 26 linkage groups corresponding to the 26 chromosomes (c) of the allotetraploid upland cotton (AD) 1 arisen from the merging of 2 genomes ("A" Old World and "D" New World). Patterns of chromosome-specific recombination were largely consistent across mapping populations. The high-density genetic consensus map included 7244 SNP markers that spanned 3538 cM and comprised 3824 SNP bins, of which 1783 and 2041 were in the A t and D t subgenomes with 1825 and 1713 cM map lengths, respectively. Subgenome average distances were nearly identical, indicating that subgenomic differences in bin number arose due to the high numbers of SNPs on the D t subgenome. Examination of expected recombination frequency or crossovers (COs) on the chromosomes within each population of the 2 subgenomes revealed that COs were also not affected by the SNPs or SNP bin number in these subgenomes. Comparative alignment analyses identified historical ancestral A t -subgenomic translocations of c02 and c03, as well as of c04 and c05. The consensus map SNP sequences aligned with high congruency to the NBI assembly of Gossypium hirsutum . However, the genomic comparisons revealed evidence of additional unconfirmed possible duplications, inversions and translocations, and unbalance SNP sequence homology or SNP sequence/loci genomic dominance, or homeolog loci bias of the upland tetraploid A t and D t subgenomes. The alignments indicated that 364 SNP-associated previously unintegrated scaffolds can be placed in pseudochromosomes of the NBI G hirsutum assembly. This is the first intraspecific SNP genetic linkage consensus map assembled in G hirsutum with a core of reproducible mendelian SNP markers assayed on different populations and it provides further knowledge of chromosome arrangement of genic and nongenic SNPs. Together, the consensus map and RIL populations provide a synergistically useful platform for localizing and identifying agronomically important loci for improvement of the cotton crop.
Duployez, Nicolas; Boudry-Labis, Elise; Decool, Gauthier; Grzych, Guillaume; Grardel, Nathalie; Abou Chahla, Wadih; Preudhomme, Claude; Roche-Lestienne, Catherine
2015-10-01
Intrachromosomal amplification of chromosome 21 (iAMP21) defines a distinct cytogenetic subgroup of B-cell precursor acute lymphoblastic leukemia (BCP-ALL) with poor prognosis that should be investigated in routine practice. Single-nucleotide polymorphism (SNP)-array provides a useful method to detect such cases showing a highly characteristic profile.
USDA-ARS?s Scientific Manuscript database
Fragaria iinumae is recognized as an ancestor of the octoploid strawberry species, including the cultivated strawberry, Fragaria ×ananassa. Here we report the construction of the first high density linkage map for F. iinumae. The map is based on two high-throughput techniques of single nucleotide p...
USDA-ARS?s Scientific Manuscript database
The objective of this study was to develop a canonical SNP panel for subtyping of Shiga-toxin producing Escherichia coli (STEC). To this purpose, 906 putative SNPs were identified using resequencing tiling arrays. A subset of 391 SNPs was further screened using high-throughput TaqMan PCR against a d...
USDA-ARS?s Scientific Manuscript database
A high-throughput genotyping platform is needed to enable marker-assisted breeding in the allo-octoploid cultivated strawberry Fragaria ×ananassa. Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid Fragaria vesca ‘Hawaii 4’ reference genome to identify sing...
A Complex 6p25 Rearrangement in a Child With Multiple Epiphyseal Dysplasia
Bedoyan, Jirair K.; Lesperance, Marci M.; Ackley, Todd; Iyer, Ramaswamy K.; Innis, Jeffrey W.; Misra, Vinod K.
2015-01-01
Genomic rearrangements are increasingly recognized as important contributors to human disease. Here we report on an 11½-year-old child with myopia, Duane retraction syndrome, bilateral mixed hearing loss, skeletal anomalies including multiple epiphyseal dysplasia, and global developmental delay, and a complex 6p25 genomic rearrangement. We have employed oligonucleotide-based comparative genomic hybridization arrays (aCGH) of different resolutions (44 and 244K) as well as a 1 M single nucleotide polymorphism (SNP) array to analyze this complex rearrangement. Our analyses reveal a complex rearrangement involving a ~2.21 Mb interstitial deletion, a ~240 kb terminal deletion, and a 70–80 kb region in between these two deletions that shows maintenance of genomic copy number. The interstitial deletion contains eight known genes, including three Forkhead box containing (FOX) transcription factors (FOXQ1, FOXF2, and FOXC1). The region maintaining genomic copy number partly overlaps the dual specificity protein phosphatase 22 (DUSP22) gene. Array analyses suggest a homozygous loss of genomic material at the 5′ end of DUSP22, which was corroborated using TaqMan® copy number analysis. It is possible that this homozygous genomic loss may render both copies of DUSP22 or its products non-functional. Our analysis suggests a rearrangement mechanism distinct from a previously reported replication-based error-prone mechanism without template switching for a specific 6p25 rearrangement with a 1.22 Mb interstitial deletion. Our study demonstrates the utility and limitations of using oligonucleotide-based aCGH and SNP array technologies of increasing resolutions in order to identify complex DNA rearrangements and gene disruptions. PMID:21204225
Jensen, Lars R; Chen, Wei; Moser, Bettina; Lipkowitz, Bettina; Schroeder, Christopher; Musante, Luciana; Tzschach, Andreas; Kalscheuer, Vera M; Meloni, Ilaria; Raynaud, Martine; van Esch, Hilde; Chelly, Jamel; de Brouwer, Arjan P M; Hackett, Anna; van der Haar, Sigrun; Henn, Wolfram; Gecz, Jozef; Riess, Olaf; Bonin, Michael; Reinhardt, Richard; Ropers, Hans-Hilger; Kuss, Andreas W
2011-01-01
X-linked intellectual disability (XLID), also known as X-linked mental retardation, is a highly genetically heterogeneous condition for which mutations in >90 different genes have been identified. In this study, we used a custom-made sequencing array based on the Affymetrix 50k platform for mutation screening in 17 known XLID genes in patients from 135 families and found eight single-nucleotide changes that were absent in controls. For four mutations affecting ATRX (p.1761M>T), PQBP1 (p.155R>X) and SLC6A8 (p.390P>L and p.477S>L), we provide evidence for a functional involvement of these changes in the aetiology of intellectual disability. PMID:21267006
Alexiev, Borislav A; Zou, Ying S
2014-12-01
Chromosomal microarray analysis using novel Molecular Inversion Probe (MIP) technology demonstrated 2,570 kb copy neutral LOH of 10q11.22 in two clear cell papillary renal cell carcinomas. In addition, one of the tumors had a big 29,784 kb deletion of 13q11-q14.2. There were two variants of unknown significance, a 2,509 kb gain of Xp22.33 and a 257 kb homozygous deletion of 8p11.22. The somatic mutation panel containing 74 mutations in nine genes did not reveal any mutations. Besides identification of submicroscopic duplications or deletions, SNP microarrays can reveal abnormal allelic imbalances including LOH and copy neutral LOH, which cannot be recognized by chromosome, FISH, and non-SNP microarray arrays. To the best of our knowledge, this is the first study demonstrating copy neutral LOH of 10q11.22 in clear cell papillary renal cell carcinomas using the new MIP SNP OncoScan FFPE Assay Kit on formalin-fixed paraffin-embedded tumor samples. Copyright © 2014 Elsevier GmbH. All rights reserved.
Foresman, Bradley J.; Oliver, Rebekah E.; Jackson, Eric W.; Chao, Shiaoman; Arruda, Marcio P.; Kolb, Frederic L.
2016-01-01
Barley yellow dwarf viruses (BYDVs) are responsible for the disease barley yellow dwarf (BYD) and affect many cereals including oat (Avena sativa L.). Until recently, the molecular marker technology in oat has not allowed for many marker-trait association studies to determine the genetic mechanisms for tolerance. A genome-wide association study (GWAS) was performed on 428 spring oat lines using a recently developed high-density oat single nucleotide polymorphism (SNP) array as well as a SNP-based consensus map. Marker-trait associations were performed using a Q-K mixed model approach to control for population structure and relatedness. Six significant SNP-trait associations representing two QTL were found on chromosomes 3C (Mrg17) and 18D (Mrg04). This is the first report of BYDV tolerance QTL on chromosome 3C (Mrg17) and 18D (Mrg04). Haplotypes using the two QTL were evaluated and distinct classes for tolerance were identified based on the number of favorable alleles. A large number of lines carrying both favorable alleles were observed in the panel. PMID:27175781
Uniparental disomy and prenatal phenotype
Li, Xiaofei; Liu, Yan; Yue, Song; Wang, Li; Zhang, Tiejuan; Guo, Cuixia; Hu, Wenjie; Kagan, Karl-Oliver; Wu, Qingqing
2017-01-01
Abstract Rationale: Uniparental disomy (UPD) gives a description of the inheritance of both homologues of a chromosome pair from the same parent. The consequences of UPD depend on the specific chromosome/segment involved and its parental origin. Patient concerns: We report prenatal phenotypes of 2 rare cases of UPD. Diagnoses: The prenatal phenotype of case 1 included sonographic markers such as enlarged nuchal translucency (NT), absent nasal bone, short femur and humerus length, and several structural malformations involving Dandy–Walker malformation and congenital heart defects. The prenatal phenotype of Case 2 are sonographic markers, including enlarged NT, thickened nuchal fold, ascites, and polyhydramnios without apparent structural malformations. Interventions: Conventional G-band karyotype appears normal in case 1, while it shows normal chromosomes with a small supernumerary marker chromosome (sSMC) in case 2. Genetic etiology was left unknown until single-nucleotide polymorphism-based array (SNP-array) was performed, and segmental paternal UPD 22 was identified in case 1 and segmental paternal UPD 14 was found in case 2. Outcomes: The parents of case 1 chose termination of pregnancy. The neonate of case 2 was born prematurely with a bellshaped small thorax and died within a day. Lessons: UPD cases are rare and the phenotypes are different, which depend on the origin and affected chromosomal part. If a fetus shows multiple anomalies that cannot be attributed to a common aneuploidy or a genetic syndrome, or manifests some features possibly related to an UPD syndrome, such as detection of sSMC, SNP-array should be considered. PMID:29137034
Uniparental disomy and prenatal phenotype: Two case reports and review.
Li, Xiaofei; Liu, Yan; Yue, Song; Wang, Li; Zhang, Tiejuan; Guo, Cuixia; Hu, Wenjie; Kagan, Karl-Oliver; Wu, Qingqing
2017-11-01
Uniparental disomy (UPD) gives a description of the inheritance of both homologues of a chromosome pair from the same parent. The consequences of UPD depend on the specific chromosome/segment involved and its parental origin. We report prenatal phenotypes of 2 rare cases of UPD. The prenatal phenotype of case 1 included sonographic markers such as enlarged nuchal translucency (NT), absent nasal bone, short femur and humerus length, and several structural malformations involving Dandy-Walker malformation and congenital heart defects. The prenatal phenotype of Case 2 are sonographic markers, including enlarged NT, thickened nuchal fold, ascites, and polyhydramnios without apparent structural malformations. Conventional G-band karyotype appears normal in case 1, while it shows normal chromosomes with a small supernumerary marker chromosome (sSMC) in case 2. Genetic etiology was left unknown until single-nucleotide polymorphism-based array (SNP-array) was performed, and segmental paternal UPD 22 was identified in case 1 and segmental paternal UPD 14 was found in case 2. The parents of case 1 chose termination of pregnancy. The neonate of case 2 was born prematurely with a bellshaped small thorax and died within a day. UPD cases are rare and the phenotypes are different, which depend on the origin and affected chromosomal part. If a fetus shows multiple anomalies that cannot be attributed to a common aneuploidy or a genetic syndrome, or manifests some features possibly related to an UPD syndrome, such as detection of sSMC, SNP-array should be considered.
Männik, Katrin; Parkel, Sven; Palta, Priit; Zilina, Olga; Puusepp, Helen; Esko, Tõnu; Mägi, Reedik; Nõukas, Margit; Veidenberg, Andres; Nelis, Mari; Metspalu, Andres; Remm, Maido; Ounap, Katrin; Kurg, Ants
2011-01-01
The increasing use of whole-genome array screening has revealed the important role of DNA copy-number variations in the pathogenesis of neurodevelopmental disorders and several recurrent genomic disorders have been defined during recent years. However, some variants considered to be pathogenic have also been observed in phenotypically normal individuals. This underlines the importance of further characterization of genomic variants with potentially variable expressivity in both patient and general population cohorts to clarify their phenotypic consequence. In this study whole-genome SNP arrays were used to investigate genomic rearrangements in 77 Estonian families with idiopathic mental retardation. In addition to this family-based approach, phenotype and genotype data from a cohort of 1000 individuals in the general population were used for accurate interpretation of aberrations found in mental retardation patients. Relevant structural aberrations were detected in 18 of the families analyzed (23%). Fifteen of those were in genomic regions where clinical significance has previously been established. In 3 families, 4 novel aberrations associated with intellectual disability were detected in chromosome regions 2p25.1-p24.3, 3p12.1-p11.2, 7p21.2-p21.1 and Xq28. Carriers of imbalances in 15q13.3, 16p11.2 and Xp22.31 were identified among reference individuals, affirming the variable phenotypic consequence of rare variants in some genomic regions considered as pathogenic. Copyright © 2010 Elsevier Masson SAS. All rights reserved.
The effect of algorithms on copy number variant detection.
Tsuang, Debby W; Millard, Steven P; Ely, Benjamin; Chi, Peter; Wang, Kenneth; Raskind, Wendy H; Kim, Sulgi; Brkanac, Zoran; Yu, Chang-En
2010-12-30
The detection of copy number variants (CNVs) and the results of CNV-disease association studies rely on how CNVs are defined, and because array-based technologies can only infer CNVs, CNV-calling algorithms can produce vastly different findings. Several authors have noted the large-scale variability between CNV-detection methods, as well as the substantial false positive and false negative rates associated with those methods. In this study, we use variations of four common algorithms for CNV detection (PennCNV, QuantiSNP, HMMSeg, and cnvPartition) and two definitions of overlap (any overlap and an overlap of at least 40% of the smaller CNV) to illustrate the effects of varying algorithms and definitions of overlap on CNV discovery. We used a 56 K Illumina genotyping array enriched for CNV regions to generate hybridization intensities and allele frequencies for 48 Caucasian schizophrenia cases and 48 age-, ethnicity-, and gender-matched control subjects. No algorithm found a difference in CNV burden between the two groups. However, the total number of CNVs called ranged from 102 to 3,765 across algorithms. The mean CNV size ranged from 46 kb to 787 kb, and the average number of CNVs per subject ranged from 1 to 39. The number of novel CNVs not previously reported in normal subjects ranged from 0 to 212. Motivated by the availability of multiple publicly available genome-wide SNP arrays, investigators are conducting numerous analyses to identify putative additional CNVs in complex genetic disorders. However, the number of CNVs identified in array-based studies, and whether these CNVs are novel or valid, will depend on the algorithm(s) used. Thus, given the variety of methods used, there will be many false positives and false negatives. Both guidelines for the identification of CNVs inferred from high-density arrays and the establishment of a gold standard for validation of CNVs are needed.
Darabi, Hatef; Beesley, Jonathan; Droit, Arnaud; Kar, Siddhartha; Nord, Silje; Moradi Marjaneh, Mahdi; Soucy, Penny; Michailidou, Kyriaki; Ghoussaini, Maya; Fues Wahl, Hanna; Bolla, Manjeet K.; Wang, Qin; Dennis, Joe; Alonso, M. Rosario; Andrulis, Irene L.; Anton-Culver, Hoda; Arndt, Volker; Beckmann, Matthias W.; Benitez, Javier; Bogdanova, Natalia V.; Bojesen, Stig E.; Brauch, Hiltrud; Brenner, Hermann; Broeks, Annegien; Brüning, Thomas; Burwinkel, Barbara; Chang-Claude, Jenny; Choi, Ji-Yeob; Conroy, Don M.; Couch, Fergus J.; Cox, Angela; Cross, Simon S.; Czene, Kamila; Devilee, Peter; Dörk, Thilo; Easton, Douglas F.; Fasching, Peter A.; Figueroa, Jonine; Fletcher, Olivia; Flyger, Henrik; Galle, Eva; García-Closas, Montserrat; Giles, Graham G.; Goldberg, Mark S.; González-Neira, Anna; Guénel, Pascal; Haiman, Christopher A.; Hallberg, Emily; Hamann, Ute; Hartman, Mikael; Hollestelle, Antoinette; Hopper, John L.; Ito, Hidemi; Jakubowska, Anna; Johnson, Nichola; Kang, Daehee; Khan, Sofia; Kosma, Veli-Matti; Kriege, Mieke; Kristensen, Vessela; Lambrechts, Diether; Le Marchand, Loic; Lee, Soo Chin; Lindblom, Annika; Lophatananon, Artitaya; Lubinski, Jan; Mannermaa, Arto; Manoukian, Siranoush; Margolin, Sara; Matsuo, Keitaro; Mayes, Rebecca; McKay, James; Meindl, Alfons; Milne, Roger L.; Muir, Kenneth; Neuhausen, Susan L.; Nevanlinna, Heli; Olswold, Curtis; Orr, Nick; Peterlongo, Paolo; Pita, Guillermo; Pylkäs, Katri; Rudolph, Anja; Sangrajrang, Suleeporn; Sawyer, Elinor J.; Schmidt, Marjanka K.; Schmutzler, Rita K.; Seynaeve, Caroline; Shah, Mitul; Shen, Chen-Yang; Shu, Xiao-Ou; Southey, Melissa C.; Stram, Daniel O.; Surowy, Harald; Swerdlow, Anthony; Teo, Soo H.; Tessier, Daniel C.; Tomlinson, Ian; Torres, Diana; Truong, Thérèse; Vachon, Celine M.; Vincent, Daniel; Winqvist, Robert; Wu, Anna H.; Wu, Pei-Ei; Yip, Cheng Har; Zheng, Wei; Pharoah, Paul D. P.; Hall, Per; Edwards, Stacey L.; Simard, Jacques; French, Juliet D.; Chenevix-Trench, Georgia; Dunning, Alison M.
2016-01-01
Genome-wide association studies have found SNPs at 17q22 to be associated with breast cancer risk. To identify potential causal variants related to breast cancer risk, we performed a high resolution fine-mapping analysis that involved genotyping 517 SNPs using a custom Illumina iSelect array (iCOGS) followed by imputation of genotypes for 3,134 SNPs in more than 89,000 participants of European ancestry from the Breast Cancer Association Consortium (BCAC). We identified 28 highly correlated common variants, in a 53 Kb region spanning two introns of the STXBP4 gene, that are strong candidates for driving breast cancer risk (lead SNP rs2787486 (OR = 0.92; CI 0.90–0.94; P = 8.96 × 10−15)) and are correlated with two previously reported risk-associated variants at this locus, SNPs rs6504950 (OR = 0.94, P = 2.04 × 10−09, r2 = 0.73 with lead SNP) and rs1156287 (OR = 0.93, P = 3.41 × 10−11, r2 = 0.83 with lead SNP). Analyses indicate only one causal SNP in the region and several enhancer elements targeting STXBP4 are located within the 53 kb association signal. Expression studies in breast tumor tissues found SNP rs2787486 to be associated with increased STXBP4 expression, suggesting this may be a target gene of this locus. PMID:27600471
Wu, Xiaoping; Guldbrandtsen, Bernt; Lund, Mogens Sandø; Sahana, Goutam
2016-09-01
Identification of genetic variants associated with feet and legs disorders (FLD) will aid in the genetic improvement of these traits by providing knowledge on genes that influence trait variations. In Denmark, FLD in cattle has been recorded since the 1990s. In this report, we used deregressed breeding values as response variables for a genome-wide association study. Bulls (5,334 Danish Holstein, 4,237 Nordic Red Dairy Cattle, and 1,180 Danish Jersey) with deregressed estimated breeding values were genotyped with the Illumina Bovine 54k single nucleotide polymorphism (SNP) genotyping array. Genotypes were imputed to whole-genome sequence variants, and then 22,751,039 SNP on 29 autosomes were used for an association analysis. A modified linear mixed-model approach (efficient mixed-model association eXpedited, EMMAX) and a linear mixed model were used for association analysis. We identified 5 (3,854 SNP), 3 (13,642 SNP), and 0 quantitative trait locus (QTL) regions associated with the FLD index in Danish Holstein, Nordic Red Dairy Cattle, and Danish Jersey populations, respectively. We did not identify any QTL that were common among the 3 breeds. In a meta-analysis of the 3 breeds, 4 QTL regions were significant, but no additional QTL region was identified compared with within-breed analyses. Comparison between top SNP locations within these QTL regions and known genes suggested that RASGRP1, LCORL, MOS, and MITF may be candidate genes for FLD in dairy cattle. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Darabi, Hatef; Beesley, Jonathan; Droit, Arnaud; Kar, Siddhartha; Nord, Silje; Moradi Marjaneh, Mahdi; Soucy, Penny; Michailidou, Kyriaki; Ghoussaini, Maya; Fues Wahl, Hanna; Bolla, Manjeet K; Wang, Qin; Dennis, Joe; Alonso, M Rosario; Andrulis, Irene L; Anton-Culver, Hoda; Arndt, Volker; Beckmann, Matthias W; Benitez, Javier; Bogdanova, Natalia V; Bojesen, Stig E; Brauch, Hiltrud; Brenner, Hermann; Broeks, Annegien; Brüning, Thomas; Burwinkel, Barbara; Chang-Claude, Jenny; Choi, Ji-Yeob; Conroy, Don M; Couch, Fergus J; Cox, Angela; Cross, Simon S; Czene, Kamila; Devilee, Peter; Dörk, Thilo; Easton, Douglas F; Fasching, Peter A; Figueroa, Jonine; Fletcher, Olivia; Flyger, Henrik; Galle, Eva; García-Closas, Montserrat; Giles, Graham G; Goldberg, Mark S; González-Neira, Anna; Guénel, Pascal; Haiman, Christopher A; Hallberg, Emily; Hamann, Ute; Hartman, Mikael; Hollestelle, Antoinette; Hopper, John L; Ito, Hidemi; Jakubowska, Anna; Johnson, Nichola; Kang, Daehee; Khan, Sofia; Kosma, Veli-Matti; Kriege, Mieke; Kristensen, Vessela; Lambrechts, Diether; Le Marchand, Loic; Lee, Soo Chin; Lindblom, Annika; Lophatananon, Artitaya; Lubinski, Jan; Mannermaa, Arto; Manoukian, Siranoush; Margolin, Sara; Matsuo, Keitaro; Mayes, Rebecca; McKay, James; Meindl, Alfons; Milne, Roger L; Muir, Kenneth; Neuhausen, Susan L; Nevanlinna, Heli; Olswold, Curtis; Orr, Nick; Peterlongo, Paolo; Pita, Guillermo; Pylkäs, Katri; Rudolph, Anja; Sangrajrang, Suleeporn; Sawyer, Elinor J; Schmidt, Marjanka K; Schmutzler, Rita K; Seynaeve, Caroline; Shah, Mitul; Shen, Chen-Yang; Shu, Xiao-Ou; Southey, Melissa C; Stram, Daniel O; Surowy, Harald; Swerdlow, Anthony; Teo, Soo H; Tessier, Daniel C; Tomlinson, Ian; Torres, Diana; Truong, Thérèse; Vachon, Celine M; Vincent, Daniel; Winqvist, Robert; Wu, Anna H; Wu, Pei-Ei; Yip, Cheng Har; Zheng, Wei; Pharoah, Paul D P; Hall, Per; Edwards, Stacey L; Simard, Jacques; French, Juliet D; Chenevix-Trench, Georgia; Dunning, Alison M
2016-09-07
Genome-wide association studies have found SNPs at 17q22 to be associated with breast cancer risk. To identify potential causal variants related to breast cancer risk, we performed a high resolution fine-mapping analysis that involved genotyping 517 SNPs using a custom Illumina iSelect array (iCOGS) followed by imputation of genotypes for 3,134 SNPs in more than 89,000 participants of European ancestry from the Breast Cancer Association Consortium (BCAC). We identified 28 highly correlated common variants, in a 53 Kb region spanning two introns of the STXBP4 gene, that are strong candidates for driving breast cancer risk (lead SNP rs2787486 (OR = 0.92; CI 0.90-0.94; P = 8.96 × 10(-15))) and are correlated with two previously reported risk-associated variants at this locus, SNPs rs6504950 (OR = 0.94, P = 2.04 × 10(-09), r(2) = 0.73 with lead SNP) and rs1156287 (OR = 0.93, P = 3.41 × 10(-11), r(2) = 0.83 with lead SNP). Analyses indicate only one causal SNP in the region and several enhancer elements targeting STXBP4 are located within the 53 kb association signal. Expression studies in breast tumor tissues found SNP rs2787486 to be associated with increased STXBP4 expression, suggesting this may be a target gene of this locus.
Sung, Yun J; Gu, C Charles; Tiwari, Hemant K; Arnett, Donna K; Broeckel, Ulrich; Rao, Dabeeru C
2012-07-01
Genotype imputation provides imputation of untyped single nucleotide polymorphisms (SNPs) that are present on a reference panel such as those from the HapMap Project. It is popular for increasing statistical power and comparing results across studies using different platforms. Imputation for African American populations is challenging because their linkage disequilibrium blocks are shorter and also because no ideal reference panel is available due to admixture. In this paper, we evaluated three imputation strategies for African Americans. The intersection strategy used a combined panel consisting of SNPs polymorphic in both CEU and YRI. The union strategy used a panel consisting of SNPs polymorphic in either CEU or YRI. The merge strategy merged results from two separate imputations, one using CEU and the other using YRI. Because recent investigators are increasingly using the data from the 1000 Genomes (1KG) Project for genotype imputation, we evaluated both 1KG-based imputations and HapMap-based imputations. We used 23,707 SNPs from chromosomes 21 and 22 on Affymetrix SNP Array 6.0 genotyped for 1,075 HyperGEN African Americans. We found that 1KG-based imputations provided a substantially larger number of variants than HapMap-based imputations, about three times as many common variants and eight times as many rare and low-frequency variants. This higher yield is expected because the 1KG panel includes more SNPs. Accuracy rates using 1KG data were slightly lower than those using HapMap data before filtering, but slightly higher after filtering. The union strategy provided the highest imputation yield with next highest accuracy. The intersection strategy provided the lowest imputation yield but the highest accuracy. The merge strategy provided the lowest imputation accuracy. We observed that SNPs polymorphic only in CEU had much lower accuracy, reducing the accuracy of the union strategy. Our findings suggest that 1KG-based imputations can facilitate discovery of significant associations for SNPs across the whole MAF spectrum. Because the 1KG Project is still under way, we expect that later versions will provide better imputation performance. © 2012 Wiley Periodicals, Inc.
Mahnke, Donna K.; Larson, Joshua M.; Ghanta, Sujana; Feng, Ying; Simpson, Pippa M.; Broeckel, Ulrich; Duffy, Kelly; Tweddell, James S.; Grossman, William J.; Routes, John M.; Mitchell, Michael E.
2010-01-01
22q11.2 Deletion syndrome (22q11.2 DS) [DiGeorge syndrome type 1 (DGS1)] occurs in ∼1:3,000 live births; 75% of children with DGS1 have severe congenital heart disease requiring early intervention. The gold standard for detection of DGS1 is fluorescence in situ hybridization (FISH) with a probe at the TUPLE1 gene. However, FISH is costly and is typically ordered in conjunction with a karyotype analysis that takes several days. Therefore, FISH is underutilized and the diagnosis of 22q11.2 DS is frequently delayed, often resulting in profound clinical consequences. Our goal was to determine whether multiplexed, quantitative real-time PCR (MQPCR) could be used to detect the haploinsufficiency characteristic of 22q11.2 DS. A retrospective blinded study was performed on 382 subjects who had undergone congenital heart surgery. MQPCR was performed with a probe localized to the TBX1 gene on human chromosome 22, a gene typically deleted in 22q11.2 DS. Cycle threshold (Ct) was used to calculate the relative gene copy number (rGCN). Confirmation analysis was performed with the Affymetrix 6.0 Genome-Wide SNP Array. With MQPCR, 361 subjects were identified as nondeleted with an rGCN near 1.0 and 21 subjects were identified as deleted with an rGCN near 0.5, indicative of a hemizygous deletion. The sensitivity (21/21) and specificity (361/361) of MQPCR to detect 22q11.2 deletions was 100% at an rGCN value drawn at 0.7. One of 21 subjects with a prior clinical (not genetically confirmed) DGS1 diagnosis was found not to carry the deletion, while another subject, not previously identified as DGS1, was detected as deleted and subsequently confirmed via microarray. The MQPCR assay is a rapid, inexpensive, sensitive, and specific assay that can be used to screen for 22q11.2 deletion syndrome. The assay is readily adaptable to high throughput. PMID:20551144
Mitochondrial DNA variants in obesity.
Knoll, Nadja; Jarick, Ivonne; Volckmar, Anna-Lena; Klingenspor, Martin; Illig, Thomas; Grallert, Harald; Gieger, Christian; Wichmann, Heinz-Erich; Peters, Annette; Wiegand, Susanna; Biebermann, Heike; Fischer-Posovszky, Pamela; Wabitsch, Martin; Völzke, Henry; Nauck, Matthias; Teumer, Alexander; Rosskopf, Dieter; Rimmbach, Christian; Schreiber, Stefan; Jacobs, Gunnar; Lieb, Wolfgang; Franke, Andre; Hebebrand, Johannes; Hinney, Anke
2014-01-01
Heritability estimates for body mass index (BMI) variation are high. For mothers and their offspring higher BMI correlations have been described than for fathers. Variation(s) in the exclusively maternally inherited mitochondrial DNA (mtDNA) might contribute to this parental effect. Thirty-two to 40 mtDNA single nucleotide polymorphisms (SNPs) were available from genome-wide association study SNP arrays (Affymetrix 6.0). For discovery, we analyzed association in a case-control (CC) sample of 1,158 extremely obese children and adolescents and 435 lean adult controls. For independent confirmation, 7,014 population-based adults were analyzed as CC sample of n = 1,697 obese cases (BMI ≥ 30 kg/m2) and n = 2,373 normal weight and lean controls (BMI<25 kg/m2). SNPs were analyzed as single SNPs and haplogroups determined by HaploGrep. Fisher's two-sided exact test was used for association testing. Moreover, the D-loop was re-sequenced (Sanger) in 192 extremely obese children and adolescents and 192 lean adult controls. Association testing of detected variants was performed using Fisher's two-sided exact test. For discovery, nominal association with obesity was found for the frequent allele G of m.8994G/A (rs28358887, p = 0.002) located in ATP6. Haplogroup W was nominally overrepresented in the controls (p = 0.039). These findings could not be confirmed independently. For two of the 252 identified D-loop variants nominal association was detected (m.16292C/T, p = 0.007, m.16189T/C, p = 0.048). Only eight controls carried the m.16292T allele, five of whom belonged to haplogroup W that was initially enriched among these controls. m.16189T/C might create an uninterrupted poly-C tract located near a regulatory element involved in replication of mtDNA. Though follow-up of some D-loop variants still is conceivable, our hypothesis of a contribution of variation in the exclusively maternally inherited mtDNA to the observed larger correlations for BMI between mothers and their offspring could not be substantiated by the findings of the present study.
Williams, Robert C; Elston, Robert C; Kumar, Pankaj; Knowler, William C; Abboud, Hanna E; Adler, Sharon; Bowden, Donald W; Divers, Jasmin; Freedman, Barry I; Igo, Robert P; Ipp, Eli; Iyengar, Sudha K; Kimmel, Paul L; Klag, Michael J; Kohn, Orly; Langefeld, Carl D; Leehey, David J; Nelson, Robert G; Nicholas, Susanne B; Pahl, Madeleine V; Parekh, Rulan S; Rotter, Jerome I; Schelling, Jeffrey R; Sedor, John R; Shah, Vallabh O; Smith, Michael W; Taylor, Kent D; Thameem, Farook; Thornley-Brown, Denyse; Winkler, Cheryl A; Guo, Xiuqing; Zager, Phillip; Hanson, Robert L
2016-05-04
The presence of population structure in a sample may confound the search for important genetic loci associated with disease. Our four samples in the Family Investigation of Nephropathy and Diabetes (FIND), European Americans, Mexican Americans, African Americans, and American Indians are part of a genome- wide association study in which population structure might be particularly important. We therefore decided to study in detail one component of this, individual genetic ancestry (IGA). From SNPs present on the Affymetrix 6.0 Human SNP array, we identified 3 sets of ancestry informative markers (AIMs), each maximized for the information in one the three contrasts among ancestral populations: Europeans (HAPMAP, CEU), Africans (HAPMAP, YRI and LWK), and Native Americans (full heritage Pima Indians). We estimate IGA and present an algorithm for their standard errors, compare IGA to principal components, emphasize the importance of balancing information in the ancestry informative markers (AIMs), and test the association of IGA with diabetic nephropathy in the combined sample. A fixed parental allele maximum likelihood algorithm was applied to the FIND to estimate IGA in four samples: 869 American Indians; 1385 African Americans; 1451 Mexican Americans; and 826 European Americans. When the information in the AIMs is unbalanced, the estimates are incorrect with large error. Individual genetic admixture is highly correlated with principle components for capturing population structure. It takes ~700 SNPs to reduce the average standard error of individual admixture below 0.01. When the samples are combined, the resulting population structure creates associations between IGA and diabetic nephropathy. The identified set of AIMs, which include American Indian parental allele frequencies, may be particularly useful for estimating genetic admixture in populations from the Americas. Failure to balance information in maximum likelihood, poly-ancestry models creates biased estimates of individual admixture with large error. This also occurs when estimating IGA using the Bayesian clustering method as implemented in the program STRUCTURE. Odds ratios for the associations of IGA with disease are consistent with what is known about the incidence and prevalence of diabetic nephropathy in these populations.
Wang-Renault, Shu-Fang; Letouzé, Eric; Imbeaud, Sandrine; Zucman-Rossi, Jessica; Deleuze, Jean-François; How-Kit, Alexandre
2017-01-01
Motivation Copy number variations (CNV) include net gains or losses of part or whole chromosomal regions. They differ from copy neutral loss of heterozygosity (cn-LOH) events which do not induce any net change in the copy number and are often associated with uniparental disomy. These phenomena have long been reported to be associated with diseases and particularly in cancer. Losses/gains of genomic regions are often correlated with lower/higher gene expression. On the other hand, loss of heterozygosity (LOH) and cn-LOH are common events in cancer and may be associated with the loss of a functional tumor suppressor gene. Therefore, identifying recurrent CNV and cn-LOH events can be important as they may highlight common biological components and give insights into the development or mechanisms of a disease. However, no currently available tools allow a comprehensive whole-genome visualization of recurrent CNVs and cn-LOH in groups of samples providing absolute quantification of the aberrations leading to the loss of potentially important information. Results To overcome these limitations, we developed aCNViewer (Absolute CNV Viewer), a visualization tool for absolute CNVs and cn-LOH across a group of samples. aCNViewer proposes three graphical representations: dendrograms, bi-dimensional heatmaps showing chromosomal regions sharing similar abnormality patterns, and quantitative stacked histograms facilitating the identification of recurrent absolute CNVs and cn-LOH. We illustrated aCNViewer using publically available hepatocellular carcinomas (HCCs) Affymetrix SNP Array data (Fig 1A). Regions 1q and 8q present a similar percentage of total gains but significantly different copy number gain categories (p-value of 0.0103 with a Fisher exact test), validated by another cohort of HCCs (p-value of 5.6e-7) (Fig 2B). Availability and implementation aCNViewer is implemented in python and R and is available with a GNU GPLv3 license on GitHub https://github.com/FJD-CEPH/aCNViewer and Docker https://hub.docker.com/r/fjdceph/acnviewer/. Contact aCNViewer@cephb.fr PMID:29261730
The genome-wide structure of two economically important indigenous Sicilian cattle breeds.
Mastrangelo, S; Saura, M; Tolone, M; Salces-Ortiz, J; Di Gerlando, R; Bertolini, F; Fontanesi, L; Sardina, M T; Serrano, M; Portolano, B
2014-11-01
Genomic technologies, such as high-throughput genotyping based on SNP arrays, provided background information concerning genome structure in domestic animals. The aim of this work was to investigate the genetic structure, the genome-wide estimates of inbreeding, coancestry, effective population size (Ne), and the patterns of linkage disequilibrium (LD) in 2 economically important Sicilian local cattle breeds, Cinisara (CIN) and Modicana (MOD), using the Illumina Bovine SNP50K v2 BeadChip. To understand the genetic relationship and to place both Sicilian breeds in a global context, genotypes from 134 other domesticated bovid breeds were used. Principal component analysis showed that the Sicilian cattle breeds were closer to individuals of Bos taurus taurus from Eurasia and formed nonoverlapping clusters with other breeds. Between the Sicilian cattle breeds, MOD was the most differentiated, whereas the animals belonging to the CIN breed showed a lower value of assignment, the presence of substructure, and genetic links with the MOD breed. The average molecular inbreeding and coancestry coefficients were moderately high, and the current estimates of Ne were low in both breeds. These values indicated a low genetic variability. Considering levels of LD between adjacent markers, the average r(2) in the MOD breed was comparable to those reported for others cattle breeds, whereas CIN showed a lower value. Therefore, these results support the need of more dense SNP arrays for a high-power association mapping and genomic selection efficiency, particularly for the CIN cattle breed. Controlling molecular inbreeding and coancestry would restrict inbreeding depression, the probability of losing beneficial rare alleles, and therefore the risk of extinction. The results generated from this study have important implications for the development of conservation and/or selection breeding programs in these 2 local cattle breeds.
Zhang, Zhongyang; Hao, Ke
2015-11-01
Cancer genomes exhibit profound somatic copy number alterations (SCNAs). Studying tumor SCNAs using massively parallel sequencing provides unprecedented resolution and meanwhile gives rise to new challenges in data analysis, complicated by tumor aneuploidy and heterogeneity as well as normal cell contamination. While the majority of read depth based methods utilize total sequencing depth alone for SCNA inference, the allele specific signals are undervalued. We proposed a joint segmentation and inference approach using both signals to meet some of the challenges. Our method consists of four major steps: 1) extracting read depth supporting reference and alternative alleles at each SNP/Indel locus and comparing the total read depth and alternative allele proportion between tumor and matched normal sample; 2) performing joint segmentation on the two signal dimensions; 3) correcting the copy number baseline from which the SCNA state is determined; 4) calling SCNA state for each segment based on both signal dimensions. The method is applicable to whole exome/genome sequencing (WES/WGS) as well as SNP array data in a tumor-control study. We applied the method to a dataset containing no SCNAs to test the specificity, created by pairing sequencing replicates of a single HapMap sample as normal/tumor pairs, as well as a large-scale WGS dataset consisting of 88 liver tumors along with adjacent normal tissues. Compared with representative methods, our method demonstrated improved accuracy, scalability to large cancer studies, capability in handling both sequencing and SNP array data, and the potential to improve the estimation of tumor ploidy and purity.
Zhang, Zhongyang; Hao, Ke
2015-01-01
Cancer genomes exhibit profound somatic copy number alterations (SCNAs). Studying tumor SCNAs using massively parallel sequencing provides unprecedented resolution and meanwhile gives rise to new challenges in data analysis, complicated by tumor aneuploidy and heterogeneity as well as normal cell contamination. While the majority of read depth based methods utilize total sequencing depth alone for SCNA inference, the allele specific signals are undervalued. We proposed a joint segmentation and inference approach using both signals to meet some of the challenges. Our method consists of four major steps: 1) extracting read depth supporting reference and alternative alleles at each SNP/Indel locus and comparing the total read depth and alternative allele proportion between tumor and matched normal sample; 2) performing joint segmentation on the two signal dimensions; 3) correcting the copy number baseline from which the SCNA state is determined; 4) calling SCNA state for each segment based on both signal dimensions. The method is applicable to whole exome/genome sequencing (WES/WGS) as well as SNP array data in a tumor-control study. We applied the method to a dataset containing no SCNAs to test the specificity, created by pairing sequencing replicates of a single HapMap sample as normal/tumor pairs, as well as a large-scale WGS dataset consisting of 88 liver tumors along with adjacent normal tissues. Compared with representative methods, our method demonstrated improved accuracy, scalability to large cancer studies, capability in handling both sequencing and SNP array data, and the potential to improve the estimation of tumor ploidy and purity. PMID:26583378
Zhu, Bo; Niu, Hong; Zhang, Wengang; Wang, Zezhao; Liang, Yonghu; Guan, Long; Guo, Peng; Chen, Yan; Zhang, Lupei; Guo, Yong; Ni, Heming; Gao, Xue; Gao, Huijiang; Xu, Lingyang; Li, Junya
2017-06-14
Fatty acid composition of muscle is an important trait contributing to meat quality. Recently, genome-wide association study (GWAS) has been extensively used to explore the molecular mechanism underlying important traits in cattle. In this study, we performed GWAS using high density SNP array to analyze the association between SNPs and fatty acids and evaluated the accuracy of genomic prediction for fatty acids in Chinese Simmental cattle. Using the BayesB method, we identified 35 and 7 regions in Chinese Simmental cattle that displayed significant associations with individual fatty acids and fatty acid groups, respectively. We further obtained several candidate genes which may be involved in fatty acid biosynthesis including elongation of very long chain fatty acids protein 5 (ELOVL5), fatty acid synthase (FASN), caspase 2 (CASP2) and thyroglobulin (TG). Specifically, we obtained strong evidence of association signals for one SNP located at 51.3 Mb for FASN using Genome-wide Rapid Association Mixed Model and Regression-Genomic Control (GRAMMAR-GC) approaches. Also, region-based association test identified multiple SNPs within FASN and ELOVL5 for C14:0. In addition, our result revealed that the effectiveness of genomic prediction for fatty acid composition using BayesB was slightly superior over GBLUP in Chinese Simmental cattle. We identified several significantly associated regions and loci which can be considered as potential candidate markers for genomics-assisted breeding programs. Using multiple methods, our results revealed that FASN and ELOVL5 are associated with fatty acids with strong evidence. Our finding also suggested that it is feasible to perform genomic selection for fatty acids in Chinese Simmental cattle.
Duployez, Nicolas; Boudry-Labis, Elise; Decool, Gauthier; Grzych, Guillaume; Grardel, Nathalie; Abou Chahla, Wadih; Preudhomme, Claude; Roche-Lestienne, Catherine
2015-01-01
Key Clinical Message Intrachromosomal amplification of chromosome 21 (iAMP21) defines a distinct cytogenetic subgroup of B-cell precursor acute lymphoblastic leukemia (BCP-ALL) with poor prognosis that should be investigated in routine practice. Single-nucleotide polymorphism (SNP)-array provides a useful method to detect such cases showing a highly characteristic profile. PMID:26509013
USDA-ARS?s Scientific Manuscript database
Bacterial cold water disease (BCWD) causes significant mortality and economic losses in salmonid aquaculture. In previous studies, we identified moderate-large effect QTL for BCWD resistance in rainbow trout (Oncorhynchus mykiss). However, the recent availability of a 57K SNP array and a genome phys...
Davey, Mark W; Graham, Neil S; Vanholme, Bartel; Swennen, Rony; May, Sean T; Keulemans, Johan
2009-01-01
Background 'Systems-wide' approaches such as microarray RNA-profiling are ideally suited to the study of the complex overlapping responses of plants to biotic and abiotic stresses. However, commercial microarrays are only available for a limited number of plant species and development costs are so substantial as to be prohibitive for most research groups. Here we evaluate the use of cross-hybridisation to Affymetrix oligonucleotide GeneChip® microarrays to profile the response of the banana (Musa spp.) leaf transcriptome to drought stress using a genomic DNA (gDNA)-based probe-selection strategy to improve the efficiency of detection of differentially expressed Musa transcripts. Results Following cross-hybridisation of Musa gDNA to the Rice GeneChip® Genome Array, ~33,700 gene-specific probe-sets had a sufficiently high degree of homology to be retained for transcriptomic analyses. In a proof-of-concept approach, pooled RNA representing a single biological replicate of control and drought stressed leaves of the Musa cultivar 'Cachaco' were hybridised to the Affymetrix Rice Genome Array. A total of 2,910 Musa gene homologues with a >2-fold difference in expression levels were subsequently identified. These drought-responsive transcripts included many functional classes associated with plant biotic and abiotic stress responses, as well as a range of regulatory genes known to be involved in coordinating abiotic stress responses. This latter group included members of the ERF, DREB, MYB, bZIP and bHLH transcription factor families. Fifty-two of these drought-sensitive Musa transcripts were homologous to genes underlying QTLs for drought and cold tolerance in rice, including in 2 instances QTLs associated with a single underlying gene. The list of drought-responsive transcripts also included genes identified in publicly-available comparative transcriptomics experiments. Conclusion Our results demonstrate that despite the general paucity of nucleotide sequence data in Musa and only distant phylogenetic relations to rice, gDNA probe-based cross-hybridisation to the Rice GeneChip® is a highly promising strategy to study complex biological responses and illustrates the potential of such strategies for gene discovery in non-model species. PMID:19758430
Gene expression and pathway analysis of human hepatocellular carcinoma cells treated with cadmium
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cartularo, Laura; Laulicht, Freda; Sun, Hong
Cadmium (Cd) is a toxic and carcinogenic metal naturally occurring in the Earth's crust. A common route of human exposure is via diet and cadmium accumulates in the liver. The effects of Cd exposure on gene expression in human hepatocellular carcinoma (HepG2) cells were examined in this study. HepG2 cells were acutely-treated with 0.1, 0.5, or 1.0 μM Cd for 24 h; or chronically-treated with 0.01, 0.05, or 0.1 μM Cd for three weeks and gene expression analysis was performed using Affymetrix GeneChip® Human Gene 1.0 ST Arrays. Acute and chronic exposures significantly altered the expression of 333 and 181more » genes, respectively. The genes most upregulated by acute exposure included several metallothioneins. Downregulated genes included the monooxygenase CYP3A7, involved in drug and lipid metabolism. In contrast, CYP3A7 was upregulated by chronic Cd exposure, as was DNAJB9, an anti-apoptotic J protein. Genes downregulated following chronic exposure included the transcriptional regulator early growth response protein 1. Ingenuity Pathway Analysis revealed that the top networks altered by acute exposure were lipid metabolism, small molecule biosynthesis, cell morphology, organization, and development; while top networks altered by chronic exposure were organ morphology, cell cycle, cell signaling, and renal and urological diseases/cancer. Many of the dysregulated genes play important roles in cellular growth, proliferation, and apoptosis, and may be involved in carcinogenesis. In addition to gene expression changes, HepG2 cells treated with cadmium for 24 h indicated a reduction in global levels of histone methylation and acetylation that persisted 72 h post-treatment. - Highlights: • A common route of human exposure to the carcinogenic metal cadmium is via diet. • HepG2 cells were treated acutely or chronically with varying doses of cadmium. • Gene expression analysis was performed using Affymetrix Human Gene 1.0 Arrays. • Acute and chronic exposures altered the expression of 333 and 181 genes, respectively. • Acute cadmium exposure altered global levels of histone methylation and acetylation.« less
Al-Absi, Boshra; Razif, Muhammad F M; Noor, Suzita M; Saif-Ali, Riyadh; Aqlan, Mohammed; Salem, Sameer D; Ahmed, Radwan H; Muniandy, Sekaran
2017-10-01
Genome-wide and candidate gene association studies have previously revealed links between a predisposition to acute lymphoblastic leukemia (ALL) and genetic polymorphisms in the following genes: IKZF1 (7p12.2; ID: 10320), DDC (7p12.2; ID: 1644), CDKN2A (9p21.3; ID: 1029), CEBPE (14q11.2; ID: 1053), and LMO1 (11p15; ID: 4004). In this study, we aimed to conduct an investigation into the possible association between polymorphisms in these genes and ALL within a sample of Yemeni children of Arab-Asian descent. Seven single-nucleotide polymorphisms (SNPs) in IKZF1, three SNPs in DDC, two SNPs in CDKN2A, two SNPs in CEBPE, and three SNPs in LMO1 were genotyped in 289 Yemeni children (136 cases and 153 controls), using the nanofluidic Dynamic Array (Fluidigm 192.24 Dynamic Array). Logistic regression analyses were used to estimate ALL risk, and the strength of association was expressed as odds ratios with 95% confidence intervals. We found that the IKZF1 SNP rs10235796 C allele (p = 0.002), the IKZF1 rs6964969 A>G polymorphism (p = 0.048, GG vs. AA), the CDKN2A rs3731246 G>C polymorphism (p = 0.047, GC+CC vs. GG), and the CDKN2A SNP rs3731246 C allele (p = 0.007) were significantly associated with ALL in Yemenis of Arab-Asian descent. In addition, a borderline association was found between IKZF1 rs4132601 T>G variant and ALL risk. No associations were found between the IKZF1 SNPs (rs11978267; rs7789635), DDC SNPs (rs3779084; rs880028; rs7809758), CDKN2A SNP (rs3731217), the CEBPE SNPs (rs2239633; rs12434881) and LMO1 SNPs (rs442264; rs3794012; rs4237770) with ALL in Yemeni children. The IKZF1 SNPs, rs10235796 and rs6964969, and the CDKN2A SNP rs3731246 (previously unreported) could serve as risk markers for ALL susceptibility in Yemeni children.
Hadfield, K D; Smith, M J; Urquhart, J E; Wallace, A J; Bowers, N L; King, A T; Rutherford, S A; Trump, D; Newman, W G; Evans, D G
2010-11-25
Biallelic inactivation of the NF2 gene occurs in the majority of schwannomas. This usually involves a combination of a point mutation or multiexon deletion, in conjunction with either a second point mutation or loss of heterozygosity (LOH). We have performed DNA sequence and dosage analysis of the NF2 gene in a panel of 239 schwannoma tumours: 97 neurofibromatosis type 2 (NF2)-related schwannomas, 104 sporadic vestibular schwannomas (VS) and 38 schwannomatosis-related schwannomas. In total, we identified germline NF2 mutations in 86 out of 97 (89%) NF2 patients and a second mutational event in 77 out of 97 (79%). LOH was by far the most common form of second hit. A combination of microsatellite analysis with either conventional comparative genomic hybridization (CGH) or multiplex ligation-dependent probe amplification (MLPA) identified mitotic recombination (MR) as the cause of LOH in 14 out of 72 (19%) total evaluable tumours. Among sporadic VS, at least one NF2 mutation was identified by sequence analysis or MLPA in 65 out of 98 (66%) tumours. LOH occurred in 54 out of 96 (56%) evaluable tumours, but MR only accounted for 5 out of 77 (6%) tested. LOH was present in 28 out of 34 (82%) schwannomatosis-related schwannomas. In all eight patients who had previously tested positive for a germline SMARCB1 mutation, this involved loss of the whole, or part of the long arm, of chromosome 22. In contrast, 5 out of 22 (23%) tumours from patients with no germline SMARCB1 mutation exhibited MR. High-resolution Affymetrix SNP6 genotyping and copy number (CN) analysis (Affymetrix, Santa Clara, CA, USA) were used to determine the chromosomal breakpoint locations in tumours with MR. A range of unique recombination sites, spanning approximately 11.4 Mb, were identified. This study shows that MR is a mechanism of LOH in NF2 and SMARCB1-negative schwannomatosis-related schwannomas, occurring less frequently in sporadic VS. We found no evidence of MR in SMARCB1-positive schwannomatosis, suggesting that susceptibility to MR varies according to the disease context.
Wu, Jianhui; Huang, Shuo; Zeng, Qingdong; Liu, Shengjie; Wang, Qilin; Mu, Jingmei; Yu, Shizhou; Han, Dejun; Kang, Zhensheng
2018-06-16
A major stripe rust resistance QTL on chromosome 4BL was localized to a 4.5-Mb interval using comparative QTL mapping methods and validated in 276 wheat genotypes by haplotype analysis. CYMMIT-derived wheat line P10103 was previously identified to have adult plant resistance (APR) to stripe rust in the greenhouse and field. The conventional approach for QTL mapping in common wheat is laborious. Here, we performed QTL detection of APR using a combination of genome-wide scanning and extreme pool-genotyping. SNP-based genetic maps were constructed using the Wheat55 K SNP array to genotype a recombinant inbred line (RIL) population derived from the cross Mingxian 169 × P10103. Five stable QTL were detected across multiple environments. A fter comparing SNP profiles from contrasting, extreme DNA pools of RILs six putative QTL were located to approximate chromosome positions. A major QTL on chromosome 4B was identified in F 2:4 contrasting pools from cross Zhengmai 9023 × P10103. A consensus QTL (LOD = 26-40, PVE = 42-55%), named QYr.nwafu-4BL, was defined and localized to a 4.5-Mb interval flanked by SNP markers AX-110963704 and AX-110519862 in chromosome arm 4BL. Based on stripe rust response, marker genotypes, pedigree analysis and mapping data, QYr.nwafu-4BL is likely to be a new APR QTL. The applicability of the SNP-based markers flanking QYr.nwafu-4BL was validated on a diversity panel of 276 wheat lines. The additional minor QTL on chromosomes 4A, 5A, 5B and 6A enhanced the level of resistance conferred by QYr.nwafu-4BL. Marker-assisted pyramiding of QYr.nwafu-4BL and other favorable minor QTL in new wheat cultivars should improve the level of APR to stripe rust.
McClure, Matthew C; Bickhart, Derek; Null, Dan; Vanraden, Paul; Xu, Lingyang; Wiggans, George; Liu, George; Schroeder, Steve; Glasscock, Jarret; Armstrong, Jon; Cole, John B; Van Tassell, Curtis P; Sonstegard, Tad S
2014-01-01
The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP) genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3) by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV) identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3), while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C) within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2) on Chromosome 8 at position 95,410,507 (UMD3.1). This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C) in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array.
McClure, Matthew C.; Bickhart, Derek; Null, Dan; VanRaden, Paul; Xu, Lingyang; Wiggans, George; Liu, George; Schroeder, Steve; Glasscock, Jarret; Armstrong, Jon; Cole, John B.; Van Tassell, Curtis P.; Sonstegard, Tad S.
2014-01-01
The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP) genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3) by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV) identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3), while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C) within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2) on Chromosome 8 at position 95,410,507 (UMD3.1). This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C) in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array. PMID:24667746
Re-Ranking Sequencing Variants in the Post-GWAS Era for Accurate Causal Variant Identification
Faye, Laura L.; Machiela, Mitchell J.; Kraft, Peter; Bull, Shelley B.; Sun, Lei
2013-01-01
Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website. PMID:23950724
Delannoy, Sabine; Mariani-Kurkdjian, Patricia; Webb, Hattie E; Bonacorsi, Stephane; Fach, Patrick
2017-01-01
Shiga toxin-producing Escherichia coli of serotype O26:H11/H- constitute a diverse group of strains and several clones with distinct genetic characteristics have been identified and characterized. Whole genome sequencing was performed using Illumina and PacBio technologies on eight stx2 -positive O26:H11 strains circulating in France. Comparative analyses of the whole genome of the stx2 -positive O26:H11 strains indicate that several clones of EHEC O26:H11 are co-circulating in France. Phylogenetic analysis of the French strains together with stx2 -positive and stx -negative E. coli O26:H11 genomes obtained from Genbank indicates the existence of four clonal complexes (SNP-CCs) separated in two distinct lineages, one of which comprises the "new French clone" (SNP-CC1) that appears genetically closely related to stx -negative attaching and effacing E. coli (AEEC) strains. Interestingly, the whole genome SNP (wgSNP) phylogeny is summarized in the cas gene phylogeny, and a simple qPCR assay targeting the CRISPR array specific to SNP-CC1 (SP_O26-E) can distinguish between the two main lineages. The PacBio sequencing allowed a detailed analysis of the mobile genetic elements (MGEs) of the strains. Numerous MGEs were identified in each strain, including a large number of prophages and up to four large plasmids, representing overall 8.7-19.8% of the total genome size. Analysis of the prophage pool of the strains shows a considerable diversity with a complex history of recombination. Each clonal complex (SNP-CC) is characterized by a unique set of plasmids and phages, including stx -prophages, suggesting evolution through separate acquisition events. Overall, the MGEs appear to play a major role in O26:H11 intra-serotype clonal diversification.
Delannoy, Sabine; Mariani-Kurkdjian, Patricia; Webb, Hattie E.; Bonacorsi, Stephane; Fach, Patrick
2017-01-01
Shiga toxin-producing Escherichia coli of serotype O26:H11/H- constitute a diverse group of strains and several clones with distinct genetic characteristics have been identified and characterized. Whole genome sequencing was performed using Illumina and PacBio technologies on eight stx2-positive O26:H11 strains circulating in France. Comparative analyses of the whole genome of the stx2-positive O26:H11 strains indicate that several clones of EHEC O26:H11 are co-circulating in France. Phylogenetic analysis of the French strains together with stx2-positive and stx-negative E. coli O26:H11 genomes obtained from Genbank indicates the existence of four clonal complexes (SNP-CCs) separated in two distinct lineages, one of which comprises the “new French clone” (SNP-CC1) that appears genetically closely related to stx-negative attaching and effacing E. coli (AEEC) strains. Interestingly, the whole genome SNP (wgSNP) phylogeny is summarized in the cas gene phylogeny, and a simple qPCR assay targeting the CRISPR array specific to SNP-CC1 (SP_O26-E) can distinguish between the two main lineages. The PacBio sequencing allowed a detailed analysis of the mobile genetic elements (MGEs) of the strains. Numerous MGEs were identified in each strain, including a large number of prophages and up to four large plasmids, representing overall 8.7–19.8% of the total genome size. Analysis of the prophage pool of the strains shows a considerable diversity with a complex history of recombination. Each clonal complex (SNP-CC) is characterized by a unique set of plasmids and phages, including stx-prophages, suggesting evolution through separate acquisition events. Overall, the MGEs appear to play a major role in O26:H11 intra-serotype clonal diversification. PMID:28932209
Identification of the mechanism underlying a human chimera by SNP array analysis.
Shin, So Youn; Yoo, Han-Wook; Lee, Beom Hee; Kim, Kun Suk; Seo, Eul-Ju
2012-09-01
Human chimerism resulting from the fusion of two different zygotes is a rare phenomenon. Two mechanisms of chimerism have been hypothesized: dispermic fertilization of an oocyte and its second polar body and dispermic fertilization of two identical gametes from parthenogenetic activation, and these can be identified and discriminated using DNA polymorphism. In the present study we describe a patient with chimerism presenting as a true hermaphrodite and applied single nucleotide polymorphism array analysis to demonstrate dispermic fertilization of two identical gametes from parthenogenetic activation as the underlying mechanism at the whole chromosome level. We suggest that application of genotyping array analysis to the diagnostic process in patients with disorders of sex development will help identify more human chimera patients and increase our understanding of the underlying mechanisms. Copyright © 2012 Wiley Periodicals, Inc.
Maouche, Seraya; Poirier, Odette; Godefroy, Tiphaine; Olaso, Robert; Gut, Ivo; Collet, Jean-Phillipe; Montalescot, Gilles; Cambien, François
2008-01-01
Background In this study we assessed the respective ability of Affymetrix and Illumina microarray methodologies to answer a relevant biological question, namely the change in gene expression between resting monocytes and macrophages derived from these monocytes. Five RNA samples for each type of cell were hybridized to the two platforms in parallel. In addition, a reference list of differentially expressed genes (DEG) was generated from a larger number of hybridizations (mRNA from 86 individuals) using the RNG/MRC two-color platform. Results Our results show an important overlap of the Illumina and Affymetrix DEG lists. In addition, more than 70% of the genes in these lists were also present in the reference list. Overall the two platforms had very similar performance in terms of biological significance, evaluated by the presence in the DEG lists of an excess of genes belonging to Gene Ontology (GO) categories relevant for the biology of monocytes and macrophages. Our results support the conclusion of the MicroArray Quality Control (MAQC) project that the criteria used to constitute the DEG lists strongly influence the degree of concordance among platforms. However the importance of prioritizing genes by magnitude of effect (fold change) rather than statistical significance (p-value) to enhance cross-platform reproducibility recommended by the MAQC authors was not supported by our data. Conclusion Functional analysis based on GO enrichment demonstrates that the 2 compared technologies delivered very similar results and identified most of the relevant GO categories enriched in the reference list. PMID:18578872
Vaginal Gene Expression During Treatment With Aromatase Inhibitors.
Kallak, Theodora Kunovac; Baumgart, Juliane; Nilsson, Kerstin; Åkerud, Helena; Poromaa, Inger Sundström; Stavreus-Evers, Anneli
2015-12-01
Aromatase inhibitor (AI) treatment suppresses estrogen biosynthesis and causes genitourinary symptoms of menopause such as vaginal symptoms, ultimately affecting the quality of life for many postmenopausal women with breast cancer. Thus, the aim of this study was to examine vaginal gene expression in women during treatment with AIs compared with estrogen-treated women. The secondary aim was to study the presence and localization of vaginal aromatase. Vaginal biopsies were collected from postmenopausal women treated with AIs and from age-matched control women treated with vaginal estrogen therapy. Differential gene expression was studied with the Affymetrix Gene Chip Gene 1.0 ST Array (Affymetrix Inc, Santa Clara, CA) system, Ingenuity pathway analysis, quantitative real-time polymerase chain reaction, and immunohistochemistry. The expression of 279 genes differed between the 2 groups; AI-treated women had low expression of genes involved in cell differentiation, proliferation, and cell adhesion. Some differentially expressed genes were found to interact indirectly with the estrogen receptor alpha. In addition, aromatase protein staining was evident in the basal and the intermediate vaginal epithelium layers, and also in stromal cells with a slightly stronger staining intensity found in AI-treated women. In this study, we demonstrated that genes involved in cell differentiation, proliferation, and cell adhesion are differentially expressed in AI-treated women. The expression of vaginal aromatase suggests that this could be the result of local and systemic inhibition of aromatase. Our results emphasize the role of estrogen for vaginal cell differentiation and proliferation and future drug candidates should be aimed at improving cell differentiation and proliferation. Copyright © 2015 Elsevier Inc. All rights reserved.
2010-01-01
Background Thoroughbred horses have been selected for traits contributing to speed and stamina for centuries. It is widely recognized that inherited variation in physical and physiological characteristics is responsible for variation in individual aptitude for race distance, and that muscle phenotypes in particular are important. Results A genome-wide SNP-association study for optimum racing distance was performed using the EquineSNP50 Bead Chip genotyping array in a cohort of n = 118 elite Thoroughbred racehorses divergent for race distance aptitude. In a cohort-based association test we evaluated genotypic variation at 40,977 SNPs between horses suited to short distance (≤ 8 f) and middle-long distance (> 8 f) races. The most significant SNP was located on chromosome 18: BIEC2-417495 ~690 kb from the gene encoding myostatin (MSTN) [Punadj. = 6.96 × 10-6]. Considering best race distance as a quantitative phenotype, a peak of association on chromosome 18 (chr18:65809482-67545806) comprising eight SNPs encompassing a 1.7 Mb region was observed. Again, similar to the cohort-based analysis, the most significant SNP was BIEC2-417495 (Punadj. = 1.61 × 10-9; PBonf. = 6.58 × 10-5). In a candidate gene study we have previously reported a SNP (g.66493737C>T) in MSTN associated with best race distance in Thoroughbreds; however, its functional and genome-wide relevance were uncertain. Additional re-sequencing in the flanking regions of the MSTN gene revealed four novel 3' UTR SNPs and a 227 bp SINE insertion polymorphism in the 5' UTR promoter sequence. Linkage disequilibrium was highest between g.66493737C>T and BIEC2-417495 (r2 = 0.86). Conclusions Comparative association tests consistently demonstrated the g.66493737C>T SNP as the superior variant in the prediction of distance aptitude in racehorses (g.66493737C>T, P = 1.02 × 10-10; BIEC2-417495, Punadj. = 1.61 × 10-9). Functional investigations will be required to determine whether this polymorphism affects putative transcription-factor binding and gives rise to variation in gene and protein expression. Nonetheless, this study demonstrates that the g.66493737C>T SNP provides the most powerful genetic marker for prediction of race distance aptitude in Thoroughbreds. PMID:20932346
Hill, Emmeline W; McGivney, Beatrice A; Gu, Jingjing; Whiston, Ronan; Machugh, David E
2010-10-11
Thoroughbred horses have been selected for traits contributing to speed and stamina for centuries. It is widely recognized that inherited variation in physical and physiological characteristics is responsible for variation in individual aptitude for race distance, and that muscle phenotypes in particular are important. A genome-wide SNP-association study for optimum racing distance was performed using the EquineSNP50 Bead Chip genotyping array in a cohort of n = 118 elite Thoroughbred racehorses divergent for race distance aptitude. In a cohort-based association test we evaluated genotypic variation at 40,977 SNPs between horses suited to short distance (≤ 8 f) and middle-long distance (> 8 f) races. The most significant SNP was located on chromosome 18: BIEC2-417495 ~690 kb from the gene encoding myostatin (MSTN) [P(unadj.) = 6.96 x 10⁻⁶]. Considering best race distance as a quantitative phenotype, a peak of association on chromosome 18 (chr18:65809482-67545806) comprising eight SNPs encompassing a 1.7 Mb region was observed. Again, similar to the cohort-based analysis, the most significant SNP was BIEC2-417495 (P(unadj.) = 1.61 x 10⁻⁹; P(Bonf.) = 6.58 x 10⁻⁵). In a candidate gene study we have previously reported a SNP (g.66493737C>T) in MSTN associated with best race distance in Thoroughbreds; however, its functional and genome-wide relevance were uncertain. Additional re-sequencing in the flanking regions of the MSTN gene revealed four novel 3' UTR SNPs and a 227 bp SINE insertion polymorphism in the 5' UTR promoter sequence. Linkage disequilibrium was highest between g.66493737C>T and BIEC2-417495 (r² = 0.86). Comparative association tests consistently demonstrated the g.66493737C>T SNP as the superior variant in the prediction of distance aptitude in racehorses (g.66493737C>T, P = 1.02 x 10⁻¹⁰; BIEC2-417495, P(unadj.) = 1.61 x 10⁻⁹). Functional investigations will be required to determine whether this polymorphism affects putative transcription-factor binding and gives rise to variation in gene and protein expression. Nonetheless, this study demonstrates that the g.66493737C>T SNP provides the most powerful genetic marker for prediction of race distance aptitude in Thoroughbreds.
Gao, Guangtu; Nome, Torfinn; Pearse, Devon E; Moen, Thomas; Naish, Kerry A; Thorgaard, Gary H; Lien, Sigbjørn; Palti, Yniv
2018-01-01
Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout ( Oncorhynchus mykiss ), SNP discovery has been previously done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL) and RNA sequencing. Recently we have performed high coverage whole genome resequencing with 61 unrelated samples, representing a wide range of rainbow trout and steelhead populations, with 49 new samples added to 12 aquaculture samples from AquaGen (Norway) that we previously used for SNP discovery. Of the 49 new samples, 11 were double-haploid lines from Washington State University (WSU) and 38 represented wild and hatchery populations from a wide range of geographic distribution and with divergent migratory phenotypes. We then mapped the sequences to the new rainbow trout reference genome assembly (GCA_002163495.1) which is based on the Swanson YY doubled haploid line. Variant calling was conducted with FreeBayes and SAMtools mpileup , followed by filtering of SNPs based on quality score, sequence complexity, read depth on the locus, and number of genotyped samples. Results from the two variant calling programs were compared and genotypes of the double haploid samples were used for detecting and filtering putative paralogous sequence variants (PSVs) and multi-sequence variants (MSVs). Overall, 30,302,087 SNPs were identified on the rainbow trout genome 29 chromosomes and 1,139,018 on unplaced scaffolds, with 4,042,723 SNPs having high minor allele frequency (MAF > 0.25). The average SNP density on the chromosomes was one SNP per 64 bp, or 15.6 SNPs per 1 kb. Results from the phylogenetic analysis that we conducted indicate that the SNP markers contain enough population-specific polymorphisms for recovering population relationships despite the small sample size used. Intra-Population polymorphism assessment revealed high level of polymorphism and heterozygosity within each population. We also provide functional annotation based on the genome position of each SNP and evaluate the use of clonal lines for filtering of PSVs and MSVs. These SNPs form a new database, which provides an important resource for a new high density SNP array design and for other SNP genotyping platforms used for genetic and genomics studies of this iconic salmonid fish species.
Di Pierro, Erica A; Gianfranceschi, Luca; Di Guardo, Mario; Koehorst-van Putten, Herma Jj; Kruisselbrink, Johannes W; Longhi, Sara; Troggio, Michela; Bianco, Luca; Muranty, Hélène; Pagliarani, Giulia; Tartarini, Stefano; Letschka, Thomas; Lozano Luis, Lidia; Garkava-Gustavsson, Larisa; Micheletti, Diego; Bink, Marco Cam; Voorrips, Roeland E; Aziz, Ebrahimi; Velasco, Riccardo; Laurens, François; van de Weg, W Eric
2016-01-01
Quantitative trait loci (QTL) mapping approaches rely on the correct ordering of molecular markers along the chromosomes, which can be obtained from genetic linkage maps or a reference genome sequence. For apple ( Malus domestica Borkh), the genome sequence v1 and v2 could not meet this need; therefore, a novel approach was devised to develop a dense genetic linkage map, providing the most reliable marker-loci order for the highest possible number of markers. The approach was based on four strategies: (i) the use of multiple full-sib families, (ii) the reduction of missing information through the use of HaploBlocks and alternative calling procedures for single-nucleotide polymorphism (SNP) markers, (iii) the construction of a single backcross-type data set including all families, and (iv) a two-step map generation procedure based on the sequential inclusion of markers. The map comprises 15 417 SNP markers, clustered in 3 K HaploBlock markers spanning 1 267 cM, with an average distance between adjacent markers of 0.37 cM and a maximum distance of 3.29 cM. Moreover, chromosome 5 was oriented according to its homoeologous chromosome 10. This map was useful to improve the apple genome sequence, design the Axiom Apple 480 K SNP array and perform multifamily-based QTL studies. Its collinearity with the genome sequences v1 and v3 are reported. To our knowledge, this is the shortest published SNP map in apple, while including the largest number of markers, families and individuals. This result validates our methodology, proving its value for the construction of integrated linkage maps for any outbreeding species.
Zago, V H S; Scherrer, D Z; Parra, E S; Panzoldo, N B; Alexandre, F; Nakandakare, E R; Quintão, E C R; de Faria, E C
2015-03-01
ATP binding cassette transporter G1 (ABCG1) promotes lipidation of nascent high-density lipoprotein (HDL) particles, acting as an intracellular transporter. SNP rs1893590 (c.-204A > C) of ABCG1 gene has been previously studied and reported as functional over plasma HDL-C and lipoprotein lipase activity. This study aimed to investigate the relationships of SNP rs1893590 with plasma lipids and lipoproteins in a large Brazilian population. Were selected 654 asymptomatic and normolipidemic volunteers from both genders. Clinical and anthropometrical data were taken and blood samples were drawn after 12 h fasting. Plasma lipids and lipoproteins, as well as HDL particle size and volume were determined. Genomic DNA was isolated for SNP rs1893590 detection by TaqMan(®) OpenArray(®) Real-Time PCR Plataform (Applied Biosystems). Mann-Whitney U, Chi square and two-way ANOVA were the used statistical tests. No significant differences were found in the comparison analyses between the allele groups for all studied parameters. Conversely, significant interactions were observed between SNP and age over plasma HDL-C, were volunteers under 60 years with AA genotype had increased HDL-C (p = 0.048). Similar results were observed in the group with body mass index (BMI) < 25 kg/m(2), where volunteers with AA genotype had higher HDL-C levels (p = 0.0034), plus an increased HDL particle size (p = 0.01). These findings indicate that SNP rs1893590 of ABCG1 has a significant impact over HDL-C under asymptomatic clinical conditions in an age and BMI dependent way.
Cohort analysis of a single nucleotide polymorphism on DNA chips.
Schwonbeck, Susanne; Krause-Griep, Andrea; Gajovic-Eichelmann, Nenad; Ehrentreich-Förster, Eva; Meinl, Walter; Glatt, Hansrüdi; Bier, Frank F
2004-11-15
A method has been developed to determine SNPs on DNA chips by applying a flow-through bioscanner. As a practical application we demonstrated the fast and simple SNP analysis of 24 genotypes in an array of 96 spots with a single hybridisation and dissociation experiment. The main advantage of this methodical concept is the parallel and fast analysis without any need of enzymatic digestion. Additionally, the DNA chip format used is appropriate for parallel analysis up to 400 spots. The polymorphism in the gene of the human phenol sulfotransferase SULT1A1 was studied as a model SNP. Biotinylated PCR products containing the SNP (The SNP summary web site: ) (mutant) and those containing no mutation (wild-type) were brought onto the chips coated with NeutrAvidin using non-contact spotting. This was followed by an analysis which was carried out in a flow-through biochip scanner while constantly rinsing with buffer. After removing the non-biotinylated strand a fluorescent probe was hybridised, which is complementary to the wild-type sequence. If this probe binds to a mutant sequence, then one single base is not fully matching. Thereby, the mismatched hybrid (mutant) is less stable than the full-matched hybrid (wild-type). The final step after hybridisation on the chip involves rinsing with a buffer to start dissociation of the fluorescent probe from the immobilised DNA strand. The online measurement of the fluorescence intensity by the biochip scanner provides the possibility to follow the kinetics of the hybridisation and dissociation processes. According to the different stability of the full-match and the mismatch, either visual discrimination or kinetic analysis is possible to distinguish SNP-containing sequence from the wild-type sequence.
Di Pierro, Erica A; Gianfranceschi, Luca; Di Guardo, Mario; Koehorst-van Putten, Herma JJ; Kruisselbrink, Johannes W; Longhi, Sara; Troggio, Michela; Bianco, Luca; Muranty, Hélène; Pagliarani, Giulia; Tartarini, Stefano; Letschka, Thomas; Lozano Luis, Lidia; Garkava-Gustavsson, Larisa; Micheletti, Diego; Bink, Marco CAM; Voorrips, Roeland E; Aziz, Ebrahimi; Velasco, Riccardo; Laurens, François; van de Weg, W Eric
2016-01-01
Quantitative trait loci (QTL) mapping approaches rely on the correct ordering of molecular markers along the chromosomes, which can be obtained from genetic linkage maps or a reference genome sequence. For apple (Malus domestica Borkh), the genome sequence v1 and v2 could not meet this need; therefore, a novel approach was devised to develop a dense genetic linkage map, providing the most reliable marker-loci order for the highest possible number of markers. The approach was based on four strategies: (i) the use of multiple full-sib families, (ii) the reduction of missing information through the use of HaploBlocks and alternative calling procedures for single-nucleotide polymorphism (SNP) markers, (iii) the construction of a single backcross-type data set including all families, and (iv) a two-step map generation procedure based on the sequential inclusion of markers. The map comprises 15 417 SNP markers, clustered in 3 K HaploBlock markers spanning 1 267 cM, with an average distance between adjacent markers of 0.37 cM and a maximum distance of 3.29 cM. Moreover, chromosome 5 was oriented according to its homoeologous chromosome 10. This map was useful to improve the apple genome sequence, design the Axiom Apple 480 K SNP array and perform multifamily-based QTL studies. Its collinearity with the genome sequences v1 and v3 are reported. To our knowledge, this is the shortest published SNP map in apple, while including the largest number of markers, families and individuals. This result validates our methodology, proving its value for the construction of integrated linkage maps for any outbreeding species. PMID:27917289
Tran, Frances; Penniket, Carolyn; Patel, Rohan V; Provart, Nicholas J; Laroche, André; Rowland, Owen; Robert, Laurian S
2013-06-01
Despite their importance, there remains a paucity of large-scale gene expression-based studies of reproductive development in species belonging to the Triticeae. As a first step to address this deficiency, a gene expression atlas of triticale reproductive development was generated using the 55K Affymetrix GeneChip(®) wheat genome array. The global transcriptional profiles of the anther/pollen, ovary and stigma were analyzed at concurrent developmental stages, and co-expressed as well as preferentially expressed genes were identified. Data analysis revealed both novel and conserved regulatory factors underlying Triticeae floral development and function. This comprehensive resource rests upon detailed gene annotations, and the expression profiles are readily accessible via a web browser. © 2013 Her Majesty the Queen in Right of Canada as represented by the Minister of Agriculture and Agri-Food Canada.
USDA-ARS?s Scientific Manuscript database
Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR) disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population ma...
Sun, Huihui; Wan, Naijun; Wang, Xinli; Chang, Liang; Cheng, Dazhi
2018-01-01
18p deletion syndrome is a rare chromosomal disease caused by deletion of the short arm of chromosome 18. By using cytogenetic and SNP array analysis, we identified a girl with 18p deletion syndrome exhibiting craniofacial anomalies, intellectual disability, and short stature. G-banding analysis of metaphase cells revealed an abnormal karyotype 46,XX,del(18)(p10). Further, SNP array detected a 15.3-Mb deletion at 18p11.21p11.32 (chr18:12842-15375878) including 61 OMIM genes. Genotype-phenotype correlation analysis showed that clinical manifestations of the patient were correlated with LAMA1, TWSG1, and GNAL deletions. Her neuropsychological assessment test demonstrated delay in most cognitive functions including impaired mathematics, linguistic skills, visual motor perception, respond speed, and executive function. Meanwhile, her integrated visual and auditory continuous performance test (IVA-CPT) indicated a severe comprehensive attention deficit. At age 7 and 1/12 years, her height was 110.8 cm (-2.5 SD height for age). Growth hormone (GH) treatment was initiated. After 27 months treatment, her height was increased to 129.6 cm (-1.0 SD height for age) at 9 and 4/12 years, indicating an effective response to GH treatment. © 2018 S. Karger AG, Basel.
Ellis, David; Chavez, Oswaldo; Coombs, Joseph J; Soto, Julian V; Gomez, Rene; Douches, David S; Panta, Ana; Silvestre, Rocio; Anglin, Noelle Lynette
2018-05-24
Breeders rely on genetic integrity of material from genebanks, however, mislabeling and errors in original data can occur. Paired samples of original material and their in vitro counterparts from 250 diverse potato landrace accessions from the International Potato Center (CIP), were fingerprinted using the Infinium 12K V2 Potato Array to confirm genetic identity and evaluate genetic diversity. Diploid, triploid, and tetraploid accessions were included representing seven cultivated potato taxa (Hawkes, 1990). Fingerprints between mother field plants and in vitro clones, were used to evaluate identity, relatedness, and ancestry. Clones of the same accession grouped together, however eleven (4.4%) accessions were mismatches genetically. SNP genotypes were used to construct a phylogeny to evaluate inter- and intraspecific relationships and population structure. Data suggests that the triploids evaluated are genetically similar. STRUCTURE analysis identified several putative hybrids and suggests six populations with significant gene flow between. This study provides a model for genetic identity of plant genetic resources collections as mistakes in conservation of these collections and in genebanks is a reality and confirmed identity is critical for breeders and other users of these collections, as well as for quality management programs and to provide insights into the diversity of the accessions evaluated.
Comparison of Two Methods for Detecting Alternative Splice Variants Using GeneChip® Exon Arrays
Fan, Wenhong; Stirewalt, Derek L.; Radich, Jerald P.; Zhao, Lueping
2011-01-01
The Affymetrix GeneChip Exon Array can be used to detect alternative splice variants. Microarray Detection of Alternative Splicing (MIDAS) and Partek® Genomics Suite (Partek® GS) are among the most popular analytical methods used to analyze exon array data. While both methods utilize statistical significance for testing, MIDAS and Partek® GS could produce somewhat different results due to different underlying assumptions. Comparing MIDAS and Partek® GS is quite difficult due to their substantially different mathematical formulations and assumptions regarding alternative splice variants. For meaningful comparison, we have used the previously published generalized probe model (GPM) which encompasses both MIDAS and Partek® GS under different assumptions. We analyzed a colon cancer exon array data set using MIDAS, Partek® GS and GPM. MIDAS and Partek® GS produced quite different sets of genes that are considered to have alternative splice variants. Further, we found that GPM produced results similar to MIDAS as well as to Partek® GS under their respective assumptions. Within the GPM, we show how discoveries relating to alternative variants can be quite different due to different assumptions. MIDAS focuses on relative changes in expression values across different exons within genes and tends to be robust but less efficient. Partek® GS, however, uses absolute expression values of individual exons within genes and tends to be more efficient but more sensitive to the presence of outliers. From our observations, we conclude that MIDAS and Partek® GS produce complementary results, and discoveries from both analyses should be considered. PMID:23675234
Metastatic breast carcinomas display genomic and transcriptomic heterogeneity
Weigelt, Britta; Ng, Charlotte KY; Shen, Ronglai; Popova, Tatiana; Schizas, Michail; Natrajan, Rachael; Mariani, Odette; Stern, Marc-Henri; Norton, Larry; Vincent-Salomon, Anne; Reis-Filho, Jorge S
2015-01-01
Metaplastic breast carcinoma is a rare and aggressive histologic type of breast cancer, preferentially displaying a triple-negative phenotype. We sought to define the transcriptomic heterogeneity of metaplastic breast cancers on the basis of current gene expression microarray-based classifiers, and to determine whether these tumors display gene copy number profiles consistent with those of BRCA1-associated breast cancers. Twenty-eight consecutive triple-negative metaplastic breast carcinomas were reviewed, and the metaplastic component present in each frozen specimen was defined (ie, spindle cell, squamous, chondroid metaplasia). RNA and DNA extracted from frozen sections with tumor cell content >60% were subjected to gene expression (Illumina HumanHT-12 v4) and copy number profiling (Affymetrix SNP 6.0), respectively. Using the best practice PAM50/claudin-low microarray-based classifier, all metaplastic breast carcinomas with spindle cell metaplasia were of claudin-low subtype, whereas those with squamous or chondroid metaplasia were preferentially of basal-like subtype. Triple-negative breast cancer subtyping using a dedicated website (http://cbc.mc.vanderbilt.edu/tnbc/) revealed that all metaplastic breast carcinomas with chondroid metaplasia were of mesenchymal-like subtype, spindle cell carcinomas preferentially of unstable or mesenchymal stem-like subtype, and those with squamous metaplasia were of multiple subtypes. None of the cases was classified as immunomodulatory or luminal androgen receptor subtype. Integrative clustering, combining gene expression and gene copy number data, revealed that metaplastic breast carcinomas with spindle cell and chondroid metaplasia were preferentially classified as of integrative clusters 4 and 9, respectively, whereas those with squamous metaplasia were classified into six different clusters. Eight of the 26 metaplastic breast cancers subjected to SNP6 analysis were classified as BRCA1-like. The diversity of histologic features of metaplastic breast carcinomas is reflected at the transcriptomic level, and an association between molecular subtypes and histology was observed. BRCA1-like genomic profiles were found only in a subset (31%) of metaplastic breast cancers, and were not associated with a specific molecular or histologic subtype. PMID:25412848
Al-Tobasei, Rafet; Ali, Ali; Leeds, Timothy D; Liu, Sixin; Palti, Yniv; Kenney, Brett; Salem, Mohamed
2017-08-07
Coding/functional SNPs change the biological function of a gene and, therefore, could serve as "large-effect" genetic markers. In this study, we used two bioinformatics pipelines, GATK and SAMtools, for discovering coding/functional SNPs with allelic-imbalances associated with total body weight, muscle yield, muscle fat content, shear force, and whiteness. Phenotypic data were collected for approximately 500 fish, representing 98 families (5 fish/family), from a growth-selected line, and the muscle transcriptome was sequenced from 22 families with divergent phenotypes (4 low- versus 4 high-ranked families per trait). GATK detected 59,112 putative SNPs; of these SNPs, 4798 showed allelic imbalances (>2.0 as an amplification and <0.5 as loss of heterozygosity). SAMtools detected 87,066 putative SNPs; and of them, 4962 had allelic imbalances between the low- and high-ranked families. Only 1829 SNPs with allelic imbalances were common between the two datasets, indicating significant differences in algorithms. The two datasets contained 7930 non-redundant SNPs of which 4439 mapped to 1498 protein-coding genes (with 6.4% non-synonymous SNPs) and 684 mapped to 295 lncRNAs. Validation of a subset of 92 SNPs revealed 1) 86.7-93.8% success rate in calling polymorphic SNPs and 2) 95.4% consistent matching between DNA and cDNA genotypes indicating a high rate of identifying SNPs with allelic imbalances. In addition, 4.64% SNPs revealed random monoallelic expression. Genome distribution of the SNPs with allelic imbalances exhibited high density for all five traits in several chromosomes, especially chromosome 9, 20 and 28. Most of the SNP-harboring genes were assigned to important growth-related metabolic pathways. These results demonstrate utility of RNA-Seq in assessing phenotype-associated allelic imbalances in pooled RNA-Seq samples. The SNPs identified in this study were included in a new SNP-Chip design (available from Affymetrix) for genomic and genetic analyses in rainbow trout.
Transcriptome-wide targets of alternative splicing by RBM4 and possible role in cancer.
Markus, M Andrea; Yang, Yee Hwa J; Morris, Brian J
2016-04-01
This study determined transcriptome-wide targets of the splicing factor RBM4 using Affymetrix GeneChip(®) Human Exon 1.0 ST Arrays and HeLa cells treated with RBM4-specific siRNA. This revealed 238 transcripts that were targeted for alternative splicing. Cross-linking and immunoprecipitation experiments identified 945 RBM4 targets in mouse HEK293 cells, 39% of which were ascribed to "alternative splicing" by in silico pathway analysis. Mouse embryonic stem cells transfected with Rbm4 siRNA hairpins exhibited reduced colony numbers and size consistent with involvement of RBM4 in cell proliferation. RBM4 cDNA probing of a cancer cDNA array involving 18 different tumor types from 13 different tissues and matching normal tissue found overexpression of RBM4 mRNA (p<0.01) in cervical, breast, lung, colon, ovarian and rectal cancers. Many RBM4 targets we identified have been implicated in these cancers. In conclusion, our findings reveal transcriptome-wide targets of RBM4 and point to potential cancer-related targets and mechanisms that may involve RBM4. Copyright © 2016 Elsevier Inc. All rights reserved.
Population-genetic properties of differentiated copy number variations in cattle.
Xu, Lingyang; Hou, Yali; Bickhart, Derek M; Zhou, Yang; Hay, El Hamidi Abdel; Song, Jiuzhou; Sonstegard, Tad S; Van Tassell, Curtis P; Liu, George E
2016-03-23
While single nucleotide polymorphism (SNP) is typically the variant of choice for population genetics, copy number variation (CNV) which comprises insertion, deletion and duplication of genomic sequence, is an informative type of genetic variation. CNVs have been shown to be both common in mammals and important for understanding the relationship between genotype and phenotype. However, CNV differentiation, selection and its population genetic properties are not well understood across diverse populations. We performed a population genetics survey based on CNVs derived from the BovineHD SNP array data of eight distinct cattle breeds. We generated high resolution results that show geographical patterns of variations and genome-wide admixture proportions within and among breeds. Similar to the previous SNP-based studies, our CNV-based results displayed a strong correlation of population structure and geographical location. By conducting three pairwise comparisons among European taurine, African taurine, and indicine groups, we further identified 78 unique CNV regions that were highly differentiated, some of which might be due to selection. These CNV regions overlapped with genes involved in traits related to parasite resistance, immunity response, body size, fertility, and milk production. Our results characterize CNV diversity among cattle populations and provide a list of lineage-differentiated CNVs.
Single-nucleotide polymorphism genotyping on optical thin-film biosensor chips.
Zhong, Xiao-Bo; Reynolds, Robert; Kidd, Judith R; Kidd, Kenneth K; Jenison, Robert; Marlar, Richard A; Ward, David C
2003-09-30
Single-nucleotide polymorphisms (SNPs) constitute the bulk of human genetic variation and provide excellent markers to identify genetic factors contributing to complex disease susceptibility. A rapid, sensitive, and inexpensive assay is important for large-scale SNP scoring. Here we report the development of a multiplex SNP detection system using silicon chips coated to create a thin-film optical biosensor. Allele-discriminating, aldehyde-labeled oligonucleotides are arrayed and covalently attached to a hydrazinederivatized chip surface. Target sequences (e.g., PCR amplicons) then are hybridized in the presence of a mixture of biotinylated detector probes, one for each SNP, and a thermostable DNA ligase. After a stringent wash (0.01 M NaOH), ligation of biotinylated detector probes to perfectly matched capture oligomers is visualized as a color change on the chip surface (gold to blue/purple) after brief incubations with an anti-biotin IgG-horseradish peroxidase conjugate and a precipitable horseradish peroxidase substrate. Testing of PCR fragments is completed in 30-40 min. Up to several hundred SNPs can be assayed on a 36-mm2 chip, and SNP scoring can be done by eye or with a simple digital-camera system. This assay is extremely robust, exhibits high sensitivity and specificity, and is format-flexible and economical. In studies of mutations associated with risk for venous thrombosis and genotyping/haplotyping of African-American samples, we document high-fidelity analysis with 0 misassignments in 500 assays performed in duplicate.
Jeon, Jae Pil; Shim, Sung Mi; Jung, Jong Sun; Nam, Hye Young; Lee, Hye Jin; Oh, Berm Seok; Kim, Kuchan; Kim, Hyung Lae; Han, Bok Ghee
2009-09-30
To examine copy number variations among the Korean population, we compared individual genomes with the Korean reference genome assembly using the publicly available Korean HapMap SNP 50 k chip data from 90 individuals. Korean individuals exhibited 123 copy number variation regions (CNVRs) covering 27.2 mb, equivalent to 1.0% of the genome in the copy number variation (CNV) analysis using the combined criteria of P value (P<0.01) and standard deviation of copy numbers (SD>or= 0.25) among study subjects. In contrast, when compared to the Affymetrix reference genome assembly from multiple ethnic groups, considerably more CNVRs (n=643) were detected in larger proportions (5.0%) of the genome covering 135.1 mb even by more stringent criteria (P<0.001 and SD>or=0.25), reflecting ethnic diversity of structural variations between Korean and other populations. Some CNVRs were validated by the quantitative multiplex PCR of short fluorescent fragment (QMPSF) method, and then copy number invariant regions were detected among the study subjects. These copy number invariant regions would be used as good internal controls for further CNV studies. Lastly, we demonstrated that the CNV information could stratify even a single ethnic population with a proper reference genome assembly from multiple heterogeneous populations.
Allele-Skewed DNA Modification in the Brain: Relevance to a Schizophrenia GWAS
Gagliano, Sarah A.; Ptak, Carolyn; Mak, Denise Y.F.; Shamsi, Mehrdad; Oh, Gabriel; Knight, Joanne; Boutros, Paul C.; Petronis, Arturas
2016-01-01
Numerous recent studies have suggested that phenotypic effects of DNA sequence variants can be mediated or modulated by their epigenetic marks, such as allele-skewed DNA modification (ASM). Using Affymetrix SNP microarrays, we performed a comprehensive search of ASM effects in human post-mortem brain and sperm samples (total n = 256) from individuals with major psychosis and control individuals. Depending on the phenotypic category of the brain samples, 1.4%–7.5% of interrogated SNPs exhibited ASM effects. Next, we investigated ASM in the context of genetic studies of schizophrenia and detected that brain ASM SNPs were significantly overrepresented among sub-threshold SNPs from a schizophrenia genome-wide association study (GWAS). Brain ASM SNPs showed a much stronger enrichment in a schizophrenia GWAS than in 17 large GWASs of non-psychiatric diseases and traits, arguing that ASM effects are at least partially tissue specific. Studies of germline and control brain ASM SNPs supported a causal association between ASM and schizophrenia. Finally, significantly higher proportions of ASM SNPs than of non-ASM SNPs were detected at loci exhibiting epigenetic signatures of enhancers and promoters, and they were overrepresented within transcription factor binding regions and DNase I hypersensitive sites. All of these findings collectively indicate that ASM SNPs should be prioritized in follow-up GWASs. PMID:27087318
Mokhtar, Siti Shuhada; Marshall, Christian R.; Phipps, Maude E.; Thiruvahindrapuram, Bhooma; Lionel, Anath C.; Scherer, Stephen W.; Peng, Hoh Boon
2014-01-01
Copy number variation (CNV) has been recognized as a major contributor to human genome diversity. It plays an important role in determining phenotypes and has been associated with a number of common and complex diseases. However CNV data from diverse populations is still limited. Here we report the first investigation of CNV in the indigenous populations from Peninsular Malaysia. We genotyped 34 Negrito genomes from Peninsular Malaysia using the Affymetrix SNP 6.0 microarray and identified 48 putative novel CNVs, consisting of 24 gains and 24 losses, of which 5 were identified in at least 2 unrelated samples. These CNVs appear unique to the Negrito population and were absent in the DGV, HapMap3 and Singapore Genome Variation Project (SGVP) datasets. Analysis of gene ontology revealed that genes within these CNVs were enriched in the immune system (GO:0002376), response to stimulus mechanisms (GO:0050896), the metabolic pathways (GO:0001852), as well as regulation of transcription (GO:0006355). Copy number gains in CNV regions (CNVRs) enriched with genes were significantly higher than the losses (P value <0.001). In view of the small population size, relative isolation and semi-nomadic lifestyles of this community, we speculate that these CNVs may be attributed to recent local adaptation of Negritos from Peninsular Malaysia. PMID:24956385
Mokhtar, Siti Shuhada; Marshall, Christian R; Phipps, Maude E; Thiruvahindrapuram, Bhooma; Lionel, Anath C; Scherer, Stephen W; Peng, Hoh Boon
2014-01-01
Copy number variation (CNV) has been recognized as a major contributor to human genome diversity. It plays an important role in determining phenotypes and has been associated with a number of common and complex diseases. However CNV data from diverse populations is still limited. Here we report the first investigation of CNV in the indigenous populations from Peninsular Malaysia. We genotyped 34 Negrito genomes from Peninsular Malaysia using the Affymetrix SNP 6.0 microarray and identified 48 putative novel CNVs, consisting of 24 gains and 24 losses, of which 5 were identified in at least 2 unrelated samples. These CNVs appear unique to the Negrito population and were absent in the DGV, HapMap3 and Singapore Genome Variation Project (SGVP) datasets. Analysis of gene ontology revealed that genes within these CNVs were enriched in the immune system (GO:0002376), response to stimulus mechanisms (GO:0050896), the metabolic pathways (GO:0001852), as well as regulation of transcription (GO:0006355). Copy number gains in CNV regions (CNVRs) enriched with genes were significantly higher than the losses (P value <0.001). In view of the small population size, relative isolation and semi-nomadic lifestyles of this community, we speculate that these CNVs may be attributed to recent local adaptation of Negritos from Peninsular Malaysia.
Pierson, Tyler Mark; Simeonov, Dimitre R; Sincan, Murat; Adams, David A; Markello, Thomas; Golas, Gretchen; Fuentes-Fajardo, Karin; Hansen, Nancy F; Cherukuri, Praveen F; Cruz, Pedro; Blackstone, Craig; Tifft, Cynthia; Boerkoel, Cornelius F; Gahl, William A
2012-01-01
Fatty acid hydroxylase-associated neurodegeneration due to fatty acid 2-hydroxylase deficiency presents with a wide range of phenotypes including spastic paraplegia, leukodystrophy, and/or brain iron deposition. All previously described families with this disorder were consanguineous, with homozygous mutations in the probands. We describe a 10-year-old male, from a non-consanguineous family, with progressive spastic paraplegia, dystonia, ataxia, and cognitive decline associated with a sural axonal neuropathy. The use of high-throughput sequencing techniques combined with SNP array analyses revealed a novel paternally derived missense mutation and an overlapping novel maternally derived ∼28-kb genomic deletion in FA2H. This patient provides further insight into the consistent features of this disorder and expands our understanding of its phenotypic presentation. The presence of a sural nerve axonal neuropathy had not been previously associated with this disorder and so may extend the phenotype. PMID:22146942
Li, X; Buitenhuis, A J; Lund, M S; Li, C; Sun, D; Zhang, Q; Poulsen, N A; Su, G
2015-11-01
The identification of causal genes or genomic regions associated with fatty acids (FA) will enhance our understanding of the pathways underlying FA synthesis and provide opportunities for changing milk fat composition through a genetic approach. The linkage disequilibrium between adjacent markers is highly consistent between the Chinese and Danish Holstein populations, such that a joint genome-wide association study (GWAS) can be performed. In this study, a joint GWAS was performed for 16 milk FA traits based on data of 784 Chinese and 371 Danish Holstein cows genotyped by a high-density bovine single nucleotide polymorphism (SNP) array. A total of 486,464 SNP markers on 29 bovine autosomes were used. Bonferroni corrections were applied to adjust the significance thresholds for multiple testing at the genome- and chromosome-wide levels. According to the analysis of either the Chinese or Danish data individually, the total numbers of overlapping SNP that were significant at the chromosome level were 94 for C14:1, 208 for the C14 index, and 1 for C18:0. Joint analysis using the combined data of the 2 populations detected greater numbers of significant SNP compared with either of the individual populations alone for 7 and 10 traits at the genome- and chromosome-wide significance levels, respectively. Greater numbers of significant SNP were detected for C18:0 and the C18 index in the Chinese population compared with the joint analysis. Sixty-five significant SNP across all traits had significantly different effects in the 2 populations. Ten FA were influenced by a quantitative trait loci (QTL) region including DGAT1. Both C14:1 and the C14 index were influenced by a QTL region including SCD1 in the combined population. Other QTL regions also showed significant associations with the studied FA. A large region (14.9-24.9 Mbp) in BTA26 significantly influenced C14:1 and the C14 index in both populations, mostly likely due to the SNP in SCD1. A QTL region (69.97-73.69 Mbp) on BTA9 showed a significantly different effect on C18:0 between the 2 populations. Detection of these important SNP and the corresponding QTL regions will be helpful for follow-up studies to identify causal mutations and their interaction with environments for milk FA in dairy cattle. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Wong, Gerard; Leckie, Christopher; Kowalczyk, Adam
2012-01-15
Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options. We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer. FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.
Blood pressure loci identified with a gene-centric array.
Johnson, Toby; Gaunt, Tom R; Newhouse, Stephen J; Padmanabhan, Sandosh; Tomaszewski, Maciej; Kumari, Meena; Morris, Richard W; Tzoulaki, Ioanna; O'Brien, Eoin T; Poulter, Neil R; Sever, Peter; Shields, Denis C; Thom, Simon; Wannamethee, Sasiwarang G; Whincup, Peter H; Brown, Morris J; Connell, John M; Dobson, Richard J; Howard, Philip J; Mein, Charles A; Onipinla, Abiodun; Shaw-Hawkins, Sue; Zhang, Yun; Davey Smith, George; Day, Ian N M; Lawlor, Debbie A; Goodall, Alison H; Fowkes, F Gerald; Abecasis, Gonçalo R; Elliott, Paul; Gateva, Vesela; Braund, Peter S; Burton, Paul R; Nelson, Christopher P; Tobin, Martin D; van der Harst, Pim; Glorioso, Nicola; Neuvrith, Hani; Salvi, Erika; Staessen, Jan A; Stucchi, Andrea; Devos, Nabila; Jeunemaitre, Xavier; Plouin, Pierre-François; Tichet, Jean; Juhanson, Peeter; Org, Elin; Putku, Margus; Sõber, Siim; Veldre, Gudrun; Viigimaa, Margus; Levinsson, Anna; Rosengren, Annika; Thelle, Dag S; Hastie, Claire E; Hedner, Thomas; Lee, Wai K; Melander, Olle; Wahlstrand, Björn; Hardy, Rebecca; Wong, Andrew; Cooper, Jackie A; Palmen, Jutta; Chen, Li; Stewart, Alexandre F R; Wells, George A; Westra, Harm-Jan; Wolfs, Marcel G M; Clarke, Robert; Franzosi, Maria Grazia; Goel, Anuj; Hamsten, Anders; Lathrop, Mark; Peden, John F; Seedorf, Udo; Watkins, Hugh; Ouwehand, Willem H; Sambrook, Jennifer; Stephens, Jonathan; Casas, Juan-Pablo; Drenos, Fotios; Holmes, Michael V; Kivimaki, Mika; Shah, Sonia; Shah, Tina; Talmud, Philippa J; Whittaker, John; Wallace, Chris; Delles, Christian; Laan, Maris; Kuh, Diana; Humphries, Steve E; Nyberg, Fredrik; Cusi, Daniele; Roberts, Robert; Newton-Cheh, Christopher; Franke, Lude; Stanton, Alice V; Dominiczak, Anna F; Farrall, Martin; Hingorani, Aroon D; Samani, Nilesh J; Caulfield, Mark J; Munroe, Patricia B
2011-12-09
Raised blood pressure (BP) is a major risk factor for cardiovascular disease. Previous studies have identified 47 distinct genetic variants robustly associated with BP, but collectively these explain only a few percent of the heritability for BP phenotypes. To find additional BP loci, we used a bespoke gene-centric array to genotype an independent discovery sample of 25,118 individuals that combined hypertensive case-control and general population samples. We followed up four SNPs associated with BP at our p < 8.56 × 10(-7) study-specific significance threshold and six suggestively associated SNPs in a further 59,349 individuals. We identified and replicated a SNP at LSP1/TNNT3, a SNP at MTHFR-NPPB independent (r(2) = 0.33) of previous reports, and replicated SNPs at AGT and ATP2B1 reported previously. An analysis of combined discovery and follow-up data identified SNPs significantly associated with BP at p < 8.56 × 10(-7) at four further loci (NPR3, HFE, NOS3, and SOX6). The high number of discoveries made with modest genotyping effort can be attributed to using a large-scale yet targeted genotyping array and to the development of a weighting scheme that maximized power when meta-analyzing results from samples ascertained with extreme phenotypes, in combination with results from nonascertained or population samples. Chromatin immunoprecipitation and transcript expression data highlight potential gene regulatory mechanisms at the MTHFR and NOS3 loci. These results provide candidates for further study to help dissect mechanisms affecting BP and highlight the utility of studying SNPs and samples that are independent of those studied previously even when the sample size is smaller than that in previous studies. Copyright © 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Assumption-free estimation of the genetic contribution to refractive error across childhood.
Guggenheim, Jeremy A; St Pourcain, Beate; McMahon, George; Timpson, Nicholas J; Evans, David M; Williams, Cathy
2015-01-01
Studies in relatives have generally yielded high heritability estimates for refractive error: twins 75-90%, families 15-70%. However, because related individuals often share a common environment, these estimates are inflated (via misallocation of unique/common environment variance). We calculated a lower-bound heritability estimate for refractive error free from such bias. Between the ages 7 and 15 years, participants in the Avon Longitudinal Study of Parents and Children (ALSPAC) underwent non-cycloplegic autorefraction at regular research clinics. At each age, an estimate of the variance in refractive error explained by single nucleotide polymorphism (SNP) genetic variants was calculated using genome-wide complex trait analysis (GCTA) using high-density genome-wide SNP genotype information (minimum N at each age=3,404). The variance in refractive error explained by the SNPs ("SNP heritability") was stable over childhood: Across age 7-15 years, SNP heritability averaged 0.28 (SE=0.08, p<0.001). The genetic correlation for refractive error between visits varied from 0.77 to 1.00 (all p<0.001) demonstrating that a common set of SNPs was responsible for the genetic contribution to refractive error across this period of childhood. Simulations suggested lack of cycloplegia during autorefraction led to a small underestimation of SNP heritability (adjusted SNP heritability=0.35; SE=0.09). To put these results in context, the variance in refractive error explained (or predicted) by the time participants spent outdoors was <0.005 and by the time spent reading was <0.01, based on a parental questionnaire completed when the child was aged 8-9 years old. Genetic variation captured by common SNPs explained approximately 35% of the variation in refractive error between unrelated subjects. This value sets an upper limit for predicting refractive error using existing SNP genotyping arrays, although higher-density genotyping in larger samples and inclusion of interaction effects is expected to raise this figure toward twin- and family-based heritability estimates. The same SNPs influenced refractive error across much of childhood. Notwithstanding the strong evidence of association between time outdoors and myopia, and time reading and myopia, less than 1% of the variance in myopia at age 15 was explained by crude measures of these two risk factors, indicating that their effects may be limited, at least when averaged over the whole population.
Dynamic variable selection in SNP genotype autocalling from APEX microarray data.
Podder, Mohua; Welch, William J; Zamar, Ruben H; Tebbutt, Scott J
2006-11-30
Single nucleotide polymorphisms (SNPs) are DNA sequence variations, occurring when a single nucleotide--adenine (A), thymine (T), cytosine (C) or guanine (G)--is altered. Arguably, SNPs account for more than 90% of human genetic variation. Our laboratory has developed a highly redundant SNP genotyping assay consisting of multiple probes with signals from multiple channels for a single SNP, based on arrayed primer extension (APEX). This mini-sequencing method is a powerful combination of a highly parallel microarray with distinctive Sanger-based dideoxy terminator sequencing chemistry. Using this microarray platform, our current genotype calling system (known as SNP Chart) is capable of calling single SNP genotypes by manual inspection of the APEX data, which is time-consuming and exposed to user subjectivity bias. Using a set of 32 Coriell DNA samples plus three negative PCR controls as a training data set, we have developed a fully-automated genotyping algorithm based on simple linear discriminant analysis (LDA) using dynamic variable selection. The algorithm combines separate analyses based on the multiple probe sets to give a final posterior probability for each candidate genotype. We have tested our algorithm on a completely independent data set of 270 DNA samples, with validated genotypes, from patients admitted to the intensive care unit (ICU) of St. Paul's Hospital (plus one negative PCR control sample). Our method achieves a concordance rate of 98.9% with a 99.6% call rate for a set of 96 SNPs. By adjusting the threshold value for the final posterior probability of the called genotype, the call rate reduces to 94.9% with a higher concordance rate of 99.6%. We also reversed the two independent data sets in their training and testing roles, achieving a concordance rate up to 99.8%. The strength of this APEX chemistry-based platform is its unique redundancy having multiple probes for a single SNP. Our model-based genotype calling algorithm captures the redundancy in the system considering all the underlying probe features of a particular SNP, automatically down-weighting any 'bad data' corresponding to image artifacts on the microarray slide or failure of a specific chemistry. In this regard, our method is able to automatically select the probes which work well and reduce the effect of other so-called bad performing probes in a sample-specific manner, for any number of SNPs.
Bangera, Rama; Correa, Katharina; Lhorente, Jean P; Figueroa, René; Yáñez, José M
2017-01-31
Salmon Rickettsial Syndrome (SRS) caused by Piscirickettsia salmonis is a major disease affecting the Chilean salmon industry. Genomic selection (GS) is a method wherein genome-wide markers and phenotype information of full-sibs are used to predict genomic EBV (GEBV) of selection candidates and is expected to have increased accuracy and response to selection over traditional pedigree based Best Linear Unbiased Prediction (PBLUP). Widely used GS methods such as genomic BLUP (GBLUP), SNPBLUP, Bayes C and Bayesian Lasso may perform differently with respect to accuracy of GEBV prediction. Our aim was to compare the accuracy, in terms of reliability of genome-enabled prediction, from different GS methods with PBLUP for resistance to SRS in an Atlantic salmon breeding program. Number of days to death (DAYS), binary survival status (STATUS) phenotypes, and 50 K SNP array genotypes were obtained from 2601 smolts challenged with P. salmonis. The reliability of different GS methods at different SNP densities with and without pedigree were compared to PBLUP using a five-fold cross validation scheme. Heritability estimated from GS methods was significantly higher than PBLUP. Pearson's correlation between predicted GEBV from PBLUP and GS models ranged from 0.79 to 0.91 and 0.79-0.95 for DAYS and STATUS, respectively. The relative increase in reliability from different GS methods for DAYS and STATUS with 50 K SNP ranged from 8 to 25% and 27-30%, respectively. All GS methods outperformed PBLUP at all marker densities. DAYS and STATUS showed superior reliability over PBLUP even at the lowest marker density of 3 K and 500 SNP, respectively. 20 K SNP showed close to maximal reliability for both traits with little improvement using higher densities. These results indicate that genomic predictions can accelerate genetic progress for SRS resistance in Atlantic salmon and implementation of this approach will contribute to the control of SRS in Chile. We recommend GBLUP for routine GS evaluation because this method is computationally faster and the results are very similar with other GS methods. The use of lower density SNP or the combination of low density SNP and an imputation strategy may help to reduce genotyping costs without compromising gain in reliability.
A Transcriptomic Comparison of Two Bambara Groundnut Landraces under Dehydration Stress
Khan, Faraz; Chai, Hui Hui; Ajmera, Ishan; Hodgman, Charlie; Mayes, Sean; Lu, Chungui
2017-01-01
The ability to grow crops under low-water conditions is a significant advantage in relation to global food security. Bambara groundnut is an underutilised crop grown by subsistence farmers in Africa and is known to survive in regions of water deficit. This study focuses on the analysis of the transcriptomic changes in two bambara groundnut landraces in response to dehydration stress. A cross-species hybridisation approach based on the Soybean Affymetrix GeneChip array has been employed. The differential gene expression analysis of a water-limited treatment, however, showed that the two landraces responded with almost completely different sets of genes. Hence, both landraces with very similar genotypes (as assessed by the hybridisation of genomic DNA onto the Soybean Affymetrix GeneChip) showed contrasting transcriptional behaviour in response to dehydration stress. In addition, both genotypes showed a high expression of dehydration-associated genes, even under water-sufficient conditions. Several gene regulators were identified as potentially important. Some are already known, such as WRKY40, but others may also be considered, namely PRR7, ATAUX2-11, CONSTANS-like 1, MYB60, AGL-83, and a Zinc-finger protein. These data provide a basis for drought trait research in the bambara groundnut, which will facilitate functional genomics studies. An analysis of this dataset has identified that both genotypes appear to be in a dehydration-ready state, even in the absence of dehydration stress, and may have adapted in different ways to achieve drought resistance. This will help in understanding the mechanisms underlying the ability of crops to produce viable yields under drought conditions. In addition, cross-species hybridisation to the soybean microarray has been shown to be informative for investigating the bambara groundnut transcriptome. PMID:28420201
Melendez, Roberto I.; McGinty, Jacqueline F.; Kalivas, Peter W.; Becker, Howard C.
2014-01-01
Neuroadaptations that participate in the ontogeny of alcohol dependence are likely a result of altered gene expression in various brain regions. The present study investigated brain region-specific changes in the pattern and magnitude of gene expression immediately following chronic intermittent ethanol (CIE) exposure and 8 hours following final ethanol exposure [i.e. early withdrawal (EWD)]. High-density oligonucleotide microarrays (Affymetrix 430A 2.0, Affymetrix, Santa Clara, CA, USA) and bioinformatics analysis were used to characterize gene expression and function in the prefrontal cortex (PFC), hippocampus (HPC) and nucleus accumbens (NAc) of C57BL/6J mice (Jackson Laboratories, Bar Harbor, ME, USA). Gene expression levels were determined using gene chip robust multi-array average followed by statistical analysis of microarrays and validated by quantitative real-time reverse transcription polymerase chain reaction and Western blot analysis. Results indicated that immediately following CIE exposure, changes in gene expression were strikingly greater in the PFC (284 genes) compared with the HPC (16 genes) and NAc (32 genes). Bioinformatics analysis revealed that most of the transcriptionally responsive genes in the PFC were involved in Ras/MAPK signaling, notch signaling or ubiquitination. In contrast, during EWD, changes in gene expression were greatest in the HPC (139 genes) compared with the PFC (four genes) and NAc (eight genes). The most transcriptionally responsive genes in the HPC were involved in mRNA processing or actin dynamics. Of the few genes detected in the NAc, the most representatives were involved in circadian rhythms. Overall, these findings indicate that brain region-specific and time-dependent neuroadaptive alterations in gene expression play an integral role in the development of alcohol dependence and withdrawal. PMID:21812870
Arakawa, Yusuke; Shimada, Mitsuo; Utsunomiya, Tohru; Imura, Satoru; Morine, Yuji; Ikemoto, Tetsuya; Mori, Hiroki; Kanamoto, Mami; Iwahashi, Shuichi; Saito, Yu; Takasu, Chie
2014-08-01
In general, the spleen is one of the abdominal organs connected by the portal system, and a splenectomy improves hepatic functions in the settings of partial hepatectomy (Hx) for portal hypertensive cases or living donor liver transplantation with excessive portal vein flow. Those precise mechanisms remain still unclear; therefore, we investigated the DNA expression profile in the spleen after 90% Hx in rats using complementary DNA microarray and pathway analysis. Messenger RNAs (mRNAs) were prepared from three rat spleens at each time point (0, 3, and 6 h after 90% Hx). Using the gene chip, mRNA was hybridized to Affymetrix GeneChip Rat Genome 230 2.0 Array (Affymetrix®) and pathway analysis was done with Ingenuity Pathway Analysis (IPA®). We determined the 3-h or 6-h/0-h ratio to assess the influence of Hx, and cut-off values were set at more than 2.0-fold or less than 1/2 (0.5)-fold. Chemokine activity-related genes including Cxcl1 (GRO1) and Cxcl2 (MIP-2) related pathway were upregulated in the spleen. Also, immediate early response genes including early growth response-1 (EGR1), FBJ murine osteosarcoma (FOS) and activating transcription factor 3 (ATF3) related pathway were upregulated in the spleen. We concluded that in the spleen the expression of numerous inflammatory-related genes would occur after 90% Hx. The spleen could take a harmful role and provide a negative impact during post Hx phase due to the induction of chemokine and transcription factors including GRO1 and EGR1. © 2014 Journal of Gastroenterology and Hepatology Foundation and Wiley Publishing Asia Pty Ltd.
Kulski, Jerzy K; Kenworthy, William; Bellgard, Matthew; Taplin, Ross; Okamoto, Koichi; Oka, Akira; Mabuchi, Tomotaka; Ozawa, Akira; Tamiya, Gen; Inoko, Hidetoshi
2005-12-01
Gene expression profiling was performed on biopsies of affected and unaffected psoriatic skin and normal skin from seven Japanese patients to obtain insights into the pathways that control this disease. HUG95A Affymetrix DNA chips that contained oligonucleotide arrays of approximately 12,000 well-characterized human genes were used in the study. The statistical analysis of the Affymetrix data, based on the ranking of the Student t-test statistic, revealed a complex regulation of molecular stress and immune gene responses. The majority of the 266 induced genes in affected and unaffected psoriatic skin were involved with interferon mediation, immunity, cell adhesion, cytoskeleton restructuring, protein trafficking and degradation, RNA regulation and degradation, signalling transduction, apoptosis and atypical epidermal cellular proliferation and differentiation. The disturbances in the normal protein degradation equilibrium of skin were reflected by the significant increase in the gene expression of various protease inhibitors and proteinases, including the induced components of the ATP/ubiquitin-dependent non-lysosomal proteolytic pathway that is involved with peptide processing and presentation to T cells. Some of the up-regulated genes, such as TGM1, IVL, FABP5, CSTA and SPRR, are well-known psoriatic markers involved in atypical epidermal cellular organization and differentiation. In the comparison between the affected and unaffected psoriatic skin, the transcription factor JUNB was found at the top of the statistical rankings for the up-regulated genes in affected skin, suggesting that it has an important but as yet undefined role in psoriasis. Our gene expression data and analysis suggest that psoriasis is a chronic interferon- and T-cell-mediated immune disease of the skin where the imbalance in epidermal cellular structure, growth and differentiation arises from the molecular antiviral stress signals initiating inappropriate immune responses.
Ram, R; Wakil, S M; Muiya, N P; Andres, E; Mazhar, N; Hagos, S; Alshahid, M; Meyer, B F; Morahan, G; Dzimiri, N
2017-03-01
Hypertriglyceridemia (hTG) is a lipid disorder, resulting from an elevation in triglyceride levels, with a strong genetic component. It constitutes a significant risk factor for coronary artery disease (CAD), a leading cause of death worldwide. In this study, we performed a common variant association study for hTG in ethnic Saudi Arabs. We genotyped 5501 individuals in a two-phase experiment using Affymetrix Axiom ® Genome-Wide CEU 1 Array (Affymetrix, Santa Cruz, CA) that contains a total of 587,352 single nucleotide polymorphisms (SNPs). The lead variant was the rs1558861 [1.99 (1.73-2.30); p = 7.37 × 10 -22 ], residing on chromosome (chr) 11 at the apolipoprotein A-I/A-5 (APOA1/APOA5) locus. The rs780094 [1.34 (1.21-1.49); p = 8.57 × 10 -8 ] on chr 2 at the glucokinase regulatory protein (GCKR) locus was similarly significantly associated, while the rs10911205 [1.29 (1.16-1.44); p = 3.52 × 10 -6 ] on chr1 at the laminin subunit gamma-1 (LAMC1) locus showed suggestive association with disease. Furthermore, the rs17145738 [0.68 (0.60-0.77); p = 6.69 × 10 -9 ] on chr7 at the carbohydrate-responsive element-binding protein-encoding (MLXIPL) gene locus displayed significant protective characteristics, while another variant rs6982502 [0.76 (0.68-0.84); p = 5.31 × 10 -7 ] on chr8 showed similar but weaker properties. These findings were replicated in 317 cases vs 1415 controls from the same ethnic Arab population. Our study identified several variants across the human genome that are associated with hTG in ethnic Arabs. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Trejter, Marcin; Jopek, Karol; Celichowski, Piotr; Tyczewska, Marianna; Malendowicz, Ludwik K; Rucinski, Marcin
2015-01-01
Adrenocortical activity in various species is sensitive to androgens and estrogens. They may affect adrenal cortex growth and functioning either via central pathways (CRH and ACTH) or directly, via specific receptors expressed in the cortex and/or by interfering with adrenocortical enzymes, among them those involved in steroidogenesis. Only limited data on expression of androgen and estrogen receptors in adrenal glands are available. Therefore the present study aimed to characterize, at the level of mRNA, expression of these receptors in specific components of adrenal cortex of intact adult male and female rats. Studies were performed on adult male and female (estrus) Wistar rats. Total RNA was isolated from adrenal zona glomerulosa (ZG) and fasciculate/reticularis (ZF/R). Expression of genes were evaluated by means of Affymetrix® Rat Gene 1.1 ST Array Strip and QPCR. By means of Affymetrix® Rat Gene 1.1 ST Array we examined adrenocortical sex differences in the expression of nearly 30,000 genes. All data were analyzed in relation to the adrenals of the male rats. 32 genes were differentially expressed in ZG, and 233 genes in ZF/R. In the ZG expression levels of 24 genes were lower and 8 higher in female rats. The more distinct sex differences were observed in the ZF/R, in which expression levels of 146 genes were lower and 87 genes higher in female rats. Performed analyses did not reveal sex differences in the expression levels of both androgen (AR) and estrogen (ER) receptor genes in the adrenal cortex of male and female rats. Therefore matrix data were validated by QPCR. QPCR revealed higher expression levels of AR gene both in ZG and ZF/R of male than female rats. On the other hand, QPCR did not reveal sex-related differences in the expression levels of ERα, ERβ and non-genomic GPR30 (GPER-1) receptor. Of those genes expression levels of ERα genes were the highest. In studied adrenal samples the relative expression of ERα mRNA was higher than ERβ mRNA. In adrenals of adult male and female rats expression levels of estrogen-related receptors ERRα and ERRβ were similar, and only in the ZF/R of female rats ERRγ expression levels were significantly higher than in males. We also analyzed expression profile of three isoforms of steroid 5α-reductase (Srd5a1, Srd5a2 and Srd5a3) and aromatase (Cyp19a1) and expression levels of all these genes were similar in ZG and ZF/R of male and female rats. In contrast to Affymetrix microarray data QPCR revealed higher expression levels of AR gene in adrenal glands of the male rats. In adrenals of both sexes expression levels of ERa, ERb, non-genomic GPR30 (GPER-1), ERR α and ERRβ receptors were comparable. The obtained results suggest that acute steroidogenic effect of estrogens on corticosteroid secretion may be mediated by non-genomic GPR30.
2011-01-01
Background Single nucleotide polymorphisms (SNPs) are the most abundant source of genetic variation among individuals of a species. New genotyping technologies allow examining hundreds to thousands of SNPs in a single reaction for a wide range of applications such as genetic diversity analysis, linkage mapping, fine QTL mapping, association studies, marker-assisted or genome-wide selection. In this paper, we evaluated the potential of highly-multiplexed SNP genotyping for genetic mapping in maritime pine (Pinus pinaster Ait.), the main conifer used for commercial plantation in southwestern Europe. Results We designed a custom GoldenGate assay for 1,536 SNPs detected through the resequencing of gene fragments (707 in vitro SNPs/Indels) and from Sanger-derived Expressed Sequenced Tags assembled into a unigene set (829 in silico SNPs/Indels). Offspring from three-generation outbred (G2) and inbred (F2) pedigrees were genotyped. The success rate of the assay was 63.6% and 74.8% for in silico and in vitro SNPs, respectively. A genotyping error rate of 0.4% was further estimated from segregating data of SNPs belonging to the same gene. Overall, 394 SNPs were available for mapping. A total of 287 SNPs were integrated with previously mapped markers in the G2 parental maps, while 179 SNPs were localized on the map generated from the analysis of the F2 progeny. Based on 98 markers segregating in both pedigrees, we were able to generate a consensus map comprising 357 SNPs from 292 different loci. Finally, the analysis of sequence homology between mapped markers and their orthologs in a Pinus taeda linkage map, made it possible to align the 12 linkage groups of both species. Conclusions Our results show that the GoldenGate assay can be used successfully for high-throughput SNP genotyping in maritime pine, a conifer species that has a genome seven times the size of the human genome. This SNP-array will be extended thanks to recent sequencing effort using new generation sequencing technologies and will include SNPs from comparative orthologous sequences that were identified in the present study, providing a wider collection of anchor points for comparative genomics among the conifers. PMID:21767361
Cuzick, Jack; Brentnall, Adam R; Segal, Corrinne; Byers, Helen; Reuter, Caroline; Detre, Simone; Lopez-Knowles, Elena; Sestak, Ivana; Howell, Anthony; Powles, Trevor J; Newman, William G; Dowsett, Mitchell
2017-03-01
Purpose At least 94 common single nucleotide polymorphisms (SNPs) are associated with breast cancer. The extent to which an SNP panel can refine risk in women who receive preventive therapy has not been directly assessed previously. Materials and Methods A risk score on the basis of 88 SNPs (SNP88) was investigated in a nested case-control study of women enrolled in the International Breast Intervention Study (IBIS-I) or the Royal Marsden study. A total of 359 women who developed cancer were matched to 636 controls by age, trial, follow-up time, and treatment arm. Genotyping was done using the OncoArray. Conditional logistic regression and matched concordance indices (mC) were used to measure the performance of SNP88 alone and with other breast cancer risk factors assessed using the Tyrer-Cuzick (TC) model. Results SNP88 was predictive of breast cancer risk overall (interquartile range odds ratio [IQ-OR], 1.37; 95% CI, 1.14 to 1.66; mC, 0.55), but mainly for estrogen receptor-positive disease (IQ-OR, 1.44; 95% CI, 1.16 to 1.79; P for heterogeneity = .10) versus estrogen receptor-negative disease. However, the observed risk of SNP88 was only 46% (95% CI, 19% to 74%) of expected. No significant interaction was observed with treatment arm (placebo IQ-OR, 1.46; 95% CI, 1.13 to 1.87; tamoxifen IQ-OR, 1.25; 95% CI, 0.96 to 1.64; P for heterogeneity = .5). The predictive power was similar to the TC model (IQ-OR, 1.45; 95% CI, 1.21 to 1.73; mC, 0.55), but SNP88 was independent of TC (Spearman rank-order correlation, 0.012; P = .7), and when combined multiplicatively, a substantial improvement was seen (IQ-OR, 1.64; 95% CI, 1.36 to 1.97; mC, 0.60). Conclusion A polygenic risk score may be used to refine risk from the TC or similar models in women who are at an elevated risk of breast cancer and considering preventive therapy. Recalibration may be necessary for accurate risk assessment.
A Genome-Wide Breast Cancer Scan in African Americans
2010-06-01
SNPs from the African American breast cancer scan to COGs , a European collaborative study which is has designed a SNP array with that will be genotyped...Award Number: W81XWH-08-1-0383 TITLE: A Genome-wide Breast Cancer Scan in African Americans PRINCIPAL INVESTIGATOR: Christopher A...SUBTITLE A Genome-wide Breast Cancer Scan in African Americans 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-08-1-0383 5c. PROGRAM
DOE Office of Scientific and Technical Information (OSTI.GOV)
McKown, Athena; Klapste, Jaroslav; Guy, Robert
2014-01-01
To uncover the genetic basis of phenotypic trait variation, we used 448 unrelated wild accessions of black cottonwood (Populus trichocarpa Torr. & Gray) from natural populations throughout western North America. Extensive information from large-scale trait phenotyping (with spatial and temporal replications within a common garden) and genotyping (with a 34K Populus SNP array) of all accessions were used for gene discovery in a genome-wide association study (GWAS).
Flanagan, Jonathan M.; Vege, Sunitha; Luban, Naomi L. C.; Brown, R. Clark; Ware, Russell E.; Westhoff, Connie M.
2017-01-01
RH genes are highly polymorphic and encode the most complex of the 35 human blood group systems. This genetic diversity contributes to Rh alloimmunization in patients with sickle cell anemia (SCA) and is not avoided by serologic Rh-matched red cell transfusions. Standard serologic testing does not distinguish variant Rh antigens. Single nucleotide polymorphism (SNP)–based DNA arrays detect many RHD and RHCE variants, but the number of alleles tested is limited. We explored a next-generation sequencing (NGS) approach using whole-exome sequencing (WES) in 27 Rh alloimmunized and 27 matched non-alloimmunized patients with SCA who received chronic red cell transfusions and were enrolled in a multicenter study. We demonstrate that WES provides a comprehensive RH genotype, identifies SNPs not interrogated by DNA array, and accurately determines RHD zygosity. Among this multicenter cohort, we demonstrate an association between an altered RH genotype and Rh alloimmunization: 52% of Rh immunized vs 19% of non-immunized patients expressed variant Rh without co-expression of the conventional protein. Our findings suggest that RH allele variation in patients with SCA is clinically relevant, and NGS technology can offer a comprehensive alternative to targeted SNP-based testing. This is particularly relevant as NGS data becomes more widely available and could provide the means for reducing Rh alloimmunization in children with SCA. PMID:29296782
Campoy, José Antonio; Lerigoleur-Balsemin, Emilie; Christmann, Hélène; Beauvieux, Rémi; Girollet, Nabil; Quero-García, José; Dirlewanger, Elisabeth; Barreneche, Teresa
2016-02-24
Depiction of the genetic diversity, linkage disequilibrium (LD) and population structure is essential for the efficient organization and exploitation of genetic resources. The objectives of this study were to (i) to evaluate the genetic diversity and to detect the patterns of LD, (ii) to estimate the levels of population structure and (iii) to identify a 'core collection' suitable for association genetic studies in sweet cherry. A total of 210 genotypes including modern cultivars and landraces from 16 countries were genotyped using the RosBREED cherry 6 K SNP array v1. Two groups, mainly bred cultivars and landraces, respectively, were first detected using STRUCTURE software and confirmed by Principal Coordinate Analysis (PCoA). Further analyses identified nine subgroups using STRUCTURE and Discriminant Analysis of Principal Components (DAPC). Several sub-groups correspond to different eco-geographic regions of landraces distribution. Linkage disequilibrium was evaluated showing lower values than in peach, the reference Prunus species. A 'core collection' containing 156 accessions was selected using the maximum length sub tree method. The present study constitutes the first population genetics analysis in cultivated sweet cherry using a medium-density SNP (single nucleotide polymorphism) marker array. We provided estimations of linkage disequilibrium, genetic structure and the definition of a first INRA's Sweet Cherry core collection useful for breeding programs, germplasm management and association genetics studies.
Krug, Utz O.; Lee, Dhong Hyun Tony; Kawamata, Norihiko; Iwanski, Gabriela B.; Lasho, Terra; Weiss, Tamara; Nowak, Daniel; Koren-Michowitz, Maya; Kato, Motohiro; Sanada, Masashi; Shih, Lee-Yung; Nagler, Arnon; Raynaud, Sophie D.; Müller-Tidow, Carsten; Mesa, Ruben; Haferlach, Torsten; Gilliland, D. Gary; Tefferi, Ayalew; Ogawa, Seishi; Koeffler, H. Phillip
2010-01-01
Philadelphia chromosome–negative myeloproliferative neoplasms (MPNs) including polycythemia vera, essential thrombocythemia, and primary myelofibrosis show an inherent tendency for transformation into leukemia (MPN-blast phase), which is hypothesized to be accompanied by acquisition of additional genomic lesions. We, therefore, examined chromosomal abnormalities by high-resolution single nucleotide polymorphism (SNP) array in 88 MPN patients, as well as 71 cases with MPN-blast phase, and correlated these findings with their clinical parameters. Frequent genomic alterations were found in MPN after leukemic transformation with up to 3-fold more genomic changes per sample compared with samples in chronic phase (P < .001). We identified commonly altered regions involved in disease progression including not only established targets (ETV6, TP53, and RUNX1) but also new candidate genes on 7q, 16q, 19p, and 21q. Moreover, trisomy 8 or amplification of 8q24 (MYC) was almost exclusively detected in JAK2V617F− cases with MPN-blast phase. Remarkably, copy number–neutral loss of heterozygosity (CNN-LOH) on either 7q or 9p including homozygous JAK2V617F was related to decreased survival after leukemic transformation (P = .01 and P = .016, respectively). Our high-density SNP-array analysis of MPN genomes in the chronic compared with leukemic stage identified novel target genes and provided prognostic insights associated with the evolution to leukemia. PMID:20068225
Craddock, Nick; Hurles, Matthew E; Cardin, Niall; Pearson, Richard D; Plagnol, Vincent; Robson, Samuel; Vukcevic, Damjan; Barnes, Chris; Conrad, Donald F; Giannoulatou, Eleni; Holmes, Chris; Marchini, Jonathan L; Stirrups, Kathy; Tobin, Martin D; Wain, Louise V; Yau, Chris; Aerts, Jan; Ahmad, Tariq; Andrews, T Daniel; Arbury, Hazel; Attwood, Anthony; Auton, Adam; Ball, Stephen G; Balmforth, Anthony J; Barrett, Jeffrey C; Barroso, Inês; Barton, Anne; Bennett, Amanda J; Bhaskar, Sanjeev; Blaszczyk, Katarzyna; Bowes, John; Brand, Oliver J; Braund, Peter S; Bredin, Francesca; Breen, Gerome; Brown, Morris J; Bruce, Ian N; Bull, Jaswinder; Burren, Oliver S; Burton, John; Byrnes, Jake; Caesar, Sian; Clee, Chris M; Coffey, Alison J; Connell, John M C; Cooper, Jason D; Dominiczak, Anna F; Downes, Kate; Drummond, Hazel E; Dudakia, Darshna; Dunham, Andrew; Ebbs, Bernadette; Eccles, Diana; Edkins, Sarah; Edwards, Cathryn; Elliot, Anna; Emery, Paul; Evans, David M; Evans, Gareth; Eyre, Steve; Farmer, Anne; Ferrier, I Nicol; Feuk, Lars; Fitzgerald, Tomas; Flynn, Edward; Forbes, Alistair; Forty, Liz; Franklyn, Jayne A; Freathy, Rachel M; Gibbs, Polly; Gilbert, Paul; Gokumen, Omer; Gordon-Smith, Katherine; Gray, Emma; Green, Elaine; Groves, Chris J; Grozeva, Detelina; Gwilliam, Rhian; Hall, Anita; Hammond, Naomi; Hardy, Matt; Harrison, Pile; Hassanali, Neelam; Hebaishi, Husam; Hines, Sarah; Hinks, Anne; Hitman, Graham A; Hocking, Lynne; Howard, Eleanor; Howard, Philip; Howson, Joanna M M; Hughes, Debbie; Hunt, Sarah; Isaacs, John D; Jain, Mahim; Jewell, Derek P; Johnson, Toby; Jolley, Jennifer D; Jones, Ian R; Jones, Lisa A; Kirov, George; Langford, Cordelia F; Lango-Allen, Hana; Lathrop, G Mark; Lee, James; Lee, Kate L; Lees, Charlie; Lewis, Kevin; Lindgren, Cecilia M; Maisuria-Armer, Meeta; Maller, Julian; Mansfield, John; Martin, Paul; Massey, Dunecan C O; McArdle, Wendy L; McGuffin, Peter; McLay, Kirsten E; Mentzer, Alex; Mimmack, Michael L; Morgan, Ann E; Morris, Andrew P; Mowat, Craig; Myers, Simon; Newman, William; Nimmo, Elaine R; O'Donovan, Michael C; Onipinla, Abiodun; Onyiah, Ifejinelo; Ovington, Nigel R; Owen, Michael J; Palin, Kimmo; Parnell, Kirstie; Pernet, David; Perry, John R B; Phillips, Anne; Pinto, Dalila; Prescott, Natalie J; Prokopenko, Inga; Quail, Michael A; Rafelt, Suzanne; Rayner, Nigel W; Redon, Richard; Reid, David M; Renwick; Ring, Susan M; Robertson, Neil; Russell, Ellie; St Clair, David; Sambrook, Jennifer G; Sanderson, Jeremy D; Schuilenburg, Helen; Scott, Carol E; Scott, Richard; Seal, Sheila; Shaw-Hawkins, Sue; Shields, Beverley M; Simmonds, Matthew J; Smyth, Debbie J; Somaskantharajah, Elilan; Spanova, Katarina; Steer, Sophia; Stephens, Jonathan; Stevens, Helen E; Stone, Millicent A; Su, Zhan; Symmons, Deborah P M; Thompson, John R; Thomson, Wendy; Travers, Mary E; Turnbull, Clare; Valsesia, Armand; Walker, Mark; Walker, Neil M; Wallace, Chris; Warren-Perry, Margaret; Watkins, Nicholas A; Webster, John; Weedon, Michael N; Wilson, Anthony G; Woodburn, Matthew; Wordsworth, B Paul; Young, Allan H; Zeggini, Eleftheria; Carter, Nigel P; Frayling, Timothy M; Lee, Charles; McVean, Gil; Munroe, Patricia B; Palotie, Aarno; Sawcer, Stephen J; Scherer, Stephen W; Strachan, David P; Tyler-Smith, Chris; Brown, Matthew A; Burton, Paul R; Caulfield, Mark J; Compston, Alastair; Farrall, Martin; Gough, Stephen C L; Hall, Alistair S; Hattersley, Andrew T; Hill, Adrian V S; Mathew, Christopher G; Pembrey, Marcus; Satsangi, Jack; Stratton, Michael R; Worthington, Jane; Deloukas, Panos; Duncanson, Audrey; Kwiatkowski, Dominic P; McCarthy, Mark I; Ouwehand, Willem; Parkes, Miles; Rahman, Nazneen; Todd, John A; Samani, Nilesh J; Donnelly, Peter
2010-04-01
Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to have an important role in genetic susceptibility to common disease. To address this we undertook a large, direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed approximately 19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated approximately 50% of all common CNVs larger than 500 base pairs. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with disease-IRGM for Crohn's disease, HLA for Crohn's disease, rheumatoid arthritis and type 1 diabetes, and TSPAN8 for type 2 diabetes-although in each case the locus had previously been identified in single nucleotide polymorphism (SNP)-based studies, reflecting our observation that most common CNVs that are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs that can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases.
Comprehensive comparison of three commercial human whole-exome capture platforms.
Asan; Xu, Yu; Jiang, Hui; Tyler-Smith, Chris; Xue, Yali; Jiang, Tao; Wang, Jiawei; Wu, Mingzhi; Liu, Xiao; Tian, Geng; Wang, Jun; Wang, Jian; Yang, Huangming; Zhang, Xiuqing
2011-09-28
Exome sequencing, which allows the global analysis of protein coding sequences in the human genome, has become an effective and affordable approach to detecting causative genetic mutations in diseases. Currently, there are several commercial human exome capture platforms; however, the relative performances of these have not been characterized sufficiently to know which is best for a particular study. We comprehensively compared three platforms: NimbleGen's Sequence Capture Array and SeqCap EZ, and Agilent's SureSelect. We assessed their performance in a variety of ways, including number of genes covered and capture efficacy. Differences that may impact on the choice of platform were that Agilent SureSelect covered approximately 1,100 more genes, while NimbleGen provided better flanking sequence capture. Although all three platforms achieved similar capture specificity of targeted regions, the NimbleGen platforms showed better uniformity of coverage and greater genotype sensitivity at 30- to 100-fold sequencing depth. All three platforms showed similar power in exome SNP calling, including medically relevant SNPs. Compared with genotyping and whole-genome sequencing data, the three platforms achieved a similar accuracy of genotype assignment and SNP detection. Importantly, all three platforms showed similar levels of reproducibility, GC bias and reference allele bias. We demonstrate key differences between the three platforms, particularly advantages of solutions over array capture and the importance of a large gene target set.
Simčič, Mojca; Smetko, Anamarija; Sölkner, Johann; Seichter, Doris; Gorjanc, Gregor; Kompan, Dragomir; Medugorac, Ivica
2015-01-01
The aim of this study was to obtain unbiased estimates of the diversity parameters, the population history, and the degree of admixture in Cika cattle which represents the local admixed breeds at risk of extinction undergoing challenging conservation programs. Genetic analyses were performed on the genome-wide Single Nucleotide Polymorphism (SNP) Illumina Bovine SNP50 array data of 76 Cika animals and 531 animals from 14 reference populations. To obtain unbiased estimates we used short haplotypes spanning four markers instead of single SNPs to avoid an ascertainment bias of the BovineSNP50 array. Genome-wide haplotypes combined with partial pedigree and type trait classification show the potential to improve identification of purebred animals with a low degree of admixture. Phylogenetic analyses demonstrated unique genetic identity of Cika animals. Genetic distance matrix presented by rooted Neighbour-Net suggested long and broad phylogenetic connection between Cika and Pinzgauer. Unsupervised clustering performed by the admixture analysis and two-dimensional presentation of the genetic distances between individuals also suggest Cika is a distinct breed despite being similar in appearance to Pinzgauer. Animals identified as the most purebred could be used as a nucleus for a recovery of the native genetic background in the current admixed population. The results show that local well-adapted strains, which have never been intensively managed and differentiated into specific breeds, exhibit large haplotype diversity. They suggest a conservation and recovery approach that does not rely exclusively on the search for the original native genetic background but rather on the identification and removal of common introgressed haplotypes would be more powerful. Successful implementation of such an approach should be based on combining phenotype, pedigree, and genome-wide haplotype data of the breed of interest and a spectrum of reference breeds which potentially have had direct or indirect historical contribution to the genetic makeup of the breed of interest. PMID:25923207
Li, Gang; Hillier, LaDeana W; Grahn, Robert A; Zimin, Aleksey V; David, Victor A; Menotti-Raymond, Marilyn; Middleton, Rondo; Hannah, Steven; Hendrickson, Sher; Makunin, Alex; O'Brien, Stephen J; Minx, Pat; Wilson, Richard K; Lyons, Leslie A; Warren, Wesley C; Murphy, William J
2016-06-01
High-resolution genetic and physical maps are invaluable tools for building accurate genome assemblies, and interpreting results of genome-wide association studies (GWAS). Previous genetic and physical maps anchored good quality draft assemblies of the domestic cat genome, enabling the discovery of numerous genes underlying hereditary disease and phenotypes of interest to the biomedical science and breeding communities. However, these maps lacked sufficient marker density to order thousands of shorter scaffolds in earlier assemblies, which instead relied heavily on comparative mapping with related species. A high-resolution map would aid in validating and ordering chromosome scaffolds from existing and new genome assemblies. Here, we describe a high-resolution genetic linkage map of the domestic cat genome based on genotyping 453 domestic cats from several multi-generational pedigrees on the Illumina 63K SNP array. The final maps include 58,055 SNP markers placed relative to 6637 markers with unique positions, distributed across all autosomes and the X chromosome. Our final sex-averaged maps span a total autosomal length of 4464 cM, the longest described linkage map for any mammal, confirming length estimates from a previous microsatellite-based map. The linkage map was used to order and orient the scaffolds from a substantially more contiguous domestic cat genome assembly (Felis catus v8.0), which incorporated ∼20 × coverage of Illumina fragment reads. The new genome assembly shows substantial improvements in contiguity, with a nearly fourfold increase in N50 scaffold size to 18 Mb. We use this map to report probable structural errors in previous maps and assemblies, and to describe features of the recombination landscape, including a massive (∼50 Mb) recombination desert (of virtually zero recombination) on the X chromosome that parallels a similar desert on the porcine X chromosome in both size and physical location. Copyright © 2016 Li et al.
Hao, Chenyang; Wang, Yuquan; Chao, Shiaoman; Li, Tian; Liu, Hongxia; Wang, Lanfen; Zhang, Xueyong
2017-01-30
A Chinese wheat mini core collection was genotyped using the wheat 9 K iSelect SNP array. Total 2420 and 2396 polymorphic SNPs were detected on the A and the B genome chromosomes, which formed 878 haplotype blocks. There were more blocks in the B genome, but the average block size was significantly (P < 0.05) smaller than those in the A genome. Intense selection (domestication and breeding) had a stronger effect on the A than on the B genome chromosomes. Based on the genetic pedigrees, many blocks can be traced back to a well-known Strampelli cross, which was made one century ago. Furthermore, polyploidization of wheat (both tetraploidization and hexaploidization) induced revolutionary changes in both the A and the B genomes, with a greater increase of gene diversity compared to their diploid ancestors. Modern breeding has dramatically increased diversity in the gene coding regions, though obvious blocks were formed on most of the chromosomes in both tetraploid and hexaploid wheats. Tag-SNP markers identified in this study can be used for marker assisted selection using haplotype blocks as a wheat breeding strategy. This strategy can also be employed to facilitate genome selection in other self-pollinating crop species.
Bailey, Swneke D.; Xie, Changchun; Do, Ron; Montpetit, Alexandre; Diaz, Rafael; Mohan, Viswanathan; Keavney, Bernard; Yusuf, Salim; Gerstein, Hertzel C.; Engert, James C.; Anand, Sonia
2010-01-01
OBJECTIVE Thiazolidinediones are used to treat type 2 diabetes. Their use has been associated with peripheral edema and congestive heart failure—outcomes that may have a genetic etiology. RESEARCH DESIGN AND METHODS We genotyped 4,197 participants of the multiethnic DREAM (Diabetes REduction Assessment with ramipril and rosiglitazone Medication) trial with a 50k single nucleotide polymorphisms (SNP) array, which captures ∼2000 cardiovascular, inflammatory, and metabolic genes. We tested 32,088 SNPs for an association with edema among Europeans who received rosiglitazone (n = 965). RESULTS One SNP, rs6123045, in NFATC2 was significantly associated with edema (odds ratio 1.89 [95% CI 1.47–2.42]; P = 5.32 × 10−7, corrected P = 0.017). Homozygous individuals had the highest edema rate (hazard ratio 2.89, P = 4.22 × 10−4) when compared with individuals homozygous for the protective allele, with heterozygous individuals having an intermediate risk. The interaction between the SNP and rosiglitazone for edema was significant (P = 7.68 × 10−3). Six SNPs in NFATC2 were significant in both Europeans and Latin Americans (P < 0.05). CONCLUSIONS Genetic variation at the NFATC2 locus contributes to edema among individuals who receive rosiglitazone. PMID:20628086
Nuñez-Acuña, Gustavo; Valenzuela-Muñoz, Valentina; Gallardo-Escárate, Cristian
2014-06-01
The salmon louse Caligus rogercresseyi is the dominant ectoparasite species affecting the salmon aquaculture industry in the Southern hemisphere, and it is currently the main cause for economic losses in Chilean aquaculture. However, despite the great concern over Caligus infestations, genomic information on this louse is still scarce, even while the need to develop high-resolution molecular markers is growing. This study provides the first deep transcriptome survey to identify thousands of SNP markers from C. rogercresseyi, with a total of 69,466 SNPs identified using the MiSeq platform (Illumina®), 30,605 (52%) of which were found in contigs successfully annotated against known protein databases. Furthermore, in silico gene expression profiles associated with SNP variants were evaluated, and the results evidenced a wide array of genes that were down- and upregulated throughout the developmental stages of C. rogercresseyi. Interestingly, putative KEGG pathways involved in resistance to antiparasitic agents were also identified, where ten pathways were associated with the nervous system and one was related to ABC transporters. Taken together, this information could be highly useful for investigating the molecular underpinnings involved in the susceptibility or resistance of salmon lice to chemical treatments. Copyright © 2014 Elsevier Inc. All rights reserved.
Ultrasoft x-ray imaging system for the National Spherical Torus Experiment
NASA Astrophysics Data System (ADS)
Stutman, D.; Finkenthal, M.; Soukhanovskii, V.; May, M. J.; Moos, H. W.; Kaita, R.
1999-01-01
A spectrally resolved ultrasoft x-ray imaging system, consisting of arrays of high resolution (<2 Å) and throughput (⩾tens of kHz) miniature monochromators, and based on multilayer mirrors and absolute photodiodes, is being designed for the National Spherical Torus Experiment. Initially, three poloidal arrays of diodes filtered for C 1s-np emission will be implemented for fast tomographic imaging of the colder start-up plasmas. Later on, mirrors tuned to the C Lyα emission will be added in order to enable the arrays to "see" the periphery through the hot core and to study magnetohydrodynamic activity and impurity transport in this region. We also discuss possible core diagnostics, based on tomographic imaging of the Lyα emission from the plume of recombined, low Z impurity ions left by neutral beams or fueling pellets. The arrays can also be used for radiated power measurements and to map the distribution of high Z impurities injected for transport studies. The performance of the proposed system is illustrated with results from test channels on the CDX-U spherical torus at Princeton Plasma Physics Laboratory.
Jackson, Eric M.; Sievert, Angela J.; Gai, Xiaowu; Hakonarson, Hakon; Judkins, Alexander R; Tooke, Laura; Perin, Juan Carlos; Xie, Hongbo; Shaikh, Tamim H.; Biegel, Jaclyn A.
2009-01-01
Translational Relevance Previous reports suggested that abnormalities of INI1 could be detected in 70–75% of malignant rhabdoid tumors. The mechanism of inactivation in the other 25% remained unclear. The goal of this study was to perform a high-resolution genomic analysis of a large series of rhabdoid tumors with the expectation of identifying additional loci related to the initiation or progression of these malignancies. We also developed a comprehensive set of assays, including a new MLPA assay, to interrogate the INI1 locus in 22q11.2. Intragenic deletions could be detected using the Illumina 550K Beadchip, whereas single exon deletions could be detected using MLPA. The current study demonstrates that with a multi-platform approach, alterations at the INI1 locus can be detected in almost all cases. Thus, appropriate molecular genetic testing can be used as an aid in the diagnosis and for treatment planning for most patients. Purpose A high-resolution genomic profiling and comprehensive targeted analysis of INI1/SMARCB1 of a large series of pediatric rhabdoid tumors was performed. The aim was to identify regions of copy number change and loss of heterozygosity that might pinpoint additional loci involved in the development or progression of rhabdoid tumors, and define the spectrum of genomic alterations of INI1 in this malignancy. Experimental Design A multi-platform approach, utilizing Illumina single nucleotide polymorphism (SNP) based oligonucleotide arrays, multiplex ligation dependent probe amplification (MLPA), fluorescence in situ hybridization (FISH), and coding sequence analysis was used to characterize genome wide copy number changes, loss of heterozygosity, and genomic alterations of INI1/SMARCB1 in a series of pediatric rhabdoid tumors. Results The bi-allelic alterations of INI1 that led to inactivation were elucidated in 50 of 51 tumors. INI1 inactivation was demonstrated by a variety of mechanisms, including deletions, mutations, and loss of heterozygosity. The results from the array studies highlighted the complexity of rearrangements of chromosome 22, compared to the low frequency of alterations involving the other chromosomes. Conclusions The results from the genome wide SNP-array analysis suggest that INI1 is the primary tumor suppressor gene involved in the development of rhabdoid tumors with no second locus identified. In addition, we did not identify hot spots for the breakpoints in sporadic tumors with deletions of chromosome 22q11.2. By employing a multimodality approach, the wide spectrum of alterations of INI1 can be identified in the majority of patients, which increases the clinical utility of molecular diagnostic testing. PMID:19276269
Bertolini, F; Galimberti, G; Schiavo, G; Mastrangelo, S; Di Gerlando, R; Strillacci, M G; Bagnato, A; Portolano, B; Fontanesi, L
2018-01-01
Commercial single nucleotide polymorphism (SNP) arrays have been recently developed for several species and can be used to identify informative markers to differentiate breeds or populations for several downstream applications. To identify the most discriminating genetic markers among thousands of genotyped SNPs, a few statistical approaches have been proposed. In this work, we compared several methods of SNPs preselection (Delta, F st and principal component analyses (PCA)) in addition to Random Forest classifications to analyse SNP data from six dairy cattle breeds, including cosmopolitan (Holstein, Brown and Simmental) and autochthonous Italian breeds raised in two different regions and subjected to limited or no breeding programmes (Cinisara, Modicana, raised only in Sicily and Reggiana, raised only in Emilia Romagna). From these classifications, two panels of 96 and 48 SNPs that contain the most discriminant SNPs were created for each preselection method. These panels were evaluated in terms of the ability to discriminate as a whole and breed-by-breed, as well as linkage disequilibrium within each panel. The obtained results showed that for the 48-SNP panel, the error rate increased mainly for autochthonous breeds, probably as a consequence of their admixed origin lower selection pressure and by ascertaining bias in the construction of the SNP chip. The 96-SNP panels were generally more able to discriminate all breeds. The panel derived by PCA-chrom (obtained by a preselection chromosome by chromosome) could identify informative SNPs that were particularly useful for the assignment of minor breeds that reached the lowest value of Out Of Bag error even in the Cinisara, whose value was quite high in all other panels. Moreover, this panel contained also the lowest number of SNPs in linkage disequilibrium. Several selected SNPs are located nearby genes affecting breed-specific phenotypic traits (coat colour and stature) or associated with production traits. In general, our results demonstrated the usefulness of Random Forest in combination to other reduction techniques to identify population informative SNPs.
Scherrer, Daniel Zanetti; Zago, Vanessa Helena de Souza; Vieira, Isabela Calanca; Parra, Eliane Soler; Panzoldo, Natália Baratella; Alexandre, Fernanda; Secolin, Rodrigo; Baracat, Jamal; Quintão, Eder Carlos Rocha; de Faria, Eliana Cotta
2015-01-01
Background Evidences suggest that paraoxonase 1 (PON1) confers important antioxidant and anti-inflammatory properties when associated with high-density lipoprotein (HDL). Objective To investigate the relationships between p.Q192R SNP of PON1, biochemical parameters and carotid atherosclerosis in an asymptomatic, normolipidemic Brazilian population sample. Methods We studied 584 volunteers (females n = 326, males n = 258; 19-75 years of age). Total genomic DNA was extracted and SNP was detected in the TaqMan® SNP OpenArray® genotyping platform (Applied Biosystems, Foster City, CA). Plasma lipoproteins and apolipoproteins were determined and PON1 activity was measured using paraoxon as a substrate. High-resolution β-mode ultrasonography was used to measure cIMT and the presence of carotid atherosclerotic plaques in a subgroup of individuals (n = 317). Results The presence of p.192Q was associated with a significant increase in PON1 activity (RR = 12.30 (11.38); RQ = 46.96 (22.35); QQ = 85.35 (24.83) μmol/min; p < 0.0001), HDL-C (RR= 45 (37); RQ = 62 (39); QQ = 69 (29) mg/dL; p < 0.001) and apo A-I (RR = 140.76 ± 36.39; RQ = 147.62 ± 36.92; QQ = 147.49 ± 36.65 mg/dL; p = 0.019). Stepwise regression analysis revealed that heterozygous and p.192Q carriers influenced by 58% PON1 activity towards paraoxon. The univariate linear regression analysis demonstrated that p.Q192R SNP was not associated with mean cIMT; as a result, in the multiple regression analysis, no variables were selected with 5% significance. In logistic regression analysis, the studied parameters were not associated with the presence of carotid plaques. Conclusion In low-risk individuals, the presence of the p.192Q variant of PON1 is associated with a beneficial plasma lipid profile but not with carotid atherosclerosis. PMID:26039660
Yoshikawa, Munemitsu; Yamashiro, Kenji; Miyake, Masahiro; Oishi, Maho; Akagi-Kurashige, Yumiko; Kumagai, Kyoko; Nakata, Isao; Nakanishi, Hideo; Oishi, Akio; Gotoh, Norimoto; Yamada, Ryo; Matsuda, Fumihiko; Yoshimura, Nagahisa
2014-10-21
We investigated the association between refractive error in a Japanese population and myopia-related genes identified in two recent large-scale genome-wide association studies. Single-nucleotide polymorphisms (SNPs) in 51 genes that were reported by the Consortium for Refractive Error and Myopia and/or the 23andMe database were genotyped in 3712 healthy Japanese volunteers from the Nagahama Study using HumanHap610K Quad, HumanOmni2.5M, and/or HumanExome Arrays. To evaluate the association between refractive error and recently identified myopia-related genes, we used three approaches to perform quantitative trait locus analyses of mean refractive error in both eyes of the participants: per-SNP, gene-based top-SNP, and gene-based all-SNP analyses. Association plots of successfully replicated genes also were investigated. In our per-SNP analysis, eight myopia gene associations were replicated successfully: GJD2, RASGRF1, BICC1, KCNQ5, CD55, CYP26A1, LRRC4C, and B4GALNT2.Seven additional gene associations were replicated in our gene-based analyses: GRIA4, BMP2, QKI, BMP4, SFRP1, SH3GL2, and EHBP1L1. The signal strength of the reported SNPs and their tagging SNPs increased after considering different linkage disequilibrium patterns across ethnicities. Although two previous studies suggested strong associations between PRSS56, LAMA2, TOX, and RDH5 and myopia, we could not replicate these results. Our results confirmed the significance of the myopia-related genes reported previously and suggested that gene-based replication analyses are more effective than per-SNP analyses. Our comparison with two previous studies suggested that BMP3 SNPs cause myopia primarily in Caucasian populations, while they may exhibit protective effects in Asian populations. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.
Scherrer, Daniel Zanetti; Zago, Vanessa Helena de Souza; Vieira, Isabela Calanca; Parra, Eliane Soler; Panzoldo, Natália Baratella; Alexandre, Fernanda; Secolin, Rodrigo; Baracat, Jamal; Quintão, Eder Carlos Rocha; Faria, Eliana Cotta de
2015-07-01
Evidences suggest that paraoxonase 1 (PON1) confers important antioxidant and anti-inflammatory properties when associated with high-density lipoprotein (HDL). To investigate the relationships between p.Q192R SNP of PON1, biochemical parameters and carotid atherosclerosis in an asymptomatic, normolipidemic Brazilian population sample. We studied 584 volunteers (females n = 326, males n = 258; 19-75 years of age). Total genomic DNA was extracted and SNP was detected in the TaqMan® SNP OpenArray® genotyping platform (Applied Biosystems, Foster City, CA). Plasma lipoproteins and apolipoproteins were determined and PON1 activity was measured using paraoxon as a substrate. High-resolution β-mode ultrasonography was used to measure cIMT and the presence of carotid atherosclerotic plaques in a subgroup of individuals (n = 317). The presence of p.192Q was associated with a significant increase in PON1 activity (RR = 12.30 (11.38); RQ = 46.96 (22.35); QQ = 85.35 (24.83) μmol/min; p < 0.0001), HDL-C (RR= 45 (37); RQ = 62 (39); QQ = 69 (29) mg/dL; p < 0.001) and apo A-I (RR = 140.76 ± 36.39; RQ = 147.62 ± 36.92; QQ = 147.49 ± 36.65 mg/dL; p = 0.019). Stepwise regression analysis revealed that heterozygous and p.192Q carriers influenced by 58% PON1 activity towards paraoxon. The univariate linear regression analysis demonstrated that p.Q192R SNP was not associated with mean cIMT; as a result, in the multiple regression analysis, no variables were selected with 5% significance. In logistic regression analysis, the studied parameters were not associated with the presence of carotid plaques. In low-risk individuals, the presence of the p.192Q variant of PON1 is associated with a beneficial plasma lipid profile but not with carotid atherosclerosis.
A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation
2013-01-01
Background Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. Results We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Conclusions Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array—more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change. PMID:23445355
Patel, Viralkumar M.; Balakrishnan, Kumudha; Douglas, Mark; Tibbitts, Thomas; Xu, Ethan Y.; Kutok, Jeffery L.; Ayers, Mary; Sarkar, Aloke; Guerrieri, Renato; Wierda, William G.; O’Brien, Susan; Jain, Nitin; Stern, Howard M.; Gandhi, Varsha
2017-01-01
Duvelisib, an oral dual inhibitor of PI3K-δ and PI3K-γ, is in phase III trials for the treatment of chronic lymphocytic leukemia (CLL) and indolent non-Hodgkin’s lymphoma (iNHL). In CLL, duvelisib monotherapy is associated with high iwCLL and nodal response rates, but complete remissions are rare. To characterize the molecular effect of duvelisib, we obtained samples from CLL patients on the duvelisib phase I trial. Gene-expression studies (RNA seq, Nanostring, Affymetrix array, and real time RT-PCR) demonstrated increased expression of BCL2 along with several BH3-only pro-apoptotic genes. In concert with induction of transcript levels, reverse phase protein arrays and immunoblots confirmed increase at the protein level. The BCL2 inhibitor venetoclax induced greater apoptosis in ex-vivo cultured CLL cells obtained from patients on duvelisib compared to pre-treatment CLL cells from the same patients. In vitro combination of duvelisib and venetoclax resulted in enhanced apoptosis even in CLL cells cultured under conditions that simulate the tumor microenvironment. These data provide a mechanistic rationale for testing the combination of duvelisib and venetoclax in the clinic. Such combination regimen (NCT02640833) is being evaluated for patients with B-cell malignancies including CLL. PMID:28017967
Patel, V M; Balakrishnan, K; Douglas, M; Tibbitts, T; Xu, E Y; Kutok, J L; Ayers, M; Sarkar, A; Guerrieri, R; Wierda, W G; O'Brien, S; Jain, N; Stern, H M; Gandhi, V
2017-09-01
Duvelisib, an oral dual inhibitor of PI3K-δ and PI3K-γ, is in phase III trials for the treatment of chronic lymphocytic leukemia (CLL) and indolent non-Hodgkin's lymphoma. In CLL, duvelisib monotherapy is associated with high iwCLL (International Workshop on Chronic Lymphocytic Leukemia) and nodal response rates, but complete remissions are rare. To characterize the molecular effect of duvelisib, we obtained samples from CLL patients on the duvelisib phase I trial. Gene expression studies (RNAseq, Nanostring, Affymetrix array and real-time RT-PCR) demonstrated increased expression of BCL2 along with several BH3-only pro-apoptotic genes. In concert with induction of transcript levels, reverse phase protein arrays and immunoblots confirmed increase at the protein level. The BCL2 inhibitor venetoclax induced greater apoptosis in ex vivo-cultured CLL cells obtained from patients on duvelisib compared with pre-treatment CLL cells from the same patients. In vitro combination of duvelisib and venetoclax resulted in enhanced apoptosis even in CLL cells cultured under conditions that simulate the tumor microenvironment. These data provide a mechanistic rationale for testing the combination of duvelisib and venetoclax in the clinic. Such combination regimen (NCT02640833) is being evaluated for patients with B-cell malignancies including CLL.
Van Goor, Angelica; Bolek, Kevin J; Ashwell, Chris M; Persia, Mike E; Rothschild, Max F; Schmidt, Carl J; Lamont, Susan J
2015-12-17
Losses in poultry production due to heat stress have considerable negative economic consequences. Previous studies in poultry have elucidated a genetic influence on response to heat. Using a unique chicken genetic resource, we identified genomic regions associated with body temperature (BT), body weight (BW), breast yield, and digestibility measured during heat stress. Identifying genes associated with a favorable response during high ambient temperature can facilitate genetic selection of heat-resilient chickens. Generations F18 and F19 of a broiler (heat-susceptible) × Fayoumi (heat-resistant) advanced intercross line (AIL) were used to fine-map quantitative trait loci (QTL). Six hundred and thirty-one birds were exposed to daily heat cycles from 22 to 28 days of age, and phenotypes were measured before heat treatment, on the 1st day and after 1 week of heat treatment. BT was measured at these three phases and BW at pre-heat treatment and after 1 week of heat treatment. Breast muscle yield was calculated as the percentage of BW at day 28. Ileal feed digestibility was assayed from digesta collected from the ileum at day 28. Four hundred and sixty-eight AIL were genotyped using the 600 K Affymetrix chicken SNP (single nucleotide polymorphism) array. Trait heritabilities were estimated using an animal model. A genome-wide association study (GWAS) for these traits and changes in BT and BW was conducted using Bayesian analyses. Candidate genes were identified within 200-kb regions around SNPs with significant association signals. Heritabilities were low to moderate (0.03 to 0.35). We identified QTL for BT on Gallus gallus chromosome (GGA)14, 15, 26, and 27; BW on GGA1 to 8, 10, 14, and 21; dry matter digestibility on GGA19, 20 and 21; and QTL of very large effect for breast muscle yield on GGA1, 15, and 22 with a single 1-Mb window on GGA1 explaining more than 15% of the genetic variation. This is the first study to estimate heritabilities and perform GWAS using this AIL for traits measured during heat stress. Significant QTL as well as low to moderate heritabilities were found for each trait, and these QTL may facilitate selection for improved animal performance in hot climatic conditions.
Jobs, Magnus; Howell, W. Mathias; Strömqvist, Linda; Mayr, Torsten; Brookes, Anthony J.
2003-01-01
Genotyping technologies need to be continually improved in terms of their flexibility, cost-efficiency, and throughput, to push forward genome variation analysis. To this end, we have leveraged the inherent simplicity of dynamic allele-specific hybridization (DASH) and coupled it to recent innovations of centrifugal arrays and iFRET. We have thereby created a new genotyping platform we term DASH-2, which we demonstrate and evaluate in this report. The system is highly flexible in many ways (any plate format, PCR multiplexing, serial and parallel array processing, spectral-multiplexing of hybridization probes), thus supporting a wide range of application scales and objectives. Precision is demonstrated to be in the range 99.8–100%, and assay costs are 0.05 USD or less per genotype assignment. DASH-2 thus provides a powerful new alternative for genotyping practice, which can be used without the need for expensive robotics support. PMID:12727908
TLR4 Asp299Gly polymorphism may be protective against chronic periodontitis.
Sellers, R M; Payne, J B; Yu, F; LeVan, T D; Walker, C; Mikuls, T R
2016-04-01
Periodontitis results from interplay between genetic and environmental factors. Single nucleotide polymorphisms (SNPs) in the coding region of the toll-like receptor 4 gene (TLR4) may be associated with periodontitis, although previous studies have been inconclusive. Moreover, the interaction between environmental factors, such as cigarette smoking (a major risk factor for periodontitis), and Porphyromonas gingivalis (a major periodontal pathogen) with the TLR4 coding region Asp299Gly SNP (rs4986790; a SNP associated with lipopolysaccharide-mediated inflammatory responses in periodontitis), have been largely ignored in previous reports. Therefore, the objective of this study was to examine the association between TLR4 Asp299Gly (rs4986790) with alveolar bone height loss (ABHL) and periodontitis, accounting for interactions between this SNP with smoking and P. gingivalis prevalence. The CD14/-260 SNP (rs2569190) served as a control, as a recent meta-analysis suggested no relationship between this SNP and periodontitis. This multicenter study included 617 participants who had rheumatoid arthritis or osteoarthritis. This report presents a secondary outcome from the primary case-control study examining the relationship of periodontitis with established rheumatoid arthritis. The Centers for Disease Control/American Academy of Periodontology case definitions of periodontitis were used for this analysis. Participants received a full-mouth clinical periodontal examination and panoramic radiograph. Percentage ABHL was measured on posterior teeth. The TLR4 Asp299Gly and CD14/-260 SNPs were selected a priori and genotypes were determined using the ImmunoChip array (Illumina(®) ). Minor allele frequencies and associations with periodontitis and ABHL did not differ according to rheumatoid arthritis vs. osteoarthritis status; therefore, data from these two groups were pooled. The presence of P. gingivalis was detected in subgingival plaque by PCR. Multivariate ordinal logistic regression examined associations between the SNPs and periodontitis or ABHL. SNP interactions with smoking and P. gingivalis were analyzed. A significant, negative interaction was observed between the TLR4 SNP and the presence of P. gingivalis (p = 0.045) with respect to periodontitis. The TLR4 minor variant was also associated with less ABHL: 16.8% of individuals with low ABHL, 9.0% with moderate ABHL and 11.2% with high ABHL had the minor allele [p = 0.029; odds ratio = 0.58 (95% confidence interval: 0.36-0.95)]. The interaction between the TLR4 SNP and smoking was not significant with respect to periodontitis or ABHL. The CD14 SNP was not associated with periodontitis or ABHL. The TLR4 Asp299Gly SNP significantly interacted with P. gingivalis in conferring a decreased risk of periodontitis and may be protective against ABHL, a feature of periodontitis. Agents blocking TLR4 signaling, a strategy currently under investigation for the treatment of other inflammatory conditions, may warrant investigation in the context of periodontitis related to the presence of P. gingivalis. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
A genome wide association study of pulmonary tuberculosis susceptibility in Indonesians.
Png, Eileen; Alisjahbana, Bachti; Sahiratmadja, Edhyana; Marzuki, Sangkot; Nelwan, Ron; Balabanova, Yanina; Nikolayevskyy, Vladyslav; Drobniewski, Francis; Nejentsev, Sergey; Adnan, Iskandar; van de Vosse, Esther; Hibberd, Martin L; van Crevel, Reinout; Ottenhoff, Tom H M; Seielstad, Mark
2012-01-13
There is reason to expect strong genetic influences on the risk of developing active pulmonary tuberculosis (TB) among latently infected individuals. Many of the genome wide linkage and association studies (GWAS) to date have been conducted on African populations. In order to identify additional targets in genetically dissimilar populations, and to enhance our understanding of this disease, we performed a multi-stage GWAS in a Southeast Asian cohort from Indonesia. In stage 1, we used the Affymetrix 100 K SNP GeneChip marker set to genotype 259 Indonesian samples. After quality control filtering, 108 cases and 115 controls were analyzed for association of 95,207 SNPs. In stage 2, we attempted validation of 2,453 SNPs with promising associations from the first stage, in 1,189 individuals from the same Indonesian cohort, and finally in stage 3 we selected 251 SNPs from this stage to test TB association in an independent Caucasian cohort (n = 3,760) from Russia. Our study suggests evidence of association (P = 0.0004-0.0067) for 8 independent loci (nominal significance P < 0.05), which are located within or near the following genes involved in immune signaling: JAG1, DYNLRB2, EBF1, TMEFF2, CCL17, HAUS6, PENK and TXNDC4. Mechanisms of immune defense suggested by some of the identified genes exhibit biological plausibility and may suggest novel pathways involved in the host containment of infection with TB.
[Association between single-nucleotide polymorphisms in the IRAK-4 gene and allergic rhinitis].
Zhang, Yuan; Xi, Lin; Zhao, Yan-ming; Zhao, Li-ping; Zhang, Luo
2012-06-01
To investigate the genetic association pattern between single-nucleotide polymorphisms (SNP) in the interleukin-1 receptor-associated kinase 4 (IRAK-4) gene and allergic rhinitis (AR). A population of 379 patients with the diagnosis of AR and 333 healthy controls who lived in Beijing region was recruited. A total of 8 reprehensive marker SNP which were in IRAK-4 gene region were selected according to the Beijing people database from Hapmap website. The individual genotyping was performed by MassARRAY platform. SPSS 13.0 software was used for statistic analysis. Subgroup analysis for the presence of different allergen sensitivities displayed associations only in the house dust mite-allergic cohorts (rs3794262: P = 0.0034, OR = 1.7388; rs4251481: P = 0.0023, OR = 2.6593), but not in subjects who were allergic to pollens as well as mix allergens. The potential genetic contribution of the IRAK-4 gene to AR demonstrated an allergen-dependant association pattern in Chinese population.
Clonal evolution through loss of chromosomes and subsequent polyploidization in chondrosarcoma.
Olsson, Linda; Paulsson, Kajsa; Bovée, Judith V M G; Nord, Karolin H
2011-01-01
Near-haploid chromosome numbers have been found in less than 1% of cytogenetically reported tumors, but seem to be more common in certain neoplasms including the malignant cartilage-producing tumor chondrosarcoma. By a literature survey of published karyotypes from chondrosarcomas we could confirm that loss of chromosomes resulting in hyperhaploid-hypodiploid cells is common and that these cells may polyploidize. Sixteen chondrosarcomas were investigated by single nucleotide polymorphism (SNP) array and the majority displayed SNP patterns indicative of a hyperhaploid-hypodiploid origin, with or without subsequent polyploidization. Except for chromosomes 5, 7, 19, 20 and 21, autosomal loss of heterozygosity was commonly found, resulting from chromosome loss and subsequent duplication of monosomic chromosomes giving rise to uniparental disomy. Additional gains, losses and rearrangements of genetic material, and even repeated rounds of polyploidization, may affect chondrosarcoma cells resulting in highly complex karyotypes. Loss of chromosomes and subsequent polyploidization was not restricted to a particular chondrosarcoma subtype and, although commonly found in chondrosarcoma, binucleated cells did not seem to be involved in these events.
A common variant near TGFBR3 is associated with primary open angle glaucoma.
Li, Zheng; Allingham, R Rand; Nakano, Masakazu; Jia, Liyun; Chen, Yuhong; Ikeda, Yoko; Mani, Baskaran; Chen, Li-Jia; Kee, Changwon; Garway-Heath, David F; Sripriya, Sarangapani; Fuse, Nobuo; Abu-Amero, Khaled K; Huang, Chukai; Namburi, Prasanthi; Burdon, Kathryn; Perera, Shamira A; Gharahkhani, Puya; Lin, Ying; Ueno, Morio; Ozaki, Mineo; Mizoguchi, Takanori; Krishnadas, Subbiah Ramasamy; Osman, Essam A; Lee, Mei Chin; Chan, Anita S Y; Tajudin, Liza-Sharmini A; Do, Tan; Goncalves, Aurelien; Reynier, Pascal; Zhang, Hong; Bourne, Rupert; Goh, David; Broadway, David; Husain, Rahat; Negi, Anil K; Su, Daniel H; Ho, Ching-Lin; Blanco, Augusto Azuara; Leung, Christopher K S; Wong, Tina T; Yakub, Azhany; Liu, Yutao; Nongpiur, Monisha E; Han, Jong Chul; Hon, Do Nhu; Shantha, Balekudaru; Zhao, Bowen; Sang, Jinghong; Zhang, NiHong; Sato, Ryuichi; Yoshii, Kengo; Panda-Jonas, Songhomita; Ashley Koch, Allison E; Herndon, Leon W; Moroi, Sayoko E; Challa, Pratap; Foo, Jia Nee; Bei, Jin-Xin; Zeng, Yi-Xin; Simmons, Cameron P; Bich Chau, Tran Nguyen; Sharmila, Philomenadin Ferdinamarie; Chew, Merwyn; Lim, Blanche; Tam, Pansy O S; Chua, Elaine; Ng, Xiao Yu; Yong, Victor H K; Chong, Yaan Fun; Meah, Wee Yang; Vijayan, Saravanan; Seongsoo, Sohn; Xu, Wang; Teo, Yik Ying; Cooke Bailey, Jessica N; Kang, Jae H; Haines, Jonathan L; Cheng, Ching Yu; Saw, Seang-Mei; Tai, E-Shyong; Richards, Julia E; Ritch, Robert; Gaasterland, Douglas E; Pasquale, Louis R; Liu, Jianjun; Jonas, Jost B; Milea, Dan; George, Ronnie; Al-Obeidan, Saleh A; Mori, Kazuhiko; Macgregor, Stuart; Hewitt, Alex W; Girkin, Christopher A; Zhang, Mingzhi; Sundaresan, Periasamy; Vijaya, Lingam; Mackey, David A; Wong, Tien Yin; Craig, Jamie E; Sun, Xinghuai; Kinoshita, Shigeru; Wiggs, Janey L; Khor, Chiea-Chuen; Yang, Zhenglin; Pang, Chi Pui; Wang, Ningli; Hauser, Michael A; Tashiro, Kei; Aung, Tin; Vithana, Eranga N
2015-07-01
Primary open angle glaucoma (POAG), a major cause of blindness worldwide, is a complex disease with a significant genetic contribution. We performed Exome Array (Illumina) analysis on 3504 POAG cases and 9746 controls with replication of the most significant findings in 9173 POAG cases and 26 780 controls across 18 collections of Asian, African and European descent. Apart from confirming strong evidence of association at CDKN2B-AS1 (rs2157719 [G], odds ratio [OR] = 0.71, P = 2.81 × 10(-33)), we observed one SNP showing significant association to POAG (CDC7-TGFBR3 rs1192415, ORG-allele = 1.13, Pmeta = 1.60 × 10(-8)). This particular SNP has previously been shown to be strongly associated with optic disc area and vertical cup-to-disc ratio, which are regarded as glaucoma-related quantitative traits. Our study now extends this by directly implicating it in POAG disease pathogenesis. © The Author 2015. Published by Oxford University Press.
A common variant near TGFBR3 is associated with primary open angle glaucoma
Li, Zheng; Allingham, R. Rand; Nakano, Masakazu; Jia, Liyun; Chen, Yuhong; Ikeda, Yoko; Mani, Baskaran; Chen, Li-Jia; Kee, Changwon; Garway-Heath, David F.; Sripriya, Sarangapani; Fuse, Nobuo; Abu-Amero, Khaled K.; Huang, Chukai; Namburi, Prasanthi; Burdon, Kathryn; Perera, Shamira A.; Gharahkhani, Puya; Lin, Ying; Ueno, Morio; Ozaki, Mineo; Mizoguchi, Takanori; Krishnadas, Subbiah Ramasamy; Osman, Essam A.; Lee, Mei Chin; Chan, Anita S.Y.; Tajudin, Liza-Sharmini A.; Do, Tan; Goncalves, Aurelien; Reynier, Pascal; Zhang, Hong; Bourne, Rupert; Goh, David; Broadway, David; Husain, Rahat; Negi, Anil K.; Su, Daniel H; Ho, Ching-Lin; Blanco, Augusto Azuara; Leung, Christopher K.S.; Wong, Tina T.; Yakub, Azhany; Liu, Yutao; Nongpiur, Monisha E.; Han, Jong Chul; Hon, Do Nhu; Shantha, Balekudaru; Zhao, Bowen; Sang, Jinghong; Zhang, NiHong; Sato, Ryuichi; Yoshii, Kengo; Panda-Jonas, Songhomita; Ashley Koch, Allison E.; Herndon, Leon W.; Moroi, Sayoko E.; Challa, Pratap; Foo, Jia Nee; Bei, Jin-Xin; Zeng, Yi-Xin; Simmons, Cameron P.; Bich Chau, Tran Nguyen; Sharmila, Philomenadin Ferdinamarie; Chew, Merwyn; Lim, Blanche; Tam, Pansy O.S.; Chua, Elaine; Ng, Xiao Yu; Yong, Victor H.K.; Chong, Yaan Fun; Meah, Wee Yang; Vijayan, Saravanan; Seongsoo, Sohn; Xu, Wang; Teo, Yik Ying; Cooke Bailey, Jessica N.; Kang, Jae H.; Haines, Jonathan L.; Cheng, Ching Yu; Saw, Seang-Mei; Tai, E-Shyong; Richards, Julia E.; Ritch, Robert; Gaasterland, Douglas E.; Pasquale, Louis R.; Liu, Jianjun; Jonas, Jost B.; Milea, Dan; George, Ronnie; Al-Obeidan, Saleh A.; Mori, Kazuhiko; Macgregor, Stuart; Hewitt, Alex W.; Girkin, Christopher A.; Zhang, Mingzhi; Sundaresan, Periasamy; Vijaya, Lingam; Mackey, David A.; Wong, Tien Yin; Craig, Jamie E.; Sun, Xinghuai; Kinoshita, Shigeru; Wiggs, Janey L.; Khor, Chiea-Chuen; Yang, Zhenglin; Pang, Chi Pui; Wang, Ningli; Hauser, Michael A.; Tashiro, Kei; Aung, Tin; Vithana, Eranga N.
2015-01-01
Primary open angle glaucoma (POAG), a major cause of blindness worldwide, is a complex disease with a significant genetic contribution. We performed Exome Array (Illumina) analysis on 3504 POAG cases and 9746 controls with replication of the most significant findings in 9173 POAG cases and 26 780 controls across 18 collections of Asian, African and European descent. Apart from confirming strong evidence of association at CDKN2B-AS1 (rs2157719 [G], odds ratio [OR] = 0.71, P = 2.81 × 10−33), we observed one SNP showing significant association to POAG (CDC7–TGFBR3 rs1192415, ORG-allele = 1.13, Pmeta = 1.60 × 10−8). This particular SNP has previously been shown to be strongly associated with optic disc area and vertical cup-to-disc ratio, which are regarded as glaucoma-related quantitative traits. Our study now extends this by directly implicating it in POAG disease pathogenesis. PMID:25861811
Massa, Alicia N; Manrique-Carpintero, Norma C; Coombs, Joseph J; Zarka, Daniel G; Boone, Anne E; Kirk, William W; Hackett, Christine A; Bryan, Glenn J; Douches, David S
2015-09-14
The objective of this study was to construct a single nucleotide polymorphism (SNP)-based genetic map at the cultivated tetraploid level to locate quantitative trait loci (QTL) contributing to economically important traits in potato (Solanum tuberosum L.). The 156 F1 progeny and parents of a cross (MSL603) between "Jacqueline Lee" and "MSG227-2" were genotyped using the Infinium 8303 Potato Array. Furthermore, the progeny and parents were evaluated for foliar late blight reaction to isolates of the US-8 genotype of Phytophthora infestans (Mont.) de Bary and vine maturity. Linkage analyses and QTL mapping were performed using a novel approach that incorporates allele dosage information. The resulting genetic maps contained 1972 SNP markers with an average density of 1.36 marker per cM. QTL mapping identified the major source of late blight resistance in "Jacqueline Lee." The best SNP marker mapped ~0.54 Mb from a resistance hotspot on the long arm of chromosome 9. For vine maturity, the major-effect QTL was located on chromosome 5 with allelic effects from both parents. A candidate SNP marker for this trait mapped ~0.25 Mb from the StCDF1 gene, which is a candidate gene for the maturity trait. The identification of markers for P. infestans resistance will enable the introgression of multiple sources of resistance through marker-assisted selection. Moreover, the discovery of a QTL for late blight resistance not linked to the QTL for vine maturity provides the opportunity to use marker-assisted selection for resistance independent of the selection for vine maturity classifications. Copyright © 2015 Massa et al.
Massa, Alicia N.; Manrique-Carpintero, Norma C.; Coombs, Joseph J.; Zarka, Daniel G.; Boone, Anne E.; Kirk, William W.; Hackett, Christine A.; Bryan, Glenn J.; Douches, David S.
2015-01-01
The objective of this study was to construct a single nucleotide polymorphism (SNP)-based genetic map at the cultivated tetraploid level to locate quantitative trait loci (QTL) contributing to economically important traits in potato (Solanum tuberosum L.). The 156 F1 progeny and parents of a cross (MSL603) between “Jacqueline Lee” and “MSG227-2” were genotyped using the Infinium 8303 Potato Array. Furthermore, the progeny and parents were evaluated for foliar late blight reaction to isolates of the US-8 genotype of Phytophthora infestans (Mont.) de Bary and vine maturity. Linkage analyses and QTL mapping were performed using a novel approach that incorporates allele dosage information. The resulting genetic maps contained 1972 SNP markers with an average density of 1.36 marker per cM. QTL mapping identified the major source of late blight resistance in “Jacqueline Lee.” The best SNP marker mapped ∼0.54 Mb from a resistance hotspot on the long arm of chromosome 9. For vine maturity, the major-effect QTL was located on chromosome 5 with allelic effects from both parents. A candidate SNP marker for this trait mapped ∼0.25 Mb from the StCDF1 gene, which is a candidate gene for the maturity trait. The identification of markers for P. infestans resistance will enable the introgression of multiple sources of resistance through marker-assisted selection. Moreover, the discovery of a QTL for late blight resistance not linked to the QTL for vine maturity provides the opportunity to use marker-assisted selection for resistance independent of the selection for vine maturity classifications. PMID:26374597
Association Analysis of the Ephrin-B2 Gene in African-Americans with End-Stage Renal Disease
Hicks, Pamela J.; Staten, Jennifer L.; Palmer, Nicholette D.; Langefeld, Carl D.; Ziegler, Julie T.; Keene, Keith L.; Sale, Michele M.; Bowden, Donald W.; Freedman, Barry I.
2008-01-01
Background Genome scans in African-Americans with end-stage renal disease (ESRD) identified linkage on chromosome 13q33 in the region containing the ephrin-B2 ligand (EFNB2) genes. Interactions between the ephrin-B2 receptor and ephrin-B2 ligand play essential roles in renal angiogenesis, blood vessel maturation, and kidney disease. Methods The EFNB2 gene was evaluated as a positional candidate for non-diabetic and diabetic ESRD susceptibility in 1,071 unrelated African-American subjects; 316 with non-diabetic etiologies of ESRD, 394 with type 2 diabetes-associated ESRD and 361 healthy controls. Single nucleotide polymorphism (SNP) genotyping was performed on the Sequenom Mass Array System. Statistical analyses were computed using Dandelion version 1.26, Snpaddmix version 1.4 and Haploview version 3.32. Results Twenty-eight HapMap tag SNPs were genotyped spanning the 39 kilobases (kb) of the EFNB2 coding region, with average spacing of 1.43 kb. Analysis of 710 ESRD patient samples and 361 controls provided no evidence of single SNP associations in either diabetic or non-diabetic ESRD; although nominal evidence of association with all-cause ESRD was observed with a two SNP (p = 0.022) and three SNP (p = 0.023) haplotype, both containing SNPs rs7490924 and rs2391335 in intron 1. Conclusions Although an attractive positional candidate gene, polymorphisms in the EFNB2 gene do not appear to contribute in a substantial way to non-diabetic, diabetic or all-cause ESRD susceptibility in African-Americans. Additional genes within the chromosome 13q33 linkage interval are likely contributors to African-American non-diabetic ESRD. PMID:18580054
Bhat, Somanath; Polanowski, Andrea M; Double, Mike C; Jarman, Simon N; Emslie, Kerry R
2012-01-01
Recent advances in nanofluidic technologies have enabled the use of Integrated Fluidic Circuits (IFCs) for high-throughput Single Nucleotide Polymorphism (SNP) genotyping (GT). In this study, we implemented and validated a relatively low cost nanofluidic system for SNP-GT with and without Specific Target Amplification (STA). As proof of principle, we first validated the effect of input DNA copy number on genotype call rate using well characterised, digital PCR (dPCR) quantified human genomic DNA samples and then implemented the validated method to genotype 45 SNPs in the humpback whale, Megaptera novaeangliae, nuclear genome. When STA was not incorporated, for a homozygous human DNA sample, reaction chambers containing, on average 9 to 97 copies, showed 100% call rate and accuracy. Below 9 copies, the call rate decreased, and at one copy it was 40%. For a heterozygous human DNA sample, the call rate decreased from 100% to 21% when predicted copies per reaction chamber decreased from 38 copies to one copy. The tightness of genotype clusters on a scatter plot also decreased. In contrast, when the same samples were subjected to STA prior to genotyping a call rate and a call accuracy of 100% were achieved. Our results demonstrate that low input DNA copy number affects the quality of data generated, in particular for a heterozygous sample. Similar to human genomic DNA, a call rate and a call accuracy of 100% was achieved with whale genomic DNA samples following multiplex STA using either 15 or 45 SNP-GT assays. These calls were 100% concordant with their true genotypes determined by an independent method, suggesting that the nanofluidic system is a reliable platform for executing call rates with high accuracy and concordance in genomic sequences derived from biological tissue.
Genomewide association study of liver abscess in beef cattle.
Keele, J W; Kuehn, L A; McDaneld, T G; Tait, R G; Jones, S A; Keel, B N; Snelling, W M
2016-02-01
Fourteen percent of U.S. cattle slaughtered in 2011 had liver abscesses, resulting in reduced carcass weight, quality, and value. Liver abscesses can result from a common bacterial cause, , which inhabits rumen lesions caused by acidosis and subsequently escapes into the blood stream, is filtered by the liver, and causes abscesses in the liver. Our aim was to identify SNP associated with liver abscesses in beef cattle. We used lung samples as a DNA source because they have low economic value, they have abundant DNA, and we had unrestricted access to sample them. We collected 2,304 lung samples from a beef processing plant: 1,152 from animals with liver abscess and 1,152 from animals without liver abscess. Lung tissue from pairs of animals, 1 with abscesses and another without, were collected from near one another on the viscera table to ensure that pairs of phenotypically extreme animals came from the same lot. Within each phenotype (abscess or no abscess), cattle were pooled by slaughter sequence into 12 pools of 96 cattle for each phenotype for a total of 24 pools. The pools were constructed by equal volume of frozen lung tissue from each animal. The DNA needed to allelotype each pool was then extracted from pooled lung tissue and the BovineHD Bead Array (777,962 SNP) was run on all 24 pools. Total intensity (TI), an indicator of copy number variants, was the sum of intensities from red and green dyes. Pooling allele frequency (PAF) was red dye intensity divided TI. Total intensity and PAF were weighted by the inverse of their respective genomic covariance matrices computed over all SNP across the genome. A false discovery rate ≤ 5% was achieved for 15 SNP for PAF and 20 SNP for TI. Genes within 50 kbp from significant SNP were in diverse pathways including maintenance of pH homeostasis in the gastrointestinal tract, maintain immune defenses in the liver, migration of leukocytes from the blood into infected tissues, transport of glutamine into the kidney in response to acidosis to facilitate production of bicarbonate to increase pH, aggregate platelets to liver injury to facilitate liver repair, and facilitate axon guidance. Evidence from the 35 detected SNP associations combined with evidence of polygenic variation indicate that there is adequate genetic variation in incidence rate of liver abscesses, which could be exploited to select sires for reduced susceptibility to subacute acidosis and associated liver abscess.
Preprocessing of gene expression data by optimally robust estimators
2010-01-01
Background The preprocessing of gene expression data obtained from several platforms routinely includes the aggregation of multiple raw signal intensities to one expression value. Examples are the computation of a single expression measure based on the perfect match (PM) and mismatch (MM) probes for the Affymetrix technology, the summarization of bead level values to bead summary values for the Illumina technology or the aggregation of replicated measurements in the case of other technologies including real-time quantitative polymerase chain reaction (RT-qPCR) platforms. The summarization of technical replicates is also performed in other "-omics" disciplines like proteomics or metabolomics. Preprocessing methods like MAS 5.0, Illumina's default summarization method, RMA, or VSN show that the use of robust estimators is widely accepted in gene expression analysis. However, the selection of robust methods seems to be mainly driven by their high breakdown point and not by efficiency. Results We describe how optimally robust radius-minimax (rmx) estimators, i.e. estimators that minimize an asymptotic maximum risk on shrinking neighborhoods about an ideal model, can be used for the aggregation of multiple raw signal intensities to one expression value for Affymetrix and Illumina data. With regard to the Affymetrix data, we have implemented an algorithm which is a variant of MAS 5.0. Using datasets from the literature and Monte-Carlo simulations we provide some reasoning for assuming approximate log-normal distributions of the raw signal intensities by means of the Kolmogorov distance, at least for the discussed datasets, and compare the results of our preprocessing algorithms with the results of Affymetrix's MAS 5.0 and Illumina's default method. The numerical results indicate that when using rmx estimators an accuracy improvement of about 10-20% is obtained compared to Affymetrix's MAS 5.0 and about 1-5% compared to Illumina's default method. The improvement is also visible in the analysis of technical replicates where the reproducibility of the values (in terms of Pearson and Spearman correlation) is increased for all Affymetrix and almost all Illumina examples considered. Our algorithms are implemented in the R package named RobLoxBioC which is publicly available via CRAN, The Comprehensive R Archive Network (http://cran.r-project.org/web/packages/RobLoxBioC/). Conclusions Optimally robust rmx estimators have a high breakdown point and are computationally feasible. They can lead to a considerable gain in efficiency for well-established bioinformatics procedures and thus, can increase the reproducibility and power of subsequent statistical analysis. PMID:21118506
Hussein, Ibtessam R; Bader, Rima S; Chaudhary, Adeel G; Bassiouni, Randa; Alquaiti, Maha; Ashgan, Fai; Schulten, Hans-Juergen; Al Qahtani, Mohammad H
2018-06-01
Congenital heart defects (CHDs) are the most common birth defects in neonatal life. CHDs could be presented as isolated defects or associated with developmental delay (DD) and/or other congenital malformations. A small proportion of cardiac defects are caused by chromosomal abnormalities or single gene defects; however, in a large proportion of cases no genetic diagnosis could be achieved by clinical examination and conventional genetic analysis. The development of genome wide array-Comparative Genomic Hybridization technique (array-CGH) allowed for the detection of cryptic chromosomal imbalances and pathogenic copy number variants (CNVs) not detected by conventional techniques. We investigated 94 patients having CHDs associated with other malformations and/or DD. Clinical examination and Echocardiography was done to all patients to evaluate the type of CHD. To investigate for genome defects we applied high-density array-CGH 2 × 400K (41 patients) and CGH/SNP microarray 2 × 400K (Agilent) for 53 patients. Confirmation of results was done using Fluorescent in situ hybridization (FISH) or qPCR techniques in certain cases. Chromosomal abnormalities such as trisomy 18, 13, 21, microdeletions: del22q11.2, del7q11.23, del18 (p11.32; p11.21), tetrasomy 18p, trisomy 9p, del11q24-q25, add 15p, add(18)(q21.3), and der 9, 15 (q34.2; q11.2) were detected in 21/94 patients (22%) using both conventional cytogenetics methods and array-CGH technique. Cryptic chromosomal anomalies and pathogenic variants were detected in 15/73 (20.5%) cases. CNVs were observed in a large proportion of the studied samples (27/56) (48%). Clustering of variants was observed in chromosome 1p36, 1p21.1, 2q37, 3q29, 5p15, 7p22.3, 8p23, 11p15.5, 14q11.2, 15q11.2, 16p13.3, 16p11.2, 18p11, 21q22, and 22q11.2. CGH/SNP array could detect loss of heterozygosity (LOH) in different chromosomal loci in 10/25 patients. Array-CGH technique allowed for detection of cryptic chromosomal imbalances that could not be detected by conventional cytogenetics methods. CHDs associated with DD/congenital malformations presented with a relatively high rate of cryptic chromosomal abnormalities. Clustering of CNVs in certain genome loci needs further analysis to identify candidate genes that may provide clues for understanding the molecular pathway of cardiac development.
Walter, Vonn; Patel, Nirali M.; Eberhard, David A.; Hayward, Michele C.; Salazar, Ashley H.; Jo, Heejoon; Soloway, Matthew G.; Wilkerson, Matthew D.; Parker, Joel S.; Yin, Xiaoying; Zhang, Guosheng; Siegel, Marni B.; Rosson, Gary B.; Earp, H. Shelton; Sharpless, Norman E.; Gulley, Margaret L.; Weck, Karen E.
2015-01-01
The recent FDA approval of the MiSeqDx platform provides a unique opportunity to develop targeted next generation sequencing (NGS) panels for human disease, including cancer. We have developed a scalable, targeted panel-based assay termed UNCseq, which involves a NGS panel of over 200 cancer-associated genes and a standardized downstream bioinformatics pipeline for detection of single nucleotide variations (SNV) as well as small insertions and deletions (indel). In addition, we developed a novel algorithm, NGScopy, designed for samples with sparse sequencing coverage to detect large-scale copy number variations (CNV), similar to human SNP Array 6.0 as well as small-scale intragenic CNV. Overall, we applied this assay to 100 snap-frozen lung cancer specimens lacking same-patient germline DNA (07–0120 tissue cohort) and validated our results against Sanger sequencing, SNP Array, and our recently published integrated DNA-seq/RNA-seq assay, UNCqeR, where RNA-seq of same-patient tumor specimens confirmed SNV detected by DNA-seq, if RNA-seq coverage depth was adequate. In addition, we applied the UNCseq assay on an independent lung cancer tumor tissue collection with available same-patient germline DNA (11–1115 tissue cohort) and confirmed mutations using assays performed in a CLIA-certified laboratory. We conclude that UNCseq can identify SNV, indel, and CNV in tumor specimens lacking germline DNA in a cost-efficient fashion. PMID:26076459
High throughput SNP discovery and genotyping in hexaploid wheat.
Rimbert, Hélène; Darrier, Benoît; Navarro, Julien; Kitt, Jonathan; Choulet, Frédéric; Leveugle, Magalie; Duarte, Jorge; Rivière, Nathalie; Eversole, Kellye; Le Gouis, Jacques; Davassi, Alessandro; Balfourier, François; Le Paslier, Marie-Christine; Berard, Aurélie; Brunel, Dominique; Feuillet, Catherine; Poncet, Charles; Sourdille, Pierre; Paux, Etienne
2018-01-01
Because of their abundance and their amenability to high-throughput genotyping techniques, Single Nucleotide Polymorphisms (SNPs) are powerful tools for efficient genetics and genomics studies, including characterization of genetic resources, genome-wide association studies and genomic selection. In wheat, most of the previous SNP discovery initiatives targeted the coding fraction, leaving almost 98% of the wheat genome largely unexploited. Here we report on the use of whole-genome resequencing data from eight wheat lines to mine for SNPs in the genic, the repetitive and non-repetitive intergenic fractions of the wheat genome. Eventually, we identified 3.3 million SNPs, 49% being located on the B-genome, 41% on the A-genome and 10% on the D-genome. We also describe the development of the TaBW280K high-throughput genotyping array containing 280,226 SNPs. Performance of this chip was examined by genotyping a set of 96 wheat accessions representing the worldwide diversity. Sixty-nine percent of the SNPs can be efficiently scored, half of them showing a diploid-like clustering. The TaBW280K was proven to be a very efficient tool for diversity analyses, as well as for breeding as it can discriminate between closely related elite varieties. Finally, the TaBW280K array was used to genotype a population derived from a cross between Chinese Spring and Renan, leading to the construction a dense genetic map comprising 83,721 markers. The results described here will provide the wheat community with powerful tools for both basic and applied research.
Mei, Mei; Yang, Lin; Zhan, Guodong; Wang, Huijun; Ma, Duan; Zhou, Wenhao; Huang, Guoying
2014-06-01
To screen for genomic copy number variations (CNVs) in two unrelated neonates with multiple congenital abnormalities using Affymetrix SNP chip and try to find the critical region associated with congenital heart disease. Two neonates were tested for genomic copy number variations by using Cytogenetic SNP chip.Rare CNVs with potential clinical significance were selected of which deletion segments' size was larger than 50 kb and duplication segments' size was larger than 150 kb based on the analysis of ChAs software, without false positive CNVs and segments of normal population. The identified CNVs were compared with those of the cases in DECIPHER and ISCA databases. Eleven rare CNVs with size from 546.6-27 892 kb were identified in the 2 neonates. The deletion region and size of case 1 were 8p23.3-p23.1 (387 912-11 506 771 bp) and 11.1 Mb respectively, the duplication region and size of case 1 were 8p23.1-p11.1 (11 508 387-43 321 279 bp) and 31.8 Mb respectively. The deletion region and size of case 2 were 8p23.3-p23.1 (46 385-7 809 878 bp) and 7.8 Mb respectively, the duplication region and size of case 2 were 8p23.1-p11.21 (12 260 914-40 917 092 bp) and 28.7 Mb respectively. The comparison with Decipher and ISCA databases supported previous viewpoint that 8p23.1 had been associated with congenital heart disease and the region between 7 809 878-11 506 771 bp may play a role in the severe cardiac defects associated with 8p23.1 deletions. Case 1 had serious cardiac abnormalities whose GATA4 was located in the duplication segment and the copy number increased while SOX7 was located in the deletion segment and the copy number decreased. The region between 7 809 878-11 506 771 bp in 8p23.1 is associated with heart defects and copy number variants of SOX7 and GATA4 may result in congenital heart disease.
Livingstone, Donald; Stack, Conrad; Mustiga, Guiliana M; Rodezno, Dayana C; Suarez, Carmen; Amores, Freddy; Feltus, Frank A; Mockaitis, Keithanne; Cornejo, Omar E; Motamayor, Juan C
2017-01-01
Cacao ( Theobroma cacao L.) is an important cash crop in tropical regions around the world and has a rich agronomic history in South America. As a key component in the cosmetic and confectionary industries, millions of people worldwide use products made from cacao, ranging from shampoo to chocolate. An Illumina Infinity II array was created using 13,530 SNPs identified within a small diversity panel of cacao. Of these SNPs, 12,643 derive from variation within annotated cacao genes. The genotypes of 3,072 trees were obtained, including two mapping populations from Ecuador. High-density linkage maps for these two populations were generated and compared to the cacao genome assembly. Phenotypic data from these populations were combined with the linkage maps to identify the QTLs for yield and disease resistance.
Mazzarelli, Joan M; Brestelli, John; Gorski, Regina K; Liu, Junmin; Manduchi, Elisabetta; Pinney, Deborah F; Schug, Jonathan; White, Peter; Kaestner, Klaus H; Stoeckert, Christian J
2007-01-01
EPConDB (http://www.cbil.upenn.edu/EPConDB) is a public web site that supports research in diabetes, pancreatic development and beta-cell function by providing information about genes expressed in cells of the pancreas. EPConDB displays expression profiles for individual genes and information about transcripts, promoter elements and transcription factor binding sites. Gene expression results are obtained from studies examining tissue expression, pancreatic development and growth, differentiation of insulin-producing cells, islet or beta-cell injury, and genetic models of impaired beta-cell function. The expression datasets are derived using different microarray platforms, including the BCBC PancChips and Affymetrix gene expression arrays. Other datasets include semi-quantitative RT-PCR and MPSS expression studies. For selected microarray studies, lists of differentially expressed genes, derived from PaGE analysis, are displayed on the site. EPConDB provides database queries and tools to examine the relationship between a gene, its transcriptional regulation, protein function and expression in pancreatic tissues.
Singh, Sudhir P; Jeet, Raja; Kumar, Jitendra; Shukla, Vishnu; Srivastava, Rakesh; Mantri, Shrikant S; Tuli, Rakesh
2014-01-01
Wheat is one of the most important cereal crops in the world. To identify the candidate genes for mineral accumulation, it is important to examine differential transcriptome between wheat genotypes, with contrasting levels of minerals in grains. A transcriptional comparison of developing grains was carried out between two wheat genotypes- Triticum aestivum Cv. WL711 (low grain mineral), and T. aestivum L. IITR26 (high grain mineral), using Affymetrix GeneChip Wheat Genome Array. The study identified a total of 580 probe sets as differentially expressed (with log2 fold change of ≥2 at p≤0.01) between the two genotypes, during grain filling. Transcripts with significant differences in induction or repression between the two genotypes included genes related to metal homeostasis, metal tolerance, lignin and flavonoid biosynthesis, amino acid and protein transport, vacuolar-sorting receptor, aquaporins, and stress responses. Meta-analysis revealed spatial and temporal signatures of a majority of the differentially regulated transcripts.
Singh, Sudhir P.; Jeet, Raja; Kumar, Jitendra; Shukla, Vishnu; Srivastava, Rakesh; Mantri, Shrikant S.; Tuli, Rakesh
2014-01-01
Wheat is one of the most important cereal crops in the world. To identify the candidate genes for mineral accumulation, it is important to examine differential transcriptome between wheat genotypes, with contrasting levels of minerals in grains. A transcriptional comparison of developing grains was carried out between two wheat genotypes- Triticum aestivum Cv. WL711 (low grain mineral), and T. aestivum L. IITR26 (high grain mineral), using Affymetrix GeneChip Wheat Genome Array. The study identified a total of 580 probe sets as differentially expressed (with log2 fold change of ≥2 at p≤0.01) between the two genotypes, during grain filling. Transcripts with significant differences in induction or repression between the two genotypes included genes related to metal homeostasis, metal tolerance, lignin and flavonoid biosynthesis, amino acid and protein transport, vacuolar-sorting receptor, aquaporins, and stress responses. Meta-analysis revealed spatial and temporal signatures of a majority of the differentially regulated transcripts. PMID:25364903
Duker, Angela L; Ballif, Blake C; Bawle, Erawati V; Person, Richard E; Mahadevan, Sangeetha; Alliman, Sarah; Thompson, Regina; Traylor, Ryan; Bejjani, Bassem A; Shaffer, Lisa G; Rosenfeld, Jill A; Lamb, Allen N; Sahoo, Trilochan
2010-11-01
Prader-Willi syndrome (PWS) is a neurobehavioral disorder manifested by infantile hypotonia and feeding difficulties in infancy, followed by morbid obesity secondary to hyperphagia. It is caused by deficiency of paternally expressed transcript(s) within the human chromosome region 15q11.2. PWS patients harboring balanced chromosomal translocations with breakpoints within small nuclear ribonucleoprotein polypeptide N (SNRPN) have provided indirect evidence for a role for the imprinted C/D box containing small nucleolar RNA (snoRNA) genes encoded downstream of SNRPN. In addition, recently published data provide strong evidence in support of a role for the snoRNA SNORD116 cluster (HBII-85) in PWS etiology. In this study, we performed detailed phenotypic, cytogenetic, and molecular analyses including chromosome analysis, array comparative genomic hybridization (array CGH), expression studies, and single-nucleotide polymorphism (SNP) genotyping for parent-of-origin determination of the 15q11.2 microdeletion on an 11-year-old child expressing the major components of the PWS phenotype. This child had an ∼236.29 kb microdeletion at 15q11.2 within the larger Prader-Willi/Angelman syndrome critical region that included the SNORD116 cluster of snoRNAs. Analysis of SNP genotypes in proband and mother provided evidence in support of the deletion being on the paternal chromosome 15. This child also met most of the major PWS diagnostic criteria including infantile hypotonia, early-onset morbid obesity, and hypogonadism. Identification and characterization of this case provide unequivocal evidence for a critical role for the SNORD116 snoRNA molecules in PWS pathogenesis. Array CGH testing for genomic copy-number changes in cases with complex phenotypes is proving to be invaluable in detecting novel alterations and enabling better genotype-phenotype correlations.
Brøndum, R F; Su, G; Janss, L; Sahana, G; Guldbrandtsen, B; Boichard, D; Lund, M S
2015-06-01
This study investigated the effect on the reliability of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k single nucleotide polymorphism (SNP) array data. The extra markers were selected with the aim of augmenting the custom low-density Illumina BovineLD SNP chip (San Diego, CA) used in the Nordic countries. The single-marker analysis was done breed-wise on all 16 index traits included in the breeding goals for Nordic Holstein, Danish Jersey, and Nordic Red cattle plus the total merit index itself. Depending on the trait's economic weight, 15, 10, or 5 quantitative trait loci (QTL) were selected per trait per breed and 3 to 5 markers were selected to tag each QTL. After removing duplicate markers (same marker selected for more than one trait or breed) and filtering for high pairwise linkage disequilibrium and assaying performance on the array, a total of 1,623 QTL markers were selected for inclusion on the custom chip. Genomic prediction analyses were performed for Nordic and French Holstein and Nordic Red animals using either a genomic BLUP or a Bayesian variable selection model. When using the genomic BLUP model including the QTL markers in the analysis, reliability was increased by up to 4 percentage points for production traits in Nordic Holstein animals, up to 3 percentage points for Nordic Reds, and up to 5 percentage points for French Holstein. Smaller gains of up to 1 percentage point was observed for mastitis, but only a 0.5 percentage point increase was seen for fertility. When using a Bayesian model accuracies were generally higher with only 54k data compared with the genomic BLUP approach, but increases in reliability were relatively smaller when QTL markers were included. Results from this study indicate that the reliability of genomic prediction can be increased by including markers significant in genome-wide association studies on whole genome sequence data alongside the 54k SNP set. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Zhang, Linsheng; Znoyko, Iya; Costa, Luciano J; Conlin, Laura K; Daber, Robert D; Self, Sally E; Wolff, Daynna J
2011-12-01
Chronic lymphocytic leukemia (CLL) is a clinically heterogeneous disease. The methods currently used for monitoring CLL and determining conditions for treatment are limited in their ability to predict disease progression, patient survival, and response to therapy. Although clonal diversity and the acquisition of new chromosomal abnormalities during the disease course (clonal evolution) have been associated with disease progression, their prognostic potential has been underappreciated because cytogenetic and fluorescence in situ hybridization (FISH) studies have a restricted ability to detect genomic abnormalities and clonal evolution. We hypothesized that whole genome analysis using high resolution single nucleotide polymorphism (SNP) microarrays would be useful to detect diversity and infer clonal evolution to offer prognostic information. In this study, we used the Infinium Omni1 BeadChip (Illumina, San Diego, CA) array for the analysis of genetic variation and percent mosaicism in 25 non-selected CLL patients to explore the prognostic value of the assessment of clonal diversity in patients with CLL. We calculated the percentage of mosaicism for each abnormality by applying a mathematical algorithm to the genotype frequency data and by manual determination using the Simulated DNA Copy Number (SiDCoN) tool, which was developed from a computer model of mosaicism. At least one genetic abnormality was identified in each case, and the SNP data was 98% concordant with FISH results. Clonal diversity, defined as the presence of two or more genetic abnormalities with differing percentages of mosaicism, was observed in 12 patients (48%), and the diversity correlated with the disease stage. Clonal diversity was present in most cases of advanced disease (Rai stages III and IV) or those with previous treatment, whereas 9 of 13 patients without detected clonal diversity were asymptomatic or clinically stable. In conclusion, SNP microarray studies with simultaneous evaluation of genomic alterations and mosaic distribution of clones can be used to assess apparent clonal evolution via analysis of clonal diversity. Since clonal evolution in CLL is strongly correlated with disease progression, whole genome SNP microarray analysis provides a new comprehensive and reliable prognostic tool for CLL patients. Copyright © 2011 Elsevier Inc. All rights reserved.
Pendergrass, Sarah A; Verma, Shefali S; Holzinger, Emily R; Moore, Carrie B; Wallace, John; Dudek, Scott M; Huggins, Wayne; Kitchner, Terrie; Waudby, Carol; Berg, Richard; McCarty, Catherine A; Ritchie, Marylyn D
2013-01-01
Investigating the association between biobank derived genomic data and the information of linked electronic health records (EHRs) is an emerging area of research for dissecting the architecture of complex human traits, where cases and controls for study are defined through the use of electronic phenotyping algorithms deployed in large EHR systems. For our study, 2580 cataract cases and 1367 controls were identified within the Marshfield Personalized Medicine Research Project (PMRP) Biobank and linked EHR, which is a member of the NHGRI-funded electronic Medical Records and Genomics (eMERGE) Network. Our goal was to explore potential gene-gene and gene-environment interactions within these data for 529,431 single nucleotide polymorphisms (SNPs) with minor allele frequency > 1%, in order to explore higher level associations with cataract risk beyond investigations of single SNP-phenotype associations. To build our SNP-SNP interaction models we utilized a prior-knowledge driven filtering method called Biofilter to minimize the multiple testing burden of exploring the vast array of interaction models possible from our extensive number of SNPs. Using the Biofilter, we developed 57,376 prior-knowledge directed SNP-SNP models to test for association with cataract status. We selected models that required 6 sources of external domain knowledge. We identified 5 statistically significant models with an interaction term with p-value < 0.05, as well as an overall model with p-value < 0.05 associated with cataract status. We also conducted gene-environment interaction analyses for all GWAS SNPs and a set of environmental factors from the PhenX Toolkit: smoking, UV exposure, and alcohol use; these environmental factors have been previously associated with the formation of cataracts. We found a total of 288 models that exhibit an interaction term with a p-value ≤ 1×10(-4) associated with cataract status. Our results show these approaches enable advanced searches for epistasis and gene-environment interactions beyond GWAS, and that the EHR based approach provides an additional source of data for seeking these advanced explanatory models of the etiology of complex disease/outcome such as cataracts.